John Baldwin [Tue, 16 Sep 2008 16:18:36 +0000 (16:18 +0000)]
- Only set i_offset in the parent directory's i-node during a lookup for
non-LOOKUP operations.
- Relax a VOP assertion for a DELETE lookup. rename() uses WANTPARENT
instead of LOCKPARENT when looking up the source pathname. ufs_rename()
uses a relookup() to lock the parent directory when it decides to finally
remove the source path. Thus, it is ok for a DELETE with WANTPARENT set
instead of LOCKPARENT to use a shared vnode lock rather than an exclusive
vnode lock.
John Baldwin [Tue, 16 Sep 2008 16:15:38 +0000 (16:15 +0000)]
vdropl() drops the vnode interlock. Thus, the code in the QUOTA case that
upgrades the vnode lock if it is share locked was dropping the interlock
before actually checking VI_DOOMED. Fix this by do the vdropl() after the
check and relying on it to drop the vnode interlock.
Ed Schouten [Tue, 16 Sep 2008 14:57:23 +0000 (14:57 +0000)]
Fix minor TTY API inconsistency.
Unlike tty_rel_gone() and tty_rel_sess(), the tty_rel_pgrp() routine
does not unlock the TTY. I once had the idea to make the code call
tty_rel_pgrp() and tty_rel_sess(), picking up the TTY lock once. This
turned out a little harder than I expected, so this is how it works now.
It's a lot easier if we just let tty_rel_pgrp() unlock the TTY, because
the other routines do this anyway.
When attempt is made to suspend a filesystem that is already syspended,
wait until the current suspension is lifted instead of silently returning
success immediately. The consequences of calling vfs_write() resume when
not owning the suspension are not well-defined at best.
Add the vfs_susp_clean() mount method to be called from
vfs_write_resume(). Set it to process_deferred_inactive() for ffs, and
stop calling it manually.
Add the thread flag TDP_IGNSUSP that allows to bypass the suspension
point in the vn_start_write. It is intended for use by VFS in the
situations where the suspender want to do some i/o requiring calls to
vn_start_write(), and this i/o cannot be done later.
Add the ffs structures introspection functions for ddb.
Show the b_dep value for the buffer in the show buffer command.
Add a comand to dump the dirty/clean buffer list for vnode.
Reviewed by: tegge
Tested and used by: pho
MFC after: 1 month
When downgrading the read-write mount to read-only, do_unmount() sets
MNT_RDONLY flag before the VFS_MOUNT() is called. In ufs_inactive()
and ufs_itimes_locked(), UFS verifies whether the fs is read-only by
checking MNT_RDONLY, but this may cause loss of the IN_MODIFIED flag
for inode on the fs being remounted rw->ro.
Introduce UFS_RDONLY() struct ufsmount' method that reports the value
of the fs_ronly. The later is set to 1 only after the remount is
finished.
The struct inode *ip supplied to softdep_freefile is not neccessary the
inode having number ino. In r170991, the ip was marked IN_MODIFIED, that
is not quite correct.
Mark only the right inode modified by checking inode number.
David Xu [Tue, 16 Sep 2008 01:46:11 +0000 (01:46 +0000)]
Allow multiple locks to be acquired by detecting corresponding
bit flag, otherwise if a thread acquired a lock, another thread
or the current thread itself can no longer acquire another lock
because thread_mask_set() return whole flag word, this results
bit leaking in the word and misbehavior in later locking and
unlocking.
Remove the tracing from the AP startup. The AP is known
to start and the tracing can interfere with AP startup.
Instead, use the available space in the reset vector
for the initial stack.
Sam Leffler [Mon, 15 Sep 2008 22:45:14 +0000 (22:45 +0000)]
Make ddb command registration dynamic so modules can extend
the command set (only so long as the module is present):
o add db_command_register and db_command_unregister to add and remove
commands, respectively
o replace linker sets with SYSINIT's (and SYSUINIT's) that register
commands
o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table
for registering top-level commands, show operands, and show all operands,
respectively
While here also:
o sort command lists
o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases
for existing commands
o add "show all trace" as an alias for "show alltrace"
o add "show all locks" as an alias for "show alllocks"
John Baldwin [Mon, 15 Sep 2008 22:26:32 +0000 (22:26 +0000)]
Rework the handling of interrupt handlers for children of ppc and ppbus:
- Retire IVARs for passing IRQs around. Instead, ppbus and ppc now allow
child devices to access the interrupt by via a rid 0 IRQ resource
using bus_alloc_resource_any().
- ppc creates its own interrupt event to manage the interrupt handlers of
child devices. ppc does not allow child devices to use filters. It
could allow this if needed, but none of the current drivers use them
and it adds a good bit of complication. It uses
intr_event_execute_handlers() to fire the child device interrupt handlers
from its threaded interrupt handler.
- Remove the ppbus_dummy_intr() hack. Now the ppc device always has an
interrupt handler registered and we no longer bounce all the way up to
nexus to manage adding/removing ppbus child interrupt handlers. Instead,
the child handlers are added and removed to the private interrupt event
in the ppc device.
John Baldwin [Mon, 15 Sep 2008 22:19:44 +0000 (22:19 +0000)]
Expose a new public routine intr_event_execute_handlers() which executes
all the non-filter handlers attached to an interrupt event. This can be
used by device drivers which multiplex their interrupt onto the interrupt
handlers for child devices.
Ed Schouten [Mon, 15 Sep 2008 15:09:35 +0000 (15:09 +0000)]
Allow COMPAT_SVR4 to be built without COMPAT_43.
It seems we only depend on COMPAT_43 to implement the send() and recv()
routines. We can easily implement them using sendto() and recvfrom(),
just like we do inside our very own C library.
I wasn't able to really test it, apart from simple compilation testing.
I've heard rumours that COMPAT_SVR4 is broken inside execve() anyway.
It's still worth to fix this, because I suspect we'll get rid of
COMPAT_43 somewhere in the future...
Joseph Koshy [Mon, 15 Sep 2008 06:47:52 +0000 (06:47 +0000)]
Correct a callchain capture bug on the i386.
On the i386 architecture, the processor only saves the current value
of `%esp' on stack if a privilege switch is necessary when entering
the interrupt handler. Thus, `frame->tf_esp' is only valid for
an entry from user mode. For interrupts taken in kernel mode, we
need to determine the top-of-stack for the interrupted kernel
procedure by adding the appropriate offset to the current frame
pointer.
Reported by: kris, Fabien Thomas
Tested by: Fabien Thomas <fabien.thomas at netasq dot com>
rewrite rt_check. Ztake into account that whiel teh rtentry is unlocked,
someone else might change it, so after we re-acquire the lock on it,
we need to check it is still valid. People have been panicing in this
function due to soem edge cases which I have hopefully removed.
Dont worry about PSL_RI (restartable interrupt indicator) in
common PowerPC code when all we want to achieve is to enable
external interrupts. We can set PSL_RI at any time before we
allow interrupts and/or exceptions, so move it to the AIM
specific initialization and do it when we also set PSL_ME
(machine check enable).
Widen psaddr_t from uintptr_t to uint64_t. This results in an
ABI change on ILP32 platforms and relating to events. However
it's harmless on little-endian ILP32 platforms in the sense
that it doesn't cause breakages. Old ILP32 thread libraries
write a 32-bit th_p and new thread libraries write a 64-bit
th_p. But due to the fact that we have an unused 32-bit data
field right after th_p and that field is always initialized to
zero, little-endian ILP32 machines effectively have a valid
64-bit th_p by accident. Likewise for new thread libraries and
old libthread_db: little endian ILP32 is unaffected.
At this time we don't support big-endian threaded applications
in GDB, so the breakage for the ILP32 case goes unnoticed.
Allow psaddr_t to be widened by using thr_pread_{int,long,ptr},
where critical. Some places still use ps_pread/ps_pwrite directly,
but only need changed when byte-order comes into the picture.
Also, change th_p in td_event_msg_t from a pointer type to
psaddr_t, so that events also work when psaddr_t is widened.
Ed Schouten [Sun, 14 Sep 2008 11:50:19 +0000 (11:50 +0000)]
Make `quot -a' work when we've got slashes in the device name.
A very long time ago we had raw device nodes. quot(8) was supposed to
use these when running `quot -a'. For some reason the code got once
changed to strip the device name until it reaches the last slash. This
is not reliable, because this means /dev/mirror/foo will be stripped to
/dev/foo.
This bug also exists on RELENG_7 and RELENG_6, but I think I'll just
merge them back somewhere after the upcoming releases. There's no rush.
Revert a part of the MRT commit that proved un-needed.
rt_check() in its original form proved to be sufficient and
rt_check_fib() can go away (as can its evil twin in_rt_check()).
I believe this does NOT address the crashes people have been seeing
in rt_check.
Tim Kientzle [Sun, 14 Sep 2008 03:49:00 +0000 (03:49 +0000)]
Clean up flags support just a tad: FreeBSD support depends on
HAVE_STRUCT_STAT_ST_FLAGS, Linux support depends on the
existence of the appropriate ioctl() options. In particular,
this should fix some nagging compile errors on Linux platforms
that don't have e2fsprogs-devel installed.
Tim Kientzle [Sun, 14 Sep 2008 02:16:04 +0000 (02:16 +0000)]
Test handling of restores relative to symlinks.
In particular:
* tar -x -P follows symlinks to existing dirs, but not without -P
* symlinks to files are always replaced
* broken symlinks are always replaced
Instead of building up a "struct nfs_args" to pass to the kernel
via nmount(), build up an iovec where each iovec member is an NFS mount
option, and pass the iovec down to the kernel via nmount(). These options
are then parsed in the kernel.
This should make it easier to add new NFS mount options in future.
Many, many thanks to Doug Rabson for taking my initial patches,
and cleaning them up. In addition, Doug added a fallback_mount()
function so that the newer mount_nfs program will work against older
kernels, to facilitate upgrading/downgrading scenarios.
Doug also re-wrote the mount_nfs.8 man page.
Add code to parse NFS mount options passed as individual
items of the nmount() iovec. This will allow us to move
away from gathering up all the NFS mount options as a single
"struct nfs_args" to be passed down through nmount().
This will make adding new NFS mount options much easier.
Many, many thanks to Doug Rabson, who took my initial patches and
cleaned them up.
Alexander Motin [Sat, 13 Sep 2008 16:56:03 +0000 (16:56 +0000)]
My massive snd_hda driver update.
Because of using more clear and same time more functional codec parser
new driver is able to handle more codecs, use them better then before and
without most of previous quirks. All of tested codecs itself manage playback,
record, input mixing and monitoring quite fine. In all of investigated
trouble cases problem was found or in nonstandard codec usage or incorrect
codec configuration made by BIOS. Most of that cases could be fixed using
device hints, some of which are already included to the driver.
New driver supports multiple codecs per HDA bus, multiple audio function
groups per codec and multiple logical sound devices per audio function group.
So don't worry when you get several PCM devices instead of one, it is normal.
It is usual situation with powerful codecs to provide, for example, 3 PCM
devices: one for 7.1 playback and main recording, one for headset and one
for digital SPDIF I/O.
New driver implements Universal Audio Architecture (UAA) much better then
previous one. Most information about recommended codec usage now taken from
the codec configuration registers initialized by BIOS. User may alter that
configuration using device hints to reconfigure logical audio devices to
his needs in a very broad range up to the limits of the codec functionality.
New driver supports digital PCM playback and AC3 pass-through. I am not sure
about completeness of this implementation, but I have several success stories
including my own. Vchans subsystem does not support AC3 pass-through so it
had to be disabled for that devices at this moment.
New driver is ready for multichannel playback, but until our OSS is unable
to use this it will just duplicate same stereo stream into all channel
pairs.
New driver supports suspend/resume. I am unable to really test this part
myself, but I have got several success stories.
Driver has very informative verbose boot messages. So if you have any
questions or problems - enable and read them first.
Alexander Motin [Sat, 13 Sep 2008 09:17:02 +0000 (09:17 +0000)]
We can't implicitly trust the hook on NGQF_FN/NGQF_FN2 processing in
ng_apply_item(). There are possible (and I have got one) use-after-free
class panics because of it.
If hook is specified, require it to be valid at the apply time. The only
exceptions are the internal ng_con_part2(), ng_con_part3() and
ng_rmhook_part2() functions which are specially made to work with invalid
hooks.
- For any lock list we hold the head in order to reduce allocation from
the free list and in this way avoid contention on the w_mtx.
In order to make the code simple, we rely on the rule that when the head
has not a child it also doesn't have other subsequent entries.
Actually this assertion is broken because we can free all the head
children and quit witness_unlock() with the head still allocated, with no
children and subsequent entries present.
Fix this by shifting the head if other entries are present and still
freeing the object, but leaving always an head.
- Fix witness_thread_has_locks() in order to report, correctly, if the
lock list linked to a specific thread has children or not based on the
above explained rule.
- Fix a printout into DDB's "show alllocks" command in order to show,
correctly, the process name that is really what we want.
- Fix style(9) for a comment.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Reported by: Marko Kiiskila <marko dot kiiskila at nokia dot com>
Sponsored by: Nokia
Make mlxcontrol work with more than one system drive:
- When searching for the next system drive, return the next one instead
of always returning the first one.
- Plug fd lead and make sure that the MLX_NEXT_CHILD ioctl is called
on the controller fd, not the disk's one.
When doing rfork(0), i.e. separating curproc VM from any other user of
the same vmspace, decrement the reference count of the shared LDT instead
of a newly-made copy. Code factually removed LDT from the process that
did rfork(0).
Introduce user_ldt_deref() function that does decrement of refcount for
the struct proc_ldt, and call it in the rfork(0) case on the shared LDT.
The user_ldt_alloc() function shall return with dt_lock locked.
The user_ldt_free() function shall return with dt_lock unlocked.
Error handling code in both functions do not handle this, fix it by
doing necessary lock/unlock.
Remove warning about static LDT segment allocation. Applications
continue using it after ~7 years since warning was introduced, and there
is no reason to discourage them.
Bandaid: disable interrupts to make sure intr_enabled and the IER register
are in sync. I'm not sure why it is needed, and why it wouldn't be on other
arm platforms, but it prevents a lockup under heavy I/O.
Remove the unused field "pc_prvspace" from the MD fields for the struct
pcpu. There's not even a thing such as a "struct pcup".
While I'm there, remove a comment that makes no sense for arm.
John Baldwin [Thu, 11 Sep 2008 18:33:57 +0000 (18:33 +0000)]
Update the comments above the 0xcf9 register reset attempt to match the
code. We only attempt a single reset using this method (a "hard" reset),
and we use two writes to ensure there is a 0 -> 1 transition in bit 2 to
force a reset.
- Fix nexus_setup_intr() abuse of setting up multiple IRQs in one go. Calling
arm_setup_irqhandler() in loop is bogus, as there's just one cookie given
from the caller and it is overwritten in each iteration so that only the
last handler's cookie value prevails.
- Proper intr masking/unmasking handling: the IRQ source is masked at PIC level
only after the last handler has been removed from the list.
Reviewed by: cognet, imp, sam, stass
Obtained from: Grzegorz Bernacki gjb ! semihalf dot com