pjd [Sun, 16 Dec 2012 14:53:27 +0000 (14:53 +0000)]
Move expand_name() after process lock is released.
This fixed panic where we hold mutex (process lock) and try to obtain sleepable
lock (vnode lock in expand_name()). The panic could occur when %I was used
in kern.corefile.
Additionally we avoid expand_name() overhead when coredumps are disabled.
pjd [Sat, 15 Dec 2012 22:26:16 +0000 (22:26 +0000)]
sbuf_trim() cannot be used on sbuf with drain function set.
This fixes panic when listing sysctls on INVARIANTS-enabled kernel while
having wbwd loaded.
This panic was not fatal, at worst one additional space was printed.
Also sbuf_trim() makes some sense even if drain function is set. The drain
function is called only when buffer is to be expanded. So we could still trim
existing buffer before drain is called. In this case it worked just fine - the
trailing space was correctly trimmed.
The problem is clang will move the two arrays out of the .ctors and .dtors
sections causing these sections to contain a single null address. By not
defining these macros we use the version of the code that places the arrays
is their sections by using __attribute__((section(".ctors"))) and similar
for .dtors.
Submitted by: Daisuke Aoyama <aoyama AT peach.ne.jp>
ae [Sat, 15 Dec 2012 20:04:24 +0000 (20:04 +0000)]
In additional to the tailq of IPv6 addresses add the hash table.
For now use 256 buckets and fnv_hash function. Use xor'ed 32-bit
s6_addr32 parts of in6_addr structure as a hash key. Update
in6_localip and in6_is_addr_deprecated to use hash table for fastest
lookup.
trociny [Sat, 15 Dec 2012 18:21:09 +0000 (18:21 +0000)]
Change `iostat -Ix` to display total duration of transactions instead
of average duration, and total busy time instead of %.
This looks more useful when one runs `iostat -Ix` periodically to
collect statistics: e.g. now it is possible to calculate busy %
between two runs subtracting total busy times and dividing per time
period.
Average duration and % busy are still available via `iostat -x`.
rwatson [Sat, 15 Dec 2012 15:21:09 +0000 (15:21 +0000)]
Four .c files from OpenBSM are used, in modified form, by the kernel to
implement the BSM audit trail format. Rename the kernel versions of the
files to match the userspace filenames so that it's easier to work out
what they correspond to, and therefore ensure they are kept in-sync.
rwatson [Sat, 15 Dec 2012 14:59:00 +0000 (14:59 +0000)]
Merge OpenBSM 1.2-alpha3 from the vendor branch to 10-CURRENT; this version
included various upstreamed patches from the FreeBSD base to make OpenBSM
compile more easily with bmake, higher warning levels, clang, and several
other loose ends.
kib [Sat, 15 Dec 2012 02:04:46 +0000 (02:04 +0000)]
When mnt_vnode_next_active iterator cannot lock the next vnode and
yields, specify the user priority for the yield. Otherwise, a
higher-priority (kernel) thread could fall into the priority-inversion
with the thread owning the mutex lock.
On single-processor machines or UP kernels, do not loop adaptively
when the next vnode cannot be locked, instead yield unconditionally.
Restructure the iteration initializer and the iterator to remove code
duplication. Put the code to fetch and lock a vnode next to the
current marker, into the mnt_vnode_next_active() function, and use it
instead of repeating the loop.
Reported by: hrs, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
kib [Sat, 15 Dec 2012 02:02:11 +0000 (02:02 +0000)]
Remove a special case for XEN, which is erronous and makes vfork(2)
behaviour to differ from the documented, only on XEN. If there are
any issues with XEN pmap left, they should be fixed in pmap.
rmacklem [Fri, 14 Dec 2012 21:49:06 +0000 (21:49 +0000)]
The group list for a non-default export entry (a host/subnet one)
was being copied from the wrong place. This patch fixes that.
This could cause access failures for mapped users, when the group
permissions were needed.
PR: 147998
Submitted by: Christopher Key (cjk32 at cam.ac.uk)
MFC after: 2 weeks
pjd [Fri, 14 Dec 2012 15:12:08 +0000 (15:12 +0000)]
- When checking if a dump exists on the given device there is no need to
provide dump directory. Eliminate this redundant argument. This changes
the usage, but the only risk here is that a warning will be printed
about directory given as device.
- Update usage of -C option.
- When clearing dump header from the given device there is also no need to
provide dump directory, although additional arguments for -c were not
documented.
- Document that -v can be used with -c and that list of devices can be given.
bryanv [Fri, 14 Dec 2012 05:27:56 +0000 (05:27 +0000)]
virtio: Start taskqueues threads after attach cannot fail
If virtio_setup_intr() failed during boot, we would hang in
taskqueue_free() -> taskqueue_terminate() for all the taskq
threads to terminate. This will never happen since the
scheduler is not running by this point.
jimharris [Thu, 13 Dec 2012 21:40:11 +0000 (21:40 +0000)]
Add bus_space_read_8 and bus_space_write_8 for amd64.
Rather than trying to KASSERT for callers that invoke this on
IO tags, either do nothing (for write_8) or return ~0 (for read_8).
Using KASSERT here just makes bus.h too messy from both
polluting bus.h with systm.h (for any number of drivers that include
bus.h without first including systm.h) or ports that use bus.h
directly (i.e. libpciaccess) as reported by zeising@.
Also don't try to implement all of the other bus_space functions for
8 byte access since realistically only these two are needed for some
devices that expose 64-bit memory-mapped registers.
Put the amd64-specific functions here rather than sys/amd64/include/bus.h
so that we can keep this header unified for x86, as requested by mdf@
and tijl@.
Submitted by: Carl Delsey <carl.r.delsey@intel.com>
MFC after: 3 days
smh [Thu, 13 Dec 2012 17:06:38 +0000 (17:06 +0000)]
Upgrades trim free request sizes before inserting them into to free map,
making range consolidation much more effective particularly for small
deletes.
This reduces memory used by the free map as well as reducing the number
of bio requests down to geom required to process all deletes.
In tests this achieved a factor of 10 reduction of trim ranges / geom
call downs.
While I'm here correct the description of zio_vdev_io_start.
PR: kern/173254
Submitted by: Steven Hartland
Approved by: pjd (mentor)
glebius [Thu, 13 Dec 2012 12:48:57 +0000 (12:48 +0000)]
Initialize state id prior to attaching state to key hash. Otherwise a
race can happen, when pf_find_state() finds state via key hash, and locks
id hash slot 0 instead of appropriate to state id slot.
glebius [Thu, 13 Dec 2012 11:11:15 +0000 (11:11 +0000)]
Fix problem in r238990. The LLE_LINKED flag should be tested prior to
entering llentry_free(), and in case if we lose the race, we should simply
perform LLE_FREE_LOCKED(). Otherwise, if the race is lost by the thread
performing arptimer(), it will remove two references from the lle instead
of one.
dteske [Wed, 12 Dec 2012 17:49:01 +0000 (17:49 +0000)]
Fix a regression caused by SVN r222417.
Prior to r222417, setting `password' in loader.conf(5) did not prevent boot
but instead only prevented changes to boot options by prompting for password
if autoboot failed or the user interrupted the countdown sequence.
After r222417 the same machine with `password' set in loader.conf(5) would no
longer boot without _always_ entering the password.
This patch restores the old (8.x and older) functionality for password in
loader.conf(5) while adding a new bootlock_password feature to replace the
edge-case should anybody desire the regressed functionality (HINT: great for
PXE servers and/or private distributions).
loader.conf(5) was updated to be more clear with-respect to password setting
(previous text was misleading).
Documentation (loader.conf(5) and check-password.4th(8)) has been updated to
include notes on the new bootlock_password setting.
Special thanks to Alex Verbod for bringing this to my attention and helping to
refine the loader.conf(5) text.
glebius [Wed, 12 Dec 2012 17:41:21 +0000 (17:41 +0000)]
Fix a crash in tcp_input(), that happens when mbuf has a fwd_tag on it,
but later after processing and freeing the tag, we need to jump back again
to the findpcb label. Since the fwd_tag pointer wasn't NULL we tried to
process and free the tag for second time.
mav [Wed, 12 Dec 2012 11:53:15 +0000 (11:53 +0000)]
Add IDs for SATA controllers on AMD Hudson-2 series chipsets.
I am not exactly sure about the naming due to lack of specs on AMD site,
but it is better to have some identification then none at all.
glebius [Tue, 11 Dec 2012 08:37:08 +0000 (08:37 +0000)]
Merge 1.127 from OpenBSD, that closes a regression from 1.125 (merged
as r242694):
do better detection of when we have a better version of the tcp sequence
windows than our peer.
this resolves the last of the pfsync traffic storm issues ive been able to
produce, and therefore makes it possible to do usable active-active
statuful firewalls with pf.
alfred [Tue, 11 Dec 2012 01:23:50 +0000 (01:23 +0000)]
Switch the hardwired WITNESS panics to kassert_panic.
This is an ongoing effort to provide runtime debug information
useful in the field that does not panic existing installations.
This gives us the flexibility needed when shipping images to a
potentially large audience with WITNESS enabled without worrying
about formerly non-fatal LORs hurting a release.
alfred [Mon, 10 Dec 2012 23:17:08 +0000 (23:17 +0000)]
Add CTLFLAG_STATS to sysctl flags
In preparation for sysctl(8) growing the ability to only print
out boot/run-time tunables we need a way to differentiate between
RW sysctl nodes that tune a particular thing, or simply export
a stat that we want to allow the sysadmin to reset to 0 (or some
other value).
To do so, we add the CTLFLAG_STATS which should be OR'd into the
CTLFLAGs when exporting a "writable/resettable" statistic node via
sysctl.
kib [Mon, 10 Dec 2012 20:44:09 +0000 (20:44 +0000)]
Do not yield while owning a mutex. The Giant reacquire in the
kern_yield() is problematic than.
The owned mutex is the mount interlock, and it is in fact not needed
to guarantee the stability of the mount list of active vnodes, so fix
the the issue by only taking the mount interlock for MNT_REF and
MNT_REL operations.
While there, augment the unconditional yield by some amount of
spinning [1].
Reported and tested by: pho
Reviewed by: attilio
Submitted by: attilio [1]
MFC after: 3 days