Some improvements to GEOM MULTIPATH:
- Implement "configure" command to allow switching operation mode of
running device on-fly without destroying and recreation.
- Implement Active/Read mode as hybrid of Active/Active and Active/Passive.
In this mode all paths not marked FAIL may handle reads same time,
but unlike Active/Active only one path handles write requests at any
point in time. It allows to closer follow original write request order
if above layers need it for data consistency (not waiting for requisite
write completion before sending dependent write).
- Hide duplicate messages about device status change.
- Remove periodic thread wake up with 10Hz rate.
Import jemalloc b57d3ec571c6551231be62b7bf92c084a8c8291c (dev branch,
prior to 3.0.0 release), which supports atomic operations based on atomic(9).
This should fix build failures for several platforms.
Drop export of vdestroy() function from kern/vfs_subr.c as it is
used only as a helper function in that file. Replace sole call to
vbusy() with inline code in vholdl(). Replace sole calls to vfree()
and vdestroy() with inline code in vdropl().
The Clang compiler already inlines these functions, so they do not
show up in a kernel backtrace which is confusing. Also you cannot
set their frame in kgdb which means that it is impossible to view
their local variables. So, while the produced code is unchanged,
the debugging should be easier.
Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL.
The primary changes are that the user of the interface no longer
needs to manage the mount-mutex locking and that the vnode that
is returned has its mutex locked (thus avoiding the need to check
to see if its is DOOMED or other possible end of life senarios).
To minimize compatibility issues for third-party developers, the
old MNT_VNODE_FOREACH interface will remain available so that this
change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH
will be removed in head.
The reason for this update is to prepare for the addition of the
MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the
active vnodes associated with a mount point (typically less than
1% of the vnodes associated with the mount point).
Fix bug where NFSv4 ACL enforcement code wouldn't unconditionally
allow the owner to read and write ACL and file attributes when there
was no entry with subject matching the owner. In other words,
'getfacl meh' shouldn't fail for the owner if the ACL looks like this:
GEOM_JOURNAL: Shutting down geom gjournal 3464572051.
panic: destroying non-empty racct: 1 allocated for resource 6
which were tracked by jh@ to be caused by checking p->p_flag,
while it wasn't initialised yet. Basically, during fork, the code
checked p_flag, concluded the process isn't marked as P_SYSTEM,
incremented the counter, and later on, when exiting, checked that
the process was marked as P_SYSTEM, and thus didn't decrement it.
Also, I believe there wasn't any good reason for checking P_SYSTEM
in the first place.
Fix panic at boot with SD/MMC readers with no media present, introduced
at r234177. Note that this is a temporary fix, until I come up with something
prettier.
Import jemalloc 9ef7f5dc34ff02f50d401e41c8d9a4a928e7c2aa (dev branch,
prior to 3.0.0 release) as contrib/jemalloc, and integrate it into libc.
The code being imported by this commit diverged from
lib/libc/stdlib/malloc.c in March 2010, which means that a portion of
the jemalloc 1.0.0 ChangeLog entries are relevant, as are the entries
for all subsequent releases.
dim [Mon, 16 Apr 2012 21:23:25 +0000 (21:23 +0000)]
Upgrade our copy of llvm/clang to trunk r154661, in preparation of the
upcoming 3.1 release (expected in a few weeks). Preliminary release
notes can be found at: <http://llvm.org/docs/ReleaseNotes.html>
- Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27
but GNU libc used it without checking its kernel version, e. g., Fedora 10.
- Move pipe(2) implementation for Linuxulator from MD files to MI file,
sys/compat/linux/linux_file.c. There is no MD code for this syscall at all.
- Correct an argument type for pipe() from l_ulong * to l_int *. Probably
this was the source of MI/MD confusion.
- When interrupt is not requested for VM86 call, make a fake exit point and
push the address onto stack as we do for INTn emulation. This avoids stack
underflow when we encounter RETF instruction in VM86 mode. Lack of this
exit point actually caused page fault in VM86 mode with VESA module when we
resume from suspend state[1].
- Remove unnecessary CLI and STI instructions from BIOS interrupt emulation.
INTn and IRET must be able to emulate the flag correctly.
marius [Mon, 16 Apr 2012 18:29:07 +0000 (18:29 +0000)]
Turn on PREEMPTION by default. After fixing several bugs over time, the
last show-stopper keeping PREEMPTION from being usable on sparc64 should
have been dealt with in r230662.
At least on 2-way systems, PREEMPTION causes a little bit of a degradation
in worldstone performance. However, FreeBSD seems to have started building
up regressions in !PREEMPTION cases so sparc64 better should not be an
oddball in this regard.
tmpfs: Allow update mounts only for certain options.
Since r230208 update mounts were allowed if the list of mount options
contained the "export" option. This is not correct as tmpfs doesn't
really support updating all options.
When we receive an ICMP unreach need fragmentation datagram, we take
proposed MTU value from it and update the TCP host cache. Then
tcp_mss_update() is called on the corresponding tcpcb. It finds the
just allocated entry in the TCP host cache and updates MSS on the
tcpcb. And then we do a fast retransmit of what we have in the tcp
send buffer.
This sequence gets broken if the TCP host cache is exausted. In this
case allocation fails, and later called tcp_mss_update() finds nothing
in cache. The fast retransmit is done with not reduced MSS and is
immidiately replied by remote host with new ICMP datagrams and the
cycle repeats. This ping-pong can go up to wirespeed.
To fix this:
- tcp_mss_update() gets new parameter - mtuoffer, that is like
offer, but needs to have min_protoh subtracted.
- tcp_mtudisc() as notification method renamed to tcp_mtudisc_notify().
- tcp_mtudisc() now accepts not a useless error argument, but proposed
MTU value, that is passed to tcp_mss_update() as mtuoffer.
Reported by: az
Reported by: Andrey Zonov <andrey zonov.org>
Reviewed by: andre (previous version of patch)
Before r228267 the option was honored but the original content of
boot.config was not preserved. I tried to fix that but missed the idea.
Now the proper way of doing things is taken from i386/boo2.
Also, a comment is added to explain this a little bit unobvious
behavior.
adrian [Sun, 15 Apr 2012 22:59:56 +0000 (22:59 +0000)]
Add in the AP96 phy configuration from openwrt.
* arge0 doesn't (yet) work via the switch PHY ports; I'm not sure why.
* arge1 maps to the WAN port. That works.
TODO:
* The PLL register needs a different (non-default) value for Gigabit
Ethernet. The board setup code needs to be extended a bit to allow
for non-default pll_1000 values - right now, those values come out
of hard-coded values in the per-chip set_pll_ge() routines.
adrian [Sun, 15 Apr 2012 19:54:22 +0000 (19:54 +0000)]
Drop this down from 512 to 128 for now.
This may result in a bit of a throughput drop. However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.
This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
add usr.sbin/pkg which is a bootstrap tool for pkgng.
it respects PACKAGESITE, PACKAGEROOT, and a new environment variable ABI (if a user want to use a different API from the base one for its packages)
it has no man page on purpose to avoid hidding the pkg(8) man page from the pkgng package.
for now uses pkgbeta.FreeBSD.org as default mirror to find its package
it respects MK_PKGTOOLS
adrian [Sun, 15 Apr 2012 00:04:23 +0000 (00:04 +0000)]
Override some default values to work around various issues in the deep,
dirty and murky past.
* Override the default cache line size to be something reasonable if
it's set to 0. Some NICs initialise with '0' (eg embedded ones)
and there are comments in the driver stating that various OSes (eg
older Linux ones) would incorrectly program things and 0 out this
register.
* Just default to overriding the latency timer. Every other driver
does this.
* Use a default cache line size of 32 bytes. It should be "reasonable
enough".
marius [Sat, 14 Apr 2012 17:17:55 +0000 (17:17 +0000)]
Add support for the Atmel SAM9XE familiy of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.
marius [Sat, 14 Apr 2012 17:09:38 +0000 (17:09 +0000)]
Add support for the Atmel SAM9XE familiy of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.
luigi [Sat, 14 Apr 2012 16:44:18 +0000 (16:44 +0000)]
i prefer this fix for the -Wformat warning (just one cast,
all the other variables are already correct for %x).
My previous attempt put the cast in the wrong place.
There seems to be an issue with Qemu (or FreeBSD VirtIO) that sets
the PCI register space for the device config to bogus values. This
only seems to happen after unloading and reloading the module.
marius [Fri, 13 Apr 2012 22:58:23 +0000 (22:58 +0000)]
Merge from x86:
r233961:
Fix interrupt load balancing regression, introduced in revision
222813, that left all un-pinned interrupts assigned to CPU 0.
In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized
the "intr_cpus" cpuset to only contain CPU0.
This initialization is too late and nullifies the results of calls
to the intr_add_cpu() that occur much earlier in the boot process.
r234074 (partial):
The BSP is not added to the mask of valid target CPUs for interrupts.
Fix this by adding the BSP as an interrupt target directly in
The scandir(3) function expects fourth parameter, compar, be in type of:
int (*compar)(const struct dirent **, const struct dirent **)
The current code defines sortq() to accept two void *, then cast them
to const struct dirent **. Because the code does not really need this
cast, we can eliminate the casts by changing the function prototype
to match scandir(3) expectation.
Change SIGUSR1 to SIGTHR to properly wake up a process that is being
traced. The use of SIGUSR1 caused traced processes (those attached to
with dtrace -p) to exit when dtrace exited.
luigi [Fri, 13 Apr 2012 16:42:54 +0000 (16:42 +0000)]
Properly disable crc stripping when operating in netmap mode.
Contrarily to what i wrote in my previous commit, the 82599
does include the CRC in the length. The operating mode is
reset in ixgbe_init_locked() and so we need to hook into
the places where the two registers (HLREG0 and RDRXCTL) are
modified.
luigi [Fri, 13 Apr 2012 16:32:33 +0000 (16:32 +0000)]
add the new memory allocator for netmap, which allocates memory
in small clusters instead of one big contiguous chunk.
This was already enabled in the previous commit.