Jaakko Heinonen [Fri, 20 Apr 2012 10:08:30 +0000 (10:08 +0000)]
The value of flags matching VNOVAL can't be supported. Return EOPNOTSUPP
from setfflags() in this case. This fixes the return value of
chflags(path, -1).
Adrian Chadd [Fri, 20 Apr 2012 08:26:05 +0000 (08:26 +0000)]
Introduce the matching PCI ath(4) fixup code from ar71xx_pci into
ar724x_pci.c.
* Move out the code which populates the firmware into ar71xx_fixup.c
* Shuffle around the ar724x fixup code to match what the ar71xx fixup
code does.
I've validated this on an AR7240 with AR9285 on-board NIC. It doesn't
yet load, as the AR9285 EEPROM code needs to be made "flash aware."
TODO:
* Validate that I haven't broken AR71xx
* Test AR9285/AR9287 onboard NICs, complete with EEPROM code changes
* Port over the needed BAR hacks for AR7240, AR7241 and AR7242 from
Linux OpenWRT. The current WAR has only been tested on the AR7240
and I'm not sure the way the BAR register is treated is "right".
The "fixup" method here is right when setting the BAR for local access -
ie, the BAR address is either 0xffff (AR7240) or 0x1000ffff (AR7241/AR7242),
but the ath9k-fixup.c code (Linux OpenWRT) does this when setting the
initial "fixup" BAR. It then restores the original BAR.
I'll have to read the ar724x PCI bus glue to see what other special cases
await.
This update uses the MNT_VNODE_FOREACH_ACTIVE interface that loops
over just the active vnodes associated with a mount point to replace
MNT_VNODE_FOREACH_ALL in the vfs_msync, ffs_sync_lazy, and qsync
routines.
The vfs_msync routine is run every 30 seconds for every writably
mounted filesystem. It ensures that any files mmap'ed from the
filesystem with modified pages have those pages queued to be
written back to the file from which they are mapped.
The ffs_lazy_sync and qsync routines are run every 30 seconds for
every writably mounted UFS/FFS filesystem. The ffs_lazy_sync routine
ensures that any files that have been accessed in the previous
30 seconds have had their access times queued for updating in the
filesystem. The qsync routine ensures that any files with modified
quotas have those quotas queued to be written back to their
associated quota file.
In a system configured with 250,000 vnodes, less than 1000 are
typically active at any point in time. Prior to this change all
250,000 vnodes would be locked and inspected twice every minute
by the syncer. For UFS/FFS filesystems they would be locked and
inspected six times every minute (twice by each of these three
routines since each of these routines does its own pass over the
vnodes associated with a mount point). With this change the syncer
now locks and inspects only the tiny set of vnodes that are active.
This change creates a new list of active vnodes associated with
a mount point. Active vnodes are those with a non-zero use or hold
count, e.g., those vnodes that are not on the free list. Note that
this list is in addition to the list of all the vnodes associated
with a mount point.
To avoid adding another set of linkage pointers to the vnode
structure, the active list uses the existing linkage pointers
used by the free list (previously named v_freelist, now renamed
v_actfreelist).
This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops
over just the active vnodes associated with a mount point (typically
less than 1% of the vnodes associated with the mount point).
Ed Schouten [Thu, 19 Apr 2012 21:12:08 +0000 (21:12 +0000)]
Do a better job at determining the username of the login session.
When multiple users share the same UID, the old code will simply pick an
arbitrary username to attach to the utmpx entries. Make the code a bit
more accurate by first checking whether getlogin() returns a username
which corresponds to the uid of the calling process. If this fails,
simply fall back to picking an arbitrary username.
Ed Schouten [Thu, 19 Apr 2012 15:28:15 +0000 (15:28 +0000)]
Properly use SHA1_Final() instead of SHA_Final().
In this case it doesn't really matter, as long as we turn a TTY name
into a set of shuffled bytes. Still, for correctness we should use the
proper function.
Michael Tuexen [Thu, 19 Apr 2012 12:43:19 +0000 (12:43 +0000)]
Fix a bug where we copy out more data from a mbuf chain that are
actually in it. This happens when SCTP receives an unknown chunk, which
requires the sending of an ERROR chunk, and there is no final padding but
the chunk is not 4-byte aligned.
Reported by yueting via rwatson@
Adrian Chadd [Thu, 19 Apr 2012 03:26:21 +0000 (03:26 +0000)]
Stop using the hardware register value byte order swapping for now,
at least until I can root cause what's going on.
The only platform I've seen this on is the AR9220 when attached to
the AR71xx CPUs. I get immediate PCIe bus errors and all subsequent
accesses cause further MIPS bus exceptions. I don't have any other
big-endian platforms to test this on.
If I get a chance (or two), I'll try to whack this on a bus analyser
and see exactly what happens.
I'd rather leave this on, especially for slower, embedded platforms.
But the #ifdef hell is something I'm trying to avoid.
Alexander Motin [Wed, 18 Apr 2012 09:42:14 +0000 (09:42 +0000)]
Some improvements to GEOM MULTIPATH:
- Implement "configure" command to allow switching operation mode of
running device on-fly without destroying and recreation.
- Implement Active/Read mode as hybrid of Active/Active and Active/Passive.
In this mode all paths not marked FAIL may handle reads same time,
but unlike Active/Active only one path handles write requests at any
point in time. It allows to closer follow original write request order
if above layers need it for data consistency (not waiting for requisite
write completion before sending dependent write).
- Hide duplicate messages about device status change.
- Remove periodic thread wake up with 10Hz rate.
Jason Evans [Tue, 17 Apr 2012 22:05:55 +0000 (22:05 +0000)]
Import jemalloc b57d3ec571c6551231be62b7bf92c084a8c8291c (dev branch,
prior to 3.0.0 release), which supports atomic operations based on atomic(9).
This should fix build failures for several platforms.
Drop export of vdestroy() function from kern/vfs_subr.c as it is
used only as a helper function in that file. Replace sole call to
vbusy() with inline code in vholdl(). Replace sole calls to vfree()
and vdestroy() with inline code in vdropl().
The Clang compiler already inlines these functions, so they do not
show up in a kernel backtrace which is confusing. Also you cannot
set their frame in kgdb which means that it is impossible to view
their local variables. So, while the produced code is unchanged,
the debugging should be easier.
Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL.
The primary changes are that the user of the interface no longer
needs to manage the mount-mutex locking and that the vnode that
is returned has its mutex locked (thus avoiding the need to check
to see if its is DOOMED or other possible end of life senarios).
To minimize compatibility issues for third-party developers, the
old MNT_VNODE_FOREACH interface will remain available so that this
change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH
will be removed in head.
The reason for this update is to prepare for the addition of the
MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the
active vnodes associated with a mount point (typically less than
1% of the vnodes associated with the mount point).
Fix bug where NFSv4 ACL enforcement code wouldn't unconditionally
allow the owner to read and write ACL and file attributes when there
was no entry with subject matching the owner. In other words,
'getfacl meh' shouldn't fail for the owner if the ACL looks like this:
GEOM_JOURNAL: Shutting down geom gjournal 3464572051.
panic: destroying non-empty racct: 1 allocated for resource 6
which were tracked by jh@ to be caused by checking p->p_flag,
while it wasn't initialised yet. Basically, during fork, the code
checked p_flag, concluded the process isn't marked as P_SYSTEM,
incremented the counter, and later on, when exiting, checked that
the process was marked as P_SYSTEM, and thus didn't decrement it.
Also, I believe there wasn't any good reason for checking P_SYSTEM
in the first place.
Fix panic at boot with SD/MMC readers with no media present, introduced
at r234177. Note that this is a temporary fix, until I come up with something
prettier.
Jason Evans [Tue, 17 Apr 2012 07:22:14 +0000 (07:22 +0000)]
Import jemalloc 9ef7f5dc34ff02f50d401e41c8d9a4a928e7c2aa (dev branch,
prior to 3.0.0 release) as contrib/jemalloc, and integrate it into libc.
The code being imported by this commit diverged from
lib/libc/stdlib/malloc.c in March 2010, which means that a portion of
the jemalloc 1.0.0 ChangeLog entries are relevant, as are the entries
for all subsequent releases.
Upgrade our copy of llvm/clang to trunk r154661, in preparation of the
upcoming 3.1 release (expected in a few weeks). Preliminary release
notes can be found at: <http://llvm.org/docs/ReleaseNotes.html>
Jung-uk Kim [Mon, 16 Apr 2012 21:22:02 +0000 (21:22 +0000)]
- Implement pipe2 syscall for Linuxulator. This syscall appeared in 2.6.27
but GNU libc used it without checking its kernel version, e. g., Fedora 10.
- Move pipe(2) implementation for Linuxulator from MD files to MI file,
sys/compat/linux/linux_file.c. There is no MD code for this syscall at all.
- Correct an argument type for pipe() from l_ulong * to l_int *. Probably
this was the source of MI/MD confusion.
Jung-uk Kim [Mon, 16 Apr 2012 19:31:44 +0000 (19:31 +0000)]
- When interrupt is not requested for VM86 call, make a fake exit point and
push the address onto stack as we do for INTn emulation. This avoids stack
underflow when we encounter RETF instruction in VM86 mode. Lack of this
exit point actually caused page fault in VM86 mode with VESA module when we
resume from suspend state[1].
- Remove unnecessary CLI and STI instructions from BIOS interrupt emulation.
INTn and IRET must be able to emulate the flag correctly.
Marius Strobl [Mon, 16 Apr 2012 18:29:07 +0000 (18:29 +0000)]
Turn on PREEMPTION by default. After fixing several bugs over time, the
last show-stopper keeping PREEMPTION from being usable on sparc64 should
have been dealt with in r230662.
At least on 2-way systems, PREEMPTION causes a little bit of a degradation
in worldstone performance. However, FreeBSD seems to have started building
up regressions in !PREEMPTION cases so sparc64 better should not be an
oddball in this regard.
Jaakko Heinonen [Mon, 16 Apr 2012 18:07:42 +0000 (18:07 +0000)]
tmpfs: Allow update mounts only for certain options.
Since r230208 update mounts were allowed if the list of mount options
contained the "export" option. This is not correct as tmpfs doesn't
really support updating all options.
When we receive an ICMP unreach need fragmentation datagram, we take
proposed MTU value from it and update the TCP host cache. Then
tcp_mss_update() is called on the corresponding tcpcb. It finds the
just allocated entry in the TCP host cache and updates MSS on the
tcpcb. And then we do a fast retransmit of what we have in the tcp
send buffer.
This sequence gets broken if the TCP host cache is exausted. In this
case allocation fails, and later called tcp_mss_update() finds nothing
in cache. The fast retransmit is done with not reduced MSS and is
immidiately replied by remote host with new ICMP datagrams and the
cycle repeats. This ping-pong can go up to wirespeed.
To fix this:
- tcp_mss_update() gets new parameter - mtuoffer, that is like
offer, but needs to have min_protoh subtracted.
- tcp_mtudisc() as notification method renamed to tcp_mtudisc_notify().
- tcp_mtudisc() now accepts not a useless error argument, but proposed
MTU value, that is passed to tcp_mss_update() as mtuoffer.
Reported by: az
Reported by: Andrey Zonov <andrey zonov.org>
Reviewed by: andre (previous version of patch)
Before r228267 the option was honored but the original content of
boot.config was not preserved. I tried to fix that but missed the idea.
Now the proper way of doing things is taken from i386/boo2.
Also, a comment is added to explain this a little bit unobvious
behavior.
Adrian Chadd [Sun, 15 Apr 2012 22:59:56 +0000 (22:59 +0000)]
Add in the AP96 phy configuration from openwrt.
* arge0 doesn't (yet) work via the switch PHY ports; I'm not sure why.
* arge1 maps to the WAN port. That works.
TODO:
* The PLL register needs a different (non-default) value for Gigabit
Ethernet. The board setup code needs to be extended a bit to allow
for non-default pll_1000 values - right now, those values come out
of hard-coded values in the per-chip set_pll_ge() routines.
Adrian Chadd [Sun, 15 Apr 2012 19:54:22 +0000 (19:54 +0000)]
Drop this down from 512 to 128 for now.
This may result in a bit of a throughput drop. However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.
This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
add usr.sbin/pkg which is a bootstrap tool for pkgng.
it respects PACKAGESITE, PACKAGEROOT, and a new environment variable ABI (if a user want to use a different API from the base one for its packages)
it has no man page on purpose to avoid hidding the pkg(8) man page from the pkgng package.
for now uses pkgbeta.FreeBSD.org as default mirror to find its package
it respects MK_PKGTOOLS
Adrian Chadd [Sun, 15 Apr 2012 00:04:23 +0000 (00:04 +0000)]
Override some default values to work around various issues in the deep,
dirty and murky past.
* Override the default cache line size to be something reasonable if
it's set to 0. Some NICs initialise with '0' (eg embedded ones)
and there are comments in the driver stating that various OSes (eg
older Linux ones) would incorrectly program things and 0 out this
register.
* Just default to overriding the latency timer. Every other driver
does this.
* Use a default cache line size of 32 bytes. It should be "reasonable
enough".