attilio [Wed, 17 Oct 2012 11:30:00 +0000 (11:30 +0000)]
Disconnect non-MPSAFE NTFS from the build in preparation for dropping
GIANT from VFS. This code is particulary broken and fragile and other
in-kernel implementations around, found in other operating systems,
don't really seem clean and solid enough to be imported at all.
If someone wants to reconsider in-kernel NTFS implementation for
inclusion again, a fair effort for completely fixing and cleaning it
up is expected.
In the while NTFS regular users can use FUSE interface and ntfs-3g
port to work with their NTFS partitions.
attilio [Wed, 17 Oct 2012 11:16:17 +0000 (11:16 +0000)]
Disconnect non-MPSAFE NWFS from the build in preparation for dropping
GIANT from VFS. In addition, disconnect also netncp, which is a base
requirement for NWFS.
In the possibility of a future maintenance of the code and later
readd to the FreeBSD base, maybe we should think about a better location
for netncp. I'm not entirely sure the / top location is actually right,
however I will let network people to comment on that more specifically.
avg [Wed, 17 Oct 2012 10:59:56 +0000 (10:59 +0000)]
zfs: make use of getnewvnode_reserve in zfs_mknode and zfs_zget
getnewvnode_reserve helps to avoid "recursing" back into zfs code
via getnewvnode when that latter needs to reclaim some vnodes.
zfs code may hold a number of locks around getnewvnode and doesn't
expect any recursion to happen on those locks, because that never
happens in solaris.
I believe that this change also eleiminates a need for the delayed
znode destruction via the taskqueue.
Many thanks to kib for devising getnewvnode_reserve.
sobomax [Wed, 17 Oct 2012 00:44:34 +0000 (00:44 +0000)]
o Use nanosleep(2) to sleep exact amount of time till the next second,
not multiple of 1 second, which results in actual time to drift back
and forth every run within 1 second of the actual action has
been set for.
Suggested by: Ian Lepore
o Schedule the first run in 1 second after starting up, not on the
boundary of the next minute, which results in the every_second jobs
not being run.
emax [Tue, 16 Oct 2012 20:18:15 +0000 (20:18 +0000)]
introduce concept of ifi_baudrate power factor. the idea is to work
around the problem where high speed interfaces (such as ixgbe(4))
are not able to report real ifi_baudrate. bascially, take a spare
byte from struct if_data and use it to store ifi_baudrate power
factor. in other words,
real ifi_baudrate = ifi_baudrate * 10 ^ ifi_baudrate power factor
this should be backwards compatible with old binaries. use ixgbe(4)
as an example on how drivers would set ifi_baudrate power factor
gonzo [Tue, 16 Oct 2012 01:10:43 +0000 (01:10 +0000)]
Split sdhci driver in two parts: sdhci and sdhci_pci.
sdchi encapsulates a generic SD Host Controller logic that relies on
actual hardware driver for register access.
sdhci_pci implements driver for PCI SDHC controllers using new SDHCI
interface
No kernel config modifications are required, but if you load sdhc
as a module you must switch to sdhci_pci instead.
jhb [Mon, 15 Oct 2012 16:29:08 +0000 (16:29 +0000)]
Add locking to the dpt(4) driver and mark it MPSAFE.
- Use device_printf() and device_get_unit() instead of storing the unit
number in the softc.
- Remove use of explicit bus space handles and tags.
- Remove the global dpt_softcs list and use devclass_get_device() instead.
- Use pci_enable_busmaster() rather than frobbing the PCI command register
directly.
jhb [Mon, 15 Oct 2012 16:13:55 +0000 (16:13 +0000)]
Add locking to the bt(4) driver and mark it MPSAFE.
- Use device_printf() and device_get_unit() instead of storing the unit
number in the softc.
- Remove use of explicit bus space handles and tags.
- Return an errno value from bt_eisa_attach() if an error occurs rather
than -1.
- Use BUS_PROBE_DEFAULT rather than 0.
jhb [Mon, 15 Oct 2012 16:09:59 +0000 (16:09 +0000)]
Add locking to the aic(4) driver and mark it MPSAFE.
- Move 'free_scbs' into the softc rather than having it be a global list
and convert it to an SLIST instead of a hand-rolled linked-list.
- Use device_printf() and device_get_unit() instead of storing the unit
number in the softc.
- Remove use of explicit bus space handles and tags.
- Don't call device_set_desc() in the pccard attach routine, instead
set a default description during the pccard probe if the matching
product doesn't have a name.
jhb [Mon, 15 Oct 2012 16:05:02 +0000 (16:05 +0000)]
Add locking to the ahb(4) driver and mark it MPSAFE.
- Use device_printf() and device_get_unit() instead of storing the unit
number in the softc.
- Remove use of explicit bus space handles and tags.
- Compare pointers against NULL.
- Let new-bus allocate a softc rather than doing it by hand.
jhb [Mon, 15 Oct 2012 15:26:00 +0000 (15:26 +0000)]
Add locking to the adw(4) driver and mark it MPSAFE.
- Use device_printf() and device_get_nameunit() instead of adw_name().
- Remove use of explicit bus space handles and tags.
- Use pci_enable_busmaster() rather than frobbing the PCI command register
directly.
- Use the softc provided by new-bus rather than allocating a new one.
sobomax [Mon, 15 Oct 2012 08:21:49 +0000 (08:21 +0000)]
Add per-second scheduling into the cron(8). Right now it's
only available via the new @every_second shortcut. ENOTIME to
implement crontab(5) format extensions to allow more flexible
scheduling.
In order to address some concerns expressed by Terry Lambert
while discussing the topic few years ago, about per-second cron
possibly causing some bad effects on /etc/crontab by stat()ing
it every second instead of every minute now (i.e. atime update),
only check that database needs to be reloaded on every 60-th
loop run. This should be close enough to the current behaviour.
kientzle [Mon, 15 Oct 2012 04:10:49 +0000 (04:10 +0000)]
Fix an mbuf leak in cpsw driver, clean up mbuf management:
* Record TX mbufs when we get them so we can release them.
* Set TX/RX mbuf slots to NULL when we are no longer responsible for them
* Move dma sync on RX into RX intr routine
rmacklem [Mon, 15 Oct 2012 00:17:16 +0000 (00:17 +0000)]
Add a new '-S' option to mountd, which tells it to suspend
execution of the nfsd threads while it is reloading the exports.
This avoids clients from getting intermittent access errors
when the exports are being reloaded non-atomically.
It is not an ideal solution, since requests will back up while
the nfsd threads are suspended. Also, when this option is used,
if mountd crashes while reloading exports, mountd will have to
be restarted to get the nfsd threads to resume execution.
This has been tested by Vincent Hoffman (vince at unsane.co.uk)
and John Hickey (jh at deterlab.net).
The nfse patch offers a more comprehensive solution for this issue.
adrian [Mon, 15 Oct 2012 00:07:18 +0000 (00:07 +0000)]
Track the total number of software queued frames in an atomic variable
stashed away in ath_node.
As much as I tried to stuff that behind the ATH_NODE lock, unfortunately
the locking is just too plain hairy (for me! And I wrote it!) to do
cleanly. Hence using atomics here instead of a lock. The ATH_NODE lock
just isn't currently used anywhere besides the rate control updates.
If in the future everything gets migrated back to using a single ATH_NODE
lock or a single global ATH_TX lock (ie, a single TX lock for all TX and
TX completion) then fine, I'll remove the atomics.
rmacklem [Sun, 14 Oct 2012 22:33:17 +0000 (22:33 +0000)]
Add two new options to the nfssvc(2) syscall that allow
processes running as root to suspend/resume execution
of the kernel nfsd threads. An earlier version of this
patch was tested by Vincent Hoffman (vince at unsane.co.uk)
and John Hickey (jh at deterlab.net).
adrian [Sun, 14 Oct 2012 20:44:08 +0000 (20:44 +0000)]
Push the actual TX processing into the ath taskqueue, rather than having
it run out of multiple concurrent contexts.
Right now the ath(4) TX processing is a bit hairy. Specifically:
* It was running out of ath_start(), which could occur from multiple
concurrent sending processes (as if_start() can be started from multiple
sending threads nowdays.. sigh)
* during RX if fast frames are enabled (so not really at the moment, not
until I fix this particular feature again..)
* during ath_reset() - so anything which calls that
* during ath_tx_proc*() in the ath taskqueue - ie, TX is attempted again
after TX completion, as there's now hopefully some ath_bufs available.
* Then, the ic_raw_xmit() method can queue raw frames for transmission
at any time, from any net80211 TX context. Ew.
This has caused packet ordering issues in the past - specifically,
there's absolutely no guarantee that preemption won't occuring _during_
ath_start() by the TX completion processing, which will call ath_start()
again. It's a mess - 802.11 really, really wants things to be in
sequence or things go all kinds of loopy.
So:
* create a new task struct for TX'ing;
* make the if_start method simply queue the task on the ath taskqueue;
* make ath_start() just be called by the new TX task;
* make ath_tx_kick() just schedule the ath TX task, rather than directly
calling ath_start().
Now yes, this means that I've taken a step backwards in terms of
concurrency - TX -and- RX now occur in the same single-task taskqueue.
But there's nothing stopping me from separating out the TX / TX completion
code into a separate taskqueue which runs in parallel with the RX path,
if that ends up being appropriate for some platforms.
This fixes the CCMP/seqno concurrency issues that creep up when you
transmit large amounts of uni-directional UDP traffic (>200MBit) on a
FreeBSD STA -> AP, as now there's only one TX context no matter what's
going on (TX completion->retry/software queue,
userland->net80211->ath_start(), TX completion -> ath_start());
but it won't fix any concurrency issues between raw transmitted frames
and non-raw transmitted frames (eg EAPOL frames on TID 16 and any other
TID 16 multicast traffic that gets put on the CABQ.) That is going to
require a bunch more re-architecture before it's feasible to fix.
In any case, this is a big step towards making the majority of the TX
path locking irrelevant, as now almost all TX activity occurs in the
taskqueue.
adrian [Sun, 14 Oct 2012 20:31:38 +0000 (20:31 +0000)]
Break the RX processing up into smaller chunks of 128 frames each.
Right now processing a full 512 frame queue takes quite a while (measured
on the order of milliseconds.) Because of this, the TX processing ends up
sometimes preempting the taskqueue:
* userland sends a frame
* it goes in through net80211 and out to ath_start()
* ath_start() will end up either direct dispatching or software queuing a
frame.
If TX had to wait for RX to finish, it would add quite a few ms of
additional latency to the packet transmission. This in the past has
caused issues with TCP throughput.
Now, as part of my attempt to bring sanity to the TX/RX paths, the first
step is to make the RX processing happen in smaller 'parts'. That way
when TX is pushed into the ath taskqueue, there won't be so much latency
in the way of things.
The bigger scale change (which will come much later) is to actually
process the frames in the ath_intr taskqueue but process _frames_ in
the ath driver taskqueue. That would reduce the latency between
processing and requeuing new descriptors. But that'll come later.
The actual work:
* Add ATH_RX_MAX at 128 (static for now);
* break out of the processing loop if npkts reaches ATH_RX_MAX;
* if we processed ATH_RX_MAX or more frames during the processing loop,
immediately reschedule another RX taskqueue run. This will handle
the further frames in the taskqueue.
This should have very minimal impact on the general throughput case,
unless the scheduler is being very very strange or the ath taskqueue
ends up spending a lot of time on non-RX operations (such as TX
completion.)
kib [Sun, 14 Oct 2012 19:43:37 +0000 (19:43 +0000)]
Add a KPI to allow to reserve some amount of space in the numvnodes
counter, without actually allocating the vnodes. The supposed use of
the getnewvnode_reserve(9) is to reclaim enough free vnodes while the
code still does not hold any resources that might be needed during the
reclamation, and to consume the slack later for getnewvnode() calls
made from the innards. After the critical block is finished, the
caller shall free any reserve left, by getnewvnode_drop_reserve(9).
mav [Sun, 14 Oct 2012 08:50:05 +0000 (08:50 +0000)]
Add explicit check for not set time inside cam_periph_freeze_after_event().
System time is set later on boot process then initial bus scan by CAM.
Until that moment microtime() is equal to microuptime(), and if system
boots quickly, the value can be close to zero. That causes settle time
waiting even for buses that don't use reset during probe.
On my test system this reduces boot time by 1 second if USB enabled, or
by 4 seconds if USB disabled. CAM waited for ctl2cam0 bus "settle".
dteske [Sun, 14 Oct 2012 06:52:49 +0000 (06:52 +0000)]
Since the introduction of the new advanced boot menu (r222417), options like
"boot verbose", "single user mode", "ACPI" and more are now stateful boolean
menuitems rather than direct action-items.
A short-coming in this new menu system is that when a user sets a non-default
value in loader.conf(5), this non-default state is not reflected in the menu
-- leading to confusion as to whether the option was taking effect or not.
This patch adds dynamic menuitem constructors _and_ the necessary Forth
callbacks to initialize these stateful menuitems -- causing the aforementioned
menuitems to adhere to loader.conf(5) settings.
PR: bin/172529
Approved by: adrian (co-mentor)
MFC after: 21 days
attilio [Sat, 13 Oct 2012 23:54:26 +0000 (23:54 +0000)]
Import a FreeBSD port of the FUSE Linux module.
This has been developed during 2 summer of code mandates and being revived
by gnn recently.
The functionality in this commit mirrors entirely content of fusefs-kmod
port, which doesn't need to be installed anymore for -CURRENT setups.
In order to get some sparse technical notes, please refer to:
http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html
or to the project branch:
svn://svn.freebsd.org/base/projects/fuse/
which also contains granular history of changes happened during port
refinements. This commit does not came from the branch reintegration
itself because it seems svn is not behaving properly for this functionaly
at the moment.
Partly Sponsored by: Google, Summer of Code program 2005, 2011
Originally submitted by: ilya, Csaba Henk <csaba-ml AT creo DOT hu >
In collabouration with: pho
Tested by: flo, gnn, Gustau Perez,
Kevin Oberman <rkoberman AT gmail DOT com>
MFC after: 2 months
alc [Sat, 13 Oct 2012 18:46:46 +0000 (18:46 +0000)]
Eliminate the conditional for releasing the page queues lock in
vm_page_sleep(). vm_page_sleep() is no longer called with this lock
held.
Eliminate assertions that the page queues lock is NOT held. These
assertions won't translate well to having distinct locks on the active
and inactive page queues, and they really aren't that useful.
mav [Sat, 13 Oct 2012 18:24:52 +0000 (18:24 +0000)]
Don't exclude XPT SIM from locking in xpt_create_path_unlocked().
We don't want xpt periph, device, target or bus disappeared because of
incorrect reference counting.
mav [Sat, 13 Oct 2012 11:23:16 +0000 (11:23 +0000)]
Fix XPT_DEBUG paths operations locking:
- Extend the lock to cover xpt_path_release() for the new path.
- While xpt_action() is called while holding right SIM lock for the new
bus, the old path release may require different SIM lock. So we have
to temporary drop the new lock and get the old one.
mav [Sat, 13 Oct 2012 10:18:36 +0000 (10:18 +0000)]
XPT_DEV_MATCH is probably the only xpt_action() method that is called
without holding SIM lock. It really doesn't need that lock, but adding it
removes that specific exception, allowing to assert locking there later.
dteske [Sat, 13 Oct 2012 03:56:33 +0000 (03:56 +0000)]
SVN r240684 broke the ability of the dot module to map include dependencies.
Teach the dot module about the new location these includes moved to (as part
of r240684) and clean things up a bit.
Reviewed by: adrian (co-mentor)
Approved by: adrian (co-mentor)
alc [Fri, 12 Oct 2012 23:26:00 +0000 (23:26 +0000)]
Replace all uses of the vm page queues lock by a new R/W lock.
Unfortunately, this lock cannot be defined as static under Xen because it
is (ab)used to serialize queued page table changes.