Maxim Sobolev [Mon, 15 Oct 2012 08:21:49 +0000 (08:21 +0000)]
Add per-second scheduling into the cron(8). Right now it's
only available via the new @every_second shortcut. ENOTIME to
implement crontab(5) format extensions to allow more flexible
scheduling.
In order to address some concerns expressed by Terry Lambert
while discussing the topic few years ago, about per-second cron
possibly causing some bad effects on /etc/crontab by stat()ing
it every second instead of every minute now (i.e. atime update),
only check that database needs to be reloaded on every 60-th
loop run. This should be close enough to the current behaviour.
Tim Kientzle [Mon, 15 Oct 2012 04:10:49 +0000 (04:10 +0000)]
Fix an mbuf leak in cpsw driver, clean up mbuf management:
* Record TX mbufs when we get them so we can release them.
* Set TX/RX mbuf slots to NULL when we are no longer responsible for them
* Move dma sync on RX into RX intr routine
Rick Macklem [Mon, 15 Oct 2012 00:17:16 +0000 (00:17 +0000)]
Add a new '-S' option to mountd, which tells it to suspend
execution of the nfsd threads while it is reloading the exports.
This avoids clients from getting intermittent access errors
when the exports are being reloaded non-atomically.
It is not an ideal solution, since requests will back up while
the nfsd threads are suspended. Also, when this option is used,
if mountd crashes while reloading exports, mountd will have to
be restarted to get the nfsd threads to resume execution.
This has been tested by Vincent Hoffman (vince at unsane.co.uk)
and John Hickey (jh at deterlab.net).
The nfse patch offers a more comprehensive solution for this issue.
Adrian Chadd [Mon, 15 Oct 2012 00:07:18 +0000 (00:07 +0000)]
Track the total number of software queued frames in an atomic variable
stashed away in ath_node.
As much as I tried to stuff that behind the ATH_NODE lock, unfortunately
the locking is just too plain hairy (for me! And I wrote it!) to do
cleanly. Hence using atomics here instead of a lock. The ATH_NODE lock
just isn't currently used anywhere besides the rate control updates.
If in the future everything gets migrated back to using a single ATH_NODE
lock or a single global ATH_TX lock (ie, a single TX lock for all TX and
TX completion) then fine, I'll remove the atomics.
Rick Macklem [Sun, 14 Oct 2012 22:33:17 +0000 (22:33 +0000)]
Add two new options to the nfssvc(2) syscall that allow
processes running as root to suspend/resume execution
of the kernel nfsd threads. An earlier version of this
patch was tested by Vincent Hoffman (vince at unsane.co.uk)
and John Hickey (jh at deterlab.net).
Adrian Chadd [Sun, 14 Oct 2012 20:44:08 +0000 (20:44 +0000)]
Push the actual TX processing into the ath taskqueue, rather than having
it run out of multiple concurrent contexts.
Right now the ath(4) TX processing is a bit hairy. Specifically:
* It was running out of ath_start(), which could occur from multiple
concurrent sending processes (as if_start() can be started from multiple
sending threads nowdays.. sigh)
* during RX if fast frames are enabled (so not really at the moment, not
until I fix this particular feature again..)
* during ath_reset() - so anything which calls that
* during ath_tx_proc*() in the ath taskqueue - ie, TX is attempted again
after TX completion, as there's now hopefully some ath_bufs available.
* Then, the ic_raw_xmit() method can queue raw frames for transmission
at any time, from any net80211 TX context. Ew.
This has caused packet ordering issues in the past - specifically,
there's absolutely no guarantee that preemption won't occuring _during_
ath_start() by the TX completion processing, which will call ath_start()
again. It's a mess - 802.11 really, really wants things to be in
sequence or things go all kinds of loopy.
So:
* create a new task struct for TX'ing;
* make the if_start method simply queue the task on the ath taskqueue;
* make ath_start() just be called by the new TX task;
* make ath_tx_kick() just schedule the ath TX task, rather than directly
calling ath_start().
Now yes, this means that I've taken a step backwards in terms of
concurrency - TX -and- RX now occur in the same single-task taskqueue.
But there's nothing stopping me from separating out the TX / TX completion
code into a separate taskqueue which runs in parallel with the RX path,
if that ends up being appropriate for some platforms.
This fixes the CCMP/seqno concurrency issues that creep up when you
transmit large amounts of uni-directional UDP traffic (>200MBit) on a
FreeBSD STA -> AP, as now there's only one TX context no matter what's
going on (TX completion->retry/software queue,
userland->net80211->ath_start(), TX completion -> ath_start());
but it won't fix any concurrency issues between raw transmitted frames
and non-raw transmitted frames (eg EAPOL frames on TID 16 and any other
TID 16 multicast traffic that gets put on the CABQ.) That is going to
require a bunch more re-architecture before it's feasible to fix.
In any case, this is a big step towards making the majority of the TX
path locking irrelevant, as now almost all TX activity occurs in the
taskqueue.
Adrian Chadd [Sun, 14 Oct 2012 20:31:38 +0000 (20:31 +0000)]
Break the RX processing up into smaller chunks of 128 frames each.
Right now processing a full 512 frame queue takes quite a while (measured
on the order of milliseconds.) Because of this, the TX processing ends up
sometimes preempting the taskqueue:
* userland sends a frame
* it goes in through net80211 and out to ath_start()
* ath_start() will end up either direct dispatching or software queuing a
frame.
If TX had to wait for RX to finish, it would add quite a few ms of
additional latency to the packet transmission. This in the past has
caused issues with TCP throughput.
Now, as part of my attempt to bring sanity to the TX/RX paths, the first
step is to make the RX processing happen in smaller 'parts'. That way
when TX is pushed into the ath taskqueue, there won't be so much latency
in the way of things.
The bigger scale change (which will come much later) is to actually
process the frames in the ath_intr taskqueue but process _frames_ in
the ath driver taskqueue. That would reduce the latency between
processing and requeuing new descriptors. But that'll come later.
The actual work:
* Add ATH_RX_MAX at 128 (static for now);
* break out of the processing loop if npkts reaches ATH_RX_MAX;
* if we processed ATH_RX_MAX or more frames during the processing loop,
immediately reschedule another RX taskqueue run. This will handle
the further frames in the taskqueue.
This should have very minimal impact on the general throughput case,
unless the scheduler is being very very strange or the ath taskqueue
ends up spending a lot of time on non-RX operations (such as TX
completion.)
Add a KPI to allow to reserve some amount of space in the numvnodes
counter, without actually allocating the vnodes. The supposed use of
the getnewvnode_reserve(9) is to reclaim enough free vnodes while the
code still does not hold any resources that might be needed during the
reclamation, and to consume the slack later for getnewvnode() calls
made from the innards. After the critical block is finished, the
caller shall free any reserve left, by getnewvnode_drop_reserve(9).
Alexander Motin [Sun, 14 Oct 2012 08:50:05 +0000 (08:50 +0000)]
Add explicit check for not set time inside cam_periph_freeze_after_event().
System time is set later on boot process then initial bus scan by CAM.
Until that moment microtime() is equal to microuptime(), and if system
boots quickly, the value can be close to zero. That causes settle time
waiting even for buses that don't use reset during probe.
On my test system this reduces boot time by 1 second if USB enabled, or
by 4 seconds if USB disabled. CAM waited for ctl2cam0 bus "settle".
Devin Teske [Sun, 14 Oct 2012 06:52:49 +0000 (06:52 +0000)]
Since the introduction of the new advanced boot menu (r222417), options like
"boot verbose", "single user mode", "ACPI" and more are now stateful boolean
menuitems rather than direct action-items.
A short-coming in this new menu system is that when a user sets a non-default
value in loader.conf(5), this non-default state is not reflected in the menu
-- leading to confusion as to whether the option was taking effect or not.
This patch adds dynamic menuitem constructors _and_ the necessary Forth
callbacks to initialize these stateful menuitems -- causing the aforementioned
menuitems to adhere to loader.conf(5) settings.
PR: bin/172529
Approved by: adrian (co-mentor)
MFC after: 21 days
Attilio Rao [Sat, 13 Oct 2012 23:54:26 +0000 (23:54 +0000)]
Import a FreeBSD port of the FUSE Linux module.
This has been developed during 2 summer of code mandates and being revived
by gnn recently.
The functionality in this commit mirrors entirely content of fusefs-kmod
port, which doesn't need to be installed anymore for -CURRENT setups.
In order to get some sparse technical notes, please refer to:
http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html
or to the project branch:
svn://svn.freebsd.org/base/projects/fuse/
which also contains granular history of changes happened during port
refinements. This commit does not came from the branch reintegration
itself because it seems svn is not behaving properly for this functionaly
at the moment.
Partly Sponsored by: Google, Summer of Code program 2005, 2011
Originally submitted by: ilya, Csaba Henk <csaba-ml AT creo DOT hu >
In collabouration with: pho
Tested by: flo, gnn, Gustau Perez,
Kevin Oberman <rkoberman AT gmail DOT com>
MFC after: 2 months
Alan Cox [Sat, 13 Oct 2012 18:46:46 +0000 (18:46 +0000)]
Eliminate the conditional for releasing the page queues lock in
vm_page_sleep(). vm_page_sleep() is no longer called with this lock
held.
Eliminate assertions that the page queues lock is NOT held. These
assertions won't translate well to having distinct locks on the active
and inactive page queues, and they really aren't that useful.
Alexander Motin [Sat, 13 Oct 2012 18:24:52 +0000 (18:24 +0000)]
Don't exclude XPT SIM from locking in xpt_create_path_unlocked().
We don't want xpt periph, device, target or bus disappeared because of
incorrect reference counting.
Alexander Motin [Sat, 13 Oct 2012 11:23:16 +0000 (11:23 +0000)]
Fix XPT_DEBUG paths operations locking:
- Extend the lock to cover xpt_path_release() for the new path.
- While xpt_action() is called while holding right SIM lock for the new
bus, the old path release may require different SIM lock. So we have
to temporary drop the new lock and get the old one.
Alexander Motin [Sat, 13 Oct 2012 10:18:36 +0000 (10:18 +0000)]
XPT_DEV_MATCH is probably the only xpt_action() method that is called
without holding SIM lock. It really doesn't need that lock, but adding it
removes that specific exception, allowing to assert locking there later.
Devin Teske [Sat, 13 Oct 2012 03:56:33 +0000 (03:56 +0000)]
SVN r240684 broke the ability of the dot module to map include dependencies.
Teach the dot module about the new location these includes moved to (as part
of r240684) and clean things up a bit.
Reviewed by: adrian (co-mentor)
Approved by: adrian (co-mentor)
Alan Cox [Fri, 12 Oct 2012 23:26:00 +0000 (23:26 +0000)]
Replace all uses of the vm page queues lock by a new R/W lock.
Unfortunately, this lock cannot be defined as static under Xen because it
is (ab)used to serialize queued page table changes.
John Baldwin [Fri, 12 Oct 2012 21:31:44 +0000 (21:31 +0000)]
Add locking to adv(4) driver and mark it MPSAFE.
- Disable the support for the second channel on twin-channel EISA cards as
the current incarnation can't possibly work correctly (it hasn't worked
since switching to new-bus where new-bus allocates the softc). If anyone
bothers to test this again it can be fixed properly and brought back.
- Use device_printf() and device_get_nameunit() instead of adv_name().
- Remove use of explicit bus space handles and tags.
- Use PCI bus accessors and helper routines rather than accessing
config registers directly.
- Handle failures from adv_attach().
Alexander Motin [Thu, 11 Oct 2012 15:21:07 +0000 (15:21 +0000)]
Increase device CCB queue array size by CAM_RL_VALUES - 1 (4) elements.
It is required to store extra recovery requests in case of bus resets.
On ATA/SATA this fixes assertion panics on HEAD with INVARIANTS enabled or
possible memory corruptions otherwise if timeout/reset happens when device
CCB queue is already full.
Pyun YongHyeon [Thu, 11 Oct 2012 06:43:43 +0000 (06:43 +0000)]
Add APE firmware support and improve firmware handshake procedure.
This change will enable IPMI access on 5717/5718/5719/5720 and 5761
controllers. Because ASF is not available when APE firmware is
present, bge_allow_asf tunable is ignored when driver detects APE
firmware. Also bge(4) no longer performs two resets(one blind
reset and the other reset with firmware in mind) in device attach.
Now bge(4) performs a reset with enough information in bge_reset().
The APE firmware also needs special handling to make suspend/resume
work but it was not implemented yet.
With this change, bge(4) should work on any 5717/5718/5719/5720
controllers. Special thanks to Mike Hibler at Emulab who setup
remote debugging on Dell R820. Without his help I couldn't be able
to address several issues happened on Dell Rx20 systems. And many
thanks to Broadcom for continuing to support FreeBSD!
Submitted by: davidch (initial version)
H/W donated by: Broadcom
Tested by: many
Tested on: Del R820/R720/R620/R420/R320 and HP Proliant DL 360 G8
Pyun YongHyeon [Thu, 11 Oct 2012 06:07:48 +0000 (06:07 +0000)]
For 5717C/5719C/5720C and 57765 PHYs, do not perform any special
handling(jumbo, wire speed etc) in brgphy_reset(). Touching
BRGPHY_MII_AUXCTL register seems to confuse APE firmware such that
it couldn't establish a link.
Pyun YongHyeon [Thu, 11 Oct 2012 05:48:04 +0000 (05:48 +0000)]
Rework controller reset procedure. Previously driver saved
BGE_PCI_PCISTATE register before issuing global reset. After
issuing reset, it reads BGE_PCI_PCISTATE register again and
compares the saved register value and current value. It was used to
know whether the global reset operation was completed or not.
Unfortunately, this logic caused several issues on recent BCM5717/
5718/5719 and BCM5720 controllers. It seems APE firmware accesses
some registers while global reset is in progress such that reading
BGE_PCI_PCISTATE register after reset does not yield old pre-reset
state value. This resulted in consuming too much time in global
reset and sometimes it couldn't successfully complete reset.
The BGE_MISCCFG_RESET_CORE_CLOCKS of BGE_MISC_CFG register is
self-clearing bit so driver is able to know the reset completion.
But the core-lock reset will disable indirect/flat/standard access
modes such that driver cannot poll BGE_MISCCFG_RESET_CORE_CLOCKS
bit of BGE_MISC_CFG register. So just wait enough time for
core-clock reset to complete.
Data sheet says driver should wait 100us for PCI/PCI-X devices and
100ms for PCIe devices. I chose 1ms for PCI/PCI-X since this value
was used for many years in bge(4). For PCIe devices, use 100ms as
recommended by data sheet.
bge_chipinit() also cleared BGE_MAC_MODE register which shall clear
firmware configured mode information. I think this will result in
losing ASF/IPMI link in device attachment. Let bge_reset() honor
firmware configured BGE_MAC_MODE register and don't announce driver
is UP in bge_reset(). Firmware should have control over driver until
it's fully initialized by driver.
While I'm here, enable workaround for PCI-X BCM5704 A0 in
bge_reset(). This will prevent internal arbitration logic from
switching to the other DMA engine after a retry cycle.
Alexander Motin [Wed, 10 Oct 2012 22:02:11 +0000 (22:02 +0000)]
- Remove ancient checks for sim->softc == NULL. It can't be NULL, as it is
set not-NULL during SIM registration and set to UMASS_GONE on destruction.
Debug messages there look broken for at least 9 years, as they dereference
softc value that was just checked to be equal to NULL.
- Remove magic pointer value UMASS_GONE and use simple NULL instead.