Gleb Smirnoff [Fri, 15 Mar 2013 13:48:53 +0000 (13:48 +0000)]
- Use m_getcl() instead of hand allocating.
- Do not calculate constant length values at run time,
CTASSERT() their sanity.
- Remove superfluous cleaning of mbuf fields after allocation.
- Replace compat macros with function calls.
Gleb Smirnoff [Fri, 15 Mar 2013 12:52:59 +0000 (12:52 +0000)]
- Use m_getcl() instead of hand allocating.
- Convert panic() to KASSERT.
- Remove superfluous cleaning of mbuf fields after allocation.
- Add comment on possible use of m_get2() here.
Implement the helper function vn_io_fault_pgmove(), intended to use by
the filesystem VOP_READ() and VOP_WRITE() implementations in the same
way as vn_io_fault_uiomove() over the unmapped buffers. Helper
provides the convenient wrapper over the pmap_copy_pages() for struct
uio consumers, taking care of the TDP_UIOHELD situations.
Sponsored by: The FreeBSD Foundation
Tested by: pho
MFC after: 2 weeks
Gleb Smirnoff [Fri, 15 Mar 2013 10:20:15 +0000 (10:20 +0000)]
Use m_get2() + m_align() instead of hand made key_alloc_mbuf(). Code
examination shows, that although key_alloc_mbuf() could return chains,
the callers never use chains, so m_get2() should suffice.
Adrian Chadd [Fri, 15 Mar 2013 02:52:37 +0000 (02:52 +0000)]
Add locking around the new holdingbf code.
Since this is being done during buffer free, it's a crap shoot whether
the TX path lock is held or not. I tried putting the ath_freebuf() code
inside the TX lock and I got all kinds of locking issues - it turns out
that the buffer free path sometimes is called with the lock held and
sometimes isn't. So I'll go and fix that soon.
Hence for now the holdingbf buffers are protected by the TXBUF lock.
When throttling a process to enforce RACCT limits, do not use neither
PBDRY (which simply doesn't make any sense) nor PCATCH (which could
be used by a malicious process to work around the PCPU limit).
Now that ioctl(2) is allowed in capability mode and we can limit ioctls for the
given descriptors, use Capsicum sandboxing for hastd in primary and secondary
modes. Allow for DIOCGDELETE and DIOCGFLUSH ioctls on provider descriptor and
for G_GATE_CMD_MODIFY, G_GATE_CMD_START, G_GATE_CMD_DONE and G_GATE_CMD_DESTROY
on GEOM Gate descriptor.
Add currently unused flag argument to the cluster_read(),
cluster_write() and cluster_wbuild() functions. The flags to be
allowed are a subset of the GB_* flags for getblk().
Sponsored by: The FreeBSD Foundation
Tested by: pho
When pidptr was passed as NULL to pidfile_open(3), we were returning
EAGAIN/EWOULDBLOCK when another daemon was running and had the pidfile open.
We should return EEXIST in that case, fix it.
Add pmap function pmap_copy_pages(), which copies the content of the
pages around, taking array of vm_page_t both for source and
destination. Starting offsets and total transfer size are specified.
The function implements optimal algorithm for copying using the
platform-specific optimizations. For instance, on the architectures
were the direct map is available, no transient mappings are created,
for i386 the per-cpu ephemeral page frame is used. The code was
typically borrowed from the pmap_copy_page() for the same
architecture.
Only i386/amd64, powerpc aim and arm/arm-v6 implementations were
tested at the time of commit. High-level code, not committed yet to
the tree, ensures that the use of the function is only allowed after
explicit enablement.
For sparc64, the existing code has known issues and a stab is added
instead, to allow the kernel linking.
Sponsored by: The FreeBSD Foundation
Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6)
MFC after: 2 weeks
Adrian Chadd [Thu, 14 Mar 2013 06:20:02 +0000 (06:20 +0000)]
Implement "holding buffers" per TX queue rather than globally.
When working on TDMA, Sam Leffler found that the MAC DMA hardware
would re-read the last TX descriptor when getting ready to transmit
the next one. Thus the whole ATH_BUF_BUSY came into existance -
the descriptor must be left alone (very specifically the link pointer
must be maintained) until the hardware has moved onto the next frame.
He saw this in TDMA because the MAC would be frequently stopping during
active transmit (ie, when it wasn't its turn to transmit.)
Fast-forward to today. It turns out that this is a problem not with
a single MAC DMA instance, but with each QCU (from 0->9). They each
maintain separate descriptor pointers and will re-read the last
descriptor when starting to transmit the next.
So when your AP is busy transmitting from multiple TX queues, you'll
(more) frequently see one QCU stopped, waiting for a higher-priority QCU
to finsh transmitting, before it'll go ahead and continue. If you mess
up the descriptor (ie by freeing it) then you're short of luck.
Thanks to rpaulo for sticking with me whilst I diagnosed this issue
that he was quite reliably triggering in his environment.
This is a reimplementation; it doesn't have anything in common with
the ath9k or the Qualcomm Atheros reference driver.
Now - it in theory doesn't apply on the EDMA chips, as long as you
push one complete frame into the FIFO at a time. But the MAC can DMA
from a list of frames pushed into the hardware queue (ie, you concat
'n' frames together with link pointers, and then push the head pointer
into the TXQ FIFO.) Since that's likely how I'm going to implement
CABQ handling in hostap mode, it's likely that I will end up teaching
the EDMA TX completion code about busy buffers, just to be "sure"
this doesn't creep up.
Tested - iperf ap->sta and sta->ap (with both sides running this code):
* AR5416 STA
* AR9160/AR9220 hostap
To validate that it doesn't break the EDMA (FIFO) chips:
* AR9380, AR9485, AR9462 STA
Using iperf with the -S <tos byte decimal value> to set the TCP client
side DSCP bits, mapping to different TIDs and thus different TX queues.
TODO:
* Make this work on the EDMA chips, if we end up pushing lists of frames
to the hardware (eg how we eventually will handle cabq in hostap/ibss
mode.)
Tijl Coosemans [Wed, 13 Mar 2013 22:01:31 +0000 (22:01 +0000)]
- Fix two possible overflows when testing if ELF program headers are on
the first page:
1. Cast uint16_t operands in a multiplication to unsigned int because
otherwise the implicit promotion to int results in a signed
multiplication that can overflow and the behaviour on integer
overflow is undefined.
2. Replace (offset + size > PAGE_SIZE) with (size > PAGE_SIZE - offset)
because the sum may overflow.
- Use the same tests to see if the path to the interpreter is on the first
page. There's no overflow here because size is already limited by
MAXPATHLEN, but the compiler optimises the new tests better. Also fix an
off-by-one error.
- Simplify tests to see if an ELF note program header is on the first page.
This also fixes an off-by-one error.
- Make quirk for reading device descriptor from broken USB devices.
Else they won't enumerate at all:
hw.usb.full_ddesc=1
- Reduce the USB descriptor read timeout from 1000ms to
500ms. Typical value for LOW speed devices is 50-100ms.
- Enumerate USB device a maximum of 3 times when a port
connection change event is detected, before giving up.
- Make the FreeBSD's USB library compile under Linux.
- Fix a compile warning where the return value of a call
to a write() function was ignored.
- Remove redundant include files from userland USB header files.
- Add some now needed include files to various C-files.
Unlike OpenBSD's, our setusercontext() will intentionally ignore the user's
own umask setting (from ~/.login.conf) unless running with the user's UID.
Therefore, we need to call it again with LOGIN_SETUMASK after changing UID.
Pyun YongHyeon [Wed, 13 Mar 2013 01:40:01 +0000 (01:40 +0000)]
r241438 broke IPMI access on Sun Fire X2200 M2(BCM5715).
Fix the IPMI regression by sending BGE_FW_DRV_STATE_UNLOAD to
ASF/IPMI firmware in driver attach phase. Sending heartheat to
ASF/IPMI is enabled only after upping interface so
setting driver state to BGE_FW_DRV_STATE_START in attach phase
broke IPMI access.
While I'm here, add NVRAM arbitration lock before performing
controller reset. ASF/IPMI firmware may be able to access the NVRAM
while controller reset is in progress. Without the arbitration
lock before resetting the controller, ASF/IPMI may not initialize
properly.
Special thanks to Miroslav Lachman who provided full remote
debugging environments.
Gleb Smirnoff [Tue, 12 Mar 2013 13:42:47 +0000 (13:42 +0000)]
Functions m_getm2() and m_get2() have different order of arguments,
and that can drive someone crazy. While m_get2() is young and not
documented yet, change its order of arguments to match m_getm2().
John Baldwin [Tue, 12 Mar 2013 12:35:02 +0000 (12:35 +0000)]
Remove fortunes-o from the base system. Debating what does or does not
belong in this file is not a useful exercise or conducive to producing a
high quality advanced operating system.
While here, simplify the make rules to autocompute BLDS and FILES from a
single DB variable.
Gleb Smirnoff [Tue, 12 Mar 2013 12:15:24 +0000 (12:15 +0000)]
In kern_sendfile() use m_extadd() instead of MEXTADD() macro, supplying
appropriate wait argument and checking return value. Before this change
m_extadd() could fail, and kern_sendfile() ignored that.
Gleb Smirnoff [Tue, 12 Mar 2013 12:12:16 +0000 (12:12 +0000)]
The m_extadd() can fail due to memory allocation failure, thus:
- Make it return int, not void.
- Add wait parameter.
- Update MEXTADD() macro appropriately, defaults to M_NOWAIT, as
before this change.
Alexander Motin [Tue, 12 Mar 2013 06:58:49 +0000 (06:58 +0000)]
Make kern_nanosleep() and pause_sbt() to use per-CPU sleep queues.
This removes significant sleep queue lock congestion on multithreaded
microbenchmarks, making them scale to multiple CPUs almost linearly.
Neel Natu [Mon, 11 Mar 2013 17:36:37 +0000 (17:36 +0000)]
Convert the offset into the bar that contains the MSI-X table to an offset
into the MSI-X table before using it to calculate the table index.
In the common case where the MSI-X table is located at the begining of the
BAR these two offsets are identical and thus the code was working by accident.
This change will fix the case where the MSI-X table is located in the middle
or at the end of the BAR that contains it.
John Baldwin [Mon, 11 Mar 2013 16:33:05 +0000 (16:33 +0000)]
Fix the 'C' field for a running thread to match the behavior described
in the manpage by having it display the current CPU (ki_oncpu) rather
than the previously used CPU (ki_lastcpu). ki_lastcpu is still used for
all other thread states.
Reported by: Chris Ross <cross+freebsd@distal.com>
MFC after: 1 week
Gleb Smirnoff [Mon, 11 Mar 2013 13:05:11 +0000 (13:05 +0000)]
Fix for quite a special case when userland emulates a netgraph node, and
userland can reply to a message with NGM_HASREPLY bit set. In this case
we should not wait for a response to a responce.
PR: 176771
Submitted by: Keith Reynolds <keith.reynolds tidalscale.com>