r220873:
- Move all Ethernet specific items from sge_eq to sge_txq. sge_eq is
now a suitable base for all kinds of egress queues.
- Add control queues (sge_ctrlq) and allocate one of these per hardware channel. They can be used to program filters and steer traffic (and
more).
r220897:
Use the correct free routine when destroying a control queue.
r220905:
Ring the freelist doorbell from within refill_fl. While here, fix a bug
that could have allowed the hardware pidx to reach the cidx even though
the freelist isn't empty. (Haven't actually seen this but it was there
waiting to happen..)
MFC: r220611
Add VOP_PATHCONF() support to the experimental NFS client
so that it can, along with other things, report whether or
not NFS4 ACLs are supported.
MFC: r220610
Fix the experimental NFSv4 client so that it recognizes server
mount point crossings correctly. It was testing the wrong flag.
Also, try harder to make sure that the fsid is different than
the one assigned to the client mount point, by hashing the
server's fsid (just to create a different value deterministically)
when it is the same.
MFC: r220585
Fix a couple of mbuf leaks introduced by r217242. I do
not believe that these leaks had a practical impact,
since the situations in which they would have occurred
would have been extremely rare.
MFC 220975:
- Clarification on kld_file_stat.size
- While here, remove a few C comments that don't seem to contribute
anything additional to the man page.
Hold the vnode around the region where object lock is dropped, until
vnode lock is acquired.
Do not drop the vnode reference for a case when the object was
deallocated during unlock. Note that in this case, VV_TEXT is cleared
by vnode_pager_dealloc().
MFC: r220546
Vrele ni_startdir in the experimental NFS server for the case
of NFSv2 getting an error return from VOP_MKNOD(). Without this
patch, the server file system remains busy after an NFSv2
VOP_MKNOD() fails.
MFC: r210933,r211397,r220518
Modify the man pages to reflect the addition of a backup
stable restart file, as done by r220510. I also merged the
typo fixes done in head as r210933, r211397 with joel@'s
permission.
This is a content change.
MFC: r220530
Add some cleanup code to the module unload operation for
the experimental NFS server, so that it doesn't leak memory
when unloaded. However, unloading the NFSv4 server is not
recommended, since all NFSv4 state will be lost by the unload
and clients will have to recover the state after a server
reload/restart as if the server crashed/rebooted.
MFC: r220519
Fix a bug in the userland rpc library, where it would use a
negative return value from write to update its position in
a buffer. The patch, courtesy of Andrey Simonenko, also simplifies
a conditional by removing the "i != cnt" clause, since it is
always true at this point in the code. The bug caused problems
for mountd, when it generated a large reply to an exports RPC
request.
MFC: r220510
Add support for a backup stable restart file to the nfsd,
used for NFSv4 restart. This permits the nfsd to create
the stable restart file as required and minimizes the risk
of trouble if the file is lost.
MFC: r220507
Add a VOP_UNLOCK() for the directory, when that is not what
VOP_LOOKUP() returned. This fixes a bug in the experimental
NFS server for the case where VFS_VGET() fails returning EOPNOTSUPP
in the ReaddirPlus RPC, forcing the use of VOP_LOOKUP() instead.
Don't clog syslog up with "inet_ntop(): Address family not supported
by protocol family" when processing requests received from the UNIX
domain socket.
The change in r206686 to allow the stop argument to work for a service
that is running even though not _enable'd had an annoying side effect.
If the service was already started at boot time by another means when
the related script came around again in rcorder it would start again,
regardless of _enable, because there was a valid pid.
So, split the test into 2 parts, one for (!rcvar && !stop), and one
for (stop && !valid_pid). This preserves the behavior from r206686
while preventing the undesired side effect.
MFC r220558.
We don't need to call EOWRITE4(sc, EHCI_USBINTR, 0) directly from each EHCI
bus driver at detach, hence ehci_detach() does exactly this since r199718.
r220649:
Fix a couple of bad races that can occur when a cxgbe interface is taken
down. The ingress queue lock was unused and has been removed as part of
these changes.
- An in-flight egress update from the SGE must be handled before the
queue that requested it is destroyed. Wait for the update to arrive.
- Interrupt handlers must stop processing rx events for a queue before
the queue is destroyed. Events that have not yet been processed
should be ignored once the queue disappears.
humanize_number(3) multiply the input number by 100, which could cause an
integer overflow when the input is very large (for example, 100 Pi would
become about 10 Ei which exceeded signed int64_t).
Solve this issue by splitting the division into two parts and avoid the
multiplication.
Changed "conscontrol unset" to accept an existing virtual
console device as an argument. Unsetting virtual console
using /dev/console seems to have never worked.
MFC r220615:
Refactor hard-reset implementation in mvs(4).
Instead of spinning in a tight loop for up to 15 seconds, polling for device
readiness while it spins up, return reset completion just after PHY reports
"connect well" or 100ms connection timeout. If device was found, use callout
for checking device readiness with 100ms period up to full 31 second timeout.
This fixes system freeze for 5-10 seconds on drives hot plug-in.
MFC r220777:
- Tune different wait loops to cut some more milliseconds from reset time.
- Do not call ahci_start() before device signature received. It is required
by the specification and caused non-fatal reset timeouts on AMD chipsets.
MFC r220657:
Some changes around hot-plug and interface power-management:
- use ATA_SE_EXCHANGED (SError.DIAG.X) bit to detect hot-plug events when
power-management enabled and ATA_SE_PHY_CHANGED (SError.DIAG.N) can't be
trusted;
- on controllers supporting staggered spin-up (SS) put unused channels
into Listen state instead of Off. It should still save some power, but
allow plug-in events to be detected;
- on controllers supporting cold presence detection (CPD), when power
management enabled, use CPD events to detect hot-plug in addition to PHY
events.
MFC r220602:
Improve SATA Asynchronous Notification feature support in CAM:
- make SATA SIMs announce capabilities to handle SDB with Notification bit;
- make PMP driver honor this SIMs capability;
- make SATA XPT to negotiate and enable this feature for ATAPI devices.
This feature allows supporting SATA ATAPI devices to inform system about
some events happened, that may require attention. In my case this allows
LG GH22LS50 SATA DVR-RW drive to report tray open/close events. Events
reported to CAM in form of AC_SCSI_AEN async. Further they could be used
as a hints for checking device status and reporting media change to upper
layers, for example, via spoiling mechanism of GEOM.
MFC r220591:
As soon as siis_reset() doesn't waits for device readiness, but only for
controller port readiness (that should set just after PHY ready signal),
reduce wait time from 10s to 1s before trying more aggressive reset method.
This should improve system responsibility in some failure conditions.
MFC r220576:
Refactor hard-reset implementation in ahci(4).
Instead of spinning in a tight loop for up to 15 seconds, polling for device
readiness while it spins up, return reset completion just after PHY reports
"connect well" or 100ms connection timeout. If device was found, use callout
for checking device readiness with 100ms period up to full 31 second timeout.
This fixes system freeze for 5-10 seconds on drives hot plug-in.
MFC r217877, r217883:
Hardware supported by siis(4) allows software control over activity LEDs.
Expose that functionality to led(4) OR-ing it with regular LED activity.
MFC r218596, r218605:
Disable NCQ for multiport Marvell 88SX61XX SATA controllers. Simultaneous
active I/O to several disks (copying large file on ZFS) causes timeout after
just a few seconds of run. Single port 88SX6111 seems like not affected.
Skip reading transferred bytes count for these controllers. It works for
88SX6111, but 88SX6145 always returns zero there. Haven't tested others,
but better to be safe.
MFC r220412, r220414, r220454, r220618, r220814:
- Make ada(4) driver to control device write cache, same as ata(4) does.
Add kern.cam.ada.write_cache sysctl/tunable to control it alike hw.ata.wc.
- Add kern.cam.ada.X.write_cache tunables/sysctls to control write caching
on per-device basis.
- While adding support for per-device sysctls, merge from graid branch
support for ADA_TEST_FAILURE kernel option, which opens few more sysctl,
allowing to simulate read and write errors for testing purposes.
MFC r214989:
When requesting sense data for SIM not doing it automatically (such as
ATAPI or USB), request only as much data as requested by consumer.
On the way back -- report how much sense data we have actually received.
Remove a check in udp6_send() that prevented v4-mapped v6 addresses from
working. We store v4 and v6 addresses as a union but for v4-mapped
addresses only store the 32bits w/o the ::ffff: word. That failed the
check as for example 127.0.0.1 would be ::7f00:1 rather than ::ffff:7f00:1
and the IN6_IS_ADDR_V4MAPPED() never worked here. Given we can hardly get
here with an unbound local address or invalid inp_vflags remove the check.
After r219579 and r219779 unbreak v4-mapped v6 sockets for UDP
some more. Similar to what we do for TCP check for v4-mapped
addresses and then handle them or the normal v6 address case.
For either set inp_vflags before calling into the pcb connect
function so that we have an unambiguous view in case we need to
set the local address or port.
In some cases as udp6_connect() without an earlier bind(2) to an
address, v4-mapped sockets allowed and a non mapped destination
address, we can end up here with both v4 and v6 indicated:
inp_vflag = (INP_IPV4|INP_IPV6|INP_IPV6PROTO)
In that case however laddrp is NULL as the IPv6 path does not
pass in a copy currently.
MFC 220430,220431,220452,220460:
If a system call does not request a full interrupt return, use a fast
path via the sysretq instruction to return from the system call. This
resolves most of the performance regression in system call microbenchmarks
between 7 and 8 on amd64.
While here, trim an instruction (and memory access) from the doreti path
and fix a typo in a comment.
Fix IPv6 ND. After r219562 we in nd6_ns_input() were erroneously always
passing the cached proxydl reference (sockaddr_dl initialized or not) to
nd6_na_output(). nd6_na_output() will thus assume a proxy NA. Revert to
conditionally passing either &proxydl or NULL if no proxy case desired.
Tested by: ipv6gw and ref9-i386
Tested by: Pete French (petefrench ingresso.co.uk on stable)
Reported by: Pete French (petefrench ingresso.co.uk on stable)
Reported by: bz, simon on Y! cluster
Reported by: kib
PR: kern/151908
X-Early-MFC: yes
Rework change made at r203146. Instead of reporting all wire errors as
SCSI status errors to CAM (that was wrong, as it too often turned retriable
wire errors into non-retriable REQUEST SENSE errors), do it only for STALL
errors on control pipe of the CBI devices. STALL on control pipe is just
a one of the ways to report error for CBI devices.
When specifying the -t option (send tag in front of message), this tag
should also be forwarded to the remote logging host, not only when the
logging is done locally.
If building (custom) FreeBSD images people tend to patch param.h. In case
this happens just before the build is started (within the same second)
CHECK_TIME actually triggers thinking param.h is in the future (see f_Xtime,
c_Xtime logi in find(1) sources for the details in !F_EXACTTIME case).
Using the -mtime -0s (seconds, rather than no unit) avoids this 1s race.
VNET socket push back:
try to minimize the number of places where we have to switch vnets
and narrow down the time we stay switched. Add assertions to the
socket code to catch possibly unset vnets as seen in r204147.
While this reduces the number of vnet recursion in some places like
NFS, POSIX local sockets and some netgraph, .. recursions are
impossible to fix.
The current expectations are documented at the beginning of
uipc_socket.c along with the other information there.
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
Reviewed by: jhb
Tested by: zec
MFC 220156:
Clamp the initial advertised receive window when responding to a SYN/ACK
to the maximum allowed window. Growing the window too large would cause
an underflow in the calculations in tcp_output() to decide if a window
update should be sent which would prevent the persist timer from being
started if data was pending and the other end of the connection advertised
an initial window size of 0.
MFC 220126:
- Enable an extra debugging bootverbose printf when probing ISA PNP cards
listing each card as it is found on non-PC98 (PC98 already had this).
- Increase the length of the DELAY() used before timing out while reading
PNP resource data.
MFC 219865:
Add pci_find_cap() as an alias for pci_find_extcap() to ease driver
portability with 9+ where pci_find_extcap() has been renamed to
pci_find_cap().
MFC 219717,220363:
- Add more details to the 'show battery' command including more raw
capacity values, charge cycle count, temperature, and more detailed
status.
- Add the ability to manage the state of write caching when the battery
back-up is missing or dead. The current state of this field is reported
in 'mfiutil cache <volume>' and can be adjusted via
'mfiutil cache <volume> bad-bbu-write-cache <enable|disable>'. This
setting should generally be disabled to avoid data loss.
MFC r220376: Allow strerror(0) and strerror_r(0, ...).
Of course, strerror_r() may still fail with ERANGE.
Although the POSIX specification said this could fail with EINVAL and
doing this likely indicates invalid use of errno, most other
implementations permitted it, various POSIX testsuites require it to
work (matching the older sys_errlist array) and apparently some
applications depend on it.
MFC r220461:
Remove setting of PCB_FULL_IRET at the places where we are going to call
update_gdt_{f,g}sbase. The functions set the flag when td == curthread,
and sysarch is always called with curthread.