MFC r215649,r215764,r215802,r215804,r215810,r215812,r216091,r216267,r218165,r220301,r215651,r215803,r216138,r218010,r217558,r220312,r220314,r215846 and r216268.
Backport USB PF and usbdump from head to 8-stable.
This fixes a long standing bug in mxge(4) where "ifconfig mxge0 $IP"
did not bring the interface into a RUNNING state, like it does on
most (all?) other FreeBSD NIC drivers.
Thanks to gnn for mentioning the bug, and yongari for pointing out that
ether_ioctl() invokes ifp->if_init() in SIOCSIFADDR.
MFC 218794, 219026:
Fixed IPsec's HMAC_SHA256-512 support to be RFC4868 compliant.
This will break interoperability with all older versions of
FreeBSD for those algorithms.
MFC r220920:
- Fix mapping of the last two SATA ports on 6-port Intel controllers.
This improves hard-reset and hot-plug on these ports.
- Device with ID 0x29218086 is a 2-port variant of ICH9 in legacy mode.
Skip probing for nonexistent slave devices there.
MFC r220917:
Use periodic status polling added at r214671 only in ATA_CAM mode. Legacy
mode won't receive much benefit from it due to its hot-plug limitations.
MFC r220886:
Add basic support for DMA-capable ATA disks on DMA-incapable controller.
This is really rare situation these days, but still may happen in embedded.
MFC r220786:
Remove always false "< 0" check for unsgined int variable. This check is
also duplicate, as the value was already checked for 0 before decrementing.
MFC r221077.
The maximum NCM frame size must be so that it
will generate a short terminated USB transfer if
the maximum NCM frame size is greater than what
the driver can handle.
MFC 210864:
- Retire acpi_pcib_resume(). It is has just been an alias for
bus_generic_resume() since the pci_link(4) driver was added.
- Change the ACPI PCI-PCI bridge driver to inherit most of its methods
from the generic PCI-PCI bridge driver.
- This also exposes the generic PCI-PCI bridge driver as pcib_driver so
other drivers can inherit from it.
hastd(8) maintains a map of dirty extents, not hastctl(8). Fix this.
r220521:
Fix a typo in comments.
r220522:
In hast_proto_recv_data() check that the size of the data to be
received does not exceed the buffer size.
r220573 (pjd):
The replication mode that is currently support is fullsync, not memsync.
Correct this and print a warning if different replication mode is
configured.
r220523, r220744:
Remove hast_proto_recv(). It was used only in one place, where
hast_proto_recv_hdr() may be used.
r220865 (pjd):
Scenario:
- We have two nodes connected and synchronized (local counters on both sides
are 0).
- We take secondary down and recreate it.
- Primary connects to it and starts synchronization (but local counters are
still 0).
- We switch the roles.
- Synchronization restarts but data is synchronized now from new primary
(because local counters are 0) that doesn't have new data yet.
This fix this issue we bump local counter on primary when we discover that
connected secondary was recreated and has no data yet.
If we act in different role than requested by the remote node, log it
as a warning and not an error.
MFC after: 1 week
r220898 (pjd), r220899 (pjd):
When we become primary, we connect to the remote and expect it to be in
secondary role. It is possible that the remote node is primary, but only
because there was a role change and it didn't finish cleaning up (unmounting
file systems, etc.). If we detect such situation, wait for the remote node
to switch the role to secondary before accepting I/Os. If we don't wait for
it in that case, we will most likely cause split-brain.
r220873:
- Move all Ethernet specific items from sge_eq to sge_txq. sge_eq is
now a suitable base for all kinds of egress queues.
- Add control queues (sge_ctrlq) and allocate one of these per hardware channel. They can be used to program filters and steer traffic (and
more).
r220897:
Use the correct free routine when destroying a control queue.
r220905:
Ring the freelist doorbell from within refill_fl. While here, fix a bug
that could have allowed the hardware pidx to reach the cidx even though
the freelist isn't empty. (Haven't actually seen this but it was there
waiting to happen..)
MFC: r220611
Add VOP_PATHCONF() support to the experimental NFS client
so that it can, along with other things, report whether or
not NFS4 ACLs are supported.
MFC: r220610
Fix the experimental NFSv4 client so that it recognizes server
mount point crossings correctly. It was testing the wrong flag.
Also, try harder to make sure that the fsid is different than
the one assigned to the client mount point, by hashing the
server's fsid (just to create a different value deterministically)
when it is the same.
MFC: r220585
Fix a couple of mbuf leaks introduced by r217242. I do
not believe that these leaks had a practical impact,
since the situations in which they would have occurred
would have been extremely rare.
MFC 220975:
- Clarification on kld_file_stat.size
- While here, remove a few C comments that don't seem to contribute
anything additional to the man page.
Hold the vnode around the region where object lock is dropped, until
vnode lock is acquired.
Do not drop the vnode reference for a case when the object was
deallocated during unlock. Note that in this case, VV_TEXT is cleared
by vnode_pager_dealloc().
MFC: r220546
Vrele ni_startdir in the experimental NFS server for the case
of NFSv2 getting an error return from VOP_MKNOD(). Without this
patch, the server file system remains busy after an NFSv2
VOP_MKNOD() fails.
MFC: r210933,r211397,r220518
Modify the man pages to reflect the addition of a backup
stable restart file, as done by r220510. I also merged the
typo fixes done in head as r210933, r211397 with joel@'s
permission.
This is a content change.
MFC: r220530
Add some cleanup code to the module unload operation for
the experimental NFS server, so that it doesn't leak memory
when unloaded. However, unloading the NFSv4 server is not
recommended, since all NFSv4 state will be lost by the unload
and clients will have to recover the state after a server
reload/restart as if the server crashed/rebooted.
MFC: r220519
Fix a bug in the userland rpc library, where it would use a
negative return value from write to update its position in
a buffer. The patch, courtesy of Andrey Simonenko, also simplifies
a conditional by removing the "i != cnt" clause, since it is
always true at this point in the code. The bug caused problems
for mountd, when it generated a large reply to an exports RPC
request.
MFC: r220510
Add support for a backup stable restart file to the nfsd,
used for NFSv4 restart. This permits the nfsd to create
the stable restart file as required and minimizes the risk
of trouble if the file is lost.
MFC: r220507
Add a VOP_UNLOCK() for the directory, when that is not what
VOP_LOOKUP() returned. This fixes a bug in the experimental
NFS server for the case where VFS_VGET() fails returning EOPNOTSUPP
in the ReaddirPlus RPC, forcing the use of VOP_LOOKUP() instead.
Don't clog syslog up with "inet_ntop(): Address family not supported
by protocol family" when processing requests received from the UNIX
domain socket.
The change in r206686 to allow the stop argument to work for a service
that is running even though not _enable'd had an annoying side effect.
If the service was already started at boot time by another means when
the related script came around again in rcorder it would start again,
regardless of _enable, because there was a valid pid.
So, split the test into 2 parts, one for (!rcvar && !stop), and one
for (stop && !valid_pid). This preserves the behavior from r206686
while preventing the undesired side effect.
MFC r220558.
We don't need to call EOWRITE4(sc, EHCI_USBINTR, 0) directly from each EHCI
bus driver at detach, hence ehci_detach() does exactly this since r199718.
r220649:
Fix a couple of bad races that can occur when a cxgbe interface is taken
down. The ingress queue lock was unused and has been removed as part of
these changes.
- An in-flight egress update from the SGE must be handled before the
queue that requested it is destroyed. Wait for the update to arrive.
- Interrupt handlers must stop processing rx events for a queue before
the queue is destroyed. Events that have not yet been processed
should be ignored once the queue disappears.
humanize_number(3) multiply the input number by 100, which could cause an
integer overflow when the input is very large (for example, 100 Pi would
become about 10 Ei which exceeded signed int64_t).
Solve this issue by splitting the division into two parts and avoid the
multiplication.
Changed "conscontrol unset" to accept an existing virtual
console device as an argument. Unsetting virtual console
using /dev/console seems to have never worked.
MFC r220615:
Refactor hard-reset implementation in mvs(4).
Instead of spinning in a tight loop for up to 15 seconds, polling for device
readiness while it spins up, return reset completion just after PHY reports
"connect well" or 100ms connection timeout. If device was found, use callout
for checking device readiness with 100ms period up to full 31 second timeout.
This fixes system freeze for 5-10 seconds on drives hot plug-in.
MFC r220777:
- Tune different wait loops to cut some more milliseconds from reset time.
- Do not call ahci_start() before device signature received. It is required
by the specification and caused non-fatal reset timeouts on AMD chipsets.
MFC r220657:
Some changes around hot-plug and interface power-management:
- use ATA_SE_EXCHANGED (SError.DIAG.X) bit to detect hot-plug events when
power-management enabled and ATA_SE_PHY_CHANGED (SError.DIAG.N) can't be
trusted;
- on controllers supporting staggered spin-up (SS) put unused channels
into Listen state instead of Off. It should still save some power, but
allow plug-in events to be detected;
- on controllers supporting cold presence detection (CPD), when power
management enabled, use CPD events to detect hot-plug in addition to PHY
events.
MFC r220602:
Improve SATA Asynchronous Notification feature support in CAM:
- make SATA SIMs announce capabilities to handle SDB with Notification bit;
- make PMP driver honor this SIMs capability;
- make SATA XPT to negotiate and enable this feature for ATAPI devices.
This feature allows supporting SATA ATAPI devices to inform system about
some events happened, that may require attention. In my case this allows
LG GH22LS50 SATA DVR-RW drive to report tray open/close events. Events
reported to CAM in form of AC_SCSI_AEN async. Further they could be used
as a hints for checking device status and reporting media change to upper
layers, for example, via spoiling mechanism of GEOM.
MFC r220591:
As soon as siis_reset() doesn't waits for device readiness, but only for
controller port readiness (that should set just after PHY ready signal),
reduce wait time from 10s to 1s before trying more aggressive reset method.
This should improve system responsibility in some failure conditions.
MFC r220576:
Refactor hard-reset implementation in ahci(4).
Instead of spinning in a tight loop for up to 15 seconds, polling for device
readiness while it spins up, return reset completion just after PHY reports
"connect well" or 100ms connection timeout. If device was found, use callout
for checking device readiness with 100ms period up to full 31 second timeout.
This fixes system freeze for 5-10 seconds on drives hot plug-in.
MFC r217877, r217883:
Hardware supported by siis(4) allows software control over activity LEDs.
Expose that functionality to led(4) OR-ing it with regular LED activity.
MFC r218596, r218605:
Disable NCQ for multiport Marvell 88SX61XX SATA controllers. Simultaneous
active I/O to several disks (copying large file on ZFS) causes timeout after
just a few seconds of run. Single port 88SX6111 seems like not affected.
Skip reading transferred bytes count for these controllers. It works for
88SX6111, but 88SX6145 always returns zero there. Haven't tested others,
but better to be safe.
MFC r220412, r220414, r220454, r220618, r220814:
- Make ada(4) driver to control device write cache, same as ata(4) does.
Add kern.cam.ada.write_cache sysctl/tunable to control it alike hw.ata.wc.
- Add kern.cam.ada.X.write_cache tunables/sysctls to control write caching
on per-device basis.
- While adding support for per-device sysctls, merge from graid branch
support for ADA_TEST_FAILURE kernel option, which opens few more sysctl,
allowing to simulate read and write errors for testing purposes.
MFC r214989:
When requesting sense data for SIM not doing it automatically (such as
ATAPI or USB), request only as much data as requested by consumer.
On the way back -- report how much sense data we have actually received.
Remove a check in udp6_send() that prevented v4-mapped v6 addresses from
working. We store v4 and v6 addresses as a union but for v4-mapped
addresses only store the 32bits w/o the ::ffff: word. That failed the
check as for example 127.0.0.1 would be ::7f00:1 rather than ::ffff:7f00:1
and the IN6_IS_ADDR_V4MAPPED() never worked here. Given we can hardly get
here with an unbound local address or invalid inp_vflags remove the check.
After r219579 and r219779 unbreak v4-mapped v6 sockets for UDP
some more. Similar to what we do for TCP check for v4-mapped
addresses and then handle them or the normal v6 address case.
For either set inp_vflags before calling into the pcb connect
function so that we have an unambiguous view in case we need to
set the local address or port.
In some cases as udp6_connect() without an earlier bind(2) to an
address, v4-mapped sockets allowed and a non mapped destination
address, we can end up here with both v4 and v6 indicated:
inp_vflag = (INP_IPV4|INP_IPV6|INP_IPV6PROTO)
In that case however laddrp is NULL as the IPv6 path does not
pass in a copy currently.
MFC 220430,220431,220452,220460:
If a system call does not request a full interrupt return, use a fast
path via the sysretq instruction to return from the system call. This
resolves most of the performance regression in system call microbenchmarks
between 7 and 8 on amd64.
While here, trim an instruction (and memory access) from the doreti path
and fix a typo in a comment.
Fix IPv6 ND. After r219562 we in nd6_ns_input() were erroneously always
passing the cached proxydl reference (sockaddr_dl initialized or not) to
nd6_na_output(). nd6_na_output() will thus assume a proxy NA. Revert to
conditionally passing either &proxydl or NULL if no proxy case desired.
Tested by: ipv6gw and ref9-i386
Tested by: Pete French (petefrench ingresso.co.uk on stable)
Reported by: Pete French (petefrench ingresso.co.uk on stable)
Reported by: bz, simon on Y! cluster
Reported by: kib
PR: kern/151908
X-Early-MFC: yes
Rework change made at r203146. Instead of reporting all wire errors as
SCSI status errors to CAM (that was wrong, as it too often turned retriable
wire errors into non-retriable REQUEST SENSE errors), do it only for STALL
errors on control pipe of the CBI devices. STALL on control pipe is just
a one of the ways to report error for CBI devices.
When specifying the -t option (send tag in front of message), this tag
should also be forwarded to the remote logging host, not only when the
logging is done locally.
If building (custom) FreeBSD images people tend to patch param.h. In case
this happens just before the build is started (within the same second)
CHECK_TIME actually triggers thinking param.h is in the future (see f_Xtime,
c_Xtime logi in find(1) sources for the details in !F_EXACTTIME case).
Using the -mtime -0s (seconds, rather than no unit) avoids this 1s race.