scottl [Mon, 24 Dec 2018 05:54:36 +0000 (05:54 +0000)]
Commands for user-initated device resets should come from the high-priority
allocator. Prior to this change, they would leak from the normal allocator.
scottl [Mon, 24 Dec 2018 05:05:38 +0000 (05:05 +0000)]
First step in refactoring and fixing the error recovery and task management
code in the mpr and mps drivers. Eliminate duplicated code and fix some
comments.
cy [Mon, 24 Dec 2018 01:12:43 +0000 (01:12 +0000)]
Remove an empty #if block.
The interesting thing is that looking through Darren's commit logs,
the line containing an extern ppsratecheck() definition was removed
from the v5-1-RELEASE branch but not from HEAD (I have taken his
CVS tree and converted it to GIT). There is a commit adding an
additional #if defined to the empty block. I can only assume that
this was intentional for something later. Looking through HEAD the
extern ppsratecheck() is there. However if we put it back it would
conflict with a static ppsratecheck() definition in fil.c when
building ipftest.
Therefore we remove this empty block.
ppsratecheck() is a function in the FreeBSD kernel. However ipftest
cannot call the ppsratecheck() in the kernel. Therefore one exists in
fil.c for use when building the userland ipftest utility which
approximates the packet filter in userland for testing of ipfilter
rules against packets captured with tcpdump.
kib [Sun, 23 Dec 2018 18:52:02 +0000 (18:52 +0000)]
Properly test for vmio buffer in bnoreuselist().
The presence of allocated v_object does not imply that the buffer is
necessary VMIO kind. Buffer might has been allocated before the
object created, then the buffer is malloced. Although we try to avoid
such situation, it seems to be still legitimate.
Reported and tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
pfg [Sun, 23 Dec 2018 18:15:48 +0000 (18:15 +0000)]
gai_strerror() - Update string error messages according to RFC 3493.
Error messages in gai_strerror(3) vary largely among OSs.
For new software we largely replaced the obsoleted EAI_NONAME and
with EAI_NODATA but we never updated the corresponding message to better
match the intended use. We also have references to ai_flags and ai_family
which are not very descriptive for non-developer end users.
Bring new new error messages based on informational RFC 3493, which has
obsoleted RFC 2553, and make them consistent among the header adn
manpage.
cy [Sun, 23 Dec 2018 05:10:36 +0000 (05:10 +0000)]
Remove NETBSD_PF. NETBSD_PF is a flag that defines whether the pfil(9)
framework is available. pfil(9) has been in FreeBSD since FreeBSD 5
and according to svn log was first committed to HEAD in 2000, therefore
it is safe to say the check is no longer needed in FreeBSD.
pfil(9) first appeared in NetBSD 1.3 (hence the name NETBSD_PF).
Therefore it is safe to say that it is supported by every NetBSD system
today. The framework also exists in illumos.
As ipfilter code is shared and exchanged between FreeBSD and NetBSD, and
at some point in the future illumos too, and as all three platforms have
pfil(9), the redundant NETBSD_PF #defines and #ifdefs are removed.
bde [Sat, 22 Dec 2018 22:59:11 +0000 (22:59 +0000)]
Fix devstat on md devices, second attempt. r341765 depends on
g_io_deliver() finishing initialization of the bio, but g_io_deliver()
actually destroys the bio. INVARIANTS makes the bug obvious by
overwriting the bio with garbage.
Restore the old order for calling devstat (except don't restore not calling
it for the error case), and translate to the devstat KPI so that this order
works.
vmaffione [Sat, 22 Dec 2018 16:23:42 +0000 (16:23 +0000)]
netmap: fix txsync check in netmap poll
To check if txsync can be skipped, it is necessary to look for
unseen TX space. However, this means comparing ring->cur
against ring->tail, rather than ring->head against ring->tail
(like nm_ring_empty() does).
This change also adds some more comments to explain the optimization
performed at the beginning of netmap_poll().
MFC after: 3 days
Sponsored by: Sunny Valley Networks
vmaffione [Sat, 22 Dec 2018 15:15:45 +0000 (15:15 +0000)]
netmap: fix bug in netmap_poll() optimization
The bug was introduced by r339639, although it is present in the upstream
netmap code since 2015. It is due to resetting the want_rx variable to
POLLIN, rather than resetting it to POLLIN|POLLRDNORM.
It only affects select(), which uses POLLRDNORM. poll() is not affected,
because it uses POLLIN.
Also, it only affects FreeBSD, because Linux skips the optimization
implemented by the piece of code where the bug occurs.
MFC after: 3 days
Sponsored by: Sunny Valley Networks
eugen [Sat, 22 Dec 2018 11:38:54 +0000 (11:38 +0000)]
ifconfig.4, lagg.4: fix documentation bug: -use_flowid needs to be used
to force local hash computation and disable usage of RSS hash
provided by driver.
bde [Sat, 22 Dec 2018 09:31:55 +0000 (09:31 +0000)]
Oops, rounddown() for the start was misspelled roundup() in r342295,
so only aligned starts worked. This broke releasing caches in most
cases where the i/o size is smaller than the fs block size.
kevans [Sat, 22 Dec 2018 06:08:06 +0000 (06:08 +0000)]
config(8): Remove all instances of an option when opting out
Quick follow-up to r342362: options can appear multiple times now, so
clean up all of them as needed. For non-OPTIONS options, this has no effect
since they're already de-duplicated.
kevans [Sat, 22 Dec 2018 06:02:34 +0000 (06:02 +0000)]
config(8): Allow duplicate options to be specified
config(8)'s option handling has been written to allow duplicate options; if
the value changes, then the latest value is used and an informative message
is printed to stderr like so:
/usr/src/sys/amd64/conf/TEST: option "VERBOSE_SYSINIT" redefined from 0 to 1
Currently, this is only a possibility for cpu types, MAXUSERS, and
MACHINE_ARCH. Anything else duplicated in a config file will use the first
value set and error about duplicated options on subsequent appearances,
which is arguably unfriendly since one could specify:
include GENERIC
nooptions VERBOSE_SYSINIT
options VERBOSE_SYSINIT
imp [Fri, 21 Dec 2018 23:22:37 +0000 (23:22 +0000)]
Try the first 256 units with nvmecontrol devlist.
The nvmecontrol code that did the devlist assumed that we had a
tightly-packed allocation of units. Since pci writing exists, this
isn't the case. Loop over the first 256 units, which is a reasonable
number of possible units.
bde [Fri, 21 Dec 2018 21:17:45 +0000 (21:17 +0000)]
Fix clobbering of the fatchain cache for clustered i/o's when full
clustering is not done. The bug caused extreme slowness for large
files in some cases.
There is no way to tell VOP_BMAP() how many blocks are wanted, so for
all file systems it has to waste time in some cases by searching for
more contiguous blocks than will be accessed. For msdosfs, it also
clobbered the fatchain cache in these cases by advancing the cache to
point to the chain entry for block that won't be read. This makes
the cache useless for the next sequential i/o (or VOP_BMAP()), so the
fat chain is searched from the beginning. The cache only has 1 relevant
entry, so it is similarly useless for random i/o.
Fix this by only advancing the cache to point to the chain entry for
the first block that will be read. Clustering uses results from
VOP_BMAP(), so when more than 1 block is read by clustering, the cache
is not advanced as optimally as before, but it is at most 1 cluster
size behind and searching the chain through the blocks for this cluster
doesn't take too long.
cem [Fri, 21 Dec 2018 20:30:52 +0000 (20:30 +0000)]
mps(4), mpr(4): remove SATA ID command cancellation hack
Add a generic mechanism to override mp?_wait_command's timeout behavior,
which continues to invoke reinit by default. Invokers who set
cm_timeout_handler may avoid automatic reinit and do their own handling.
Adapt mp?sas_get_sata_identify to this mechanism and remove its callout
hack.
cem [Fri, 21 Dec 2018 20:29:16 +0000 (20:29 +0000)]
mps(4), mpr(4): Fix lifetime of command buffer for mp?sas_get_sata_identify
In the event that the ID command timed out, mps(4)/mpr(4) did not free the
command until it could be cancelled. However, it freed the associated
buffer (cm_data). Fix the lifetime issue by freeing the associated buffer
only after Abort Task or controller reset.
bde [Fri, 21 Dec 2018 20:12:43 +0000 (20:12 +0000)]
Quick fix for initialization of mnt_iosize_max. (This limit controls
mainly clustering and read-ahead.) Copy the initialization from ffs,
and also copy a couple of lines of ffs's nearby style for initialization
order and whitespace.
A correct fix would de-duplicate the initialization and fix bitrot in it
instead of adding another instance of the duplication. Complications to
use the size preferred by the device have been reduced to hard-coding
slightly pessimal and/or inconsistent defaults, using large code that was
almost needed to support the complications.
For msdosfs, the result was that mnt_iosize_max was DFTLPHYS (64K) but is
now MAXPHYS (128K).
vmaffione [Fri, 21 Dec 2018 13:56:57 +0000 (13:56 +0000)]
netmap: nmreplay: import various fixes from upstream (2704a51839906)
Changelist:
- General reformatting
- Fix packet duplication in cons(). Whenever cons() reached the
burst limit it would send all pending packets without advancing
head. This caused the last injected packet to be sent again in
the next round.
- Fix full-speed transmissions after first loop.
vmaffione [Fri, 21 Dec 2018 11:50:14 +0000 (11:50 +0000)]
netmap: move buf_size validation code to its own function
This code validates the netmap buf_size against the interface MTU
and maximum descriptor size, to make sure the values are consistent.
Moving this functionality to its own function is needed because this
function is also called by Linux-specific code.
bde [Fri, 21 Dec 2018 08:15:31 +0000 (08:15 +0000)]
Use VOP_ADVISE() with POSIX_FADV_DONTNEED instead of IO_DIRECT to
implement not double-caching for reads from vnode-backed md devices.
Use VOP_ADVISE() similarly instead of !IO_DIRECT unsimilarly for writes.
Add a "cache" option to mdconfig to allow changing the default of not
caching.
This depends on a recent commit to fix VOP_ADVISE(). A previous version
had optimizations for sequential i/o's (merge the i/o's and only uncache
for discontiguous i/o's and for full blocks), but optimizations and
knowledge of block boundaries belong in VOP_ADVISE(). Read-ahead should
also be handled better, by supporting it in md and discarding it in
VOP_ADVISE().
POSIX_FADV_DONTNEED is ignored by zfs, but so is IO_DIRECT.
POSIX_FADV_DONTNEED works better than IO_DIRECT if it is not ignored,
since it only discards from the buffer cache immediately, while
IO_DIRECT also discards from the page cache immediately.
IO_DIRECT was not used for writes since it was claimed to be too slow,
but most of the slowness for writes is from doing them synchronously by
default. Non-synchronous writes still deadlock in many cases.
IO_DIRECT only has a special implementation for ffs reads with DIRECTIO
configured. Otherwise, if it is not ignored than it uses the buffer and
page caches normally except for discarding everything after each i/o,
and then it has much the same overheads as POSIX_FADV_DONTNEED. The
overheads for reading with ffs and DIRECTIO were similar in tests of md.
bde [Fri, 21 Dec 2018 04:57:59 +0000 (04:57 +0000)]
Fix rounding in vop_stdadvise() for POSIX_FADV_NOREUSE (really
POSIX_FADV_DONTNEED). The most broken case was for applications that
advise for the whole file and then do block-aligned i/o's 1 block at
a time. Then advice is sent to VOP_ADVISE() 1 block at a time, but
in vop_stdadvise() the 1-block advice was turned into 0-block advice
for the buffer cache part.
The bugs were caused partly by callers representing the region as
(a_start, a_end), where a_end is actually the maximum, and everything
else representing the region as (start, end) where 'end' is actually
the end (1 after the maximum). The maximum a_end must be rounded up,
but was rounded down. Also, rounding to page boundaries was inconsistent.
The bugs and fixes have no effect for zfs and other file systems that
don't use the buffer cache or the page cache. Most or all file systems
currently use the default VOP_FADVISE(), but it finds a null buffer cache
and a null page cache for file systems that don't use normal methods.
mckusick [Fri, 21 Dec 2018 01:09:25 +0000 (01:09 +0000)]
Some filesystems (like cd9660 and ext3) require that VFS_STATFS()
be called before VFS_ROOT() is called. Move the call for VFS_STATFS()
so that it is done after VFS_MOUNT(), but before VFS_ROOT().
This change actually improves the robustness of the mount system
call because it returns an error rather than failing silently
when VFS_STATFS() returns failure.
rmacklem [Thu, 20 Dec 2018 22:21:41 +0000 (22:21 +0000)]
Fix the NFSv4 server to obey vfs.nfsd.nfs_privport.
When the NFSv4 server was coded, I believed that the specification authors
did not want NFSv4 servers to require a client to use a reserved port#.
However, recently it has been noted that the Linux NFSv4 server does support
a check for a reserved port#.
Since both the FreeBSD and Linux NFSv4 clients use a reserved port# by
default, enabling vfs.nfsd.nfs_privport to require a reserved port# for
NFSv4 the same as it does for NFSv2, 3 seems reasonable.
The only case where this could cause a POLA violation is a FreeBSD NFSv4
server with vfs.nfsd.nfs_privport set, but with NFSv4 clients doing mounts
without using a reserved port# (< 1024).
cem [Thu, 20 Dec 2018 20:55:33 +0000 (20:55 +0000)]
tpm(4): Fix GCC build after r342084 (TPM 2.0 driver commit)
Move static variable definition (cdevsw) to a more conventional location
(the C file it is used in), rather than a header.
This fixes the GCC warning, -Wunused-variable ("defined but not used") when
the tpm20.h header is included in files other than tpm20.c (e.g.,
tpm_tis.c).
bcran [Thu, 20 Dec 2018 19:39:37 +0000 (19:39 +0000)]
Rework UEFI ESP generation
Currently, the installer uses pre-created 800KB FAT12 filesystems that
it dd's onto the ESP partition.
This changeset improves that by having the installer generate a FAT32
filesystem directly onto the ESP using newfs_msdos and then copying
loader.efi into /EFI/freebsd.
For live installs it then runs efibootmgr to add a FreeBSD boot entry
in the BIOS.
bcran [Thu, 20 Dec 2018 19:27:46 +0000 (19:27 +0000)]
Wait a maximum of 300 seconds for network send/recv in libsa
The reason for this change is that currently, a send/recv
takes many hours to time out.
This is suboptimal in the bootloader because it means for example
that NFS will take hours to fail before allowing subsequent access
methods such as gzip to be tried.
Setting MAXWAIT to 300 seconds (5 minutes) still allows slow
connections of 1Mb to be used to download a 30MB kernel file.
tuexen [Thu, 20 Dec 2018 16:05:30 +0000 (16:05 +0000)]
Fix a regression in the TCP handling of received segments.
When receiving TCP segments the stack protects itself by limiting
the resources allocated for a TCP connections. This patch adds
an exception to these limitations for the TCP segement which is the next
expected in-sequence segment. Without this patch, TCP connections
may stall and finally fail in some cases of packet loss.
Reported by: jhb@
Reviewed by: jtl@, rrs@
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D18580
mw [Thu, 20 Dec 2018 01:05:09 +0000 (01:05 +0000)]
Fix obtaining RSP address in TPM CRB for non-amd64 platforms
On amd64 the RSP address can be read in single 8-byte transaction,
which is obviously not possible on 32-bit platforms. Fix that
by performing 2 4-byte read on them.
imp [Wed, 19 Dec 2018 23:15:49 +0000 (23:15 +0000)]
32-bit mips SMP is unsupported
Per discussions on mips@, 32-bit mips SMP is now unsupported. The
files in the tree will compile for a while longer, but when the
atomic_swap_64 or similar atomic enters into the MI part of the tree,
as currently foreseen sometime next year, these ports will start to no
longer link. The JZ4780 is the only such system we have.
The UP version of this chip is unaffected by this, and will remain
supported.
imp [Wed, 19 Dec 2018 22:56:31 +0000 (22:56 +0000)]
Fix the date
The first part of the mips pruning has been commited. This part
is uncontested. Fix the date in the UPDATING file to reflect when
I made the commit. The contested parts will be committed (or not)
once those discussions complete.
imp [Wed, 19 Dec 2018 22:54:34 +0000 (22:54 +0000)]
Remove old config file for SENTRY5
This is an older broadcom part that implements the mips32 ISA. 32-bit
FreeBSD/mips now requires mips32r2, so retire this config. Most of the
broadcom port is shared with newer ports, so what little code may be
unique to this part has not been GC'd at this time.
Discussed on: freebsd-mips@
Differential Revision: https://reviews.freebsd.org/D18543
imp [Wed, 19 Dec 2018 22:54:23 +0000 (22:54 +0000)]
Remove the GXEMUL support.
gxemul was a nice stop-gap while qemu support for mips was firmed
up. Now MALTA* + qemu is the platform of choice retire gxemul support.
It's unknown when this was last confirmed working.
Discussed on: freebsd-mips@
Differential Revision: https://reviews.freebsd.org/D18543
imp [Wed, 19 Dec 2018 22:54:03 +0000 (22:54 +0000)]
Remove support for the now very old SiByte MIPS platform. It's not
relevant and is unused. It's also getting in the way of progress in
some admittedly minor ways. Better to retire it to reduce the burden
on the project.
Discussed on: freebsd-mips@
Differential Revision: https://reviews.freebsd.org/D18543
mw [Wed, 19 Dec 2018 22:47:37 +0000 (22:47 +0000)]
Fix alignment issue in uefisign
The pe_certificate structure has to be aligned to 8 bytes. [1]
Since this is now checked in edk2, any binaries signed with
older version of this tool will fail verification.
mjg [Wed, 19 Dec 2018 22:30:26 +0000 (22:30 +0000)]
mac: reduce pessimization of sdt probe handling
Prior to the change the code would branch on return value and then check
if probes are enabled. Since vast majority of the time they are not, this
is clearly wasteful. Check probes first.
emaste [Wed, 19 Dec 2018 18:16:29 +0000 (18:16 +0000)]
bootpd: validate hardware type
Due to insufficient validation of network-provided data it may have been
possible for a malicious actor to craft a bootp packet which could cause
a stack buffer overflow.
admbugs: 850
Reported by: Reno Robert
Reviewed by: markj
Approved by: so
Security: FreeBSD-SA-18:15.bootpd
Sponsored by: The FreeBSD Foundation
markj [Wed, 19 Dec 2018 04:54:32 +0000 (04:54 +0000)]
Remove a use of a negative array index from fxp(4).
This fixes a warning seen when compiling amd64 GENERIC with clang 7.
Also remove the workaround added in r337324. clang 7 and gcc 4.2
generate the same code with or without the code change.
avos [Wed, 19 Dec 2018 03:08:10 +0000 (03:08 +0000)]
net80211: fix out-of-bounds read in ieee80211_amrr(9).
ieee80211_alloc_node() does not initialize rateset tables; that's not
expected by rate control modules and will result in array access at
index -1 - where ni_essid[] array is located (zeroed at allocation, so
there are no user-visible consequences).
Just delay rate control initialization to the moment, when rateset
tables are initiaziled; nothing will use rates here anyway.
np [Wed, 19 Dec 2018 01:37:00 +0000 (01:37 +0000)]
cxgbe/t4_tom: fixes for issues on the passive open side.
- Fix PR 227760 by getting the TOE to respond to the SYN after the call
to toe_syncache_add, not during it. The kernel syncache code calls
syncache_respond just before syncache_insert. If the ACK to the
syncache_respond is processed in another thread it may run before the
syncache_insert and won't find the entry. Note that this affects only
t4_tom because it's the only driver trying to insert and expand
syncache entries from different threads.
- Do not leak resources if an embryonic connection terminates at
SYN_RCVD because of L2 lookup failures.
- Retire lctx->synq and associated code because there is never a need to
walk the list of embryonic connections associated with a listener.
The per-tid state is still called a synq entry in the driver even
though the synq itself is now gone.
avg [Tue, 18 Dec 2018 17:17:53 +0000 (17:17 +0000)]
ichwd: add a few assertions about tco_version
Those should ensure correctness of ichwd_find_ich_lpc_bridge() and
ichwd_find_ich_lpc_bridge() as well as make it easier for both humans
and static analyzers to see the relation between tco_version and ich and
smb variables in ichwd_identify().
Reported by: Coverity
CID: 1396314, 1396317
MFC after: 10 days
markj [Mon, 17 Dec 2018 21:48:20 +0000 (21:48 +0000)]
Remove UMS support code from radeonkms.
The code is unreachable since the entries of radeon_ioctls[] are not
associated with any device: we provide only the KMS entry points.
Moreover, r600_cp_dispatch_texture() contains an integer overflow bug
that can be triggered from userspace.[1]
Reported by: Anonymous of the Shellphish Grill Team[1]
Reviewed by: dumbbell
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18516
markj [Mon, 17 Dec 2018 21:13:05 +0000 (21:13 +0000)]
Revert r336326.
In testing on a Dell Latitude 7480, having ig4.ko loaded during a
suspend caused the system to hang. It turns out that ig4iic_intr() was
being called after the device entered D3, and entered an infinite loop
because a read of the I2C status register returned all ones, causing us
to attempt to read a byte from the data buffer until one of the status
bits clears. This occured because ig4iic_pci0 shares an interrupt with
the VGA device on this laptop, so ig4iic_intr() gets called even when
there is no work to do. This is exactly the problem fixed by r342170,
which resolves the hang for me and allows suspend/resume to work with
ig4.ko loaded. So, re-enable autoloading of ig4.ko in the hope that
r342170 resolves the problem universally.
Reviewed by: gonzo
MFC after: 1 month (pending an MFC of r342170)
Differential Revision: https://reviews.freebsd.org/D18587
asomers [Mon, 17 Dec 2018 18:11:06 +0000 (18:11 +0000)]
audit(4) tests: require /etc/rc.d/auditd
These tests should be skipped if /etc/rc.d/auditd is missing, which could be
the case if world was built with WITHOUT_AUDIT set. Also, one test case
requires /etc/rc.d/accounting.