The NS_MOREFRAG flag can be set in a netmap slot to represent a
multi-fragment packet. Only the last fragment of a packet does
not have the flag set. On TX rings, the flag may be set by the
userspace application. The kernel will look at the flag and use it
to properly set up the NIC TX descriptors.
On RX rings, the kernel may set the flag if the packet received
was split across multiple netmap buffers. The userspace application
should look at the flag to know when the packet is complete.
Kyle Evans [Sun, 24 Jan 2021 19:25:34 +0000 (13:25 -0600)]
lualoader: improve loader.conf var processing
lualoader was previously not processing \ as escapes; this commit fixes
that and does better error checking on the value as well.
Additionally, loader.conf had some odd restrictions on values that make
little sense. Previously, lines like:
kernel=foo
Would simply be discarded with a malformed line complaint you might not
see unless you disable beastie.
lualoader tries to process these as well as it can and manipulates the
environment, while forthloader did minimal processing and constructed a
`set` command to do the heavy lifting instead. The lua approach was
re-envisioned from building a `set` command so that we can appropriately
reset the environment when, for example, boot environments change.
Lift the previous restrictions to allow unquoted values on the right hand
side of an expression. Note that an unquoted value is effectively:
[A-Za-z0-9-][A-Za-z0-9-_.]*
This commit also stops trying to weirdly limit what it can handle in a
quoted value. Previously it only allowed spaces, alphanumeric, and
punctuation, which is kind of weird. Change it here to grab as much as it
can between two sets of quotes, then let processEnvVar() do the needful and
complain if it finds something malformed looking.
My extremely sophisticated test suite is as follows:
Future work may provide a stub loader module in userland so that we can
formally test the loader scripts rather than sketchy setups like the above
in conjunction with the lua-* tools in ^/tools/boot.
Alexander Motin [Sun, 24 Jan 2021 19:23:04 +0000 (14:23 -0500)]
Exclude reserved iSCSI Initiator Task Tag.
RFC 7143 (11.2.1.8):
An ITT value of 0xffffffff is reserved and MUST NOT be assigned for a
task by the initiator. The only instance in which it may be seen on
the wire is in a target-initiated NOP-In PDU (Section 11.19) and in
the initiator response to that PDU, if necessary.
Alexander Motin [Sun, 24 Jan 2021 18:58:29 +0000 (13:58 -0500)]
Exclude reserved iSCSI Target Transfer Tag.
RFC 7143 (11.7.4):
The Target Transfer Tag values are not specified by this protocol,
except that the value 0xffffffff is reserved and means that the
Target Transfer Tag is not supplied.
Jessica Clarke [Sat, 23 Jan 2021 20:59:15 +0000 (20:59 +0000)]
Fix cross-build support for Ubuntu 16.04
Older glibc headers did some very nasty things that have since been
sanitised. We could also fix this by adding a linux/getopt.h wrapper
alongside the existing common/getopt.h that #undef's __need_getopt, but
that seems a little more hacky and complicated.
The reverted commit was motivated by a problem observed on stable/12,
but it turns out that a better solution was committed in r348309 but not
MFCed. So, revert this change since it is unnecessary and not really
correct: it assumes that the order in which module metadata records is
defined determines their order in the output linker set. While this
seems to hold in my testing, it is not guaranteed.
Reported by: cem
Discussed with: imp
MFC after: 3 days
Target value for val has uint32_t type, not uint, adjust used constant.
Change val type to unsigned so that left and right sides of comparision
operator do not expose different signed types of same range [*].
Switch to unsigned long long and strtoll(3) so that 0x80000000 is
accepted by conversion function [**].
Reported by: kargl [*]
Noted by: emaste [**]
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28301
nfs client: block vnode_pager_setsize() calls from nfscl_loadattrcache in nfs_write
Otherwise writing thread might wait on sbusy state of the pages which were
busied by itself, similarly to nfs_read(). But also we need to clear
NVNSETSZKSIP flag possibly set by ncl_pager_setsize(), to not undo
extension done by write.
By default, axgbe driver does a receiver reset after predefined number
of retries for the link to come up. However, this receiver reset
doesn't always suffice, due to an hardware issue.
In that case, as a workaround, a complete phy reset is necessary.
This patch introduces a sysctl that can be set to 1 to let the driver
reset the phy completely, rather than just doing receiver reset.
The workaround will be removed once the issue is fixed by means
of firmware update.
This patch also fixes the handling of the direct attach cables
properly.
Ed Maste [Fri, 22 Jan 2021 17:22:35 +0000 (12:22 -0500)]
elfctl: allow features to be specified by value
This will allow elfctl on older releases to set bits that are not yet
known there, so that the binary will have the correct settings applied
if run on a later FreeBSD version.
PR: 252629 (related)
Suggested by: kib
Reviewed by: gbe (manpage, earlier), kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28284
Mitchell Horne [Fri, 22 Jan 2021 18:56:56 +0000 (14:56 -0400)]
gdb: only return signal values for powerpc's gdb_cpu_signal()
Summary:
The values returned by this function are reported to the gdb client as
the reason for the break in execution, a signal value such as SIGTRAP,
SIGEMT, or SIGSEGV. As such, exact vector numbers can be misidentified.
Return SIGEMT in the default case instead.
Reviewed by: alfredo
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28046
Jamie Gritton [Fri, 22 Jan 2021 18:56:24 +0000 (10:56 -0800)]
jail: fix dangling reference bug from 6754ae2572eb
The change to use refcounts for pr_uref was mishandled in
prison_proc_free, so killing a jail's last process could add
an extra reference, leaving it an unkillable zombie.
Kyle Evans [Wed, 20 Jan 2021 14:01:25 +0000 (08:01 -0600)]
build: remove LIBPTHREAD/LIBTHR build options
WITHOUT_LIBTHR has been broken for a little over five years now, since the
xz 5.2.0 update introduced a hard liblzma dependency on libthr, and building
a useful system without threading support is becoming increasingly more
difficult.
Additionally, in the five plus years that it's been broken more reverse
dependencies have cropped up in libzstd, libsqlite3, and libcrypto (among
others) that make it more and more difficult to reconcile the effort needed
to fix these options.
Marius Strobl [Thu, 21 Jan 2021 22:34:19 +0000 (23:34 +0100)]
sym(4): fix nits reported by Coverity
- In ___dma_getp(), remove dead code. [1]
- In sym_sir_bad_scsi_status(), add missing FALLTHROUGH. [2]
While at it:
- For getbaddrcb(), remove __unused from the nseg argument as it's in
fact used when compiling with INVARIANTS.
- In sym_int_sir(), ensure in all branches that cp is not NULL before
using it.
* Fix bug with /32 aliases introduced in 81728a538d24.
* Explicitly document business logic for IPv4 ifa routes.
* Remove remnants of rtinit()
* Deduplicate ifa->route prefix code by moving it into ia_getrtprefix()
* Deduplicate conditional check for ifa_maintain_loopback_route() by
moving into ia_need_loopback_route()
* Remove now-unused flags argument from in_addprefix().
Remove deadlock in rc caused by pwait waiting for itself.
The following situation can trigger the deadlock:
1) Long time ago a_service was started through rc.d
2) We want to restart a_service and issue service a_service restart
3) rc.subr reads current process PID (via file or process),
sends TERM signal and runs pwait with PID harvested
4) a_service process dies very quickly so it's PID becomes available.
It is possible that while original process was running,
PID counter overflowed and pwait got assigned a_service's PID.
This patch ignores pid(s) to wait that are equal to pwait PID.
Reported by: Dan McGregor, Boris Lytochkin
Submitted by: Boris Lytochkin <lytboris at gmail.com>
Reviewed By: 0mp
MFC after: 2 weeks
PR: 218598
Differential Revision: https://reviews.freebsd.org/D28240
Mark Johnston [Thu, 21 Jan 2021 19:30:19 +0000 (14:30 -0500)]
libcasper/cap_grp tests: Reset the group database handle
Some tests verify that the capgrp capability does not permit calls to
setgrent(3), but all tests need to ensure that they reset the
capability's group database handle, otherwise the local process and
casper process will be out of sync.
The cap_pwd tests already handle this.
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Mark Johnston [Thu, 21 Jan 2021 19:30:19 +0000 (14:30 -0500)]
libc/nss: Ensure that setgroupent(3) actually works as advertised
Because the "files" and "compat" implementations failed to set the
"stayopen", keyed lookups would close the database handle, contrary to
the purpose of setgroupent(3). setpassent(3)'s implementation does not
have this bug.
Mark Johnston [Thu, 21 Jan 2021 19:30:19 +0000 (14:30 -0500)]
libc/nss: Restore iterator state when doing passwd/group lookups
The getpwent(3) and getgrent(3) implementations maintain some internal
iterator state. Interleaved calls to functions which do passwd/group
lookups using a key, such as getpwnam(3), would in some cases clobber
this state, causing a subsequent getpwent() or getgrent() call to
restart iteration from the beginning of the database or to terminate
early. This is particularly troublesome in programming environments
where execution of green threads is interleaved within a single OS
thread.
Take care to restore any iterator state following a keyed lookup. The
"files" provider for the passwd database was already handling this
correctly, but "compat" was not, and both providers had this problem
when accessing the group database.
Mark Johnston [Thu, 21 Jan 2021 19:30:19 +0000 (14:30 -0500)]
libc/nss tests: Fix getpw and getgr single-pass tests
Some NSS regression tests for getgrent(3) and getpwent(3) were not
testing anything because the test incorrectly requested creation of a
database snapshot.
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Mark Johnston [Thu, 21 Jan 2021 19:30:18 +0000 (14:30 -0500)]
Define PNP info after defining driver modules
PNP info definitions currently have an unfortunate requirement in that
they must follow the associated module definition in the module metadata
linker set. Otherwise devmatch can segfault while processing the linker
hints file since kldxref maintains the order in the linker set.
A number of drivers violate this requirement. In some cases this can
cause devmatch(8) to segfault when processing the linker hints file.
Work around the problem for now simply by adjusting the drivers.
Andrew Gallatin [Thu, 21 Jan 2021 14:45:15 +0000 (09:45 -0500)]
iflib: Fix a NULL pointer deref
rxd_frag_to_sd() have pf_rv parameter as NULL with the current
code. This patch fixes the NULL pointer dereference in that
case thus avoiding a possible panic.
elf: add some definitions for i386 and amd64 relocations
I believe that rtld does not need to implement them, they are mostly for
the static linker. 'Mostly' because for amd64 our kernel linker loads
object files, and amd64 relocation types could be observed.
Defines were taken from glibc sources.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28205
Alexander Motin [Thu, 21 Jan 2021 02:33:14 +0000 (21:33 -0500)]
Remove FirstBurstLength limit for software iSCSI.
For hardware offload solicited data may potentially be handled more
efficiently than unsolicited due to direct data placement. Or there
can be some unsolicited write buffering limitations. It may create
situations where FirstBurstLength limit is really useful.
Software driver though has no those factors, having to do memcopy()
any way and having no so hard limit on the temporary storage. Same
time more active use of unsolicited transfers allows to avoid some
of Ready To Transfer (R2T) PDU round-trip times and processing.
This change effectively doubles from 64KB to 128KB the maximum size
of write command that can be transferred within one link RTT. Tests
of (64KB, 128KB] QD1 writes mixed with simultaneous QD8 reads over
the same connection, increasing RTT, shows almost double write speed
and half latency, while we should be able to afford few megabytes of
RAM for additional buffering on a target these days.
Kyle Evans [Mon, 18 Jan 2021 20:11:58 +0000 (14:11 -0600)]
pkgbase: allow update-packages for first-run of packaging
If ${REPODIR}/${PKG_ABI} does not exist when we begin real-update-packages,
skip the comparison with the non-existent previous repository and just
finish the repo off. This allows external scripts to just assume they can
run `update-packages` rather than figuring out if they'd previously run
`packages` for this Version/Arch combo.
PKG_VERSION_FROM_DIR was added so that we could perhaps detect the three
distinct cases:
1.) If the repo has not yet been created, PKG_VERSION_FROM_DIR will be
empty.
2.) If the repo is in some intermediate state between created and fully
initialized, PKG_VERSION_FROM_DIR may point to the ABI directory.
3.) If the repo is fully initialized, then PKG_VERSION_FROM_DIR points to
the latest build to compare to.
Option #2 is explicitly unhandled at the moment, but this is no different
than it was before.
Reviewed-by: manu
Differential-Revision: https://reviews.freebsd.org/D28229
Jessica Clarke [Thu, 21 Jan 2021 02:14:41 +0000 (02:14 +0000)]
virtio_mmio: Delete a stale #if 0'ed debug print
This was blindly moved in r360722 but the variable being printed is not
yet initialised. It's of little use and can easily be added back in the
right place if needed by someone debugging, so just delete it.
Jessica Clarke [Thu, 21 Jan 2021 01:54:52 +0000 (01:54 +0000)]
linux64: Don't pass unnecessary -S and -g to objcopy
Since we use --input-type binary these options are rather meaningless. Both
binutils and elftoolchain ignore the option in this case, but LLVM does not,
and instead strips all symbols from the output file, causing missing symbols at
run time if building with llvm-objcopy. Thus simply remove the options; the
linux module has never included them for building its VDSO (added in r283407),
but for some reason the original commit of linux64 (r283424) added them.
These should however eventually be changed to use template assembly files as is
now done for firmware and MFS_IMAGE.
Jessica Clarke [Thu, 21 Jan 2021 01:54:12 +0000 (01:54 +0000)]
Rename i386's Linux ELF to Linux ELF32
This is what amd64 calls the i386 Linux ABI in order to distinguish it
from the amd64 Linux ABI, and matches the nomenclature used for the
FreeBSD ABIs where they always have the size suffix in the name.
Jessica Clarke [Thu, 21 Jan 2021 01:21:35 +0000 (01:21 +0000)]
Build VirtIO modules on all architectures
Currently only amd64, i386 and powerpc build VirtIO modules, yet all other
architectures have at least one kernel configuration that includes the
transport drivers, and so they lack drivers for all the devices they don't
statically compile into the kernel. Instead, enable the build everywhere so all
architectures have the full set of device drivers available.
Jessica Clarke [Thu, 21 Jan 2021 01:07:23 +0000 (01:07 +0000)]
virtio: Reduce boilerplate for device driver module definitions
Rather than have every device register itself for both virtio_pci and
virtio_mmio, provide a VIRTIO_DRIVER_MODULE wrapper to declare both,
merge VIRTIO_SIMPLE_PNPTABLE with VIRTIO_SIMPLE_PNPINFO and make the
latter register for both buses. This also has the benefit of abstracting
away the available transports and their names.
We must check MagicValue not just Version before anything else, and then
we must check DeviceID and immediately abort if zero (and this must not
be an error).
Do all this when probing rather than at the start of attaching as that's
where this belongs, and provides a clear boundary between the device
detection and device initialisation parts of the specified driver
initialisation process. This also means we don't create empty device
instances for placeholder devices, reducing clutter on systems that
pre-allocate a large number, such as QEMU's AArch64 virt machine (which
provides 32).
John Baldwin [Thu, 21 Jan 2021 00:37:55 +0000 (16:37 -0800)]
Restructure the crypto(7) manpage and add authentication algorithms.
Add separate sections for authentication algorithms, block ciphers,
stream ciphers, and AEAD algorithms. Describe properties commmon to
algorithms in each section to avoid duplication.
Use flat tables to list algorithm properties rather than nested
tables.
John Baldwin [Thu, 21 Jan 2021 00:33:34 +0000 (16:33 -0800)]
Simplify dynamic ipfilter sysctls.
Pass the structure offset in arg2 instead of arg1. This avoids
having to undo the pointer arithmetic on arg1. Instead arg2 can
be used directly as an offset relative to the desired structure.
Jamie Gritton [Wed, 20 Jan 2021 23:08:27 +0000 (15:08 -0800)]
jail: Use refcount(9) for prison references.
Use refcount(9) for both pr_ref and pr_uref in struct prison. This
allows prisons to held and freed without requiring the prison mutex.
An exception to this is that dropping the last reference will still
lock the prison, to keep the guarantee that a locked prison remains
valid and alive (provided it was at the time it was locked).
Among other things, this honors the promise made in a comment in
crcopy(9), that it will not block, which hasn't been true for two
decades.
Ryan Libby [Wed, 20 Jan 2021 21:59:49 +0000 (13:59 -0800)]
pms: handle maximum size IO with any alignment
Define the maximum numbers of segments to allow for non-page alignment
at the beginning and end of a maxphys size transfer. Also set
ccb_pathinq.maxio consistent with maxphys.
hms: Workaround idle mouse drift in I2C sampling mode.
Many I2C "compatibility" mouse devices found on touchpads continue to
return last report data in sampling mode after touch has been ended.
That results in cursor drift. Filter out such a reports with comparing
content of current report with content of previous one.
hconf(4): Do not fetch report before writing new usage values back.
There is a report that reading of surface/button switch feature report
causes SYN1B7D touchpad malfunction. As specs does not require it to
be readable assume that report usages have default value on attach and
last written value during operation. Do not apply default usage values
on attachment and resume.
While here fix manpage typos and add avg@ to copyright header.
Andrew Turner [Wed, 20 Jan 2021 09:56:47 +0000 (09:56 +0000)]
Handle arm64 undefied instructions on msr exceptions
When userspace tries to access a special register that it doesn't have
access to the kernel receives an exception. On most cores this exception
has been observed to be the undefined instruction exception, however on
the Apple M1 under a QEMU based hypervisor it can be the MSR exception.
Handle this second case by also running the undefined exception handler
on these exceptions.
Alan Somers [Sun, 3 Jan 2021 04:25:05 +0000 (21:25 -0700)]
aio: micro-optimize the lio_opcode assignments
This allows slightly more efficient opcode testing in-kernel. It is
transparent to userland, except to applications that sneakily submit
aio fsync or aio mlock operations via lio_listio, which has never been
documented, requires the use of deliberately undefined constants
(LIO_SYNC and LIO_MLOCK), and is arguably a bug.