Call sv_onexec hook after the process VA is created.
For future use in the Linux emulation layer call sv_onexec hook right after
the new process address space is created. It's safe, as sv_onexec used only
by Linux abi and linux_on_exec() does not depend on a state of process VA.
linux(4): Remove function prototypes from the vDSO.
In preparation for vDSO code revision get rid of incomplete vDSO methods
from locore, but leave .note.Linux section commented out.
.note.Linux section is used by glibc rtld to get the kernel version, that
saves one system call call. I'll try to implement it later, if figure out
how to use it with jails.
These were internal binutils relocations that have no way to be
generated in assembly nor will ever be seen in the output, and so should
never have been defined in the psABI in the first place. They have
therefore been removed from the spec as of [1], so do so here too.
Warner Losh [Mon, 12 Jul 2021 03:26:08 +0000 (21:26 -0600)]
awk: remove proctab.c
proctab.c is a generated file and never should have been committed to
the tree. This file has been added and removed a couple of times, most
recently added by me in my 2019 updates.
Warner Losh [Tue, 20 Jul 2021 02:10:22 +0000 (20:10 -0600)]
awk: Add more details top the FS variable
The current description of the FS is true, but only part of the
truth. Add information about single characters and note that FS="" is
undefined by the standard, though the two other awk implenetations (mawk
and gawk) also have this interpretation.
Rick Macklem [Tue, 20 Jul 2021 00:35:39 +0000 (17:35 -0700)]
nfscl: Send stateid.seqid of 0 for NFSv4.1/4.2 mounts
For NFSv4.1/4.2, the client may set the "seqid" field of the
stateid to 0 in RPC requests. This indicates to the server that
it should not check the "seqid" or return NFSERR_OLDSTATEID if the
"seqid" value is not up to date w.r.t. Open/Lock operations
on the stateid. This "seqid" is incremented by the NFSv4 server
for each Open/OpenDowngrade/Lock/Locku operation done on the stateid.
Since a failure return of NFSERR_OLDSTATEID is of no use to
the client for I/O operations, it makes sense to set "seqid"
to 0 for the stateid argument for I/O operations.
This avoids server failure replies of NFSERR_OLDSTATEID,
although I am not aware of any case where this failure occurs.
This makes the FreeBSD NFSv4.1/4.2 client compatible with the
Linux NFSv4.1/4.2 client.
John Baldwin [Mon, 19 Jul 2021 22:36:31 +0000 (15:36 -0700)]
cxgbei: Don't assert F for data completion PDUs.
If a data PDU encounters an error such as a digest error, the firmware
will report that data PDU when completion moderation is active even if
it is not the final data PDU in a burst.
John Baldwin [Mon, 19 Jul 2021 22:36:31 +0000 (15:36 -0700)]
cxgbei: Remove invalid assertion.
A non-placed PDU can be delivered by CPL_RX_ISCSI_CMP in the middle of
a burst of placed PDUs (received via DDP) in which case the rcv_nxt
will not match the start of the non-placed PDU.
Michael Tuexen [Mon, 19 Jul 2021 22:29:18 +0000 (00:29 +0200)]
tcp: fix RACK and BBR when using VIMAGE enabled kernel
Fix a bug in VNET handling, which occurs when using specific NICs.
PR: 257195
Reviewed by: rrs
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D31212
acpi: Fix a repeated vm_offset_t that should be a vm_size_t
The underlying types for both are the same so arguably this doesn't
really matter, but using the wrong type is still confusing and
technically incorrect.
Alex Richardson [Mon, 19 Jul 2021 14:04:19 +0000 (15:04 +0100)]
Allow building usr.bin/vi with MK_ASAN
We have to namespace the regex functions to avoid duplicate symbol errors.
This also ensures that vi doesn't define the libc reg* functions with
mismatched signatures.
ld: error: duplicate symbol: regcomp
>>> defined at sanitizer_common_interceptors.inc:7519 (/usr/src/contrib/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:7519)
>>> asan_interceptors.o:(__interceptor_regcomp) in archive /usr/lib/clang/10.0.1/lib/freebsd/libclang_rt.asan-x86_64.a
>>> defined at regcomp.c
>>> .../regex/regcomp.c.o:(.text+0x0)
ld: error: duplicate symbol: regerror
>>> defined at sanitizer_common_interceptors.inc:7543 (/usr/src/contrib/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:7543)
>>> asan_interceptors.o:(__interceptor_regerror) in archive /usr/lib/clang/10.0.1/lib/freebsd/libclang_rt.asan-x86_64.a
>>> defined at regerror.c
>>> .../regex/regerror.c.o:(.text+0x0)
ld: error: duplicate symbol: regexec
>>> defined at sanitizer_common_interceptors.inc:7530 (/usr/src/contrib/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:7530)
>>> asan_interceptors.o:(__interceptor_regexec) in archive /usr/lib/clang/10.0.1/lib/freebsd/libclang_rt.asan-x86_64.a
>>> defined at regexec.c
>>> .../regex/regexec.c.o:(.text+0x0)
ld: error: duplicate symbol: regfree
>>> defined at sanitizer_common_interceptors.inc:7553 (/usr/src/contrib/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:7553)
>>> asan_interceptors.o:(__interceptor_regfree) in archive /usr/lib/clang/10.0.1/lib/freebsd/libclang_rt.asan-x86_64.a
>>> defined at regfree.c
>>> .../regex/regfree.c.o:(.text+0x0)
Committed upstream as https://github.com/lichray/nvi2/pull/92
Kyle Evans [Sun, 20 Jun 2021 19:36:10 +0000 (14:36 -0500)]
kenv: allow listing of static kernel environments
The early environment is typically cleared, so these new options
need the PRESERVE_EARLY_KENV kernel config(8) option. These environments
are reported as missing by kenv(1) if the option is not present in the
running kernel.
Kyle Evans [Sun, 20 Jun 2021 19:29:31 +0000 (14:29 -0500)]
kern: add an option for preserving the early kenv
Some downstream configurations do not store secrets in the
early (loader/static) environments and desire a way to preserve these
for diagnostic reasons. Provide an option to do so.
Emmanuel Vadot [Fri, 5 Feb 2021 20:41:06 +0000 (21:41 +0100)]
arm64: Add per SoC family kernel config
There is multiple reason for this :
- This makes it easier to see which driver is needed for each SoC
- This makes it easier to create a custom config for one SoC
- This really reduce boot time (which some people might want)
Some explaination about the files :
- std.arm64 contains all standard kernel option
- std.dev contains all the standard kernel devices
- std.<soc> contains all drivers needed to boot on this SoC family
- <SOC> includes std.arm64, std.dev and std.<soc>
- GENERIC includes std.arm64, std.dev and all std.<soc>
pf: locally originating connections with 'route-to' fail
Similar to the REPLY_TO shortcut (6d786845cf) we also can't shortcut
ROUTE_TO. If we do we will fail to apply transformations or update the
state, which can lead to premature termination of the connections.
Kristof Provost [Tue, 2 Mar 2021 15:57:27 +0000 (16:57 +0100)]
pf tests: Test the match keyword
The new match keyword can currently only assign queues, so we can only
test it with ALTQ.
Set up a basic scenario where we use 'match' to assign ICMP traffic to a
slow queue, and confirm that it's really getting slowed down.
Kristof Provost [Tue, 2 Mar 2021 15:01:04 +0000 (16:01 +0100)]
pf: match keyword support
Support the 'match' keyword.
Note that support is limited to adding queuing information, so without
ALTQ support in the kernel setting match rules is pointless.
For the avoidance of doubt: this is NOT full support for the match
keyword as found in OpenBSD's pf. That could potentially be built on top
of this, but this commit is NOT that.
Commit ee29e6f31111 changed the internal KAPI between the nfscommon
and nfsd modules. Bump __FreeBSD_version to 1400026 since both
modules will need to be rebuilt from sources.
Rick Macklem [Fri, 16 Jul 2021 22:01:03 +0000 (15:01 -0700)]
nfsd: Add sysctl to set maximum I/O size up to 1Mbyte
Since MAXPHYS now allows the FreeBSD NFS client
to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio
so that the maximum NFS server I/O size can be set up to 1Mbyte.
The Linux NFS client can also do 1Mbyte I/O operations.
The default of 128Kbytes for the maximum I/O size has
not been changed for two reasons:
- kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O
- The limited benchmarking I can do actually shows a drop in I/O rate
when the I/O size is above 256Kbytes.
However, daveb@spectralogic.com reports seeing an increase
in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client.
Randall Stewart [Fri, 16 Jul 2021 17:59:57 +0000 (13:59 -0400)]
tcp: When rack or bbr get a pullup failure in the common code, don't free the NULL mbuf.
There is a bug in the error path where rack_bbr_common does a m_pullup() and the pullup fails.
There is a stray mfree(m) after m is set to NULL. This is not a good idea :-)
Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D31194
David Chisnall [Sat, 10 Jul 2021 16:19:52 +0000 (17:19 +0100)]
Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned. This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.
Randall Stewart [Fri, 16 Jul 2021 10:07:13 +0000 (06:07 -0400)]
tcp: Lro needs to validate that it does not go beyond the end of the mbuf as it parses.
Currently the LRO parser, if given a packet that say has ETH+IP header but the TCP header
is in the next mbuf (split), would walk garbage. Lets make sure we keep track as we
parse of the length and return NULL anytime we exceed the length of the mbuf.
Kevin Bowling [Fri, 16 Jul 2021 06:50:14 +0000 (23:50 -0700)]
ixgbe: Print FW NVM and Option ROM versions
It can be useful for system operators to see this kind of information
when correlating issues or requesting support from the OEM or Intel for
hardware and firmware issues.
Dave Fullard [Fri, 16 Jul 2021 04:02:48 +0000 (23:02 -0500)]
freebsd-update: create a ZFS boot environment on install
Updated freebsd-update to allow it to create boot environments using
bectl should the system support it. The bectl utility was updated in
r352211 (490e13c1403f) to support a 'check' to determine if the system
supports boot environments. If UFS is used, the bectl check will fail
then no attempt will be made to create the boot environment.
If freebsd-update is run inside a jail, no attempt will be made to
create a boot environment.
The boot environment function will create a new environment using the
format: current FreeBSD kernel version and date/timestamp, example:
12.0-RELEASE-p10_2019-10-03_185233
This functionality can be disabled by setting 'CreateBootEnv' in
freebsd-update.conf to 'no'.
Mark Johnston [Fri, 16 Jul 2021 02:38:46 +0000 (22:38 -0400)]
lio_listio: Don't post a completion notification if none was requested
One is allowed to use LIO_NOWAIT without specifying a sigevent. In this
case, lj->lioj_signal is left uninitialized, but several code paths
examine liov_signal.sigev_notify to figure out which notification to
post. Unconditionally initialize that field to SIGEV_NONE.
Add a dumb test case which triggers the bug.
Reported by: KMSAN+syzkaller
Reviewed by: asomers
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31197
Mark Johnston [Fri, 16 Jul 2021 02:35:28 +0000 (22:35 -0400)]
libc: Use the initial-exec TLS model
This permits more efficient accesses of thread-local variables, which
are heavily used at least by jemalloc and locale-aware code. Note that
on amd64 and i386, jemalloc's thread-local variables already have their
TLS model overridden by defining JEMALLOC_TLS_MODEL.
For now the change is applied only to tested platforms, but should in
principle be enabled everywhere.
PR: 255840
Suggested by: jrtc27
Reviewed by: kib
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31070
Mark Johnston [Fri, 16 Jul 2021 02:26:25 +0000 (22:26 -0400)]
rtld/arm64: Remove checks for undefined symbols when processing TPREL64
lld emits several GOT relocations referencing the null sumbol in libc.so
when compiled with -ftls-model=initial-exec. This symbol is specified
to be undefined.
We generally do not handle dynamic TLS relocations against weak,
undefined symbols, so avoid printing a warning here. This makes it
possible to compile libc.so using the initial-exec TLS model on arm64.
Reviewed by: jrtc27, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31069
This fixes a kernel panic when probing for vmd_bus on Intel TigerLake on
14-CURRENT. Apparently, vmd_bus is a type of PCI bus, but was registered
as a separate device class.
Warner Losh [Fri, 16 Jul 2021 00:30:53 +0000 (18:30 -0600)]
UPDATING: Not unusual side effect of the awk bug fixed in d4d252c49976
You might not be able to build the kernel if you have an awk between
Jul 7th and today. It does not affect all platforms due to the nature
of the bug (so amd64 is unaffected in stable/13 or current, but
is affected in stable/12. i386 seems to be affected everywhere).
Warner Losh [Thu, 15 Jul 2021 22:46:06 +0000 (16:46 -0600)]
awk: revert upstream's attempt to disallow hex strings
Upstream one-true-awk decided to disallow hex strings as numbers. This
is in line with awk's behavior prior to C99, and allowed by the POSIX
standard. The standard, however, allows them to be treated as numbers
because that's what the standard said in the 2001 through 2004 editions.
Since 2001, the nawk in FreeBSD has treated them as numbers, so restore
that behavior, allowed by the standard.
A number of scripts in the FreeBSD tree depend on this interpretation,
including scripts to build the kernel which had mysteriously started
failing for some people and not others. By re-allowing 0x hex numbers,
this fixes those scripts and restores POLA.
Upstream issue: https://github.com/onetrueawk/awk/issues/126
Sponsored by: Netflix
Reviewed by: kevans
MFC After: asap due to regression alrady merged to stable
Differential Revision: https://reviews.freebsd.org/D31199
Warner Losh [Thu, 15 Jul 2021 22:17:23 +0000 (16:17 -0600)]
nvme: Enable interrupts after qpair fully constructed
To guard against the ill effects of a spurious interrupt during
construction (or one that was bogusly pending), enable interrupts after
the qpair is completely constructed. Otherwise, we can die with null
pointer dereferences in nvme_qpair_process_completions. This has been
observed in at least one pre-release NVMe drive where the MSIX interrupt
fired while the queue was being created, before we'd started the NVMe
controller card.
The alternative of only turning on the interrupts after the rest was
tried, but was insufficient to work around this bug and made the code
more complicated w/o benefit.
nanobsd: Use gpart and create code image before full disk image
The attached patch brings two main changes to the nanobsd script:
1- gpart is used instead of fdisk;
2- the code image is created first, and then used to ``assemble'' the
full disk image.
The patch was first proposed on the freebsd-embedded list:
http://lists.freebsd.org/pipermail/freebsd-embedded/2012-June/001580.html
and is currently under discussion:
http://lists.freebsd.org/pipermail/freebsd-embedded/2014-January/002216.html
Another effect is that the -f option ("suppress code slice extraction")
now imples the -i option ("suppress disk image build").
imp@ applied Patch by hand to new legacy.sh, plus tweaked for NANO_LOG vs
NANO_OBJ confusion in original.
Mark Johnston [Thu, 15 Jul 2021 16:23:04 +0000 (12:23 -0400)]
eli: Zero pad bytes that arise when certain auth algorithms are used
When authentication is configured, GELI ensures that the amount of data
per sector is a multiple of 16 bytes. This is done in
eli_metadata_softc(). When the digest size is not a multiple of 16
bytes, this leaves some extra pad bytes at the end of every sector, and
they were not being zeroed before being written to disk. In particular,
this happens with the HMAC/SHA1, HMAC/RIPEMD160 and HMAC/SHA384 data
authentication algorithms.
This change ensures that they are zeroed before being written to disk.
Reported by: KMSAN
Reviewed by: delphij, asomers
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31170
Mark Johnston [Thu, 15 Jul 2021 16:18:17 +0000 (12:18 -0400)]
nfsclient: Avoid copying uninitialized bytes into statfs
hst will be nul-terminated but the remaining space in the buffer is left
uninitialized. Avoid copying the entire buffer to ensure that
uninitialized bytes are not leaked via statfs(2).
Reported by: KMSAN
Reviewed by: rmacklem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31167
This can be used for variables which are only used with either
INVARIANTS or WITNESS. Without any annotation they run into dead store
warnings from cc --analyze and always annotating with __unused may hide
bad vars when it should not.
Send a zero-length-packet first when opening a BULK endpoint for USB serial
port devices. If it gets eaten it is fine. Many USB device side implementations
don't properly support the clear endpoint halt command and if they do, data is lost
because the transmit FIFO is typically reset when this command is received.
Andrew Turner [Thu, 8 Jul 2021 12:14:56 +0000 (12:14 +0000)]
Update the SCTLR_EL1 register definitions
They are valid as of the ARMv8.7 XML.
While here remove SCTLR_RES0 as it's unused and depends on which CPU
the kernel is running on and switch to shifted values as they are
easier to compare with the documentation.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31120
Warner Losh [Thu, 15 Jul 2021 03:06:08 +0000 (21:06 -0600)]
loader: make sure CPUTYPE is ignored when building
CPUTYPE?=native causes -march=native to be added to the command
line. When the host machine is haswell, this causes some versions of
clang to generate code that can't execute in the efi boot loader
environment. Set _CPUCFLAGS= to undo what's done bsd.cpu.mk. bsd.cpu.mk
is included too early to control with NO_CPU_CFLAGS here. The only other
option is to put that in all the Makefiles, and this is less tedious and
error prone.
Warner Losh [Wed, 14 Jul 2021 22:43:25 +0000 (16:43 -0600)]
loader: Create loader_simp(8) to document simple version of loader
loader_simp is a much simplified version of loader that will process a
linear sequence of commands from loader.rc. It has neither Forth nor Lua
built in and is much smaller. Document it. This is largely copied from
loader.8 since it implements those built-in commands. Future revisions
will fix this duplication.
Alfonso Gregory [Wed, 14 Jul 2021 21:48:35 +0000 (15:48 -0600)]
Remove incorrect __restricted labels from strcspn
strcspn should never have had the __restrict keywords. While both of
these strings are const, it may have unindended side effects. While this
is the kernel, the POSIX definition also omits restrict.
Rick Macklem [Wed, 14 Jul 2021 20:33:37 +0000 (13:33 -0700)]
nfscl: Avoid KASSERT() panic in cache_enter_time()
Commit 844aa31c6d87 added cache_enter_time_flags(), specifically
so that the NFS client could specify that cache enter replace
any stale entry for the same name. Doing so avoids a KASSERT()
panic() in cache_enter_time(), as reported by the PR.
This patch uses cache_enter_time_flags() for Readdirplus, to
avoid the panic(), since it is impossible for the NFS client
to know if another client (or a local process on the NFS server)
has replaced a file with another file of the same name.
This patch only affects NFS mounts that use the "rdirplus"
mount option.
There may be other places in the NFS client where this needs
to be done, but no panic() has been observed during testing.