Adjust dtrace_unload() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
In file included from sys/cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c:18440:
sys/cddl/dev/dtrace/dtrace_unload.c:26:14: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
dtrace_unload()
^
void
This is because dtrace_unload() is declared with a (void) argument list,
but defined with an empty argument list. Make the definition match the
declaration.
Adjust dtrace_getf_barrier() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
sys/cddl/contrib/opensolaris/uts/common/dtrace/dtrace.c:17019:20: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
dtrace_getf_barrier()
^
void
This is because dtrace_getf_barrier() is declared with a (void) argument
list, but defined with an empty argument list. Make the definition match
the declaration.
Adjust profile_unload() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
sys/cddl/dev/profile/profile.c:640:15: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
profile_unload()
^
void
This is because profile_unload() is declared with a (void) argument
list, but defined with an empty argument list. Make the definition match
the declaration.
Adjust dtnfsclient_unload() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
sys/fs/nfsclient/nfs_clkdtrace.c:544:19: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
dtnfsclient_unload()
^
void
This is because dtnfsclient_unload() is declared with a (void) argument
list, but defined with an empty argument list. Make the definition match
the declaration.
Adjust fbd_list() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
sys/dev/fb/fbd.c:205:9: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
fbd_list()
^
void
This is because fbd_list() is declared with a (void) argument list, but
defined with an empty argument list. Make the definition match the
declaration.
Adjust db_flush_line() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
sys/ddb/db_lex.c:94:14: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
db_flush_line()
^
void
This is because db_flush_line() is declared with a (void) argument
list, but defined with an empty argument list. Make the definition match
the declaration.
Adjust dtmalloc_unload() definition to avoid clang 15 warning
With clang 15, the following -Werror warnings is produced:
sys/cddl/dev/dtmalloc/dtmalloc.c:177:16: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
dtmalloc_unload()
^
void
This is because dtmalloc_unload() is declared with a (void) argument
list, but defined with an empty argument list. Make the definition match
the declaration.
Adjust t4_tracer_mod{load,unload}() definitions to avoid clang 15 warnings
With clang 15, the following -Werror warnings are produced:
sys/dev/cxgbe/t4_tracer.c:234:18: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
t4_tracer_modload()
^
void
sys/dev/cxgbe/t4_tracer.c:243:20: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
t4_tracer_modunload()
^
void
This is because t4_tracer_modload() and t4_tracer_modunload() are
declared with a (void) argument list, but defined with an empty argument
list. Make the definitions match the declarations.
Simon J. Gerraty [Tue, 19 Jul 2022 15:59:53 +0000 (08:59 -0700)]
Add -S option to veriexec
During software installation, use veriexec -S to strictly
enforce certificate validity checks (notBefore, notAfter).
Otherwise ignore certificate validity period.
It is generally unacceptible for the Internet to stop working
just because someone did not upgrade their infrastructure for a decade.
Andrew Turner [Wed, 23 Mar 2022 17:39:58 +0000 (17:39 +0000)]
Add experimental 16k page support on arm64
Add initial 16k page support on arm64. It is considered experimental,
with no guarantee of compatibility with a userspace or kernel modules
built with the current a 4k page size as code will likely try to pass
in a too small size when working with APIs that take a multiple of a
page, e.g. mmap.
As this is experimental, and because userspace and the kernel need to
have the PAGE_SIZE macro kept in sync there is no kernel option to
enable this. To test a new image should be built with the
PAGE_{SIZE,SHIFT,MASK} macros changed to the 16k versions.
There are currently known issues with loading modules from an old
loader as it can misalign them to load on a non-16k boundary.
Testing has shown good results in kernel workloads that allocate and
free large amounts of memory as only a quarter of the number of calls
into the VM subsystem are needed in the best case.
Reviewed by: markj
Tested by: gallatin
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34793
Kristof Provost [Thu, 23 Jun 2022 20:35:29 +0000 (22:35 +0200)]
ipsec: replace SECASVAR mtx by rmlock
This mutex is a significant point of contention in the ipsec code, and
can be relatively trivially replaced by a read-mostly lock.
It does require a separate lock for the replay protection, which we do
here by adding a separate mutex.
This improves throughput (without replay protection) by 10-15%.
Alan Cox [Mon, 18 Jul 2022 00:56:39 +0000 (19:56 -0500)]
x86/iommu: Shrink the critical section in dmar_qi_task()
It is safe to test and clear the Invalidation Wait Descriptor
Complete flag before acquiring the DMAR lock in dmar_qi_task(),
rather than waiting until the lock is held.
Colin Percival [Wed, 13 Jul 2022 00:43:07 +0000 (17:43 -0700)]
x86: Remove 1 second DELAY from cpu_reset
On SMP systems, cpu_reset broadcasts a message telling the APs to stop
themselves, and then the BSP waits 1 second before actually resetting
itself; this behaviour dates back to 1998-05-17.
I assume that this delay was added in order to allow the APs to stop
themselves before the BSP resets; but we wait until the APs have all
acknowledged entering the "stopped" state, so it no longer seems to
serve any purpose.
Colin Percival [Wed, 13 Jul 2022 00:42:26 +0000 (17:42 -0700)]
Add kern.reboot_wait_time sysctl
Historic FreeBSD behaviour (dating back to 1994-04-02) when rebooting
is to print "Rebooting..." and then
/* wait 1 sec for printf's to complete and be read */
Prior to April 1994, there was a 100 ms delay (added 1993-11-12).
Since (a) most users will already be aware that the system is rebooting
and do not need to take time to read an additional message to that
effect, and (b) most FreeBSD systems don't have anyone actively looking
at the console anyway, this delay no longer serves much purpose.
This commit adds a kern.reboot_wait_time sysctl which defaults to 0;
historic behaviour can be regained by setting it to 1.
Reviewed by: imp
Relnotes: FreeBSD now reboots faster; to restore the traditional
wait after printing "Rebooting..." to the console, set
kern.reboot_wait_time=1 (or more).
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D35796
Generally, access to the kernel debugger is considered to be unsafe from
a security perspective since it presents an unrestricted interface to
inspect or modify the system state, including sensitive data such as
signing keys.
However, having some access to debugger functionality on production
systems may be useful in determining the cause of a panic or hang.
Therefore, it is desirable to have an optional policy which allows
limited use of ddb(4) while disabling the functionality which could
reveal system secrets.
This loadable MAC module allows for the use of some ddb(4) commands
while preventing the execution of others. The commands have been broadly
grouped into three categories:
- Those which are 'safe' and will not emit sensitive data (e.g. trace).
Generally, these commands are deterministic and don't accept
arguments.
- Those which are definitively unsafe (e.g. examine <addr>, search
<addr> <value>)
- Commands which may be safe to execute depending on the arguments
provided (e.g. show thread <addr>).
Safe commands have been flagged as such with the DB_CMD_MEMSAFE flag.
Commands requiring extra validation can provide a function to do so.
For example, 'show thread <addr>' can be used as long as addr can be
checked against the system's list of process structures.
The policy also prevents debugger backends other than ddb(4) from
executing, for example gdb(4).
Reviewed by: markj, pauamma_gundo.com (manpages)
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35371
Add three simple hooks to the debugger allowing for a loaded MAC policy
to intervene if desired:
1. Before invoking the kdb backend
2. Before ddb command registration
3. Before ddb command execution
We extend struct db_command with a private pointer and two flag bits
reserved for policy use.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35370
This flag value can be used to indicate if a command has the property of
being "memory safe". In this instance, memory safe means that the
command does not allow/enable reads or writes of arbitrary memory,
regardless of the arguments passed to it. For example, 'backtrace' is
considered a memory-safe command since its output is deterministic,
while 'show vnode' is not, since it requires a memory address as an
argument and will print the contents beginning at that location.
Apply the flag to the "show all" command macros. It is expected that
commands added to this table will always exhibit this property.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35581
Eric van Gyzen [Mon, 18 Jul 2022 18:07:20 +0000 (13:07 -0500)]
bge: tell debugnet there are 2 rx rings, not 1,024
debugnet provides the network stack for netgdb and netdump. Since it
must operate under panic/debugger conditions and can't rely on dynamic
memory allocation, it preallocates mbufs during boot or network
configuration. At that time, it does not yet know which interface
will be used for debugging, so it does not know the required size and
quantity of mbufs to allocate. It takes the worst-case approach by
calculating its requirements from the largest MTU and largest number
of receive queues across all interfaces that support debugnet.
Unfortunately, the bge NIC driver told debugnet that it supports 1,024
receive queues. It actually supports only 2 queues (with 1,024 slots,
thus the error). This greatly exaggerated debugnet's preallocation,
so with an MTU of 9000 on any interface, it allocated 600 MB of memory.
A tiny fraction of this memory would be used if netgdb or netdump were
invoked; the rest is completely wasted.
Mark Johnston [Mon, 18 Jul 2022 19:50:45 +0000 (15:50 -0400)]
sched_ule: Ensure we hold the thread lock when modifying td_flags
The load balancer may force a running thread to reschedule and pick a
new CPU. To do this it sets some flags in the thread running on a
loaded CPU. But the code assumed that a running thread's lock is the
same as that of the corresponding runqueue, and there are small windows
where this is not true. In this case, we can end up with non-atomic
modifications to td_flags.
Since this load balancing is best-effort, simply give up if the thread's
lock doesn't match; in this case the thread is about to enter the
scheduler anyway.
Kornel Dulęba [Tue, 10 May 2022 13:22:55 +0000 (15:22 +0200)]
Implement shared page address randomization
It used to be mapped at the top of the UVA.
If the randomization is enabled any address above .data section will be
randomly chosen and a guard page will be inserted in the shared page
default location.
The shared page is now mapped in exec_map_stack, instead of
exec_new_vmspace. The latter function is called before image activator
has a chance to parse ASLR related flags.
The KERN_PROC_VM_LAYOUT sysctl was extended to provide shared page
address.
The feature is enabled by default for 64 bit applications on all
architectures.
It can be toggled kern.elf64.aslr.shared_page sysctl.
Kornel Dulęba [Thu, 2 Jun 2022 07:58:12 +0000 (09:58 +0200)]
Rework how shared page related data is stored
Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.
Kornel Dulęba [Thu, 2 Jun 2022 08:45:54 +0000 (10:45 +0200)]
Introduce the PROC_SIGCODE() macro
Use a getter macro instead of fetching the sigcode address directly
from a sysent of a given process. It assumes that the sigcode is stored
in the shared page, which is true in all cases, except for a.out
binaries. This will be later useful when the shared page address
randomization is introduced.
No functional change intended.
Mike Karels [Sat, 16 Jul 2022 21:05:58 +0000 (16:05 -0500)]
ofed/infiniband: fix ifdefs for new INET changes, fixing LINT-NOIP
Some of the ofed/infiniband code has INET and INET6 address handling
code without using ifdefs. This failed with a recent change to INET,
in which IN_LOOPBACK() started using a VNET variable, and which is not
present if INET is not configured. Add #ifdef INET, and INET6 for good
measure, in cma_loopback_addr(), along with inclusion of the options
headers in ib_cma.c.
arm64, qoriq_therm: fix handling sites on version 1 and 2
For version 2 extend the TMUV2_TMSAR() write loop over all site_ids
registered for a particular SoC and actually use the site_id rather
than always just the first [0] (which for the LX2080 would be a
problem given there is no site0).
Later, while version 2 adds the SITEs to enable to TMSR in bits 0..<n>,
version 1 (e.g., LS1028, LS1046, LS1088) add MSITEs to TMR
bits 16..31 or rather 15..0(16-<n>). Adjust the loops to only enable
the site_ids listed for the particular SoC for monitoring. This now
also deals with sparse site_ids (not starting at 0, or not being
contiguous).
sys/dev/cxgbe/cudbg/cudbg_lib.c:2949:6: error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]
int i = 0;
^
Apparently 'i' was meant as the current retry counter, but '1' was used
in the while loop comparison instead, making the loop potentially
infinite, if 'busy' never gets reset.
MFC after: 3 days
Reviewed by: np
Differential Revision: https://reviews.freebsd.org/D35834
Mark Johnston [Sat, 16 Jul 2022 15:29:53 +0000 (11:29 -0400)]
vm_object: Remove redundant OBJ_SWAP checks
With the removal of OBJT_DEFAULT, OBJ_ANON implies OBJ_SWAP.
Note, this means that vm_object_split() is more expensive than it used
to be, as it holds busy locks until the end of the range is reached,
even if the object has no swap blocks allocated.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35789
Mark Johnston [Sat, 16 Jul 2022 15:29:19 +0000 (11:29 -0400)]
vm: Remove handling for OBJT_DEFAULT objects
Now that OBJT_DEFAULT objects can't be instantiated, we can simplify
checks of the form object->type == OBJT_DEFAULT || (object->flags &
OBJ_SWAP) != 0. No functional change intended.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35788
Mark Johnston [Sat, 16 Jul 2022 15:28:09 +0000 (11:28 -0400)]
swap_pager: Removing handling for objects with OBJ_SWAP clear
With the removal of OBJT_DEFAULT, we can assume that pager operations
provide an object with OBJ_SWAP set. Also, we do not need to convert
objects from type OBJT_DEFAULT. Thus, remove checks for OBJ_SWAP and
remove code which modifies the object type. In some places, replace the
check for OBJ_SWAP with a check for whether any swap blocks are
assigned.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35786
Mark Johnston [Sat, 16 Jul 2022 15:27:27 +0000 (11:27 -0400)]
vm_object: Modify vm_object_allocate_anon() to return OBJT_SWAP objects
With this change, OBJT_DEFAULT objects are no longer allocated.
Instead, anonymous objects are always of type OBJT_SWAP and always have
OBJ_SWAP set.
Modify the page fault handler to check the swap block radix tree in
places where it checked for objects of type OBJT_DEFAULT. In
particular, there's no need to invoke getpages for an OBJT_SWAP object
with no swap blocks assigned.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35785
sys/dev/cxgb/cxgb_sge.c:1290:21: error: variable 'txsd' set but not used [-Werror,-Wunused-but-set-variable]
struct tx_sw_desc *txsd = &txq->sdesc[txqs->pidx];
^
It appears 'txsd' is a leftover from a previous refactoring (see 3f345a5d09b6), but is no longer used for anything, and can be removed
without any functional change.
MFC after: 3 days
Reviewed by: np
Differential Revision: https://reviews.freebsd.org/D35833
Clarify when GEOM utilities exit with success or failure.
Historically, GEOM utilities (gpart(8), gstripe(8), gmirror(8),
etc) used the gctl_error() routine to report errors. If they called
gctl_error() they would exit with EXIT_FAILURE, otherwise they would
return with EXIT_SUCCESS. If they used gctl_error() to output an
informational message, for example when run with the -v (verbose)
option, they would mistakenly exit with EXIT_FAILURE. A further
limitation of the gctl_error() function was that it could only be
called once. Messages from any additional calls to gctl_error()
would be silently discarded.
To resolve these problems a new function, gctl_msg() has been added.
It can be called multiple times to output multiple messages. It
also has an additional errno argument which should be zero if it is
an informational message or an errno value (EINVAL, EBUSY, etc) if
it is an error. When done the gctl_post_messages() function should
be called to indicate that all messages have been posted. If any
of the messages had a non-zero errno, the utility will EXIT_FAILURE.
If only informational messages (with zero errno) were posted, the
utility will EXIT_SUCCESS.
Tested by: Peter Holm
PR: 265184
MFC after: 1 week
Adjust agp_find_device() definition in agp.c to avoid clang 15 warning
With clang 15, the following -Werror warning is produced:
sys/dev/agp/agp.c:910:16: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
agp_find_device()
^
void
This is because agp_find_device() is declared with a (void) argument
list, and defined with an empty argument list. Make the definition match
the declaration.
Replace struct timeval in header with struct timespec.
To differentiate header formats, add a new KTR_VERSIONED flag
set in the header type field similar to the existing KTRDROP flag.
To make it easier to extend ktrace headers in the future,
extend the existing header with a version field (version 0 is
reserved for older records without KTR_VERSIONED) as well as
new fields holding the thread ID and CPU ID.
BSD stat and GNU stat differ significantly when it comes to using a
custom format string, both in the option name and in the format string
itself. Handle both here (assuming Linux means GNU stat rather than BSD
stat).
Makefile.inc1 release bsd.own.mk: Introduce and use TAR_CMD
Our uses of tar rely on BSDisms, and so do not work in environments
where GNU tar is the default tar. Providing a TAR_CMD variable like
some other commands allows it to be overridden to use bsdtar in such
cases.
Makefile.inc1: Set LC_COLLATE in distributeworld for glibc compatibility
distributeworld relies on "foo" sorting directly before "foo type=...",
but with glibc both en_US and en_GB have "fooa" sort between "foo" and
"foo z", resulting in some files (in particular, id due to "ident"
sorting before "id type=" but after "id") not being included in the meta
files and thus not included in the dist tarballs. Forcing use of the C
locale ensures this does not occur.
FreeBSD and macOS have a test that treats == as an alias for =, but
Linux tends to use GNU coreutils (when not a builtin) which does not.
Use the standard syntax instead for compatibility.
Makefile.inc1: Honour DB_FROM_SRC for NO_ROOT distributeworld
Currently the host's database files are used, but on non-FreeBSD these
are not necessarily sufficient; in particular, Linux does not have a
wheel group. Instead, use -N to use the in-tree database files when
creating the METALOG entries, as is done for the recursive makes via
IMAKE_MTREE.
tests/sys/cddl/zfs/bin/readmmap.c:97:9: error: call to undeclared function 'time'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
Obtained from: https://github.com/CTSRD-CHERI/cheribsd/commit/1737d8397a0
MFC after: 3 days
Remove unnecessary const and volatile qualifiers from __fp_type_select()
Since https://github.com/llvm/llvm-project/commit/ca75ac5f04f2, clang 15
has a new warning about _Generic selection expressions, such as used in
math.h:
lib/libc/gdtoa/_ldtoa.c:82:10: error: due to lvalue conversion of the controlling expression, association of type 'volatile float' will never be selected because it is qualified [-Werror,-Wunreachable-code-generic-assoc]
switch (fpclassify(u.e)) {
^
lib/msun/src/math.h:109:2: note: expanded from macro 'fpclassify'
__fp_type_select(x, __fpclassifyf, __fpclassifyd, __fpclassifyl)
^
lib/msun/src/math.h:85:14: note: expanded from macro '__fp_type_select'
volatile float: f(x), \
^
This is because the controlling expression always undergoes lvalue
conversion first, dropping any cv-qualifiers. The 'const', 'volatile',
and 'volatile const' associations will therefore never be used.
Warner Losh [Thu, 30 Jun 2022 18:16:46 +0000 (12:16 -0600)]
kboot: Implement mount(2)
Create a wrapper for the mount system call. To ensure a sane early boot
environment and to gather data we need for kexec, we may need to mount
some special filesystems.
Warner Losh [Thu, 30 Jun 2022 18:25:49 +0000 (12:25 -0600)]
kboot: Implement dup(2)
Early in boot, we need to create the normal stdin/out/err env for the
boot loader to run in. To do that, we need to open the console and
duplicate the file descriptors which requires dup(2). Implement a
wrapper as host_dup.
Warner Losh [Thu, 30 Jun 2022 18:22:33 +0000 (12:22 -0600)]
kboot: Implement symlink(2)
Linux's /dev/fd is implemented inside of /proc/self/fd, so we may need
to create a symlink to it early in boot. "/dev/fd" and "/dev/std*" might
not be strictly required for the boot loader, but should be present for
maximum flexibility.
Warner Losh [Fri, 15 Jul 2022 05:19:18 +0000 (23:19 -0600)]
kboot: Implement stat(2) and fstat(2) system calls
Implement stat(2) and fstat(2) in terms of newfstatat and newfstat
system calls respectively (assume we have a compat #define when
there's no newfstat and just a regular fstat and do so for ppc).
Snag struct kstat (the Linux kernel stat(2), et al interface) from musl
and attribute properly.
Warner Losh [Fri, 1 Jul 2022 17:57:02 +0000 (11:57 -0600)]
kboot: Add HOST_O_ constants for open, etc
Add the common O_ constants for the open, fcntl, etc system calls. They
are different than FreeBSD's. While they can differ based on
architecture, they are constant for architectures we care about, and
those architectures use the 'generic' version so future architectures
will also work.
Warner Losh [Thu, 14 Jul 2022 03:41:17 +0000 (21:41 -0600)]
kboot: Rework _start
Split _start into _start and _start_c (inspired by musl and the powerpc
impl is copied from there). This allows us to actually get the command
line arguments on all the platforms. We have a very simplified startup
that supports only static linking.