When r362031 moved local TLB invalidation after shootdown IPI send, it
moved too much. In particular, PCID-mode clearing of the pm_gen
generation counters must occur before IPIs are send, which is in fact
described by the comment before seq_cst fence in the invalidation
functions.
Fix it by extracting pm_gen clearing into new helper
pmap_invalidate_preipi(), which is executed before a call to
smp_masked_tlb_shootdown().
Rest of the local invalidation callbacks is simplified as result, and
become very similar to the remote shootdown handlers (to be merged in
some future).
Move pin of the thread to pmap_invalidate_preipi(), and do unpin in
smp_masked_tlb_shootdown().
Reported and tested by: mjg (previous version)
Reviewed by: alc, cem (previous version), markj
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D227588
Ability to load-balance traffic over multiple path is a must-have thing for routers.
It may be used by the servers to balance outgoing traffic over multiple default gateways.
The previous implementation, RADIX_MPATH stayed in the shadow for too long.
It was not well maintained, which lead us to a vicious circle - people were using
non-contiguous mask or firewalls to achieve similar goals. As a result, some routing
daemons implementation still don't have multipath support enabled for FreeBSD.
Turning on ROUTE_MPATH by default would fix it. It will allow to reduce networking
feature gap to other operating systems. Linux and OpenBSD enabled similar support
at least 5 years ago.
ROUTE_MPATH does not consume memory unless actually used. It enables around ~1k LOC.
It does not bring any behaviour changes for userland.
Additionally, feature is (temporarily) turned off by the net.route.multipath sysctl
defaulting to 0.
Michael Tuexen [Mon, 14 Dec 2020 22:13:58 +0000 (22:13 +0000)]
Improve the counting of blocks used to transfer a file from the
server to the client in case of not using an OACK: Don't miss
the first block in case of it is not also the last one.
Michal Meloun [Mon, 14 Dec 2020 13:10:19 +0000 (13:10 +0000)]
Finish implementation of ARM PMU interrupts.
The ARM PMU may use single per-core interrupt or may use multiple generic
interrupts, one per core. In this case, special attention must be paid to
the correct identification of the physical location of the core, its order
in the external database (FDT) and the associated cpuid.
Also keep in mind that a SoC can have multiple different PMUs
(usually one per cluster)
Michal Meloun [Mon, 14 Dec 2020 11:57:43 +0000 (11:57 +0000)]
Verify (and fix) the context_id argument passed to the mpentry () by PSCI.
Some older PSCI implementations corrupt (or do not pass) the context_id
argument to newly started secondary cores. Although the ideal solution to this
problem is u-boot update, we can find the correct value for the argument (cpuid)
by comparing of real core mpidr register with the value stored in pcu->mpidr.
Be bug compatible with other operating systems by allowing non-sequential
interface numbering for USB descriptors in userspace. Else certain USB
control requests using the interface number, won't be recognized by the
USB firmware.
Refer to section 9.2.3 in the USB 2.0 specification:
Interfaces are numbered from zero to one less than the number of concurrent interfaces
supported by the configuration.
Jessica Clarke [Mon, 14 Dec 2020 00:54:05 +0000 (00:54 +0000)]
loader: Ignore the .interp section on RISC-V
Without this we risk having the .interp section be placed earlier in the
file and mess with section offsets; in particular it has been seen to be
placed at the start of the file and cause the PE/COFF header to not be
at address 0. This is the same fix as was done for arm64 in r365578.
Jessica Clarke [Mon, 14 Dec 2020 00:50:45 +0000 (00:50 +0000)]
strdup.3: Function appeared in 4.3BSD-Reno, not 4.4BSD
Linux claims 4.3BSD, we claim 4.4BSD and OpenBSD claims 4.3BSD-Reno. It turns
out that OpenBSD got it right: the function was added in late 1988 a few months
after 4.3BSD-Tahoe, well in advance of 4.3BSD-Reno.
Jessica Clarke [Mon, 14 Dec 2020 00:47:59 +0000 (00:47 +0000)]
mips: Fix sub-word atomics implementation
These aligned the address but then always used the least significant
bits of the value in memory, which is the wrong half 50% of the time for
16-bit atomics and the wrong quarter 75% of the time for 8-bit atomics.
These bugs were all present in r178172, the commit that added the mips
port, and have remained for its entire existence to date.
Jessica Clarke [Mon, 14 Dec 2020 00:46:24 +0000 (00:46 +0000)]
loader: Print autoboot countdown immediately, not at 9
For the first second otime and ntime are equal so no message gets
printed. Instead we should print the countdown right from the start,
although we do it at the end of the first iteration so that if a key has
already been pressed then the message is suppressed.
Michael Tuexen [Sun, 13 Dec 2020 23:51:51 +0000 (23:51 +0000)]
Harden the handling of outgoing streams in case of an restart or INIT
collision. This avouds an out-of-bounce access in case the peer can
break the cookie signature. Thanks to Felix Wilhelm from Google for
reporting the issue.
Stefan Eßer [Sun, 13 Dec 2020 19:06:59 +0000 (19:06 +0000)]
Fix WITHOUT_ICONV build
Move the include of langinfo.h out of the WITH_ICONV condition block,
since it is not dependent on ICONV. This was correct when nl_langinfo()
had only been called in the WITH_ICONV case, but that is no longer the
case.
Martin Matuska [Sun, 13 Dec 2020 16:26:37 +0000 (16:26 +0000)]
MFV r368607:
Sync libarchive with vendor.
Vendor changes:
Issue #1461: Unbreak build without lzma
Issue #1462: warc reader: Fix build with gcc11
Issue #1463: Fix code compatibility in test_archive_read_support.c
Issue #1464: Use built-in strnlen on platforms where not available
Issue #1465: warc reader: fix undefined behaviour in deconst() function
Vendor changes:
Issue #1461: Unbreak build without lzma
Issue #1462: warc reader: Fix build with gcc11
Issue #1463: Fix code compatibility in test_archive_read_support.c
Issue #1464: Use built-in strnlen on platforms where not available
Issue #1465: warc reader: fix undefined behaviour in deconst() function
Brandon Bergren [Sun, 13 Dec 2020 03:58:43 +0000 (03:58 +0000)]
[PowerPC] Floating-point exception trap followup
* Fix incorrect operation on 32-bit caused by incorrectly-sized storage
for a temporary FPSCR.
* Fix several whitespace problems.
* Don't try to enable VSX during cleanup_fpscr().
Michael Tuexen [Sat, 12 Dec 2020 22:23:45 +0000 (22:23 +0000)]
Clean up more resouces of an existing SCTP association in case of
a restart.
This fixes a use-after-free scenario, which was reported by Felix
Wilhelm from Google in case a peer is able to modify the cookie.
However, this can also be triggered by an assciation restart under
some specific conditions.
Kyle Evans [Sat, 12 Dec 2020 21:25:38 +0000 (21:25 +0000)]
stand: liblua: add a pager module
This is nearly a 1:1 mapping of the pager API from libsa. The only real
difference is that pager.output() will accept any number of arguments and
coerce all of them to strings for output using luaL_tolstring (i.e. the
__tostring metamethod will be used).
The only consumer planned at this time is the upcoming "show-module-options"
implementation.
Kristof Provost [Sat, 12 Dec 2020 20:14:39 +0000 (20:14 +0000)]
pf: Allow net.pf.request_maxcount to be set from loader.conf
Mark request_maxcount as RWTUN so we can set it both at runtime and from
loader.conf. This avoids usings getting caught out by the change from tunable
to run time configuration.
Ian Lepore [Sat, 12 Dec 2020 18:34:15 +0000 (18:34 +0000)]
Provide userland notification of gpio pin changes ("userland gpio interrupts").
This is an import of the Google Summer of Code 2018 project completed by
Christian Kramer (and, sadly, ignored by us for two years now). The goals
stated for that project were:
FreeBSD already has support for interrupts implemented in the GPIO
controller drivers of several SoCs, but there are no interfaces to take
advantage of them out of user space yet. The goal of this work is to
implement such an interface by providing descriptors which integrate
with the common I/O system calls and multiplexing mechanisms.
The initial imported code supports the following functionality:
- A kernel driver that provides an interface to the user space; the
existing gpioc(4) driver was enhanced with this functionality.
- Implement support for the most common I/O system calls / multiplexing
mechanisms:
- read() Places the pin number on which the interrupt occurred in the
buffer. Blocking and non-blocking behaviour supported.
- poll()/select()
- kqueue()
- signal driven I/O. Posting SIGIO when the O_ASYNC was set.
- Many-to-many relationship between pins and file descriptors.
- A file descriptor can monitor several GPIO pins.
- A GPIO pin can be monitored by multiple file descriptors.
- Integration with gpioctl and libgpio.
I added some fixes (mostly to locking) and feature enhancements on top of
the original gsoc code. The feature ehancements allow the user to choose
between detailed and summary event reporting. Detailed reporting provides
a record describing each pin change event. Summary reporting provides the
time of the first and last change of each pin, and a count of how many times
it changed state since the last read(2) call. Another enhancement allows
the recording of multiple state change events on multiple pins between each
call to read(2) (the original code would track only a single event at a time).
The phabricator review for these changes timed out without approval, but I
cite it below anyway, because the review contains a series of diffs that
show how I evolved the code from its original state in Christian's github
repo for the gsoc project to what is being commited here. (In effect,
the phab review extends the VC history back to the original code.)
Submitted by: Christian Kramer
Obtained from: https://github.com/ckraemer/freebsd/tree/gsoc2018
Differential Revision: https://reviews.freebsd.org/D27398
Stefan Eßer [Sat, 12 Dec 2020 11:23:52 +0000 (11:23 +0000)]
Change getlocalbase() to not allocate any heap memory
After the commit of the current version, Scott Long pointed out, that an
attacker might be able to cause a use-after-free access if this function
returned the value of the sysctl variable "user.localbase" by freeing
the allocated memory without the cached address being cleared in the
library function.
To resolve this issue, I have proposed the originally suggested version
with a statically allocated buffer in a review (D27370). There was no
feedback on this review and after waiting for more than 2 weeks, the
potential security issue is fixed by this commit. (There was no security
risk in practice, since none of the programs converted to use this
function attempted to free the buffer. The address could only have
pointed into the heap if user.localbase was set to a non-default value,
into r/o data or the environment, else.)
This version uses a static buffer of size LOCALBASE_CTL_LEN, which
defaults to MAXPATHLEN. This does not increase the memory footprint
of the library at this time, since its data segment grows from less
than 7 KB to less than 8 KB, i.e. it will get two 4 KB pages on typical
architectures, anyway.
Compiling with LOCALBASE_CTL_LEN defined as 0 will remove the code
that accesses the sysctl variable, values between 1 and MAXPATHLEN-1
will limit the maximum size of the prefix. When built with such a
value and if too large a value has been configured in user.localbase,
the value defined as ILLEGAL_PREFIX will be returned to cause any
file operations on that result to fail. (Default value is "/dev/null/",
the review contained "/\177", but I assume that "/dev/null" exists and
can not be accessed as a directory. Any other string that can be assumed
not be a valid path prefix could be used.)
I do suggest to use LOCALBASE_CTL_LEN to size the in-kernel buffer for
the user.localbase variable, too. Doing this would guarantee that the
result always fit into the buffer in this library function (unless run
on a kernel built with a different buffer size.)
The function always returns a valid string, and only in case it is built
with a small static buffer and run on a system with too large a value in
user.localbase, the ILLEGAL_PREFIX will be returned, effectively causing
the created path to be non-existent.
Kyle Evans [Sat, 12 Dec 2020 05:57:42 +0000 (05:57 +0000)]
lualoader: provide module-manipulation commands
Specifically, we have:
- enable-module
- disable-module
- toggle-module
These can be used to add/remove modules to be loaded or force modules to be
loaded in spite of modules_blacklist. In the typical case, a user is
expected to use them to recover an issue happening due to a module directive
they've added to their loader.conf or because they discover that they've
under-specified what to load.
Brooks Davis [Fri, 11 Dec 2020 21:51:50 +0000 (21:51 +0000)]
ndis(4): expand deprecation to the whole driver
nids(4) was a clever idea in the early 2000's when the market was
flooded with 10/100 NICs with Windows-only drivers, but that hasn't been
the case for ages and the driver has had no meaningful maintenance in
ages. It only supports Windows-XP era drivers.
Brooks Davis [Fri, 11 Dec 2020 21:40:38 +0000 (21:40 +0000)]
hme(4): Remove as previous announced
The hme (Happy Meal Ethernet) driver was the onboard NIC in most
supported sparc64 platforms. A few PCI NICs do exist, but we have seen
no evidence of use on non-sparc systems.
Mitchell Horne [Fri, 11 Dec 2020 20:01:45 +0000 (20:01 +0000)]
riscv: small counter(9) improvements
Prefer atomics to critical section. This reduces the cost of the
increment operation and removes the possibility of it being interrupted
by counter_u64_zero().
Use CPU_FOREACH() macro to skip absent CPUs.
Replace hand-rolled address calculation with zpcpu_get().
Brooks Davis [Fri, 11 Dec 2020 01:00:07 +0000 (01:00 +0000)]
style(9): Correct whitespace in struct definitions
struct ifconf and struct ifreq use the odd style "struct<tab>foo".
struct ifdrv seems to have tried to follow this but was committed with
spaces in place of most tabs resulting in "struct<space><space>ifdrv".
Enji Cooper [Fri, 11 Dec 2020 00:26:49 +0000 (00:26 +0000)]
cap_enter(2): fix CAVEATS section
The CAVEATS section was misspelled as "CAVEAT" before this change. Fix the
spelling to identify issues related to the section.
Furthermore, given that the section order was incorrect, move the CAVEATS
section down to the bottom of the manpage, per the conventional section
order.
Enji Cooper [Fri, 11 Dec 2020 00:20:04 +0000 (00:20 +0000)]
posix_spawn(3): fix section that references `vfork`
`vfork(2)` should be referenced in paragraphs as `.Fn vfork`, not `vfork()`.
This change switches the reference to use `.Fn`, which in turn makes the
manpage `make manlint` clean.
Bryan Drewery [Thu, 10 Dec 2020 20:44:29 +0000 (20:44 +0000)]
contig allocs: Don't retry forever on M_WAITOK.
This restores behavior from before domain iterators were added in
r327895 and r327896.
The vm_domainset_iter_policy() will do a vm_wait_doms() and then
restart its iterator when M_WAITOK is set. It will also force
the containing loop to have M_NOWAIT. So we get an unbounded
retry loop rather than the intended bounded retries that
kmem_alloc_contig_pages() already handles.
This also restores M_WAITOK to the vmem_alloc() call in
kmem_alloc_attr_domain() and kmem_alloc_contig_domain().
Michael Tuexen [Thu, 10 Dec 2020 19:36:33 +0000 (19:36 +0000)]
Fix the TFTP client when performing a RRQ for files smaller than 512 bytes
and the server not sending an OACK:
* Close the file.
* Report the correct the number of received blocks.
Mateusz Guzik [Thu, 10 Dec 2020 17:17:22 +0000 (17:17 +0000)]
fd: make serialization in fdescfree_fds conditional on hold count
p_fd nullification in fdescfree serializes against new threads transitioning
the count 1 -> 2, meaning that fdescfree_fds observing the count of 1 can
safely assume there is nobody else using the table. Losing the race and
observing > 1 is harmless.
hyperv/vmbus: avoid crash, panic if vbe fb info is missing
Do not assume that VBE framebuffer metadata can be used. Like with the
EFI fb metadata, it may be null, so we should take care not to
dereference the null vbefb pointer. This avoids a panic when booting
-CURRENT on a gen1 VM in Azure.
Approved by: tsoome
Sponsored by: Miles AS
Differential Revision: https://reviews.freebsd.org/D27533
Stefan Eßer [Thu, 10 Dec 2020 09:31:05 +0000 (09:31 +0000)]
Lift scope of buf[] to make it extend to a potential access via *basename
It can be assumed that the contents of the buffer was still allocated and
valid at the point of the out-of-scope access, so there was no security
issue in practice.
Reported by: Coverity Scan CID 1437697
MFC after: 3 days
Thomas Munro [Thu, 10 Dec 2020 07:13:15 +0000 (07:13 +0000)]
truss: Add AIO syscalls.
Display the arguments of aio_read(2), aio_write(2), aio_suspend(2),
aio_error(2), aio_return(2), aio_cancel(2), aio_fsync(2), aio_mlock(2),
aio_waitcomplete(2) and lio_listio(2) in human-readable form.
Fix bug in ifconfig preventing proper VLAN creation.
Detection of interface type by filter must happen before detection of
interface type by prefix. Else the following sequence of commands will
try to create a LAGG interface instead of a VLAN interface, which
accidentially worked previously, because the date pointed to by the
ifr_data pointer was not parsed by VLAN create ioctl(2). This is a
regression after r368229, because the VLAN creation now parses the
ifr_data field.
How to reproduce:
# ifconfig lagg0 create
# ifconfig lagg0.256 create
Ryan Libby [Wed, 9 Dec 2020 18:43:58 +0000 (18:43 +0000)]
dmar: reserve memory windows of PCIe root port
PCI memory address space is shared between memory-mapped devices (MMIO)
and host memory (which may be remapped by an IOMMU). Device accesses to
an address within a memory aperture in a PCIe root port will be treated
as peer-to-peer and not forwarded to an IOMMU. To avoid this, reserve
the address space of the root port's memory apertures in the address
space used by the IOMMU for remapping.
Reviewed by: kib, tychon
Discussed with: Anton Rang <rang@acm.org>
Tested by: tychon
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D27503
Kyle Evans [Wed, 9 Dec 2020 15:28:56 +0000 (15:28 +0000)]
netgraph: macfilter: small fixes
Two issues:
- The DEBUG macro defined is in direct conflict with the DEBUG kernel
option, which broke the -LINT build[0]
- Building with NG_MACFILTER_DEBUG did not compile on LP64 systems due to
using %d for sizeof().
Mark Johnston [Wed, 9 Dec 2020 14:05:08 +0000 (14:05 +0000)]
Plug a race between fd table teardown and several loops
To export information from fd tables we have several loops which do
this:
FILDESC_SLOCK(fdp);
for (i = 0; fdp->fd_refcount > 0 && i <= lastfile; i++)
<export info for fd i>;
FILDESC_SUNLOCK(fdp);
Before r367777, fdescfree() acquired the fd table exclusive lock between
decrementing fdp->fd_refcount and freeing table entries. This
serialized with the loop above, so the file at descriptor i would remain
valid until the lock is dropped. Now there is no serialization, so the
loops may race with teardown of file descriptor tables.
Acquire the exclusive fdtable lock after releasing the final table
reference to provide a barrier synchronizing with these loops.
Reported by: pho
Reviewed by: kib (previous version), mjg
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27513