Stefan Eßer [Thu, 10 Dec 2020 09:31:05 +0000 (09:31 +0000)]
Lift scope of buf[] to make it extend to a potential access via *basename
It can be assumed that the contents of the buffer was still allocated and
valid at the point of the out-of-scope access, so there was no security
issue in practice.
Reported by: Coverity Scan CID 1437697
MFC after: 3 days
Thomas Munro [Thu, 10 Dec 2020 07:13:15 +0000 (07:13 +0000)]
truss: Add AIO syscalls.
Display the arguments of aio_read(2), aio_write(2), aio_suspend(2),
aio_error(2), aio_return(2), aio_cancel(2), aio_fsync(2), aio_mlock(2),
aio_waitcomplete(2) and lio_listio(2) in human-readable form.
Fix bug in ifconfig preventing proper VLAN creation.
Detection of interface type by filter must happen before detection of
interface type by prefix. Else the following sequence of commands will
try to create a LAGG interface instead of a VLAN interface, which
accidentially worked previously, because the date pointed to by the
ifr_data pointer was not parsed by VLAN create ioctl(2). This is a
regression after r368229, because the VLAN creation now parses the
ifr_data field.
How to reproduce:
# ifconfig lagg0 create
# ifconfig lagg0.256 create
Ryan Libby [Wed, 9 Dec 2020 18:43:58 +0000 (18:43 +0000)]
dmar: reserve memory windows of PCIe root port
PCI memory address space is shared between memory-mapped devices (MMIO)
and host memory (which may be remapped by an IOMMU). Device accesses to
an address within a memory aperture in a PCIe root port will be treated
as peer-to-peer and not forwarded to an IOMMU. To avoid this, reserve
the address space of the root port's memory apertures in the address
space used by the IOMMU for remapping.
Reviewed by: kib, tychon
Discussed with: Anton Rang <rang@acm.org>
Tested by: tychon
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D27503
Kyle Evans [Wed, 9 Dec 2020 15:28:56 +0000 (15:28 +0000)]
netgraph: macfilter: small fixes
Two issues:
- The DEBUG macro defined is in direct conflict with the DEBUG kernel
option, which broke the -LINT build[0]
- Building with NG_MACFILTER_DEBUG did not compile on LP64 systems due to
using %d for sizeof().
Mark Johnston [Wed, 9 Dec 2020 14:05:08 +0000 (14:05 +0000)]
Plug a race between fd table teardown and several loops
To export information from fd tables we have several loops which do
this:
FILDESC_SLOCK(fdp);
for (i = 0; fdp->fd_refcount > 0 && i <= lastfile; i++)
<export info for fd i>;
FILDESC_SUNLOCK(fdp);
Before r367777, fdescfree() acquired the fd table exclusive lock between
decrementing fdp->fd_refcount and freeing table entries. This
serialized with the loop above, so the file at descriptor i would remain
valid until the lock is dropped. Now there is no serialization, so the
loops may race with teardown of file descriptor tables.
Acquire the exclusive fdtable lock after releasing the final table
reference to provide a barrier synchronizing with these loops.
Reported by: pho
Reviewed by: kib (previous version), mjg
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27513
Justin Hibbits [Wed, 9 Dec 2020 02:07:01 +0000 (02:07 +0000)]
dev/mfi: Make a seemingly bogus conditional unconditional
Summary:
r358689 attempted to fix a clang warning/error by inferring the intent
of the condition "(cdb[0] != 0x28 || cdb[0] != 0x2A)". Unfortunately, it looks
like this broke things. Instead, fix this by making this path unconditional,
effectively reverting to the previous state.
Bryan Drewery [Tue, 8 Dec 2020 23:38:26 +0000 (23:38 +0000)]
fts_read: Handle error from a NULL return better.
This is addressing cases such as fts_read(3) encountering an [EIO]
from fchdir(2) when FTS_NOCHDIR is not set. That would otherwise be
seen as a successful traversal in some of these cases while silently
discarding expected work.
As noted in r264201, fts_read() does not set errno to 0 on a successful
EOF so it needs to be set before calling it. Otherwise we might see
a random error from one of the iterations.
gzip is ignoring most errors and could be improved separately.
Kyle Evans [Tue, 8 Dec 2020 18:47:22 +0000 (18:47 +0000)]
cpuset_set{affinity,domain}: do not allow empty masks
cpuset_modify() would not currently catch this, because it only checks that
the new mask is a subset of the root set and circumvents the EDEADLK check
in cpuset_testupdate().
This change both directly validates the mask coming in since we can
trivially detect an empty mask, and it updates cpuset_testupdate to catch
stuff like this going forward by always ensuring we don't end up with an
empty mask.
The check_mask argument has been renamed because the 'check' verbiage does
not imply to me that it's actually doing a different operation. We're either
augmenting the existing mask, or we are replacing it entirely.
Kyle Evans [Tue, 8 Dec 2020 18:45:47 +0000 (18:45 +0000)]
kern: cpuset: resolve race between cpuset_lookup/cpuset_rel
The race plays out like so between threads A and B:
1. A ref's cpuset 10
2. B does a lookup of cpuset 10, grabs the cpuset lock and searches
cpuset_ids
3. A rel's cpuset 10 and observes the last ref, waits on the cpuset lock
while B is still searching and not yet ref'd
4. B ref's cpuset 10 and drops the cpuset lock
5. A proceeds to free the cpuset out from underneath B
Resolve the race by only releasing the last reference under the cpuset lock.
Thread A now picks up the spinlock and observes that the cpuset has been
revived, returning immediately for B to deal with later.
Kyle Evans [Tue, 8 Dec 2020 18:44:06 +0000 (18:44 +0000)]
kern: cpuset: plug a unr leak
cpuset_rel_defer() is supposed to be functionally equivalent to
cpuset_rel() but with anything that might sleep deferred until
cpuset_rel_complete -- this setup is used specifically for cpuset_setproc.
Add in the missing unr free to match cpuset_rel. This fixes a leak that
was observed when I wrote a small userland application to try and debug
another issue, which effectively did:
cpuset(&newid);
cpuset(&scratch);
newid gets leaked when scratch is created; it's off the list, so there's
no mechanism for anything else to relinquish it. A more realistic reproducer
would likely be a process that inherits some cpuset that it's the only ref
for, but it creates a new one to modify. Alternatively, administratively
reassigning a process' cpuset that it's the last ref for will have the same
effect.
Mitchell Horne [Tue, 8 Dec 2020 18:24:33 +0000 (18:24 +0000)]
arm64: fix struct l_sigaction_t layout
The definition was copied from amd64, but the layout of the struct
differs slightly between these platforms. This fixes spurious
`unsupported sigaction flag 0xXXXXXXXX` messages when executing some
Linux binaries on arm64.
Reviewed by: emaste
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27460
John Baldwin [Tue, 8 Dec 2020 18:00:58 +0000 (18:00 +0000)]
Check that the frame pointer is within the current stack.
This same check is used on other architectures. Previously this would
permit a stack frame to unwind into any arbitrary kernel address
(including unmapped addresses).
Adrian Chadd [Tue, 8 Dec 2020 17:25:59 +0000 (17:25 +0000)]
[ath] replace the hard-coded magic values in if_athioctl.h with constant defines
Replace some hard-coded magic values in the ioctl stats struct with
#defines. I'm going to follow up with some more sanity checking in
the receive path that also use these values so we don't do bad
things if the hardware is (more) confused.
Gleb Smirnoff [Tue, 8 Dec 2020 16:46:00 +0000 (16:46 +0000)]
The list of ports in configuration path shall be protected by locks,
epoch shall be used only for fast path. Thus use LAGG_XLOCK() in
lagg_[un]register_vlan. This fixes sleeping in epoch panic.
Ed Maste [Tue, 8 Dec 2020 14:56:15 +0000 (14:56 +0000)]
Default to WITHOUT_GDB (GDB 6.1.1) for FreeBSD 13
As discussed on -current, -stable, -toolchain, and with jhb@ and imp@,
disable the obsolete in-tree GDB 6.1.1 by default. This was kept only
to provide kgdb for the crashinfo tool, but is long-obsolete, does not
support all architectures that FreeBSD does, and held back other work
(such as forcing the use of DWARF2 for kernel debug).
Crashinfo will use kgdb from the gdb package or devel/gdb port, and will
privde a message referencing those if no kgdb is found.
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Kyle Evans [Tue, 8 Dec 2020 14:05:25 +0000 (14:05 +0000)]
src.opts.mk: switch to bsdgrep as /usr/bin/grep
This has been years in the making, and we all knew it was bound to happen
some day. Switch to the BSDL grep implementation now that it's been a
little more thoroughly tested and theoretically supports all of the
extensions that gnugrep in base had with our libregex(3).
Folks shouldn't really notice much from this update; bsdgrep is slower than
gnugrep, but this is currently the price to pay for fewer bugs. Those
dissatisfied with the speed of grep and in need of a faster implementation
should check out what textproc/ripgrep and textproc/the_silver_searcher
can do for them.
I have some WIP to make bsdgrep faster, but do not consider it a blocker
when compared to the pros of switching now (aforementioned bugs, licensing).
Enji Cooper [Tue, 8 Dec 2020 04:18:16 +0000 (04:18 +0000)]
extattr_get_file(20: bump .Dd
This is being done for the formatting and context changes. While the net content
hasn't been changed, the content/context changes were sufficient to warrant the
date bump.
Enji Cooper [Tue, 8 Dec 2020 04:16:05 +0000 (04:16 +0000)]
extattr_get_file(2): clarify RETURN VALUES
While some of the syscalls' behavior were documented and implied in the
RETURN VALUES section by earlier, e.g., the DESCRIPTION sections, as having
behavior of the other calls (`*_fd` vs `*_file` vs `*_link`), there was a lot
of implied return value behavior in the section prior to this change.
Explicitly document the syscall behavior per the current implementation in
sys/kern/vfs_extattr.c so others can better develop based on its explicit
documented behavior instead of having to digest the context of the manpage to
understand the appropriate behavior.
Enji Cooper [Tue, 8 Dec 2020 04:01:03 +0000 (04:01 +0000)]
extattr_get_file(2): sort syscalls alphabetically
Although some sections of the manpage sort the syscalls alphabetically, many
core areas of the manpage do not. Sort the syscalls so it is easier to pick out
functional changes and to improve manpage readability.
This formatting change is also being done to make future functional changes
easier to spot.
Enji Cooper [Tue, 8 Dec 2020 03:43:00 +0000 (03:43 +0000)]
extattr_get_fd(2): fix manlint errors
- The CAVEATS section was misspelled as "CAVEAT".
- The CAVEATS section should come before the "BUGS" section and after
other existing sections by convention.
Mitchell Horne [Tue, 8 Dec 2020 00:48:50 +0000 (00:48 +0000)]
release: don't checksum images if there are none
For platforms that don't have any of the memstick, cdrom, or dvdrom
release images (i.e. riscv64), the release-install target will trip up
when invoking md5(1) on the non-existent image files. Skipping this
allows the install to complete successfully.
Mitchell Horne [Tue, 8 Dec 2020 00:42:03 +0000 (00:42 +0000)]
RISC-V release confs
Add two release flavors for RISC-V. First, the traditional "big-iron"
images, capable of generating distribution sets and VM images. Installer
images won't be built yet, but can be trivially enabled in the future
with the addition of riscv/make-memstick.sh.
Second, a GENERICSD embedded image. I've opted for this instead of
board-specific SD card images as it allows users to just dd the u-boot
they want. The RISC-V hardware ecosystem is still young, so a
configuration for e.g. the new PolarFire SoC Icicle Kit would likely see
very few users.
Mitchell Horne [Tue, 8 Dec 2020 00:37:11 +0000 (00:37 +0000)]
riscv: allow building virtual machine images
RISC-V has the same booting requirements as arm64 (loader.efi, no legacy
boot options), so generated images for both architectures have the same
partition layout.
Mitchell Horne [Tue, 8 Dec 2020 00:35:13 +0000 (00:35 +0000)]
release.sh: add support for RISC-V embedded builds
Since the few existing RISC-V hardware platforms are single board
computers, we can piggyback off of arm/arm64's embedded build support
for generating SD card images.
I don't see a pressing need to change the naming in this file at this
time.
Reviewed by: gjb, manu
Differential Revision: https://reviews.freebsd.org/D27043
Andrew Turner [Mon, 7 Dec 2020 17:54:49 +0000 (17:54 +0000)]
Ensure the boot CPU is CPU 0 on arm64
We assume the boot CPU is always CPU 0 on arm64. To allow for this reserve
cpuid 0 for the boot CPU in the ACPI and FDT cases but otherwise start the
CPU as normal. We then check for the boot CPU in start_cpu and return as if
it was started.
While here extract the FDT CPU init code into a new function to simplify
cpu_mp_start and return FALSE from start_cpu when the CPU fails to start.
Reviewed by: mmel
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D27497
Mark Johnston [Mon, 7 Dec 2020 14:52:57 +0000 (14:52 +0000)]
iflib: Detach tasks upon device registration failure
In some error paths we would fail to detach from the iflib taskqueue
groups. Also move the detach code into its own subroutine instead of
duplicating it.
Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com>
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D27342
Toomas Soome [Mon, 7 Dec 2020 11:25:18 +0000 (11:25 +0000)]
loader: xdr_array is missing count
The integer arrays are encoded in nvlist as counted array <count, i0, i1...>,
loader xdr_array() is missing the count. This will affect the pool import when
there are hole devices in pool.
Prefer using the MIN() function macro over the min() inline function
in the LinuxKPI. Linux defines min() to be a macro, while in FreeBSD
min() is a static inline function clamping its arguments to
"unsigned int".
Mark Johnston [Sun, 6 Dec 2020 22:45:50 +0000 (22:45 +0000)]
uma: Make uma_zone_set_maxcache() work better with small limits
The old implementation chose the largest bucket zone such that if the
per-CPU caches are fully populated, the total number of items cached is
no larger than the specified limit. If no such zone existed, UMA would
not do any caching.
We can now use uz_bucket_size_max to set a precise limit on the number
of items in a zone's bucket, so the total size of per-CPU caches can be
bounded more easily. Implement a new policy in uma_zone_set_maxcache():
choose a bucket size such that up to half of the limit can be cached in
per-CPU caches, with the rest going to the full bucket cache. This
fixes a problem with the kstack_cache zone: the limit of 4 * mp_ncpus
items meant that the zone would not do any caching, defeating the whole
purpose of the zone. That's because the smallest bucket size holds up
to 2 items and we may cache up to 3 full buckets per CPU, and
2 * 3 * mp_ncpus > 4 * mp_ncpus.
Reported by: mjg
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27168
Mark Johnston [Sun, 6 Dec 2020 22:45:39 +0000 (22:45 +0000)]
uma: Enforce the use of uz_bucket_size_max in the free path
uz_bucket_size_max is the maximum permitted bucket size. When filling a
new bucket to satisfy uma_zalloc(), the bucket is populated with at most
uz_bucket_size_max items. The maximum number of entries in the bucket
may be larger. When freeing items, however, we will fill per-CPPU
buckets up to their maximum number of entries, potentially exceeding
uz_bucket_size_max. This makes it difficult to precisely limit the
number of items that may be cached in a zone. For example, if one wants
to limit buckets to 1 entry for a particular zone, that's not possible
since the smallest bucket holds up to 2 entries.
Try to solve the problem by using uz_bucket_size_max to limit the number
of entries in a bucket. Note that the ub_entries field is initialized
upon every bucket allocation. Most zones are not affected since they do
not impose any specific limit on the maximum bucket size.
While here, remove the UMA_ZONE_MINBUCKET flag. It was unused and we
now have uma_zone_set_maxcache() to control the zone's cache size more
precisely.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27167
Ed Maste [Sun, 6 Dec 2020 21:34:04 +0000 (21:34 +0000)]
Add deprecation notice to mn(4)
Sync serial (T1/E1) interfaces are largely irrelevant today and phk
confirms this driver is unnecessary in review D23928.
This leaves ce(4) and cp(4) in the tree. They're likely not relevant
either, but glebius contacted the manufacturer and those devices are
still available for purchase. At glebius' suggestion leave them in
the tree as long as they do not impose a maintenace burden.
Approved by: phk
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Michael Tuexen [Sun, 6 Dec 2020 18:43:12 +0000 (18:43 +0000)]
When dropping packets (RRQ or WRQ) for debugging, report the send
operation as successful. Reporting a failure stops the transfer
instead of using timeouts.
Kyle Evans [Sun, 6 Dec 2020 17:45:42 +0000 (17:45 +0000)]
bsdgrep: don't link against libregex for bootstrap
r368355 removed the GNU_GREP_COMPAT knob (off by default) and forgot that
bsdgrep may be built/used for bootstrap on some systems.
All base uses should strive to use only POSIX-compliant expressions anyways
and we haven't had libregex by default here up to this point, so just don't
do that if we're bootstrapping.
Note that the resulting binary has the wrong `grep -V` information as it
falsely claims to be GNU compatible, but it is only for bootstrap.
Yuri Pankov [Sun, 6 Dec 2020 16:44:41 +0000 (16:44 +0000)]
update wcwidth data from utf8proc
Character width data being out of date is a constant source
of weird rendering issues and wasted time trying to diagnose
those, e.g. as reported by Jeremy Chadwick:
https://gitlab.com/muttmua/mutt/-/issues/67
Sadly, there is no real ("standard") wcwidth data source, so
this tries to rectify the problem using the utf8proc one (through
its C API) which would hopefully benefeat both FreeBSD and
utf8proc through bug reports (if any).
Kyle Evans [Sun, 6 Dec 2020 15:58:50 +0000 (15:58 +0000)]
bectl: simplify the tail end of the jail cmd
This has already confused me once (and I'm pretty sure I wrote it), so let's
clarify: unjailing after the command has completed will only happen if we're
interactive and -U has not been specified.
This just folds two conditionals together to make it obvious how -b/-U
interact with each other.
Tijl Coosemans [Sun, 6 Dec 2020 10:58:55 +0000 (10:58 +0000)]
Move V4L feature declarations and DTrace provider definitions from
linux_common.c to linux_util.c so they become available on i386.
linux_common.c defines the linux_common kernel module but this module does
not exist on i386 and linux_common.c is not included in the linux module.
linux_util.c is included in the linux_common module on amd64 and the linux
module on i386.
Remove linux_common.c from files.i386 again. It was added recently in
r367433 when the DTrace provider definitions were moved.
The V4L feature declarations were moved to linux_common in r283423.
Tijl Coosemans [Sat, 5 Dec 2020 14:53:24 +0000 (14:53 +0000)]
Fix i386 linux module after r367395.
In r367395 parts of machine dependent linux_dummy.c were moved to a new
machine independent file sys/compat/linux/linux_dummy.c and the existing
linux_dummy.c was renamed to linux_dummy_machdep.c.
Add linux_dummy_machdep.c to the linux module for i386.
Rename sys/amd64/linux32/linux_dummy.c for consistency.
Add the new linux_dummy.c to the linux module for i386.
Kyle Evans [Sat, 5 Dec 2020 14:38:46 +0000 (14:38 +0000)]
libc: regex: partial revert of r368358
Part of the libregex functionality leaked into the tests it shares with
the standard regex(3). Introduce a P flag to set the REG_POSIX cflag to
indicate that libc regex should effectively do nothing while libregex should
specifically run it in non-extended mode.
Michal Meloun [Sat, 5 Dec 2020 14:06:01 +0000 (14:06 +0000)]
Simplify startup of secondary cores and store MPIDR register to pcpu.
- record MPIDR for all started cores in pcpu, they will be used as link
between physical locality of given core, ID in external description
(FDT or ACPI) and cupid.
- because of above, cpuid can (and should) be freely assigned, only boot
CPU must have cpuid 0. Simplify startup code according this.
Please note that pure cpuid is not sufficient instrument to hold any
information about core or cluster topology, nor to determistically iterate
over subpart of cores in CPU (iterate over all cores in single cluster for
example). Situation is more complicated by fact that PSCI can reject start
of core without reporting error (because power budget for example), or by
fact that is possible that we booted on non-first core in cluster (thus with
cpuid 0 assigned to random core).
Given cores topology should be exhibited to other parts of system
(for example to scheduler for big.little or multicluster systems) by using
smp_topo interface.
Michal Meloun [Sat, 5 Dec 2020 10:55:09 +0000 (10:55 +0000)]
DesignWare PCIe driver: Don't call bus_generic_attach() twice.
bus_generic_attach() should be called from the attach function of the real
implementation, not from the common init function.
It was realized just a little too late that this was a hack that belonged in
individual regex(3)-using applications. It was surrounded in NOTYET and not
implemented in the engine, so remove it.