We generally like to avoid style changes when other changes are not
planned. In this case there are some makesyscalls.lua changes in the
pipeline, and this cleans up style nits in generated files that were
highlighted by experiments with clang-format.
Reviewed by: brooks, kevans
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30235
Remove OBJT_SWAP_TMPFS. Move tmpfs-specific swap pager bits into
tmpfs_subr.c.
There is no longer any code to directly support tmpfs in sys/vm, most
tmpfs knowledge is shared by non-anon swap object type implementation.
The tmpfs-specific methods are provided by registered tmpfs pager, which
inherits from the swap pager.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30168
vm_object_set_memattr(): handle all object types without listing them explicitly
This avoids the need to know all existing object types in advance, by the
cost of loosing the assert that unknown object type is handled in a sane
manner.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30168
Lutz Donnerhacke [Tue, 12 Jan 2021 21:28:29 +0000 (22:28 +0100)]
netgraph/ng_bridge: become SMP aware
The node ng_bridge underwent a lot of changes in the last few months.
All those steps were necessary to distinguish between structure
modifying and read-only data transport paths. Now it's done, the node
can perform frame forwarding on multiple cores in parallel.
Emmanuel Vadot [Mon, 3 May 2021 08:12:26 +0000 (10:12 +0200)]
pkgbase: Move librt to clibs
librt implement the POSIX realtime extension library.
Move it to clibs instead of utilities as a number of ports uses it
so avoid a dependancy on FreeBSD-utilities.
Use the new control message to move ethernet addresses from a link to
a new link in ng_bridge(4). Send this message instead of doing the
work directly requires to move the loop detection into the control
message processing. This will delay the loop detection by a few
frames.
This decouples the read-only activity from the modification under a
more strict writer lock.
Cyril Zhang [Thu, 13 May 2021 12:55:06 +0000 (08:55 -0400)]
sort: Cache value of MB_CUR_MAX
Every usage of MB_CUR_MAX results in a call to __mb_cur_max. This is
inefficient and redundant. Caching the value of MB_CUR_MAX in a global
variable removes these calls and speeds up the runtime of sort. For
numeric sorting, runtime is almost halved in some tests.
PR: 255551
PR: 255840
Reviewed by: markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30170
Mark Johnston [Thu, 13 May 2021 12:33:37 +0000 (08:33 -0400)]
posix timers: Check for overflow when converting to ns
Disallow a time or timer period value when the conversion to nanoseconds
would overflow. Otherwise it is possible to trigger a divison by zero
in realtime_expire_l(), where we compute the number of overruns by
dividing by the timer interval.
Fixes: 7995dae9 ("posix timers: Improve the overrun calculation")
Reported by: syzbot+5ab360bd3d3e3c5a6e0e@syzkaller.appspotmail.com
Reported by: syzbot+157b74ff493140d86eac@syzkaller.appspotmail.com
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30233
Mark Johnston [Thu, 13 May 2021 12:33:23 +0000 (08:33 -0400)]
fork: Suspend other threads if both RFPROC and RFMEM are not set
Otherwise, a multithreaded parent process may trigger races in
vm_forkproc() if one thread calls rfork() with RFMEM set and another
calls rfork() without RFMEM.
Also simplify vm_forkproc() a bit, vmspace_unshare() already checks to
see if the address space is shared.
Randall Stewart [Thu, 13 May 2021 11:36:04 +0000 (07:36 -0400)]
tcp: Incorrect KASSERT causes a panic in rack
Skyzall found an interesting panic in rack. When a SYN and FIN are
both sent together a KASSERT gets tripped where it is validating that
a mbuf pointer is in the sendmap. But a SYN and FIN often will not
have a mbuf pointer. So the fix is two fold a) make sure that the
SYN and FIN split the right way when cloning an RSM SYN on left
edge and FIN on right. And also make sure the KASSERT properly
accounts for the case that we have a SYN or FIN so we don't
panic.
Reviewed by: mtuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D30241
Robert Wing [Wed, 3 Mar 2021 06:05:47 +0000 (21:05 -0900)]
bhyve/snapshot: provide a way to send other messages/data to bhyve
This is a step towards sending messages (other than suspend/checkpoint)
from bhyvectl to bhyve.
Introduce a new struct, ipc_message - this struct stores the type of
message and a union containing message specific structures for the type
of message being sent.
Kristof Provost [Wed, 12 May 2021 17:13:40 +0000 (19:13 +0200)]
tests: Only log critical errors from scapy
Since 2.4.5 scapy started issuing warnings about a few different
configurations during our tests. These are harmless, but they generate
stderr output, which upsets atf_check.
Configure scapy to only log critical errors (and thus not warnings) to
fix these tests.
Mark Johnston [Wed, 12 May 2021 13:39:36 +0000 (09:39 -0400)]
Fix mbuf leaks in various pru_send implementations
The various protocol implementations are not very consistent about
freeing mbufs in error paths. In general, all protocols must free both
"m" and "control" upon an error, except if PRUS_NOTREADY is specified
(this is only implemented by TCP and unix(4) and requires further work
not handled in this diff), in which case "control" still must be freed.
This diff plugs various leaks in the pru_send implementations.
Reviewed by: tuexen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30151
Mark Johnston [Wed, 12 May 2021 15:49:24 +0000 (11:49 -0400)]
nd6: Avoid using an uninitialized sockaddr in nd6_prefix_offlink()
Commit 81728a538 ("Split rtinit() into multiple functions.") removed
the initialization of sa6, but not one of its uses. This meant that we
were passing an uninitialized sockaddr as the address to
lltable_prefix_free(). Remove the variable outright to fix the problem.
The caller is expected to hold a reference on pr.
Fixes: 81728a538 ("Split rtinit() into multiple functions.")
Reported by: KMSAN
Reviewed by: donner
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30166
Mark Johnston [Wed, 12 May 2021 14:05:37 +0000 (10:05 -0400)]
if: Remove unnecessary validation in the SIOCSIFNAME handler
A successful copyinstr() call guarantees that the returned string is
nul-terminated. Furthermore, the removed check would harmlessly compare
an uninitialized byte with '\0' if the new name is shorter than
IFNAMESIZ - 1.
Reported by: KMSAN
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
This update fixes the initialization of "scale" to 20 if started with
-l and the initial statement leads to an error (e.g. contains a syntax
error). Scale was initialized to 0 in that case.
Another change is the support of job control in interactive mode with
line editing enabled. The control characters have been interpreted as
editing commands only, prior to this version.
Andrew Fengler [Wed, 12 May 2021 01:59:10 +0000 (01:59 +0000)]
Add support for adding default routes for other FIBs
Make rc.d/routing read defaultrouter_fibN and ipv6_defaultrouter_fibN, and
set it as the default gateway for FIB N, where N is from 1 to (net.fibs - 1)
This allows adding gateways for multiple FIBs in the same format as the main
gateway. (FIB 0)
Robert Wing [Thu, 11 Mar 2021 19:27:43 +0000 (10:27 -0900)]
libvmm: explicitly save and restore errno in vm_open()
In commit 6bb140e3ca895a14, vm_destroy() was replaced with free() to
preserve errno. However, it's possible that free() may change the errno
as well. Keep the free() call, but explicitly save and restore errno.
Kirk McKusick [Tue, 11 May 2021 21:51:06 +0000 (14:51 -0700)]
Ensure that files with no allocated blocks are trimmed to zero length.
UFS does not allow files to end with a hole; it requires that the
last block of a file be allocated. As fsck_ffs(8) initially scans
each allocated inode, it tracks the last allocated block in the
inode. It then checks that the inode's size falls in the last
allocated block. If the last allocated block falls before the size,
a `file size beyond end of allocated file' warning is issued and
the file is shortened to reference the last allocated block (to avoid
having it reference a hole at its end). If the last allocated block
falls after the size, a `partially truncated file' warning is issued
and all blocks following the block referenced by the size are freed.
Because of an incorrect unsigned comparison, this test was failing
to handle files with no allocated blocks but non-zero size (which
should have had their size reset to zero). Once that was fixed the
test started incorrectly complaining about short symbolic links
that place the link path in the inode rather than in a disk block.
Because these symbolic links have a non-zero size, but no allocated
blocks, fsck_ffs wanted to zero out their size. This patch has to
detect and avoid changing the size of such symbolic links.
Mark Johnston [Tue, 11 May 2021 21:36:12 +0000 (17:36 -0400)]
cryptodev: Fix some input validation bugs
- When we do not have a separate IV, make sure that the IV length
specified by the session is not larger than the payload size.
- Disallow AEAD requests without a separate IV. crp_sanity() asserts
that CRYPTO_F_IV_SEPARATE is set for AEAD requests, and some (but not
all) drivers require it.
- Return EINVAL for AEAD requests if an IV is specified but the
transform does not expect one.
Roger Pau Monné [Tue, 11 May 2021 10:19:29 +0000 (12:19 +0200)]
xen/blkback: fix reconnection of backend
The hotplug script will be executed only once for each backend,
regardless of the frontend triggering reconnections. Fix blkback to
deal with the hotplug script being executed only once, so that
reconnections don't stall waiting for a hotplug script execution
that will never happen.
As a result of the fix move the initialization of dev_mode, dev_type
and dev_name to the watch callback, as they should be set only once
the first time the backend connects.
This fix is specially relevant for guests wanting to use UEFI OVMF
firmware, because OVMF will use Xen PV block devices and disconnect
afterwards, thus allowing them to be used by the guest OS. Without
this change the guest OS will stall waiting for the block backed to
attach.
Fixes: de0bad00010c ('blkback: add support for hotplug scripts')
MFC after: 1 week
Sponsored by: Citrix Systems R&D
Randall Stewart [Tue, 11 May 2021 12:15:05 +0000 (08:15 -0400)]
tcp: In rack, we must only convert restored rtt when the hostcache does restore them.
Rack now after the previous commit is very careful to translate any
value in the hostcache for srtt/rttvar into its proper format. However
there is a snafu here in that if tp->srtt is 0 is the only time that
the HC will actually restore the srtt. We need to then only convert
the srtt restored when it is actually restored. We do this by making
sure it was zero before the call to cc_conn_init and it is non-zero
afterwards.
Wojciech Macek [Fri, 23 Apr 2021 03:57:03 +0000 (05:57 +0200)]
mroute: fix race condition during mrouter shutting down
There is a race condition between V_ip_mrouter de-init
and ip_mforward handling. It might happen that mrouted
is cleaned up after V_ip_mrouter check and before
processing packet in ip_mforward.
Use epoch call aproach, similar to IPSec which also handles
such case.
There are two module declarations in the nfscl.ko module for "nfscl"
and "nfs". Both of these declarations had MODULE_DEPEND() calls.
This patch deletes the MODULE_DEPEND() calls for "nfs" to avoid
confusion with respect to what modules this module is dependent upon.
The patch also adds comments explaining why there are two module
declarations within the module.
Mark Johnston [Tue, 11 May 2021 00:18:00 +0000 (20:18 -0400)]
vfs: Fix error handling in vn_fullpath_hardlink()
vn_fullpath_any_smr() will return a positive error number if the
caller-supplied buffer isn't big enough. In this case the error must be
propagated up, otherwise we may copy out uninitialized bytes.
Reported by: syzkaller+KMSAN
Reviewed by: mjg, kib
MFC aftr: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30198
rtld: preserve the 'seen' state of the dlerror message in errmsg_save()
rtld preserves its current error message around calls to user init/fini
lists, to not override original error with potential secondary errors
caused by user code recursing into rtld. After 4d9128da54f8f8e2a29190,
the preservation of the string itself is not enough, the 'seen'
indicator must be preserved as well. Otherwise, since new code does not
clear string (it cannot), call to _rtld_error() from errmsg_restore()
revived whatever message was consumed last.
Change errmsg_save() to return structure recording both 'seen' indicator
and the message, if any.
PR: 255698
Reported by: Eugene M. Kim <astralblue@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
It reopens the passed file descriptor, checking the file backing vnode'
current access rights against open mode. In particular, this flag allows
to convert file descriptor opened with O_PATH, into operable file
descriptor, assuming permissions allow that.
Reviewed by: markj
Tested by: Andrew Walker <awalker@ixsystems.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30148
Randall Stewart [Mon, 10 May 2021 15:25:51 +0000 (11:25 -0400)]
tcp:Host cache and rack ending up with incorrect values.
The hostcache up to now as been updated in the discard callback
but without checking if we are all done (the race where there are
more than one calls and the counter has not yet reached zero). This
means that when the race occurs, we end up calling the hc_upate
more than once. Also alternate stacks can keep there srtt/rttvar
in different formats (example rack keeps its values in microseconds).
Since we call the hc_update *before* the stack fini() then the
values will be in the wrong format.
Rack on the other hand, needs to convert items pulled from the
hostcache into its internal format else it may end up with
very much incorrect values from the hostcache. In the process
lets commonize the update mechanism for srtt/rttvar since we
now have more than one place that needs to call it.
Kristof Provost [Tue, 4 May 2021 17:23:15 +0000 (19:23 +0200)]
in6_mcast: Return EADDRINUSE when we've already joined the group
Distinguish between truly invalid requests and those that fail because
we've already joined the group. Both cases fail, but differentiating
them allows userspace to make more informed decisions about what the
error means.
For example. radvd tries to join the all-routers group on every SIGHUP.
This fails, because it's already joined it, but this failure should be
ignored (rather than treated as a sign that the interface's multicast is
broken).
This puts us in line with OpenBSD, NetBSD and Linux.
sbin/ipfw: Fix parsing error in table based forward
The argument parser does not recognise the optional port for an
"tablearg" argument. Fix simplifies the code by make the internal
representation expicit for the parser.
Rick Macklem [Sat, 8 May 2021 00:30:56 +0000 (17:30 -0700)]
nfscl: Add support for va_birthtime to NFSv4
There is a NFSv4 file attribute called TimeCreate
that can be used for va_birthtime.
r362175 added some support for use of TimeCreate.
This patch completes support of va_birthtime by adding
support for setting this attribute to the server.
It also eanbles the client to
acquire and set the attribute for a NFSv4
server that supports the attribute.
Randall Stewart [Fri, 7 May 2021 21:32:32 +0000 (17:32 -0400)]
This takes Warners suggested approach to making it so that
platforms that for whatever reason cannot include the RATELIMIT option
can still work with rack. It adds two dummy functions that rack will
call and find out that the highest hw supported b/w is 0 (which
kinda makes sense and rack is already prepared to handle).
Reviewed by: Michael Tuexen, Warner Losh
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D30163
Fix panic when trying to delete non-existent gateway in multipath route.
IF non-existend gateway was specified, the code responsible for calculating
an updated nexthop group, returned the same already-used nexthop group.
After the route table update, the operation result contained the same
old & new nexthop groups. Thus, the code responsible for decomposing
the notification to the list of simple nexthop-level notifications,
was not able to find any differences. As a result, it hasn't updated any
of the "simple" notification fields, resulting in empty rtentry pointer.
This empty pointer was the direct reason of a panic.
Fix the problem by returning ESRCH when the new nexthop group is the same
as the old one after applying gateway filter.
Reported by: Michael <michael.adm at gmail.com>
PR: 255665
MFC after: 3 days