pstef [Thu, 7 May 2020 16:56:18 +0000 (16:56 +0000)]
ps: extend the non-standard option -d (tree view) to work with -p
Initially it seemed that there were multiple possible ways to do it.
Processing option -p could conditionally add selected processes and
their descendants to the list for further work, but it is not guaranteed
to know whether the -d option has been used or not, and it also doesn't
have access to the process list just yet.
There is also descendant_sort() which has access to all possibly needed
information, but serves the purely post-processing purpose of sorting
output.
Then there is the loop that uses invocation information and full process
list to create a list of processes for final display. It seems the most
natural place to implement this, but indeterminate state of the process
list and volatility of the final list that is being created obstruct
adding an elegant search for all elements of process descendancy trees.
So I opted for adding another loop, just before the one I mentioned
above. For all selected processes it conditionally adds direct
descendants to the end of this list of selected processes.
avg [Thu, 7 May 2020 13:11:32 +0000 (13:11 +0000)]
gpioiic_attach: fix a NULL pointer crash on hints-based systems
The attach method uses GPIO_GET_BUS() to get a "newbus" device
that provides a pin. But on hints-based systems a GPIO controller
driver might not be fully initialized yet and it does not know gpiobus
hanging off it. Thus, GPIO_GET_BUS() cannot be called yet.
The reason is that controller drivers typically create a child gpiobus
using gpiobus_attach_bus() and that leads to the following call chain:
gpiobus_attach_bus() -> gpiobus_attach() ->
bus_generic_attach(gpiobus) -> gpioiic_attach().
So, gpioiic_attach() is called before gpiobus_attach_bus() returns.
I observed this bug with nctgpio driver on amd64.
I think that the problem was introduced in r355276.
The fix is to avoid calling GPIO_GET_BUS() from the attach method.
Instead, we know that on hints-based systems only the parent gpiobus can
provide the pins.
Nothing is changed for FDT-based systems.
Sometimes, especially when there is not much memory in the system left,
allocating mbuf jumbo clusters (like 9KB or 16KB) can take a lot of time
and it is not guaranteed that it'll succeed. In that situation, the
fallback will work, but if the refill needs to take a place for a lot of
descriptors at once, the time spent in m_getjcl looking for memory can
cause system unresponsiveness due to high priority of the Rx task. This
can also lead to driver reset, because Tx cleanup routine is being
blocked and timer service could detect that Tx packets aren't cleaned
up. The reset routine can further create another unresponsiveness - Rx
rings are being refilled there, so m_getjcl will again burn the CPU.
This was causing NVMe driver timeouts and resets, because network driver
is having higher priority.
Instead of 16KB jumbo clusters for the Rx buffers, 9KB clusters are
enough - ENA MTU is being set to 9K anyway, so it's very unlikely that
more space than 9KB will be needed.
However, 9KB jumbo clusters can still cause issues, so by default the
page size mbuf cluster will be used for the Rx descriptors. This can have a
small (~2%) impact on the throughput of the device, so to restore
original behavior, one must change sysctl "hw.ena.enable_9k_mbufs" to
"1" in "/boot/loader.conf" file.
As a part of this patch (important fix), the version of the driver
was updated to v2.1.2.
rrs [Thu, 7 May 2020 10:46:02 +0000 (10:46 +0000)]
NF has an internal option that changes the tcp_mcopy_m routine slightly (has
a few extra arguments). Recently that changed to only have one arg extra so
that two ifdefs around the call are no longer needed. Lets take out the
extra ifdef and arg.
jrtc27 [Wed, 6 May 2020 23:31:30 +0000 (23:31 +0000)]
virtio: Support MMIO bus for all devices
The bus is independent of the device, so all devices can be attached to
either a PCI bus or an MMIO bus. For example, QEMU's virtio-rng-device
gives the MMIO variant of virtio-rng-pci, and is now detected.
jrtc27 [Wed, 6 May 2020 23:28:51 +0000 (23:28 +0000)]
virtio_mmio: Support non-transitional version 2 devices
The non-legacy virtio MMIO specification drops the use of PFNs and
replaces them with physical addresses. Whilst many implementations are
so-called transitional devices, also implementing the legacy
specification, TinyEMU[1] does not. Device-specific configuration
registers have also changed to being little-endian, and must be accessed
using a single aligned access for registers up to 32 bits, and two
32-bit aligned accesses for 64-bit registers.
jhb [Wed, 6 May 2020 22:15:09 +0000 (22:15 +0000)]
Deprecate ubsec(4) for FreeBSD 13.0.
With the removal of in-tree consumers of DES, Triple DES, and
MD5-HMAC, the only algorithm this driver still supports is SHA1-HMAC.
This is not very useful as a standalone algorithm (IPsec AH-only with
SHA1 would be the only user).
This driver has also not been kept up to date with the original driver
in OpenBSD which supports a few more cards and AES-CBC on newer cards.
The newest card currently supported by this driver was released in
2005.
dim [Wed, 6 May 2020 19:10:39 +0000 (19:10 +0000)]
Merge commit 4ca2cad94 from llvm git (by Justin Hibbits):
[PowerPC] Add clang -msvr4-struct-return for 32-bit ELF
Summary:
Change the default ABI to be compatible with GCC. For 32-bit ELF
targets other than Linux, Clang now returns small structs in
registers r3/r4. This affects FreeBSD, NetBSD, OpenBSD. There is no
change for 32-bit Linux, where Clang continues to return all structs
in memory.
Add clang options -maix-struct-return (to return structs in memory)
and -msvr4-struct-return (to return structs in registers) to be
compatible with gcc. These options are only for PPC32; reject them on
PPC64 and other targets. The options are like -fpcc-struct-return and
-freg-struct-return for X86_32, and use similar code.
To actually return a struct in registers, coerce it to an integer of
the same size. LLVM may optimize the code to remove unnecessary
accesses to memory, and will return i32 in r3 or i64 in r3:r4.
emaste [Wed, 6 May 2020 18:38:40 +0000 (18:38 +0000)]
binutils: disconnect objdump from the build
The in-tree binutils is old and will not be updated. It does not support
all archs supported by FreeBSD, and for the archs it does support not all
CPU features are supported.
Other tools have migrated to copyfree alternatives. Although llvm-objdump
is nearly a drop-in replacement for GNU objdump it is missing a few options
and has some differences in output format. For now just remove GNU objdump;
ports and developers can use a contemporary, maintained version from ports
or packages. We can revisit installing llvm-objdump as objdump in the
future.
PR: 212319 [exp-run]
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7338
dim [Wed, 6 May 2020 18:13:00 +0000 (18:13 +0000)]
In r358396 I merged llvm upstream commit 2e24219d3, which fixed "error:
unsupported relocation on symbol" when assembling arm 'adr' pseudo
instructions. However, the upstream commit did not take big-endian arm
into account.
Applying the same changes to the big-endian handling is straightforward,
thanks to Andrew Turner and Peter Smith for the hint. This will also be
submitted upstream.
MFC after: immediately, since this fix is meant for stable/11
markj [Wed, 6 May 2020 15:01:06 +0000 (15:01 +0000)]
Simplify arm64's pmap_bootstrap() a bit.
locore constructs an L2 page mapping the kernel and preloaded data
starting a KERNBASE (the same as VM_MIN_KERNEL_ADDRESS on arm64).
initarm() and pmap_bootstrap() use the preloaded metadata to
tell it where it can start allocating from.
pmap_bootstrap() currently iterates over the L2 page to find the last
valid entry, but doesn't do anything with the result. Remove the loop
and zap some now-unused local variables.
rmacklem [Wed, 6 May 2020 00:44:03 +0000 (00:44 +0000)]
Delete unused function newnfs_trimleading.
The NFS function called newnfs_trimleading() has not been used by the
code in long time. To give you a clue, it still had a K&R style function
declaration.
Delete it, since it is just cruft, as a part of the NFS mbuf handling
cleanup in preparation for adding ext_pgs mbuf support.
The ext_pgs mbuf support for the build/send side is needed by
nfs-over-tls.
cem [Tue, 5 May 2020 17:55:45 +0000 (17:55 +0000)]
pwcache.3: Explicitly document OOM condition
The pwcache functions allocate memory, and may return NULL pointers if that
allocation fails and the corresponding uid or gid was not found in the local
password database. Document this behavior.
tuexen [Tue, 5 May 2020 17:52:44 +0000 (17:52 +0000)]
Fix the computation of the numbers of entries of the mapping array to
look at when generating a SACK. This was wrong in case of sequence
numbers wrap arounds.
Thanks to Gwenael FOURRE for reporting the issue for the userland stack:
https://github.com/sctplab/usrsctp/issues/462
MFC after: 3 days
avg [Tue, 5 May 2020 12:14:11 +0000 (12:14 +0000)]
acpi_video: try our best to work on systems without non-essential methods
Only _BCL and _BCM methods seem to be essential to the driver's
operation. If _BQC is missing then we can assume that the current
brightness is whatever we set by the last _BCM invocation. If _DCS or
_DGS is missing the we can make assumptions as well.
The change is based on a patch suggested by Anthony Jenkins
<Scoobi_doo@yahoo.com> in PR 207086.
PR: 207086
Submitted by: Anthony Jenkins <Scoobi_doo@yahoo.com (earlier version)
Reviewed by: manu
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D24653
rmacklem [Tue, 5 May 2020 00:58:03 +0000 (00:58 +0000)]
Revert r360514, to avoid unnecessary churn of the sources.
r360514 prepared the NFS code for changes to handle ext_pgs mbufs on
the receive side. However, at this time, KERN_TLS does not pass
ext_pgs mbufs up through soreceive(). As such, as this time, only
the send/build side of the NFS mbuf code needs to handle ext_pgs mbufs.
Revert r360514 since the rather extensive changes required for receive
side ext_pgs mbufs are not yet needed.
This avoids unnecessary churn of the sources.
jhb [Tue, 5 May 2020 00:02:04 +0000 (00:02 +0000)]
Initial support for bhyve save and restore.
Save and restore (also known as suspend and resume) permits a snapshot
to be taken of a guest's state that can later be resumed. In the
current implementation, bhyve(8) creates a UNIX domain socket that is
used by bhyvectl(8) to send a request to save a snapshot (and
optionally exit after the snapshot has been taken). A snapshot
currently consists of two files: the first holds a copy of guest RAM,
and the second file holds other guest state such as vCPU register
values and device model state.
To resume a guest, bhyve(8) must be started with a matching pair of
command line arguments to instantiate the same set of device models as
well as a pointer to the saved snapshot.
While the current implementation is useful for several uses cases, it
has a few limitations. The file format for saving the guest state is
tied to the ABI of internal bhyve structures and is not
self-describing (in that it does not communicate the set of device
models present in the system). In addition, the state saved for some
device models closely matches the internal data structures which might
prove a challenge for compatibility of snapshot files across a range
of bhyve versions. The file format also does not currently support
versioning of individual chunks of state. As a result, the current
file format is not a fixed binary format and future revisions to save
and restore will break binary compatiblity of snapshot files. The
goal is to move to a more flexible format that adds versioning,
etc. and at that point to commit to providing a reasonable level of
compatibility. As a result, the current implementation is not enabled
by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option
for userland builds, and the kernel option BHYVE_SHAPSHOT.
Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai
Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz
Relnotes: yes
Sponsored by: University Politehnica of Bucharest
Sponsored by: Matthew Grooms (student scholarships)
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D19495
rrs [Mon, 4 May 2020 23:02:58 +0000 (23:02 +0000)]
This fixes two issues found by ankitraheja09@gmail.com
1) When BBR retransmits the syn it was messing up the snd_max
2) When we need to send a RST we might not send it when we should
tuexen [Mon, 4 May 2020 22:02:49 +0000 (22:02 +0000)]
Enter the net epoch before calling the output routine in TCP BBR.
This was only triggered when setting the IPPROTO_TCP level socket
option TCP_DELACK.
This issue was found by runnning an instance of SYZKALLER.
Reviewed by: rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D24690
rrs [Mon, 4 May 2020 20:28:53 +0000 (20:28 +0000)]
This commit brings things into sync with the advancements that
have been made in rack and adds a few fixes in BBR. This also
removes any possibility of incorrectly doing OOB data the stacks
do not support it. Should fix the skyzaller crashes seen in the
past. Still to fix is the BBR issue just reported this weekend
with the SYN and on sending a RST. Note that this version of
rack can now do pacing as well.
rrs [Mon, 4 May 2020 20:19:57 +0000 (20:19 +0000)]
Adjust the fb to have a way to ask the underlying stack
if it can support the PRUS option (OOB). And then have
the new function call that to validate and give the
correct error response if needed to the user (rack
and bbr do not support obsoleted OOB data).
Sponsoered by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D24574
brooks [Mon, 4 May 2020 17:16:30 +0000 (17:16 +0000)]
Set LG_VADDR to 48 on RISC-V.
The Sv48 PTE format is the largest currently defined address space for
RISC-V. It makes no sense to define a larger size and doing so (at
least for 64-bits) forces rtrees down a slow path.
After converting routing subsystem customers to use nexthop objects
defined in r359823, some fields in struct rtentry became unused.
This commit removes rt_ifp, rt_ifa, rt_gateway and rt_mtu from struct rtentry
along with the code initializing and updating these fields.
Cleanup of the remaining fields will be addressed by D24669.
This commit also changes the implementation of the RTM_CHANGE handling.
Old implementation tried to perform the whole operation under radix WLOCK,
resulting in slow performance and hacks like using RTF_RNH_LOCKED flag.
New implementation looks up the route nexthop under radix RLOCK, creates new
nexthop and tries to update rte nhop pointer. Only last part is done under
WLOCK.
In the hypothetical scenarious where multiple rtsock clients
repeatedly issue RTM_CHANGE requests for the same route, route may get
updated between read and update operation. This is addressed by retrying
the operation multiple (3) times before returning failure back to the
caller.
wulf [Mon, 4 May 2020 10:59:17 +0000 (10:59 +0000)]
[evdev] Add AT translated set1 scancodes for F-unlocked F1-12 keys.
"F lock" is a switch between two sets of scancodes for function keys F1-F12
found on some Logitech and Microsoft PS/2 keyboards [1]. When "F lock" is
pressed, then F1-F12 act as function keys and produce usual keyscans for
these keys. When "F lock" is depressed, F1-F12 produced the same keyscans
but prefixed with E0.
Some laptops use [2] E0-prefixed F1-F12 scancodes for non-standard keys.
mav [Sun, 3 May 2020 16:14:55 +0000 (16:14 +0000)]
Add session locking in cfiscsi_ioctl_handoff().
While there, remove ifdef around cs_target check in cfiscsi_ioctl_list().
I am not sure why this ifdef was added, but without this check code will
crash below on NULL dereference.
bcr [Sun, 3 May 2020 10:15:58 +0000 (10:15 +0000)]
Fix various, mostly minor errors in man pages like:
- Abbreviated month name in .Dd
- position of HISTORY section
- alphabetical ordering within SEE ALSO section
- adding .Ed before .Sh DESCRIPTION
- remove trailing whitespaces
- Line break after a sentence stop
- Use BSD OS macros instead of hardcoded strings
No .Dd bumps as there was no actual content change made
in any of these pages.
imp [Sun, 3 May 2020 04:22:27 +0000 (04:22 +0000)]
We need to hold the periph lock when we release the ccb (and when we
run it). Make sure that we do. Simplify the flow a bit, and fix a
comment since we do need to do these things.
glebius [Sun, 3 May 2020 00:37:16 +0000 (00:37 +0000)]
Step 4.2: start divorce of M_EXT and M_EXTPG
They have more differencies than similarities. For now there is lots
of code that would check for M_EXT only and work correctly on M_EXTPG
buffers, so still carry M_EXT bit together with M_EXTPG. However,
prepare some code for explicit check for M_EXTPG.
glebius [Sun, 3 May 2020 00:12:56 +0000 (00:12 +0000)]
Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf
within m_epg namespace.
All edits except the 'struct mbuf' declaration and mb_dupcl() were done
mechanically with sed:
glebius [Sat, 2 May 2020 23:46:29 +0000 (23:46 +0000)]
Step 2.2:
o Shrink sglist(9) functions to work with multipage mbufs down from
four functions to two.
o Don't use 'struct mbuf_ext_pgs *' as argument, use struct mbuf.
o Rename to something matching _epg.
glebius [Sat, 2 May 2020 22:44:23 +0000 (22:44 +0000)]
In mb_unmapped_compress() we don't need mbuf structure to keep data,
but we need buffer of MLEN bytes. This isn't just a simplification,
but important fixup, because previous commit shrinked sizeof(struct
mbuf) down below MSIZE, and instantiating an mbuf on stack no longer
provides enough data.
glebius [Sat, 2 May 2020 22:39:26 +0000 (22:39 +0000)]
Continuation of multi page mbuf redesign from r359919.
The following series of patches addresses three things:
Now that array of pages is embedded into mbuf, we no longer need
separate structure to pass around, so struct mbuf_ext_pgs is an
artifact of the first implementation. And struct mbuf_ext_pgs_data
is a crutch to accomodate the main idea r359919 with minimal churn.
Also, M_EXT of type EXT_PGS are just a synonym of M_NOMAP.
The namespace for the newfeature is somewhat inconsistent and
sometimes has a lengthy prefixes. In these patches we will
gradually bring the namespace to "m_epg" prefix for all mbuf
fields and most functions.
Step 1 of 4:
o Anonymize mbuf_ext_pgs_data, embed in m_ext
o Embed mbuf_ext_pgs
o Start documenting all this entanglement
asomers [Sat, 2 May 2020 20:14:59 +0000 (20:14 +0000)]
Resolve conflict between the fusefs(5) and mac_bsdextended(4) tests
mac_bsdextended(4), when enabled, causes ordinary operations to send many
more VOP_GETATTRs to file system. The fusefs tests expectations aren't
written with those in mind. Optionally expecting them would greatly
obfuscate the fusefs tests. Worse, certain fusefs functionality (like
attribute caching) would be impossible to test if the tests couldn't expect
an exact number of GETATTR operations.
This commit resolves that conflict by making two changes:
1. The fusefs tests will now check for mac_bsdextended, and skip if it's
enabled.
2. The mac_bsdextended tests will now check whether the module is enabled, not
merely loaded. If it's loaded but disabled, the tests will automatically
enable it for the duration of the tests.
With these changes, a CI system can achieve best coverage by loading both
fusefs and mac_bsdextended at boot, and setting
security.mac.bsdextended.enabled=0
mav [Sat, 2 May 2020 16:54:59 +0000 (16:54 +0000)]
Cleanup LUN addition/removal.
- Make ctl_add_lun() synchronous. Asynchronous addition was used by
Copan's proprietary code long ago and never for upstream FreeBSD.
- Move LUN enable/disable calls from backends to CTL core.
- Serialize LUN modification and partially removal to avoid double frees.
- Slightly unify backends code.
jhb [Sat, 2 May 2020 01:00:29 +0000 (01:00 +0000)]
Don't pass bogus keys down for NULL algorithms.
The changes in r359374 added various sanity checks in sessions and
requests created by crypto consumers in part to permit backend drivers
to make assumptions instead of duplicating checks for various edge
cases. One of the new checks was to reject sessions which provide a
pointer to a key while claiming the key is zero bits long.
IPsec ESP tripped over this as it passes along whatever key is
provided for NULL, including a pointer to a zero-length key when an
empty string ("") is used with setkey(8). One option would be to
teach the IPsec key layer to not allocate keys of zero length, but I
went with a simpler fix of just not passing any keys down and always
using a key length of zero for NULL algorithms.
jhb [Sat, 2 May 2020 00:06:58 +0000 (00:06 +0000)]
Remove support for IPsec algorithms deprecated in r348205 and r360202.
Examples of depecrated algorithms in manual pages and sample configs
are updated where relevant. I removed the one example of combining
ESP and AH (vs using a cipher and auth in ESP) as RFC 8221 says this
combination is NOT RECOMMENDED.
Specifically, this removes support for the following ciphers:
- des-cbc
- 3des-cbc
- blowfish-cbc
- cast128-cbc
- des-deriv
- des-32iv
- camellia-cbc
This also removes support for the following authentication algorithms:
- hmac-md5
- keyed-md5
- keyed-sha1
- hmac-ripemd160
mhorne [Fri, 1 May 2020 21:58:19 +0000 (21:58 +0000)]
Use the HSM SBI extension to start APs
The addition of the HSM SBI extension to OpenSBI introduces a new
breaking change: secondary harts will remain parked in the firmware,
until they are brought up explicitly via sbi_hsm_hart_start(). Add
the call to do this, sending the secondary harts to mpentry.
If the HSM extension is not present, secondary harts are assumed to be
released by the firmware, as is the case for OpenSBI =< v0.6 and BBL.
In the case that the HSM call fails we exclude the CPU, notify the
user, and allow the system to proceed with booting.
mhorne [Fri, 1 May 2020 21:52:29 +0000 (21:52 +0000)]
Make mpentry independent of _start
APs enter the kernel at the same point as the BSP, the _start routine.
They then jump to mpentry, but not before storing the kernel's physical
load address in the s9 register. Extract this calculation into its own
routine, so that APs can be instructed to enter directly from mpentry.
imp [Fri, 1 May 2020 21:24:19 +0000 (21:24 +0000)]
Add KASSERT to ensure sane nsid.
All callers are currently filtering bad nsid to this function,
however, we'll have undefined behavior if that's not true. Add the
KASSERT to prevent that.
imp [Fri, 1 May 2020 20:29:46 +0000 (20:29 +0000)]
Various improvements to this man page:
o Be consistent about device-id and namespace-id
o Use consistent arg markup for these
o document you can use disk names too
o document nsid command better
o document the idenntify command
o add a couple of examples.
imp [Fri, 1 May 2020 17:50:21 +0000 (17:50 +0000)]
Remove more stray sparc64 ifdefs.
Also, dmabuf appears to only be set for sparc64 case, but there was a
comment at its only use that says it was broken for some apple
adapters. #ifdef it all of that out now that nothing sets it.
imp [Fri, 1 May 2020 17:16:57 +0000 (17:16 +0000)]
When we have an invalid build option, don't rm -rf the current
directory.
Add a quick sanity check to objdir before using it. It must start
with /. If there was a make error getting it, report that and continue
with the next target. If there was anything else, bail out.
bdragon [Fri, 1 May 2020 16:56:36 +0000 (16:56 +0000)]
[PowerPC] Set fixed boot1.elf load address
Due to the way claiming works, we need to ensure on AIM OFW machines that
we don't have overlapping ranges on any step of the load.
Load boot1.elf at 0x38000 so it will not overlap with anything even if the
entire PReP partition gets loaded by OFW.
Tested on an iBook G4, a PowerBook G4, a PowerMac G5, and qemu pseries.
(qemu pseries is broken without this patch due to the high address used
by lld10.)