Mark Johnston [Tue, 1 Mar 2022 16:53:42 +0000 (11:53 -0500)]
fasttrap: Avoid creating WX mappings
fasttrap instruments certain instructions by overwriting them and
copying the original instruction to some per-thread scratch space which
is executed after the probe fires. This trampoline jumps back to the
tracepoint after executing the original instruction.
The created mapping has both write and execute permissions, and so this
mechanism doesn't work when allow_wx is disabled. Work around the
restriction by using proc_rwmem() to write to the trampoline.
Reviewed by: vangyzen
Tested by: Amit <akamit91@hotmail.com>
Sponsored by: The FreeBSD Foundation
Mark Johnston [Tue, 1 Mar 2022 16:48:39 +0000 (11:48 -0500)]
proc: Relax proc_rwmem()'s assertion on the process hold count
This reference ensures that the process and its associated vmspace will
not be destroyed while proc_rwmem() is executing. If, however, the
calling thread belongs to the target process, then it is unnecessary to
hold the process. In particular, fasttrap - a module which enables
userspace dtrace - may frequently call proc_rwmem(), and we'd prefer to
avoid the overhead of locking and bumping the hold count when possible.
Thus, make the assertion conditional on "p != curproc". Also assert
that the process is not already exiting. No functional change intended.
Andrew Turner [Tue, 8 Mar 2022 11:38:51 +0000 (11:38 +0000)]
Make the arm64 get_pcpu a function again
We assume the pointer returned from get_pcpu will be consistent even
if the thread is moved to a new CPU. Fix this by partially reverting 63c858a04d565 to make get_pcpu a function again.
Mark Johnston [Tue, 1 Mar 2022 14:07:14 +0000 (09:07 -0500)]
riscv: Add support for enabling SV48 mode
This increases the size of the user map from 256GB to 128TB. The kernel
map is left unchanged for now.
For now SV48 mode is left disabled by default, but can be enabled with a
tunable. Note that extant hardware does not implement SV48, but QEMU
does.
- In pmap_bootstrap(), allocate a L0 page and attempt to enable SV48
mode. If the write to SATP doesn't take, the kernel continues to run
in SV39 mode.
- Define VM_MAX_USER_ADDRESS to refer to the SV48 limit. In SV39 mode,
the region [VM_MAX_USER_ADDRESS_SV39, VM_MAX_USER_ADDRESS_SV48] is not
mappable.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
libc __sfvwrite(): roll back FILE buffer pointer on fflush error
__sfvwrite() advances the pointer before calling fflush. If fflush()
fails, it is not enough to roll back inside it, because we cannot know
how much was advanced by the caller.
buf_alloc(): Stop using LK_NOWAIT, use LK_NOWITNESS
Despite the buffer taken from cache or free list, it still can be
locked, due to 'lockless lookup' in getblkx() potentially operating on
the freed buffers. The lock is transient, but prevents the use of
LK_NOWAIT there for the goal of neutralizing WITNESS.
Just use LK_NOWITNESS.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
Dimitry Andric [Wed, 9 Mar 2022 21:23:35 +0000 (22:23 +0100)]
Build compiler-rt against libunwind, not libcxxrt
Parts of compiler-rt are also built for libgcc_eh and libgcc_s, and
these were already pointing to the libunwind unwind.h. For the sake of
consistency, also build compiler-rt itself against the libunwind
unwind.h, not the libcxxrt one.
Dimitry Andric [Tue, 8 Mar 2022 21:53:16 +0000 (22:53 +0100)]
Remove compat hacks from libcxxrt's _Unwind_Exception
This reverts 9097e3cbcac4, which was in itself a revert of upstream
libcxxrt commits 88bdf6b290da ("Specify double-word alignment for ARM
unwind") and b96169641f79 ("Updated Itanium unwind"), and a
reapplication of our commit 3c4fd2463bb2 ("libcxxrt: add padding in
__cxa_allocate_* to fix alignment").
The editors/libreoffice port will be patched to be able to cope with the
standards-compliant alignment of _Unwind_Exception and consequently,
that of __cxa_exception. The layouts and sizes of these structures
should then be completely the same for libcxxrt, libunwind and
libc++abi.
Dimitry Andric [Mon, 28 Feb 2022 20:06:19 +0000 (21:06 +0100)]
Fix indentation in usr.bin/diff/pr.c
In commit 6fa5bf0832ef the pr(1) related code in diff was moved around,
but some part of the indentation was messed up, and one line was
duplicated. Remove the duplicated line, and fix up the indentation.
Reviewed by: bapt
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D34398
Martin Matuska [Fri, 11 Mar 2022 07:11:42 +0000 (08:11 +0100)]
zfs: merge openzfs/zfs@ef83e07db (zfs-2.1-release) into stable/13
OpenZFS release 2.1.3
Notable upstream pull request merges:
#12569 FreeBSD: Really zero the zero page
#12828 FreeBSD: Add vop_standard_writecount_nomsyn
#12828 zfs: Fix a deadlock between page busy and the teardown lock
#12828 FreeBSD: Catch up with more VFS changes
#12851 FreeBSD: Provide correct file generation number
#12857 Verify dRAID empty sectors
#12874 FreeBSD: Update argument types for VOP_READDIR
#12896 Reduce number of arc_prune threads
#12934 FreeBSD: Fix zvol_*_open() locking
#12961 FreeBSD: Fix leaked strings in libspl mnttab
#12964 Fix handling of errors from dmu_write_uio_dbuf() on FreeBSD
#12981 Introduce a flag to skip comparing the local mac when
raw sending
#12985 Avoid memory allocations in the ARC eviction thread
#13014 Report dnodes with faulty bonuslen
#13016 FreeBSD: Fix zvol_cdev_open locking
#13027 Fix clearing set-uid and set-gid bits on a file when
replying a write
#13031 Add enumerated vdev names to 'zpool iostat -v' and
'zpool list -v'
#13074 Enable encrypted raw sending to pools with greater ashift
#13076 Receive checks should allow unencrypted child datasets
#13098 Avoid dirtying the final TXGs when exporting a pool
#13172 Fix ENOSPC when unlinking multiple files from full pool
Colin Percival [Thu, 17 Feb 2022 21:01:11 +0000 (13:01 -0800)]
Add support for getting early entropy from UEFI
UEFI provides a protocol for accessing randomness. This is a good way
to gather early entropy, especially when there's no driver for the RNG
on the platform (as is the case on the Marvell Armada8k (MACCHIATObin)
for now).
If the entropy_efi_seed option is enabled in loader.conf (default: YES)
obtain 2048 bytes of entropy from UEFI and pass is to the kernel as a
"module" of name "efi_rng_seed" and type "boot_entropy_platform"; if
present, ingest it into the kernel RNG.
Make sure the XHCI driver obeys the isochronous scheduling threshold value
as given by the XHCI hardware parameters to avoid scheduling isochronous
transfers too early.
Cy Schubert [Thu, 3 Mar 2022 06:43:48 +0000 (22:43 -0800)]
ipfilter: Reliably print the interface name
When printing the interface name from the ipstate_t struct the interface
name in is_ifp may not always be avaiable when reading it from kmem
(tested on FreeBSD and NetBSD). However the is_ifname (the interface
name character string) is almost always available -- it is not available
when the source of the packet is a process running on the firewall
itself. Rather than print both interface name strings, print only the
one.
Cy Schubert [Thu, 3 Mar 2022 06:40:18 +0000 (22:40 -0800)]
ipfilter: Obtain the interface name more efficiently
Rather than use a kmem read to determine the interface name used by a
nat_t structure through a pointer, nat_ipfs->netif->if_xname, obtain it
directly from nat_ifnames in the nat_t structure itself using the new
FORMAT_IF macro.
Cy Schubert [Thu, 3 Mar 2022 06:21:59 +0000 (22:21 -0800)]
ipfilter: Introduce the new FORMAT_IF macro
Interface names stored in the ipstate_t and ipnat_t structures can be
NULL. This occurs when an application, such as named, is running on the
firewall machine itself. For example an application, i.e. named, running
on the firewall itself will cause a state table display and NAT mapping
display to show a null ingress interface and its egress interface. This
is perfectly valid but confusing to human eyes. Rather than print
nothing, print "(null)".
Kyle Evans [Fri, 4 Mar 2022 03:48:21 +0000 (21:48 -0600)]
tests: readlink: fix atf_test_case call [NFC]
This was meant to read `basic`, rather than a duplicate of `f_flag`. It
is largely irrelevant, though, as atf_test_case mostly just makes
sure that the proper functions are defined.
Kyle Evans [Thu, 27 Jan 2022 18:02:17 +0000 (12:02 -0600)]
cp: fix -R with links
The traversal was previously not properly honoring -H/-L/-P. Notably,
we should not have been resolving symlinks encountered during traversal
when either -H or -P are specified.
lualoader: fix the autoboot_delay countdown message
When the timer drops from double to single digits, a spare 'e' is left
on the end of the line as we don't overwrite it. Include an extra space
at the end to account for this and overwrite the leftover character.
Using /etc/jail.{jailname}.conf is nice, however it makes /etc/ very
messy if you have many jails. This patch allows one to move these
config files out of the way into /etc/jail.conf.d/{jailname}.conf.
Note that the same caveat as /etc/jail.*.conf applies: the jail service
will not autodiscover all of these for starting 'all' jails. This is
considered future work, since the behavior matches.
Jessica Clarke [Tue, 15 Feb 2022 16:10:34 +0000 (16:10 +0000)]
libpmc: Allow specifying explicit EVENT_xxH events on armv7 and arm64
This is useful for processors where we don't have an event table; in
those cases we default to a Cortex A8 (armv7) or Cortex A53 (arm64) in
order to attempt to provide something useful, but you're then limited to
the counters in those tables, some of which may also not be implemented
(e.g. LD/ST_RETIRED are no longer implemented in more recent cores,
replaced by LD/ST_SPEC).
Adding the raw EVENT_xxH event lists to each table ensures that you can
always request the exact events you want, regardless of what has been
detected or is known.
Warner Losh [Wed, 9 Mar 2022 00:01:04 +0000 (17:01 -0700)]
Update smartqpi driver to vendor's latest submission
Newly added features & bug fixes
o Fixed an issue smartpqi debug log messages are flooding kernel logs.
o Fixed an issue where devices are shown as RAID 0 in display info.
o Feature: Changed 32 bit dma address to 64 bit address
o Added new controlller ids.
Submitted by: Microsemi
Reviewed by: Scott Benesh (Microsemi), imp
Differential Revision: https://reviews.freebsd.org/D34469
MFC After: 3 days
Scott Long [Sat, 26 Feb 2022 17:25:43 +0000 (10:25 -0700)]
Fix "set but not used" in smartpqi. The PCI_MEM macros don't require a
physical/absolute address in FreeBSD, but it looks like the calling
code might be somewhat portable to other OS's that do require this.
Therefore, set the variables to __unused instead of removing the code
entirely.
Warner Losh [Tue, 22 Feb 2022 21:25:32 +0000 (14:25 -0700)]
camcontrol fwdownload minor improvements
Minor improvements to the fwdownload code suggested by chs@:
o Print the path_id/target we're rescanning so it's not invisible
o No need for XPT_GDEVLIST, all the info is filled in. Remove sending it
as well as a comment related to it from a mistaken observation. libcam
always fills these in properly, so use those for the ccb path/target.
o Don't leak /dev/xpt fd in success cases.
o Rename fw_rescan_lun to fw_rescan_target and pass sim_mode to
only print path_id and target_id info.
Warner Losh [Thu, 3 Jun 2021 23:44:27 +0000 (17:44 -0600)]
smartpqi: Remove stray declaration
pqisrc_is_firmware_feature_enabled shouldn't be declared inline in a
header, and then static inline in the .c function. Remove this stray
declartion from the header. gcc6 complains, but clang does not.
Warner Losh [Wed, 23 Feb 2022 21:28:16 +0000 (14:28 -0700)]
bio: make _bio_cflags match bio_cflags
While none of the in-tree geoms that modify the bio_cflags use the top
half of the word, were they to do this in the future, we'd hit false
positives for DIAGNOSTIC kernels. Make both uint16_t.
Jose Luis Duran [Tue, 22 Feb 2022 11:39:03 +0000 (08:39 -0300)]
libefivar: Correct the string expression of UTF8 vendor device path
According to UEFI spec, the string expression of UTF8 vendor
device node should be displayed as: VenUtf8(). Current code
display it as: VenUft8() by mistake when convert device
path node to text.
Warner Losh [Tue, 22 Feb 2022 17:34:36 +0000 (10:34 -0700)]
camcontrol: Force a rescan of the lun after firmware download.
After downloading the firmware to a device, it's inquiry data likely
will change. Force a rescan of the target with the CAM_EXPECT_INQ_CHANGE
flag to get it to record the new inqury data as being expected. This
avoids the need for a 'camcontrol rescan' on the device which detaches
and re-attaches the disk (da, ada) device. This brings fwdownload up to
nvmecontrol's ability to do the same thing w/o changing the exposed
nvme/nvd/nda device. We scan the target and not the LUN because dual
actuator drives have multiple LUNs, but the firmware is global across
many vendors' drives (and the so far theoretical ones that aren't won't
be harmed by the rescan).
Since the underlying struct disk is now preserved accross this
operation, it's now possible to upgrade firmware of a root device w/o
crashing the system. On systems that are quite busy, the worst that
happens is that certain operaions are reported cancelled when the new
firmware is activated. These operations are retried with the normal CAM
recovery mechanisms and will work on the retry. The only visible hiccup
is the time that new firmware is flashing / initializing. One should not
consider this operation completely risk free, however, since not all
drives are well behaved after a firmware download.
usb(4): Don't skip calling uhub_explore_sub() even on HUB port errors.
This should fix an issue where the "udev->re_enumerate_wait" field never gets
processed and reset. In this case usbconfig will wait forever and never return.
Alan Somers [Tue, 22 Feb 2022 05:00:42 +0000 (22:00 -0700)]
fusefs: fix a cached attributes bug during directory rename
When renaming a directory into a different parent directory, invalidate
the cached attributes of the new parent. Otherwise, stat will show the
wrong st_nlink value.
Mike Karels [Wed, 23 Feb 2022 20:42:30 +0000 (14:42 -0600)]
Add serial-number to hw.fdt sysctl area if found in fdt.
Add serial-number sysctl if that fdt property exists and is a printable
string. While here, ensure that the hw.fdt sysctl values fit in the
buffers provided so that they will be NUL-terminated. Tested on
Raspberry Pi 3B+ and 4.
Reviewed by: manu imp
Differential Revision: https://reviews.freebsd.org/D34356
fdt: Expose the model, compatible and freebsd dts brandind as sysctl
This make it easier for script to get the hardware on which they are running.
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D31205
Reviewed by: imp
Should be ok on powerpc: jhibbits (over irc)
Fix ENOSPC when unlinking multiple files from full pool
When unlinking multiple files from a pool at 100% capacity, it was
possible for ENOSPC to be returned after the first unlink. e.g.
rm -f /mnt/fs/test1.0.0 /mnt/fs/test1.1.0 /mnt/fs/test1.2.0
rm: cannot remove '/mnt/fs/test1.1.0': No space left on device
rm: cannot remove '/mnt/fs/test1.2.0': No space left on device
After waiting for the pending deferred frees from the first unlink to
be processed the remaining files can then be unlinked. This is caused
by the quota limit in dsl_dir_tempreserve_impl() being temporarily
decreased to the allocatable pool capacity less any deferred free
space.
This is resolved using the existing mechanism of returning ERESTART
when over quota as long as we know enough space will shortly be
available after processing the pending deferred frees.
Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13172
Ed Maste [Mon, 28 Feb 2022 01:11:20 +0000 (20:11 -0500)]
zfs: Update test format strings to match variable typtes
And drop stray 'd' from the end of some printed numbers. I assume this
was the result of someone thinking u is a printf length modifier for d,
not a format specifier itself.
Reviewed by: kevans, rew
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34387
Mark Johnston [Tue, 1 Mar 2022 13:54:55 +0000 (08:54 -0500)]
release: Remove references to ChallengeResponseAuthentication
This sshd_config keyword was replaced by KbdInteractiveAuthentication in
openssh 8.7, though ChallengeResponseAuthentication is silently accepted
as an alias. However, this means that the code in ec2.conf which
modifies a commented-out line no longer does anything. Apply a minimal
fix.
Reviewed by: cperciva, emaste
Sponsored by: The FreeBSD Foundation
ix(4): Add control of 2.5/5G autonegotiation speeds
This change enables the user to control 2.5G and 5G autonegotiation
speeds via advertise_speed sysctl for X550T devices. Due to reported
interoperability issues with switches, 2.5G and 5G speeds will not be
advertised by default.
Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> Co-authored-by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Tested by: gowtham.kumar.ks@intel.com
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D26245
Eric Joyner [Fri, 4 Mar 2022 18:25:25 +0000 (10:25 -0800)]
ice(4): Update to 1.34.2-k
- Adds FW logging support
- Once enabled, this lets the firmware print event and error messages
to the log, increasing the visibility into what the hardware is
doing; this is useful for debugging
- General bug fixes
- Adds inital DCB support to the driver
- Notably, this adds support for DCBX to the driver; now with the
fw_lldp sysctl set to 1, the driver and adapter will adopt a DCBX
configuration sent from a link partner
- Adds statistcs sysctls for priority flow control frames
- Adds new configuration sysctls for DCB-related features: (VLAN) user
priority to TC mapping; ETS bandwidth allocation; priority flow
control
- Remove unused SR-IOV files (until support gets added)
Eric Joyner [Thu, 29 Jul 2021 23:24:14 +0000 (16:24 -0700)]
iflib: Allow drivers to determine which queue to TX on
Adds a new function pointer to struct if_txrx in order to allow
drivers to set their own function that will determine which queue
a packet should be sent on.
Since this includes a kernel ABI change, bump the __FreeBSD_version
as well.
(This motivation behind this is to allow the driver to examine the
UP in the VLAN tag and determine which queue to TX on based on
that, in support of HW TX traffic shaping.)
Eric Joyner [Sat, 13 Feb 2021 00:04:54 +0000 (16:04 -0800)]
ixl(4): Remove iavf(4) source files
Since iavf(4) no longer shares code with ixl(4) as of commit f2fbd56a8d07665bc0a5e8b7e40026b50a591e2a and now has its own directory,
remove these now-unused iavf(4)-only files.
Eric Joyner [Fri, 12 Feb 2021 21:28:18 +0000 (13:28 -0800)]
iavf(4): Split source and update to 3.0.26-k
The iavf(4) driver now uses a different source base from ixl(4), since
it will be the standard VF driver for new Intel Ethernet products going
forward, including ice(4). It continues to use the iflib framework
for network drivers.
Since it now uses a different source code base, this commit adds a new
sys/dev/iavf entry, but it re-uses the existing module name so no
configuration changes are necessary.
Eric Joyner [Wed, 23 Jun 2021 20:41:54 +0000 (13:41 -0700)]
ice(4): Update to version 0.29.4-k
Includes various feature improvements and bug fixes.
Notable changes include:
- Firmware logging support
- Link management flow changes
- New sysctl to report aggregated error counts
- Health Status Event reporting from firmware (Use the new read-only
tunables hw.ice.enable_health_events / dev.ice.#.enable_health_events
to turn this off)
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Sponsored by: Intel Corporation
Brian Behlendorf [Mon, 11 Oct 2021 17:49:13 +0000 (10:49 -0700)]
ZTS: deadman_sync fix
In the CI environment it's possible for events to be slightly
delayed resulting in 4, instead of 5, events appearing in the
log file. This isn't a problem and should be considered a
success to avoid false positive test results.
Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12625
Ryan Stone [Thu, 11 Feb 2021 16:17:58 +0000 (11:17 -0500)]
Fix ifa refcount leak in ifa_ifwithnet()
In 4f6c66cc9c75c8, ifa_ifwithnet() was changed to no longer
ifa_ref() the returned ifaddr, and instead the caller was required
to stay in the net_epoch for as long as they wanted the ifaddr
to remain valid. However, this missed the case where an AF_LINK
lookup would call ifaddr_byindex(), which still does ifa_ref()
the ifaddr. This would cause a refcount leak.
Fix this by inlining the relevant parts of ifaddr_byindex() here,
with the ifa_ref() call removed. This also avoids an unnecessary
entry and exit from the net_epoch for this case.
I've audited all in-tree consumers of ifa_ifwithnet() that could
possibly perform an AF_LINK lookup and confirmed that none of them
will expect the ifaddr to have a reference that they need to
release.
Ed Maste [Wed, 2 Mar 2022 16:40:00 +0000 (11:40 -0500)]
vt_vga: fix colour in pixel blocks with more than 4 colours
VGA hardware provides many different graphics and data access modes,
each with different capabilities and limitations.
VGA vt(4) graphics mode operates on blocks of pixels at a time. When a
given pixel block contains only two colours the vt_vga driver uses write
mode 3. When the block contains more than two colours it uses write
mode 0. This is done because two-colour write mode 3 is much more
efficient.
In practice write mode 3 is used most of the time, as there is often a
single foreground colour and single background colour across the entire
console. One common exception requiring the use of mode 0 is when the
mouse cursor is drawn over a background other than black, as we need
black and white for the cursor in addition to the background colour.
VGA's default 16-colour palette provides the same set of colours as the
system console, but in a different order. Previously we configured a
non-default VGA palette that had the same colours at the same indexes.
However, this caused anything drawn before the kernel started (drawn by
the loader, for instance) to change colours once the kernel configured
the new, non-default palette.
In 5e251aec8636 we switched to leaving the default VGA palette in place,
translating console colour indexes to VGA colour indexes as necessary.
This translation was missed for the write mode 0 case for pixel blocks
with more than two colours.
PR: 261751
Reviewed by: adrian
MFC after: 1 week
Fixes: 5e251aec8636 ("vt(4): Use default VGA palette")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34412
Marcin Wojtas [Tue, 4 May 2021 23:47:37 +0000 (01:47 +0200)]
sdhci_xenon: add UHS support
This patch adds the necessary methods resolution to the sdhci_xenon
driver which are required to configure UHS modes for SD/MMC devices.
Apart from the two generic routines, the custom sdhci_xenon_set_uhs_timing
function is responsible for setting the SDHCI_HOST_CONTROL2 register
with appropriate mode select values - in case of HS200 and HS400
they are non-standard.
Marcin Wojtas [Thu, 27 May 2021 18:39:12 +0000 (20:39 +0200)]
sdhci_xenon: improve the VCCQ voltage switch sequence
Improve the VCCQ voltage switch, so that to properly
handle the SDHCI_HOST_CONTROL2 register signaling
flags and along with manipulating the regulator.
Marcin Wojtas [Thu, 27 May 2021 17:48:17 +0000 (19:48 +0200)]
sdhci_xenon: allow to properly disable the UHS signaling
Until now the "no-1-8-v" DT flag wrongly disabled the SDHCI_CAN_VDD_180
- slot 1.8V power supply capability, whereas it refers to the signaling
voltage. Fix the sdhci_xenon_read_4 and allow to disable the UHS modes
depending on the DT property or PHY slow mode. While at it - make sure
the unsupported 1.2V signaling is always disabled and not reported
in the bootverbose log.
Marcin Wojtas [Sat, 1 May 2021 07:55:06 +0000 (09:55 +0200)]
sdhci_xenon: enable MMC FDT parsing
The mmc_fdt_parse allows to parse more MMC-related
FDT properties. Start using it. "wp-inverted" property,
VQMMC and newly added VMMC power supply parsing
is now done in a generic code.