manu [Sat, 10 Oct 2020 07:18:51 +0000 (07:18 +0000)]
Brand our DTS with the Linux version it was imported from
DTS must be synced with the kernel, add a freebsd,dts-version string in
the root node of each DTS that we compile so we can later in the kernel
check that it contain a correct value.
rmacklem [Sat, 10 Oct 2020 00:01:40 +0000 (00:01 +0000)]
Modify mountd.c so that it does not always malloc 4K for the map credentials.
r362163 upgraded mountd so that it could handle MAX_NGROUPS
groups for the anonymous user credentials (the ones provided by
-maproot and -mapall exports options).
The problem is that this resulted in every export structure growing by
about 4Kbytes, because the cr_groups field went from 16->MAX_NGROUPS.
This patch fixes this by only including a small 32 element cr_groups in the
structure and then malloc()'ng cr_groups when a larger one is needed.
The value of SMALLNGROUPS is arbitrarily set to 32, assuming most users
used by -maproot or -mapall will be in <= 32 groups.
cxgbe(4): More fixes for the T6 FCS error counter.
r365732 was the first attempt to get an accurate count but it was
writing to some read-only registers to clear them and that obviously
didn't work. Instead, note the counter's value when it is supposed to
be cleared and subtract it from future readings.
dev.<port>.stats.rx_fcs_error should not be serviced from the MPS
register for T6.
The stats.* sysctls should all use T5_PORT_REG for T5 and above. This
must have been missed in the initial T5 support years ago. Fix it while
here.
MFC after: 3 days
Sponsored by: Chelsio Communications
jhb [Fri, 9 Oct 2020 20:20:42 +0000 (20:20 +0000)]
Don't invoke semunload() if seminit() fails during MOD_LOAD.
The module handler code invokes a MOD_UNLOAD event immediately if
MOD_LOAD fails. The result was that if seminit() failed, semunload()
was invoked twice. semunload() is not idempotent however and would
try to remove it's process_exit eventhandler twice resulting in a
panic.
mjg [Fri, 9 Oct 2020 19:10:00 +0000 (19:10 +0000)]
cache: fix vexec panic when racing against vgone
Use of dead_vnodeops would result in a panic instead of returning the intended
EOPNOTSUPP error.
While here make sure to abort, not just try to return a partial result.
The former allows the regular lookup to restart from scratch, while the latter
makes it stuck with an unusable vnode.
imp [Fri, 9 Oct 2020 15:29:05 +0000 (15:29 +0000)]
Avoid using single quotes in arguments to logger.
Single quotes interfere with the workaround put in with r335753 and
aren't necessary in this case. I believe that all the underling issues
with r335753 have been corrected, but need to do more extensive
followup before reverting it as a bad idea.
PR: 240411
MFC After: 2 days (to give it time to get into 12.2)
markj [Fri, 9 Oct 2020 15:27:37 +0000 (15:27 +0000)]
col(1): Fix a couple of bugs
- When flushing extra lines after all input has been processed, make
sure that local state is reinitialized correctly.
- When -f is specified, make sure to end output with a full newline.
- Fix some style issues and update comments.
- Add some regression tests.
PR: 249308
Submitted by: Yang Zhong <yzhong@freebsdfoundation.org>
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26536
gbe [Fri, 9 Oct 2020 15:14:19 +0000 (15:14 +0000)]
Fix a few mandoc issues
- whitespace at end of input line
- skipping paragraph macro: Pp at the end of Sh
- new sentence, new line
- consider using OS macro: Fx
- AUTHORS section without An macro
- skipping paragraph macro: Pp before Ss
rscheff [Fri, 9 Oct 2020 14:33:09 +0000 (14:33 +0000)]
Add DSCP support for network QoS to iscsi initiator.
Allow the DSCP codepoint also to be configurable
for the traffic in the direction from the initiator
to the target, such that writes and any requests
are also treated in the appropriate QoS class.
gbe [Fri, 9 Oct 2020 14:03:45 +0000 (14:03 +0000)]
Fix a few mandoc issues
- no blank before trailing delimiter
- whitespace at end of input line
- sections out of conventional order
- normalizing date format
- AUTHORS section without An macro
rscheff [Fri, 9 Oct 2020 12:44:56 +0000 (12:44 +0000)]
Stop sending tiny new data segments during SACK recovery
Consider the currently in-use TCP options when
calculating the amount of new data to be injected during
SACK loss recovery. That addresses the effect that very small
(new) segments could be injected on partial ACKs while
still performing a SACK loss recovery.
rscheff [Fri, 9 Oct 2020 12:06:43 +0000 (12:06 +0000)]
Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow.
This adds a new IP_PROTO / IPV6_PROTO setsockopt (getsockopt)
option IP(V6)_VLAN_PCP, which can be set to -1 (interface
default), or explicitly to any priority between 0 and 7.
Note that for untagged traffic, explicitly adding a
priority will insert a special 801.1Q vlan header with
vlan ID = 0 to carry the priority setting
Fix EINVAL message when CPU binding information is requested for IRQ.
`cpuset -g -x N` along with requested information always prints
message `cpuset: getdomain: Invalid argument'. The EINVAL is returned
from kern_cpuset_getdomain(), since it doesn't expect CPU_LEVEL_WHICH
and CPU_WHICH_IRQ parameters.
To fix the error, do not call cpuset_getdomain() when `-x' is specified.
imp [Fri, 9 Oct 2020 01:48:14 +0000 (01:48 +0000)]
Create in-tree LINT files
Now that config(8) has supported include for 19 years, transition to
including the NOTES files. include support didn't exist at the time,
nor did the envvar stuff recently added. Now that it does, eliminate
the building of LINT files by just including everything you need.
Note: This may cause conflicts with updating in some cases.
find sys -name LINT\* -rm
is suggested across this commit to remove the generated LINT
files.
rmacklem [Fri, 9 Oct 2020 01:04:28 +0000 (01:04 +0000)]
Make vn_generic_copy_file_range() interruptible via a signal.
Without this patch, when vn_generic_copy_file_range() is
doing a large copy, it will remain in the function for a
considerable amount of time, delaying handling of any
outstanding signals until the copy completes.
This patch adds checks for signals that need to be
processed after each successful data copy cycle.
When sig_intr() returns non-zero, vn_generic_copy_file_range()
will return.
The check "if (len < savlen)" ensures that some data
has been copied, so that progress will be made.
Note that, since copy_file_range(2) is allowed to
return fewer bytes copied than requested, it
will never return EINTR/ERESTART when sig_intr()
returns non-zero.
imp [Fri, 9 Oct 2020 00:27:45 +0000 (00:27 +0000)]
Stop ignoring makeLINT generated files
We're going to check these files in shortly since we don't need to
generate them anymore. Generated files cause issues for different work
flows anyway.
imp [Fri, 9 Oct 2020 00:16:26 +0000 (00:16 +0000)]
Initial support for implementing the bootXXX.efi workaround
Too many version of UEFI firmware (so far only confirmed on amd64)
don't really support efibootmgr selection of boot. That's the most
reliable, when it works, since there's no guesswork. However, many do
not save, unmolested, the variables that efibootmgr sets, so as a
fallback we also install loader.efi as bootXXX.efi (where XXX is
either aa64 or x64) if it doesn't already exist in /efi/boot on the
ESP. The standard only defines this for removable devices, but it's
almost ubiquitously used as a fallback. Many BIOSes implement a drive
selection feature that takes over the efibootmgr protocol, rendinering
it useless (either generally, or for those vendors not on the short
list). bootxxx.efi works around this. However, we don't install it
unconditionally there, as that breaks some popular multi-boot setups.
kib [Thu, 8 Oct 2020 22:46:15 +0000 (22:46 +0000)]
vm_page_dump_index_to_pa(): Add braces to the expression involving + and &.
The precedence of the '&' operator is less than of '+'. Added braces
do change the order of evaluation into the natural one, in my opinion.
On the other hand, the value of the expression should not change since
all elements should have page-aligned values.
This fixes a gcc warning reported.
Reported by: adrian
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Thu, 8 Oct 2020 22:41:02 +0000 (22:41 +0000)]
Do not leak B_BARRIER.
Normally when a buffer with B_BARRIER is written, the flag is cleared
by g_vfs_strategy() when creating bio. But in some cases FFS buffer
might not reach g_vfs_strategy(), for instance when copy-on-write
reports an error like ENOSPC. In this case buffer is returned to
dirty queue and might be written later by other means. Among then
bdwrite() reasonably asserts that B_BARRIER is not set.
In fact, the only current use of B_BARRIER is for lazy inode block
initialization, where write of the new inode block is fenced against
cylinder group write to mark inode as used. The situation could be
seen that we break dependency by updating cg without written out
inode. Practically since CoW was not able to find space for a copy of
inode block, for the same reason cg group block write should fail.
Reported by: pho
Discussed with: chs, imp, mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26511
kib [Thu, 8 Oct 2020 22:34:34 +0000 (22:34 +0000)]
sig_intr(9): return early if AST is not scheduled.
Check td_flags for relevant AST requests lock-less. This opens the
race slightly wider where sig_intr() returns false negative, but might
be it is worth it.
Requested by: mjg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Thu, 8 Oct 2020 22:31:11 +0000 (22:31 +0000)]
Do not allow to use O_BENEATH as an oracle.
Specifically, if lookup() returned any error and the topping directory
was not latched, which means that (non-existent) path did not returned
to the topping location, give ENOTCAPABLE a priority over the lookup()
error.
PR: 249960
Reviewed by: emaste, ngie
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26695
imp [Thu, 8 Oct 2020 20:56:06 +0000 (20:56 +0000)]
Remove APM BIOS support
APM BIOS was relevant only to early laptops (approximately P166 or
P200 and slower). These have not been relevant for a long time, and
this code has been untested for a long time (as far as I can
tell). The APM compat code in ACPI and the apm(8) command is not being
retired. Both of these items are still in use (apm(8) is more
scriptable than the replacement acpiconf, for the most part). This has
been commented out of i386 GENERIC since 2002. This code is not
relevant to any other port.
mhorne [Thu, 8 Oct 2020 18:29:17 +0000 (18:29 +0000)]
Fix a loop condition
The correct way to identify the end of the metadata is two adjacent
entries set to zero/MODINFO_END. I made a typo and this was checking the
first entry twice.
Reported by: rpokala
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
mhorne [Thu, 8 Oct 2020 18:02:05 +0000 (18:02 +0000)]
Add a routine to dump boot metadata
The boot metadata (also referred to as modinfo, or preload metadata)
provides information about the size and location of the kernel,
pre-loaded modules, and other metadata (e.g. the EFI framebuffer) to be
consumed during by the kernel during early boot. It is encoded as a
series of type-length-value entries and is usually constructed by
loader(8) and passed to the kernel. It is also faked on some
architectures when booted by other means.
Although much of the module information is available via kldstat(8),
there is no easy way to debug the metadata in its entirety. Add some
routines to parse this data and allow it to be printed to the console
during early boot or output via a sysctl.
Since the output can be lengthly, printing to the console is gated
behind the debug.dump_modinfo_at_boot kenv variable as well as the
BOOTVERBOSE flag. The sysctl to print the metadata is named
debug.dump_modinfo.
Reviewed by: tsoome
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26687
kaktus [Thu, 8 Oct 2020 11:45:10 +0000 (11:45 +0000)]
[pf] /etc/rc.d/pf should REQUIRE routing
When a system with pf_enable="YES" in /etc/rc.conf uses hostnames in
/etc/pf.conf, these hostnames cannot be resolved via external nameservers
because the default route is not yet set. This results in an empty
(all open) ruleset.
Since r195026 already put netif back to REQUIRE, this change does not affect
the issue that the firewall should rather have been setup before any
network traffic can occur.
PR: 211928
Submitted by: Robert Schulze
Reported by: Robert Schulze
Tested by: Mateusz Kwiatkowski
No objections from: kp
MFC after: 3 days
cxgbe(4): knobs to drop various kinds of undesirable frames on ingress.
These kind of drops come for free in the sense that they do not use the
filter TCAM or any other resource that wouldn't normally be used during
rx. Frames dropped by the hardware get counted in the MAC's rx stats
but are not delivered to the driver.
hw.cxgbe.attack_filter
Set to 1 to enable the "attack filter". Default is 0. The attack
filter will drop an incoming frame if any of these conditions is true:
src ip/ip6 == dst ip/ip6; tcp and src/dst ip is not unicast; src/dst ip
is loopback (127.x.y.z); src ip6 is not unicast; src/dst ip6 is loopback
(::1/128) or unspecified (::/128); tcp and src/dst ip6 is mcast
(ff00::/8).
hw.cxgbe.drop_ip_fragments
Set to 1 to drop all incoming IP fragments. Default is 0. Note that
this drops valid frames.
hw.cxgbe.drop_pkts_with_l2_errors
Set to 1 to drop incoming frames with Layer 2 length or checksum errors.
Default is 1.
hw.cxgbe.drop_pkts_with_l3_errors
Set to 1 to drop incoming frames with IP version, length, or checksum
errors. Default is 0.
hw.cxgbe.drop_pkts_with_l4_errors
Set to 1 to drop incoming frames with Layer 4 length, checksum, or other
errors. Default is 0.
mhorne [Wed, 7 Oct 2020 23:14:49 +0000 (23:14 +0000)]
Handle kmod local relocation failures gracefully
It is possible for elf_reloc_local() to fail in the unlikely case of
an unsupported relocation type. If this occurs, do not continue to
process the file.
80211: ifconfig replace MS() with _IEEE80211_MASKSHIFT()
As we did in the kernel in r366112 replace the MS() macro with the version(s)
added to the kernel: _IEEE80211_MASKSHIFT(). Also provide its counter part.
This will later allow use to use other macros defined in net80211 headers
here in ifconfig.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
This code was iteratively implemented during the work on various WiFi
drivers -- from individual functions to a macro-created implementations
for the various bit sized needed (and then extended to more for
comepleteness). Some of the bit combinations do not seem to make sense
so are left out.
The __bf_shf(x) was obtained from D26681 [1].
Requested by: manu [1]
Reviewed by: hselasky, manu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26708
The method was optional prior to r365938, which made it mandatory but did add
any test that an implementation provides the method nor implement it for
bhyveload. The code path might not be hit unless the user's loader was
configured to write to a file on disk, such as with nextboot(8).
mhorne [Wed, 7 Oct 2020 18:48:10 +0000 (18:48 +0000)]
Print symbol index for unsupported relocation types
It is unlikely, but possible, that an unrecognized or unsupported
relocation type is encountered while trying to load a kernel module. If
this occurs we should offer the symbol index as a hint to the user.
While here, fix some small style issues.
Reviewed by: markj, kib (amd64 part, in D26701)
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
hselasky [Wed, 7 Oct 2020 17:46:49 +0000 (17:46 +0000)]
Properly cleanup driver during remove_one() in mlx5core.
Cleanup all host resources, SYSCTLs, MSIX vectors and memory used
by the host and only leave the device allocated memory behind, if any,
because it may still be in use, when the PCI remove function is called.
Else future probe calls may fail due to SYSCTLs already existing.
imp [Wed, 7 Oct 2020 06:16:37 +0000 (06:16 +0000)]
Move kernel env global variables, etc to sys/kenv.h
The kernel globals for kenv are confined to 2 files that need them and
a few that likely shouldn't (but as written the code does). Move them
from sys/systm.h to sys/kenv.h. This removed a XXX from systm.h and
cleans it up a little bit...
imp [Wed, 7 Oct 2020 05:44:35 +0000 (05:44 +0000)]
cam: Add quirk for Samsung MZ7* behind a SATA-to-SAS interposer
Sometimes, this drive will be present in the system such that the the
firmware identification string doesn't start with ATA, such as when
it's behind a SATA-to-SAS interposer. Add another quirk for that.
bridge: call member interface ioctl() without NET_EPOCH
We're not allowed to hold NET_EPOCH while sleeping, so when we call ioctl()
handlers for member interfaces we cannot be in NET_EPOCH. We still need some
protection of our CK_LISTs, so hold BRIDGE_LOCK instead.
That requires changing BRIDGE_LOCK into a sleepable lock, and separating the
BRIDGE_RT_LOCK, to protect bridge_rtnode lists. That lock is taken in the data
path (while in NET_EPOCH), so it cannot be a sleepable lock.
jhb [Tue, 6 Oct 2020 17:58:56 +0000 (17:58 +0000)]
Store the send tag type in the common send tag header.
Both cxgbe(4) and mlx5(4) wrapped the existing send tag header with
their own identical headers that stored the type that the
type-specific tag structures inherited from, so in practice it seems
drivers need this in the tag anyway. This permits removing these
extra header indirections (struct cxgbe_snd_tag and struct
mlx5e_snd_tag).
In addition, this permits driver-independent code to query the type of
a tag, e.g. to know what type of tag is being queried via
if_snd_query.
jrtc27 [Tue, 6 Oct 2020 13:02:20 +0000 (13:02 +0000)]
riscv: Handle supervisor instruction page faults
We should never take instruction page faults when in the kernel, but by
using the standard page fault code we should get a more-informative
message about faulting on a NOFAULT page rather than branching to the
default case here and printing an "Unknown kernel exception ..."
message.