Rick Macklem [Sun, 28 Aug 2022 01:31:20 +0000 (18:31 -0700)]
nfsd: Update console message for no session found
The NFSv4.1/4.2 server generates a console message that indicates
that there is no session. I was until recently perplexed w.r.t. how
this could occur. It turns out that the common cause is multiple NFS
clients with the same /etc/hostid.
The host uuid is used by the FreeBSD NFSv4.1/4.2 client as a unique
identifier for the client. If multiple clients use the same host uuid,
this indicates to the NFSv4.1/4.2 server that they are the same client
and confusion occurs.
This trivial patch modifies the console message to suggest that the
client's /etc/hostid needs to be checked for uniqueness.
Rick Macklem [Sat, 27 Aug 2022 23:03:18 +0000 (16:03 -0700)]
nfscl: Fix handling of nd_slotid while handling NFSERR_BADSESSION
When the NFSv4.1/4.2 client is handling a server error
of NFSERR_BADSESSION, it retries RPCs with a new session.
Without this patch, the nd_slotid was not being updated
for the new session.
This would result in a bogus console message like
"Wrong session srvslot=X slot=Y" and then it would
free the incorrect slot, often generating a
"freeing free slot!!" console message as well.
This patch fixes the problem.
Note that FreeBSD NFSv4.1/4.2 servers only
generate a NFSERR_BADSESSION error after a reboot
or after a client does a DestroySession operation.
Bjoern A. Zeeb [Sat, 27 Aug 2022 14:48:09 +0000 (14:48 +0000)]
LinuxKPI 802.11: change type of bssid in struct ieee80211_bss_conf
Enabling other driver code found that the bssid in
struct ieee80211_bss_conf is not an array but expected to be
a const pointer (const, != NULL checks).
Adjust accordingly in the header and in the LinuxKPI compat code.
There initialization now needs to be a static array always present
as we need a value before we will have a BSS (node in scan_to_auth)
as the mac80211 driver (*handlers) are expecting the pointer to be
not NULL (copying without checks).
This is a pre-req to enable d3 (CONFIG_PM[_SLEEP]) in the future.
Tested by: Tomoaki AOKI (junchoon dec.sakura.ne.jp)
Tested by: Berislav Purgar (bpurgar gmail.com)
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Some non-compliant USB devices do not implement the
clear endpoint halt feature. Silently ignore such
failures, when they at least responded correctly
passing up a valid STALL PID packet.
Warner Losh [Fri, 26 Aug 2022 21:47:21 +0000 (15:47 -0600)]
stand: Document that boot0 uses BIOS
And thus has a limited range of supported baud rates. Also add that
setting BOOT_BOOT0_COMCONSOLE_SPEED=0 will leave it unchanged which
sometimes can give you 115200 if the BIOS initialized things outside of
the normal BIOS baud rates (which many x86 enbedded-targetted boards
do).
Warner Losh [Fri, 26 Aug 2022 21:46:33 +0000 (15:46 -0600)]
stand: More sensible defaults when ConOut is missing
When ConOut is missing, we used to default to serial. Except we did it
in the worst way possible by just setting the howto bits and not
updating the console setting, which lead to weird behavior where we'd
get some things on the video port, others on serial.
Instead, set console to "efi,comconsole" for this case. Also set
RB_MULTIPLE always (so we get dual consoles from the kernel) and or in
RB_SERIAL when we can't find GOPs that suggest the precense of a video
console. This will put output in the most places and have a sensible
default for 'primary' console.
Sponsored by: Netflix
Reviewed by: emaste, manu
Differential Revision: https://reviews.freebsd.org/D36299
Warner Losh [Fri, 26 Aug 2022 17:39:37 +0000 (11:39 -0600)]
efi: Create a define for memory descriptor version
For true EFI platforms, the EFI BIOS will return version 1 (since no
other version is defined as of this commit). However, for environments
that wish to create an EFI memory mapping table that aren't actually
EFI, we need to know this. Add EFI_MEMORY_DESCRIPTOR_VERSION for this
constant.
firk [Fri, 26 Aug 2022 08:05:56 +0000 (11:05 +0300)]
Fix compat10 semaphore interface race
Wrong has-waiters and missing unconditional _count==0 check may cause
infinite waiting with already non-zero count.
1) properly clear _has_waiters flag when waiting failed to start
2) always check _count before start waiting
Gleb Smirnoff [Fri, 26 Aug 2022 15:16:15 +0000 (08:16 -0700)]
socket(2): bring documentation up tp date
o Undocument sockets that are no longer supported, or never were.
o Add AF_HYPERV. Note: PF_HYPERV isn't defined, no typo here.
o Point at ip(4) and ip6(4) instead of unwelcoming "not described here".
Rick Macklem [Fri, 26 Aug 2022 03:48:04 +0000 (20:48 -0700)]
nfscl: Fix handling of nd_slotid while handling NFSERR_BADSESSION
When the NFSv4.1/4.2 client is handling a server error
of NFSERR_BADSESSION, it retries RPCs with a new session.
Without this patch, the nd_slotid was not being updated
for the new session.
This would result in a bogus console message like
"Wrong session srvslot=X slot=Y" and then it would
free the incorrect slot, often generating a
"freeing free slot!!" console message as well.
This patch fixes the problem.
Note that FreeBSD NFSv4.1/4.2 servers only
generate a NFSERR_BADSESSION error after a reboot
or after a client does a DestroySession operation.
Rick Macklem [Fri, 26 Aug 2022 03:33:31 +0000 (20:33 -0700)]
nfscl: Fix handling of a bad session slot (NFSv4.1/4.2)
When a session has been marked defunct by the server
sending a NFSERR_BADSESSION reply to the NFSv4.1/4.2
client, nfsv4_sequencelookup() returns NFSERR_BADSESSION
without actually assigning a session slot.
Without this patch, newnfs_request() would erroneously
free slot 0.
This could result in the slot being reused prematurely,
but most likely just generated a "freeing free slot!!"
console message.
This patch fixes the code to not do the erroneous
freeing of the slot for this case.
Notable upstream pull request merges:
#13717 Fix zpool status in case of unloaded keys
#13753 Prevent zevent list from consuming all of kernel memory
#13767 arcstat: fix -p option
#13785 Updates for snapshots_changed property
The target modifiers (-g, -p, -u) may occur in any position except
between -n and its argument; furthermore, we support both the old
absolute form (without -n) and the modern relative form (with -n).
Andrew Turner [Mon, 22 Aug 2022 17:02:13 +0000 (18:02 +0100)]
Add an IDC only arm64 icache sync function
When the IDC flag is set in the cache type register we don't need to
clean the data cache to the point of unification. Previously we
supported this flag being set only when the DIC flags was also set.
Add a new handler for when this is not the case.
Andrew Turner [Mon, 27 Jun 2022 12:37:40 +0000 (12:37 +0000)]
Allow the kernel to emulate the physical counter on arm64
When running under a VM we don't have access to the physical counter.
Add support to emulate this instruction by handling the trap in the
kernel. As it is slow only enable when the hw.emulate_phys_counter
tunable is set on boot.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35613
Doug Moore [Thu, 25 Aug 2022 07:26:04 +0000 (02:26 -0500)]
rb_tree: optimize rb_insert
In searching for where to insert a new node, RB_INSERT discards the
address of the pointer that will have to be modified, so that it must
find it again from the values of 'parent' and 'comp'. Stop discarding
that address, and so avoid having to recompute it.
Aymeric Wibo [Wed, 24 Aug 2022 23:20:13 +0000 (02:20 +0300)]
libc: Add strverscmp(3) and versionsort(3)
Add a strverscmp(3) function to libc, a GNU extension I implemented by
reading its glibc manual page. It orders strings following a much more
natural ordering (e.g. "ent1 < ent2 < ent10" as opposed to
"ent1 < ent10 < ent2" with strcmp(3)'s lexicographic ordering).
Also add versionsort(3) for use as scandir(3)'s compar argument.
Update manual page for scandir(3) and add one for strverscmp(3).
Umer Saleem [Wed, 24 Aug 2022 21:20:43 +0000 (02:20 +0500)]
Updates for snapshots_changed property
Currently, snapshots_changed property is stored in dd_props_zapobj, due
to which the property is assumed to be local. This causes a difference
in behavior with respect to other readonly properties.
This commit stores the snapshots_changed property in dd_object. Source
is not set to local in this case, which makes it consistent with other
readonly properties.
This commit also updates the date string format to include seconds.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13785
i386: do not allow userspace to set tf_trapno on sigreturn(2)
tf_trapno is checked on return from interrupt/exception to determine if
special handling is needed for switching address space. This is due to
the possibility of NMI/MCHK/DBG to occur at arbitrary place in kernel,
where both address space and stack used could be transient. Kernel
saves current %cr3 in tf_err for such events, to restore on return.
If user is able to set tf_trapno, it can trigger that special handling,
and since tf_err is also user-controlled by sigreturn(2), the result is
undefined.
irettraps: i386 does not push %ss/%esp when exception does not switch rings
Which means that we must not copy top 8 bytes from the trampoline stack
for the exception frame to the regular thread kstack. As consequence,
this stops corruption of the pcb. The visible effect was often a broken
fork(2) on the CPU where corruption occured.
Account for the detail by substracting 8 from the copy byte count when
moving exception frames from trampoline to the regular stack.
[irettraps handles segmentation/stack/protection faults which could
occur on the doreti path, where we might already switched stack and
address space]
Reported and tested by: pho
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
i386 copyout_fast: improve detection of a fault on accessing userspace
Do not blindly account a page fault occuring on the trampoline area,
as the userspace access fault. Check that it occured exactly in the
instruction that does that.
This avoids unneeded switches of address space on faults not needing the
switch, effectively converting machine resets due to tripple faults,
into regular panics.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
Brooks Davis [Wed, 24 Aug 2022 17:34:39 +0000 (18:34 +0100)]
freebsd32_sendmsg: fix control message ABI
When a freebsd32 caller uses all or most allowed space for control
messages (MCLBYTES == 2K) then the message may no longer fit when
the messages are padded for 64-bit alignment. Historically we've just
shrugged and said there is no ABI guarantee. We ran into this on
CheriBSD where a capsicumized 64-bit nm would fail when called with more
than 64 files.
Fix this by not gratutiously capping size of mbuf data we'll allocate
to MCLBYTES and let m_get2 allocate up to MJUMPAGESIZE (4K or larger).
Instead of hard-coding a length check, let m_get2 do it and check for a
NULL return.
Brooks Davis [Wed, 24 Aug 2022 17:34:07 +0000 (18:34 +0100)]
mbuf: Don't support PAGE_SIZE < 4K
The Vax supported such things, but FreeBSD does not. This further
implies that MJUMPAGESIZE > MCLBYTES so assert this and remove code
handling them being equal.
Brooks Davis [Wed, 24 Aug 2022 17:32:09 +0000 (18:32 +0100)]
pkg: Add limited --debug/-d support
Add an internal debug level global:
- Level 1 (-d) currently does nothing.
- Level 2 (-d -d) enables libfetch debugging (quite verbose) so it's
possible to see what pkg is attempting to download without having
to sniff traffic.
netinet6: fix SIOCSPFXFLUSH_IN6 by skipping manually-configured prefixes
Summary:
Currently netinet6/ code allocates IPv6 prefixes (nd_prefix) for
both manually-assigned addresses and advertised prefixes. As a result,
prefixes from manually-assigned prefixes can be seen in `ndp -p` list
and be cleared via `ndp -P`. The latter relies on the SIOCSPFXFLUSH_IN6
ioctl to clear to prefix list.
The original intent of the SIOCSPFXFLUSH_IN6 was to clear prefixes
originated from the advertising routers:
```
1998-09-02 JINMEI, Tatuya <jinmei@isl.rdc.toshiba.co.jp>
* nd6.c (nd6_ioctl): added 2 new ioctls; SIOCSRTRFLUSH_IN6 and
SIOCSPFXFLUSH_IN6. The former is to flush all default routers
in the default router list, and the latter is to flush all the
prefixes and the addresses derived from them in the prefix list.
```
Restore the intent by marking prefixes derived from the RA messages
with newly-added ndpr_flags.ra_derived flag and skip prefixes not marked
with such flag during deletion and listing.
Emmanuel Vadot [Thu, 18 Aug 2022 15:34:35 +0000 (17:34 +0200)]
linuxkpi: unbreak linux_i2cbb
This is a joint work with manu.
- fixed conditions in do_i2c_transfer and i2c_transfer as linux_i2cbb
does not set adapter->algo->master_xfer but does set
adapter->algo_data;
- fixed parent bus specification for linux_i2cbb driver module;
- actually implemented iicbb_transfer method;
- added iicbb_pre_xfer and iicbb_post_xfer methods;
- removed unnecessary and harmful delays (and other extra logic) from
iicbb methods as iicbb driver already has them;
- added setting of iicbb speed based on algo_data->udelay, so that iicbb
uses correct delays;
Warner Losh [Wed, 24 Aug 2022 12:27:01 +0000 (06:27 -0600)]
arm64: Remove unused typedef
We don't use EFI_MEMORY_DESCRIPTOR that's typedef'd here. We use the one
from sys/efi.h instead. Remove the clutter here as these two are subtly
different (though wind up with the same layout due to alignment rules).
Kirk McKusick [Wed, 24 Aug 2022 06:48:40 +0000 (23:48 -0700)]
Provide cache coherency between getnextinode() and ginode()
The fsck_ffs(8) utility has two subsystems for reading and writing
inodes. The getnextinode() interface is used in Pass 1 (and Pass
1b if needed) to sequentially walk through all the inodes in the
filesystem. The ginode() interface is used to read and write
individual inodes. Pass 1 uses a mix of both interfaces. This
change ensures that ginode() returns a pointer to the inode in the
cache maintained by getnextinode() when that interface holds the
requested inode so that all modifications to the inode are made in
a single place and are all written to the disk together.
Reported by: Peter Holm
Tested by: Peter Holm
Sponsored by: The FreeBSD Foundation
Kirk McKusick [Wed, 24 Aug 2022 06:28:30 +0000 (23:28 -0700)]
Update standard superblock when successful using an alternate superblock.
Historically fsck_ffs(8) would only use alternate superblocks when
running in manual mode. When the standard superblock fails, it now
tries to find and use a backup superblocks even when running in `preen'
mode. If an alternate superblock is found and the filesystem is
successfully cleaned up using it, write the alternate superblock
back to the standard superblock so that the filesystem can be
subsequently mounted and used.
Reported by: Peter Holm
Tested by: Peter Holm
Sponsored by: The FreeBSD Foundation
arcanist: use FreeBSD/git project repository instead of FreeBSD/svn
Current `.arcconfig` specifies explicit mapping to the FreeBSD/SVN
project in Phabricator, which is inactive. This mapping only gets
updated to the current "production" (FreeBSD/git) project when the
underlying change is committed.
Update arcanist configuration to create all new diffs in the
FreeBSD/git project.
Randall Stewart [Tue, 23 Aug 2022 13:17:05 +0000 (09:17 -0400)]
tcp: Rack rwnd collapse.
Currently when the peer collapses its rwnd, we mark packets to be retransmitted
and use the must_retran flags like we do when a PMTU collapses to retransmit the
collapsed packets. However this causes a problem with some middle boxes that
play with the rwnd to control flow. As soon as the rwnd increases we start resending
which may be not even a rtt.. and in fact the peer may have gotten the packets. Which
means we gratuitously retransmit packets we should not.
The fix here is to make sure that a rack time has passed before retransmitting the packets.
This makes sure that the rwnd collapse was real and the packets do need retransmission.
Randall Stewart [Tue, 23 Aug 2022 13:12:31 +0000 (09:12 -0400)]
TCP Lro has a loss of timestamp precision and reorders packets.
A while back Hans optimized the LRO code. This is great but one
optimization he did degrades the timestamp precision so that
all flushed LRO entries end up with the same LRO timestamp
if there is not a hardware timestamp. The intent of the LRO timestamp
is to get as close to the time that the packet arrived as possible. Without
the LRO queuing this works out fine since a binuptime is taken and then
the rx_common code is called. But when you go through the queue path
you end up *not* updating the M_LRO_TSTMP fields.
Another issue in the LRO code is several places that cause packet reordering. In
general TCP can handle reordering but it can cause extra un-needed retransmission
as well as other oddities. We will fix all of the reordering problems.
Lets fix this so that we restore the precision to the timestamp.
Luiz Amaral [Mon, 22 Aug 2022 18:54:36 +0000 (20:54 +0200)]
pfsync: replace struct pfsync_pkt with int flags
Get rid of struct pfsync_pkt. It was used to store data on the stack to
pass to all the submessage handlers, but only the flags part of it was
ever used. Just pass the flags directly instead.
Jessica Clarke [Mon, 22 Aug 2022 21:02:53 +0000 (22:02 +0100)]
tools/build: Unbreak bmake bootstrap on Linux
Currently make.py has a hack to add the cross-build headers to the
include search path when bootstrapping bmake on Linux (but not macOS).
This is a bit of an abuse of these headers, and e9ba1fd5eda2 was not
prepared for this, since sys/bitcount.h won't exist in that instance (it
gets copied into WORLDTMP during the legacy build). Work around this
until we can wean the bmake bootstrap off using these headers by not
including sys/bitcount.h when it doesn't exist.
Fixes: e9ba1fd5eda2 ("tools/build: Provide FreeBSD's bitstring API when cross-building")
Rick Macklem [Mon, 22 Aug 2022 20:54:24 +0000 (13:54 -0700)]
nfsd: Allow multiple instances of rpc.tlsservd
During a discussion with someone working on NFS-over-TLS
for a non-FreeBSD platform, we agreed that a single server
daemon for TLS handshakes could become a bottleneck when
an NFS server first boots, if many concurrent NFS-over-TLS
connections are attempted.
This patch modifies the kernel RPC code so that it can
handle multiple rpc.tlsservd daemons. A separate commit
currently under review as D35886 for the rpc.tlsservd
daemon.
Paul Dagnelie [Mon, 22 Aug 2022 19:36:22 +0000 (12:36 -0700)]
Prevent zevent list from consuming all of kernel memory
There are a couple changes included here. The first is to introduce
a cap on the size the ZED will grow the zevent list to. One million
entries is more than enough for most use cases, and if you are
overflowing that value, the problem needs to be addressed another
way. The value is also tunable, for those who want the limit to be
higher or lower.
The other change is to add a kernel module parameter that allows
snapshot creation/deletion to be exempted from the history logging;
for most workloads, having these things logged is valuable, but for
some workloads it produces large quantities of log spam and isn't
especially helpful.
Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Issue #13374
Closes #13753
Mateusz Kozyra [Mon, 22 Aug 2022 07:07:15 +0000 (09:07 +0200)]
uart: Add ACPI entry for LS1046A UART
NXP defines unique name for LS1046A UART - "NXP0018".
It is ns8250 compatible, adding a new uart compat data entry is enough
to make it work.
Tested on LS1046ARDB.