Jessica Clarke [Thu, 14 Dec 2023 20:17:20 +0000 (20:17 +0000)]
kldxref: Reduce divergence between per-architecture files
Note that relbase is always 0 for DSOs so its omission for __KLD_SHARED
architectures was not a bug in practice.
Whilst here, also parenthesise the dest offset for where to avoid
transiently creating an out-of-bounds pointer, which is UB (though even
on CHERI architectures, where capability bounds compression can result
in that creating invalid capabilities that will trap on dereference,
optimisation will reassociate to the correct form in practice and thus
work just fine).
John Baldwin [Tue, 12 Dec 2023 23:43:00 +0000 (15:43 -0800)]
kldxref: Make use of libelf to be a portable cross tool
This allows kldxref to operate on kernel objects from any
architecture, not just the native architecture. In particular, this
will permit generating linker.hints files as part of a cross-arch
release build.
- elf.c is a new file that includes various wrappers around libelf
including routines to read ELF data structures such as program and
section headers and ELF relocations into the "generic" forms
described in <gelf.h>. This file also provides routines for
converting a linker set into an array of addresses (GElf_Addr)
as well as reading architecture-specific mod_* structures and
converting them into "generic" Gmod_* forms where pointers are
replaced with addresses.
- The various architecture-specific reloc handlers now use GElf_*
types for most values (including GElf_Rel and GElf_Rela for
relocation structures) and use routines from <sys/endian.h> to read
and write target values. A new linker set matches reloc handlers
to specific ELF (class, encoding, machine) tuples.
- The bits of kldxref.c that write out linker.hints now use the
encoding (ELFDATA2[LM]SB) of the first file encountered in a
directory to set the endianness of the output file. Input files
with a different architecture in the same directory are skipped with
a warning. In addition, the initial version record for the file
must be deferred until the first record is finished since the
architecture of the output file is not known until then.
- Various places that used 'sizeof(void *)' throughout now use
'elf_pointer_size()' to determine the size of a pointer in the
target architecture.
Tested by: amd64 binary on both amd64 and i386 /boot/kernel
Reviewed by: imp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D42966
John Baldwin [Tue, 12 Dec 2023 23:30:16 +0000 (15:30 -0800)]
kldxref: Refactor PNP entry parsing, no functional change
- Add a free_pnp_list to complement parse_pnp_list. Add freeing
of 'new_desc' which was previously leaked.
- Move body of loop that checked a single pnp list element against a
table entry into a parse_pnp_entry function to reduce indentation
and split parse_entry into a smaller function.
- Similarly, split out a record_pnp_info function from parse_entry
which builds the pnp_list and walks a table.
Olivier Certner [Fri, 24 Nov 2023 21:21:16 +0000 (22:21 +0100)]
libthr: thr_attr.c: EINVAL, not ENOTSUP, on invalid arguments
On first read, POSIX may seem ambiguous about the return code for some
scheduling-related pthread functions on invalid arguments. But a more
thorough reading and a bit of standards archeology strongly suggests
that this case should be handled by EINVAL and that ENOTSUP is reserved
for implementations providing only part of the functionality required by
the POSIX option POSIX_PRIORITY_SCHEDULING (e.g., if an implementation
doesn't support SCHED_FIFO, it should return ENOTSUP on a call to, e.g.,
sched_setscheduler() with 'policy' SCHED_FIFO).
This reading is supported by the second sentence of the very definition
of ENOTSUP, as worded in CAE/XSI Issue 5 and POSIX Issue 6: "The
implementation does not support this feature of the Realtime Feature
Group.", and the fact that an additional ENOTSUP case was added to
pthread_setschedparam() in Issue 6, which introduces SCHED_SPORADIC,
saying that pthread_setschedparam() may return it when attempting to
dynamically switch to SCHED_SPORADIC on systems that doesn't support
that.
glibc, illumos and NetBSD also support that reading by always returning
EINVAL, and OpenBSD as well, since it always returns EINVAL but the
corresponding code has a comment suggesting returning ENOTSUP for
SCHED_FIFO and SCHED_RR, which it effectively doesn't support.
Additionally, always returning EINVAL fixes inconsistencies where EINVAL
would be returned on some out-of-range values and ENOTSUP on others.
Reviewed by: markj
Approved by: markj (mentor)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D43006
Marius Strobl [Fri, 12 Jan 2024 22:27:07 +0000 (23:27 +0100)]
uart(4): Honor hardware state of NS8250-class for tsw_busy
In 9750d9e5, I brought the equivalent of the TS_BUSY flag back in a
mostly hardware-agnostic way in order to fix tty_drain() and, thus,
TIOCDRAIN for UARTs with TX FIFOs. This proved to be sufficient for
fixing the regression reported. So in light of the release cycle of
FreeBSD 10.3, I decided that this change was be good enough for the
time being and opted to go with the smallest possible yet generic
(for all UARTs driven by uart(4)) solution addressing the problem at
hand.
However, at least for the NS8250-class the above isn't a complete
fix as these UARTs only trigger an interrupt when the TX FIFO became
empty. At this point, there still can be an outstanding character
left in the transmit shift register as indicated via the LSR. Thus,
this change adds the 3rd (besides the tty(4) and generic uart(4) bits)
part I had in my tree ever since, adding a uart_txbusy method to be
queried in addition for tsw_busy and hooking it up as appropriate
for the NS8250-class.
As it turns out, the exact equivalent of this 3rd part later on was
implemented for uftdi(4) in 9ad221a5.
While at it, explain the rational behind the deliberately missing
locking in uart_tty_busy() (also applying to the generic sc_txbusy
testing already present).
Marius Strobl [Tue, 9 Jan 2024 22:01:46 +0000 (23:01 +0100)]
igb(4): Remove disconnected SYSCTL
The global hw.igb.rx_process_limit knob never was adhered to by the
in-tree version of this driver but similar functionality is available
via the device-specific dev.igb.N.iflib.rx_budget.
While at it, remove the - besides initialization of tx_process_limit -
unused {r,t}x_process_limit members.
Stefan Eßer [Sat, 29 Jul 2023 18:52:53 +0000 (20:52 +0200)]
usr.bin/gh-bc: fix Makefile for WITHOUT_NLS_CATALOGS case
Some macro definitions had been moved into a Makefile section
that depends on MK_NLS_CATALOGS != "no", leading to LTO and the
installation of tests being disabled in the WITHOUT_NLS_CATALOGS
case.
This update fixes a bug where line breaks in printed numbers may not
match the line length set by the user. The value is printed correctly,
just not split as specified in some situations.
Output a line as soon as it is possible to determine that it will have
to be output. For the basic case, this means output each line as it is
read unless it is identical to the previous one. For the -d case, it
means output the first instance as soon as the second is read, unless
the -c option was also given. The -D and -u cases were already fine.
Add test cases for interactive use with no options and with -d.
Daniel Tameling [Sat, 25 Feb 2023 17:25:51 +0000 (10:25 -0700)]
uniq(1): use strtonum to parse options
Previously strtol was used and the result was directly cast to an int
without checking for an overflow. Use strtonum instead since it is
safer and tells us what went wrong.
The standard is somewhat unclear, but on the balance, I believe that the
phrase “the rest of the input line” should be interpreted to mean the
rest of the input line including the terminating newline if and only if
there is one. This means the current implementation is incorrect on two
points:
- First, it suppresses the previous line's newline in the '1' case.
- Second, it unconditionally emits a newline at the end of the output
for non-empty input, even if the input did not end with a newline.
Resolve this by rewriting the main loop. Instead of special-casing the
first line and then assuming that every line ends with a newline, we
remember how each line ends and emit that either at the beginning of
the next line or at the end of the file except in the one case ('+')
where the standard explicitly says not to.
While here, try to reduce diff to upstream a little and update their
RCS tag to reflect the fact that while we've diverged significantly
from them, we've incorporated all their changes. Remove the useless
second RCS tag.
We also update the tests to account for the change in interpretation
of the '1' case and add a test case for unterminated input.
Rewrite `copy_file()` so the lflag and sflag are handled as early as
possible instead of constantly checking that they're not set and then
handling them at the end. This also opens the door to changing the
failure logic at some future point (for instance, we might decide to
fall back to copying if `errno` indicates that the file system does not
support links).
This test case tests two different things: first, that copying a symlink
results in a file with the same contents as the target of the symlink,
rather than a second symlink, and second, that cp will refuse to copy a
file to itself, or to a link to itself, or a link to its target. Leave
the first part in basic_symlink, move the second part to a new test case
named samefile, and slightly expand both cases.
- The HLPR flags are grouped together at the beginning because they are
the standard flags for programs using FTS. Move the N flag out from
among them to its correct place in the sequence.
- The Pflag variable isn't used outside main(), but moving it out lets
us skip initialization and keeps it with its friends H, L and R.
If the destination file exists but we decide unlink it, set the dne
flag. This means we don't need to re-check the conditions that would
have caused us to delete the file when we later need to decide whether
to create or replace it.
Warner Losh [Thu, 7 Dec 2023 19:32:27 +0000 (12:32 -0700)]
cp: Add -N flag, inspired by NetBSD's similar flag
Add -N to supress copying of file flags when -p is specified (explicitly
or implicitly). Often times we don't care about the flags or wish to be
able to copy to NFS, and this comes in handy for that. FreeBSD's and
NetBSD's cp are somewhat different, so I had to reimplement all but one
of the patch hunks...
cp: Don't warn for chflags() failing with EOPNOTSUPP if flags == 0
From NetBSD's utils.c 1.5 importing importing BSDI change, with light
formatting changes:
Author: cgd <cgd@NetBSD.org>
Date: Wed Feb 26 14:40:51 1997 +0000
Patch from BSDI (via Keith Bostic):
>NFS doesn't support chflags; ignore errors unless there's reason
>to believe we're losing bits. (Note, this still won't be right
>if the server supports flags and we were trying to *remove* flags
>on a file that we copied, i.e., that we didn't create.)
Tom Jones [Thu, 27 Jan 2022 17:24:45 +0000 (17:24 +0000)]
Remove SMALL conditionals from gzip
gzip has SMALL conditionals which enable building a reduced size version
of the binary. These exist as part of the introduction of BSD licensed
gzip in 2004 in NetBSD and appear to have been required to reach a size
for inclusion in their install media. For more information see commits
to gzip in the NetBSD tree on the 28th of March 2004.
SMALL doesn't appear to be hooked up to our build system and
complicates gzip quite a bit.
Kyle Evans [Wed, 3 Jan 2024 22:17:59 +0000 (16:17 -0600)]
bhyveload: use a dirfd to support -h
Don't allow lookups from the loader scripts, which in rare cases may be
in guest control depending on the setup, to leave the specified host
root. Open the root dir and strictly do RESOLVE_BENEATH lookups from
there.
cb_open() has been restructured a bit to work nicely with this, using
fdopendir() in the directory case and just using the fd we already
opened in the regular file case.
hostbase_open() was split out to provide an obvious place to apply
rights(4) if that's something we care to do.
man.sh on stable/13 is missing some new features.
Unfortunately this means that this fix is not working as on stable/14.
Be patient and wait until the following 2 commits are ready to merge.
Mike Karels [Sun, 14 Jan 2024 17:01:19 +0000 (11:01 -0600)]
Increase the size of riscv GENERICSD images to 6 GB
The stable/13 snapshot this week failed to build the riscv GENERICSD
image because it ran out of space. Checking main and stable/14
snapshots, they are also low on space, around 100% or more of
capacity. Increase them all from 5 GB to 6 GB. Note, this is the
only riscv image configuration.
Zhenlei Huang [Tue, 7 Nov 2023 04:45:25 +0000 (12:45 +0800)]
kern linker: Do not retry loading modules on EEXIST
LINKER_LOAD_FILE() calls linker_load_dependencies() which will return
EEXIST in case the module to be loaded has already been compiled into
the kernel. Since the format of the module is now recognized then there
is no need to retry loading with a different linker, otherwise the
userland will get misleading error number ENOEXEC.
tcp: prevent spurious empty segments and fix uncommon panic
Only try sending more data on pure ACKs when there is
more data available in the send buffer.
In the case of a retransmitted SYN not being sent due to
an internal error, the snd_una/snd_nxt accounting could
be off, leading to a panic. Pulling snd_nxt up to snd_una
prevents this from happening.
Xin LI [Fri, 12 Jan 2024 05:38:04 +0000 (21:38 -0800)]
releng-gce: Advertise the availability of gVNIC support in GCE images.
This marks FreeBSD GCE images as gVNIC capable by adding the
--guest-os-features=GVNIC flag at creation time as suggested in GCE
documentation[1]. This allows Generation 3 and newer GCE instances
to leverage advanced networking capabilities and performance
enhancements provided by gVNIC. Users will benefit from these
improvements without needing to create custom images.
This commit introduces SRD metrics through sysctl.
The metrics can be queried using the following sysctl node:
sysctl dev.ena.<device index>.ena_srd_info
This commit adds sysctl support for customer metrics.
Different customer metrics can be found in the following sysctl node:
sysctl dev.ena.<device index>.customer_metrics
ena: Introduce shared sample interval for all stats
Rename sample_interval node to stats_sample_interval and move
it up in the sysctl tree to make it clear that it's relevant for
all the stats and not only ENI metrics (Currently, sample interval node
is found under eni_metrics node).
Path to node:
dev.ena.<device_index>.stats_sample_interval
Once this parameter is set it will set the sample interval for all the
stats node including SRD/customer metrics.
Osama Abboud [Mon, 30 Oct 2023 11:27:03 +0000 (11:27 +0000)]
ena: Add sysctl support for spreading IRQs
This commit allows spreading IO IRQs over different CPUs through sysctl.
Two sysctl nodes are introduced:
1- base_cpu: servers as the first CPU to which the first IO IRQ
will be bound.
2- cpu_stride: sets the distance between every two CPUs to which every
two consecutive IO IRQs are bound.
For example for doing the following IO IRQs / CPU binding:
Run the following commands:
sysctl dev.ena.<device index>.irq_affinity.base_cpu=0
sysctl dev.ena.<device_index>.irq_affinity.cpu_stride=2
Also introduced rss_enabled field, which is intended to replace
'#ifdef RSS' in multiple places, in order to prevent code duplication.
We want to bind interrupts to CPUs in case of rss set OR in case
the newly defined sysctl paremeter is set. This requires to remove a
couple of '#ifdef RSS' as well in the structs, since we'll be using the
relevant parameters in the CPU binding code.
Gleb Smirnoff [Mon, 4 Dec 2023 18:18:56 +0000 (10:18 -0800)]
if_tuntap: fix NOIP build
Note: this removes one TUNDEBUG() for the sake of not having one more
ifdefed variable declaration and for the overall code brevity. The call
from tuntap into LRO can be so easily traced with dtrace(1) that an
80-ish printf(9)-based debugging can be omitted.
Michael Tuexen [Thu, 9 Nov 2023 10:37:27 +0000 (11:37 +0100)]
if_tuntap: support receive checksum offloading for tap interfaces
When enabled, pretend that the IPv4 and transport layer checksum
is correct for packets injected via the character device.
This is a prerequisite for adding support for LRO, which will
be added next. Then packetdrill can be used to test the LRO
code in local mode.
Michael Tuexen [Sun, 5 Nov 2023 19:32:46 +0000 (20:32 +0100)]
if_tuntap: trigger the bpf hook on transmitting for the tap interface
The tun interface triggers the bpf hook when a packet is transmitted,
the tap interface triggers it when the packet is read from the
character device. This is inconsistent.
So fix the tap device such that it behaves like the tun device.
This is needed for adding support for the tap device to packetdrill.
Michael Tuexen [Sun, 5 Nov 2023 14:28:54 +0000 (15:28 +0100)]
udplite: make socketoption available on IPv6 sockets
This patch allows the IPPROTO_UDPLITE-level socket options
UDPLITE_SEND_CSCOV and UDPLITE_RECV_CSCOV to be used on
AF_INET6 sockets in addition to AF_INET sockets.
Michael Tuexen [Tue, 12 Sep 2023 23:33:54 +0000 (01:33 +0200)]
sctp: improve shutting down the read side of a socket
When shutdown(..., SHUT_RD) or shutdown(..., SHUT_RDWR) is called,
really clean up the read queue and issue an ungraceful shutdown if
user messages are affected.
Michael Tuexen [Fri, 8 Sep 2023 14:20:51 +0000 (16:20 +0200)]
sctp: cleanup locking for notifications
All notifications are now queued via sctp_ulp_notify(). Do
the locking of the inp read lock there and validate this in all
functions being used.
This is one step in avoiding race conditions when closing the
read end of an SCTP socket.
Michael Tuexen [Thu, 24 Aug 2023 13:52:55 +0000 (15:52 +0200)]
sctp: improve handling of socket shutdown for reading
If a socket is marked as cannot read anymore, drop chunks which
should be added to a control element in the receive queue.
This is consistent with dropping control elements instead of
adding them in the same situation.
Michael Tuexen [Wed, 23 Aug 2023 06:36:15 +0000 (08:36 +0200)]
sctp: improve handling of SHUTDOWN and SHUTDOWN ACK chunks
When handling a SHUTDOWN or SHUTDOWN ACK chunk detect if the peer
is violating the protocol by not having made sure all user messages
are reveived by the peer. If this situation is detected, abort the
association.
Michael Tuexen [Sat, 19 Aug 2023 10:35:49 +0000 (12:35 +0200)]
sctp: cleanup handling of graceful shutdown of the peer
Don't handle a graceful shutdown of the peer as an implicit signal
that all partial messages are complete. First, this is not implemented
correctly and second this should not be done by the peer. It is more
appropriate to handle this as a protocol violation.
Remove the incorrect code and leave detecting the protocol violation
and its handling in a followup commit.
Michael Tuexen [Fri, 4 Aug 2023 06:32:25 +0000 (08:32 +0200)]
sctp: improve consistency of acc and ccc handling in snd buffer
Don't clear the counters for the socket snd buffer when
shutdown(..., SHUT_WR) or shutdown(..., SHUT_RDWR) is called.
This was causing the system to panic() when SCTP pf tests were
running.