The pidx argument of isc_rxd_flush() indicates which is the last valid
receive descriptor to be used by the NIC. However, current code has
multiple issues:
- Intel drivers write pidx to their RDT register, which means that
NICs will only use the descriptors up to pidx-1 (modulo ring size N),
and won't actually use the one pointed by pidx. This does not break
reception, but it is anyway confusing and suboptimal (the NIC will
actually see only N-2 descriptors as available, rather than N-1).
Other drivers (if_vmx, if_bnxt, if_mgb) adhere to this semantic).
- The semantic used by Intel (RDT is one descriptor past the last
valid one) is used by most (if not all) NICs, and it is also used
on the TX side (also in iflib). Since iflib is not currently
using this semantic for RX, it must decrement fl->ifl_pidx
(modulo N) before calling isc_rxd_flush(), and then the
per-driver callback implementation must increment the index
again (to match the real semantic). This is confusing and suboptimal.
- The iflib refill function is also called at initialization.
However, in case the ring size is smaller than 128 (e.g. if_mgb),
the refill function will actually prepare all the receive
descriptors (N), without leaving one unused, as most of NICs assume
(e.g. to avoid RDT to overrun RDH). I can speculate that the code
looks like this right now because this issue showed up during
testing (e.g. with if_mgb), and it was easy to workaround by
decrementing pidx before isc_rxd_flush().
The goal of this change is to simplify the code (removing a bunch
of instructions from the RX fast path), and to make the semantic of
isc_rxd_flush() consistent across drivers. To achieve this, we:
- change the semantics of the pidx argument to the usual one (that
is the index one past the last valid one), so that both iflib and
drivers avoid the decrement/increment dance.
- fix the initialization code to prepare at most N-1 descriptors.
qlxgb: Initialize if_mtu before setting max_frame_size.
Previously we were relying on ether_ifattach() to set if_mtu, but
max_frame_size is initialized earlier. This fixes a regression
introduced by r250375.
PR: 249050
Submitted by: Christian Vallières <novacrash_@hotmail.com>
MFC after: 3 days
Move all of the error prints in readsb() from stderr to stdout.
The only output from fsck that should go to stderr is the usage message.
if setup() fails then exit with EEXIT rather than 0.
Introduce the SDHCI driver for NXP QorIQ Layerscape SoCs
Implement support for an eSDHC controller found in NXP QorIQ Layerscape SoCs.
This driver has been tested with NXP LS1046A and LX2160A (Honeycomb board),
which is incompatible with the existing sdhci_fsl driver (aiming at older
chips from this family). As such, it is not intended as replacement for
the old driver, but rather serves as an improved alternative for SoCs that
support it.
It comes with support for both PIO and Single DMA modes and samples the
clock from the extres clk API.
Submitted by: Artur Rojek <ar@semihalf.com>
Reviewed by: manu, mmel, kibab
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D26153
In the util-linux version of script, it will always exit with succes.
Except when run with -e, in which case it will have the exit value of
the child. BSD Script already uses the child's exit value for its exit
value. Some config and other helper scripts depend on being able to
specify -e. Accept it for compatibility since we'll already to the
right thing, but otherwise we ignore it.
Add Cannon Point PCH Thermal Controller Device ID.
PR: 249047
Reported by: Dries Michiels <driesm.michiels at gmail.com>
--This line, and those below, will be ignored--
> Description of fields to fill in above: 76 columns --|
> PR: If and which Problem Report is related.
> Submitted by: If someone else sent in the change.
> Reported by: If someone else reported the issue.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> MFH: Ports tree branch name. Request approval for merge.
> Relnotes: Set to 'yes' for mention in release notes.
> Security: Vulnerability reference (one per line) or description.
> Sponsored by: If the change was sponsored by an organization (each collaborator).
> Differential Revision: https://reviews.freebsd.org/D### (*full* phabric URL needed).
> Empty fields above will be automatically removed.
It's no longer unusual to be able to build a release with a single
command, so drop "actually" that hints at a surprise. Also just use
"network install directory" instead of referencing FTP; it's more
likely to be HTTP now.
Reviewed by: gjb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26260
The caller-supplied pointer is unconditionally dereferenced at the
beginning of the function, so there is no point in comparing it with
NULL thereafter.
Reported by: Coverity
MFC after: 1 week
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Fix string overflow that could occur during redirection due to passing
the wrong length to strlcpy(3). It looks like it could overflow into
the next field, isc_user, which is properly long to accomodate for it;
I don't think it could cause any harm other than breaking the connection.
Reviewed by: mav
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26247
POSIX is pretty clear that command -v, command -V and type shall write
absolute pathnames. Therefore, we need to prepend the current directory's
name to relative pathnames.
This can happen either when PATH contains a relative pathname or when the
operand contains a slash but is not an absolute pathname.
Make hardware TLS send tag allocation synchronous in mlx5en(4).
Previously the send tag was setup in the background, and all packets for
the given send tag were dropped until ready. Change this to be blocking
behaviour so that once the setsocketopt() for enabling TLS completes,
the socket is ready to send packets. Do this by simply flushing the
work request which does the needed firmware programming during send
tag allocation.
Calling pmc_hook inside a critical section may result in a panic.
This happens when the user callchain is fetched, because it uses
pmap_map_user_ptr, that tries to get the (sleepable) pmap lock when the
needed vsid is not found.
Judging by the implementation in other platforms, intr_irq_handler in
kern/subr_intr.c and what pmc_hook do, it seems safe to move pmc_hook
outside the critical section.
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D26111
andrew [Tue, 1 Sep 2020 11:02:43 +0000 (11:02 +0000)]
Support stage 2 arm64 pmap in more places
Add support for stage 2 pmap to pmap_pte_dirty, pmap_release, and more
of pmap_enter. This adds support in all placess I have hit while testing
bhyve ehile faulting pages in as needed.
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D26065
Change printf format string to include the extra blank
This is a follow up change to r364321 after a discussion about the style.
All near by places use extra blanks in format strings, and while use of the
format string to provide the extra blank may need more cycles than adding 1
to twidth, it generates shorter code and is clearer in the opinion of some
reviewers of the previous change.
The "Intel Sunrise Point-LP USB 3.0 controller" doesn't update the wMaxPacket
field in the control endpoint context automatically causing a BABBLE error code
on the initial first USB device descriptor read, when the bMaxPacketSize is not
8 bytes.
OZFS is the top of the OpenZFS tree (aka src/sys/contrib/openzfs).
ZFSOSSRC is the path to the OepnZFS sources
ZFSOSINC is the path to the OepnZFS includes
Save 7k of text space by using simpler crc32 for standalone case. we
don't need all that fancy optimization in the boot loader, so use a
simplified version of the CRC function. We could save more by doing it
one bit at a time rather than 32, but this is the biggest savings at
the smallest performance hit.
With LUA and verfied exec, gptboot, gptzfsboot and friends are pushing
the ~530k limit and every little bit helps.
When SMP support for powerpc was added in r178628, the last callers of this
function were removed. All code that needs to manipulate the task priority
just does it directly instead.
Add a check to test for the case of the "tls" option being used with "udp".
The KERN_TLS only supports TCP, so use of the "tls" option with "udp" will
not work. This patch adds a test for this case, so that the mount is not
attempted when both "tls" and "udp" are specified.
imp [Mon, 31 Aug 2020 23:31:16 +0000 (23:31 +0000)]
Warn for the non pccard attachments
These devices have non-pccard attachments. Warn for those as well. Both an and
wi don't do the modern cyrpto needed to use these cards on secure wifi networks.
an needs firmware from Cisco, which I don't think was ever produced. wi could
in theory do it with raw frames and on-host encryption, but nobody has written
that in the 15 years since WEP was cracked.
MFC After: 3 days
Noticed by: rgrimes
Differential Revision: https://reviews.freebsd.org/D26138
imp [Mon, 31 Aug 2020 21:04:00 +0000 (21:04 +0000)]
Add deprecation notice for apm BIOS
Add deprecation notice for apm bios, aka the apm(4) device. The apm(8)
command will remain, at least for a while, since ACPI emulates the apm
ioctl interface.
Discussed on: arch@
Relnotes: yes
MFC After: 3 days
imp [Mon, 31 Aug 2020 19:47:30 +0000 (19:47 +0000)]
gc pmtimer and apm
pmtimer was removed from base some time ago. apm hasn't been relevant
for these devices in a long time (and was commented out). Remove them
both from these config files.
kib [Mon, 31 Aug 2020 16:23:51 +0000 (16:23 +0000)]
mlx5_core: Import PDDR register definitions
PDDR (Port Diagnostics Database Register) is used to read the physical
layer debug database, which contains helpful troubleshooting information
regarding the state of the link.
PDDR register can only be queried when PCAM register reports it as
supported in its register mask. A new helper macro was added to
the MLX5_CAP_* infrastructure in order to access this mask.
vangyzen [Mon, 31 Aug 2020 16:17:28 +0000 (16:17 +0000)]
infiniband: Appease Coverty
Coverity claims the call to rdma_gid2ip in cma_igmp_send overwrites addr.
Use a consistent definition of sockaddr to prevent detections and code
changes in the future.
markj [Mon, 31 Aug 2020 15:59:17 +0000 (15:59 +0000)]
ggated(8): Avoid doubly opening the requested disk device.
- Initialize the disk device fd field in connection_new().
- Close the disk device after handing the connection over
to a child worker.
- Avoid re-opening a disk device for each connection from
the same client, avoiding an fd leak.
kevans [Mon, 31 Aug 2020 15:07:15 +0000 (15:07 +0000)]
posixshm: fix setting of shm_flags
Noted in D24652, we currently set shmfd->shm_flags on every
shm_open()/shm_open2(). This wasn't properly thought out; one shouldn't be
able to specify incompatible flags on subsequent opens of non-anon shm.
Move setting of shm_flags explicitly to the two places shmfd are created, as
we do with seals, and validate when we're opening a pre-existing mapping
that we've either passed no flags or we've passed the exact same flags as
the first time.
gallatin [Mon, 31 Aug 2020 13:53:14 +0000 (13:53 +0000)]
make m_getm2() resilient to zone_jumbop exhaustion
When the zone_jumbop is exhausted, most things using
using sosend* (like sshd) will eventually
fail or hang if allocations are limited to the
depleted jumbop zone. This makes it imossible to
communicate with a box which is under an attach which
exhausts the jumbop zone.
Rather than depending on the page size zone, also try cluster
allocations to satisfy larger requests. This allows me
to ssh to, and serve 100Gb/s of traffic from a server which
under attack and has had its page-sized zone exhausted.
whu [Mon, 31 Aug 2020 09:05:45 +0000 (09:05 +0000)]
Hyper-V: storvsc: Enhance srb_status code handling.
In hv_storvsc_io_request() when coring, prevent changing of the send channel
from the base channel to another one. storvsc_poll always probes on the base
channel.
Based upon conversations with Microsoft, changed the handling of srb_status
codes. Most we should never get, others yes. All are treated as retry-able
except for two. We should not get these statuses, but if we ever do, the I/O
state is not known.
kevans [Mon, 31 Aug 2020 01:45:48 +0000 (01:45 +0000)]
ipv6: quit dropping packets looping back on p2p interfaces
To paraphrase the below-referenced PR:
This logic originated in the KAME project, and was even controversial when
it was enabled there by default in 2001. No such equivalent logic exists in
the IPv4 stack, and it turns out that this leads to us dropping valid
traffic when the "point to point" interface is actually a 1:many tun
interface, e.g. with the wireguard userland stack.
Even in the case of true point-to-point links, this logic only avoids
transient looping of packets sent by misconfigured applications or
attackers, which can be subverted by proper route configuration rather than
hardcoded logic in the kernel to drop packets.
In the review, melifaro goes on to note that the kernel can't fix it, so it
perhaps shouldn't try to be 'smart' about it. Additionally, that TTL will
still kick in even with incorrect route configuration.
rmacklem [Sun, 30 Aug 2020 21:21:58 +0000 (21:21 +0000)]
Add support for the NFS over TLS exports to mountd.
Three new export flags are added to mountd that will restrict exported
file system mounts to use TLS. Without these flags, TLS is allowed, but not
required.
The exports(5) man page will be updated in a future commit.
glebius [Sun, 30 Aug 2020 17:13:04 +0000 (17:13 +0000)]
Followup on r364922. Old comment said that the only reason to put
the hook at queue mode was that mn_rx_intr() doesn't run at splnet
level. In today's netgraph the only legitimate reason for queue mode
is recursion avoidance. So I see no reason for queue mode here.
sjg [Sat, 29 Aug 2020 21:05:43 +0000 (21:05 +0000)]
zalloc_malloc:Free hexdump preceeding buffer when we detect overflow
Move hexdump from stand/common/misc.c to stand/libsa/hexdump.c
(svn cp)
Disable use of pager - causes linking issue for boot1
can be re-enabled by defining HEXDUMP_PAGER.
wulf [Sat, 29 Aug 2020 19:26:31 +0000 (19:26 +0000)]
LinuxKPI: Implement ksize() function.
In Linux, ksize() gets the actual amount of memory allocated for a given
object. This commit adds malloc_usable_size() to FreeBSD KPI which does
the same. It also maps LinuxKPI ksize() to newly created function.
gjb [Sat, 29 Aug 2020 15:30:21 +0000 (15:30 +0000)]
Avoid the build from falling over if devel/git is not installed
on the system. Set a null branch/hash in this case, to avoid
undefined GITREV/GITBRANCH variables from falling over in other
areas.
Reported by: many
Sponsored by: Rubicon Communications, LLC (netgate.com)
imp [Sat, 29 Aug 2020 04:30:12 +0000 (04:30 +0000)]
Move to using sbuf for some sysctl in newbus
Convert two different sysctl to using sbuf. First, for all the default
sysctls we implement for each device driver that's attached. This is a
pure sbuf conversion.
Second, convert sysctl_devices to fill its buffer with sbuf rather
than a hand-rolled crappy thing I wrote years ago.
imp [Sat, 29 Aug 2020 04:30:06 +0000 (04:30 +0000)]
Retire devctl_notify_f()
devctl_notify_f isn't needed, so retire it. The flags argument is now
unused, so rather than keep it around, retire it. Convert all old
users of it to devctl_notify(). This path no longer sleeps, so is safe
to call from any context. Since it doesn't sleep, it doesn't need to
know if it is OK to sleep or not.
imp [Sat, 29 Aug 2020 04:29:53 +0000 (04:29 +0000)]
devctl: move to using a uma zone
Convert the memory management of devctl. Rewrite if to make better
use of memory. This eliminates several mallocs (5? worse case) needed
to send a message. It's now possible to always send a message, though
if things are really backed up the oldest message will be dropped to
free up space for the newest.
Add a static bus_child_{location,pnpinfo}_sb to start migrating to
sbuf instead of buffer + length. Use it in the new code. Other code
will be converted later (bus_child_*_str is only used inside of
subr_bus.c, though implemented in ~100 places in the tree).
melifaro [Fri, 28 Aug 2020 23:01:56 +0000 (23:01 +0000)]
Move fib_rte_to_nh_flags() from net/route_var.h to net/route/nhop_ctl.c.
No functional changes.
Initially this function was created to perform runtime flag conversions
for the previous incarnation of fib lookup functions. As these functions
got deprecated, move the function to the file with the only remaining
caller. Lastly, rename it to convert_rt_to_nh_flags() to follow the
naming notation.
melifaro [Fri, 28 Aug 2020 22:50:20 +0000 (22:50 +0000)]
Move net/route/shared.h definitions to net/route/route_var.h.
No functional changes.
net/route/shared.h was created in the inital phases of nexthop conversion.
It was intended to serve the same purpose as route_var.h - share definitions
of functions and structures between the routing subsystem components. At
that time route_var.h was included by many files external to the routing
subsystem, which largerly defeats its purpose.
As currently this is not the case anymore and amount of route_var.h includes
is roughly the same as shared.h, retire the latter in favour of the former.
melifaro [Fri, 28 Aug 2020 21:59:10 +0000 (21:59 +0000)]
Further split nhop creation and rtable operations.
As nexthops are immutable, some operations such as route attribute changes
require nexthop fetching, forking, modification and route switching.
These operations are not atomic, so they may need to be retried multiple
times in presence of multiple speakers changing the same route.
This change introduces "synchronisation" primitive: route_update_conditional(),
simplifying logic for route changes and upcoming multipath operations.
vmaffione [Fri, 28 Aug 2020 20:03:54 +0000 (20:03 +0000)]
lib: add libnetmap
This changeset introduces the new libnetmap library for writing
netmap applications.
Before libnetmap, applications could either use the kernel API
directly (e.g. NIOCREGIF/NIOCCTRL) or the simple header-only-library
netmap_user.h (e.g. nm_open(), nm_close(), nm_mmap() etc.)
The new library offers more functionalities than netmap_user.h:
- Support for complex netmap options, such as external memory
allocators or per-buffer offsets. This opens the way to future
extensions.
- More flexibility in the netmap port bind options, such as
non-numeric names for pipes, or the ability to specify the netmap
allocator that must be used for a given port.
- Automatic tracking of the netmap memory regions in use across the
open ports.
At the moment there is no man page, but the libnetmap.h header file
has in-depth documentation.