Emmanuel Vadot [Fri, 21 May 2021 11:33:34 +0000 (13:33 +0200)]
sdio: Always use increment address for read/write_4
SDIO CMD53 (RW Extented) can either write to the same address (useful for FIFO)
or auto increment the destination address (to write to multiple registers).
It is more logical to have read/write_4 to use incremental mode and make other
helper function for writing to a FIFO destination especially since most FIFO
write/read will be 8bits based and not 32bits based.
Emmanuel Vadot [Fri, 21 May 2021 09:56:39 +0000 (11:56 +0200)]
sdio: Change the sdio helper name and arg order
Do not use b/l but _1/_4 also address comes first and then data.
This makes them closer to something like bus_space_{read,write}
We have no users in the tree.
devd: move all devd notification logic to a separate file.
Currently, subr_bus.c shares logic for (a) maintaining all HW devices
(e.g. discovery/attach/detach logic) and (b) generic devctl notification
layer for devices/PMU/GEOM/interfaces/etc).
These two subsystems share really tiny interaction interface, composed of 3
notification functions. With that in mind, move devctl layer to a
separate file, establishing a clear notification interface between the
sub.c bus layer and the provider (devctl).
The primary driver of this change is netlink implementation (D36002).
The idea is to propagate device-level events to netlink as well, so all
netlink customers can subscribe to these changes.
The long-term goal is to deprecate devctl and to use netlink as the
kernel<> userland transport provided netlink gets enough traction.
routing: move rtentry and subscription code out of route_ctl.c
route_ctl.c size has grown considerably since initial introduction.
Factor out non-relevant parts:
* all rtentry logic, such as creation/destruction and accessors
goes to net/route/route_rtentry.c
* all rtable subscription logic goes to net/route/route_subscription.c
routing: add rib_<add|del>_route_px() functions operating with nexthops.
This change adds public KPI to work with routes using pre-created
nexthops, instead of using data from addrinfo structures. These
functions will be later used for adding/deleting kernel-originated
routes and upcoming netlink protocol.
As a part of providing this KPI, low-level route addition code has been
reworked to provide more control over route creation or change.
Specifically, a number of operation flags
(RTM_F_<CREATE|EXCL|REPLACE|APPEND>) have been added, defining the
desired behaviour the the route already exists (or not exists). This
change required some changes in the multipath addition code, resulting
in moving this code to route_ctl.c, rendering mpath_ctl.c empty.
routing: split nexthop creation and rtentry creation.
This change is required for the upcoming introduction of the next
nexhop-based operations KPI, as it will create rtentry and nexthops
at different stages of route table modification.
* Use same filter func (rib_filter_f_t) for nexhtop groups to
simplify callbacks.
* simplify conditional route deletion & remove the need to pass
rt_addrinfo to the low-level deletion functions
* speedup rib_walk_del() by removing an additional per-prefix lookup
This and the follow-up routing-related changes target to remove or
reduce `struct rt_addrinfo` usage and use recently-landed nhop(9)
KPI instead.
Traditionally `rt_addrinfo` structure has been used to propagate all necessary
information between the protocol/rtsock and a routing layer. Many
functions inside routing subsystem uses it internally. However, using
this structure became somewhat complicated, as there are too many ways
of specifying a single state and verifying data consistency is hard.
For example, arerouting flgs consistent with mask/gateway sockaddr pointers?
Is mask really a host mask? Are sockaddr "valid" (e.g. properly zeroed, masked,
have proper length)? Are they mutable? Is the suggested interface specified
by the interface index embedded into the sockadd_dl gateway, or passed
as RTAX_IFP parameter, or directly provided by rti_ifp or it needs to
be derived from the ifa?
These (and other similar) questions have to be considered every time when
a function has `rt_addrinfo` pointer as an argument.
The new approach is to bring more control back to the protocols and
construct the desired routing objects themselves - in the end, it's the
protocol/subsystem who knows the desired outcome.
This specific diff changes the following:
* add explicit basic low-level radix operations:
add_route() (renamed from add_route_nhop())
delete_route() (factored from change_route_nhop())
change_route() (renamed from change_route_nhop)
* remove "info" parameter from change_route_conditional() as a part
of reducing rt_addrinfo usage in the internal KPIs
* add lookup_prefix_rt() wrapper for doing re-lookups after
RIB lock/unlock
Gleb Smirnoff [Wed, 10 Aug 2022 18:09:34 +0000 (11:09 -0700)]
tcp: utilize new solisten_clone() and solisten_enqueue()
This streamlines cloning of a socket from a listener. Now we do not
drop the inpcb lock during creation of a new socket, do not do useless
state transitions, and put a fully initialized socket+inpcb+tcpcb into
the listen queue.
Before this change, first we would allocate the socket and inpcb+tcpcb via
tcp_usr_attach() as TCPS_CLOSED, link them into global list of pcbs, unlock
pcb and put this onto incomplete queue (see 6f3caa6d815). Then, after
sonewconn() we would lock it again, transition into TCPS_SYN_RECEIVED,
insert into inpcb hash, finalize initialization of tcpcb. And then, in
call into tcp_do_segment() and upon transition to TCPS_ESTABLISHED call
soisconnected(). This call would lock the listening socket once again
with a LOR protection sequence and then we would relocate the socket onto
the complete queue and only now it is ready for accept(2).
Gleb Smirnoff [Wed, 10 Aug 2022 18:09:34 +0000 (11:09 -0700)]
sockets: provide solisten_clone(), solisten_enqueue()
as alternative KPI to sonewconn(). The latter has three stages:
- check the listening socket queue limits
- allocate a new socket
- call into protocol attach method
- link the new socket into the listen queue of the listening socket
The attach method, originally designed for a creation of socket by the
socket(2) syscall has slightly different semantics than attach of a socket
cloned by listener. Make it possible for protocols to call into the
first stage, then perform a different attach, and then call into the
final stage. The first stage, that checks limits and clones a socket
is called solisten_clone(), and the function that enqueues the socket
is solisten_enqueue().
Changing mode on a pin (input/output/pullup/pulldown) is a bit slow.
Improve this by caching what we can.
We need to check if the pin is in gpio mode, do that the first time
that we have a request for this pin and cache the result. We can't do
that at attach as we are a child of rk_pinctrl and it didn't finished
its attach then.
Cache also the flags specific to the pinctrl (pullup or pulldown) if the
pin is in input mode.
Cache the registers that deals with input/output mode and output value. Also
remove some register reads when we change the direction of a pin or when we
change the output value since the bit changed in the registers only affect output
pins.
Gleb Smirnoff [Wed, 10 Aug 2022 14:32:37 +0000 (07:32 -0700)]
tcp: address a wire level race with 2 ACKs at the end of TCP handshake
Imagine we are in SYN-RCVD state and two ACKs arrive at the same time,
both valid, e.g. coming from the same host and with valid sequence.
First packet would locate the listening socket in the inpcb database,
write-lock it and start expanding the syncache entry into a socket.
Meanwhile second packet would wait on the write lock of the listening
socket. First packet will create a new ESTABLISHED socket, free the
syncache entry and unlock the listening socket. Second packet would
call into syncache_expand(), but this time it will fail as there
is no syncache entry. Second packet would generate RST, effectively
resetting the remote connection.
It seems to me, that it is impossible to solve this problem with
just rearranging locks, as the race happens at a wire level.
To solve the problem, for an ACK packet arrived on a listening socket,
that failed syncache lookup, perform a second non-wildcard lookup right
away. That lookup may find the new born socket. Otherwise, we indeed
send RST.
netinet6: allow ND entries creation for all directly-reachable
destinations.
The current assumption is that kernel-handled rtadv prefixes along with
the interface address prefixes are the only prefixes considered in
the ND neighbor eligibility code.
Change this by allowing any non-gatewaye routes to be eligible. This
will allow DHCPv6-controlled routes to be correctly handled by
the ND code.
Refactor nd6_is_new_addr_neighbor() to enable more deterministic
performance in "found" case and remove non-needed
V_rt_add_addr_allfibs handling logic.
Michael Tuexen [Mon, 8 Aug 2022 11:07:10 +0000 (13:07 +0200)]
tcp: improve BBLog for output events when using the FreeBSD stack
Put the return value of ip_output()/ip6_output in the output event
instead of adding another one in case of an error. This improves
consistency with other similar places.
Reviewed by: rscheff
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D36085
Xin LI [Wed, 10 Aug 2022 00:27:54 +0000 (17:27 -0700)]
arc4random(3): Reduce diff with OpenBSD.
The main change was v1.57 by djm@:
Randomise the rekey interval a little. Previously, the chacha20
instance would be rekeyed every 1.6MB. This makes it happen at a
random point somewhere in the 1-2MB range.
Mark Johnston [Tue, 9 Aug 2022 20:08:13 +0000 (16:08 -0400)]
dtrace/amd64: Implement emulation of call instructions
Here, the provider is responsible for updating the trapframe to redirect
control flow and for computing the return address. Once software-saved
registers are restored, the emulation shifts the remaining context down
on the stack to make space for the return address, then copies the
address provided by the invop handler. dtrace_invop() is modified to
allocate temporary storage space on the stack for use by the provider to
return the return address.
This is to support a new provider for amd64 which can instrument
arbitrary instructions, not just function entry and exit instructions as
FBT does.
In collaboration with: christos
Sponsored by: Google, Inc. (GSoC 2022)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Mark Johnston [Tue, 9 Aug 2022 20:08:09 +0000 (16:08 -0400)]
fbt/x86: Extract arg1 for return probes from the trapframe
dtrace invop handlers have access to the whole trapframe, just use that
to extract %rax/%eax for return probes instead of relying on an
additional parameter to the handler. No functional change intended.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Jessica Clarke [Tue, 9 Aug 2022 21:57:47 +0000 (22:57 +0100)]
etcupdate: Add a -N flag to perform a NO_ROOT build
This is in preparation for including an etcupdate tree when performing a
-DNO_ROOT release image build. Although -DNO_ROOT can be passed via -M,
to be useful we need to mangle the resulting METALOG to mirror the
various cleanups to the tree that are done after the build (removing
generated files, empty files and empty directories), so etcupdate needs
its own flag.
Jessica Clarke [Tue, 9 Aug 2022 21:57:22 +0000 (22:57 +0100)]
etcupdate: Prefer POSIX -depth to BSD -d
This is in preparation for building an etcupdate tree on non-FreeBSD
when building release images. The -d option is documented as a
BSD-specific equivalent to the POSIX -depth primary. Whilst GNU find
sort of accepts it in an attempt to be compatible, it still doesn't
permit it coming before the paths, unlike BSD find, and prints a
deprecation warning either way. Thus, use the equivalent POSIX -depth to
ensure it works correctly and without warning everywhere.
Jessica Clarke [Tue, 9 Aug 2022 21:57:01 +0000 (22:57 +0100)]
release: Forward ${MAKE} to etcupdate via the new -m flag
This is in preparation for non-FreeBSD builds where make is GNU make and
so etcupdate needs to know the name of or path to the bmake binary to
use for its own builds.
Jessica Clarke [Tue, 9 Aug 2022 21:56:19 +0000 (22:56 +0100)]
etcupdate: Add a -m flag to change the make binary that's run
This will allow release/Makefile to forward on ${MAKE} to allow building
on non-FreeBSD systems where ${MAKE} is something other than make, as
make is typically GNU make in such situations.
Jessica Clarke [Tue, 9 Aug 2022 21:52:47 +0000 (22:52 +0100)]
release: Use in-tree etcupdate for build
This is in preparation for non-FreeBSD and -DNO_ROOT builds. On
non-FreeBSD there is no host etcupdate to use, and -DNO_ROOT will
require additional flags that may not be supported by the host's
etcupdate when building on FreeBSD. Moreover, there's no guarantee
anyway that the host's etcupdate is quite right for the current tree;
upgrading from source only requires that the host's is good enough for
-p which just manually copies master.passwd and group, the rest of the
upgrade is done post-installworld. For example, should a new set of
autogenerated files be added that etcupdate is taught about, the host
won't know about them and so the bootstrapped current tree will
incorrectly contain them, leading to spurious diffs on the installed
system.
Mark Johnston [Mon, 25 Jul 2022 20:53:21 +0000 (16:53 -0400)]
vm_fault: Shoot down shared mappings in vm_fault_copy_entry()
As in vm_fault_cow(), it's possible, albeit rare, for multiple vm_maps
to share a shadow object. When copying a page from a backing object
into the shadow, all mappings of the source page must therefore be
removed. Otherwise, future operations on the object tree may detect
that the source page is fully shadowed and thus can be freed.
Approved by: so
Security: FreeBSD-SA-22:11.vm
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35635
elf_note_prpsinfo: handle more failures from proc_getargv()
Resulting sbuf_len() from proc_getargv() might return 0 if user mangled
ps_strings enough. Also, sbuf_len() API contract is to return -1 if the
buffer overflowed. The later should not occur because get_ps_strings()
checks for catenated length, but check for this subtle detail explicitly
as well to be more resilent.
The end result is that p_comm is used in this situations.
Approved by: so
Security: FreeBSD-SA-22:09.elf
Reported by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: delphij, markj
admbugs: 988
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35391
This driver supports the auto negotiation mode between the copper and fiber
ports.
This PHY has two independent PHYs (one for copper and other for fiber) but in
this case the functionality is presented as a single PHY for easy management.
Mike Karels [Tue, 9 Aug 2022 12:08:09 +0000 (07:08 -0500)]
netinet tests: Add test for IPv6 mapped-v4 bind problem
Test fix in 637f317c6d9c, verifying that when ports run out, we get
an EADDRNOTAVAIL error from bind() rather than an EADDRINUSE error
from connect(). Use small port range to exhaust ports and see which
error happens.
Eugene Grosbein [Mon, 8 Aug 2022 22:21:02 +0000 (05:21 +0700)]
syslog(3): unbreak log generation using fabricated PID
Recover application ability to supply fabricated PID
embedded into ident that was lost when libc switched
to generation of RFC 5424 log messages, for example:
logger -t "ident[$$]" -p user.notice "test"
It is essential for long running scripts.
Also, this change unbreaks matching resulted entries
by ident in syslog.conf:
!ident
*.* /var/log/ident.log
Without the fix, the log (and matching) was broken:
Aug 1 07:36:58 hostname ident[123][86483]: test
Now it works as expected and worked before breakage:
Emmanuel Vadot [Mon, 8 Aug 2022 18:21:08 +0000 (20:21 +0200)]
linuxkpi: io.h: Only exclude armv6 and armv7 for asm/set_memory.h
Other arches like powerpc* needs it.
Fixes: d387a1b4b1996 ("linuxkpi: io.h: Do not include asm/set_memory.h for armv6 and armv7") Fixes: 789dbdbb48574 ("linuxkpi: Add arch_io_{reserve,free}_memtype_wc")
Sponsored by: Beckhoff Automation GmbH & Co. KG
John Baldwin [Mon, 8 Aug 2022 18:21:54 +0000 (11:21 -0700)]
cxgbe: Handle requests for TLS key allocations with no TLS key storage.
If an adapter advertises support for TLS keys but an empty TLS key
storage area in on-board memory, fail the request rather than invoking
vmem_alloc on an uninitialized vmem.
Per the EHABI32 specification, __cxa_end_cleanup must take care to
preserve registers before calling _Unwind_Resume(). So, libcxxrt uses
an assembly stub which preserves caller-saved registers around the call
to __cxa_get_cleanup(). But:
- it failed to restore them properly,
- it did not preserve the link register.
Fix both of these problems. This is needed to fix exception unwinding
on FreeBSD with LLVM 14. Note that r4 is callee-saved but is pushed
onto the stack to preserve stack pointer alignment.
Sponsored-by: The FreeBSD Foundation
MFC after: 1 week
thread_create(): call cpu_copy_thread() after td_pflags is zeroed
By calling the function too early we might still have the td_pflags
value cached from the previous struct thread use. cpu_copy_thread()
depends on correct value for TDP_KTHREAD at least on x86.
Reported, bisected, and tested by: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36069
In collaboration with: dougm
Reviewed by: alc
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D36001
Gordon Bergling [Sun, 7 Aug 2022 14:30:24 +0000 (16:30 +0200)]
libutil: Fix mandoc warnings
- missing comma before name
- possible typo in section name: Sh CAVEAT instead of CAVEATS
- useless macro: Tn
- blank line in fill mode, using .sp
- no blank before trailing delimiter: Dv NULL?