Otherwise the same checks are duplicated across four different system
call implementations, cpuset_(get|set)(affinity|domain)(). No
functional change intended.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
- Get rid of the ifl_vm_addrs array. It is not used by any existing
consumer, so we are just dirtying a couple of cache lines for no
reason.
- Use uma_zalloc(fl->ifl_zone) instead of m_cljget(). Otherwise
m_cljget() is doing unnecessary work to look up the correct zone, when
iflib already knows what that zone is.
- ifl_gen is only used when INVARIANTS is on, so make that more clear.
- Fix some style nits and inconsistencies.
Reviewed by: gallatin
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25490
iflib: Fix handling of mbuf cluster allocation failures.
When refilling an rx freelist, make sure we only update the hardware
producer index if at least one cluster was allocated. Otherwise the
NIC is programmed to write a previously used cluster, typically
resulting in a use-after-free when packet data is written by the
hardware.
Also make sure that we don't update the fragment index cursor if the
last allocation attempt didn't succeed. For at least Intel drivers,
iflib assumes that the consumer index and fragment index cursor stay in
lockstep, but this assumption was violated in the face of cluster
allocation failures.
Reported and tested by: pho
Reviewed by: gallatin, hselasky
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25489
In the last IFUNC related changes to rtld, the code that handled non-PLT
GNU IFUNC relocations ended up getting lost. This could leave some
relocations unhandled, causing crashes or misbehavior. This change restores
the handling of these relocations, but now together with the other IFUNC
relocations, allowing resolvers to reference external symbols.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D25550
andrew [Mon, 6 Jul 2020 08:51:55 +0000 (08:51 +0000)]
Add a driver for bcm2838 PCI express controller
This adds support for the Broadcom bcm2711 PCI express controller, found
on the Raspberry Pi 4 (aka the bcm2838 SoC). The driver has only been
developed against the soldered-on VIA XHCI controller and not tested
with other end points.
Submitted by: Robert Crowston <crowston_protonmail.com>
Differential Revision: https://reviews.freebsd.org/D25068
Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which
order the clients are attached and detached. For example one IB client may
use resources from another IB client. This can lead to a potential deadlock
at shutdown. For example if the ipoib is unregistered after the ib_multicast
client is detached, then if ipoib is using multicast addresses a deadlock may
happen, because ib_multicast will wait for all its resources to be freed before
returning from the remove method.
Fix this by using module_xxx_order() instead of module_xxx().
Also, add a new function nfsm_add_ext_pgs() which will either add a page
or add a new ext_pgs mbuf with a page to the mbuf list. Used by nfsm_strtom().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.
Since ND_EXTPG is never set yet, there is no semantic change at this time.
andrew [Sun, 5 Jul 2020 14:38:22 +0000 (14:38 +0000)]
Rerun kernel ifunc resolvers after all CPUs have started
On architectures that use RELA relocations it is safe to rerun the ifunc
resolvers on after all CPUs have started, but while they are sill parked.
On arm64 with big.LITTLE this is needed as some SoCs have shipped with
different ID register values the big and little clusters meaning we were
unable to rely on the register values from the boot CPU.
Add support for rerunning the resolvers on arm64 and amd64 as these are
both RELA using architectures.
Reviewed by: kib
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D25455
Gather writes to larger chunks (MAXPHYS) instead of issuing them in
sectors.
On my SanDisk Cruzer Blade 16GB USB stick this made formatting much faster:
x before
+ after
+--------------------------------------------------------------------------+
|+ |
|+ x |
|+ x x|
|A MA||
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 3 15.89 16.38 16 16.09 0.2570992
+ 3 0.32 0.37 0.35 0.34666667 0.025166115
Difference at 95.0% confidence
-15.7433 +/- 0.414029
-97.8455% +/- 0.25668%
(Student's t, pooled s = 0.182665)
Make linprocfs(5) create /proc/bus/pci/devices/, and linsysfs(5)
create /sys/class/power_supply/. This silences some warnings
from biology/linux-foldingathome.
Reported by: 0mp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25557
Add support for ext_pgs mbufs to nfscl_reqstart() and nfsm_set().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.
Since ND_EXTPG is never set yet, there is no semantic change at this time.
cem [Fri, 3 Jul 2020 14:54:46 +0000 (14:54 +0000)]
Add domain policy allocation for amd64 fpu_kern_ctx
Like other types of allocation, fpu_kern_ctx are frequently allocated per-cpu.
Provide the API and sketch some example consumers.
fpu_kern_alloc_ctx_domain() preferentially allocates memory from the
provided domain, and falls back to other domains if that one is empty
(DOMAINSET_PREF(domain) policy).
Maybe it makes more sense to just shove one of these in the DPCPU area
sooner or later -- left for future work.
They trigger for some people, the bug is not obvious, there are no takers
for fixing it, the issue already had to be there for years beforehand and
is low priority.
cxgbe(4): changes in the Tx path to help increase tx coalescing.
- Ask the firmware for the number of frames that can be stuffed in one
work request.
- Modify mp_ring to increase the likelihood of tx coalescing when there
are just one or two threads that are doing most of the tx. Add teeth
to the abdication mechanism by pushing the consumer lock into mp_ring.
This reduces the likelihood that a consumer will get stuck with all
the work even though it is above its budget.
- Add support for coalesced tx WR to the VF driver. This, with the
changes above, results in a 7x improvement in the tx pps of the VF
driver for some common cases. The firmware vets the L2 headers
submitted by the VF driver and it's a big win if the checks are
performed for a batch of packets and not each one individually.
This is the first of a series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.
Since ND_EXTPG is never set yet, there is no semantic change at this time.
Must acquire the z_teardown_lock before accessing the zfsvfs_t object. I
can't reproduce this panic on demand, but this looks like the correct
solution.
The top 10 bits of a pte are reserved by specification[1] and are not part of
the PPN.
[1] 'Volume II: RISC-V Privileged Architectures V20190608-Priv-MSU-Ratified',
'4.4.1 Addressing and Memory Protection', page 72: "The PTE format for Sv39 is
shown in Figure 4.18. ... Bits 63–54 are reserved for future use and must be
zeroed by software for forward compatibility."
andrew [Wed, 1 Jul 2020 16:17:51 +0000 (16:17 +0000)]
Move ID reading signatures to a better header
The functions to read the common user and kernel ID registers should be
in cpu.h rather than undefined.h as they are related to CPU details and
used by undefined instruction handlers.
Update with the members of the 11th core team, core.xi
- Update the core-secretary role.
- Update the comment to mention that the sorting is done based on FreeBSD
login name
andrew [Wed, 1 Jul 2020 15:17:45 +0000 (15:17 +0000)]
Read the arm64 ID registers earlier in the boot process.
Also move parsing the registers to just after the secondary CPUs have
started. This means the kernel register view from all CPUs is available
after the CPU SYSINITs have finished, e.g. for use by ifunc resolvers.
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D25505
andrew [Wed, 1 Jul 2020 12:07:28 +0000 (12:07 +0000)]
Simplify the flow when getting/setting an isrc
Rather than unlocking and returning we can just perform the needed action
only when the interrupt source is valid and reuse the unlock in both the
valid irq and invalid irq cases.
Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.
Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy. Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.
It fixes Redis, among other things.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25461
The "pid" field in the LinuxKPI task struct is typically set to the thread ID
and not the process ID. Make sure the linux_task_exiting() function uses tdfind()
to lookup the BSD procedure structure pointer by the "pid" field, and only
fallback to pfind() when no match is found! This makes linux_task_exiting()
in line with the rest of the code.
The new function operates similarly to ifconfig_lagg_get_lagg_status and
likewise is accompanied by a function to free the bridge status data structure.
I have included in this patch the relocation of some strings describing STP
parameters and the PV2ID macro from ifconfig into net/if_bridgevar.h as they
are useful for consumers of libifconfig.
cem [Wed, 1 Jul 2020 02:16:36 +0000 (02:16 +0000)]
geom(4): Kill GEOM_PART_EBR_COMPAT option
Take advantage of Warner's nice new real GEOM aliasing system and use it for
aliased partition names that actually work.
Our canonical EBR partition name is the weird, not-default-on-x86-prior-to-
this-revision "da1p4+00001234." However, if compatibility mode (tunable
kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to
provide the alias names like "da1p5" in addition to the weird canonical
names.
Naming partition providers was just one aspect of the COMPAT knob; in
addition it limited mutability, in part because it did not preserve existing
EBR header content aside from that of LBA 0. This change saves the EBR
header for LBA 0, as well as for every EBR partition encountered. That way,
when we write out the EBR partition table on modification, we can restore
any bootloader or other metadata in both LBA0 (the first data-containing EBR
may start after 0) as well as every logical EBR we read from the disk, and
only update the geometry metadata and linked list pointers that describe the
actual partitioning.
(This change does not add support for the 'bootcode' verb to EBR.)
PR: 232463
Reported by: Manish Jain <bourne.identity AT hotmail.com>
Discussed with: ae (no objection)
Relnotes: maybe
Differential Revision: https://reviews.freebsd.org/D24939
cem [Wed, 1 Jul 2020 00:59:28 +0000 (00:59 +0000)]
Replace OPENSSL_NO_SSL3_METHODs with dummies
SSLv3 has been deprecated since 2015 (and broken since 2014: "POODLE"); it
should not have shipped in FreeBSD 11 (2016) or 12 (2018). No one should use
it, and if they must, they can use some implementation outside of base.
There are three symbols removed with OPENSSL_NO_SSL3_METHOD:
These symbols exist to request an explicit SSLv3 connection to a server.
There is no good reason for an application to link or invoke these symbols
instead of TLS_method(), et al (née SSLv23_method, et al). Applications
that do so have broken cryptography.
Define these symbols for some pedantic definition of ABI stability, but
remove the functionality again (r361392) after r362620.
- Add CCM driver and clocks implementations for i.MX 8M
- Add GPC driver for iMX8
- Add clock tree for i.MX 8M Quad
- Add clocks support and new compat strings (where required) for existing i.MX 6 UART, I2C, and GPIO drivers
- Enable aarch64-compatible drivers form i.MX 6 in arm64 GENERIC kernel config
- Add dtb/imx8 kernel module with DTBs for Nitrogen8M and iMX8MQ EVK
With this patch both Nitrogen8M and iMX8MQ EVK boot with NFS root up to multiuser login prompt
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D25274
adrian [Wed, 1 Jul 2020 00:23:49 +0000 (00:23 +0000)]
[net80211] Migrate HT/legacy protection mode and preamble calculation to per-VAP flags
The later firmware devices (including iwn!) support multiple configuration
contexts for a lot of things, leaving it up to the firmware to decide
which channel and vap is active. This allows for things like off-channel
p2p sta/ap operation and other weird things.
However, net80211 is still focused on a "net80211 drives all" when it comes to driving
the NIC, and as part of this history a lot of these options are global and not per-VAP.
This is fine when net80211 drives things and all VAPs share a single channel - these
parameters importantly really reflect the state of the channel! - but it will increasingly
be not fine when we start supporting more weird configurations and more recent NICs.
Yeah, recent like iwn/iwm.
Anyway - so, migrate all of the HT protection, legacy protection and preamble
stuff to be per-VAP. The global flags are still there; they're now calculated
in a deferred taskqueue that mirrors the old behaviour. Firmware based drivers
which have per-VAP configuration of these parameters can now just listen to the
per-VAP options.
What do I mean by per-channel? Well, the above configuration parameters really
are about interoperation with other devices on the same channel. Eg, HT protection
mode will flip to legacy/mixed if it hears ANY BSS that supports non-HT stations or
indicates it has non-HT stations associated. So, these flags really should be
per-channel rather than per-VAP, and then for things like "do i need short preamble
or long preamble?" turn into a "do I need it for this current operating channel".
Then any VAP using it can query the channel that it's on, reflecting the real
required state.
This patch does none of the above paragraph just yet.
I'm also cheating a bit - I'm currently not using separate taskqueues for
the beacon updates and the per-VAP configuration updates. I can always further
split it later if I need to but I didn't think it was SUPER important here.
So:
* Create vap taskqueue entries for ERP/protection, HT protection and short/long
preamble;
* Migrate the HT station count, short/long slot station count, etc - into per-VAP
variables rather than global;
* Fix a bug with my WME work from a while ago which made it per-VAP - do the WME
beacon update /after/ the WME update taskqueue runs, not before;
* Any time the HT protmode configuration changes or the ERP protection mode
config changes - schedule the task, which will call the driver without the
net80211 lock held and all correctly serialised;
* Use the global flags for beacon IEs and VAP flags for probe responses and
other IE situations.
The primary consumer of this is ath10k. iwn could use it when sending RXON,
but we don't support IBSS or AP modes on it yet, and I'm not yet sure whether
it's required in STA mode (ie whether the firmware parses beacons to change
protection mode or whether we need to.)
Tested:
* AR9280, STA/AP
* AR9380, DWDS STA+STA/AP
* ath10k work, STA/AP
* Intel 6235, STA
* Various rtwn / run NICs, DWDS STA and STA configurations
tsoome [Tue, 30 Jun 2020 21:48:58 +0000 (21:48 +0000)]
boot1.efi: use malloc family from libsa
The zfs reader development did reach to the point where linking boot1,
we will get errors about duplicate symbols Malloc, Free, Calloc.
We can just use libsa version, just as loader.efi does. The only concern is,
libsa zalloc is using fixed size heap region, I did pick 64MB as other
stage instances are using, but this size is likely not optimal. In any case,
with limited memory setups, we should boot loader.efi directly.
trasz [Tue, 30 Jun 2020 16:24:28 +0000 (16:24 +0000)]
Make linprocfs(5) create the /proc/<PID>/task/ directores.
This is to silence down some Chromium assertions.
PR: kern/240991
Analyzed by: Alex S <iwtcex@gmail.com>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25256
andrew [Tue, 30 Jun 2020 15:58:29 +0000 (15:58 +0000)]
Add dwc_otg_acpi
Create an acpi attachment for the DWC USB OTG device. This is present in
the Raspberry Pi 4 in the USB-C port normally used to power the board. Some
firmware presents the kernel with ACPI tables rather than FDT so we need
an ACPI attachment.
Submitted by: Greg V <greg_unrelenting.technology>
Approved by: hselasky (removal of All rights reserved)
Differential Revision: https://reviews.freebsd.org/D25203
markj [Tue, 30 Jun 2020 15:56:54 +0000 (15:56 +0000)]
Remove CRYPTO_TIMING.
It was added a very long time ago. It is single-threaded, so only
really useful for basic measurements, and in the meantime we've gotten
some more sophisticated profiling tools.
asomers [Mon, 29 Jun 2020 22:12:23 +0000 (22:12 +0000)]
savecore: accept device names without the /dev/ prefix
dumpon has accepted device names without the prefix ever since r291207.
Since dumpon and savecore are always paired, they ought to accept the same
arguments. Prior to this change, specifying 'dumpdev="da3"' in
/etc/rc.conf, for example, would result in dumpon working just fine but
savecore complaining that "Dump device does not exist".
mhorne [Mon, 29 Jun 2020 19:30:35 +0000 (19:30 +0000)]
Fix printf(3) output of long doubles on RISC-V
When the RISC-V port was initially committed to FreeBSD, GCC would
generate 64-bit long doubles, and the definitions in _fpmath.h reflected
that. This was changed to 128-bit in GCC later that year [1], but the
definitions were never updated, despite the documented workaround. This
causes printf(3) and friends to interpret only the low 64-bits of a long
double in ldtoa, thereby printing incorrect values.
Update the definitions now that both clang and GCC generate 128-bit long
doubles.
kevans [Mon, 29 Jun 2020 17:47:00 +0000 (17:47 +0000)]
linux: reposition the comment for bsd_to_linux_bits/linux_to_bsd_bits
rpokala notes that splitting the definitions like this is kind of silly,
since the comment applies to both. Move the comment up (or the definition
down, depending on your perspective on life) accordingly.
jhb [Mon, 29 Jun 2020 17:19:08 +0000 (17:19 +0000)]
Stop using STATIC_CFLAGS.
This was added in r293648 to pass -mlong-calls for crt1.o and gcrt1.o.
The use of -mlong-calls was removed in r358851 for LLVM 10.0, leaving
STATIC_CFLAGS empty.
cem [Mon, 29 Jun 2020 16:54:00 +0000 (16:54 +0000)]
vm: Add missing WITNESS warnings for M_WAITOK allocation
vm_map_clip_{end,start} and lookup_clip_start allocate memory M_WAITOK
for !system_map vm_maps. Add WITNESS warning annotation for !system_map
callers who may be holding non-sleepable locks.
fernape [Mon, 29 Jun 2020 15:15:14 +0000 (15:15 +0000)]
hexdump(1): Add EXAMPLES section
* Add examples showing the use of -f, -C, -s, -n
* Rework the two already present examples that were *format* examples
* Remove .Tn suggested by mandoc(1)
* Remove reference to gdb(1) since it is not present in current
emaste [Mon, 29 Jun 2020 13:30:48 +0000 (13:30 +0000)]
Revert r362261, "Re-apply r333944 to unbreak ports"
A file update in 2018 broke many ports as it misidentified shared
libraries as PIE binaries. r333944 reverted part of the change,
restoring ports builds but misidentifying objects in the opposite
direction.
Earlier this month file 5.39 was imported, and then the change
originally from r333944 was recommitted as r362261. However, the
issue was fixed upstream, so r362261 serves no purpose.
PR: 246960, 247461 [exp-run]
Sponsored by: The FreeBSD Foundation