Kyle Evans [Mon, 26 Aug 2019 17:34:07 +0000 (17:34 +0000)]
MFC r351119, r351135-r351136, r351412: stand xtoolchain-llvm90 fixes
r351119:
stand: push LIBC_SRC up into defs.mk
Other parts of stand/ that don't use libsa will need to grab bits from libc
shortly. Push LIBC_SRC up to defs.mk in advance of this so that they can use
it, and rename it to LIBCSRC to match the convention of the rest of the *SRC
variables in this file.
r351135:
stand: boot2: fix build with xtoolchain-llvm90
ufsread.c grows a dependency on __ashldi3 with llvm90. Grab ashldi3.c out of
compiler-rt rather than trying to link against libsa (for now).
-Wno-missing-prototypes is necessary to compile ashldi3.c standalone.
r351136:
stand: gptboot: fix build with xtoolchain-llvm90
ufsread.c grows a dependency on __ashldi3 with llvm90. For gptboot, just
start pulling in ashldi3.c ashrdi3.c lshrdi3.c into libsa for all archs as
the number of archs requiring one or more of them keeps growing. qdivrem.c
and quad.h can be trivially kicked out of libsa if we start pulling these
from compiler-rt as qdivrem was only used to implement umoddi3, divdi3,
moddi3 (also in qdivrem.c).
Cy Schubert [Sun, 25 Aug 2019 13:36:20 +0000 (13:36 +0000)]
MFC r350881:
Calculate the number interface array elements using the new FR_NUM macro
instead of the hard-coded value of 4. This is a precursor to increasing
the number of interfaces speficied in "on {interface, ..., interface}".
Note that though this feature is coded in ipf_y.y, it is partially
supported in the ipfilter kld, meaning it does not work yet (and is yet
to be documented in ipf.5 too).
Cy Schubert [Sun, 25 Aug 2019 04:56:33 +0000 (04:56 +0000)]
MFC r350880:
r272552 applied the patch from ipfilter upstream fil.c r1.129 to fix
broken ipfilter rule matches (upstream bug #554). The upstream patch
was incomplete, it resolved all but one rule compare issue. The issue
fixed here is when "{to, reply-to, dup-to} interface" are used in
conjuncion with "on interface". The match was only made if the on keyword
was specified in the same order in each case referencing the same rule.
This commit fixes this.
The reason for this is that interface name strings and comment keyword
comments are stored in a a variable length field starting at fr_names
in the frentry struct. These strings are placed into this variable length
in the order they are encountered by ipf_y.y and indexed through index
pointers in fr_ifnames, fr_comment or one of the frdest struct fd_name
fields. (Three frdest structs are within frentry.) Order matters and
this patch takes this into account.
While in here it was discovered that though ipfilter is designed to
pport multiple interface specifiations per rule (up to four), this
undocumented (the man page makes no mention of it) feature does not work.
A todo is to fix the multiple interfaces feature at a later date. To
understand the design decision as to why only four were intended, it is
suspected that the decision was made because Sun workstations and PCs
rarely if ever exceeded four NICs at the time, this is not true in 2019.
John Baldwin [Sat, 24 Aug 2019 00:35:59 +0000 (00:35 +0000)]
MFC 350551:
Don't reset memory attributes when mapping physical addresses for ACPI.
Previously, AcpiOsMemory was using pmap_mapbios which would always map
the requested address Write-Back (WB). For several AMD Ryzen laptops,
the BIOS uses AcpiOsMemory to directly access the PCI MCFG region in
order to access PCI config registers. This has the side effect of
remapping the MCFG region in the direct map as WB instead of UC
hanging the laptops during boot.
On the one laptop I examined in detail, the _PIC global method used to
switch from 8259A PICs to I/O APICs uses a pair of PCI config space
registers at offset 0x84 in the device at 0:0:0 to as a pair of
address/data registers to access an indirect register in the chipset
and clear a single bit to switch modes.
To fix, alter the semantics of pmap_mapbios() such that it does not
modify the attributes of any existing mappings and instead uses the
existing attributes. If a new mapping is created, this new mapping
uses WB (the default memory attribute).
Special thanks to the gentleman whose name I don't have who brought
two affected laptops to the hacker lounge at BSDCan. Direct access to
the affected systems permitted finding the root cause within an hour
or so.
Introduce VFNT_MAXDIMENSION to replace bare 128, add set_height, and
consistently use set_height and set_width.
Submitted by: Dmitry Wagin
MFC r348662: vtfontcvt: include width and height in verbose info
Submitted by: Dmitry Wagin
MFC r348668: vtfontcvt: zero memory allocated by xmalloc
Submitted by: Dmitry Wagin
MFC r348692: vtfontcvt: exit on error if the input font has too many glyphs
The kernel has a limit of 131072 glyphs in a font; add the same check to
vtfontcvt so that we won't create a font file that the kernel will not
load.
MFC r348796: vtfontcvt: allow out-of-order glyphs
Reported by: mi
MFC r349049: vtfontcvt: add comments in add_glyph
MFC r349100: vtfontcvt: improve BDF and hex font parsing
Support larger font sizes.
Submitted by: Dmitry Wagin (original version)
MFC r349101: vtfontcvt: initialize bbwbytes to avoid GCC 4.2.1 warning
MFC r349105: vtfontcvt: initialize another variable to quiet GCC warning
I believe this case could be triggered by a broken .bdf font.
MFC r349107: vtfontcvt: improve .bdf verification
Previously we would crash if the BBX y-offset was outside of the font
bounding box.
Reported by: afl fuzzer
MFC r349108: vtfontcvt: improve .bdf validation
Previously if we had a BBX entry that had invalid values (e.g. bounding
box outside of font bounding box) and failed sscanf (e.g., because it
had fewer than four values) we skipped the BBX value validation and then
triggered an assertion failure.
Reported by: afl fuzzer
MFC r349111: vtfontcvt: correct typo in hex parsing update
Submitted by: Dmitry Wagin
MFC r349333: vtfontcvt: improve .bdf validation
Previously if we had a FONTBOUNDINGBOX or DWIDTH entry that had missing
or invalid values and and failed sscanf, we would proceeded with
partially initialized bounding box / device width variables.
MFC r350974:
Since ipvoly is used for checksum calculation, part of original IP
header is zeroed. This part includes ip_ttl field, that can be used
later in IP_MINTTL socket option handling.
This snapshot among other things includes a fix for a crash of mandoc with empty
tbl reported by rea@ (his regression test has been incorporated upstream)
Toomas Soome [Thu, 22 Aug 2019 07:37:34 +0000 (07:37 +0000)]
loader: support com.delphix:removing
MFC r348353: boot1.efi should also provide Calloc
MFC r350772: loader: support com.delphix:removing
MFC r350825: loader: add error check for vdev_indirect calls
As prerequisite, we need Calloc in boot1 so we would not conflict with libsa.
We should support removing vdev from boot pool. Update loader zfs reader
to support com.delphix:removing.
Andriy Gapon [Thu, 22 Aug 2019 07:17:49 +0000 (07:17 +0000)]
MFC r350894: a stop gap fix for a race between dnode_hold and dnode_sync_free
The race was introduced in r337669, the large dnode feature import from
ZoL. The problem was debugged by ZoL developers and then,
independently, on FreeBSD.
The fix is an early proposal by Brian Behlendorf:
https://github.com/behlendorf/zfs/commit/50f32ed74e42aa28522e9681fb8ae55239fa33a7
This fix never went into ZoL. A larger change that was committed later
included a different solution because of the re-worked code.
Ideally, we want to revert this fix and re-synchronize FreeBSD large
dnode code with that in illumos (or newer ZoL). illumos has a later
import of the feature from ZoL that does not have the bug.
John Baldwin [Wed, 21 Aug 2019 23:44:46 +0000 (23:44 +0000)]
MFC 350666:
Tidy up the list of auth and encryption algorithms for IPsec stats.
- Use keyed-md5 and keyed_sha1 instead of md5 and sha1 to match
the names accepted by setkey and to also avoid confusion since
these are not "plain" MD5 or SHA1.
- Remove always-true #ifdef's to make the source a bit easier to
read.
- Add missing mappings for tcp-md5, camellia-cbc, and aes-gmac.
John Baldwin [Wed, 21 Aug 2019 22:42:08 +0000 (22:42 +0000)]
MFC 348970,348974:
Make the warning intervals for deprecated crypto algorithms tunable.
348970:
Make the warning intervals for deprecated crypto algorithms tunable.
New sysctl/tunables can now set the interval (in seconds) between
rate-limited crypto warnings. The new sysctls are:
- kern.cryptodev_warn_interval for /dev/crypto
- net.inet.ipsec.crypto_warn_interval for IPsec
- kern.kgssapi_warn_interval for KGSSAPI
348974:
Move declaration of warninterval out from under COMPAT_FREEBSD32.
This fixes builds of kernels without COMPAT_FREEBSD32.
Eric Joyner [Tue, 20 Aug 2019 20:15:32 +0000 (20:15 +0000)]
MFC various iflib fixes from head
Included revisions:
r347418 - iflib: use default ntxd and rxd when user value is not power of 2
r348372 - iflib: provide probe wrapper for vendor drivers
r350306 - iflib: fix dangling device softc pointer
r350507 - iflib: remove kobject class reference increment
r350509 - iflib: Prevent kernel panic caused by loading driver with a specific interrupt configuration
r351152 - iflib: add iflib_deregister to help cleanup on exit
John Baldwin [Tue, 20 Aug 2019 01:30:35 +0000 (01:30 +0000)]
MFC 348876: Add warnings to /dev/crypto for deprecated algorithms.
These algorithms are deprecated algorithms that will have no in-kernel
consumers in FreeBSD 13. Specifically, deprecate the following
algorithms:
- ARC4
- Blowfish
- CAST128
- DES
- 3DES
- MD5-HMAC
- Skipjack
John Baldwin [Tue, 20 Aug 2019 00:50:17 +0000 (00:50 +0000)]
MFC 348875:
Add warnings for Kerberos GSS algorithms deprecated in RFCs 6649 and 8429.
All of these algorithms are explicitly marked SHOULD NOT in one of these
RFCs.
Specifically, RFC 6649 deprecates all algorithms using DES as well as
the "export-friendly" variant of RC4. RFC 8429 deprecates Triple DES
and the remaining RC4 algorithms.
John Baldwin [Mon, 19 Aug 2019 22:31:04 +0000 (22:31 +0000)]
MFC 349467: Hold an explicit reference on the socket for the aiotx task.
Previously, the aiotx task relied on the aio jobs in the queue to hold
a reference on the socket. However, when the last job is completed,
there is nothing left to hold a reference to the socket buffer lock
used to check if the queue is empty. In addition, if the last job on
the queue is cancelled, the task can run with no queued jobs holding a
reference to the socket buffer lock the task uses to notice the queue
is empty.
Fix these races by holding an explicit reference on the socket when
the task is queued and dropping that reference when the task
completes.
John Baldwin [Mon, 19 Aug 2019 21:59:02 +0000 (21:59 +0000)]
MFC 348874: Remove an overly-aggressive assertion.
While it is true that the new vmspace passed to vmspace_switch_aio
will always have a valid reference due to the AIO job or the extra
reference on the original vmspace in the worker thread, it is not true
that the old vmspace being switched away from will have more than one
reference.
Specifically, when a process with queued AIO jobs exits, the exit hook
in aio_proc_rundown will only ensure that all of the AIO jobs have
completed or been cancelled. However, the last AIO job might have
completed and woken up the exiting process before the worker thread
servicing that job has switched back to its original vmspace. In that
case, the process might finish exiting dropping its reference to the
vmspace before the worker thread resulting in the worker thread
dropping the last reference.
Emmanuel Vadot [Fri, 16 Aug 2019 21:40:39 +0000 (21:40 +0000)]
MFC r343952, r344003, r344219, r344343, r344456
r343952 by ganbold:
Enable necessary bits when activating interrupts. This allows
reading some events from the interrupt status registers. These events
are reported to devd via system "PMU" and subsystem "Battery", "AC"
and "USB" such as plugged/unplugged, absent, charged and charging.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19116
r344003 by ganbold:
Add sensors support for AXP803/AXP813. Sensor values such as
battery charging, charge state, voltage, charging current, discharging current,
battery capacity etc. can be obtained via sysctl.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19145
r344219 by ganbold:
Add sysctl for setting battery charging current.
The charging current can be set using steps
from 0: 200mA to 13: 2800mA (200mA/step).
While there, fix battery charging current related
sensor descriptions.
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19212
r344343 by ganbold:
Clarify notifications when battery capacity ratio
reaches warning and shutdown thresholds.
r344456 by ganbold:
Add base to the warning threshold.
Kyle Evans [Fri, 16 Aug 2019 21:14:27 +0000 (21:14 +0000)]
MFC r351078, r351085, r351088: mostly a nop (two commits + revert of two)
This commit is mostly a nop, but ends up renumbering #4 clause to #3 in one
copy of quad.h... this is OK; stand/ situation in stable/11 is pretty murky
and the commit that renumbered the clause got lost somewhere. quad.h will be
disappearing in a not-so-distant future MFC.
r351078:
stand: kick out quad.h
Use quad.h from libc instead for the time being. This reduces the number of
nearly-identical-quad.h we have in tree to two with only minor changes.
Prototypes for some *sh*di3 have been added to match the copy in libkern.
The differences between the two are likely few enough that they can perhaps
be merged with little additional effort to bring us down to 1.
r351085:
libc quad.h: one last _STANDALONE correction
Kyle Evans [Fri, 16 Aug 2019 21:01:35 +0000 (21:01 +0000)]
MFC r350464: kern_shm_open: push O_CLOEXEC into caller control
The motivation for this change is to allow wrappers around shm to be written
that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but sets
it unconditionally. kern_shm_open is used by the shm_open(2) syscall, which
is mandated by POSIX to set CLOEXEC, and CloudABI's sys_fd_create1().
Presumably O_CLOEXEC is intended in the latter caller, but it's unclear from
the context.
sys_shm_open() now unconditionally sets O_CLOEXEC to meet POSIX
requirements, and a comment has been dropped in to kern_fd_open() to explain
the situation and add a pointer to where O_CLOEXEC setting is maintained for
shm_open(2) correctness. CloudABI's sys_fd_create1() also unconditionally
sets O_CLOEXEC to match previous behavior.
This also has the side-effect of making flags correctly reflect the
O_CLOEXEC status on this fd for the rest of kern_shm_open(), but a
glance-over leads me to believe that it didn't really matter.
Emmanuel Vadot [Fri, 16 Aug 2019 20:56:35 +0000 (20:56 +0000)]
MFC r349596 by ganbold:
Extend simple_mfd driver to expose a syscon interface if
that node is also compatible with syscon. For instance,
Rockchip RK3399's GRF (General Register Files) is compatible
with simple-mfd as well as syscon and has devices like
usb2-phy, emmc-phy and pcie-phy etc. under it.
Pedro F. Giffuni [Fri, 16 Aug 2019 04:51:04 +0000 (04:51 +0000)]
MFC r350969:
Add deprecation notice to snd_ds1(4).
As suggested in:
https://wiki.freebsd.org/WhatsGoing/FreeBSD13
We will be dropping the snd_ds1 driver. The driver is known to be buggy
and no one has been working on it for years now.
Users of old Yamaha cards may have luck with the OSS drivers instead.
Kyle Evans [Thu, 15 Aug 2019 17:40:48 +0000 (17:40 +0000)]
MFC r350576: ipfw: fix jail option after r348215
r348215 changed jail_getid(3) to validate passed-in jids as active jails
(as the function is documented to return -1 if the jail does not exist).
This broke the jail option (in some cases?) as the jail historically hasn't
needed to exist at the time of rule parsing; jids will get stored and later
applied.
Fix this caller to attempt to parse *av as a number first and just use it
as-is to match historical behavior. jail_getid(3) must still be used in
order for name arguments to work, but it's strictly a fallback in case we
weren't given a number.
John Baldwin [Wed, 14 Aug 2019 23:31:53 +0000 (23:31 +0000)]
MFC 348695: Support MSI-X for passthrough devices with a separate PBA BAR.
pci_alloc_msix() requires both the table and PBA BARs to be allocated
by the driver. ppt was only allocating the table BAR so would fail
for devices with the PBA in a separate BAR. Fix this by allocating
the PBA BAR before pci_alloc_msix() if it is stored in a separate BAR.
While here, release BARs after calling pci_release_msi() instead of
before. Also, don't call bus_teardown_intr() in error handling code
if bus_setup_intr() has just failed.
John Baldwin [Wed, 14 Aug 2019 23:28:43 +0000 (23:28 +0000)]
MFC 348694: Don't simulate PBA access if the PBA is in a separate BAR.
bhyve has to virtualize the MSI-X table to trap reads and writes to
that table and map those to virtual interrupts that it maps real host
interrupts on to. For the pending-bit-array (PBA), bhyve passes
accesses from the guest directly to the hardware.
bhyve's virtualization of the MSI-X table is done by intercepting all
reads and writes to the BAR holding the MSI-X table. However, if the
PBA is stored in the same BAR as the MSI-X table, accesses to the PBA
portion of this BAR have to be forwarded to the real BAR.
However, in the case that the PBA was stored in a separate BAR and
it's offset in that separate BAR overlapped with the portion of the
MSI-X table BAR that the table used, the handlers for the table BAR
would incorrectly think that some accesses were PBA reads and writes.
This caused a crash in bhyve when it indirected a NULL pointer. Fix
this case by never trying to handle PBA access if the PBA lives in a
separate BAR.
John Baldwin [Wed, 14 Aug 2019 23:25:58 +0000 (23:25 +0000)]
MFC 347465: Apply r280991 to ip6_fragment.
This uses m_dup_pkthdr() to copy all of the metadata about a packet to
each of its fragments including VLAN tags, mbuf tags, etc. instead of
hand-copying a few fields.
John Baldwin [Wed, 14 Aug 2019 23:05:57 +0000 (23:05 +0000)]
MFC 346360: Push down INP_WLOCK slightly in tcp_ctloutput.
The inp lock is not needed for testing the V6 flag as that flag is set
once when the inp is created and never changes. For non-TCP socket
options the lock is immediately dropped after checking that flag.
This just pushes the lock down to only be acquired for TCP socket
options.
This isn't a hot-path, more a cosmetic cleanup I noticed while reading
the code.
Dimitry Andric [Wed, 14 Aug 2019 19:21:26 +0000 (19:21 +0000)]
MFC r350697:
Fix a possible segfault in wcsxfrm(3) and wcsxfrm_l(3).
If the length of the source wide character string, passed in via the
"size_t n" parameter, is set to zero, the function should only return
the required length for the destination wide character string. In this
case, it should *not* attempt to write to the destination, so the "dst"
parameter is permitted to be NULL.
However, when the internally called _collate_wxfrm() function returns an
error, such as when using the "C" locale, as a fallback wcscpy(3) or
wcsncpy(3) are used. But if the input length is zero, wcsncpy(3) will
be called with a length of -1! If the "dst" parameter is NULL, this
will immediately result in a segfault, or if "dst" is a valid pointer,
it will most likely result in unexpectedly overwritten memory.
Fix this by explicitly checking for an input length greater than zero,
before calling wcsncpy(3).
Note that a similar situation does not occur in strxfrm(3), the plain
character version of this function, as it uses strlcpy(3) for the error
case. The strlcpy(3) function does not write to the destination if the
input length is zero.
Andrew Turner [Wed, 14 Aug 2019 17:02:36 +0000 (17:02 +0000)]
MFC r350112, r350241, and r350242:
r350166:
arm64: Implement HWCAP
Add HWCAP support for arm64.
defines are the same as in Linux and a userland program can use
elf_aux_info to get the data.
We only save the common denominator for all cores in case the
big and little cluster have different support (this is known to
exists even if we don't support those SoCs in FreeBSD)
r350241:
Ensure the arm64 ID register fields are 64 bit types.
Previously only some of the ID register fields were 64 bit. To allow
for a script to generate these mark them all 64 bit. To allow for their
use in assembly we need to use the UINT64_C macro via a new UL macro
to stop the lines from being too long.
Andrew Turner [Wed, 14 Aug 2019 16:54:51 +0000 (16:54 +0000)]
MFC r345510:
Sort printing of the ID registers on arm64 to be identical to the
documentation. This will simplify checking new fields when they are added.
Alan Somers [Mon, 12 Aug 2019 20:31:12 +0000 (20:31 +0000)]
MFC r349248, r349391, r350088
r349248:
fcntl: fix overflow when setting F_READAHEAD
VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.
Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936. Formalize that by using a union type.
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20710
r349391:
fcntl: style changes to r349248
Reported by: bde
MFC-With: 349248
Sponsored by: The FreeBSD Foundation
r350088:
F_READAHEAD: Fix r349248's overflow protection, broken by r349391
I accidentally broke the main point of r349248 when making stylistic changes
in r349391. Restore the original behavior, and also fix an additional
overflow that was possible when uio->uio_resid was nearly SSIZE_MAX.
Reported by: cem
Reviewed by: bde
MFC-With: 349248
Sponsored by: The FreeBSD Foundation
Alan Somers [Mon, 12 Aug 2019 20:21:36 +0000 (20:21 +0000)]
MFC r349231, r349233, r349280, r349478
r349231:
Add FIOBMAP2 ioctl
This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux. FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20705
r349233:
#include <sys/types.h> from sys/filio.h
This fixes world build after r349231
Reported by: Jenkins
MFC-With: 349231
Sponsored by: The FreeBSD Foundation
r349280:
Reduce namespace pollution from r349233
Define __daddr_t in _types.h and use it in filio.h
Reported by: ian, bde
Reviewed by: ian, imp, cem
MFC-With: 349233
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20715
r349478:
FIOBMAP2: inline vn_ioc_bmap2
Reported by: kib
Reviewed by: kib
MFC-With: 349238
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20783
Alexander Motin [Mon, 12 Aug 2019 19:44:57 +0000 (19:44 +0000)]
MFC r350652 (by imp): Fix mismerge.
I merged passthru.c from the wrong branch (it was a branch that went further in
a direction I wound up not taking). Fix the mismerge and turn passthru on.
Alexander Motin [Mon, 12 Aug 2019 19:43:25 +0000 (19:43 +0000)]
MFC r350563: Add `nvmecontrol sanitize` command.
It allows to delete all user data from NVM subsystem in one of 3 methods.
It is a close equivalent of SCSI SANITIZE command of `camcontrol sanitize`,
so I tried to keep arguments as close as possible.
While there, fix supported sanitize methods reporting in `identify`.
Alexander Motin [Mon, 12 Aug 2019 19:39:31 +0000 (19:39 +0000)]
MFC r350523, r350524: Add IOCTL to translate nvdX into nvmeY and NSID.
While very useful by itself, it also makes `nvmecontrol` not depend on
hardcoded device names parsing, that in its turn makes simple to take
nvdX (and potentially any other) device names as arguments.
Also added IOCTL bypass from nvdX to respective nvmeYnsZ makes them
interchangeable for management purposes.
This adds several previously missed but important subcommands to list
namespaces and controllers. It also fixes few previously added but
just found with real testing to be broken subcommands.
Also while there, add possibility to explicitly specify nsid for
`nvmecontrol identify` subcommand. It may be useful to specify nsids
not having own devices, for example 0xffffffff, or just newly created
ones.