]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoMFC r350779:
Mark Johnston [Wed, 21 Aug 2019 16:15:40 +0000 (16:15 +0000)]
MFC r350779:
Fix formatting.

PR: 239726

4 years agoMFC r350458:
Mark Johnston [Wed, 21 Aug 2019 16:15:07 +0000 (16:15 +0000)]
MFC r350458:
Use VNASSERT() in checked VOP wrappers.

4 years agoMFC various iflib fixes from head
Eric Joyner [Tue, 20 Aug 2019 20:15:32 +0000 (20:15 +0000)]
MFC various iflib fixes from head

Included revisions:
r347418 - iflib: use default ntxd and rxd when user value is not power of 2
r348372 - iflib: provide probe wrapper for vendor drivers
r350306 - iflib: fix dangling device softc pointer
r350507 - iflib: remove kobject class reference increment
r350509 - iflib: Prevent kernel panic caused by loading driver with a specific interrupt configuration
r351152 - iflib: add iflib_deregister to help cleanup on exit

Sponsored by: Intel Corporation

4 years agoMFC r351262:
Mark Johnston [Tue, 20 Aug 2019 17:53:16 +0000 (17:53 +0000)]
MFC r351262:
Use a sleepable lock for midistat functions.

Security: CVE-2019-5612

4 years agoMFC r351254: mqueuefs: fix compat32 struct file leak
Ed Maste [Tue, 20 Aug 2019 17:45:22 +0000 (17:45 +0000)]
MFC r351254: mqueuefs: fix compat32 struct file leak

In a compat32 error case we previously leaked a struct file.

Submitted by: Karsten König, Secfault Security
Security: CVE-2019-5603

4 years agoMFC 348876: Add warnings to /dev/crypto for deprecated algorithms.
John Baldwin [Tue, 20 Aug 2019 01:30:35 +0000 (01:30 +0000)]
MFC 348876: Add warnings to /dev/crypto for deprecated algorithms.

These algorithms are deprecated algorithms that will have no in-kernel
consumers in FreeBSD 13.  Specifically, deprecate the following
algorithms:
- ARC4
- Blowfish
- CAST128
- DES
- 3DES
- MD5-HMAC
- Skipjack

Relnotes: yes

4 years agoMFC 348875:
John Baldwin [Tue, 20 Aug 2019 00:50:17 +0000 (00:50 +0000)]
MFC 348875:
Add warnings for Kerberos GSS algorithms deprecated in RFCs 6649 and 8429.

All of these algorithms are explicitly marked SHOULD NOT in one of these
RFCs.

Specifically, RFC 6649 deprecates all algorithms using DES as well as
the "export-friendly" variant of RC4.  RFC 8429 deprecates Triple DES
and the remaining RC4 algorithms.

Relnotes: yes

4 years agoMFC 349616: Fix description of debug.obsolete_panic.
John Baldwin [Mon, 19 Aug 2019 23:57:37 +0000 (23:57 +0000)]
MFC 349616: Fix description of debug.obsolete_panic.

4 years agoMFC 349467: Hold an explicit reference on the socket for the aiotx task.
John Baldwin [Mon, 19 Aug 2019 22:31:04 +0000 (22:31 +0000)]
MFC 349467: Hold an explicit reference on the socket for the aiotx task.

Previously, the aiotx task relied on the aio jobs in the queue to hold
a reference on the socket.  However, when the last job is completed,
there is nothing left to hold a reference to the socket buffer lock
used to check if the queue is empty.  In addition, if the last job on
the queue is cancelled, the task can run with no queued jobs holding a
reference to the socket buffer lock the task uses to notice the queue
is empty.

Fix these races by holding an explicit reference on the socket when
the task is queued and dropping that reference when the task
completes.

4 years agoMFC 348874: Remove an overly-aggressive assertion.
John Baldwin [Mon, 19 Aug 2019 21:59:02 +0000 (21:59 +0000)]
MFC 348874: Remove an overly-aggressive assertion.

While it is true that the new vmspace passed to vmspace_switch_aio
will always have a valid reference due to the AIO job or the extra
reference on the original vmspace in the worker thread, it is not true
that the old vmspace being switched away from will have more than one
reference.

Specifically, when a process with queued AIO jobs exits, the exit hook
in aio_proc_rundown will only ensure that all of the AIO jobs have
completed or been cancelled.  However, the last AIO job might have
completed and woken up the exiting process before the worker thread
servicing that job has switched back to its original vmspace.  In that
case, the process might finish exiting dropping its reference to the
vmspace before the worker thread resulting in the worker thread
dropping the last reference.

4 years agoMFC 348791: Fix debug trace after removal of pdu_overhead.
John Baldwin [Mon, 19 Aug 2019 18:50:56 +0000 (18:50 +0000)]
MFC 348791: Fix debug trace after removal of pdu_overhead.

4 years agoMFC 348969: Document sysctl nodes that translate their values.
John Baldwin [Mon, 19 Aug 2019 17:27:06 +0000 (17:27 +0000)]
MFC 348969: Document sysctl nodes that translate their values.

This documents the behavior of sysctl_msec_to_ticks.

The MFC does not document SYSCTL_{ADD,}_SBINTIME_[UM]SEC since those
are only present in head.

4 years agoMFC r350893: Allow ZVOL bookmarks to be listed recursively
Andriy Gapon [Mon, 19 Aug 2019 07:45:39 +0000 (07:45 +0000)]
MFC r350893: Allow ZVOL bookmarks to be listed recursively

PR: 197821

4 years agoMFC r350701,r350702: rc.8: add a reference to service(8)
Andriy Gapon [Mon, 19 Aug 2019 07:40:42 +0000 (07:40 +0000)]
MFC r350701,r350702: rc.8: add a reference to service(8)

4 years agoMFC of 351002
Kirk McKusick [Sat, 17 Aug 2019 06:06:50 +0000 (06:06 +0000)]
MFC of 351002

Clarify how FS_METACKHASH flag is managed.

4 years agoMFC r350108: Remove support for FreeBSD 10.x.
Xin LI [Sat, 17 Aug 2019 01:49:57 +0000 (01:49 +0000)]
MFC r350108: Remove support for FreeBSD 10.x.

4 years agoMFC r348968:
Doug Moore [Fri, 16 Aug 2019 21:54:12 +0000 (21:54 +0000)]
MFC r348968:
Avoid overflow in computing gap sizes.

Approved by: markj (mentor)

4 years agoMFC r343952, r344003, r344219, r344343, r344456
Emmanuel Vadot [Fri, 16 Aug 2019 21:40:39 +0000 (21:40 +0000)]
MFC r343952, r344003, r344219, r344343, r344456

r343952 by ganbold:
Enable necessary bits when activating interrupts. This allows
reading some events from the interrupt status registers. These events
are reported to devd via system "PMU" and subsystem "Battery", "AC"
and "USB" such as plugged/unplugged, absent, charged and charging.

Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19116

r344003 by ganbold:
Add sensors support for AXP803/AXP813. Sensor values such as
battery charging, charge state, voltage, charging current, discharging current,
battery capacity etc. can be obtained via sysctl.

Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19145

r344219 by ganbold:
Add sysctl for setting battery charging current.
The charging current can be set using steps
from 0: 200mA to 13: 2800mA (200mA/step).
While there, fix battery charging current related
sensor descriptions.

Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D19212

r344343 by ganbold:
Clarify notifications when battery capacity ratio
reaches warning and shutdown thresholds.

r344456 by ganbold:
Add base to the warning threshold.

4 years agoMFC r348881:
Doug Moore [Fri, 16 Aug 2019 21:36:13 +0000 (21:36 +0000)]
MFC r348881:
Touch fewer entries when altering vm_map.

Approved by: markj (mentor)

4 years agoMFC r349637 by ganbold:
Emmanuel Vadot [Fri, 16 Aug 2019 21:28:28 +0000 (21:28 +0000)]
MFC r349637 by ganbold:

Fix build error introduced by r349596.

4 years agoMFC r351078, r351085, r351088: mostly a nop (two commits + revert of two)
Kyle Evans [Fri, 16 Aug 2019 21:14:27 +0000 (21:14 +0000)]
MFC r351078, r351085, r351088: mostly a nop (two commits + revert of two)

This commit is mostly a nop, but ends up renumbering #4 clause to #3 in one
copy of quad.h... this is OK; stand/ situation in stable/11 is pretty murky
and the commit that renumbered the clause got lost somewhere. quad.h will be
disappearing in a not-so-distant future MFC.

r351078:
stand: kick out quad.h

Use quad.h from libc instead for the time being. This reduces the number of
nearly-identical-quad.h we have in tree to two with only minor changes.

Prototypes for some *sh*di3 have been added to match the copy in libkern.
The differences between the two are likely few enough that they can perhaps
be merged with little additional effort to bring us down to 1.

r351085:
libc quad.h: one last _STANDALONE correction

r351088:
Revert r351078, r351085: stand/quad.h eviction

It did not go well; further examination is required...

4 years agoMFC r350630, r350657: static analysis fixes from Haiku
Kyle Evans [Fri, 16 Aug 2019 21:03:55 +0000 (21:03 +0000)]
MFC r350630, r350657: static analysis fixes from Haiku

r350630:
oce(4): potential out of bounds access before vector validation

r350657:
ral: rt2860: fix wcid2ni access/size issue

RT2860_WCID_MAX is supposed to describe the max STA index for wcid2ni, and
was instead being used as the size -- off-by-one.

rt2860_drain_stats_fifo was range-checking wcid only after accessing
out-of-bounds potentially.

4 years agoMFC r350464: kern_shm_open: push O_CLOEXEC into caller control
Kyle Evans [Fri, 16 Aug 2019 21:01:35 +0000 (21:01 +0000)]
MFC r350464: kern_shm_open: push O_CLOEXEC into caller control

The motivation for this change is to allow wrappers around shm to be written
that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but sets
it unconditionally. kern_shm_open is used by the shm_open(2) syscall, which
is mandated by POSIX to set CLOEXEC, and CloudABI's sys_fd_create1().
Presumably O_CLOEXEC is intended in the latter caller, but it's unclear from
the context.

sys_shm_open() now unconditionally sets O_CLOEXEC to meet POSIX
requirements, and a comment has been dropped in to kern_fd_open() to explain
the situation and add a pointer to where O_CLOEXEC setting is maintained for
shm_open(2) correctness. CloudABI's sys_fd_create1() also unconditionally
sets O_CLOEXEC to match previous behavior.

This also has the side-effect of making flags correctly reflect the
O_CLOEXEC status on this fd for the rest of kern_shm_open(), but a
glance-over leads me to believe that it didn't really matter.

4 years agoMFC r349596 by ganbold:
Emmanuel Vadot [Fri, 16 Aug 2019 20:56:35 +0000 (20:56 +0000)]
MFC r349596 by ganbold:

Extend simple_mfd driver to expose a syscon interface if
that node is also compatible with syscon. For instance,
Rockchip RK3399's GRF (General Register Files) is compatible
with simple-mfd as well as syscon and has devices like
usb2-phy, emmc-phy and pcie-phy etc. under it.

Reviewed by: manu

4 years agoMFC r348880, r348882
Emmanuel Vadot [Fri, 16 Aug 2019 20:49:10 +0000 (20:49 +0000)]
MFC r348880, r348882

r348880 by loos:
Add the GPIO driver for the North/South bridge in Marvell Armada 37x0.

The A3700 has a different GPIO controller and thus, do not use the old (and
shared) code for Marvell.

The pinctrl driver, also part of the controller, is not supported yet (but
the implementation should be straightforward).

Sponsored by: Rubicon Communications, LLC (Netgate)

r348882 by loos:
Add support for the GPIO SD Card VCC regulator/switch and the GPIO SD Card
detection pins to the Marvell Xenon SDHCI controller.

These features are enable by 'vqmmc-supply' and 'cd-gpios' properties in the
DTS.

This fixes the SD Card detection on espressobin.

Sponsored by: Rubicon Communications, LLC (Netgate)

4 years agoMFC r350450, r350540:
Mark Johnston [Fri, 16 Aug 2019 15:31:46 +0000 (15:31 +0000)]
MFC r350450, r350540:
Enable witness(4) blessings.

4 years agoMFC r350671:
Mark Johnston [Fri, 16 Aug 2019 15:25:53 +0000 (15:25 +0000)]
MFC r350671:
readelf: Close input files when done with them.

4 years agoMFC r350679:
Mark Johnston [Fri, 16 Aug 2019 15:24:04 +0000 (15:24 +0000)]
MFC r350679:
Merge r3780 from elftoolchain.

4 years agoMFC r350696:
Mark Johnston [Fri, 16 Aug 2019 15:23:43 +0000 (15:23 +0000)]
MFC r350696:
Use designated initializers for vmm_ops.

4 years agoMFC r350816:
Andrey V. Elsukov [Fri, 16 Aug 2019 12:27:19 +0000 (12:27 +0000)]
MFC r350816:
  Add missing new line in several log messages.

  PR: 239694

4 years agoMFC r350970:
Pedro F. Giffuni [Fri, 16 Aug 2019 04:53:02 +0000 (04:53 +0000)]
MFC r350970:
Add deprecation notice to snd_maestro(4).

As suggested in:
https://wiki.freebsd.org/WhatsGoing/FreeBSD13

this old driver is buggy and no one is working on it so we should deprecate
it for the next release.

4 years agoMFC r350969:
Pedro F. Giffuni [Fri, 16 Aug 2019 04:51:04 +0000 (04:51 +0000)]
MFC r350969:
Add deprecation notice to snd_ds1(4).

As suggested in:
https://wiki.freebsd.org/WhatsGoing/FreeBSD13

We will be dropping the snd_ds1 driver. The driver is known to be buggy
and no one has been working on it for years now.
Users of old Yamaha cards may have luck with the OSS drivers instead.

4 years agopmc: restore "unhalted-cycles" alias
Matt Macy [Thu, 15 Aug 2019 21:39:21 +0000 (21:39 +0000)]
pmc: restore "unhalted-cycles" alias

Reported by: mav@

4 years agoMFC r350576: ipfw: fix jail option after r348215
Kyle Evans [Thu, 15 Aug 2019 17:40:48 +0000 (17:40 +0000)]
MFC r350576: ipfw: fix jail option after r348215

r348215 changed jail_getid(3) to validate passed-in jids as active jails
(as the function is documented to return -1 if the jail does not exist).
This broke the jail option (in some cases?) as the jail historically hasn't
needed to exist at the time of rule parsing; jids will get stored and later
applied.

Fix this caller to attempt to parse *av as a number first and just use it
as-is to match historical behavior. jail_getid(3) must still be used in
order for name arguments to work, but it's strictly a fallback in case we
weren't given a number.

4 years agoMFC 348695: Support MSI-X for passthrough devices with a separate PBA BAR.
John Baldwin [Wed, 14 Aug 2019 23:31:53 +0000 (23:31 +0000)]
MFC 348695: Support MSI-X for passthrough devices with a separate PBA BAR.

pci_alloc_msix() requires both the table and PBA BARs to be allocated
by the driver.  ppt was only allocating the table BAR so would fail
for devices with the PBA in a separate BAR.  Fix this by allocating
the PBA BAR before pci_alloc_msix() if it is stored in a separate BAR.

While here, release BARs after calling pci_release_msi() instead of
before.  Also, don't call bus_teardown_intr() in error handling code
if bus_setup_intr() has just failed.

4 years agoMFC 348694: Don't simulate PBA access if the PBA is in a separate BAR.
John Baldwin [Wed, 14 Aug 2019 23:28:43 +0000 (23:28 +0000)]
MFC 348694: Don't simulate PBA access if the PBA is in a separate BAR.

bhyve has to virtualize the MSI-X table to trap reads and writes to
that table and map those to virtual interrupts that it maps real host
interrupts on to.  For the pending-bit-array (PBA), bhyve passes
accesses from the guest directly to the hardware.

bhyve's virtualization of the MSI-X table is done by intercepting all
reads and writes to the BAR holding the MSI-X table.  However, if the
PBA is stored in the same BAR as the MSI-X table, accesses to the PBA
portion of this BAR have to be forwarded to the real BAR.

However, in the case that the PBA was stored in a separate BAR and
it's offset in that separate BAR overlapped with the portion of the
MSI-X table BAR that the table used, the handlers for the table BAR
would incorrectly think that some accesses were PBA reads and writes.
This caused a crash in bhyve when it indirected a NULL pointer.  Fix
this case by never trying to handle PBA access if the PBA lives in a
separate BAR.

4 years agoMFC 347465: Apply r280991 to ip6_fragment.
John Baldwin [Wed, 14 Aug 2019 23:25:58 +0000 (23:25 +0000)]
MFC 347465: Apply r280991 to ip6_fragment.

This uses m_dup_pkthdr() to copy all of the metadata about a packet to
each of its fragments including VLAN tags, mbuf tags, etc. instead of
hand-copying a few fields.

4 years agoMFC 346360: Push down INP_WLOCK slightly in tcp_ctloutput.
John Baldwin [Wed, 14 Aug 2019 23:05:57 +0000 (23:05 +0000)]
MFC 346360: Push down INP_WLOCK slightly in tcp_ctloutput.

The inp lock is not needed for testing the V6 flag as that flag is set
once when the inp is created and never changes.  For non-TCP socket
options the lock is immediately dropped after checking that flag.
This just pushes the lock down to only be acquired for TCP socket
options.

This isn't a hot-path, more a cosmetic cleanup I noticed while reading
the code.

4 years agoMFC r350697:
Dimitry Andric [Wed, 14 Aug 2019 19:21:26 +0000 (19:21 +0000)]
MFC r350697:

Fix a possible segfault in wcsxfrm(3) and wcsxfrm_l(3).

If the length of the source wide character string, passed in via the
"size_t n" parameter, is set to zero, the function should only return
the required length for the destination wide character string.  In this
case, it should *not* attempt to write to the destination, so the "dst"
parameter is permitted to be NULL.

However, when the internally called _collate_wxfrm() function returns an
error, such as when using the "C" locale, as a fallback wcscpy(3) or
wcsncpy(3) are used.  But if the input length is zero, wcsncpy(3) will
be called with a length of -1!  If the "dst" parameter is NULL, this
will immediately result in a segfault, or if "dst" is a valid pointer,
it will most likely result in unexpectedly overwritten memory.

Fix this by explicitly checking for an input length greater than zero,
before calling wcsncpy(3).

Note that a similar situation does not occur in strxfrm(3), the plain
character version of this function, as it uses strlcpy(3) for the error
case.  The strlcpy(3) function does not write to the destination if the
input length is zero.

4 years agoMFC r350112, r350241, and r350242:
Andrew Turner [Wed, 14 Aug 2019 17:02:36 +0000 (17:02 +0000)]
MFC r350112, r350241, and r350242:

r350166:
arm64: Implement HWCAP

Add HWCAP support for arm64.
defines are the same as in Linux and a userland program can use
elf_aux_info to get the data.
We only save the common denominator for all cores in case the
big and little cluster have different support (this is known to
exists even if we don't support those SoCs in FreeBSD)

Differential Revision: https://reviews.freebsd.org/D17137

r350241:
Ensure the arm64 ID register fields are 64 bit types.

Previously only some of the ID register fields were 64 bit. To allow
for a script to generate these mark them all 64 bit. To allow for their
use in assembly we need to use the UINT64_C macro via a new UL macro
to stop the lines from being too long.

Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D20977

r350242:
As with r350241 use the new UL macro on the main register mask.

Sponsored by: DARPA, AFRL

4 years agoMFC r345510:
Andrew Turner [Wed, 14 Aug 2019 16:54:51 +0000 (16:54 +0000)]
MFC r345510:
Sort printing of the ID registers on arm64 to be identical to the
documentation. This will simplify checking new fields when they are added.

4 years agoMFC r339593:
Andrew Turner [Wed, 14 Aug 2019 16:45:16 +0000 (16:45 +0000)]
MFC r339593:
Fix the ID_AA64ISAR0_EL1 dot product field shift.

It's 44 in the documentation, use this correct value.

4 years agoMFC r339592:
Andrew Turner [Wed, 14 Aug 2019 16:40:23 +0000 (16:40 +0000)]
MFC r339592:
Correctly set the DAIF bits in new threads

We should only unmask interrupts when creating a new thread and leave the
other exceptions in teh same state as before creating the thread.

4 years agoMFC r350497: ppp: correct echo-req magic number on big endian archs
Ed Maste [Wed, 14 Aug 2019 13:14:47 +0000 (13:14 +0000)]
MFC r350497: ppp: correct echo-req magic number on big endian archs

The magic number is a 32-bit quantity; use uint32_t to match hton's
return type and avoid sending zeros (upper 32 bits) on big-endian
architectures.

PR: 184141
Sponsored by: The FreeBSD Foundation

4 years agoMFC r350484, r350607, r350608:
Konstantin Belousov [Wed, 14 Aug 2019 09:56:58 +0000 (09:56 +0000)]
MFC r350484, r350607, r350608:
Make randomized stack gap between strings and pointers to argv/envs.

4 years agoMFC r350757:
Konstantin Belousov [Wed, 14 Aug 2019 09:49:28 +0000 (09:49 +0000)]
MFC r350757:
Update comment explaining create_init().

4 years agoMFC r350758:
Konstantin Belousov [Wed, 14 Aug 2019 09:44:48 +0000 (09:44 +0000)]
MFC r350758:
Fix stack grow for init.

4 years agoMFC r350675:
Hans Petter Selasky [Wed, 14 Aug 2019 09:43:27 +0000 (09:43 +0000)]
MFC r350675:
Correct PCI device ID for XHCI USB controller.

Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>
Sponsored by: Mellanox Technologies

4 years agoMFC r350396:
Hans Petter Selasky [Wed, 14 Aug 2019 09:40:12 +0000 (09:40 +0000)]
MFC r350396:
Add support for tethering with Nokia 7 plus and the alike.

PR: 239495
Sponsored by: Mellanox Technologies

4 years agoMFC r350861:
Konstantin Belousov [Wed, 14 Aug 2019 09:39:39 +0000 (09:39 +0000)]
MFC r350861:
wait(2): clarify reparenting of children of the exiting process.

4 years agoMFC r350860:
Konstantin Belousov [Wed, 14 Aug 2019 09:38:55 +0000 (09:38 +0000)]
MFC r350860:
wait(2): split long line by using .Fo/.Fa instead of .Ft.

4 years agoMFC r350481, r350483:
Konstantin Belousov [Wed, 14 Aug 2019 09:37:43 +0000 (09:37 +0000)]
MFC r350481, r350483:
Avoid conflicts with libc symbols in libthr jump table.

4 years agoMFC r350855: Upgrade to Bzip2 version 1.0.8.
Xin LI [Wed, 14 Aug 2019 06:39:20 +0000 (06:39 +0000)]
MFC r350855: Upgrade to Bzip2 version 1.0.8.

4 years agoMFC r350961: Missed part of r350523.
Alexander Motin [Tue, 13 Aug 2019 19:17:44 +0000 (19:17 +0000)]
MFC r350961: Missed part of r350523.

4 years agoMFC r350639:
Konstantin Belousov [Tue, 13 Aug 2019 13:47:03 +0000 (13:47 +0000)]
MFC r350639:
amd64: prevents speculations over swapgs reload of %gs base.

4 years agoMFC r348250:
Piotr Kubaj [Mon, 12 Aug 2019 23:44:03 +0000 (23:44 +0000)]
MFC r348250:
Add snd_hda(4) to GENERIC64 used by powerpc64.

amd64 also has snd_hda(4) in GENERIC.

Approved by: jhibbits (src committer), linimon (mentor)

4 years agoMFC r350207:
Alan Somers [Mon, 12 Aug 2019 20:32:47 +0000 (20:32 +0000)]
MFC r350207:

VOP_PATHCONF.9: correct the type of the retval argument

It was changed from int to register_t in r22521 and from register_t to long
in r328099, but the man page wasn't updated either time.

4 years agoMFC r349248, r349391, r350088
Alan Somers [Mon, 12 Aug 2019 20:31:12 +0000 (20:31 +0000)]
MFC r349248, r349391, r350088

r349248:
fcntl: fix overflow when setting F_READAHEAD

VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.

Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936.  Formalize that by using a union type.

Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20710

r349391:
fcntl: style changes to r349248

Reported by: bde
MFC-With: 349248
Sponsored by: The FreeBSD Foundation

r350088:
F_READAHEAD: Fix r349248's overflow protection, broken by r349391

I accidentally broke the main point of r349248 when making stylistic changes
in r349391.  Restore the original behavior, and also fix an additional
overflow that was possible when uio->uio_resid was nearly SSIZE_MAX.

Reported by: cem
Reviewed by: bde
MFC-With: 349248
Sponsored by: The FreeBSD Foundation

4 years agoMFC r349231, r349233, r349280, r349478
Alan Somers [Mon, 12 Aug 2019 20:21:36 +0000 (20:21 +0000)]
MFC r349231, r349233, r349280, r349478

r349231:
Add FIOBMAP2 ioctl

This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux.  FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.

Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20705

r349233:
#include <sys/types.h> from sys/filio.h

This fixes world build after r349231

Reported by: Jenkins
MFC-With: 349231
Sponsored by: The FreeBSD Foundation

r349280:
Reduce namespace pollution from r349233

Define __daddr_t in _types.h and use it in filio.h

Reported by: ian, bde
Reviewed by: ian, imp, cem
MFC-With: 349233
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20715

r349478:
FIOBMAP2: inline vn_ioc_bmap2

Reported by: kib
Reviewed by: kib
MFC-With: 349238
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20783

4 years agoMFC r350652 (by imp): Fix mismerge.
Alexander Motin [Mon, 12 Aug 2019 19:44:57 +0000 (19:44 +0000)]
MFC r350652 (by imp): Fix mismerge.

I merged passthru.c from the wrong branch (it was a branch that went further in
a direction I wound up not taking). Fix the mismerge and turn passthru on.

4 years agoMFC r350599, r350609: Add `nvmecontrol resv` to handle NVMe reservations.
Alexander Motin [Mon, 12 Aug 2019 19:44:28 +0000 (19:44 +0000)]
MFC r350599, r350609: Add `nvmecontrol resv` to handle NVMe reservations.

NVMe reservations are quite alike to SCSI persistent reservations and
can be used in clustered setups with shared multiport storage.

Relnotes: yes

4 years agoMFC r350563: Add `nvmecontrol sanitize` command.
Alexander Motin [Mon, 12 Aug 2019 19:43:25 +0000 (19:43 +0000)]
MFC r350563: Add `nvmecontrol sanitize` command.

It allows to delete all user data from NVM subsystem in one of 3 methods.
It is a close equivalent of SCSI SANITIZE command of `camcontrol sanitize`,
so I tried to keep arguments as close as possible.

While there, fix supported sanitize methods reporting in `identify`.

MFC after: yes

4 years agoMFC r350555: Fix parameter check broken at r350057.
Alexander Motin [Mon, 12 Aug 2019 19:42:43 +0000 (19:42 +0000)]
MFC r350555: Fix parameter check broken at r350057.

4 years agoMFC r350553: Add more random bits from NVMe 1.4.
Alexander Motin [Mon, 12 Aug 2019 19:42:07 +0000 (19:42 +0000)]
MFC r350553: Add more random bits from NVMe 1.4.

4 years agoMFC r350541: Decode few more NVMe log pages.
Alexander Motin [Mon, 12 Aug 2019 19:41:35 +0000 (19:41 +0000)]
MFC r350541: Decode few more NVMe log pages.

In particular: Changed Namespace List, Commands Supported and Effects,
Reservation Notification, Sanitize Status.

Add few new arguments to `nvmecontrol log` subcommand.

4 years agoMFC r350529, r350530: Add more new fields and values from NVMe 1.4.
Alexander Motin [Mon, 12 Aug 2019 19:40:40 +0000 (19:40 +0000)]
MFC r350529, r350530: Add more new fields and values from NVMe 1.4.

4 years agoMFC r350523, r350524: Add IOCTL to translate nvdX into nvmeY and NSID.
Alexander Motin [Mon, 12 Aug 2019 19:39:31 +0000 (19:39 +0000)]
MFC r350523, r350524: Add IOCTL to translate nvdX into nvmeY and NSID.

While very useful by itself, it also makes `nvmecontrol` not depend on
hardcoded device names parsing, that in its turn makes simple to take
nvdX (and potentially any other) device names as arguments.

Also added IOCTL bypass from nvdX to respective nvmeYnsZ makes them
interchangeable for management purposes.

4 years agoMFC r350477: Feature-complete NVMe Namespace Management.
Alexander Motin [Mon, 12 Aug 2019 19:38:10 +0000 (19:38 +0000)]
MFC r350477: Feature-complete NVMe Namespace Management.

This adds several previously missed but important subcommands to list
namespaces and controllers.  It also fixes few previously added but
just found with real testing to be broken subcommands.

Also while there, add possibility to explicitly specify nsid for
`nvmecontrol identify` subcommand.  It may be useful to specify nsids
not having own devices, for example 0xffffffff, or just newly created
ones.

4 years agoMFC r350462: Tune some commands desctiption.
Alexander Motin [Mon, 12 Aug 2019 18:59:20 +0000 (18:59 +0000)]
MFC r350462: Tune some commands desctiption.

4 years agoMFC r350461: Fix usage printing for nested subcommands.
Alexander Motin [Mon, 12 Aug 2019 18:58:51 +0000 (18:58 +0000)]
MFC r350461: Fix usage printing for nested subcommands.

Instead of `nvmecontrol create` should be `nvmecontrol ns create`, etc.

4 years agoMFC r350399: Add some new fields and bits from NVMe 1.4.
Alexander Motin [Mon, 12 Aug 2019 18:58:20 +0000 (18:58 +0000)]
MFC r350399: Add some new fields and bits from NVMe 1.4.

4 years agoMFC r350333 (by imp): Widen the type for to.
Alexander Motin [Mon, 12 Aug 2019 18:57:46 +0000 (18:57 +0000)]
MFC r350333 (by imp): Widen the type for to.

The timeout field in the CAPS register is defined to be 8 bits, so its type was
uint8_t. We recently started adding 1 to it to cope with rogue devices that
listed 0 timeout time (which is impossible). However, in so doing, other devices
that list 0xff (for a 2 minute timeout) were broken when adding 1
overflowed. Widen the type to be uint32_t like its source register to avoid the
issue.

4 years agoMFC r350311 (by imp):
Alexander Motin [Mon, 12 Aug 2019 18:56:46 +0000 (18:56 +0000)]
MFC r350311 (by imp):
Fix the fix to the logic bug. Upon further testing, the bug is that we shadoow
opt.vendor with vendor. We shouldn't. Delete the latter and use the former
everywhere and restore the prior logic which is now correct.

4 years agoMFC r350309 (by imp): Fix several related coverity issues:
Alexander Motin [Mon, 12 Aug 2019 18:56:11 +0000 (18:56 +0000)]
MFC r350309 (by imp): Fix several related coverity issues:

Make sure to always free shortopts and lopts when returning.
Fix minor logic bug to guard against NULLs properly.

CID: 140365414036561403658

4 years agoMFC r350147 (by imp): Keep track of the number of commands that exhaust their retry...
Alexander Motin [Mon, 12 Aug 2019 18:55:36 +0000 (18:55 +0000)]
MFC r350147 (by imp): Keep track of the number of commands that exhaust their retry limit.

While we print failure messages on the console, sometimes logs are lost or
overwhelmed. Keeping a count of how many times we've failed retriable commands
helps get a magnitude of the problem.

4 years agoMFC r350146 (by imp): Keep track of the number of retried commands.
Alexander Motin [Mon, 12 Aug 2019 18:54:58 +0000 (18:54 +0000)]
MFC r350146 (by imp): Keep track of the number of retried commands.

Retried commands can indicate a performance degredation of an nvme drive. Keep
track of the number of retries and report it out via sysctl, just like number of
commands an interrupts.

4 years agoMFC r350120 (by imp): Use sysctl + CTLRWTUN for hw.nvme.verbose_cmd_dump.
Alexander Motin [Mon, 12 Aug 2019 18:54:24 +0000 (18:54 +0000)]
MFC r350120 (by imp): Use sysctl + CTLRWTUN for hw.nvme.verbose_cmd_dump.

Also convert it to a bool. While the rest of the driver isn't yet bool clean,
this will help.

4 years agoMFC r350118 (by imp): Provide new tunable hw.nvme.verbose_cmd_dump
Alexander Motin [Mon, 12 Aug 2019 18:53:53 +0000 (18:53 +0000)]
MFC r350118 (by imp): Provide new tunable hw.nvme.verbose_cmd_dump

The nvme drive dumps only the most relevant details about a command when it
fails. However, there are times this is not sufficient (such as debugging weird
issues for a new drive with a vendor). Setting hw.nvme.verbose_cmd_dump=1
in loader.conf will enable more complete debugging information about each
command that fails.

4 years agoMFC r350114 (by imp):
Alexander Motin [Mon, 12 Aug 2019 18:53:01 +0000 (18:53 +0000)]
MFC r350114 (by imp):
Provide macros to extract the sub-fields of the CAP_LO and CAP_HI registers.

These macros make places where we extract these easier to read. The shift and
mask stuff is also a bit tedious and error prone. Start with the CAP_LO and
CAP_HI registers since their scope is somewhat constrained. This is style
chagne only, no functional changes.

4 years agoMFC r350094 (by imp): Remove now-obsolete comment.
Alexander Motin [Mon, 12 Aug 2019 18:52:10 +0000 (18:52 +0000)]
MFC r350094 (by imp): Remove now-obsolete comment.

4 years agoMFC r350068 (by imp): Assume that the timeout value from the capacity is 1-based
Alexander Motin [Mon, 12 Aug 2019 18:51:35 +0000 (18:51 +0000)]
MFC r350068 (by imp): Assume that the timeout value from the capacity is 1-based

Neither the 1.3 or 1.4 standards say this number is 1's based, but adding 1
costs little and copes with those NVMe drives that report '0' in this field
cheaply. This is consistent with what the Linux driver does as well.

4 years agoMFC r350058 (by imp): Implement {io,admin}-passthru commands.
Alexander Motin [Mon, 12 Aug 2019 18:50:57 +0000 (18:50 +0000)]
MFC r350058 (by imp): Implement {io,admin}-passthru commands.

These are mostly compatible with Linux, with three exceptions.
1. We don't do metadata segment stuff. Our passthrough interface
   doesn't cope. The code is there, but generates an error.
2. Linux lets you specify a namespace ID for the command. We current
   do not: we get ours from the namespace device, or pass in a generic
   one. Generally, this will lead to the same command, but FreeBSD's
   is safer since you can't specify the wrong id.
3. --show-command outputs to stderr instead of stdout so you can both
   see your command, and capture its output with a simple redirect.

4 years agoMFC r350057 (by imp): Create generic command / arg parsing routines
Alexander Motin [Mon, 12 Aug 2019 18:50:26 +0000 (18:50 +0000)]
MFC r350057 (by imp): Create generic command / arg parsing routines

Create a set of routines and structures to hold the data for the args
for a command. Use them to generate help and to parse args. Convert
all the current commands over to the new format. "comnd" is a hat-tip
to the TOPS-20 %COMND JSYS that (very) loosely inspired much of the
subsequent command line notions in the industry, but this is far
simpler (the %COMND man page is longer than this code) and not in the
kernel... Also, it implements today's de-facto
        command [verb]+ [opts]* [args]*
format rather than the old, archaic TOPS-20 command format :)

This is a snapshot of a work in progress to get the nvme passthru
stuff committed. In time it will become a private library and used
by some other programs in the tree that conform to the above pattern.

4 years agoMFC r348495 (by imp):
Alexander Motin [Mon, 12 Aug 2019 18:49:32 +0000 (18:49 +0000)]
MFC r348495 (by imp):
Since a fatal trap can happen at aribtrary times, don't panic when the
completions are not in a consistent state. Cope with the different
places the normal I/O completion polling thread can be interrupted and
then re-entered during a kernel panic + dump.

4 years agoMFC r347939 (by scottl): Better formatting for the logpage section
Alexander Motin [Mon, 12 Aug 2019 18:48:47 +0000 (18:48 +0000)]
MFC r347939 (by scottl): Better formatting for the logpage section

4 years agoMFC r347369 (by imp): rename nvme_ctrlr_destroy_qpair to nvme_ctrlr_destroy_qpairs
Alexander Motin [Mon, 12 Aug 2019 18:48:17 +0000 (18:48 +0000)]
MFC r347369 (by imp): rename nvme_ctrlr_destroy_qpair to nvme_ctrlr_destroy_qpairs

Maintain symmetry with nvme_ctrlr_create_qpairs, making it easier to
match init/uninit scenarios.

4 years agoMFC r344955 (by imp):
Alexander Motin [Mon, 12 Aug 2019 18:47:40 +0000 (18:47 +0000)]
MFC r344955 (by imp):
Don't print all the I/O we abort on a reset, unless we're out of
retries.

When resetting the controller, we abort I/O. Prior to this fix, we
printed a ton of abort messages for I/O that we're going to
retry. This imparts no useful information. Stop printing them unless
our retry count is exhausted. Clarify code for when we don't retry,
and remove useless arg to a routine that's always called with it
as 'true'. All the other debug is still printed (including multiple
reset messages if we have multiple timeouts before the taskqueue
runs the actual reset) so that we know when we reset.

4 years agoMFC r344469 (by imp): Rework logpage extensibility.
Alexander Motin [Mon, 12 Aug 2019 17:56:16 +0000 (17:56 +0000)]
MFC r344469 (by imp): Rework logpage extensibility.

Move from using a linker set to a constructor function that's
called. This simplifies the code and is slightly more obvious.  We now
keep a list of page decoders rather than having an array we managed
before. Commands will move to something similar in the future.

4 years agoMFC r344191 (by imp): Remove write-only s_flag.
Alexander Motin [Mon, 12 Aug 2019 17:55:25 +0000 (17:55 +0000)]
MFC r344191 (by imp): Remove write-only s_flag.

4 years agoMFC r342358 (by imp): Try the first 256 units with nvmecontrol devlist.
Alexander Motin [Mon, 12 Aug 2019 17:54:28 +0000 (17:54 +0000)]
MFC r342358 (by imp): Try the first 256 units with nvmecontrol devlist.

The nvmecontrol code that did the devlist assumed that we had a
tightly-packed allocation of units. Since pci writing exists, this
isn't the case. Loop over the first 256 units, which is a reasonable
number of possible units.

4 years agoMFC r341664 (by imp):
Alexander Motin [Mon, 12 Aug 2019 17:53:23 +0000 (17:53 +0000)]
MFC r341664 (by imp):
Update paths based on last-minute changes from libexec to lib.

4 years agoMFC r341663 (by imp):
Alexander Motin [Mon, 12 Aug 2019 17:52:45 +0000 (17:52 +0000)]
MFC r341663 (by imp):
Declare global function print_intel_add_smart in header

4 years agoMFC r341662 (by imp): Use proper prototypes.
Alexander Motin [Mon, 12 Aug 2019 17:52:03 +0000 (17:52 +0000)]
MFC r341662 (by imp): Use proper prototypes.

4 years agoMFC r341661 (by imp): It's useful to have this be a global function.
Alexander Motin [Mon, 12 Aug 2019 17:51:28 +0000 (17:51 +0000)]
MFC r341661 (by imp): It's useful to have this be a global function.

Other vendors base their additional smart info pages on what Intel did
plus some other bits. So it's convenient to have this be global.

4 years agoMFC r341660 (by imp): This is not a samsung standard, so remove that alias.
Alexander Motin [Mon, 12 Aug 2019 17:50:18 +0000 (17:50 +0000)]
MFC r341660 (by imp): This is not a samsung standard, so remove that alias.

This was never documented, and isn't needed, so it's best removed to
avoid confusion.

4 years agoMFC r341659 (by imp): Move intel and wdc files to their own modules
Alexander Motin [Mon, 12 Aug 2019 17:49:44 +0000 (17:49 +0000)]
MFC r341659 (by imp): Move intel and wdc files to their own modules

Move the intel and wdc vendor specific stuff to their own modules.

4 years agoMFC r341658 (by imp): Const poison the command interface
Alexander Motin [Mon, 12 Aug 2019 17:49:06 +0000 (17:49 +0000)]
MFC r341658 (by imp): Const poison the command interface

Make the pointers we pass into the commands const, also make the
linker set mirrors const.

4 years agoMFC r341657 (by imp): Dynamically load .so modules to expand functionality
Alexander Motin [Mon, 12 Aug 2019 17:48:14 +0000 (17:48 +0000)]
MFC r341657 (by imp): Dynamically load .so modules to expand functionality

o Dynamically load all the .so files found in /libexec/nvmecontrol and
  /usr/local/libexec/nvmecontrol.
o Link nvmecontrol -rdynamic so that its symbols are visible to the
  libraries we load.
o Create concatinated linker sets that we dynamically expand.
o Add the linked-in top and logpage linker sets to the mirrors for them
  and add those sets to the mirrors when we load a new .so.
o Add some macros to help hide the names of the linker sets.
o Update the man page.

4 years agoMFC r341416 (by imp): Fix typo in comment
Alexander Motin [Mon, 12 Aug 2019 17:42:44 +0000 (17:42 +0000)]
MFC r341416 (by imp): Fix typo in comment

4 years agoMFC r341415 (by imp): Delete the undocumented alias 'wds'.
Alexander Motin [Mon, 12 Aug 2019 17:41:57 +0000 (17:41 +0000)]
MFC r341415 (by imp): Delete the undocumented alias 'wds'.

This was a typo for wdc. Eliminate it since it was in error. People
should use either 'wdc' or 'hgst' for the vendor from now on. 'hgst'
works for all versions this functionality is present for.