Ian Lepore [Sun, 23 Jun 2019 16:00:29 +0000 (16:00 +0000)]
MFC r348141, r348143
r348141:
Handle the driftfile option correctly when ntpd_flags is empty.
The logic I originally wrote to detect whether a driftfile option was in the
set of flags was based on the result of removing the pattern *flag* being an
empty string. That didn't handle the case where the string was empty to
begin with. Doh! So now it also specifically checks for an empty string.
The result of the bad check was that ntpd would run without a driftfile, but
it would do so only if it was running as root instead of the non-priveleged
ntpd user, which isn't a typical case. Ntpd runs fine without a driftfile,
although it does take it longer to stabilize the clock frequency at startup.
Reported by: avg@
Pointy hat: ian@
r348143:
Remove accidentally-added blank line; the style throughout this file
is to use no whitespace between a comment block and the code it describes.
Ian Lepore [Sun, 23 Jun 2019 15:58:46 +0000 (15:58 +0000)]
MFC r348123, r348164, r348166
r348123:
Add pnp info to the imx_i2c driver.
r348164:
Mark i2c slave devices busy while they own the bus.
Many i2c slave drivers are in modules that can be unloaded. If they detach
while IO is in progress the bus would be hung forever. Conversely,
lower-layer drivers (iicbus and the hardware driver) also live in modules
and other kinds of bad things happen if they get detached while IO is in
progress. Because device_busy() propagates up to parents, marking the slave
device busy while it owns the bus solves both kinds of problems that come
with detaching i2c devices while IO is in progress.
r348166:
Release the bus-recovery gpio pins in detach(), so that unload then
reload of the module works without "pin already allocated" errors.
Ian Lepore [Sun, 23 Jun 2019 15:55:41 +0000 (15:55 +0000)]
MFC r348120:
Add a new 'tr' (transfer) mode to i2c(8) to support more i2c controllers.
Some i2c controller hardware does not provide a way to do individual START,
REPEAT-START and STOP actions on the i2c bus. Instead, they can only do
a complete transfer as a single operation. Typically they can do either
START-data-STOP or START-data-REPEATSTART-data-STOP. In the i2c driver
framework, this corresponds to the iicbus_transfer method. In the userland
interface they are initiated with the I2CRDWR ioctl command.
These changes add a new 'tr' mode which can be specified with the '-m'
command line option. This mode should work on all hardware; when an i2c
controller driver doesn't directly support the iicbus_transfer method,
code in the i2c driver framework uses the lower-level START/REPEAT/STOP
methods to implement the transfer. After this new mode has gotten some
testing on various hardware, the 'tr' mode should probably become the
new default mode.
Alan Somers [Sun, 23 Jun 2019 13:44:06 +0000 (13:44 +0000)]
MFC r348737:
Add a testing facility to manually reclaim a vnode
Add the debug.try_reclaim_vnode sysctl. When a pathname is written to it, it
will be reclaimed, as long as it isn't already or doomed. The purpose is to
gain test coverage for vnode reclamation, which is otherwise hard to
achieve.
Add the debug.ftry_reclaim_vnode sysctl. It does the same thing, except
that its argument is a file descriptor instead of a pathname.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20519
Alan Somers [Sun, 23 Jun 2019 13:35:01 +0000 (13:35 +0000)]
MFC r348251:
Remove "struct ucred*" argument from vtruncbuf
vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever
since that function was first added in r34611. Remove it. Also, remove some
"struct ucred" arguments from fuse and nfs functions that were only used by
vtruncbuf.
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20377
Glen Barber [Thu, 20 Jun 2019 14:34:45 +0000 (14:34 +0000)]
MFC r349160:
Fix passing ${CONF_FILES} (which contains MAKE_CONF and
SRC_CONF, __MAKE_CONF and SRCCONF, respectively) through
to arm_install_base() and chroot_arm_build_release().
This prevents failures when the target image is intended
to be build with make.conf(5) and src.conf(5) overrides,
which are correctly handled for non-embedded image builds.
Michael Tuexen [Thu, 20 Jun 2019 07:50:38 +0000 (07:50 +0000)]
MFC r348728:
r347382 added receiver side DSACK support for the TCP base stack.
The corresponding changes for the RACK stack where missed and are added
by this commit.
Andriy Gapon [Thu, 20 Jun 2019 06:53:59 +0000 (06:53 +0000)]
drm2/intel_iic: stop using iicbus_set_nostop
The desired mode of transmitting messages is implemented by subclassing
iicbb driver and overriding its iicbus_transfer method with an almost
identical copy that issues the stop condition only at the very end.
iicbus_set_nostop is very broken and is set to be removed from the KPI.
This is a direct commit as in head the drm drivers have been moved out
of the tree.
The same change has been committed to FreeBSDDesktop/drm-legacy.
Cy Schubert [Thu, 20 Jun 2019 05:01:35 +0000 (05:01 +0000)]
MFC r349152:
Make ipf_objbytes a constant. ipf_objbytes is a table of internal data
structures that are saved across reboots by ipfs(8). The table is not
changed at runtime.
Alexander Motin [Thu, 20 Jun 2019 01:18:15 +0000 (01:18 +0000)]
MFC r348764: Allow UMA hash tables to expand faster then 2x in 20 seconds.
ZFS ABD allocates tons of 4KB chunks via UMA, requiring huge hash tables.
With initial hash table size of only 32 elements it takes ~20 expansions
or ~400 seconds to adapt to handling 220GB ZFS ARC. During that time not
only the hash table is highly inefficient, but also each of those expan-
sions takes significant time with the lock held, blocking operation.
On my test system with 256GB of RAM and ZFS pool of 28 HDDs this change
reduces time needed to first time read 240GB from ~300-400s, during which
system is quite busy and unresponsive, to only ~150s with light CPU load
and just 5 sub-second CPU spikes to expand the hash table.
MFC r349192:
Add the ability to limit how much the code will fragment the RACK send map
in response to SACKs. The default behavior is unchanged; however, the
limit can be activated by changing the new net.inet.tcp.rack.split_limit
sysctl.
Ed Maste [Wed, 19 Jun 2019 14:57:51 +0000 (14:57 +0000)]
MFC r347228: makesyscalls: use @generated tag in generated files
Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.
Jamie Gritton [Tue, 18 Jun 2019 23:49:13 +0000 (23:49 +0000)]
Unmount filesystems on jail removal with "-f", to get around a situation
where the jail root vnode reference is stopping the filesystem from
unmounting, when the jail is removed by still exists in a dying state.
Rick Macklem [Mon, 17 Jun 2019 00:37:55 +0000 (00:37 +0000)]
MFC: r347583
Replace global list for grouplist with list(s) for each exportlist element.
In mountd.c, the grouplist structures are linked into a single global
linked list headed by "grphead". The only use of this linked list is
to free all list elements when the exportlist elements are also all being
free'd at the time the exports are being reloaded.
This patch replaces this one global linked list head with a list head in
each exportlist structure, where the grouplist elements for that exported
file system are linked.
The only change is that now the grouplist elements are free'd with the
associated exportlist element as they are free'd instead of all grouplist
elements being free'd after the exportlist elements are free'd. This
change should have no effect in practice.
This is being done, since a future patch that will add a "-I" option for
incrementally updating the exports in the kernel needs to know which
grouplist elements are associated with each exported file system and
having them linked into a list headed by the exportlist element does that.
Rick Macklem [Mon, 17 Jun 2019 00:20:39 +0000 (00:20 +0000)]
MFC: r347498
Factor code into two new functions in preparation for a future commit.
Factor code into two functions.
read_exportfile() a functon which reads the exports file(s) and calls
get_exportlist_one() to process each of them.
delete_export() a function which deletes the exports in the kernel for a file
system.
The contents of these functions is just the same code as was used to do the
operations, moved into separate functions. As such, there is no semantic change.
This is being done in preparation for a future commit that will add an
option to do incremental changes of kernel exports upon receiving SIGHUP.
Rick Macklem [Mon, 17 Jun 2019 00:00:12 +0000 (00:00 +0000)]
MFC: r347476
Factor out some exportlist list operations into separate functions.
This patch moves the code that removes and frees all exportlist elements
out into a separate function called free_exports().
It does the same for the insertion of a new exportlist entry into a list.
It also adds a second argument to ex_search() for the list to use.
None of these changes have any semantic effect. They are being done to
prepare the code for future patches that convert the single linked list
for the exportlist to a hash table of lists and a patch that will do
incremental changes of exports in the kernel.
And it fixes the argument for SLIST_HEAD_INITIALIZER() to a pointer,
which doesn't really matter, since SLIST_HEAD_INITIALIZER() doesn't use
the argument.
Marius Strobl [Sun, 16 Jun 2019 15:30:07 +0000 (15:30 +0000)]
MFC: r347222
o Avoid determining the MAC class (LEM/EM or IGB) - possibly even multiple
times - on every interrupt by using an own set of device methods for the
IGB class. This translates to introducing igb_if_intr_{disable,enable}()
and igb_if_{rx,tx}_queue_intr_enable() with that IGB-specific code moved
out of their EM counterparts and otherwise continuing to use the EM IFDI
methods also for IGB.
Note that igb_if_intr_{disable,enable}() also issue E1000_WRITE_FLUSH as
lost with the conversion of igb(4) to iflib(4).
Also note, that the em_if_{disable,enable}_intr() methods are renamed to
em_if_intr_{disable,enable}() for consistency with the names used in the
interface declaration.
o In em_intr():
- Don't bother to bail out if the interrupt type is "legacy", i. e. INTx
or MSI, as iflib(4) doesn't use ift_legacy_intr methods for MSI-X. All
other iflib(4)-based drivers avoid this check, too.
- Given that only the MSI-X interrupts have one-shot behavior (by taking
advantage of the EIAC register), explicitly disable interrupts. Hence,
em_intr() now matches what {em,igb}_irq_fast() previously did (in case
of igb(4) supposedly also to work around MSI message reordering errata
on certain systems).
o In em_if_intr_disable():
- Clear the EIAC register unconditionally for 82574 and not just in case
of MSI-X, matching em_if_intr_enable() and bringing back the last hunk
of r206437 lost with the iflib(4) conversion.
- Write to EM_EIAC for clearing said register instead of to the IGB-only
E1000_EIAC used ever since the iflib(4) conversion.
Marius Strobl [Sun, 16 Jun 2019 15:25:46 +0000 (15:25 +0000)]
MFC: r347221, r347245
o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
_task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
TX throughput of EM-class devices on par with the MSI-X case and, thus,
close to wirespeed/pre-iflib(4) times again. [1]
Note that independently of the interrupt type, the UDP performance with
these MACs still is abysmal and nowhere near to where it was before the
conversion of em(4) to iflib(4).
o In iflib_init_locked(), announce which free list failed to set up.
o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
as the latter is valid with MSI-X only.
o Instead of adding the missing - and apparently convoluted enough that a
DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
make the checks fail gracefully rather than panic. This avoids invoking
the checks at runtime over and over again in iflib_fast_intr_rxtx() and
_task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
these functions more readable.
o In iflib_rx_structures_setup(), only initialize LRO resources if device
and driver have LRO capability in order to not waste memory. Also, free
the LRO resources again if setting them up fails for one of the queues.
However, don't bother invoking iflib_rx_sds_free() in that case because
iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
of failure via iflib_rx_structures_free(), but there definitely is some
asymmetry left to be fixed, though).
o Similarly, free LRO resources again in iflib_rx_structures_free().
o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
instead of panicing (but only in case of INVARIANTS). This is a follow-
up to r344132 (MFCed to stable/12 in r344163) as such bugs shouldn't be
fatal.
o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
gracefully, too.
o Bring yet more sanity to iflib_msix_init():
- If the device doesn't provide enough MSI-X vectors or not all vectors
can be allocate so the expected number of queues in addition to admin
interrupts can't be supported, try MSI next (and then INTx) as proper
MSI-X vector distribution can't be assured in such cases. In essence,
this change brings r254008 forward to iflib(4). Also, this is the fix
alluded to in the commit message of r343934.
- If the MSI-X allocation has failed, don't prematurely announce MSI is
going to be used as the latter in fact may not be available either.
- When falling back to MSI, only release the MSI-X table resource again
if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
driver, in the first place.
o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.
Marius Strobl [Sun, 16 Jun 2019 15:11:52 +0000 (15:11 +0000)]
MFC: r347211
- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
- Remove the only ever written to ift_db_mtx_name member of struct iflib_txq.
- Remove the unused or only ever written to ifr_size, ifr_cq_pidx, ifr_cq_gen
and ifr_lro_enabled members of struct iflib_rxq.
- Consistently spell DMA, RX and TX uppercase in comments, messages etc.
instead of mixing with some lowercase variants.
- Consistently use if_t instead of a mix of if_t and struct ifnet pointers.
- Bring the function comments of _iflib_fl_refill(), iflib_rx_sds_free() and
iflib_fl_setup() in line with reality.
- Judging problem reports, people are wondering what on earth messages like:
"TX(0) desc avail = 1024, pidx = 0"
are trying to indicate. Thus, extend this string to be more like that of
non-iflib(4) Ethernet MAC drivers, notifying about a watchdog timeout due
to which the interface will be reset.
- Take advantage of the M_HAS_VLANTAG macro.
- Use false/true rather than FALSE/TRUE for variables of type bool.
- Use FALLTHROUGH as advocated by style(9).
Marius Strobl [Sun, 16 Jun 2019 14:53:53 +0000 (14:53 +0000)]
MFC: r344062 (partial)
- For diff reduction, bring in as much of the taskqgroup_attach{,_cpu}(9)
fix from r344062 as possible without breaking KBI, which in turn means
that these functions still don't properly work across architectures in
stable/12 (in theory, compat shims should be possible but result in a
PITA to maintain). However, e. g. the static iflib_irq_set_affinity()
now also takes the irq as an if_irq_t instead of an int.
- Move the gtaskqueue_enqueue_fn typedef from <sys/gtaskqueue.h> to
the gtaskqueue implementation as it's only used and needed there.
- Change the GTASK_INIT macro to use "gtask" rather than "task" as
argument given that it actually operates on a struct gtask rather
than a struct task.
- Let subr_gtaskqueue.c consistently use __func__ to print functions
names.
Marius Strobl [Sun, 16 Jun 2019 11:34:56 +0000 (11:34 +0000)]
MFC: r344060, r344064
Further correct and optimize the bus_dma(9) usage of iflib(4):
o Correct the obvious bugs in the netmap(4) parts:
- No longer check for the existence of DMA maps as bus_dma(9)
is used unconditionally in iflib(4) since r341095 (MFCed to
stable/12 in r343304).
- Supply the correct DMA tag and map pairs to bus_dma(9)
functions (see also the commit message of r343753, MFCed to
stable/12 in r343933).
- In iflib_netmap_timer_adjust(), add synchronization of the
TX descriptors before calling the ift_txd_credits_update
method as the latter evaluates the TX descriptors possibly
updated by the MAC.
- In _task_fn_tx(), wrap the netmap(4)-specific bits in
#ifdef DEV_NETMAP just as done in _task_fn_admin() and
_task_fn_rx() respectively.
o In iflib_fast_intr_rxtx(), synchronize the TX rather than
the RX descriptors before calling the ift_txd_credits_update
method (see also above).
o There's no need to synchronize an RX buffer that is going to
be recycled in iflib_rxd_pkt_get(), yet; it's sufficient to
do that as late as passing RX buffers to the MAC via the
ift_rxd_refill method. Hence, combine that synchronization
with the synchronization of new buffers into a common spot
in _iflib_fl_refill().
o There's no need to synchronize the RX descriptors of a free
list in preparation of the MAC updating their statuses with
every invocation of rxd_frag_to_sd(); it's enough to do this
once before handing control over to the MAC, i. e. before
calling ift_rxd_flush method in _iflib_fl_refill(), which
already performs the necessary synchronization.
o Given that the ift_rxd_available method evaluates the RX
descriptors which possibly have been altered by the MAC,
synchronize as appropriate beforehand. Most notably this
is now done in iflib_rxd_avail(), which in turn means that
we don't need to issue the same synchronization yet again
before calling the ift_rxd_pkt_get method in iflib_rxeof().
o In iflib_txd_db_check(), synchronize the TX descriptors
before handing them over to the MAC for transmission via
the ift_txd_flush method.
o In iflib_encap(), move the TX buffer synchronization after
the invocation of the ift_txd_encap() method. If the MAC
driver fails to encapsulate the packet and we retry with
a defragmented mbuf chain or finally fail, the cycles for
TX buffer synchronization have been wasted. Synchronizing
afterwards matches what non-iflib(4) drivers typically do
and is sufficient as the MAC will not actually start with
the transmission before - in this case - the ift_txd_flush
method is called.
Moreover, for the latter reason the synchronization of the
TX descriptors in iflib_encap() can go as it's enough to
synchronize them before passing control over to the MAC by
issuing the ift_txd_flush() method (see above).
o In iflib_txq_can_drain(), only synchronize TX descriptors
if the ift_txd_credits_update method accessing these is
actually called.
Ed Maste [Sat, 15 Jun 2019 09:28:48 +0000 (09:28 +0000)]
MFC r348498: libatf: remove workaround not required after atf >= 0.18 update
lib/atf/libatf-c/tests/Makefile added the -Wno-duplicate-decl-specifier
due to an issue with an old version of ATF. ATF has long since been
updated to a version with the fix so the workaround is no longer
necessary.
Allan Jude [Fri, 14 Jun 2019 15:09:08 +0000 (15:09 +0000)]
MFC r348714:
zpool.8: the comment property is not read-only
The comment property was listed in the man page twice, once under the list
of read-only properties, and again (correctly), under the list of user
editable properties.
PR: 238355
Reported by: Michael Zuo <muh.muhten@gmail.com>
Sponsored by: Klara Systems
Alexander Motin [Thu, 13 Jun 2019 01:23:03 +0000 (01:23 +0000)]
MFC r348422: Pass data pointers to the driver in way in expects.
Probably due to historical reasons the driver uses In/Out arguments in
odd way. While this tool still never uses Out arguments to see that,
make the code to not trigger EINVAL in possible future uses.
Alexander Motin [Tue, 11 Jun 2019 14:32:32 +0000 (14:32 +0000)]
MFC r348790: Fix comparison signedness in arc_is_overflowing().
When ARC size is very small, aggsum_lower_bound(&arc_size) may return
negative values, that due to unsigned comparison caused delays, waiting
for arc_adjust() to "fix" it by calling aggsum_value(&arc_size). Use
of signed comparison there fixes the problem.
Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
This is irrelevant to FreeBSD, just to reduce divergence.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Allan Jude <allanjude@freebsd.org>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andrew Stormont <astormont@racktopsystems.com>
Alexander Motin [Tue, 11 Jun 2019 13:49:48 +0000 (13:49 +0000)]
MFC r345728 (by pjd):
If the autoexpand pool property is turned on and vdev is healthy try to
expand the pool automatically when we detect underlying GEOM provider
size change.
Alexander Motin [Tue, 11 Jun 2019 13:36:15 +0000 (13:36 +0000)]
MFC r344316 (by pjd):
The way ZFS searches for its vdevs is the following: first it looks for
a vdev that has the same name as the one stored in metadata and that has
all VDEV labels in place. If it cannot find a GEOM provider with the given
name and all VDEV labels it will scan all GEOM providers for the best match
(the most VDEV labels available), but here the name is ignored.
In case the ZFS pool is created, eg. using GPT partition label:
# zpool create tank /dev/gpt/tank
everything works, and on every import ZFS will pick /dev/gpt/tank and
not /dev/da0p4.
The problem occurs when da0p4 is extended and ZFS is unable to find all
VDEV labels in /dev/gpt/tank anymore (the VDEV labels stored at the end
of the partition are now somewhere else). In this case it will scan all
GEOM providers and will pick the first one with the best match, ie. da0p4.
Fix this problem by checking the VDEV/provider name even if we get the same
match. If the name is the same as the one we have in pool's metadata, prefer
this GEOM provider.
MFC r348797:
Fix for reading the configuration descriptor in libusb. Catch invalid
configuration descriptor reads early on to avoid issues with devices
that don't check for a valid USB configuration read request.
Alexander Motin [Tue, 11 Jun 2019 01:09:54 +0000 (01:09 +0000)]
MFC r348332: Fix array out of bound panic introduced in r306219.
As I see, different NICs in different configurations may have different
numbers of TX and RX queues. The code was assuming 1:1 mapping between
event queues (interrupts) and TX/RX queues. Since number of interrupts
is set to maximum of TX and RX queues, when those two are different, the
system is doomed.
I have no documentation or deep knowledge about this hardware, so this
change is based on general observations and code reading. If some of my
guesses are wrong, please do better. I just confirmed HP NC550SFP NICs
are working now.