Eric van Gyzen [Wed, 3 Jul 2019 19:52:24 +0000 (19:52 +0000)]
MFC r349285
VirtIO SCSI: validate seg_max on attach
Until head r349278 (stable/12 r349690), bhyve presented a seg_max
to the guest that was too large. Detect this case and clamp it to
the virtqueue size. Otherwise, we would fail the "too many segments
to enqueue" assertion in virtqueue_enqueue().
I hit this by running a guest with a MAXPHYS of 256 KB.
Eric van Gyzen [Wed, 3 Jul 2019 19:50:22 +0000 (19:50 +0000)]
MFC r349278
bhyve: Fix vtscsi maximum segment config
The seg_max value reported to the guest should be two less than the
host's maximum, in order to leave room for the request and the
response. This is analogous to r347033 for virtio_block.
We hit the "too many segments to enqueue" assertion on OneFS because
we increase MAXPHYS to 256 KB.
The POWER8NVL (POWER8 NVLink) architecturally behaves identically to the
POWER8, with a different PVR identifier. Mark it as such, so it shows up
appropriately to the user.
The second statements on the lines are not guarded by the `if' condition.
This triggers a warning with newer gcc. It's relatively harmless given the
usage, but incorrect. Instead, wrap the statements so they're properly
guarded.
r345829: powerpc: Apply r178139 from sparc64 to powerpc's fpu_sqrt
r345831: powerpc: Allow emulating optional FPU instructions on CPUs with an FPU
r349402: powerpc/booke: Handle misaligned floating point loads/stores as on AIM
MFC r349522:
Need to apply the PCIM_BAR_MEM_BASE mask to the physical memory
address before returning it to the user. Some of the least significant
bits have special meaning and should be masked away.
MFC r349409 and r349410:
Fix support for LIBUSB_HOTPLUG_ENUMERATE in libusb. Currently all
devices are enumerated regardless of of the LIBUSB_HOTPLUG_ENUMERATE
flag. Make sure when the flag is not specified no arrival events are
generated for currently enumerated devices.
MFC r349368:
Free all allocated unit IDs in cuse(3) after the client character
devices have been destroyed to avoid creating character devices with
identical name.
Ed Maste [Wed, 3 Jul 2019 17:34:26 +0000 (17:34 +0000)]
MFC r349268: nandsim: correct test to avoid out-of-bounds access
Previously nandsim_chip_status returned EINVAL iff both of user-provided
chip->ctrl_num and chip->num were out of bounds. If only one failed the
bounds check arbitrary memory would be read and returned.
The NAND framework is not built by default, nandsim is not intended for
production use (it is a simulator), and the nandsim device has root-only
permissions.
admbugs: 827
Reported by: Daniel Hodson of elttam
Security: kernel information leak or DoS
Sponsored by: The FreeBSD Foundation
While working on PR/238796 I discovered an unused variable in frdest,
the next hop structure. It is likely this contributes to PR/238796
though other factors remain to be investigated.
Add missing words after PCI in the description of the PCIOCWRITE and
PCIOCATTACHED ioctls.
Use singular in PCIOCREAD, we only read one register at the time.
Reviewed by: bcr, bjk, rgrimes, cem
Differential Revision: https://reviews.freebsd.org/D20671
r349150:
pci.4: Use plural configuration registers
It is customary to use plural when talking about PCI configure registers.
MFC r349267:
Add "tcpmss" opcode to match the TCP MSS value.
With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.
# ipfw add deny log tcp from any to any tcpmss 0-500
r346455:
psm(4): Add support for 4 and 5 finger touches in synaptics driver
While 4-th and 5-th finger positions are not exported through PS/2
interface, total number of touches is reported by MT trackpads.
r346456:
psm(4): do not process gestures when palm is present
Ignoring of gesture processing when the palm is detected helps to reduce
some of the erratic pointer behavior.
This fixes regression introduced in r317814
Reported by: Ben LeMasurier <ben@crypt.ly>
r346457:
psm(4): respect tap_disabled configuration with enabled Extended support
This fixes a bug where, even when hw.psm.tap_enabled=0, touchpad taps
were processed.
tap_enabled has three states: unconfigured, disabled, and enabled (-1, 0, 1).
To respect PR kern/139272, taps are ignored only when explicity disabled.
Submitted by: Ben LeMasurier <ben@crypt.ly> (initial version)
r346458:
psm(4): give names to synaptics commands
Submitted by: Ben LeMasurier <ben@crypt.ly>
r348520:
psm(4): Add Elantech touchpad IC type 15 found on Thinkpad L480 laptops
r348529:
psm(4): Add natural scrolling support to sysmouse protocol
This change enables natural scrolling with two finger scroll enabled
and when user is using a trackpad (mouse and trackpoint are not affected).
Depending on trackpad model it can be activated with setting of
hw.psm.synaptics.natural_scroll or hw.psm.elantech.natural_scroll sysctl
values to 1.
Evdev protocol is not affected by this change too. Tune userland client
e.g. libinput to enable natural scrolling in that case.
r348818:
psm(4): Add extra sanity checks to Elantech trackpoint packet parser.
Add strict checks for unused bit states in Elantech trackpoint packet
parser to filter out spurious events produces by some hardware which
are detected as trackpoint packets. See comment on r328191 for example.
Martin Matuska [Fri, 28 Jun 2019 22:31:53 +0000 (22:31 +0000)]
MFC r348993,349135:
Sync libarchive with vendor including security fixes
r348993:
- version bumped to 3.4.0
- check_symlinks_fsobj() without chdir() and fchdir()
- bsdtar.1 manpage fixes
- patches from OpenBSD to libarchive_fe/passphrase.c
r349135:
PR #1212: RAR5 reader - window_mask was not updated correctly
(OSS-Fuzz 15278)
OSS-Fuzz 15120: RAR reader - extend use after free bugfix
In certain scenarios, it is possible for PCPU data to be
accessed before it has been initialized (e.g. during printf
if the kernel was built with the TSLOG option).
Initialize the PCPU pointer for hart 0 at the beginning of
initriscv() rather than near the end.
Alexander Motin [Thu, 27 Jun 2019 14:10:58 +0000 (14:10 +0000)]
MFC r349376: Fix strsep_quote() on strings without quotes.
For strings without quotes and escapes dstptr and srcptr are equal, so
zeroing *dstptr before checking *srcptr is not a good idea. In practice
it means that in -maproot=65534:65533 everything after the colon is lost.
The problem was there since r293305, but before r346976 it was covered by
improper strsep_quote() usage.
r343826 by yuripv:
pwm.8: fix markup in synopsis, add -f description
r346698 by manu:
arm: allwinner: aw_pwm: compile it as module too
r349057:
Allow pwm(9) components to be selected individually, while 'device pwm'
still includes it all.
r349058:
In detach(), check for failure of bus_generic_detach(), only release
resources if they got allocated (because detach() gets called from attach()
to handle various failures), and delete the pwmbus child if it got created.
r349059:
Don't call pwmbus_attach_bus(), because it may not be present if this
driver is compiled into the kernel but pwmbus will be loaded as a module
when needed (and because of that, pwmbus_attach_bus() is going away in
the near future). Instead, just directly do what that function did:
register the fdt xfef handle, and attach the pwmbus.
r349060:
Handle failure to enable the clock or obtain its frequency.
r349073:
Do not include pwm.h here, it is purely a userland interface file containing
ioctl defintions for the pwmc driver. It is not part of the pwmbus interface.
r349074:
Move/rename the sys/pwm.h header file to dev/pwm/pwmc.h. The file contains
ioctl definitions and related datatypes that allow userland control of pwm
hardware via the pwmc device. The new name and location better reflects its
assocation with a single device driver.
r349075:
Remove pwmbus_attach_bus(), it no longer has any callers. Also remove a
couple prototypes for functions that never existed (and never will).
r349076:
Use device_delete_children() instead of a locally-rolled copy of it that
leaks the device-list memory.
r349077:
Add a missing #include. I suspect this used to get included via some header
pollution that was cleaned up recently, and this file got missed in the
cleanup because it's not attached to the build unless you specifically
request this device in a custom kernel config.
r349080:
Make pwmbus driver and devclass vars static; they're not mentioned in any
header file, so they can't be used outside this file anyway.
r349081:
Unwrap prototype lines so that return type and function name are on the
same line. No functional changes.
r349082:
Spell unsigned int as u_int and channel as chan; eliminates the need to wrap
some long lines.
r349083:
Give the aw_pwm driver a module version.
r349084:
Rename the channel_max method to channel_count, because that's what it's
returning. (If the channel count is 2, then the max channel number is 1.)
r349085:
Destroy the cdev on device detach. Also, make the driver and devclass
static, because nothing outside this file needs them.
r349086:
Restructure the pwm device hirearchy and interfaces.
The pwm and pwmbus interfaces were nearly identical, this merges them into a
single pwmbus interface. The pwmbus driver now implements the pwmbus
interface by simply passing all calls through to its parent (the hardware
driver). The channel_count method moves from pwm to pwmbus, and the
get_bus method is deleted (just no longer needed).
The net effect is that the interface for doing pwm stuff is now the same
regardless of whether you're a child of pwmbus, or some random driver
elsewhere in the hierarchy that is bypassing the pwmbus layer and is talking
directly to the hardware driver via cross-hierarchy connections established
using fdt data.
The pwmc driver is now a child of pwmbus, instead of being its sibling
(that's why the get_bus method is no longer needed; pwmc now gets the
device_t of the bus using device_get_parent()).
r349088:
Make pwm channel numbers unsigned.
r349091:
The pwm interface was replaced with pwmbus, include the right header file.
r349092:
Make channel number unsigned, and spell unsigned int u_int. This should
have been part of r349088.
r349093:
This code no longer uses fdt/ofw stuff, no need to include ofw headers.
r349094:
Add module makefiles for pwm.
r349095:
Split the dtb MODULES_EXTRA line to a series of += lines, making it easier
to maintain and keep in alphabetical order, and paving the way for adding
some other modules that aren't dtb-related.
r349096:
Add module makefiles for Texas Instruments ARM SoCs.
The natural place to look for them based on how other SoCs are organized
would be sys/modules/ti, but that's already taken. Drop a clue into
modules/ti/Makefile directing people to modules/arm_ti if they're looking
for ARM modules.
r349097:
Build SoC-specific modules with GENERIC for the SoCs that have them.
r349115:
Rename pwmbus.h to ofw_pwm.h, because after all the recent changes, there
is nothing left in the file that related to pwmbus at all. It just contains
prototypes for the functions implemented in dev/pwm.ofw_pwm.c, so name it
accordingly and fix the include protect wrappers to match.
A new pwmbus.h will be coming along in a future commit.
r349119:
Rework pwmbus and pwmc so that each child will handle a single PWM channel.
Previously, there was a pwmc instance for each instance of pwm hardware
regardless of how many pwm channels that hardware supported. Now there
will be a pwmc instance for each channel when the hardware supports
multiple channels. With a separate instance for each channel, we can have
"named channels" in userland by making devfs alias entries in /dev/pwm.
These changes add support for ivars to pwmbus, and use an ivar to track the
channel number for each child. It also adds support for hinted children.
In pwmc, the driver checks for a label hint, and if present, it's used to
create an alias for the cdev in /dev/pwm. It's not anticipated that hints
will be heavily used, but it's easy to do and allows quick ad-hoc creation
of named channels from userland by using kenv to create hint.pwmc.N.label=
hints. Upcoming changes will add FDT support, and most labels will
probably be specified that way.
r349130:
Add ofw_pwmbus to enumerate pwmbus devices on systems configured with fdt
data. Also, add fdt support to pwmc.
r349131:
Implement the ofw_bus_get_node method in aw_pwm(4) so that ofw_pwmbus can
find its metadata for instantiating children.
r349132:
Add back a const qualifier I somehow fumbled away between test-building
and committing recent changes.
r349143:
Put the pwmc cdev filenames under the pwm directory along with any label
names. I.e., everything related to pwm now goes in /dev/pwm. This will
make it easier for userland tools to turn an unqualified name into a fully
qualified pathname, whether it's the base pwmcX.Y name or a label name.
r349144:
Follow changes in the pwmc(4) driver in relation to device filenames.
The driver now names its cdev nodes pwmcX.Y where X is unit number and
Y is the channel within that unit. Change the default device name from
pwmc0 to pwmc0.0. The driver now puts cdev files and label aliases in
the /dev/pwm directory, so allow the user to provide unqualified names
with -f and automatically prepend the /dev/pwm part for them.
Update the examples in the manpage to show the new device name format
and location within /dev/pwm.
r349145:
Put periods at the ends of argument descriptions. Explain the relationship
between the period and duty arguments.
r349164:
Remove everything related to channels from the pwmc public interface, now
that there is a pwmc(4) instance per channel and the channel number is
maintained as a driver ivar rather than being passed in from userland.
r349165:
Explain the relationship between PWM hardware channels being controlled and
pwmc(4) device filenames. Also, use uppercase PWM when the term is being
used as an acronym, and expand the acronym where it's first used.
r349166:
Rearrange the argument checking and processing so that enable and disable
can be combined with configuring the period and duty cycle (the same ioctl
sets all 3 values at once, so there's no reason to require the user to run
the program twice to get all 3 things set).
r349167:
Oops, it seems I left out the word 'cycle', fix it.
r349168:
Add a pwmc(4) manpage.
r349174:
Handle labels specified with hints even on FDT systems. Hints are the
easiest thing for a user to control (via loader.conf or kenv+kldload), so
handle them in addition to any label specified via the FDT data.
r349269:
Some mundane tweaks and cleanups to help de-clutter the diffs of some
upcoming functional changes.
Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data. Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite. Move the
device_t and driver_methods structs to the end of the file. Tweak comments.
r349270:
Add support for the PWM(9) API. This allows configuring the pwm output using
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps. The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
r349271:
Catch up with recent changes in pwmbus(9). The pwm(9) and pwmbus(9)
interfaces were unified into pwmbus(9), and the PWMBUS_CHANNEL_MAX method
was renamed PWMBUS_CHANNEL_COUNT. The pwmbus_attach_bus() function just
went away completely. Also, fix a few typos such as s/is/if/.
r349272:
Do some general cleanup and light wordsmithing.
Sort methods alphabetically. Wrap long lines. Start sentences on a new
line. Remove contractions (not because it's a good idea, just to silence
igor). Add some explanation of the units for the period and duty arguments
and the convention for channel numbers.
r349273:
Add pwm to the armv7 GENERIC kernel, it's now used by TI and Allwinner.
The idea behind those functions is not to force consumers to remember that there
is a need to check errno on failure. We already have a caph_enter(3) function
which does the same for cap_enter(2).
r348995:
Don't attempt to include hwpmc support for armv6, we're missing some of the
necessary support functions in cpu-v6.h, and it may be that the only armv6
platform we support (RPi, the bcm2835 SOC) is incapable of supporting hwpmc.
r348169:
Define macros making it easier to define bus-specific pnpinfo for FDT systems.
Pnpinfo is bus-specific and requires the bus name. The FDTCOMPAT_PNP_INFO()
macro makes it easier to define new FDT-based pnpinfo for busses other than
simplebus.
r348172:
Use the new FDTCOMPAT_PNP_INFO() macro to define SPIBUS_FDT_PNP_INFO().
Also rename SPIBUS_PNP_INFO -> SPIBUS_FDT_PNP_INFO because there could be
other kinds of pnpinfo for other (non-fdt) bus attachments.
r348173:
Rename IICBUS_FDT_PNPINFO -> IICBUS_FDT_PNP_INFO because all the other
existing pnpinfo-related macros right now use PNP_INFO, not PNPINFO.
r348183:
Add pnpinfo.
r348184:
Add pnpinfo to all i2c drivers that have FDT compat data.
Ian Lepore [Sun, 23 Jun 2019 16:00:29 +0000 (16:00 +0000)]
MFC r348141, r348143
r348141:
Handle the driftfile option correctly when ntpd_flags is empty.
The logic I originally wrote to detect whether a driftfile option was in the
set of flags was based on the result of removing the pattern *flag* being an
empty string. That didn't handle the case where the string was empty to
begin with. Doh! So now it also specifically checks for an empty string.
The result of the bad check was that ntpd would run without a driftfile, but
it would do so only if it was running as root instead of the non-priveleged
ntpd user, which isn't a typical case. Ntpd runs fine without a driftfile,
although it does take it longer to stabilize the clock frequency at startup.
Reported by: avg@
Pointy hat: ian@
r348143:
Remove accidentally-added blank line; the style throughout this file
is to use no whitespace between a comment block and the code it describes.
Ian Lepore [Sun, 23 Jun 2019 15:58:46 +0000 (15:58 +0000)]
MFC r348123, r348164, r348166
r348123:
Add pnp info to the imx_i2c driver.
r348164:
Mark i2c slave devices busy while they own the bus.
Many i2c slave drivers are in modules that can be unloaded. If they detach
while IO is in progress the bus would be hung forever. Conversely,
lower-layer drivers (iicbus and the hardware driver) also live in modules
and other kinds of bad things happen if they get detached while IO is in
progress. Because device_busy() propagates up to parents, marking the slave
device busy while it owns the bus solves both kinds of problems that come
with detaching i2c devices while IO is in progress.
r348166:
Release the bus-recovery gpio pins in detach(), so that unload then
reload of the module works without "pin already allocated" errors.
Ian Lepore [Sun, 23 Jun 2019 15:55:41 +0000 (15:55 +0000)]
MFC r348120:
Add a new 'tr' (transfer) mode to i2c(8) to support more i2c controllers.
Some i2c controller hardware does not provide a way to do individual START,
REPEAT-START and STOP actions on the i2c bus. Instead, they can only do
a complete transfer as a single operation. Typically they can do either
START-data-STOP or START-data-REPEATSTART-data-STOP. In the i2c driver
framework, this corresponds to the iicbus_transfer method. In the userland
interface they are initiated with the I2CRDWR ioctl command.
These changes add a new 'tr' mode which can be specified with the '-m'
command line option. This mode should work on all hardware; when an i2c
controller driver doesn't directly support the iicbus_transfer method,
code in the i2c driver framework uses the lower-level START/REPEAT/STOP
methods to implement the transfer. After this new mode has gotten some
testing on various hardware, the 'tr' mode should probably become the
new default mode.
Alan Somers [Sun, 23 Jun 2019 13:44:06 +0000 (13:44 +0000)]
MFC r348737:
Add a testing facility to manually reclaim a vnode
Add the debug.try_reclaim_vnode sysctl. When a pathname is written to it, it
will be reclaimed, as long as it isn't already or doomed. The purpose is to
gain test coverage for vnode reclamation, which is otherwise hard to
achieve.
Add the debug.ftry_reclaim_vnode sysctl. It does the same thing, except
that its argument is a file descriptor instead of a pathname.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20519
Alan Somers [Sun, 23 Jun 2019 13:35:01 +0000 (13:35 +0000)]
MFC r348251:
Remove "struct ucred*" argument from vtruncbuf
vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever
since that function was first added in r34611. Remove it. Also, remove some
"struct ucred" arguments from fuse and nfs functions that were only used by
vtruncbuf.
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20377
Glen Barber [Thu, 20 Jun 2019 14:34:45 +0000 (14:34 +0000)]
MFC r349160:
Fix passing ${CONF_FILES} (which contains MAKE_CONF and
SRC_CONF, __MAKE_CONF and SRCCONF, respectively) through
to arm_install_base() and chroot_arm_build_release().
This prevents failures when the target image is intended
to be build with make.conf(5) and src.conf(5) overrides,
which are correctly handled for non-embedded image builds.
Michael Tuexen [Thu, 20 Jun 2019 07:50:38 +0000 (07:50 +0000)]
MFC r348728:
r347382 added receiver side DSACK support for the TCP base stack.
The corresponding changes for the RACK stack where missed and are added
by this commit.
Andriy Gapon [Thu, 20 Jun 2019 06:53:59 +0000 (06:53 +0000)]
drm2/intel_iic: stop using iicbus_set_nostop
The desired mode of transmitting messages is implemented by subclassing
iicbb driver and overriding its iicbus_transfer method with an almost
identical copy that issues the stop condition only at the very end.
iicbus_set_nostop is very broken and is set to be removed from the KPI.
This is a direct commit as in head the drm drivers have been moved out
of the tree.
The same change has been committed to FreeBSDDesktop/drm-legacy.
Cy Schubert [Thu, 20 Jun 2019 05:01:35 +0000 (05:01 +0000)]
MFC r349152:
Make ipf_objbytes a constant. ipf_objbytes is a table of internal data
structures that are saved across reboots by ipfs(8). The table is not
changed at runtime.
Alexander Motin [Thu, 20 Jun 2019 01:18:15 +0000 (01:18 +0000)]
MFC r348764: Allow UMA hash tables to expand faster then 2x in 20 seconds.
ZFS ABD allocates tons of 4KB chunks via UMA, requiring huge hash tables.
With initial hash table size of only 32 elements it takes ~20 expansions
or ~400 seconds to adapt to handling 220GB ZFS ARC. During that time not
only the hash table is highly inefficient, but also each of those expan-
sions takes significant time with the lock held, blocking operation.
On my test system with 256GB of RAM and ZFS pool of 28 HDDs this change
reduces time needed to first time read 240GB from ~300-400s, during which
system is quite busy and unresponsive, to only ~150s with light CPU load
and just 5 sub-second CPU spikes to expand the hash table.
MFC r349192:
Add the ability to limit how much the code will fragment the RACK send map
in response to SACKs. The default behavior is unchanged; however, the
limit can be activated by changing the new net.inet.tcp.rack.split_limit
sysctl.
Ed Maste [Wed, 19 Jun 2019 14:57:51 +0000 (14:57 +0000)]
MFC r347228: makesyscalls: use @generated tag in generated files
Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.
Jamie Gritton [Tue, 18 Jun 2019 23:49:13 +0000 (23:49 +0000)]
Unmount filesystems on jail removal with "-f", to get around a situation
where the jail root vnode reference is stopping the filesystem from
unmounting, when the jail is removed by still exists in a dying state.
Rick Macklem [Mon, 17 Jun 2019 00:37:55 +0000 (00:37 +0000)]
MFC: r347583
Replace global list for grouplist with list(s) for each exportlist element.
In mountd.c, the grouplist structures are linked into a single global
linked list headed by "grphead". The only use of this linked list is
to free all list elements when the exportlist elements are also all being
free'd at the time the exports are being reloaded.
This patch replaces this one global linked list head with a list head in
each exportlist structure, where the grouplist elements for that exported
file system are linked.
The only change is that now the grouplist elements are free'd with the
associated exportlist element as they are free'd instead of all grouplist
elements being free'd after the exportlist elements are free'd. This
change should have no effect in practice.
This is being done, since a future patch that will add a "-I" option for
incrementally updating the exports in the kernel needs to know which
grouplist elements are associated with each exported file system and
having them linked into a list headed by the exportlist element does that.
Rick Macklem [Mon, 17 Jun 2019 00:20:39 +0000 (00:20 +0000)]
MFC: r347498
Factor code into two new functions in preparation for a future commit.
Factor code into two functions.
read_exportfile() a functon which reads the exports file(s) and calls
get_exportlist_one() to process each of them.
delete_export() a function which deletes the exports in the kernel for a file
system.
The contents of these functions is just the same code as was used to do the
operations, moved into separate functions. As such, there is no semantic change.
This is being done in preparation for a future commit that will add an
option to do incremental changes of kernel exports upon receiving SIGHUP.
Rick Macklem [Mon, 17 Jun 2019 00:00:12 +0000 (00:00 +0000)]
MFC: r347476
Factor out some exportlist list operations into separate functions.
This patch moves the code that removes and frees all exportlist elements
out into a separate function called free_exports().
It does the same for the insertion of a new exportlist entry into a list.
It also adds a second argument to ex_search() for the list to use.
None of these changes have any semantic effect. They are being done to
prepare the code for future patches that convert the single linked list
for the exportlist to a hash table of lists and a patch that will do
incremental changes of exports in the kernel.
And it fixes the argument for SLIST_HEAD_INITIALIZER() to a pointer,
which doesn't really matter, since SLIST_HEAD_INITIALIZER() doesn't use
the argument.
Marius Strobl [Sun, 16 Jun 2019 15:30:07 +0000 (15:30 +0000)]
MFC: r347222
o Avoid determining the MAC class (LEM/EM or IGB) - possibly even multiple
times - on every interrupt by using an own set of device methods for the
IGB class. This translates to introducing igb_if_intr_{disable,enable}()
and igb_if_{rx,tx}_queue_intr_enable() with that IGB-specific code moved
out of their EM counterparts and otherwise continuing to use the EM IFDI
methods also for IGB.
Note that igb_if_intr_{disable,enable}() also issue E1000_WRITE_FLUSH as
lost with the conversion of igb(4) to iflib(4).
Also note, that the em_if_{disable,enable}_intr() methods are renamed to
em_if_intr_{disable,enable}() for consistency with the names used in the
interface declaration.
o In em_intr():
- Don't bother to bail out if the interrupt type is "legacy", i. e. INTx
or MSI, as iflib(4) doesn't use ift_legacy_intr methods for MSI-X. All
other iflib(4)-based drivers avoid this check, too.
- Given that only the MSI-X interrupts have one-shot behavior (by taking
advantage of the EIAC register), explicitly disable interrupts. Hence,
em_intr() now matches what {em,igb}_irq_fast() previously did (in case
of igb(4) supposedly also to work around MSI message reordering errata
on certain systems).
o In em_if_intr_disable():
- Clear the EIAC register unconditionally for 82574 and not just in case
of MSI-X, matching em_if_intr_enable() and bringing back the last hunk
of r206437 lost with the iflib(4) conversion.
- Write to EM_EIAC for clearing said register instead of to the IGB-only
E1000_EIAC used ever since the iflib(4) conversion.
Marius Strobl [Sun, 16 Jun 2019 15:25:46 +0000 (15:25 +0000)]
MFC: r347221, r347245
o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
_task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
TX throughput of EM-class devices on par with the MSI-X case and, thus,
close to wirespeed/pre-iflib(4) times again. [1]
Note that independently of the interrupt type, the UDP performance with
these MACs still is abysmal and nowhere near to where it was before the
conversion of em(4) to iflib(4).
o In iflib_init_locked(), announce which free list failed to set up.
o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
as the latter is valid with MSI-X only.
o Instead of adding the missing - and apparently convoluted enough that a
DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
make the checks fail gracefully rather than panic. This avoids invoking
the checks at runtime over and over again in iflib_fast_intr_rxtx() and
_task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
these functions more readable.
o In iflib_rx_structures_setup(), only initialize LRO resources if device
and driver have LRO capability in order to not waste memory. Also, free
the LRO resources again if setting them up fails for one of the queues.
However, don't bother invoking iflib_rx_sds_free() in that case because
iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
of failure via iflib_rx_structures_free(), but there definitely is some
asymmetry left to be fixed, though).
o Similarly, free LRO resources again in iflib_rx_structures_free().
o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
instead of panicing (but only in case of INVARIANTS). This is a follow-
up to r344132 (MFCed to stable/12 in r344163) as such bugs shouldn't be
fatal.
o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
gracefully, too.
o Bring yet more sanity to iflib_msix_init():
- If the device doesn't provide enough MSI-X vectors or not all vectors
can be allocate so the expected number of queues in addition to admin
interrupts can't be supported, try MSI next (and then INTx) as proper
MSI-X vector distribution can't be assured in such cases. In essence,
this change brings r254008 forward to iflib(4). Also, this is the fix
alluded to in the commit message of r343934.
- If the MSI-X allocation has failed, don't prematurely announce MSI is
going to be used as the latter in fact may not be available either.
- When falling back to MSI, only release the MSI-X table resource again
if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
driver, in the first place.
o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.