Marius Strobl [Tue, 20 Nov 2018 00:08:33 +0000 (00:08 +0000)]
Given that the idea of D15374 was to "make memmove a first class citizen",
provide a _MEMMOVE extension of _MEMCPY that deals with overlap based on
the previous bcopy(9) implementation and use the former for bcopy(9) and
memmove(9). This addresses my D15374 review comment, avoiding extra MOVs
in case of memmove(9) and trashing the stack pointer.
Thomas Munro [Tue, 20 Nov 2018 00:06:53 +0000 (00:06 +0000)]
pom: Fix fencepost bugs.
Under some conditions pom would report "waning" and then "full", show
higher percentages than it should, and get confused by DST. Fix.
Before:
2018.01.30: The Moon is Waxing Gibbous (97% of Full)
2018.01.31: The Moon is Waning Gibbous (100% of Full)
2018.02.01: The Moon is Full
2018.02.02: The Moon is Waning Gibbous (98% of Full)
After:
2018.01.30: The Moon is Waxing Gibbous (96% of Full)
2018.01.31: The Moon is Waxing Gibbous (99% of Full)
2018.02.01: The Moon is Full
2018.02.02: The Moon is Waning Gibbous (97% of Full)
Marius Strobl [Mon, 19 Nov 2018 23:56:33 +0000 (23:56 +0000)]
For consistency within the front-end, prefer SDHCI_{READ,WRITE}_{2,4}()
to sdhci_acpi_{read,write}_{2,4}() in the sdhci_acpi_set_uhs_timing()
added in r340543.
Justin Hibbits [Mon, 19 Nov 2018 23:54:49 +0000 (23:54 +0000)]
powerpc: Sync icache on SIGILL, in case of cache issues
The update of jemalloc to 5.1.0 exposed a cache syncing issue on a Freescale
e500 base system. There was already code in the FPU emulator to address
this, but it was limited to a single static variable, and did not attempt to
sync the cache. This pulls that out to the higher level program exception
handler, and syncs the cache.
If a SIGILL is hit a second time at the same address, it will be treated as
a real illegal instruction, and handled accordingly.
Ed Maste [Mon, 19 Nov 2018 22:18:18 +0000 (22:18 +0000)]
rescue: set NO_SHARED in Makefile
The rescue binary is built statically via the Makefile generated by
crunchgen, but that does not trigger other shared/static logic in
bsd.prog.mk - in particular disabling retpolineplt with static linking.
PR: 233336
Reported by: Charlie Li
Sponsored by: The FreeBSD Foundation
Add a fortune describing how to upload a machine's dmesg information to the
NYCBUG database.
We want to encourage our users to upload their dmesgs so that the project can
get a better insight into what kind of hardware is run on. This helps in making
data-driven decisions about i.e., platform and driver support.
Note that dmesgs may contain sensitive information like hardware serial numbers,
hence uploading them without review is discouraged.
Ben Widawsky [Mon, 19 Nov 2018 18:29:03 +0000 (18:29 +0000)]
acpi: fix acpi_ec_probe to only check EC devices
This patch utilizes the fixed_devclass attribute in order to make sure
other acpi devices with params don't get confused for an EC device.
The existing code assumes that acpi_ec_probe is only ever called with a
dereferencable acpi param. Aside from being incorrect because other
devices of ACPI_TYPE_DEVICE may be probed here which aren't ec devices,
(and they may have set acpi private data), it is even more nefarious if
another ACPI driver uses private data which is not dereferancable. This
will result in a pointer deref during boot and therefore boot failure.
On X86, as it stands today, no other devices actually do this (acpi_cpu
checks for PROCESSOR type devices) and so there is no issue. I ran into
this because I am adding such a device which gets probed before
acpi_ec_probe and sets private data. If ARM ever has an EC, I think
they'd run into this issue as well.
There have been several iterations of this patch. Earlier
iterations had ECDT enumerated ECs not call into the probe/attach
functions of this driver. This change was Suggested by: jhb@.
Kyle Evans [Mon, 19 Nov 2018 17:09:57 +0000 (17:09 +0000)]
bectl(8) tests: attempt to load the ZFS module
Observed in a CI test image, bectl_create test will run and be marked as
skipped because the module is not loaded. The first zpool invocation will
automagically load the module, but bectl_create is still skipped. Subsequent
tests all pass as expected because the module is now loaded and everything
is OK.
Kyle Evans [Mon, 19 Nov 2018 16:47:21 +0000 (16:47 +0000)]
libbe(3): Handle non-ZFS rootfs better
If rootfs isn't ZFS, current version will emit an error claiming so and fail
to initialize libbe. As a consumer, bectl -r (undocumented) can be specified
to operate on a BE independently of whether on a UFS or ZFS root.
Unbreak this for the UFS case by only erroring out the init if we can't
determine a ZFS dataset for rootfs and no BE root was specified. Consumers
of libbe should take care to ensure that rootfs is non-empty if they're
trying to use it, because this could certainly be the case.
Some check is needed before zfs_path_to_zhandle because it will
unconditionally emit to stderr if the path isn't a ZFS filesystem, which is
unhelpful for our purposes.
This should also unbreak the bectl(8) tests on a UFS root, as is the case in
Jenkins' -test runs.
Tijl Coosemans [Mon, 19 Nov 2018 15:31:54 +0000 (15:31 +0000)]
Do proper copyin of control message data in the Linux sendmsg syscall.
Instead of calling m_append with a user address, allocate an mbuf cluster
and copy data into it using copyin. For the SCM_CREDS case, instead of
zeroing a stack variable and appending that to the mbuf, zero part of the
mbuf cluster directly. One mbuf cluster is also the size limit used by
the FreeBSD sendmsg syscall (uipc_syscalls.c:sockargs()).
Jayachandran C. [Mon, 19 Nov 2018 03:52:56 +0000 (03:52 +0000)]
gitv3_its: fixes for multiple GIC ITS blocks
First pass of support for multiple GIC ITS blocks with ACPI.
Changes are to:
* register the correct subset of interrupts with pic_register
in case of ACPI.
* initialize just the cpu interface for the first ITS, when
domain information is not avialable. This has to be done
until we split the per-CPU init to do LPI setup just once.
* remove duplicate check for the GIC ITS domain, the sc_cpus
are setup from domain, so the check again in per-CPU init
seems unnecessary.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17841
Jayachandran C. [Mon, 19 Nov 2018 03:43:10 +0000 (03:43 +0000)]
pci_host_generic : move activate/release to generic code
Now that the ACPI and FDT implementations for activating and
deactivating resources are the same, we can move it to
pci_host_generic.c. No functional changes.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17793
This is a major update for pci_host_generic_acpi.c, the current
implementation has some gaps that are better fixed up in one go.
The changes are to:
* Follow x86 method of not adding PCI resources to PCI host bridge in
ACPI code. This has been moved to pci_host_generic_acpi.c, where we
walk thru its resources of the host bridge and add them.
* Fixup code in pci_host_generic_acpi.c to read all decoded ranges
and update the 'ranges' property. This allows us to share most of
the code with generic implementation (and the FDT one).
* Parse and setup IO ranges and bus ranges when walking the resources
above. Drop most of the changes related to this from acpica code.
* Add the ECAM memory area as mem resource 0. Implement the logic to
get the ECAM area from MCFG (using bus range which we now decode),
or from _CBA (using _BBN/bus range). Drop aarch64 ifdefs from acpica
code which did part of this.
* Switch resource activation to similar code as FDT implementation,
this can be moved into generic implementation in a later pass.
* Drop the mechanism of using the 7th bit of bus number as the domain,
this is not correct and will work only in very specific cases. Use
_SEG as PCI domain and use the bus ranges of the host bridge to
provide start bus number.
This commit should not make any functional change to dev/acpica/acpi.c
for other architectures, almost all the changes there are to revert
earlier additions in this file done for aarch64.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17791
Jayachandran C. [Mon, 19 Nov 2018 03:02:47 +0000 (03:02 +0000)]
acpica: rework INTRNG interrupts
On arm64 (where INTRNG is enabled), the interrupts have to be mapped
with ACPI_BUS_MAP_INTR() before adding them as resources to devices.
The earlier code did the mapping before calling acpi_set_resource(),
which bypassed code that checked for PCI link interrupts.
To fix this, move the call to map interrupts into acpi_set_resource()
and that requires additional work to lookup interrupt properties.
The changes here are to:
* extend acpi_lookup_irq_handler() to lookup an irq in the ACPI
resources
* create a helper function acpi_map_intr() which uses the updated
acpi_lookup_irq_handler() to look up an irq, and then map it
with ACPI_BUS_MAP_INTR()
* use acpi_map_intr() in acpi_pcib_route_interrupt() to map
pci link interrupts.
With these changes, we can drop the ifdefs in acpi_resource.c, and
we can also drop the call for mapping interrupts in generic_timer.c
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17790
Jayachandran C. [Mon, 19 Nov 2018 02:55:18 +0000 (02:55 +0000)]
pci_host_generic*: basic implementation of bus range
Both ACPI and FDT support bus ranges for pci host bridges. Update
pci_host_generic*.[ch] with a default implementation to support this.
This will be used in the next set of changes for ACPI based host
bridge. No functional changes in this commit.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17657
Jayachandran C. [Mon, 19 Nov 2018 02:43:34 +0000 (02:43 +0000)]
pci_host_generic: allocate resources against devices
Fix up pci_host_generic.c and pci_host_generic_fdt.c to allocate
resources against devices that requested them. Currently the
allocation happens against the pcib, which is incorrect.
This is needed for the upcoming changes for fixing up
pci_host_generic_acpi.c
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17656
Jayachandran C. [Mon, 19 Nov 2018 02:38:02 +0000 (02:38 +0000)]
pci_host_generic: remove unneeded ThunderX2 quirk
The current quirk implementation writes a fixed address to the PCI BAR
to fix a firmware bug. The PCI BARs are allocated by firmware and will
change depending on PCI devices present. So using a fixed address here
is not correct.
This quirk worked around a firmware bug that programmed the MSI-X bar
of the SATA controller incorrectly. The newer firmware does not have
this issue, so it is better to drop this quirk altogether.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D17655
Kyle Evans [Mon, 19 Nov 2018 02:30:12 +0000 (02:30 +0000)]
bectl(8): Add some regression tests
These tests operate on a file-backed zpool that gets created in the kyua
temp dir. root and ZFS support are both required for these tests. Current
tests cover create, destroy, export/import, jail, list (kind of), mount,
rename, and jail.
List tests should later be extended to cover formatting and the different
list flags, but for now only covers basic "are create/destroy actually
reflected properly"
Kyle Evans [Mon, 19 Nov 2018 02:16:20 +0000 (02:16 +0000)]
libbe(3): Properly account for altroot when creating new BEs
Previously we would blindly copy the 'mountpoint' property, which includes
the altroot. The altroot needs to be snipped off prior to setting it on the
new BE, though, or you'll end up with a new BE and a mountpoint of /mnt with
altroot=/mnt
Kyle Evans [Mon, 19 Nov 2018 02:12:08 +0000 (02:12 +0000)]
bectl(3)/libbe(3): Allow BE root to be specified
Add an undocumented -r option preceding the bectl subcommand to specify a BE
root to operate out of. This will remain undocumented for now, as some
caveats apply:
- BEs cannot be activated in the pool that doesn't contain the rootfs
- bectl create cannot work out of the box without the -e option right now,
since it defaults to the rootfs and cross-pool cloning doesn't work like
that (IIRC)
Plumb the BE root through to libbe(3) so that some things -can- be done to
it, e.g.
this aides in some upgrade setups where rootfs is not necessarily ZFS, and
also makes it easier/possible to regression-test bectl when combined with a
file-backed zpool.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D18029
Mark Johnston [Sun, 18 Nov 2018 01:58:48 +0000 (01:58 +0000)]
Change dumpon(8)'s handling of -g.
Rather than using a special value to denote "use the default router",
treat the absence of the -g option to mean the same thing. The
in-kernel netdump client will always attempt to reach the server
directly before falling back to the configured gateway anyway. This
change makes it cleaner to support a hostname value for -g.
Reviewed by: cem
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D18025
Alan Cox [Sun, 18 Nov 2018 01:27:17 +0000 (01:27 +0000)]
Tidy up vm_map_simplify_entry() and its recently introduced helper
functions. Notably, reflow the text of some comments so that they
occupy fewer lines, and introduce an assertion in one of the new
helper functions so that it is not misused by a future caller.
In collaboration with: Doug Moore <dougm@rice.edu>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D17635
Kyle Evans [Sat, 17 Nov 2018 19:19:37 +0000 (19:19 +0000)]
libbe(3): Rewrite be_unmount to stop mucking with getmntinfo(2)
Go through the ZFS layer instead; given a BE, we can derive the dataset,
zfs_open it, then zfs_unmount. ZFS takes care of the dirty details and
likely gets it more correct than we did for more interesting setups.
Kyle Evans [Sat, 17 Nov 2018 19:15:29 +0000 (19:15 +0000)]
libbe(3): rewrite init to support chroot usage
libbe(3) currently uses zfs_be_root and locates which of its children is
currently mounted at "/". This is reasonable, but not correct in the case of
a chroot, for two reasons:
- chroot root may be of a different zpool than zfs_be_root
- chroot root will not show up as mounted at "/"
Fix both of these by rewriting libbe_init to work from the rootfs down.
zfs_path_to_zhandle on / will resolve to the dataset mounted at the new
root, rather than the real root. From there, we can derive the BE root/pool
and grab the bootfs off of the new pool. This does no harm in the average
case, and opens up bectl to operating on different pools for scenarios where
one may be, for instance, updating a pool that generally gets re-rooted into
from a separate UFS root or zfs bootpool.
While here, I've also:
- Eliminated the check for /boot and / to be on the same partition. This
leaves one open to a setup where /boot (and consequently, kernel/modules)
are not included in the boot environment. This may very well be an
intentional setup done by someone that knows what they're doing, we should
not kill BE usage because of it.
- Eliminated the validation bits of BEs and snapshots that enforced
'mountpoint' to be "/" -- this broke when trying to operate on an imported
pool with an altroot, but we need not be this picky.
Reported by: philip
Reviewed by: philip, allanjude (previous version)
Tested by: philip
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D18012
Marius Strobl [Sat, 17 Nov 2018 17:21:36 +0000 (17:21 +0000)]
- Restore setting the clock for devices which support the default/legacy
transfer mode only (lost with r321385). [1]
- Similarly, don't try to set the power class on MMC devices that comply
to version 4.0 of the system specification but are operated in default/
legacy transfer or 1-bit bus mode as no power class is specified for
these cases. Trying to set a power class nevertheless resulted in an -
albeit harmless - error message.
John Baldwin [Fri, 16 Nov 2018 23:39:39 +0000 (23:39 +0000)]
Axe MINIMUM_MSI_INT.
Just allow MSI interrupts to always start at the end of the I/O APIC
pins. Since existing machines already have more than 255 I/O APIC
pins, IRQ 255 is no longer reliably invalid, so just remove the
minimum starting value for MSI.
Align IA32_ARCH_CAP MSR definitions and use with SDM rev. 068.
SDM rev. 068 was released yesterday and it contains the description of
the MSR 0x10a IA32_ARCH_CAP. This change adds symbolic definitions for
all bits present in the document, and decode them in the CPU
identification lines printed on boot.
But also, the document defines SSB_NO as bit 4, while FreeBSD used but
2 to detect the need to work-around Speculative Store Bypass
issue. Change code to use the bit from SDM.
Similarly, the document describes bit 3 as an indicator that L1TF
issue is not present, in particular, no L1D flush is needed on
VMENTRY. We used RDCL_NO to avoid flushing, and again I changed the
code to follow new spec from SDM.
In fact my Apollo Lake machine with latest ucode shows this:
IA32_ARCH_CAPS=0x19<RDCL_NO,SKIP_L1DFL_VME,SSB_NO>
Reviewed by: bwidawsk
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D18006
Add some additional length checks to the IPv4 fragmentation code.
Specifically, block 0-length fragments, even when the MF bit is clear.
Also, ensure that every fragment with the MF bit clear ends at the same
offset and that no subsequently-received fragments exceed that offset.
development(7): Replace "reboot" with "shutdown -r now"
We generally document shutdown(8) instead of reboot(8) as it's better for
interactive use.
In modern FreeBSD is matters a lot less, it's mostly just convention. One
minor thing is that shutdown(8) produces a global message, while reboot(8)
does not. It is believed that historically, some versions of reboot did not
do appropriate safe shutdown checks and just rebooted.
It's also just consistency: for example the handbook[1] documents shutdown.
There is actually another important difference between reboot and shutdown
-r now: reboot does not run /etc/rc.shutdown. This is because reboot has
its own shutdown procedure and does not signal init like init 6 and
shutdown -r now do (except in the case of rerooting via reboot -r).
A few years ago jilles@ proposed changing reboot's default to signalling
init (preserving reboot -q which just invokes the reboot system call), but
this was not accepted. Perhaps this can be tried again for 13.0.
Implement support for sysctl hw.model for Mediatek/Ralink SoCs
These SoCs have CHIPID registers, which store the Chip model, according
to the manufacturer; make use of those in order to better identify
the chip we're actually running on.
If we're unable to read the CHIPID registers for some reason we will
use the string "unknown " as a value for hw.model.
Reported by: yamori813@yahoo.co.jp
Sponsored by: Smartcom - Bulgaria AD
Mike Karels [Fri, 16 Nov 2018 03:42:29 +0000 (03:42 +0000)]
Fix flags collision causing inability to enable CBQ in ALTQ
The CBQ BORROW flag conflicts with the RMCF_CODEL flag; the
two sets of definitions actually define the same things. The symptom
is that a kernel with CBQ support and not CODEL fails to load a QoS
policy with the obscure error "pfctl: DIOCADDALTQ: Cannot allocate memory."
If ALTQ_DEBUG is enabled, the error becomes a little clearer:
"rmc_newclass: CODEL not configured for CBQ!" is printed by the kernel.
There really shouldn't be two sets of macros that have to be defined
consistently, but the include structure isn't right for exporting
CBQ flags to altq_rmclass.h. Re-align the definitions, and add
CTASSERTs in the kernel to ensure that the definitions are consistent.
John Baldwin [Fri, 16 Nov 2018 01:27:24 +0000 (01:27 +0000)]
Restore the <sys/vmem.h> header to fix build of cxgbe(4) TOM.
vmem's are not just used for TLS memory in TOM and the #include actually
predates the TLS code so should not have been removed when the TLS vmem
moved in r340466.
Pointy hat to: jhb
Sponsored by: Chelsio Communications
Mateusz Guzik [Fri, 16 Nov 2018 00:44:22 +0000 (00:44 +0000)]
amd64: handle small memset buffers with overlapping stores
Instead of jumping to locations which store the exact number of bytes,
use displacement to move the destination.
In particular the following clears an area between 8-16 (inclusive)
branch-free:
movq %r10,(%rdi)
movq %r10,-8(%rdi,%rcx)
For instance for rcx of 10 the second line is rdi + 10 - 8 = rdi + 2.
Writing 8 bytes starting at that offset overlaps with 6 bytes written
previously and writes 2 new, giving 10 in total.
Provides a nice win for smaller stores. Other ones are erratic depending
on the microarchitecture.
General idea taken from NetBSD (restricted use of the trick) and bionic
string functions (use for various ranges like in this patch).
Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17660
John Baldwin [Thu, 15 Nov 2018 23:10:46 +0000 (23:10 +0000)]
Change the quantum for TLS key addresses to 32 bytes.
The addresses passed when reading and writing keys are always shifted
right by 5 as the memory locations are addressed in 32-byte chunks, so
the quantum needs to be 32, not 8.
Mark Johnston [Thu, 15 Nov 2018 23:02:59 +0000 (23:02 +0000)]
Remove mostly-useless proc provider probes.
For some reason the proc UMA zone's ctor, dtor and init functions are
instrumented, but these functions are always available through FBT.
Moreover, the probes are not part of the original Solaris proc
provider, aren't documented, have no uses (e.g., in dwatch(8)) and
have no clear use to begin with. Therefore, remove them.
John Baldwin [Thu, 15 Nov 2018 22:47:47 +0000 (22:47 +0000)]
Use sbsndptr_adv() instead of sbsndptr() for TOE TLS.
For TOE TLS, we just want to advance the send pointer to skip over the
record just sent to the TOE. The recently added sbsndptr_adv() is
sufficient for that and is cheaper.
Mateusz Guzik [Thu, 15 Nov 2018 20:28:35 +0000 (20:28 +0000)]
amd64: sync up libc memset with the kernel version
- tidy up memset to have rax set earlier for small sizes
- finish the tail in memset with an overlapping store
- align memset buffers to 16 bytes before using rep stos
Set the SPI clock speed and polarity on each transfer to catch up with
recent changes in spibus and allow the use of different SPI modes on
the same bus.
Reported by: ian
Sponsored by: Rubicon Communications, LLC (Netgate)
Warner Losh [Thu, 15 Nov 2018 16:02:45 +0000 (16:02 +0000)]
Add cam_iosched_set_latfcn to set a latency callback for high latency.
It's often useful to have a callback when an I/O takes more than a
threshold amount of time. This adds the infrastructure for periph
devices to register one.
One use-case is as a debugging aide when you need a semi-realtime
indication of an I/O outlier so you can trigger bus capture gear for
vendor analysis.
Warner Losh [Thu, 15 Nov 2018 16:02:13 +0000 (16:02 +0000)]
When converting ns,us,ms to sbt, return the ceil() of the result
rather than the floor(). Returning the floor means that
sbttoX(Xtosbt(y)) != y for almost all values of y. In practice, this
results in a difference of at most 1 in the lsb of the sbintime_t.
This difference is meaningless for all current users of these
functions, but is important for the newly introduced sysctl conversion
routines which implicitly rely on the transformation being idempotent.
Stephen Hurd [Wed, 14 Nov 2018 20:36:18 +0000 (20:36 +0000)]
Clear RX completion queue state veriables in iflib_stop()
iflib_stop() was not resetting the rxq completion queue state variables.
This meant that for any driver that has receive completion queues, after a
reinit, iflib would start asking what's available on the rx side starting at
whatever the completion queue index was prior to the stop, instead of at 0.
Sean Eric Fagan [Wed, 14 Nov 2018 19:06:43 +0000 (19:06 +0000)]
mountd has no way to configure the listen queue depth; rather than add a new
option, we pass -1 down to listen, which causes it to use the
kern.ipc.soacceptqueue sysctl.
John Baldwin [Wed, 14 Nov 2018 18:45:33 +0000 (18:45 +0000)]
Revert r332735 and fix MSI-X to properly fail allocations when full.
The off-by-one errors in 332735 weren't actual errors and were
preventing the last MSI interrupt source from being used. Instead,
the issue is that when all MSI interrupt sources were allocated, the
loop in msix_alloc() would terminate with 'msi' still set to non-null.
The only check for 'i' overflowing was in the 'msi' == NULL case, so
msix_alloc() would try to reuse the last MSI interrupt source instead
of failing.
Fix by moving the check for all sources being in use to just after the
loop.
netmap(4) support for vtnet(4) was incomplete and had multiple bugs.
This commit fixes those bugs to bring netmap on vtnet in a functional state.
Changelist:
- handle errors returned by virtqueue_enqueue() properly (they were
previously ignored)
- make sure netmap XOR rest of the kernel access each virtqueue.
- compute the number of netmap slots for TX and RX separately, according to
whether indirect descriptors are used or not for a given virtqueue.
- make sure sglist are freed according to their type (mbufs or netmap
buffers)
- add support for mulitiqueue and netmap host (aka sw) rings.
- intercept VQ interrupts directly instead of intercepting them in txq_eof
and rxq_eof. This simplifies the code and makes it easier to make sure
taskqueues are not running for a VQ while it is in netmap mode.
- implement vntet_netmap_config() to cope with changes in the number of queues.
Reviewed by: bryanv
Approved by: gnn (mentor)
MFC after: 3 days
Sponsored by: Sunny Valley Networks
Differential Revision: https://reviews.freebsd.org/D17916
Stephen Hurd [Wed, 14 Nov 2018 15:16:45 +0000 (15:16 +0000)]
Fix leaks caused by ifc_nhwtxqs never being initialized
r333502 removed initialization of ifc_nhwtxqs, and it's not clear
there's a need to copy it into the struct iflib_ctx at all. Use
ctx->ifc_sctx->isc_ntxqs instead.
Further, iflib_stop() did not clear the last ring in the case where
isc_nfl != isc_nrxqs (such as when IFLIB_HAS_RXCQ is set). Use
ctx->ifc_sctx->isc_nrxqs here instead of isc_nfl.
The d_off field has been added to the dirent structure recently.
Currently filesystems don't support this feature. Support has been
added and tested for zfs, ufs, ext2fs, fdescfs, msdosfs and unionfs.
A stub implementation is available for cd9660, nandfs, udf and
pseudofs but hasn't been tested.
Motivation for this feature: our usecase is for a userspace nfs server
(nfs-ganesha) with zfs. At the moment we cache direntry offsets by
calling lseek once per entry, with this patch we can get the offset
directly from getdirentries(2) calls which provides a significant
speedup.