kevans [Sat, 4 Aug 2018 21:41:10 +0000 (21:41 +0000)]
efirt: Don't enter EFI context early, convert addrs to KVA instead
efi_enter here was needed because efi_runtime dereference causes a fault
outside of EFI context, due to runtime table living in runtime service
space. This may cause problems early in boot, though, so instead access it
by converting paddr to KVA for access.
While here, remove the other direct PHYS_TO_DMAP calls and the explicit DMAP
requirement from efidev.
kib [Sat, 4 Aug 2018 20:45:43 +0000 (20:45 +0000)]
Swap in WKILLED processes.
Swapped-out process that is WKILLED must be swapped in as soon as
possible. The reason is that such process can be killed by OOM and
its pages can be only freed if the process exits. To exit, the kernel
stack of the process must be mapped.
When allocating pages for the stack of the WKILLED process on swap in,
use VM_ALLOC_SYSTEM requests to increase the chance of the allocation
to succeed.
Add counter of the swapped out processes to avoid unneeded iteration
over the allprocs list when there is no work to do, reducing the
allproc_lock ownership.
markj [Sat, 4 Aug 2018 20:29:58 +0000 (20:29 +0000)]
Fix the regression test for PR 181741.
With r337328, the test hangs becase the sendmsg() call will block until
the receive buffer is at least partially drained. Fix the problem by
using a non-blocking socket and allowing short writes. Also assert
that a SCM_CREDS message was received if one was expected.
PR: 181741
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16516
markj [Sat, 4 Aug 2018 20:26:54 +0000 (20:26 +0000)]
Don't check rcv sockbuf limits when sending on a unix stream socket.
sosend_generic() performs an initial comparison of the amount of data
(including control messages) to be transmitted with the send buffer
size. When transmitting on a unix socket, we then compare the amount
of data being sent with the amount of space in the receive buffer size;
if insufficient space is available, sbappendcontrol() returns an error
and the data is lost. This is easily triggered by sending control
messages together with an amount of data roughly equal to the send
buffer size, since the control message size may change in uipc_send()
as file descriptors are internalized.
Fix the problem by removing the space check in sbappendcontrol(),
whose only consumer is the unix sockets code. The stream sockets code
uses the SB_STOP mechanism to ensure that senders will block if the
receive buffer fills up.
PR: 181741
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D16515
dim [Sat, 4 Aug 2018 14:57:23 +0000 (14:57 +0000)]
Fix build of hyperv with base gcc on i386
Summary:
Base gcc fails to compile `sys/dev/hyperv/pcib/vmbus_pcib.c` for i386,
with the following -Werror warnings:
cc1: warnings being treated as errors
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'new_pcichild_device':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:567: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'vmbus_pcib_on_channel_callback':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:940: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_pci_protocol_negotiation':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1012: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_pci_enter_d0':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1073: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'hv_send_resources_allocated':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1125: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c: In function 'vmbus_pcib_map_msi':
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1730: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
This is because on i386, several casts from `uint64_t` to a pointer
reduce the value from 64 bit to 32 bit.
For gcc, this can be fixed by an intermediate cast to uintptr_t. Note
that I am assuming the incoming values will always fit into 32 bit!
Differential Revision: https://reviews.freebsd.org/D15753
MFC after: 3 days
delphij [Sat, 4 Aug 2018 14:13:09 +0000 (14:13 +0000)]
In r337271, we limited the sector number to the lower of calculated
number and CHS based number. However, on some systems, BIOS would
report 0 in CHS fields, making the system to think there is 0 sectors.
Add a check before comparing the calculated total with bd_sectors.
wulf [Sat, 4 Aug 2018 12:31:19 +0000 (12:31 +0000)]
wmt(4): Use internal function to calculate input report size
Usbhid's hid_report_size() calculates integral size of all reports of given
kind found in the HID descriptor rather then exact size of report with given
ID as its userland counterpart does. As all input data processed by the
driver is located within the same report, calculate required driver's buffer
size with userland version, imported in one of the previous commits.
This allows us to skip zeroing of buffer on processing of each report.
if present to enable some devices like WaveShare touchscreens. Unlike
Windows we discard content of the blob. We try mimic Windows driver
behaviour from the USB device point of view.
kevans [Sat, 4 Aug 2018 06:40:18 +0000 (06:40 +0000)]
efi-autoresizecons: Don't fail the boot w/o GOP or UGA
efi-autoresizecons is currently executed for every boot. If it fails, we
risk failing the boot, and we really shouldn't do that unless we absolutely
must.
Not being able to locate GOP or UGA is not a significant enough failure to
kill the boot. We always have the option to fall back to resizing ConOut to
a higher text mode resolution (if available), so do that.
This was detected by Doug [1] while attempting a bhyve + UEFI + PXE boot.
This patch was effectively also submitted by Doug, but I expanded the
comment he had originally sent me a little bit to indicate why this is an OK
idea.
manu [Fri, 3 Aug 2018 22:04:00 +0000 (22:04 +0000)]
dtb: am335x: Remove links and add more dts
The links were to cope with the switch to upstream dts.
We don't need them anymore.
While here add the rest of the beaglebone family dts as u-boot is common
on all those boards and load the dtb based on the product name.
This just miss the pocketbeagle variant as it's not yet in sys/gnu/dts but
will be with the Linux 4.18 dts import.
jhibbits [Fri, 3 Aug 2018 20:04:06 +0000 (20:04 +0000)]
nvme(4): Add bus_dmamap_sync() at the end of the request path
Summary:
Some architectures, in this case powerpc64, need explicit synchronization
barriers vs device accesses.
Prior to this change, when running 'make buildworld -j72' on a 18-core
(72-thread) POWER9, I would see controller resets often. With this change, I
don't see these resets messages, though another tester still does, for yet to be
determined reasons, so this may not be a complete fix. Additionally, I see a
~5-10% speed up in buildworld times, likely due to not needing to reset the
controller.
bdrewery [Fri, 3 Aug 2018 19:24:04 +0000 (19:24 +0000)]
Fix some filemon path logging issues.
- Properly handle snprintf return value for truncation and avoid
overflowing the later write with the bogus length.
- Increase the msgbufr size to handle a rename of 2 full files.
The larger allocation causes a slight performance hit which will be mitigated
in the future. A rewrite with sbufs will likely be done as well.
Some drives report a geometry that is inconsisetent with the total
number of sectors reported through the BIOS. Cylinders * heads *
sectors may not necessarily be equal to the total number of sectors
reported through int13h function 48h.
An example of this is when a Mediasonic HD3-U2B PATA to USB enclosure
with a 80 GB disk is attached. Loader hangs at line 506 of
stand/i386/libi386/biosdisk.c while attempting to read sectors beyond
the end of the disk, sector 156906855. I discovered that the Mediasonic
enclosure was reporting the disk with 9767 cylinders, 255 heads, 63
sectors/track. That's 156906855 sectors. However camcontrol and
Windows 10 both report report the disk having 156301488 sectors, not
the calculated value. At line 280 biosdisk.c sets the sectors to the
higher of either bd->bd_sectors or the total calculated at line 276
(156906855) instead of the lower and correct value of 156301488 reported
by int 13h 48h.
This was tested on all three of my Mediasonic HD3-U2B PATA to USB
enclosures.
Instead of using the higher of bd_sectors (returned by int13h) or the
calculated value, this patch uses the lower and safer of the values.
jhb [Fri, 3 Aug 2018 18:52:51 +0000 (18:52 +0000)]
Install the 32-bit compat sanitizer libraries.
The lib32 build was already building the i386 version of
the clang sanitizers (libclang_rt) but they were not being
installed. This enables the installation.
MK_TOOLCHAIN=no was originally added to the install make
environment to disable includes so that NO_INCS could be
removed. The MK_TOOLCHAIN in bsd.incs.mk was subsequently
renamed to MK_INCLUDES, but bsd.lib.mk doesn't even include
bsd.incs.mk when LIBRARIES_ONLY is defined which the install
make environment for compat libs now defines. However,
setting MK_TOOLCHAIN=no forced MK_CLANG=no which disabled
libclang_rt during the install32 phase. Remove MK_TOOLCHAIN=no
since LIBRARIES_ONLY is now sufficient.
Since the libcompat environment overrides both LIBDIR and
SHLIBDIR, libclang_rt/Makefile.inc has to set both variables
to force the libraries to be installed to the location
expected by the compiler.
kib [Fri, 3 Aug 2018 18:35:20 +0000 (18:35 +0000)]
Require write access when mmapping BAR.
This actually makes the rights requirements for accessing PCI config
space and BARs using /dev/pci same. Since unchanged /dev/pci mode
only allows write open for root, default configuration de-facto limits
the BAR read to root only. In particular, state-changing reads of the
registers are limited to root.
Discussed with: se
Suggested and reviewed by: jhb (kernel part)
Sponsored by: The FreeBSD Foundation
MFC after: 12 days
Differential revision: https://reviews.freebsd.org/D16580
avg [Fri, 3 Aug 2018 14:27:28 +0000 (14:27 +0000)]
safer wait-free iteration of shared interrupt handlers
The code that iterates a list of interrupt handlers for a (shared)
interrupt, whether in the ISR context or in the context of an interrupt
thread, does so in a lock-free fashion. Thus, the routines that modify
the list need to take special steps to ensure that the iterating code
has a consistent view of the list. Previously, those routines tried to
play nice only with the code running in the ithread context. The
iteration in the ISR context was left to a chance.
After commit r336635 atomic operations and memory fences are used to
ensure that ie_handlers list is always safe to navigate with respect to
inserting and removal of list elements.
There is still a question of when it is safe to actually free a removed
element.
The idea of this change is somewhat similar to the idea of the epoch
based reclamation. There are some simplifications comparing to the
general epoch based reclamation. All writers are serialized using a
mutex, so we do not need to worry about concurrent modifications. Also,
all read accesses from the open context are serialized too.
So, we can get away just two epochs / phases. When a thread removes an
element it switches the global phase from the current phase to the other
and then drains the previous phase. Only after the draining the removed
element gets actually freed. The code that iterates the list in the ISR
context takes a snapshot of the global phase and then increments the use
count of that phase before iterating the list. The use count (in the
same phase) is decremented after the iteration. This should ensure that
there should be no iteration over the removed element when its gets
freed.
This commit also simplifies the coordination with the interrupt thread
context. Now we always schedule the interrupt thread when removing one
of handlers for its interrupt. This makes the code both simpler and
safer as the interrupt thread masks the interrupt thus ensuring that
there is no interaction with the ISR context.
P.S. This change matters only for shared interrupts and I realize that
those are becoming a thing of the past (and quickly). I also understand
that the problem that I am trying to solve is extremely rare.
PR: 229106
Reviewed by: cem
Discussed with: Samy Al Bahra
MFC after: 5 weeks
Differential Revision: https://reviews.freebsd.org/D15905
mav [Fri, 3 Aug 2018 02:16:45 +0000 (02:16 +0000)]
Reduce taskq and context-switch cost of zio pipe
When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).
On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching. We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().
This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes #7736
asomers [Fri, 3 Aug 2018 01:37:00 +0000 (01:37 +0000)]
Fix LOCAL_PEERCRED with socketpair(2)
Enable the LOCAL_PEERCRED socket option for unix domain stream sockets
created with socketpair(2). Previously, it only worked with unix domain
stream sockets created with socket(2)/listen(2)/connect(2)/accept(2).
PR: 176419
Reported by: Nicholas Wilson <nicholas@nicholaswilson.me.uk>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16350
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
mav [Fri, 3 Aug 2018 00:01:48 +0000 (00:01 +0000)]
MFV r337210: 9577 remove zfs_dbuf_evict_key tsd
The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary - we can
instead pass a flag down in a few places to prevent recursive dbuf eviction.
Making this change has 3 benefits:
1. The code semantics are easier to understand.
2. On Linux, performance is improved, because creating/removing TSD values
(by setting to NULL vs non-NULL) is expensive, and we do it very often.
3. According to Nexenta, the current semantics can cause a deadlock when
concurrently calling dmu_objset_evict_dbufs() (which is rare today, but they
are working on a "parallel unmount" change that triggers this more easily)
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
cem [Thu, 2 Aug 2018 23:45:14 +0000 (23:45 +0000)]
wc(1): Fix 'wc -L'
I inadvertently broke 'wc -L' in r326736. We must skip the fast path if -L
was specified, in addition to the existing check for the -l option.
Document long-standing -L behavior (count varies depending on whether wc(1)
is run with the -m option or not) in wc.1. That behavior dates back to the
introduction of the -L option, but was not documented.
mav [Thu, 2 Aug 2018 23:43:01 +0000 (23:43 +0000)]
MFV r337200:
9438 Holes can lose birth time info if a block has a mix of birth times
Ultimately, the problem here is that when you truncate and write a file in
the same transaction group, the dbuf for the indirect block will be zeroed
out to deal with the truncation, and then written for the write. During
this process, we will lose hole birth time information for any holes in the
range. In the case where a dnode is being freed, we need to determine
whether the block should be converted to a higher-level hole in the zio
pipeline, and if so do it when the dnode is being synced out.
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
- Ignore any type of TID where the start/end values are not in the
correct order. There are situations where the firmware isn't able to
reserve room for the number requested in the config file but doesn't
report a failure during configuration and instead sets end <= start.
- Track start/end in tid_tab and remove some redundant copies from
adapter->params.
- Move all the start/end and other read-only parameters to a quiet part
of tid_tab, away from the tid locks.
mav [Thu, 2 Aug 2018 21:59:46 +0000 (21:59 +0000)]
MFV r337190: 9486 reduce memory used by device removal on fragmented pools
In the most fragmented real-world cases, this reduces memory used by the
mapping from ~1GB to ~50MB of RAM per 1TB of storage removed. Less
fragmented cases will typically also see around 50-100MB of RAM per 1TB
of storage.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
mav [Thu, 2 Aug 2018 21:25:32 +0000 (21:25 +0000)]
MFV r337184: 9457 libzfs_import.c:add_config() has a memory leak
A memory leak occurs on lines 209 and 213 because the config is not freed
in the error case. The interface to add_config() seems less than ideal -
it would be better if it copied any data necessary from the config and the
caller freed it.
mav [Thu, 2 Aug 2018 21:19:35 +0000 (21:19 +0000)]
MFV r337182: 9330 stack overflow when creating a deeply nested dataset
Datasets that are deeply nested (~100 levels) are impractical. We just put
a limit of 50 levels to newly created datasets. Existing datasets should
work without a problem.
mav [Thu, 2 Aug 2018 21:07:04 +0000 (21:07 +0000)]
9539 Make zvol operations use _by_dnode routines
Continues what was started in 7801 add more by-dnode routines by fully
converting zvols to avoid unnecessary dnode_hold() calls. This saves a
small amount of CPU time and slightly improves latencies of operations
on zvols.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Richard Yao <richard.yao@prophetstor.com>
mav [Thu, 2 Aug 2018 20:33:13 +0000 (20:33 +0000)]
MFV r337175: 9487 Free objects when receiving full stream as clone
All objects after the last written or freed object are not supposed to
exist after receiving the stream. We should free them accordingly, as if
a freeobjects record for them had been included in the stream.
mav [Thu, 2 Aug 2018 20:18:49 +0000 (20:18 +0000)]
MFV r337171:
9464 txg_kick() fails to see that we are quiescing, forcing transactions
to their next stages without leaving them accumulate changes
Ideally we would like txg_kick() to get triggered only when we are sure
that we are not syncing AND not quiescing any txg. This way we can kick
an open TXG to the quiescing state when we are sure that there is nothing
going on and we would benefit from the different states running
concurrently.
rmacklem [Thu, 2 Aug 2018 20:10:59 +0000 (20:10 +0000)]
Silence newer gcc warnings.
Newer versions of gcc generate "might not be initialized" warnings for
several variables in nfsrpc_doiods(). I have checked and all of these
variables are assigned values before they are used.
In the one case of "tdrpc", it could have passed garbage as an argument
to nfscl_dofflayoutio() when mirrorcnt is one. However nfscl_dofflayoutio() only
uses the argument when mirrorcnt > 1, so it wasn't actually broken.
This patch initializes "tdrpc" to avoid confusion and initializes the rest
to make the compiler happy.
mav [Thu, 2 Aug 2018 20:06:46 +0000 (20:06 +0000)]
MFV r337167: 9442 decrease indirect block size of spacemaps
Updates to indirect blocks of spacemaps can contribute significantly to
write inflation. Therefore we want to reduce the indirect block size of
spacemaps from 128K to 16K.
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Albert Lee <trisk@forkgnu.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
cxgbe(4): Use the tx credit limit for ethofld rather than TOE when
initializing the softc for a per-flow rate limiter. The limit happens
to be the same for both and the existing code worked by accident for
common configurations.
cem [Thu, 2 Aug 2018 19:25:43 +0000 (19:25 +0000)]
FUSE: Bump maximum IO size to enable more performant operation
Various components restrict size of IO passed up to the userspace filesystem
based on the mount's f_iosize value. The previous default of PAGE_SIZE
is anemic, even for normal filesystems, but especially considering every
FUSE operation involves a kernel <-> userspace IPC upcall.
Bump to DFLTPHYS (currently 64kB) to match other FUSE implementations.
Anecdotally, Jakub reports IO read performance increased from 600 MB/s ->
2700 MB/s with a basic RAM-backed FUSE filesystem.
PR: 230260
Reported by: Peter (MooseFS) <freebsd AT moosefs.com>
Tested by: Jakub Kruszona-Zawadzki <acid AT moosefs.com>
MFC after: 3 days
Only filesystems and volumes are valid "zfs remap" parameters: when passed
a snapshot name zfs_remap_indirects() does not handle the EINVAL returned
from libzfs_core, which results in failing an assertion and consequently
crashing.
mav [Thu, 2 Aug 2018 18:55:55 +0000 (18:55 +0000)]
Do not blindly include illumos kernel headers instead of user-space.
It is not needed now, and I doubt it much helped at all, creating more
confusions then good.
gjb [Thu, 2 Aug 2018 18:51:44 +0000 (18:51 +0000)]
Fix the ftp-stage target for arm embedded builds.
The images were renamed from KERNCONF to BOARDNAME when
specified, which would result in an image name of:
12.0-CURRENT-arm-armv7-GENERIC.img
which would then be renamed to use the BOARDNAME for the
SoC the image is targeted to use. BOARDNAME was specified
for all images as of r336994, which now causes the ftp-stage
target to fail, as the rename is no longer necessary.
trasz [Thu, 2 Aug 2018 07:43:28 +0000 (07:43 +0000)]
Make sure the rtld(1) error messages go to stderr, not stdout.
While here fix capitalization of a few nearby strings, add the
rtld's file name prefix so it's obvious where the message come
from, and return zero when "-h" is used.
https://www.illumos.org/issues/7955
Libshare currently initializes all available filesystems when doing any
libshare operation. This requires iterating through all the filesystem
multiple times, which is a huge performance problem for sharing and
unsharing operations.
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Daniel Hoffman <dj.hoffman@delphix.com>
For FreeBSD this is practically a NOP, just a diff reduction.
kevans [Wed, 1 Aug 2018 20:08:20 +0000 (20:08 +0000)]
ubldr: Bump heap size, 1MB -> 2MB
1MB was leaving very little margin in some of the worse-case scenarios with
lualoader. 2MB is still low enough that we shouldn't have any problems with
UBoot-supported boards.
markj [Wed, 1 Aug 2018 19:45:04 +0000 (19:45 +0000)]
Fix some nits in the unix_passfd tests.
- Remove return statements in functions with a void return type.
- Allocate enough space for the SCM_CREDS and SCM_RIGHTS messages
received in the rights_creds_payload test.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
kib [Wed, 1 Aug 2018 18:58:24 +0000 (18:58 +0000)]
Add ioctl to conveniently mmap a PCI device BAR into userspace.
Add the ioctl PCIOCBARMMAP on /dev/pci to conveniently create
userspace mapping of a PCI device BAR. This is enormously superior to
read the BAR value with PCIOCREAD and then try to mmap /dev/mem, and
should allow to automatically activate the mapped BARs when needed in
future.
Current implementation creates new sg pager for each user mmap
request. If the pointer (and reference) to a managed device pager is
stored in pci_map, we would be able to revoke all mappings on the BAR
deactivation or relocation. This is related to the unimplemented BAR
activation on mmap, and is postponed for the future.