]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
5 years agoDirect commit to stable, file not present in current
imp [Wed, 17 Oct 2018 02:45:15 +0000 (02:45 +0000)]
Direct commit to stable, file not present in current

Catch up to r332154: Fix d_dev removal of d_type.

5 years agoMFC r336159:
np [Wed, 17 Oct 2018 02:25:15 +0000 (02:25 +0000)]
MFC r336159:

cxgbe(4): Add a sysctl to report the chip's microprocessor's load
averages.  This works with debug or custom firmwares only.

sysctl dev.<nexus>.<instance>.loadavg
sysctl dev.t6nex.0.loadavg

5 years agoMFC r335352:
np [Wed, 17 Oct 2018 02:05:31 +0000 (02:05 +0000)]
MFC r335352:

cxgbe(4): Some mailbox commands require access to the Tx pipeline and
can time out if it's backed up due to a non-stop deluge of PAUSE frames
from a misbehaving peer.  Detect this situation and toggle MPS TxEn
to allow forward progress.

5 years agoMFC r334987:
np [Wed, 17 Oct 2018 01:59:45 +0000 (01:59 +0000)]
MFC r334987:

cxgbe(4): Remove homemade version of htobe32 from the driver.

It was needed only for ia64 where it was implemented as a call to
bswapXX, which was always a real function.  htobeXX with a constant
argument is calculated at compile-time everywhere else.

5 years agoMFC r320426:
np [Wed, 17 Oct 2018 01:49:43 +0000 (01:49 +0000)]
MFC r320426:

cxgbe/t4_tom: Do not include space taken by the TCP timestamp option in
the "effective MSS" for the connection.  The chip expects it this way.

5 years agoMFC r338254:
np [Wed, 17 Oct 2018 01:30:51 +0000 (01:30 +0000)]
MFC r338254:

cxgbe(4): Use fcmpset instead of cmpset when appropriate.

5 years agoMFC r338924:
np [Wed, 17 Oct 2018 01:20:18 +0000 (01:20 +0000)]
MFC r338924:

cxgbe(4): Link related changes.

- Switch to using 32b port/link capabilities in the driver.  The 32b
  format is used internally by firmwares > 1.16.45.0 and the driver will
  now interact with the firmware in its native format, whether it's 16b
  or 32b.  Note that the 16b format doesn't have room for 50G, 200G, or
  400G speeds.

- Add a bit in the pause_settings knobs to allow negotiated PAUSE
  settings to override manual settings.

- Ensure that manual link settings persist across an administrative
  down/up as well as transceiver unplug/replug.

- Remove unused is_*G_port() functions.

Sponsored by: Chelsio Communications

5 years agoMFC r336042:
np [Wed, 17 Oct 2018 01:05:52 +0000 (01:05 +0000)]
MFC r336042:

cxgbe(4): Assume that any unknown flash on the card is 4MB and has 64KB
sectors, instead of refusing to attach to the card.

Submitted by: Casey Leedom @ Chelsio
Sponsored by: Chelsio Communications

5 years agoMFC r333139:
np [Wed, 17 Oct 2018 00:57:28 +0000 (00:57 +0000)]
MFC r333139:

cxgbe(4): Destroy the cdev before disabling interrupts in driver detach.

Filter work requests are submitted in the nexus cdev's ioctl which then
blocks waiting for a reply.  If driver detach runs in this state and
disables interrupts the ioctl will never complete and detach will hang
in destroy_cdev.

5 years agoMFC r325840, r327811, and r329701.
np [Wed, 17 Oct 2018 00:45:01 +0000 (00:45 +0000)]
MFC r325840, r327811, and r329701.

r325840:
CXGBE: fix big-endian behaviour

The setbit/clearbit pair casts the bitfield pointer
to uint8_t* which effectively treats its contents as
little-endian variable. The ffs() function accepts int as
the parameter, which is big-endian. Use uint8_t here to
avoid mismatch, as we have only 4 doorbells.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D13084

r327811:
CXGBE: fix get_filt to be endianness-aware

Unconditional 32-bit shift is not endianness-safe.
Modify the logic to work both on LE and BE.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D13102

r329701:
CXGBE: implement prefetch on non-Intel architectures

Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           np, pdk@semihalf.com
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D14452

5 years agoMFC r320419, r337679, r338366, and r338652.
np [Wed, 17 Oct 2018 00:27:21 +0000 (00:27 +0000)]
MFC r320419, r337679, r338366, and r338652.

r320419:
cxgbe/iw_cxgbe: Disable debug output by default.  The help text for the sysctl
already says that the default is 0.

r337679:
Remove unused stuff from iw_cxgbe.h

r338366:
cxgbe/iw_cxgbe: Fix iWARP RDMA + VIMAGE operation by setting the VNET
properly in a couple of places in the driver.

r338652:
cxgbe/iw_cxgbe: Fix reported build breakage when the kernel
configuration has "device cxgbe' but no VIMAGE.

Sponsored by: Chelsio Communications

5 years agoMFC r332515:
np [Tue, 16 Oct 2018 22:13:05 +0000 (22:13 +0000)]
MFC r332515:
Fix typo in cxgbetool.8.

5 years agoMFC r330887:
np [Tue, 16 Oct 2018 22:09:33 +0000 (22:09 +0000)]
MFC r330887:
cxgbetool(8): Add the ability to decode hardware TCBs.

Sponsored by: Chelsio Communications

5 years agoMFC 326138,326436,326852: Style fixes to kdump.
jhb [Tue, 16 Oct 2018 20:53:16 +0000 (20:53 +0000)]
MFC 326138,326436,326852: Style fixes to kdump.

326138:
Use C standard spelling uint64_t for u_int64_t.

326436:
vmstat: fix style(9) violations and bump WARNS.

326852:
Re-add spaces lost in r326436.

5 years agoMFC r327254, r327904, and r328994.
np [Tue, 16 Oct 2018 19:26:04 +0000 (19:26 +0000)]
MFC r327254, r327904, and r328994.

r327254:
cxgbe/iw_cxgbe: Fix iWARP over VLANs (catch up with r326169).

r327904:
cxgbe/iw_cxgbe: Remove duplicates to fix compilation with recent gcc.

r328994:
iw_cxgbe: Remove declaration of a function that no longer exists.

Sponsored by: Chelsio Communications

5 years agoMFC r339241:
kib [Mon, 15 Oct 2018 10:50:04 +0000 (10:50 +0000)]
MFC r339241:
Disallow zero day of month from strptime("%d").

PR: 232072

5 years agoMFC r336027 (andrew): Teach binutils that arm64 is a 64bit architecture.
emaste [Sun, 14 Oct 2018 01:16:48 +0000 (01:16 +0000)]
MFC r336027 (andrew): Teach binutils that arm64 is a 64bit architecture.

This is needed to cross build from arm64 to other architectures that
use binutils.

5 years agoMFC r339288: Remove extra thread_exit() call left after r329802.
mav [Sat, 13 Oct 2018 03:12:57 +0000 (03:12 +0000)]
MFC r339288: Remove extra thread_exit() call left after r329802.

spa_condense_indirect_thread() is no longer a thread function, but just
a callback for new zthr KPI.

5 years agoMFC r339076
ken [Fri, 12 Oct 2018 19:44:19 +0000 (19:44 +0000)]
MFC r339076

This has been edited slightly from the version in head.  In head, the probe
sections of dadone() were split out into separate functions.  In stable/11,
dadone() is still a single function.

So, for stable/11, this describes the change:

sys/cam/scsi/scsi_da.c:
In the DA_CCB_PROBE_DONE case in dadone(), free the data pointer
before returning.

  ------------------------------------------------------------------------
  r339076 | ken | 2018-10-01 13:00:46 -0600 (Mon, 01 Oct 2018) | 12 lines

  Fix a da(4) driver memory leak for SCSI SMR devices.

  In the probe case for SCSI SMR Host Aware or Most Managed drives, be sure
  to free allocated memory.

  sys/cam/scsi/scsi_da.c:
   In dadone_probezone(), free the data pointer before returning.

  Sponsored by: Spectra Logic

  ------------------------------------------------------------------------

Sponsored by: Spectra Logic

5 years agoMFC r339197: Add sysctls for dbuf metadata cache variables added in r336959.
mav [Fri, 12 Oct 2018 01:11:20 +0000 (01:11 +0000)]
MFC r339197: Add sysctls for dbuf metadata cache variables added in r336959.

5 years agoMFC 338055: Remove some vestiges of IPI_LAZYPMAP on i386.
jhb [Thu, 11 Oct 2018 19:06:54 +0000 (19:06 +0000)]
MFC 338055: Remove some vestiges of IPI_LAZYPMAP on i386.

The support for lazy pmap invalidations on i386 was removed in r281707.
This removes the constant for the IPI and stops accounting for it when
sizing the interrupt count arrays.

5 years agoMFC r339237: Fix r336951 mismerge -- use of uninitialized variable.
mav [Thu, 11 Oct 2018 15:12:10 +0000 (15:12 +0000)]
MFC r339237: Fix r336951 mismerge -- use of uninitialized variable.

5 years agoMFC r339235:
hselasky [Thu, 11 Oct 2018 07:34:56 +0000 (07:34 +0000)]
MFC r339235:
Add missing steering rules for virtual function, VF, in mlx4en(4) driver.

When acting as a VF it is required to add steering rules for all unicast
addresses. Even if promiscious mode is selected. Else incoming data packets
will be dropped.

Sponsored by: Mellanox Technologies

5 years agoMFC r339181: crt: switch to standard note type definitions from elf_common.h
emaste [Thu, 11 Oct 2018 00:26:15 +0000 (00:26 +0000)]
MFC r339181: crt: switch to standard note type definitions from elf_common.h

This makes it easier to grep the source tree for these notes, and
ensures that they will remain in sync.

Sponsored by: The FreeBSD Foundation

5 years agoMFC r338200: Adding device ID for Terratec SiXPack 5.1+.
avatar [Wed, 10 Oct 2018 22:49:52 +0000 (22:49 +0000)]
MFC r338200: Adding device ID for Terratec SiXPack 5.1+.

5 years agoDisable the KASSERT for curcpu == 0 in netisr for EARLY_AP_STARTUP.
jhb [Wed, 10 Oct 2018 21:28:04 +0000 (21:28 +0000)]
Disable the KASSERT for curcpu == 0 in netisr for EARLY_AP_STARTUP.

In the EARLY_AP_STARTUP case, thread0 can migrate to another CPU
before this SYSINIT is run.  However, the only part of this SYSINIT
that assumes it runs on CPU 0 is in the !EARLY_AP_STARTUP case when it
creates the netisr for the boot CPU.  In the EARLY_AP_STARTUP case we
start up the netisr's for the first N CPUs during the SYSINIT itself
and don't depend on running on the boot CPU for correct operation.

This is a direct comit to stable/11 as the assertion was removed as part
of a different change in r302595.

Reported by: rwatson, truckman, jkim, FreeNAS bug 45611

5 years agoMFC r333569: cpucontrol: improve Intel microcode revision check
emaste [Wed, 10 Oct 2018 15:54:01 +0000 (15:54 +0000)]
MFC r333569: cpucontrol: improve Intel microcode revision check

According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR
should be zeroed before executing cpuid (although in practice it does
not seem to matter).

PR: 192487
Submitted by: Dan Lukes
Reported by: Henrique de Moraes Holschuh

5 years agoMFC r333233: gpart: add fat32lba MBR partition type
emaste [Wed, 10 Oct 2018 15:44:14 +0000 (15:44 +0000)]
MFC r333233: gpart: add fat32lba MBR partition type

FAT32 partition with LBA addressing.

Sponsored by: The FreeBSD Foundation

5 years agoMFC r338810: openssh: rename local macro to avoid OpenSSL 1.1.1 conflict
emaste [Wed, 10 Oct 2018 15:38:33 +0000 (15:38 +0000)]
MFC r338810: openssh: rename local macro to avoid OpenSSL 1.1.1 conflict

Local changes introduced an OPENSSH_VERSION macro, but this conflicts
with a macro of the same name introduced with OepnsSL 1.1.1

Sponsored by: The FreeBSD Foundation

5 years agoMFC r339019: clang: allow ifunc resolvers to accept arguments
emaste [Wed, 10 Oct 2018 15:37:10 +0000 (15:37 +0000)]
MFC r339019: clang: allow ifunc resolvers to accept arguments

Previously Clang required ifunc resolution functions to take no
arguments, presumably because GCC documented ifunc resolvers as taking
no arguments.  However, GCC accepts resolvers accepting arguments, and
our rtld passes CPU ID information (cpuid, hwcap, etc.) to ifunc
resolvers.  Just remove the check from the in-tree compiler for our in-
tree compiler.

Sponsored by: The FreeBSD Foundation

5 years agoregerate src.conf.5 to remove duplicate entries
emaste [Wed, 10 Oct 2018 13:19:54 +0000 (13:19 +0000)]
regerate src.conf.5 to remove duplicate entries

Also correct arch lists - armv7, mips*hf, powerpcspe, riscv64 are not
in stable/11.

PR: 226908, 229514
Sponsored by: The FreeBSD Foundation

5 years agoMFC r334072, r334247 (eadler): Add the text '@generated' to src.conf.5
emaste [Wed, 10 Oct 2018 13:12:52 +0000 (13:12 +0000)]
MFC r334072, r334247 (eadler): Add the text '@generated' to src.conf.5

This is a cross-tool approach to identifying generated code. Some tools,
notably phabricator, handle this marker specially.  See
https://reviews.freebsd.org/differential/diff/42870/ for such an
example.

5 years agoMFC r306729: makeman: avoid bogus output with duplicated options
emaste [Wed, 10 Oct 2018 13:06:31 +0000 (13:06 +0000)]
MFC r306729: makeman: avoid bogus output with duplicated options

On some targets 'make showconfig' currently reports both 'no' and 'yes'
for some options. For example:

% make TARGET=mips showconfig | grep SSP
MK_SSP           = no
MK_SSP           = yes

Emit a warning on encountering a duplicated variable, and skip the
second entry.

PR: 226908, 229514
Sponsored by: The FreeBSD Foundation

5 years agoMFC 338976: Don't clear DR6 for debug exceptions from userland.
jhb [Tue, 9 Oct 2018 22:35:43 +0000 (22:35 +0000)]
MFC 338976: Don't clear DR6 for debug exceptions from userland.

This reverts part of r333368.  The attempt to clear DR6 was occuring
too soon as trapsignal() does not pause to let the debugger notice the
SIGTRAP and query DR6.  The signal exchange does not occur until much
later during ast().  As a result, GDB was no longer recognizing
hardware breakpoints and watchpoints on x86.

In addition, any userland programs that want to inspect DR6 in a
SIGTRAP handler don't have a way to do this if we clear DR6 in the
exception handler.

Instead of relying on the kernel to clear DR6, debuggers will have to
explicitly clear it after a trace trap (which they needed to do on
older kernels anyway).

5 years agoMFH (r333574): fully support acting as a recursing resolver.
des [Tue, 9 Oct 2018 20:29:04 +0000 (20:29 +0000)]
MFH (r333574): fully support acting as a recursing resolver.

PR: 222902

5 years agoMFH (r314778): use reallocarray(3) for extra bounds checks
des [Tue, 9 Oct 2018 10:49:19 +0000 (10:49 +0000)]
MFH (r314778): use reallocarray(3) for extra bounds checks
MFH (r333306): fix typo in man page
MFH (r333571, r333572): preserve if-modified-since across redirects
MFH (r334317): simplify the DEBUG macro
MFH (r334319): style bug roundup
MFH (r334326): fix netrc file location logic, improve netrcfd handling
MFH (r338572): fix end-of-transfer statistics, improve no-tty display

PR: 202424, 224426, 228017

5 years agoMFC r338925:
brooks [Mon, 8 Oct 2018 22:38:28 +0000 (22:38 +0000)]
MFC r338925:

Don't override LDFLAGS set in bsd.cpu.mk.

This is a direct commit to a generated file.  Simon plans to fix this
upstream before the next import.

PR: 231557
Approved by: re (gjb)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL

5 years agoMFC 338021: Use 'bool' instead of 'int' for various boolean flags.
jhb [Mon, 8 Oct 2018 17:22:27 +0000 (17:22 +0000)]
MFC 338021: Use 'bool' instead of 'int' for various boolean flags.

5 years agoMFC 337400: Remove spurious ABI tags from kdump output.
jhb [Mon, 8 Oct 2018 17:18:55 +0000 (17:18 +0000)]
MFC 337400: Remove spurious ABI tags from kdump output.

The abidump routine output an ABI tag when -A was specified for records
that were not displayed due to type or pid filtering.  To fix, split
the code to lookup the ABI from the code to display the ABI, move the
code to display the ABI into dumpheader(), and move dumpheader() later
in the main loop as a simplification.  Previously dumpheader() was
called under a condition that repeated conditions made later in the
main loop.

5 years agoMFC r339025:
kib [Sun, 7 Oct 2018 00:40:56 +0000 (00:40 +0000)]
MFC r339025:
Update x86/ifunc.h.

5 years agoMFC 338022: Fix casts between 64-bit physical addresses and pointers in EFI.
jhb [Fri, 5 Oct 2018 21:10:03 +0000 (21:10 +0000)]
MFC 338022: Fix casts between 64-bit physical addresses and pointers in EFI.

Compiling FreeBSD/i386 with modern GCC triggers warnings for various
places that convert 64-bit EFI_ADDRs to pointers and vice versa.
- Cast pointers to uintptr_t rather than to uint64_t when assigning
  to a 64-bit integer.
- Cast 64-bit integers to uintptr_t before a cast to a pointer.

5 years agoMFC r338999:
kib [Fri, 5 Oct 2018 18:15:44 +0000 (18:15 +0000)]
MFC r338999:
Correct vm_fault_copy_entry() handling of backing file truncation
after the file mapping was wired.

5 years agoMFC r338998:
kib [Fri, 5 Oct 2018 18:14:18 +0000 (18:14 +0000)]
MFC r338998:
In vm_fault_copy_entry(), we should not assert that entry is charged
if the dst_object is not of swap type.

5 years agoMFC r338997:
kib [Fri, 5 Oct 2018 18:12:49 +0000 (18:12 +0000)]
MFC r338997:
In vm_fault_copy_entry(), collect the code to initialize a newly
allocated dst_object in a single place.

5 years agoMFC r338993:
hselasky [Fri, 5 Oct 2018 07:49:01 +0000 (07:49 +0000)]
MFC r338993:
When multiple threads are involved receiving completion events in LibUSB
make sure there is always a master polling thread, by setting the "ctx_handler"
field in the context. Else the reception of completion events can stop.
This happens if event threads are created and destroyed during runtime.

Found by: Ludovic Rousseau <ludovic.rousseau+freebsd@gmail.com>
PR: 231742
Sponsored by: Mellanox Technologies

5 years agoMFC r338964:
kib [Thu, 4 Oct 2018 11:47:53 +0000 (11:47 +0000)]
MFC r338964:
Remove -m (update) from ldconfig -32 & -soft invocation on startup.

5 years agoMFC r338956:
kib [Wed, 3 Oct 2018 17:32:02 +0000 (17:32 +0000)]
MFC r338956:
Provide refobj context when doing libmap substitution inside
search_library_path().

5 years agoMFC r324953 (by traz):
kib [Wed, 3 Oct 2018 17:30:59 +0000 (17:30 +0000)]
MFC r324953 (by traz):
Remove unneeded calls to access(2) from rtld(1); just call open(2) instead.

5 years agoMFC r324952 (by trasz):
kib [Wed, 3 Oct 2018 17:29:30 +0000 (17:29 +0000)]
MFC r324952 (by trasz):
Replace lseek(2)/read(2) pair with pread(2).

5 years agoMFC r324951 (by trasz):
kib [Wed, 3 Oct 2018 17:28:27 +0000 (17:28 +0000)]
MFC r324951 (by trasz):
Make find_library() conform to style(9).

5 years agoMFC r324950 (by trasz):
kib [Wed, 3 Oct 2018 17:27:20 +0000 (17:27 +0000)]
MFC r324950 (by trasz):
Reword the conditional.

5 years agoMFC r338040: diff(1): Refactor -B a little bit
kevans [Wed, 3 Oct 2018 17:21:45 +0000 (17:21 +0000)]
MFC r338040: diff(1): Refactor -B a little bit

Instead of doing a second pass to skip empty lines if we've specified -I, go
ahead and check both at once. Ignore critera has been split out into its own
function to try and keep the logic cleaner.

5 years agoMFC r338646: dd(1): Correct padding in status=progress
kevans [Wed, 3 Oct 2018 17:20:30 +0000 (17:20 +0000)]
MFC r338646: dd(1): Correct padding in status=progress

Output padding is specified via outlen, which is set using the return value
of fprintf. Because it's printing that padding plus a trailing byte, it
grows by one each iteration rather than reflecting actual length.

Additionally, iec was sized improperly for scaling up similarly to si.
Fixing this revealed that the humanize_number(3) call to populate persec
was using the wrong width.

5 years agoMFC r338223, r338263: Missing bits from OptionalObsoleteFiles
kevans [Wed, 3 Oct 2018 17:19:04 +0000 (17:19 +0000)]
MFC r338223, r338263: Missing bits from OptionalObsoleteFiles

r338223:
Remove ZFS leftovers when WITHOUT_ZFS is set

Submitted by: Oliver Pinter
Differential Revision: https://reviews.freebsd.org/D16810

r338263:
Remove hyper-v leftovers when WITHOUT_HYPERV is set

hv_vss_daemon was missed.

5 years agoMFC r338219, r338250: FDT in Loader fixes
kevans [Wed, 3 Oct 2018 17:17:38 +0000 (17:17 +0000)]
MFC r338219, r338250: FDT in Loader fixes

r338219:
fdt_fixups: relocate the /chosen node after applying fixups

As indicated by the comment, any fixups applied (which might include
overlays) can invalidate the previously located node by adding nodes or
setting/adding properties. The later fdt_setprop of fixup-applied property
would then fail because of the bad/wrong node offset.

This would have generally been harmless, but potentially caused multiple
applications of fixups and caused a little bit of bloat.

r338250:
efiloader: Setup FDT in autoload to fix overlays clobbering kenv

manu found in the noted PR that overlays seemed to be clobbering the kenv
and killing the boot. Further inspection revealed that one can `fdt ls` at
the loader prompt for a successful boot, but autoboot breaks it.

In the autoboot case, first setup of FDT is happening in the middle of
bi_load, which triggers loading of the DTBO from /boot.

This is bad, bad, bad. Files in the loader are loaded somewhere in the
middle of the address space one after another. bi_load starts building the
needed kernel bootinfo immediately after the highest-addr loaded file. File
loads in the middle of bi_load suddenly clobber bootinfo and everything goes
off the rails.

The solution to this is to use take advantage of arch_autoload to setup FDT
in efiloader compiled with LOADER_FDT_SUPPORT. This matches how it works in
ubldr land, and is how it should have worked when overlay support was added
to efiloader since fdt_setup_fdtp now has the potential to load files
(courtesy of fdt_platform_load_dtb).

5 years agoMFC r338039: diff(1): Implement -B/--ignore-blank-lines
kevans [Wed, 3 Oct 2018 17:16:18 +0000 (17:16 +0000)]
MFC r338039: diff(1): Implement -B/--ignore-blank-lines

As noted by cem in r338035, coccinelle invokes diff(1) with the -B flag.
This was not previously implemented here, so one was forced to create a link
for GNU diff to /usr/local/bin/diff

Implement the -B flag and add some primitive tests for it. It is implemented
in the same fashion that -I is implemented; each chunk's lines are scanned,
and if a non-blank line is encountered then the chunk will be output.
Otherwise, it's skipped.

5 years agoMFC r337964, r338232: dtc(1) updates
kevans [Wed, 3 Oct 2018 17:14:40 +0000 (17:14 +0000)]
MFC r337964, r338232: dtc(1) updates

r337964:
dtc(1): Update to 97d2d5715eeb45108cc60367fdf6bd5b2046b050

Notable fixes:
- Overlays may now be generated properly without -@
- /__local_fixups__ were not including unit address in their structure
- The error reporting a magic token was misleading, reporting
  "Bad magic token in header.  Got d00dfeed expected 0xd00dfeed"
  if the token was missing. This has been split out into a separate message.

r338232:
dtc(1): Update to 0892ec7; HACKING and implicit header fixes

Fixes courtesy of arichardson and jmg:
- HACKING was pointing to the wrong place
- Added headers were being relied on implicitly, but libstdc++ did not
  comply with the unspoken wishes of dtc.

5 years agoMFC r337567 (by mmacy):
mav [Wed, 3 Oct 2018 17:10:32 +0000 (17:10 +0000)]
MFC r337567 (by mmacy):
Performance optimization of AVL tree comparator functions

MFV:
commit ee36c709c3d5f7040e1bd11f5c75318aa03e789f
Author: Gvozden Neskovic <neskovic@gmail.com>
Date:   Sat Aug 27 20:12:53 2016 +0200

    perf: 2.75x faster ddt_entry_compare()
        First 256bits of ddt_key_t is a block checksum, which are expected
    to be close to random data. Hence, on average, comparison only needs to
    look at first few bytes of the keys. To reduce number of conditional
    jump instructions, the result is computed as: sign(memcmp(k1, k2)).

    Sign of an integer 'a' can be obtained as: `(0 < a) - (a < 0)` := {-1, 0, 1} ,
    which is computed efficiently.  Synthetic performance evaluation of
    original and new algorithm over 1G random keys on 2.6GHz Intel(R) Xeon(R)
    CPU E5-2660 v3:

    old     6.85789 s
    new     2.49089 s

    perf: 2.8x faster vdev_queue_offset_compare() and vdev_queue_timestamp_compare()
        Compute the result directly instead of using conditionals

    perf: zfs_range_compare()
        Speedup between 1.1x - 2.5x, depending on compiler version and
    optimization level.

    perf: spa_error_entry_compare()
        `bcmp()` is not suitable for comparator use. Use `memcmp()` instead.

    perf: 2.8x faster metaslab_compare() and metaslab_rangesize_compare()
    perf: 2.8x faster zil_bp_compare()
    perf: 2.8x faster mze_compare()
    perf: faster dbuf_compare()
    perf: faster compares in spa_misc
    perf: 2.8x faster layout_hash_compare()
    perf: 2.8x faster space_reftree_compare()
    perf: libzfs: faster avl tree comparators
    perf: guid_compare()
    perf: dsl_deadlist_compare()
    perf: perm_set_compare()
    perf: 2x faster range_tree_seg_compare()
    perf: faster unique_compare()
    perf: faster vdev_cache _compare()
    perf: faster vdev_uberblock_compare()
    perf: faster fuid _compare()
    perf: faster zfs_znode_hold_compare()

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Signed-off-by: Richard Elling <richard.elling@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes #5033

5 years agoMFC r338869: MFV r338866: 9700 ZFS resilvered mirror does not balance reads
mav [Wed, 3 Oct 2018 15:36:36 +0000 (15:36 +0000)]
MFC r338869: MFV r338866: 9700 ZFS resilvered mirror does not balance reads

illumos/illumos-gate@82f63c3c2bf5e4378706e8dcfccf717d67371be9

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author:     Jerry Jelinek <jerry.jelinek@joyent.com>

5 years agoMFC r337972: 9751 Allocation throttling misplacing ditto blocks
mav [Wed, 3 Oct 2018 15:35:27 +0000 (15:35 +0000)]
MFC r337972: 9751 Allocation throttling misplacing ditto blocks

Relax allocation throttling for ditto blocks.  Due to random imbalances
in allocation it tends to push block copies to one vdev, that looks
slightly better at the moment.  Slightly less strict policy allows both
improve data security and surprisingly write performance, since we don't
need to touch extra metaslabs on each vdev to respect the min distance.

Sponsored by:   iXsystems, Inc.

5 years agoMFC r337970: 9738 Fix third block copy allocations, broken at 9112.
mav [Wed, 3 Oct 2018 15:34:49 +0000 (15:34 +0000)]
MFC r337970: 9738 Fix third block copy allocations, broken at 9112.

Use METASLAB_WEIGHT_CLAIM weight to allocate tertiary blocks.
Previous use of METASLAB_WEIGHT_SECONDARY for that caused errors
later on metaslab_activate_allocator() call, leading to massive
load of unneeded metaslabs and write freezes.

Reviewed by:    Paul Dagnelie <pcd@delphix.com>

5 years agoMFC r337923: Make vfs.zfs.zio.dva_throttle_enabled sysctl writable.
mav [Wed, 3 Oct 2018 15:33:20 +0000 (15:33 +0000)]
MFC r337923: Make vfs.zfs.zio.dva_throttle_enabled sysctl writable.

Not sure what I thought originally, but as I see now runtime changes are
working fine, and the code seems like even designed for this.

5 years agoMFC r337883: Add couple tunables/sysctl, missed in r336949.
mav [Wed, 3 Oct 2018 15:32:42 +0000 (15:32 +0000)]
MFC r337883: Add couple tunables/sysctl, missed in r336949.

5 years agoMFC r337870: Fix mismerge in r337196.
mav [Wed, 3 Oct 2018 15:31:44 +0000 (15:31 +0000)]
MFC r337870: Fix mismerge in r337196.

ZoL did the same mistake, and fixed it with separate commit 863522b1f9:

dsl_scan_scrub_cb: don't double-account non-embedded blocks

We were doing count_block() twice inside this function, once
unconditionally at the beginning (intended to catch the embedded block
case) and once near the end after processing the block.

The double-accounting caused the "zpool scrub" progress statistics in
"zpool status" to climb from 0% to 200% instead of 0% to 100%, and
showed double the I/O rate it was actually seeing.

This was apparently a regression introduced in commit 00c405b4b5e8,
which was an incorrect port of this OpenZFS commit:

    https://github.com/openzfs/openzfs/commit/d8a447a7

Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Steven Noonan <steven@uplinklabs.net>
Closes #7720
Closes #7738

5 years agoMFC r337229: Reduce taskq and context-switch cost of zio pipe
mav [Wed, 3 Oct 2018 14:59:39 +0000 (14:59 +0000)]
MFC r337229: Reduce taskq and context-switch cost of zio pipe

When doing a read from disk, ZFS creates 3 ZIO's: a zio_null(), the
logical zio_read(), and then a physical zio. Currently, each of these
results in a separate taskq_dispatch(zio_execute).

On high-read-iops workloads, this causes a significant performance
impact. By processing all 3 ZIO's in a single taskq entry, we reduce the
overhead on taskq locking and context switching.  We accomplish this by
allowing zio_done() to return a "next zio to execute" to zio_execute().

This results in a ~12% performance increase for random reads, from
96,000 iops to 108,000 iops (with recordsize=8k, on SSD's).

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59292
Closes #7736

zfsonlinux/zfs@62840030a7dceaee013ddbcc1eebcfc7922edf7c

5 years agoMFC r337227: MFV r337223:
mav [Wed, 3 Oct 2018 14:59:03 +0000 (14:59 +0000)]
MFC r337227: MFV r337223:
9580 Add a hash-table on top of nvlist to speed-up operations

illumos/illumos-gate@2ec7644aab2a726a64681fa66c6db8731b160de1

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

5 years agoMFC r337221: MFV r337220: 8375 Kernel memory leak in nvpair code
mav [Wed, 3 Oct 2018 14:58:28 +0000 (14:58 +0000)]
MFC r337221: MFV r337220: 8375 Kernel memory leak in nvpair code

illumos/illumos-gate@843c2111b160463f014d325560ad4b051711928e

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337219: MFV r337218: 7261 nvlist code should enforce name length limit
mav [Wed, 3 Oct 2018 14:57:54 +0000 (14:57 +0000)]
MFC r337219: MFV r337218: 7261 nvlist code should enforce name length limit

illumos/illumos-gate@48dd5e630c9b1773b7b10d08a3b90b6c9062d713

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337217: MFV r337216: 7263 deeply nested nvlist can overflow stack
mav [Wed, 3 Oct 2018 14:57:19 +0000 (14:57 +0000)]
MFC r337217: MFV r337216: 7263 deeply nested nvlist can overflow stack

illumos/illumos-gate@9ca527c3d3dfa7c8f304b34a9e03b5eddace838f

Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337215: MFV 337214:
mav [Wed, 3 Oct 2018 14:56:38 +0000 (14:56 +0000)]
MFC r337215: MFV 337214:
9621 Make createtxg and guid properties public

illumos/illumos-gate@e8d4a73c868afb740396041be80ed2b141065e76

Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Yuri Pankov <yuripv@yuripv.net>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Josh Paetzel <josh@tcbug.org>

5 years agoMFC r337213: MFV r337212:
mav [Wed, 3 Oct 2018 14:55:36 +0000 (14:55 +0000)]
MFC r337213: MFV r337212:
9465 ARC check for 'anon_size > arc_c/2' can stall the system

illumos/illumos-gate@abe1fd01ce5a83718c5a840daeab4abdaec1c104

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Don Brady <don.brady@delphix.com>

5 years agoMFC r337211: MFV r337210: 9577 remove zfs_dbuf_evict_key tsd
mav [Wed, 3 Oct 2018 14:54:48 +0000 (14:54 +0000)]
MFC r337211: MFV r337210: 9577 remove zfs_dbuf_evict_key tsd

The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary - we can
instead pass a flag down in a few places to prevent recursive dbuf eviction.
Making this change has 3 benefits:

1. The code semantics are easier to understand.
2. On Linux, performance is improved, because creating/removing TSD values
(by setting to NULL vs non-NULL) is expensive, and we do it very often.
3. According to Nexenta, the current semantics can cause a deadlock when
concurrently calling dmu_objset_evict_dbufs() (which is rare today, but they
are working on a "parallel unmount" change that triggers this more easily)

illumos/illumos-gate@c2919acbea007fa95c709b60d073db9a24526e01

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337209:
mav [Wed, 3 Oct 2018 14:53:51 +0000 (14:53 +0000)]
MFC r337209:
MFV r337208: 9591 ms_shift can be incorrectly changed in MOS config for
indirect vdevs that have been historically expanded

illumos/illumos-gate@11f6a9680e013a7c9c57dc0b64d3e91e2eee1a6b

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author:     Serapheim Dimitropoulos <serapheim@delphix.com>

5 years agoMFC r337207: MFV r337206: 9338 moved dnode has incorrect dn_next_type
mav [Wed, 3 Oct 2018 14:53:07 +0000 (14:53 +0000)]
MFC r337207: MFV r337206: 9338 moved dnode has incorrect dn_next_type

illumos/illumos-gate@c7fbe46df966ea665df63b6e6071808987e839d1

Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337205:
mav [Wed, 3 Oct 2018 14:52:35 +0000 (14:52 +0000)]
MFC r337205:
MFV r337204: 9439 ZFS double-free due to failure to dirty indirect block

illumos/illumos-gate@99a19144e82244f3426f055cc73af8a937c0135c

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337202: MFV r337200:
mav [Wed, 3 Oct 2018 14:51:49 +0000 (14:51 +0000)]
MFC r337202: MFV r337200:
9438 Holes can lose birth time info if a block has a mix of birth times

Ultimately, the problem here is that when you truncate and write a file in
the same transaction group, the dbuf for the indirect block will be zeroed
out to deal with the truncation, and then written for the write. During
this process, we will lose hole birth time information for any holes in the
range. In the case where a dnode is being freed, we need to determine
whether the block should be converted to a higher-level hole in the zio
pipeline, and if so do it when the dnode is being synced out.

illumos/illumos-gate@738e2a3ce3b2579222d6855e7fe75b5bcfcddf8d

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Paul Dagnelie <pcd@delphix.com>

5 years agoMFC r337201: Fix build after r337196 mismerge.
mav [Wed, 3 Oct 2018 14:51:16 +0000 (14:51 +0000)]
MFC r337201: Fix build after r337196 mismerge.

5 years agoMFC r337198: MFV r337197: 9456 ztest failure in zil_commit_waiter_timeout
mav [Wed, 3 Oct 2018 14:50:40 +0000 (14:50 +0000)]
MFC r337198: MFV r337197: 9456 ztest failure in zil_commit_waiter_timeout

illumos/illumos-gate@b6031810da58df96413bf76e068638fcab1f228a

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author:     Prakash Surya <prakash.surya@delphix.com>

5 years agoMFC r337196: MFV r337195: 9454 ::zfs_blkstats should count embedded blocks
mav [Wed, 3 Oct 2018 14:50:06 +0000 (14:50 +0000)]
MFC r337196: MFV r337195: 9454 ::zfs_blkstats should count embedded blocks

illumos/illumos-gate@dec267e7ea9828898b1c64462daa6636c4ef5e29

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337194: MFV r337193:
mav [Wed, 3 Oct 2018 14:49:32 +0000 (14:49 +0000)]
MFC r337194: MFV r337193:
9424 ztest failure: "unprotected error in call to Lua API (Invalid value type 'f
unction' for key 'error')"

illumos/illumos-gate@fe3ba4d1227d8746116ece7240682b13595c3142

Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337191:
mav [Wed, 3 Oct 2018 14:48:55 +0000 (14:48 +0000)]
MFC r337191:
MFV r337190: 9486 reduce memory used by device removal on fragmented pools

In the most fragmented real-world cases, this reduces memory used by the
mapping from ~1GB to ~50MB of RAM per 1TB of storage removed. Less
fragmented cases will typically also see around 50-100MB of RAM per 1TB
of storage.

illumos/illumos-gate@cfd63e1b1bcf7ba4bf72f55ddbd87ce008d2986d

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337185:
mav [Wed, 3 Oct 2018 14:48:17 +0000 (14:48 +0000)]
MFC r337185:
MFV r337184: 9457 libzfs_import.c:add_config() has a memory leak

A memory leak occurs on lines 209 and 213 because the config is not freed
in the error case.  The interface to add_config() seems less than ideal -
it would be better if it copied any data necessary from the config and the
caller freed it.

illumos/illumos-gate@ddfe901b12348d31c500fb57f9174e88860a4061

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author:     sara hartse <sara.hartse@delphix.com>

5 years agoMFC r337183:
mav [Wed, 3 Oct 2018 14:47:29 +0000 (14:47 +0000)]
MFC r337183:
MFV r337182: 9330 stack overflow when creating a deeply nested dataset

Datasets that are deeply nested (~100 levels) are impractical. We just put
a limit of 50 levels to newly created datasets. Existing datasets should
work without a problem.

illumos/illumos-gate@5ac95da7d61660aa299c287a39277cb0372be959

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author:     Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>

5 years agoMFC r337181: 9539 Make zvol operations use _by_dnode routines
mav [Wed, 3 Oct 2018 14:46:25 +0000 (14:46 +0000)]
MFC r337181: 9539 Make zvol operations use _by_dnode routines

Continues what was started in 7801 add more by-dnode routines by fully
converting zvols to avoid unnecessary dnode_hold() calls. This saves a
small amount of CPU time and slightly improves latencies of operations
on zvols.

illumos/illumos-gate@8dfe5547fbf0979fc1065a8b6fddc1e940a7cf4f

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Rick McNeal <rick.mcneal@nexenta.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Richard Yao <richard.yao@prophetstor.com>

5 years agoMFC r337179: 9523 Large alloc in zdb can cause trouble
mav [Wed, 3 Oct 2018 14:45:48 +0000 (14:45 +0000)]
MFC r337179: 9523 Large alloc in zdb can cause trouble

16MB alloc in zdb_embedded_block() can cause cores in certain situations
(clang, gcc55).

OsX commit: https://github.com/openzfsonosx/zfs/commit/ced236a5da6e72ea7bf6d2919fe14e17cffe10f1
FreeBSD commit: https://svnweb.freebsd.org/base?view=revision&revision=326150
illumos/illumos-gate@03a4c2f4bfaca30115963b76445279b36468a614

Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Jorgen Lundman <lundman@lundman.net>

This is an update for r326150 (by avg), where this change comes from.

5 years agoMFC r337177:
mav [Wed, 3 Oct 2018 14:44:16 +0000 (14:44 +0000)]
MFC r337177:
MFV r337175: 9487 Free objects when receiving full stream as clone

All objects after the last written or freed object are not supposed to
exist after receiving the stream. We should free them accordingly, as if
a freeobjects record for them had been included in the stream.

zfsonlinux/zfs@48fbb9ddbf2281911560dfbc2821aa8b74127315
illumos/illumos-gate@7864b8192b8d30471fa2240466d516292e5765b8

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Paul Dagnelie <pcd@delphix.com>

5 years agoMFC r337172, MFV r337171:
mav [Wed, 3 Oct 2018 14:43:17 +0000 (14:43 +0000)]
MFC r337172, MFV r337171:
9464 txg_kick() fails to see that we are quiescing, forcing transactions
to their next stages without leaving them accumulate changes

Ideally we would like txg_kick() to get triggered only when we are sure
that we are not syncing AND not quiescing any txg. This way we can kick
an open TXG to the quiescing state when we are sure that there is nothing
going on and we would benefit from the different states running
concurrently.

illumos/illumos-gate@fa41d87de9ec9000964c605eb01d6dc19e4a1abe

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Serapheim Dimitropoulos <serapheim@delphix.com>

5 years agoMFC r338947:
ae [Wed, 3 Oct 2018 12:47:54 +0000 (12:47 +0000)]
MFC r338947:
  Add "src-ip" or "dst-ip" keyword to the output, when we are printing the
  rest of rule options.

  Reported by: lev

5 years agoMFC r338955:
kib [Wed, 3 Oct 2018 11:34:28 +0000 (11:34 +0000)]
MFC r338955:
When doing lm_add(), check for duplicates.

5 years agoMFC r337169: MFV r337167: 9442 decrease indirect block size of spacemaps
mav [Wed, 3 Oct 2018 03:14:40 +0000 (03:14 +0000)]
MFC r337169: MFV r337167: 9442 decrease indirect block size of spacemaps

Updates to indirect blocks of spacemaps can contribute significantly to
write inflation.  Therefore we want to reduce the indirect block size of
spacemaps from 128K to 16K.

illumos/illumos-gate@221813c13b43ef48330b03725e00edee85108cf1

Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Albert Lee <trisk@forkgnu.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337163: MFV r337161: 9512 zfs remap poolname@snapname coredumps
mav [Wed, 3 Oct 2018 03:13:53 +0000 (03:13 +0000)]
MFC r337163: MFV r337161: 9512 zfs remap poolname@snapname coredumps

Only filesystems and volumes are valid "zfs remap" parameters: when passed
a snapshot name zfs_remap_indirects() does not handle the EINVAL returned
from libzfs_core, which results in failing an assertion and consequently
crashing.

illumos/illumos-gate@0b2e8253986c5c761129b58cfdac46d204903de1

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Sara Hartse <sara.hartse@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author:     loli10K <ezomori.nozomu@gmail.com>

5 years agoMFC r337160:
mav [Wed, 3 Oct 2018 02:52:47 +0000 (02:52 +0000)]
MFC r337160:
Do not blindly include illumos kernel headers instead of user-space.
It is not needed now, and I doubt it much helped at all, creating more
confusions then good.

5 years agoMFC r337063: MFV r316926:
mav [Wed, 3 Oct 2018 02:51:13 +0000 (02:51 +0000)]
MFC r337063: MFV r316926:
7955 libshare needs to initialize only those datasets being modified by the consumer

illumos/illumos-gate@8a981c3356b194b3b5c0ae9276a9cc31cd2f93a3
https://github.com/illumos/illumos-gate/commit/8a981c3356b194b3b5c0ae9276a9cc31cd2f93a3

https://www.illumos.org/issues/7955
  Libshare currently initializes all available filesystems when doing any
  libshare operation. This requires iterating through all the filesystem
  multiple times, which is a huge performance problem for sharing and
  unsharing operations.

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Gordon Ross <gordon.w.ross@gmail.com>
Author: Daniel Hoffman <dj.hoffman@delphix.com>

For FreeBSD this is practically a NOP, just a diff reduction.

5 years agoMFC r337030: MFV r337029:
mav [Wed, 3 Oct 2018 02:50:07 +0000 (02:50 +0000)]
MFC r337030: MFV r337029:
9426 metaslab size can exceed offset addressable by spacemap

metaslab size can exceed offset addressable by spacemap. The vdev can
address up to 2^63 * SPA_MAXBLOCKSIZE (512). A metaslab can address up to
2^47 * 2^vdev_ashift. Therefore we may need to increase the number of
metaslabs so that the maximum metaslab size is capped at the amount that
can be addressed by the spacemap. This should happen in
vdev_metaslab_set_size().

illumos/illumos-gate@b4bf0cf0458759c67920a031021a9d96cd683cfe

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Don Brady <don.brady@delphix.com>

5 years agoMFC r337028: MFV r337027:
mav [Wed, 3 Oct 2018 02:49:24 +0000 (02:49 +0000)]
MFC r337028: MFV r337027:
9328 zap code can take advantage of c99
9329 panic in zap_leaf_lookup() due to concurrent zapification

illumos/illumos-gate@bf26014c5541b6119f34e0d95294b7f2eb105ac2

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337025: MFV r337022:
mav [Wed, 3 Oct 2018 02:48:31 +0000 (02:48 +0000)]
MFC r337025: MFV r337022:
9403 assertion failed in arc_buf_destroy() when concurrently reading block with checksum error

This assertion (VERIFY) failure was reported when reading a block. Turns out
the problem is that if we get an i/o error (ECKSUM in this case), and there
are multiple concurrent ARC reads of the same block (from different clones),
then the ARC will put multiple buf's on the same ANON hdr, which isn't
supposed to happen, and then causes a panic when we try to arc_buf_destroy()
the buf.

illumos/illumos-gate@fa98e487a9619b7902f218663be219e787a57dad

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337021: MFV r337020:9443 panic when scrub a v10 pool
mav [Wed, 3 Oct 2018 02:19:17 +0000 (02:19 +0000)]
MFC r337021: MFV r337020:9443 panic when scrub a v10 pool

illumos/illumos-gate@bb1f424574ac8e08069d0ba993c2a41ffe796794

Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author:     Matthew Ahrens <mahrens@delphix.com>

5 years agoMFC r337017: MFV r337014:
mav [Wed, 3 Oct 2018 02:18:16 +0000 (02:18 +0000)]
MFC r337017: MFV r337014:
9421 zdb should detect and print out the number of "leaked" objects
9422 zfs diff and zdb should explicitly mark objects that are on the deleted queue

illumos/illumos-gate@20b5dafb425396adaebd0267d29e1026fc4dc413

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author:     Paul Dagnelie <pcd@delphix.com>

5 years agoMFC r337007: MFV r336991, r337001:
mav [Wed, 3 Oct 2018 02:16:22 +0000 (02:16 +0000)]
MFC r337007: MFV r336991, r337001:
9102 zfs should be able to initialize storage devices

The first access to a disk block can incur a performance penalty on some
platforms (e.g. AWS's EBS, VMware VMDKs). Therefore it is recommended that
volumes be "thick provisioned", where supported by the platform (VMware).
Thick provisioning is time consuming and often is ignored. If the thick
provision step is omitted, customers will see suboptimal performance until
we have written to all parts of the LUN. ZFS should be able to initialize
any unused storage to remove any first-write penalty that exists.

illumos/illumos-gate@094e47e980b0796b94b1b8f51f462a64d246e516

Reviewed by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author:     George Wilson <george.wilson@delphix.com>

5 years agoMFC r336961:
mav [Wed, 3 Oct 2018 02:14:38 +0000 (02:14 +0000)]
MFC r336961:
MFV r336960: 9256 zfs send space estimation off by > 10% on some datasets

illumos/illummos-gate@df477c0afa111b5205c872dab36dbfde391656de

Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author:     Paul Dagnelie <pcd@delphix.com>