https://www.illumos.org/issues/10406
The large dnode changes from 8423 caused problems in zfs recv for a legacy
stream. This manifests when attempting to mount the received stream, but the
problem is in the receive code. We missed the following commit from ZoL which
fixes this.
commit da2feb42fb5c7a8c1e1cc67f7a880da9d8e97bc2
Author: Tom Caputi <tcaputi@datto.com>
Date: Thu Jun 28 17:55:11 2018 -0400
Fix 'zfs recv' of non large_dnode send streams
Currently, there is a bug where older send streams without the
DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly.
The code in receive_object() fails to handle cases where
drro->drr_dn_slots is set to 0, which is always the case when the
sending code does not support this feature flag. This patch fixes
the issue by ensuring that that a value of 0 is treated as
DNODE_MIN_SLOTS.
https://www.illumos.org/issues/10406
The large dnode changes from 8423 caused problems in zfs recv for a legacy
stream. This manifests when attempting to mount the received stream, but the
problem is in the receive code. We missed the following commit from ZoL which
fixes this.
commit da2feb42fb5c7a8c1e1cc67f7a880da9d8e97bc2
Author: Tom Caputi <tcaputi@datto.com>
Date: Thu Jun 28 17:55:11 2018 -0400
Fix 'zfs recv' of non large_dnode send streams
Currently, there is a bug where older send streams without the
DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly.
The code in receive_object() fails to handle cases where
drro->drr_dn_slots is set to 0, which is always the case when the
sending code does not support this feature flag. This patch fixes
the issue by ensuring that that a value of 0 is treated as
DNODE_MIN_SLOTS.
ZoL issues:
Improved dnode allocation #6564
Clean up large dnode code #6262
Fix dnode_hold() freeing dnode behavior #8172
Fix dnode allocation race #6414, #6439
Partial: Raw sends must be able to decrease nlevels #6821, #6864
Remove unnecessary txg syncs from receive_object() Closes #7197
This updates FreeBSD large_dnode code (that was imported from ZoL) to a version
that was committed to illumos. It has some cleanups, improvements and fixes
comparing to what we have in FreeBSD now. I think that the most significant
update is 8199 multi-threaded dmu_object_alloc().
Ed Maste [Thu, 15 Aug 2019 14:54:18 +0000 (14:54 +0000)]
stand: remove CLANG_NO_IAS from boot2
Many components under stand/ had CLANG_NO_IAS added when Clang's
Integrated Assembler (IAS) did not handle .codeNN directives. Clang
gained support quite some time ago, and we can now build stand/ with
IAS. In most cases IAS- and GNU as-assembled boot components were
identical, and CLANG_NO_IAS was already removed from other components.
Clang IAS produces different output for some components, including
boot2, so CLANG_NO_IAS was not previously removed for those.
In the case of boot2 the difference is that IAS produces a larger
encoding for one instruction (the testb at the beginning of read).
GNU as produces:
2e f6 06 b0 08 80
while IAS includes an address size override prefix (67) and produces:
2e 67 f6 05 b3 08 00 00 80
This results in three fewer NOPs elsewhere in boot2 but no functional
change, so switch to IAS for boot2.
(We can separately pursue improved 16-bit IAS support with the LLVM
developers.)
Alexander Motin [Thu, 15 Aug 2019 14:11:11 +0000 (14:11 +0000)]
Implement new methods for Intel and PLX NTB.
This restores parity with AMD NTB driver. Though without any drivers
supporting more then one peer and respective KPI modification to pass
peer index to most of the calls this addition is pretty useless now.
Some files got their contented duplicated in r345409. Some mistakes where
fixed in r345430. The only file that was left with a duplicated content was
CVE-2019-5598.py.
Justin Hibbits [Thu, 15 Aug 2019 03:42:15 +0000 (03:42 +0000)]
powerpc/pmap: Enable UMA_MD_SMALL_ALLOC for 64-bit booke
The only thing blocking UMA_MD_SMALL_ALLOC from working on 64-bit booke
powerpc was a missing check in pmap_kextract(). Adding DMAP handling into
pmap_kextract(), we can now use UMA_MD_SMALL_ALLOC. This should improve
performance and stability a bit, since DMAP is always mapped in TLB1, so
this relieves pressure on TLB0.
Doug Moore [Thu, 15 Aug 2019 02:30:44 +0000 (02:30 +0000)]
swap_pager.c reserves 2 blocks for a bsd label. Change that 2 to the
expression howmany(BBSIZE, PAGE_SIZE), where BBSIZE is the size of the
boot block area. That can be less than 2 if PAGE_SIZE is big.
swapon(8) has an option to trim (delete) all the blocks of a device at
startup. However, if the first of those blocks is a bsd label, then
trimming those blocks is destructive. Change swapon to leave the
first BBSIZE bytes untrimmed.
Update manual pages to reflect changes in how swapon and how it may be
used, espeically in association with savecore.
Conrad Meyer [Thu, 15 Aug 2019 00:39:53 +0000 (00:39 +0000)]
random(4): Remove "EXPERIMENTAL" verbiage from concurrent operation
No functional change.
Add a verbose comment giving an example side-by-side comparison between the
prior and Concurrent modes of Fortuna, and why one should believe they
produce the same result.
The intent is to flip this on by default prior to 13.0, so testing is
encouraged. To enable, add the following to loader.conf:
kern.random.fortuna.concurrent_read="1"
The intent is also to flip the default blockcipher to the faster Chacha-20
prior to 13.0, so testing of that mode of operation is also appreciated.
To enable, add the following to loader.conf:
Warner Losh [Wed, 14 Aug 2019 20:58:23 +0000 (20:58 +0000)]
The bxe driver, QLogic NetXtreme II Ethernet 10Gb PCIe adapter driver, is x86
specific, and only builds there. Likewise the module is built there. Move it to
the x86-only files.x86.
Reviewed by: jhb (verbal OK on irc)
Differential Revision: https://reviews.freebsd.org/D21248
Warner Losh [Wed, 14 Aug 2019 20:58:17 +0000 (20:58 +0000)]
The ACPI parts are identical between i386 and amd64
Apart from one MD file, ACPI is a x86 implementation, not specific to either
i386 or amd64, so put it into files.x86. Other architectures include fewer
files for the same options, so it can't move into the MI files file.
Reviewed by: jhb (verbal OK on irc)
Differential Revision: https://reviews.freebsd.org/D21248
Warner Losh [Wed, 14 Aug 2019 20:58:01 +0000 (20:58 +0000)]
Move all the hp* drivers too files.x86
The HPT drivers are all x86 only. Move them to files.x86. Because of the way we
run uudecode, we can use $M instead of needing entries for them in separate
files.
Reviewed by: jhb (verbal OK on irc)
Differential Revision: https://reviews.freebsd.org/D21248
Warner Losh [Wed, 14 Aug 2019 20:57:54 +0000 (20:57 +0000)]
Move the identical x86 lines to files.x86
Move all the identical x86 lines to files.x86. The non-identical ones should be
unified and moved as well, but that would require additional changes that would
need a more careful review and may not be MFCable, so I'll do them
separately. I'll delete the mildly snarky comment when things are unified.
Reviewed by: jhb (verbal OK on irc)
Differential Revision: https://reviews.freebsd.org/D21248
Alan Somers [Wed, 14 Aug 2019 20:45:00 +0000 (20:45 +0000)]
fusefs: Fix the size of fuse_getattr_in
In FUSE protocol 7.9, the size of the FUSE_GETATTR request has increased.
However, the fusefs driver is currently not sending the additional fields.
In our implementation, the additional fields are always zero, so I there
haven't been any test failures until now. But fusefs-lkl requires the
request's length to be correct.
Fix this bug, and also enhance the test suite to catch similar bugs.
PR: 239830
MFC after: 2 weeks
MFC-With: 350665
Sponsored by: The FreeBSD Foundation
Alan Somers [Wed, 14 Aug 2019 18:04:04 +0000 (18:04 +0000)]
fusefs: fix intermittency in the default_permissions.Unlink.ok test
The test needs to expect a FUSE_FORGET operation. Most of the time the test
would pass anyway, because by chance FUSE_FORGET would arrive after the
unmount.
MFC after: 2 weeks
MFC-With: 350665
Sponsored by: The FreeBSD Foundation
Alexander Motin [Wed, 14 Aug 2019 16:12:03 +0000 (16:12 +0000)]
Report NOIOB and NPWG fields as stripe size.
Namespace Optimal I/O Boundary field added in NVMe 1.3 and Namespace
Preferred Write Granularity added in 1.4 allow upper layers to align
I/Os for improved SSD performance and endurance.
I don't have hardware reportig those yet, but NPWG could probably be
reported by bhyve.
Conrad Meyer [Tue, 13 Aug 2019 23:32:56 +0000 (23:32 +0000)]
geom_uzip(4), mkuzip(8): Add Zstd image mode
The Zstd format bumps the CLOOP major number to 4 to avoid incompatibility
with older systems. Support in geom_uzip(4) is conditional on the ZSTDIO
kernel option, which is enabled in amd64 GENERIC, but not all in-tree
configurations.
mkuzip(8) was modified slightly to always initialize the nblocks + 1'th
offset in the CLOOP file format. Previously, it was only initialized in the
case where the final compressed block happened to be unaligned w.r.t.
DEV_BSIZE. The "Fake" last+1 block change in r298619 means that the final
compressed block's 'blen' was never correct unless the compressed uzip image
happened to be BSIZE-aligned. This happened in about 1 out of every 512
cases. The zlib and lzma decompressors are probably tolerant of extra trash
following the frame they were told to decode, but Zstd complains that the
input size is incorrect.
Correspondingly, geom_uzip(4) was modified slightly to avoid trashing the
nblocks + 1'th offset when it is known to be initialized to a good value.
This corrects the calculated final real cluster compressed length to match
that printed by mkuzip(8).
mkuzip(8) was refactored somewhat to reduce code duplication and increase
ease of adding other compression formats.
* Input block size validation was pulled out of individual compression
init routines into main().
* Init routines now validate a user-provided compression level or select
an algorithm-specific default, if none was provided.
* A new interface for calculating the maximal compressed size of an
incompressible input block was added for each driver. The generic code
uses it to validate against MAXPHYS as well as to allocate compression
result buffers in the generic code.
* Algorithm selection is now driven by a table lookup, to increase ease of
adding other formats in the future.
mkuzip(8) gained the ability to explicitly specify a compression level with
'-C'. The prior defaults -- 9 for zlib and 6 for lzma -- are maintained.
The new zstd default is 9, to match zlib.
Rather than select lzma or zlib with '-L' or its absense, respectively, a
new argument '-A <algorithm>' is provided to select 'zlib', 'lzma', or
'zstd'. '-L' is considered deprecated, but will probably never be removed.
All of the new features were documented in mkuzip.8; the page was also
cleaned up slightly.
John Baldwin [Tue, 13 Aug 2019 21:15:59 +0000 (21:15 +0000)]
Fix build with DRM and INVARIANTS enabled.
The DRM drivers use the lockdep assertion macros with spinlock_t locks
which are backed by mutexes, not sx locks. This causes compile
failures since you can't use sx_assert with a mutex. Instead, change
the lockdep macros to use lock_class methods. This works by assuming
that each LinuxKPI locking primitive embeds a FreeBSD lock as its
first structure and uses a cast to get to the underlying 'struct
lock_object'.
Ed Maste [Tue, 13 Aug 2019 15:41:36 +0000 (15:41 +0000)]
Remove some more leftover rlogin man page xrefs
rcmds were removed in r32435 and these three man pages can trivially
drop the references.
There's still a reference in pts.4 because it describes a mode
(TIOCPKT_NOSTOP), and only lists rlogin/rlogind as examples of programs
that use that mode. To update later.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Save ip_ttl value and restore it after checksum calculation.
Since ipvoly is used for checksum calculation, part of original IP
header is zeroed. This part includes ip_ttl field, that can be used
later in IP_MINTTL socket option handling.
Randall Stewart [Tue, 13 Aug 2019 12:41:15 +0000 (12:41 +0000)]
Place back in the dependency on HPTS via module depends versus
a fatal error in compiling. This was taken out by mistake
when I mis-merged from the 18q22p2 sources of rack in NF. Opps.
Ian Lepore [Tue, 13 Aug 2019 03:49:53 +0000 (03:49 +0000)]
Fix the driver name in ads111x.4, and hook the manpage up to the build.
The driver was originally written with the name ads1115, but at the last
minute it got renamed to ads111x to reflect its support for many related
chips, but I forgot to update the manpage to match the renaming before
committing it all.
Pedro F. Giffuni [Tue, 13 Aug 2019 01:24:43 +0000 (01:24 +0000)]
Add deprecation notice to snd_ds1(4).
As suggested in:
https://wiki.freebsd.org/WhatsGoing/FreeBSD13
We will be dropping the snd_ds1 driver. The driver is known to be buggy
and no one has been working on it for years now.
Users of old Yamaha cards may have luck with the OSS drivers instead.
MFC after: 3 days
Diferential Revision: https://reviews.freebsd.org/D21138
Warner Losh [Mon, 12 Aug 2019 23:25:21 +0000 (23:25 +0000)]
Fix powerpc LINT build
tcpratelimit isn't supported as there's now atomic_add_64, so add it to the exclusion list
Add comment for why PPC_PROBE_CHIPSET is on the list
Remove UKBD_DFLT_KEYMAP now that ukbd works on all platforms.
Warner Losh [Mon, 12 Aug 2019 22:58:56 +0000 (22:58 +0000)]
Create files.x86
files.x86 is for the parts of the system that are common to both i386 and amd64
due too their nature. First up, to get the ball rolling, is fdc, the floppy disk
support. It works only on amd64 and i386 these days, and that's unlikely to
change.
Reviewed by: jhb, cem (earlier versrions)
Differential Revision: https://reviews.freebsd.org/D21210
Warner Losh [Mon, 12 Aug 2019 22:58:44 +0000 (22:58 +0000)]
Move sc out of the global file
x86 needs sc, as does sparc64. powerpc doesn't use it by default, but some old
powermac notebooks do not work with vt yet for reasons unknonw. Even so, I've
removed it from powerpc LINT. It's not in daily use there, and the intent is to
100% switch to vt now that it works for that platform to limit support burden.
All the other architectures omit some or all of the screen savers from their
lint config. Move them to the x86 NOTES files and remove the exclusions. This
reduces slightly the number of savers sparc64 compiles, but since they are in
GENERIC, the overage is adequate and if someone reaelly wants to sort them out
in sparc64 they can sweat the details and the testing.
Increase YPMAXRECORD to 16M to be compatible with Linux.
Since YP protocol definition uses the constant to declare
variable-size opaque byte strings, the change should be binary
compatible with existing installations which do not expose keys or
values larger than 1024 bytes.
All uses of local variables with YPMAXRECORD sizes were removed to
avoid insane stack use. On the other hand, variables with static
lifetime should be fine and only result in increased VA use.
Glibc made same change, increasing the allowed length for keys and
values in YP to 16M, in 2013.
Leandro Lupori [Mon, 12 Aug 2019 12:51:47 +0000 (12:51 +0000)]
[PPC64] Save FPU registers before enabling VSX
Fixed trap handler logic, in order to make it save FPU registers,
if FPU is enabled, before enabling VSX. Without this change, FPU
register contents were being lost when set before VSX was enabled.
https://www.illumos.org/issues/6585
In any pool without the extensible dataset feature flag already enabled,
creating a dataset with dedup set to use one of the new checksums would result
in the following panic as soon as any data was added:
panic[cpu0]/thread=ffffff0006761c40: feature_get_refcount(spa, feature,
&refcount) != 48 (0x30 != 0x30), file: ../../common/fs/zfs/zfeature.c line 390
ffffff0006761830fffffffffba8fbdd () ffffff0006761890 zfs:feature_do_action+11a () ffffff00067618c0 zfs:spa_feature_incr+1e () ffffff0006761920 zfs:dmu_object_zapify+b7 () ffffff00067619b0 zfs:dsl_dataset_activate_feature+97 () ffffff0006761a20 zfs:dsl_dataset_sync+ba () ffffff0006761ab0 zfs:dsl_pool_sync+153 () ffffff0006761b70 zfs:spa_sync+26e () ffffff0006761c20 zfs:txg_sync_thread+227 () ffffff0006761c30 unix:thread_start+8 ()
Inspection showed that feature->fi_feature was 7, which is the value of
SPA_FEATURE_EXTENSIBLE_DATASET in the spa_feature enum.
Testing shows that the panic can be prevented by explicitly setting extensible
dataset as a dependency for the sha512, edonr, and skein feature flags.
Alternatively, the new checksums code could possibly be changed to obviate the
need for the dependency.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: ilovezfs <ilovezfs@icloud.com>
https://www.illumos.org/issues/6585
In any pool without the extensible dataset feature flag already enabled,
creating a dataset with dedup set to use one of the new checksums would result
in the following panic as soon as any data was added:
panic[cpu0]/thread=ffffff0006761c40: feature_get_refcount(spa, feature,
&refcount) != 48 (0x30 != 0x30), file: ../../common/fs/zfs/zfeature.c line 390
ffffff0006761830fffffffffba8fbdd () ffffff0006761890 zfs:feature_do_action+11a () ffffff00067618c0 zfs:spa_feature_incr+1e () ffffff0006761920 zfs:dmu_object_zapify+b7 () ffffff00067619b0 zfs:dsl_dataset_activate_feature+97 () ffffff0006761a20 zfs:dsl_dataset_sync+ba () ffffff0006761ab0 zfs:dsl_pool_sync+153 () ffffff0006761b70 zfs:spa_sync+26e () ffffff0006761c20 zfs:txg_sync_thread+227 () ffffff0006761c30 unix:thread_start+8 ()
Inspection showed that feature->fi_feature was 7, which is the value of
SPA_FEATURE_EXTENSIBLE_DATASET in the spa_feature enum.
Testing shows that the panic can be prevented by explicitly setting extensible
dataset as a dependency for the sha512, edonr, and skein feature flags.
Alternatively, the new checksums code could possibly be changed to obviate the
need for the dependency.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: ilovezfs <ilovezfs@icloud.com>
Andriy Gapon [Mon, 12 Aug 2019 10:30:00 +0000 (10:30 +0000)]
a stop gap fix for a race between dnode_hold and dnode_sync_free
The race was introduced in r337669, the large dnode feature import from
ZoL. The problem was debugged by ZoL developers and then,
independently, on FreeBSD.
The fix is an early proposal by Brian Behlendorf:
https://github.com/behlendorf/zfs/commit/50f32ed74e42aa28522e9681fb8ae55239fa33a7
This fix never went into ZoL. A larger change that was committed later
included a different solution because of the re-worked code.
Ideally, we want to revert this fix and re-synchronize FreeBSD large
dnode code with that in illumos (or newer ZoL). illumos has a later
import of the feature from ZoL that does not have the bug.
PR: 236480
Obtained from: Brian Behlendorf <behlendorf1@llnl.gov>
Submitted by: ncrogers@gmail.com (patch adaptation)
Reported by: ncrogers@gmail.com
Tested by: ncrogers@gmail.com,
Dennis Noordsij <dennis.noordsij@alumni.helsinki.fi>,
Julien Cigar <julien@perdition.city>
MFC after: 10 days
Andriy Gapon [Mon, 12 Aug 2019 10:00:32 +0000 (10:00 +0000)]
Allow ZVOL bookmarks to be listed recursively
Many thanks to cryx-freebsd@h3q.com for reporting the problem and
submitting a fix. I have chosen to take an equivalent but textually
different patch from ZoL just to avoid increasing divergence between
OpenZFS flavours.
Justin Hibbits [Mon, 12 Aug 2019 03:03:56 +0000 (03:03 +0000)]
powerpc: Unify pmap definitions between AIM and Book-E
This is part 2 of r347078, pulling the page directory out of the Book-E
pmap. This breaks KBI for anything that uses struct pmap (such as vm_map)
so any modules that access this must be rebuilt.
Cy Schubert [Mon, 12 Aug 2019 02:42:47 +0000 (02:42 +0000)]
Initialize the frentry (the control block that defines a rule) checksum
to zero. Matching checksums save time and effort by mitigating the need
for full rule compare.
Cy Schubert [Sun, 11 Aug 2019 23:54:52 +0000 (23:54 +0000)]
Calculate the number interface array elements using the new FR_NUM macro
instead of the hard-coded value of 4. This is a precursor to increasing
the number of interfaces speficied in "on {interface, ..., interface}".
Note that though this feature is coded in ipf_y.y, it is partially
supported in the ipfilter kld, meaning it does not work yet (and is yet
to be documented in ipf.5 too).
Cy Schubert [Sun, 11 Aug 2019 23:54:49 +0000 (23:54 +0000)]
r272552 applied the patch from ipfilter upstream fil.c r1.129 to fix
broken ipfilter rule matches (upstream bug #554). The upstream patch
was incomplete, it resolved all but one rule compare issue. The issue
fixed here is when "{to, reply-to, dup-to} interface" are used in
conjuncion with "on interface". The match was only made if the on keyword
was specified in the same order in each case referencing the same rule.
This commit fixes this.
The reason for this is that interface name strings and comment keyword
comments are stored in a a variable length field starting at fr_names
in the frentry struct. These strings are placed into this variable length
in the order they are encountered by ipf_y.y and indexed through index
pointers in fr_ifnames, fr_comment or one of the frdest struct fd_name
fields. (Three frdest structs are within frentry.) Order matters and
this patch takes this into account.
While in here it was discovered that though ipfilter is designed to
support multiple interface specifiations per rule (up to four), this
undocumented (the man page makes no mention of it) feature does not work.
A todo is to fix the multiple interfaces feature at a later date. To
understand the design decision as to why only four were intended, it is
suspected that the decision was made because Sun workstations and PCs
rarely if ever exceeded four NICs at the time, this is not true in 2019.