Alexander Motin [Sat, 3 Sep 2016 10:48:48 +0000 (10:48 +0000)]
7003 zap_lockdir() should tag hold
zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which
tags the hold on the zap. This will help diagnose programming errors
which misuse the hold on the ZAP.
Sponsored by: Intel Corp.
Closes #108
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/7277
ztest always prints the debug messages (zfs_dbgmsg()) by calling
zfs_dbgmsg_print(). We should add a flag to zdb to make it do this as well
before exiting.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
https://www.illumos.org/issues/7136
6922 added ESC_ZFS_VDEV_REMOVE_AUX and ESC_ZFS_VDEV_REMOVE_DEV sysevents
whenever an aux device gets removed from a pool. However, those sysevents will
be created without the vdev_guid and vdev_path fields. It would be better to
always populate those fields.
https://www.illumos.org/issues/7115
The addition of spa_event_notify in vdev removal code (see #6922) causes events
to be generated even if the spare failed to be removed with EBUSY.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Alan Somers <asomers@gmail.com>
https://www.illumos.org/issues/7230
A test failure occurred where a send stream had only a BEGIN record. This
should not be possible if the send returns without error. Prevented this from
happening in the future by adding an assertion to dmu_send_impl() to verify
that if the function returns 0 (success) both a BEGIN and END record are
present. Did this by adding flags to dmu_sendarg_t (indicating whether BEGIN or
END records sent), having dump_record() set flags appropriately, adding VERIFY
statement to dmu_send_impl().
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matt Krantz <matt.krantz@delphix.com>
https://www.illumos.org/issues/7235
The function dsl_dataset_set_blkptr() is unused. We should remove it.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/7090
When write I/Os are issued, they are issued in block order but the ZIO pipeline
will drive them asynchronously through the allocation stage which can result in
blocks being allocated out-of-order. It would be nice to preserve as much of
the logical order as possible.
In addition, the allocations are equally scattered across all top-level VDEVs
but not all top-level VDEVs are created equally. The pipeline should be able to
detect devices that are more capable of handling allocations and should
allocate more blocks to those devices. This allows for dynamic allocation
distribution when devices are imbalanced as fuller devices will tend to be
slower than empty devices.
The change includes a new pool-wide allocation queue which would throttle and
order allocations in the ZIO pipeline. The queue would be ordered by issued
time and offset and would provide an initial amount of allocation of work to
each top-level vdev. The allocation logic utilizes a reservation system to
reserve allocations that will be performed by the allocator. Once an allocation
is successfully completed it's scheduled on a given top-level vdev. Each top-
level vdev maintains a maximum number of allocations that it can handle
(mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels *
mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab
groups and round robin across all eligible metaslab groups to distribute the
work. As top-levels complete their work, they receive additional work from the
pool-wide allocation queue until the allocation queue is emptied.
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
Add syntactic sugar to dtrace: "if" and "else" statements. The sugar is
baked down to standard dtrace features by adding additional clauses with
the appropriate predicates.
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Gary Mills <gary_mills@fastmail.fm>
https://www.illumos.org/issues/7164
If the pool/dataset command-line argument is specified with a trailing
slash, for example, "tank/", we should interpret it as the topmost
dataset (rather than the whole pool)
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Tim Chase <tim@chase2k.com>
https://www.illumos.org/issues/6391
When using zdb with non-default SPA config file it is not convenient
to add -U <non-default-config-file-path> all the time. This commit
introduces support for setting/overriding SPA config location via
environment variable 'SPA_CONFIG_PATH'.
If -U flag is specified in the command line it will override any other
value as usual.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Richard Yao <ryao@gentoo.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Cyril Plisko <cyril.plisko@mountall.com>
https://www.illumos.org/issues/7163
Running zloop from zfs-precommit hit this assertion:
*panicstr/s
0xfffffd7fd7419370: assertion failed for thread 0xfffffd7fe29ed240,
thread-id 577: parent != NULL, file ../../../uts/common/fs/zfs/dbuf.c, line
1827
$c
libc.so.1`_lwp_kill+0xa()
libc.so.1`_assfail+0x182(fffffd7ffb1c29fa, fffffd7ffb1cc028, 723)
libc.so.1`assfail+0x19(fffffd7ffb1c29fa, fffffd7ffb1cc028, 723)
libzpool.so.1`dbuf_dirty+0xc69(10e3bc10, 3601700)
libzpool.so.1`dbuf_dirty+0x61e(10c73640, 3601700)
libzpool.so.1`dbuf_dirty+0x61e(10e28280, 3601700)
libzpool.so.1`dmu_buf_will_fill+0x64(10e28280, 3601700)
libzpool.so.1`dmu_write+0x1b6(2c7e640, d, 400000002e000000, 200, 3717b40, 3601700)
ztest_replay_write+0x568(4950d0, 3717a80, 0)
ztest_write+0x125(4950d0, d, 400000002e000000, 200, 413f000)
ztest_io+0x1bb(4950d0, d, 400000002e000000)
ztest_dmu_write_parallel+0xaa(4950d0, 6)
ztest_execute+0x83(1, 420c98, 6)
ztest_thread+0xf4(6)
libc.so.1`_thrp_setup+0x8a(fffffd7fe29ed240)
libc.so.1`_lwp_start()
This is another manifestation of ECKSUM in ztest:
The lowest level ancestor that’s in memory is the L8 (topmost). The L7
ancestor is blkid 0x10:
::dbufs -O 0x2c7e640 -o d -l 7 |::dbuf
addr object lvl blkid holds os 600be50 d 7 4 1 ztest/ds_6 719d880 d 7 0 4 ztest/ds_6
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/6451
Sometimes ztest fails because zdb detects checksum errors. e.g.:
Traversing all blocks to verify checksums and verify nothing leaked ...
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 8000160> DVA0=<0:1cc2000:
180000> [L0 other uint64[]] sha256 uncompressed LE contiguou
s unique single size=100000L/100000P birth=271L/271P fill=1
cksum=c5a3e27d1ed0f894:843bca3a5473c4bf:f76a19b6830a2e4:91292591613a12bf --
skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 800000180> DVA0=<0:ce16800:
180000> [L0 other uint64[]] sha256 uncompressed LE contigu
ous unique single size=100000L/100000P birth=840L/840P fill=1
cksum=5d018f3d061e17f3:6d1584784587bf63:2805a74a0ce37369:ba68a214806c7e75
-- skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 1000000360> DVA0=<0:10d37400:
180000> [L0 other uint64[]] sha256 uncompressed LE conti
guous unique single size=100000L/100000P birth=904L/904P fill=1
cksum=fa1e11d4138bd14b:86c9488c444473e3:f31e43c72e72e46b:e3446472d1174d
ba -- skipping
zdb_blkptr_cb: Got error 50 reading <71, 47, 0, 400000002c0> DVA0=<0:127ef400:
180000> [L0 other uint64[]] sha256 uncompressed LE cont
iguous dedup single size=100000L/100000P birth=549L/549P fill=1
cksum=30e14955ebf13522:66dc2ff8067e6810:4607e750abb9d3b3:6582b8af909fcb
58 -- skipping
zdb_blkptr_cb: Got error 50 reading <657, 5, 0, 1c0> DVA0=<0:1a180400:180000>
[L0 other uint64[]] fletcher4 uncompressed LE contiguou
s unique single size=100000L/100000P birth=1091L/1091P fill=1 cksum=a6cf1e50: 29b3bd01c57e5:36779b914035db9a:db61cdcf6bec56f0 -- skippin
g
The problem is that ztest_fault_inject() can inject multiple faults into the
same block. It is designed such that it can inject errors on all leafs of a
RAID-Z or mirror, but for a given range of offsets, it will only inject errors
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Jorgen Lundman <lundman@lundman.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
https://www.illumos.org/issues/7086
In dbuf_dirty(), we need to grab the dn_struct_rwlock before looking at the
db_blkptr, to prevent it from being changed by syncing context.
Otherwise we may see that ztest got a segfault from this stack:
libzpool.so.1`dva_get_dsize_sync+0x98(872f000, b32b240, fed7811b, 0, b4cda20, 0)
libzpool.so.1`bp_get_dsize+0x60(872f000, b32b240, 0, 97cb780, 9d4c1a8, 0)
libzpool.so.1`dbuf_dirty+0x9b3(ce0a100, 97cb780, 9, fecd2530)
libzpool.so.1`dmu_buf_will_dirty+0xc3(ce0a100, 97cb780, ea293d6c, 1)
libzpool.so.1`zap_lockdir+0x1a0(8aaa3c0, 1, 0, 97cb780, 1, 1)
libzpool.so.1`zap_remove_norm+0x30(8aaa3c0, 1, 0, 8728b10, 0, 97cb780)
libzpool.so.1`zap_remove+0x29(8aaa3c0, 1, 0, 8728b10, 97cb780, a)
ztest_replay_remove+0x225(ea294588, 8728ae8, 0, 38010000, 0, 0)
ztest_remove+0x9f(ea294588, ea293f50, 4, 3)
ztest_object_init+0x78(ea294588, ea293f50, 4e0, 1)
ztest_dmu_object_alloc_free+0x71(ea294588, 13)
ztest_dmu_objset_create_destroy+0x224(80cef08, 13, 0, 805d36c, 9017ad44, 0)
ztest_execute+0x89(a, 807c720, 13, 0)
ztest_thread+0xea(13, 0, 0, 0)
libc.so.1`_thrp_setup+0x88(f0983240)
libc.so.1`_lwp_start(f0983240, 0, 0, 0, 0, 0)
Looking into it a bit, we see that this is an embedded blockpointer, so
BP_GET_NDVAS should have returned 0: b32b240::blkptr
EMBEDDED [L0 ZAP_OTHER] et=0 LZ4 size=200L/4aP birth=80L
Instead, it looks like another thread is modifying this blockpointer: b32b240::ugrep | ::whatis f47a0e0c is in [ stack tid=0x19f ] ebd6ec40 is in [ stack tid=0x226 ] ea293bd0 is in [ stack tid=0x244 ] ea293be4 is in [ stack tid=0x244 ]
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/7072
upstream:
38733 zfs fails to expand if lun added when os is in shutdown state
DLPX-36910 spares and caches should not display expandable space
DLPX-39262 vdev_disk_open spam zfs_dbgmsg buffer
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: George Wilson <george.wilson@delphix.com>
https://www.illumos.org/issues/7104
The current default indirect block size is 16KB. We can improve
performance by increasing it to 128KB. This is especially helpful for
any workload that needs to read most of the metadata, e.g.
scrub/resilver, file deletion, filesystem deletion, and zfs send.
We also need to fix a few space estimation errors to make the tests
pass.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/7071
upstream
DLPX-40482 lzc_snapshot does not fill in errlist on ENOENT
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/6950
When reading compressed data from disk, the ARC should keep the compressed
block cached and only decompress it when consumers access the block. The
uncompressed data should be short-lived allowing the ARC to cache a much larger
amount of data. The DMU would also maintain a smaller cache of uncompressed
blocks to minimize the impact of decompressing frequently accessed blocks.
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>
https://www.illumos.org/issues/6447
I got a patch from someone who uses nvpair code outside of illumos. It fixes a
couple of gcc warnings/bugs for him.
1. silence uninitialized use warnings
2. add parentheses around assignment used as truth value
3. fix printf format specifier (ll is for integers only)
4. strstr, strspn, strcspn, and strcmp are declared in string.h, not
strings.h.
5. avoid scanning integer into boolean variable
Reviewed by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Steve Dougherty <sdougherty@barracuda.com>
https://www.illumos.org/issues/7082
upstream
DLPX-40542 bptree_iterate() passes wrong args to zfs_dbgmsg()
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/6314
Callers of dsl_dataset_name pass a buffer of size ZFS_MAXNAMELEN, but
dsl_dataset_name copies the datasets' name PLUS the snapshot name to it,
resulting in a max of 2 * ZFS_MAXNAMELEN + '@'.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/6872
We compile the zfs libraries with -Wno-uninitialized. We should remove
this. Change makefiles, fix new warnings, fix pbchk errors.
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
https://www.illumos.org/issues/4521
zfstest is trying to execute evil "zfs unmount -a", which fails (fortunately,
as it would otherwise leave me with my ~ missing):
03:44:11.86 cannot unmount '/export/home/yuri': Device busy cannot unmount '/
export/home': Device busy
03:44:11.86 ERROR: /usr/sbin/zfs unmount -a exited 1
This affects, at least, zfs_mount_009_neg and zfs_mount_all_001_pos, both
failing on that step. The pool containing the /export/home hierarchy is
included in KEEP variable, but it doesn't seem to affect anything here.
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
https://www.illumos.org/issues/5813
Lets pull in this patch from freebsd:
http://svnweb.freebsd.org/base?view=revision&revision=271764
zfs_setprop_error(): Handle errno value E2BIG.
This errno value is emitted by dsl_props_set_check() in
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c, and
is used to mean that the property value is too long. For the record,
the maximum length is ZAP_MAXVALUELEN, which is 8*1024 bytes.
Instead of claiming an unknown error (and abort()ing), provide
something more specific to the scenario involved. As far as I
can tell, E2BIG is not emitted for any other scenario.
MFC after: 1 week
Sponsored by: Spectra Logic
Affects: All ZFS versions starting 27 Feb 2009 (illumos ccba0801)
This change modified the value returned by
dsl_props_set_check(), so that it can distinguish between
a name that's too long and a value that's too long, but
libzfs was not updated accordingly.
MFSpectraBSD: r1051499 on 2014/03/28 11:07:59
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Will Andrews <will@freebsd.org>
https://www.illumos.org/issues/6873
lzc_destroy_snaps() returns an nvlist in errlist.
zfs_destroy_snaps_nvl() should nvlist_free() it before returning.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Chris Williamson <chris.williamson@delphix.com>
https://www.illumos.org/issues/6879
In libzfs_sendrecv, there's a typo:
case DRR_SPILL:
if (byteswap) {
drr->drr_u.drr_write.drr_length =
BSWAP_64(drr->drr_u.drr_spill.drr_length);
}
Instead of drr_write.drr_length, we should be assigning the result of the
byteswap to drr_spill.drr_length.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Dan Kimmel <dan.kimmel@delphix.com>
https://www.illumos.org/issues/6111
If you create a zfs child folder, zfs send returns an error when a recursive
incremental send is done between two snapshots made prior to the folder
creation.
The problem can be reproduced with the following steps.
root@zfs:/# zfs create pool/test
root@zfs:/# zfs snapshot pool/test@snap1
root@zfs:/# zfs snapshot pool/test@snap2
root@zfs:/# zfs create pool/test/child
root@zfs:/# zfs send -R -I pool/test@snap1 pool/test@snap2 > /dev/null
WARNING: could not send pool/test/child@snap2: does not exist
WARNING: could not send pool/test/child@snap2: does not exist
root@zfs:/# echo $?
1
root@zfs:/# zfs snapshot -r pool/test@snap3
root@zfs:/# zfs send -R -I pool/test@snap1 pool/test@snap3 > /dev/null
root@zfs:/# echo $?
0
root@zfs:/# zfs send -R -I pool/test@snap2 pool/test@snap3 > /dev/null
root@zfs:/# echo $?
0
Since pool/test/child was created after snap2, zfs send should not expect snap2
to be in pool/test/child when doing a recursive send. It should examine the
compare the creation time of the snapshot and each child folder to decide if
the folder will be sent. The next incremental send between snap2 and snap3
would properly create the child folder and snap3 which first appears in the
child folder.
The problem is identical if '-i' is used instead of '-I'.
Reviewed by: Alex Aizman alex.aizman@nexenta.com
Reviewed by: Alek Pinchuk alek.pinchuk@nexenta.com
Reviewed by: Roman Strashkin roman.strashkin@nexenta.com
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Alex Deiter <alex.deiter@nexenta.com>
https://www.illumos.org/issues/5768
zfsctl_snapshot_inactive() leaks a hold on the dvp (directory vnode) if v_count > 1.
reproduce by:
create a fs with 100 snapshots.
have a thread do:
while true; do ls -l /test/snaps/.zfs/snapshot >/dev/null; done
have another thread do:
while true; do zfs promote test/clone; zfs promote test/snaps; done
use dtrace to delay & observe:
dtrace -w -xd \\
-n 'vn_rele:entry/args0 == (void*)0xffffff01dd42ce80ULL/{[stack()]=count();
chill(100000);}' \\
-n 'zfsctl_snapshot_inactive:entry{ if (args[0]->v_count > 1) trace(args[0]-
>v_count); self->vp=args[0];}' \\
-n 'gfs_vop_inactive:entry/callers["zfsctl_snapshot_inactive"]/{self->good=1;
[stack()]=count()}' \\
-n 'zfsctl_snapshot_inactive:return{if (self->good) self->good=0; else printf
("bad return");}' \\
-n 'gfs_dir_lookup:return/callers["zfsctl_snapshot_inactive"] && self->vp-
>v_count > 1/{trace(self->vp->v_count)}'
the address is found by selecting one of the output of this at random:
dtrace -n 'zfsctl_snapshot_inactive:entry{print(args[0]);'
when you see "bad return", we have hit the bug. Then doing "zfs umount test/
snaps" will fail with EBUSY.
When we hit this case, we also leak the hold on the target vnode (vn). When the
inactive callback is called on a vnode with v_count > 1, it needs to be
decremented.
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Adam Leventhal <adam.leventhal@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Rich Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/6940
Similar to #6334, but this time with empty directories:
$ zfs create tank/quota
$ zfs set quota=10M tank/quota
$ zfs snapshot tank/quota@snap1
$ zfs set mountpoint=/mnt/tank/quota tank/quota
$ mkdir /mnt/tank/quota/dir # create an empty directory
$ mkfile 11M /mnt/tank/quota/11M
/mnt/tank/quota/11M: initialized 9830400 of 11534336 bytes: Disc quota exceeded
$ rmdir /mnt/tank/quota/dir # now unlink the empty directory
rmdir: directory "/mnt/tank/quota/dir": Disc quota exceeded
From user perspective, I would expect that ZFS is always able to remove files
and directories even when the quota is exceeded.
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Simon Klinkert <simon.klinkert@gmail.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: Prakash Surya <prakash.surya@delphix.com>
https://www.illumos.org/issues/7019
Currently zfsdev_ioctl, when confronted by a request with the FKIOCTL flag set,
skips all processing of secpolicy functions. This means that ZFS is not doing
any kind of verification of the credentials or access rights of the caller and
assuming that (as it is an in-kernel client) all such checks have already been
done.
This turns out to be quite a dangerous assumption, especially with respect to
sdev. In general I don't think it's particularly reasonable to offload this
enforcement of access rights onto other kernel subsystems when ZFS has some
particular local semantics in this area (delegated datasets etc) and does not
provide any kind of API to allow other subsystems to avoid code duplication
when doing it. ZFS should apply its normal access policy to requests from
within the kernel, and callers should take care to give it the correct
credentials and call it from the correct context in order to get the results
they need.
You can observe the currently unfortunate consequences of this bug in any non-
global zone that has access to /dev/zvol or any subset of it via sdev profiles.
In particular, a zone used to contain a KVM or similar which has a single zvol
passed through to it using a <device match= block in its zone XML.
Even though sdev makes something of an attempt to control for whether the
caller should have access to nodes in /dev/zvol, it doesn't do this correctly,
or really at all in the lookup call path. So, if we have a zone that's been
given access to any part of /dev/zvol, it can simply look up the full path to
any other zvol on the entire system, and the node will appear and be able to be
used.
Reviewed by: Robert Mustacchi <rm@joyent.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alex Wilson <alex.wilson@joyent.com>
https://www.illumos.org/issues/6922
ZFS does not do a config_sync after removing an aux (spare, log, or cache)
device. AFAICT this isn't being done because it is slow and was deemed
unnecessary. However, it should be such a rare operation that speed doesn't
matter, and not doing it results in two problems:
1) It is theoretically possible to remove an aux device from one pool and
attach it to another, then lose power. When power is restored, both pools would
think that they own the aux device.
2) Removal of the aux device doesn't send any useful sysevents to userland.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alan Somers <asomers@gmail.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
https://www.illumos.org/issues/6878
Summary of changes:
* Replace generic "scan done" message with "scan aborted, restarting",
"scan cancelled", or "scan done"
* Log number of errors using spa_get_errlog_size
* Refactor scan restarting check into static function
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Nav Ravindranath <nav@delphix.com>
https://www.illumos.org/issues/6513
If a ZFS object contains a hole at level one, and then a data block is created
at level 0 underneath that l1 block, l0 holes will be created. However, these
l0 holes do not have the birth time property set; as a result, incremental
sends will not send those holes.
Fix is to modify the dbuf_read code to fill in birth time data.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Boris Protopopov <bprotopopov@hotmail.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Paul Dagnelie <pcd@delphix.com>
https://www.illumos.org/issues/6902
pjd has authored and commited a patch in Jan 21, 2012 that substanially speeds
up zfs snapshot listing if requesting only the name property and sorting by
name.
In this special case, the snapshot properties do not need to be loaded. This
code has been adopted by zfsonlinux on May 29, 2012.
Commit message from pjd:
Dramatically optimize listing snapshots when user requests only
snapshot
names and wants to sort them by name, ie. when executes:
1. zfs list -t snapshot -o name -s name
Because only name is needed we don't have to read all snapshot
properties.
Below you can find how long does it take to list 34509 snapshots from
a single
disk pool before and after this change with cold and warm cache:
before:
1. time zfs list -t snapshot -o name -s name > /dev/null
cold cache: 525s
warm cache: 218s
after:
1. time zfs list -t snapshot -o name -s name > /dev/null
cold cache: 1.7s
warm cache: 1.1s
References:
http://svnweb.freebsd.org/base?view=revision&revision=230438
https://github.com/freebsd/freebsd/commit/8e3e9863
https://github.com/zfsonlinux/zfs/commit/0cee2406
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pawel Dawidek <pjd@freebsd.org>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Martin Matuska <martin@matuska.org>
https://www.illumos.org/issues/6876
Calling dsl_dataset_name on a dataset with a 256 byte buffer is asking for
trouble. We should check every dataset on import, using a 1024 byte buffer and
checking each time to see if the dataset's new name is longer than 256 bytes.
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Paul Dagnelie <pcd@delphix.com>
https://www.illumos.org/issues/6844
dnode_next_offset is used in a variety of places to iterate over the holes or
allocated blocks in a dnode. It operates under the premise that it can iterate
over the blockpointers of a dnode in open context while holding only the
dn_struct_rwlock as reader. Unfortunately, this premise does not hold.
When we create the zio for a dbuf, we pass in the actual block pointer in the
indirect block above that dbuf. When we later zero the bp in
zio_write_compress, we are directly modifying the bp. The state of the bp is
now inconsistent from the perspective of dnode_next_offset: the bp will appear
to be a hole until zio_dva_allocate finally finishes filling it in. In the
meantime, dnode_next_offset can detect a hole in the dnode when none exists.
I was able to experimentally demonstrate this behavior with the following
setup:
1. Create a file with 1 million dbufs.
2. Create a thread that randomly dirties L2 blocks by writing to the first L0
block under them.
3. Observe dnode_next_offset, waiting for it to skip over a hole in the middle
of a file.
4. Do dnode_next_offset in a loop until we skip over such a non-existent hole.
The fix is to ensure that it is valid to iterate over the indirect blocks in a
dnode while holding the dn_struct_rwlock by passing the zio a copy of the BP
and updating the actual BP in dbuf_write_ready while holding the lock.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Boris Protopopov <bprotopopov@hotmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alex Reece <alex@delphix.com>
https://www.illumos.org/issues/6874
When we do a clone swap (caused by "zfs rollback" or "zfs receive"), the ZPL
doesn't completely reload the state from the DMU; some values remain cached in
the zfsvfs_t.
steps to reproduce:
```
#!/bin/bash -x
zfs destroy -R test/fs
zfs destroy -R test/recvd
zfs create test/fs
zfs snapshot test/fs@a
zfs set userquota@$USER=1m test/fs
zfs snapshot test/fs@b
zfs send test/fs@a | zfs recv test/recvd
zfs send -i @a test/fs@b | zfs recv test/recvd
zfs userspace test/recvd
1. should show 1m quota
dd if=/dev/urandom of=/test/recvd/file bs=1k count=1024
sync
dd if=/dev/urandom of=/test/recvd/file2 bs=1k count=1024
2. should fail with ENOSPC
sync
zfs unmount test/recvd
zfs mount test/recvd
zfs userspace test/recvd
3. if bug above, now shows 1m quota
dd if=/dev/urandom of=/test/recvd/file3 bs=1k count=1024
4. if bug above, now fails with ENOSPC
```
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Matthew Ahrens <mahrens@delphix.com>
Mark Johnston [Mon, 6 Jun 2016 22:09:22 +0000 (22:09 +0000)]
7035 string-related subroutines should validate input earlier
Reviewed by: Alex Wilson <alex.wilson@joyent.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Patrick Mooney <pmooney@pfmooney.com>
Mark Johnston [Mon, 6 Jun 2016 22:07:55 +0000 (22:07 +0000)]
7033 ustack helper should fault on bad return values
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Alex Wilson <alex.wilson@joyent.com>
Mark Johnston [Mon, 6 Jun 2016 22:06:45 +0000 (22:06 +0000)]
7034 negative record sizes should be rejected
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Alex Wilson <alex.wilson@joyent.com>
Alexander Motin [Wed, 11 May 2016 12:50:58 +0000 (12:50 +0000)]
6736 ZFS per-vdev ZAPs
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Joe Stein <joe.stein@delphix.com>
Alexander Motin [Wed, 11 May 2016 12:36:19 +0000 (12:36 +0000)]
6841 Undirty freed spill blocks
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Tim Chase <tim@chase2k.com>
https://www.illumos.org/issues/6052
At the moment type parameter of lzc_create() is of dmu_objset_type_t type.
That exposes an implementation detail and requires sys/fs/zfs.h to be included
in libzfs_core.h creating unnecessary coupling between libzfs_core interface
and ZFS internals.
I think that dmu_objset_type_t should be replaced with a libzfs_core
enumeration of supported dataset types.
For ABI reasons the new enumeration could be bit-compatible with
dmu_objset_type_t.
For example:
typedef enum {
LZC_DST_ZFS = 2,
LZC_DST_ZVOL
} lzc_dataset_type_t;
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Andriy Gapon <andriy.gapon@clusterhq.com>
Alexander Motin [Mon, 11 Apr 2016 21:07:18 +0000 (21:07 +0000)]
6322 ZFS indirect block predictive prefetch
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Author: Alexander Motin <mav@FreeBSD.org>
Improve speculative prefetch of indirect blocks.
Scalability of many operations on wide ZFS pool can be limited by
requirement to prefetch indirect blocks first. Recently added
asynchronous indirect block read partially helped, but did not
solve the problem completely. This patch extends existing prefetcher
functionality to explicitly work with indirect blocks.
Before this change prefetcher issued reads for up to 8MB of data in
advance. With this change it also issues indirect block reads
for up to 64MB of data in advance, so that when it will be time to
actually read those data, it can be done immediately. Alike effect
can be achieved by just increasing maximal data prefetch distance,
but at higher memory cost.
Also this change introduces indirect block prefetch for rewrite
operations, that was never done before. Previously ARC miss for
Indirect blocks regularly blocked rewrites, converting perfectly
aligned asynchronous operations into synchronous read-write pairs,
significantly reducing maximal rewrite speed.
While being there this issue was also fixed:
- prefetch was done always, even if caching for the dataset was
completely disabled.
Testing on FreeBSD with zvol on top of 6x striped 2x mirrored pool
of 12 assorted HDDs shown me such performance numbers:
------- BEFORE --------
Write 491363677 bytes/sec
Read 312430631 bytes/sec
Rewrite 97680464 bytes/sec
-------- AFTER --------
Write 493524146 bytes/sec
Read 438598079 bytes/sec
Rewrite 277506044 bytes/sec
Alexander Motin [Sat, 9 Apr 2016 19:49:40 +0000 (19:49 +0000)]
6418 zpool should have a label clearing command
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Author: Will Andrews <will@firepipe.net>
Alexander Motin [Sat, 2 Apr 2016 08:25:41 +0000 (08:25 +0000)]
6738 zfs send stream padding needs documentation
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Eli Rosenthal <eli.rosenthal@delphix.com>
Alexander Motin [Sat, 2 Apr 2016 08:24:23 +0000 (08:24 +0000)]
6739 userland version of cv_timedwait_hires() always assumes absolute time
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>
Alexander Motin [Sat, 2 Apr 2016 08:19:41 +0000 (08:19 +0000)]
6681 zfs list burning lots of time in dodefault() via dsl_prop_*
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Alex Wilson <alex.wilson@joyent.com>
Alexander Motin [Thu, 10 Mar 2016 08:56:18 +0000 (08:56 +0000)]
6370 ZFS send fails to transmit some holes
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Stefan Ring <stefanrin@gmail.com>
Reviewed by: Steven Burgess <sburgess@datto.com>
Reviewed by: Arne Jansen <sensille@gmx.net>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
In certain circumstances, "zfs send -i" (incremental send) can produce a
stream which will result in incorrect sparse file contents on the
target.
The problem manifests as regions of the received file that should be
sparse (and read a zero-filled) actually contain data from a file that
was deleted (and which happened to share this file's object ID).
Note: this can happen only with filesystems (not zvols, because they do
not free (and thus can not reuse) object IDs).
Note: This can happen only if, since the incremental source (FromSnap),
a file was deleted and then another file was created, and the new file
is sparse (i.e. has areas that were never written to and should be
implicitly zero-filled).
We suspect that this was introduced by 4370 (applies only if hole_birth
feature is enabled), and made worse by 5243 (applies if hole_birth
feature is disabled, and we never send any holes).
The bug is caused by the hole birth feature. When an object is deleted
and replaced, all the holes in the object have birth time zero. However,
zfs send cannot tell that the holes are new since the file was replaced,
so it doesn't send them in an incremental. As a result, you can end up
with invalid data when you receive incremental send streams. As a
short-term fix, we can always send holes with birth time 0 (unless it's
a zvol or a dataset where we can guarantee that no objects have been
reused).
Alexander Motin [Tue, 8 Mar 2016 18:51:12 +0000 (18:51 +0000)]
4448 zfs diff misprints unicode characters
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Joshua M. Clulow <jmc@joyent.com>
Alexander Motin [Tue, 8 Mar 2016 18:37:34 +0000 (18:37 +0000)]
6551 cmd/zpool: cleanup gcc warnings
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Alexander Motin [Tue, 8 Mar 2016 18:35:07 +0000 (18:35 +0000)]
6550 cmd/zfs: cleanup gcc warnings
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Igor Kozhukhov <ikozhukhov@gmail.com>
Alexander Motin [Tue, 8 Mar 2016 18:08:33 +0000 (18:08 +0000)]
6659 nvlist_free(NULL) is a no-op
Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Marcel Telka <marcel@telka.sk>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Alexander Motin [Tue, 8 Mar 2016 17:52:43 +0000 (17:52 +0000)]
6562 Refquota on receive doesn't account for overage
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Dan McDonald <danmcd@omniti.com>
Alexander Motin [Tue, 8 Mar 2016 17:34:07 +0000 (17:34 +0000)]
6450 scrub/resilver unnecessarily traverses snapshots created
after the scrub started
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 8 Mar 2016 17:29:57 +0000 (17:29 +0000)]
6537 Panic on zpool scrub with DEBUG kernel
Reviewed by: Steve Gonczi <gonczi@comcast.net>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Gary Mills <gary_mills@fastmail.fm>
Alexander Motin [Tue, 8 Mar 2016 16:11:59 +0000 (16:11 +0000)]
6531 Provide mechanism to artificially limit disk performance
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
Mark Johnston [Wed, 2 Mar 2016 05:43:16 +0000 (05:43 +0000)]
6604 harden DIF bounds checking
Author: Bryan Cantrill <bryan@joyent.com>
Reviewed by: Alex Wilson <alex.wilson@joyent.com>
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Alexander Motin [Tue, 26 Jan 2016 13:49:46 +0000 (13:49 +0000)]
6529 Properly handle updates of variably-sized SA entries.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Ned Bass <bass6@llnl.gov>
Reviewed by: Tim Chase <tim@chase2k.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Andriy Gapon <avg@icyb.net.ua>
During the update process in sa_modify_attrs(), the sizes of existing
variably-sized SA entries are obtained from sa_lengths[]. The case where
a variably-sized SA was being replaced neglected to increment the index
into sa_lengths[], so subsequent variable-length SAs would be rewritten
with the wrong length. This patch adds the missing increment operation
so all variably-sized SA entries are stored with their correct lengths.
Another problem was that index into attr_desc[] was increased even when
an attribute was removed. If that attribute was not the last attribute,
then the last attribute was lost.
Alexander Motin [Tue, 26 Jan 2016 13:44:47 +0000 (13:44 +0000)]
6495 Fix mutex leak in dmu_objset_find_dp
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Albert Lee <trisk@omniti.com>
Author: Steven Hartland <steven.hartland@multiplay.co.uk>
Alexander Motin [Tue, 26 Jan 2016 13:40:22 +0000 (13:40 +0000)]
6494 ASSERT supported zio_types for file and disk vdevs
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Albert Lee <trisk@omniti.com>
Author: Steven Hartland <steven.hartland@multiplay.co.uk>
Alexander Motin [Tue, 26 Jan 2016 13:20:31 +0000 (13:20 +0000)]
4986 receiving replication stream fails if any snapshot exceeds refquota
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Gordon Ross <gordon.ross@nexenta.com>
Author: Dan McDonald <danmcd@omniti.com>
Alexander Motin [Tue, 26 Jan 2016 13:02:16 +0000 (13:02 +0000)]
6434 sa_find_sizes() may compute wrong SA header size
Reviewed-by: Ned Bass <bass6@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Andriy Gapon <avg@freebsd.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: James Pan <jiaming.pan@yahoo.com>
Alexander Motin [Tue, 26 Jan 2016 12:55:43 +0000 (12:55 +0000)]
6414 vdev_config_sync could be simpler
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Will Andrews <will@firepipe.net>
Alexander Motin [Tue, 26 Jan 2016 12:51:41 +0000 (12:51 +0000)]
6388 Failure of userland copy should return EFAULT
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Richard Yao <ryao@gentoo.org>
Alexander Motin [Tue, 26 Jan 2016 12:49:31 +0000 (12:49 +0000)]
6386 Fix function call with uninitialized value in vdev_inuse
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Richard Yao <ryao@gentoo.org>
Alexander Motin [Tue, 26 Jan 2016 12:47:33 +0000 (12:47 +0000)]
6334 Cannot unlink files when over quota
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Toomas Soome <tsoome@me.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Simon Klinkert <simon.klinkert@gmail.com>
Alexander Motin [Tue, 26 Jan 2016 12:39:07 +0000 (12:39 +0000)]
6385 Fix unlocking order in zfs_zget
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Andriy Gapon <avg@freebsd.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Richard Yao <ryao@gentoo.org>
Dimitry Andric [Fri, 15 Jan 2016 21:41:45 +0000 (21:41 +0000)]
6527 Possible access beyond end of string in zpool comment
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Alexander Motin [Sun, 18 Oct 2015 18:56:48 +0000 (18:56 +0000)]
5767 fix several problems with zfs test suite
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: John Wren Kennedy <john.kennedy@delphix.com>
Alexander Motin [Sun, 18 Oct 2015 18:30:47 +0000 (18:30 +0000)]
5847 libzfs_diff should check zfs_prop_get() return
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Albert Lee <trisk@omniti.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Alexander Eremin <a.eremin@nexenta.com>
Alexander Motin [Sun, 18 Oct 2015 17:57:42 +0000 (17:57 +0000)]
5561 support root pools on EFI/GPT partitioned disks
5125 update zpool/libzfs to manage bootable whole disk pools (EFI/GPT labeled disks)
Reviewed by: Jean McCormack <jean.mccormack@nexenta.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Hans Rosenfeld <hans.rosenfeld@nexenta.com>
Alexander Motin [Sun, 18 Oct 2015 11:23:58 +0000 (11:23 +0000)]
6298 zfs_create_008_neg and zpool_create_023_neg need to be updated for large block support
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Joe Stein <joe.stein@delphix.com>
Alexander Motin [Sun, 18 Oct 2015 08:12:05 +0000 (08:12 +0000)]
5745 zfs set allows only one dataset property to be set at a time
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Reviewed by: Richard PALO <richard@NetBSD.org>
Reviewed by: Steven Hartland <killing@multiplay.co.uk>
Approved by: Rich Lowe <richlowe@richlowe.net>
Author: Chris Williamson <chris.williamson@delphix.com>