]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/commit - lib/libz/inffast.c
MFV r309249: 3821 Race in rollback, zil close, and zil flush
authorAndriy Gapon <avg@FreeBSD.org>
Mon, 28 Nov 2016 15:14:31 +0000 (15:14 +0000)
committerAndriy Gapon <avg@FreeBSD.org>
Mon, 28 Nov 2016 15:14:31 +0000 (15:14 +0000)
commit0451d4e97b27e599592a4ffb157ee98cf4a96dc3
tree2fdd512228db95fc0d50673cc632245e6300bea1
parenta70475ca42f59ae02212781949e9e3b3ab721771
parentaed3e94d2f6e64cb5ba8b22faabc36005b466a46
MFV r309249: 3821 Race in rollback, zil close, and zil flush

Note: there was a merge conflict resolved by me.

illumos/illumos-gate@43297f973a3543e7403ac27076490ab958a94b15
https://github.com/illumos/illumos-gate/commit/43297f973a3543e7403ac27076490ab958a94b15

https://www.illumos.org/issues/3821
  We recently had nodes with some of the latest zfs bits panic on us in a
  rollback-heavy environment. The following is from my preliminary analysis:
  Let's look at where we died:
  > $C
  ffffff01ea6b9a10 taskq_dispatch+0x3a(0, fffffffff7d20450ffffff5551dea920, 1)
  ffffff01ea6b9a60 zil_clean+0xce(ffffff4b7106c080, 7e0f1)
  ffffff01ea6b9aa0 dsl_pool_sync_done+0x47(ffffff4313065680, 7e0f1)
  ffffff01ea6b9b70 spa_sync+0x55f(ffffff4310c1d040, 7e0f1)
  ffffff01ea6b9c20 txg_sync_thread+0x20f(ffffff4313065680)
  ffffff01ea6b9c30 thread_start+8()
  If we dig in we can find that this dataset corresponds to a zone:
  > ffffff4b7106c080::print zilog_t zl_os->os_dsl_dataset->ds_dir->dd_myname
  zl_os->os_dsl_dataset->ds_dir->dd_myname = [ "8ffce16a-13c2-4efa-a233-
  9e378e89877b" ]
  Okay so we have a null taskq pointer. That only happens during the calls to
  zil_open and zil_close. If we poke around we can see that we're actually in
  midst of a rollback:
  > ::pgrep zfs | ::printf "0x%x %s\\n" proc_t . p_user.u_psargs
  0xffffff43262800a0 zfs rollback zones/15714eb6-f5ea-469f-ac6d-
  4b8ab06213c2@marlin_init
  0xffffff54e22a1028 zfs rollback zones/8ffce16a-13c2-4efa-a233-
  9e378e89877b@marlin_init
  0xffffff4362f3a058 zfs rollback zones/0ddb8e49-ca7e-42e1-8fdc-
  4ac4ba8fe9f8@marlin_init
  0xffffff5748e8d020 zfs rollback zones/426357b5-832d-4430-953e-
  10cd45ff8e9f@marlin_init
  0xffffff436b867008 zfs rollback zones/8f36bf37-8a9c-4a44-995c-
  6d1b2751e6f5@marlin_init
  0xffffff4381ad4090 zfs rollback zones/6c8eca18-fbd6-46dd-ac24-
  2ed45cd0da70@marlin_init

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Andriy Gapon <avg@FreeBSD.org>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>

MFC after: 3 weeks
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c