]> CyberLeo.Net >> Repos - FreeBSD/stable/10.git/commit
MFC r305331: MFV r304155:
authormav <mav@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Fri, 14 Oct 2016 07:27:40 +0000 (07:27 +0000)
committermav <mav@ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f>
Fri, 14 Oct 2016 07:27:40 +0000 (07:27 +0000)
commitc5b7be29b2b9b733680c1e24b4d17f64ad8f25af
tree8b5ca4a582f31a1393d5e009fb3ddcbf90fed2e3
parent4b964bbc42473f47926d0fb6855b66a63d56106b
MFC r305331: MFV r304155:
7090 zfs should improve allocation order and throttle allocations

illumos/illumos-gate@0f7643c7376dd69a08acbfc9d1d7d548b10c846a
https://github.com/illumos/illumos-gate/commit/0f7643c7376dd69a08acbfc9d1d7d548b
10c846a

https://www.illumos.org/issues/7090
  When write I/Os are issued, they are issued in block order but the ZIO pipelin
e
  will drive them asynchronously through the allocation stage which can result i
n
  blocks being allocated out-of-order. It would be nice to preserve as much of
  the logical order as possible.
  In addition, the allocations are equally scattered across all top-level VDEVs
  but not all top-level VDEVs are created equally. The pipeline should be able t
o
  detect devices that are more capable of handling allocations and should
  allocate more blocks to those devices. This allows for dynamic allocation
  distribution when devices are imbalanced as fuller devices will tend to be
  slower than empty devices.
  The change includes a new pool-wide allocation queue which would throttle and
  order allocations in the ZIO pipeline. The queue would be ordered by issued
  time and offset and would provide an initial amount of allocation of work to
  each top-level vdev. The allocation logic utilizes a reservation system to
  reserve allocations that will be performed by the allocator. Once an allocatio
n
  is successfully completed it's scheduled on a given top-level vdev. Each top-
  level vdev maintains a maximum number of allocations that it can handle
  (mg_alloc_queue_depth). The pool-wide reserved allocations (top-levels *
  mg_alloc_queue_depth) are distributed across the top-level vdevs metaslab
  groups and round robin across all eligible metaslab groups to distribute the
  work. As top-levels complete their work, they receive additional work from the
  pool-wide allocation queue until the allocation queue is emptied.

Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: George Wilson <george.wilson@delphix.com>

git-svn-id: svn://svn.freebsd.org/base/stable/10@307279 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
17 files changed:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/refcount.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/refcount.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_impl.h
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c
sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h