]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
11 years agoDo not call malloc(M_WAITOK) while bodev->fence_lock mutex is
kib [Sat, 23 Mar 2013 22:23:15 +0000 (22:23 +0000)]
Do not call malloc(M_WAITOK) while bodev->fence_lock mutex is
held. The ttm_buffer_object_transfer() does not need the mutex locked
at all, except for the call to the driver sync_obj_ref() method.

Reported and tested by: dumbbell
MFC after:   2 weeks

11 years agoMerge bugfix from vendor master branch:
mm [Sat, 23 Mar 2013 21:34:10 +0000 (21:34 +0000)]
Merge bugfix from vendor master branch:

Limit write requests to at most INT_MAX.
This prevents a certain common programming error (passing -1 to write)
from leading to other problems deeper in the library.

References:
https://github.com/libarchive/libarchive/commit/22531545514043e0

Reported by: Xin Li <delphij@FreeBSD.org>
Obtained from:  libarchive (master branch)

11 years agodrm/ttm: Fix a typo: s/pTTM]/[TTM]/
dumbbell [Sat, 23 Mar 2013 20:46:47 +0000 (20:46 +0000)]
drm/ttm: Fix a typo: s/pTTM]/[TTM]/

11 years agodrm/ttm: Explain why we don't need to acquire a ref in ttm_bo_vm_ctor()
dumbbell [Sat, 23 Mar 2013 20:43:26 +0000 (20:43 +0000)]
drm/ttm: Explain why we don't need to acquire a ref in ttm_bo_vm_ctor()

11 years agoFix kernel build with options ZFS after r24571 (libzfs_core).
mm [Sat, 23 Mar 2013 20:01:45 +0000 (20:01 +0000)]
Fix kernel build with options ZFS after r24571 (libzfs_core).

Submitted by: Bjoern A. Zeeb <bz@FreeBSD.org>

11 years agoRevert 248634 and 248643 (e.g., restoring 248625 and 248639).
mckusick [Sat, 23 Mar 2013 20:00:02 +0000 (20:00 +0000)]
Revert 248634 and 248643 (e.g., restoring 248625 and 248639).

Build verified by: Glen Barber (gjb@)

11 years agodrm/ttm: Fix TTM buffer object refcount
dumbbell [Sat, 23 Mar 2013 19:19:19 +0000 (19:19 +0000)]
drm/ttm: Fix TTM buffer object refcount

This fixes memory leaks in the radeonkms driver.

Reviewed by: Konstantin Belousov (kib@)
Tested by: J.R. Oldroyd <jr@opal.com>

11 years agoFix compiling ed w/ WITHOUT_ED_CRYPTO... These variables aren't
jmg [Sat, 23 Mar 2013 19:04:57 +0000 (19:04 +0000)]
Fix compiling ed w/ WITHOUT_ED_CRYPTO...  These variables aren't
used..

Submitted by:   deeptech71 at gmail dot com

11 years agoDon't check and warn about pmap mismatch on every call to busdma sync.
ian [Sat, 23 Mar 2013 17:17:06 +0000 (17:17 +0000)]
Don't check and warn about pmap mismatch on every call to busdma sync.
With some recent busdma refactoring, sometimes it happens that a sync
op gets called when bus_dmamap_load() never got called, which results
in a spurious warning about a map mismatch when no sync operations will
actually happen anyway.  Now the check is done only if a sync operation
is actually performed, and the result of the check is a panic, not just
a printf.

Reviewed by: cognet (who prevented me from donning a point hat)

11 years agoBe more explicit about what each bio_cmd & bio_flags value means.
will [Sat, 23 Mar 2013 16:55:07 +0000 (16:55 +0000)]
Be more explicit about what each bio_cmd & bio_flags value means.

Reviewed by: ken (mentor)

11 years agoZFS: Fix a panic while unmounting a busy filesystem.
will [Sat, 23 Mar 2013 16:34:56 +0000 (16:34 +0000)]
ZFS: Fix a panic while unmounting a busy filesystem.

This particular scenario was easily reproduced using a NFS export.  When the
first 'zfs unmount' occurred, it returned EBUSY via this path, while
vflush() had flushed references on the filesystem's root vnode, which in
turn caused its v_interlock to be destroyed.  The next time 'zfs unmount'
was called, vflush() tried to obtain this lock, which caused this panic.

Since vflush() on FreeBSD is a definitive call, there is no need to check
vfsp->vfs_count after it completes.  Simply #ifdef sun this check.

Submitted by: avg
Reviewed by: avg
Approved by: ken (mentor)
MFC after: 1 month

11 years agoExtend taskqueue(9) to enable per-taskqueue callbacks.
will [Sat, 23 Mar 2013 15:11:53 +0000 (15:11 +0000)]
Extend taskqueue(9) to enable per-taskqueue callbacks.

The scope of these callbacks is primarily to support actions that affect the
taskqueue's thread environments.  They are entirely optional, and
consequently are introduced as a new API: taskqueue_set_callback().

This interface allows the caller to specify that a taskqueue requires a
callback and optional context pointer for a given callback type.

The callback types included in this commit can be used to register a
constructor and destructor for thread-local storage using osd(9).  This
allows a particular taskqueue to define that its threads require a specific
type of TLS, without the need for a specially-orchestrated task-based
mechanism for startup and shutdown in order to accomplish it.

Two callback types are supported at this point:

- TASKQUEUE_CALLBACK_TYPE_INIT, called by every thread when it starts, prior
  to processing any tasks.
- TASKQUEUE_CALLBACK_TYPE_SHUTDOWN, called by every thread when it exits,
  after it has processed its last task but before the taskqueue is
  reclaimed.

While I'm here:

- Add two new macros, TQ_ASSERT_LOCKED and TQ_ASSERT_UNLOCKED, and use them
  in appropriate locations.
- Fix taskqueue.9 to mention taskqueue_start_threads(), which is a required
  interface for all consumers of taskqueue(9).

Reviewed by: kib (all), eadler (taskqueue.9), brd (taskqueue.9)
Approved by: ken (mentor)
Sponsored by: Spectra Logic
MFC after: 1 month

11 years agoRevert r247892 now that this has been fixed upstream.
des [Sat, 23 Mar 2013 14:52:31 +0000 (14:52 +0000)]
Revert r247892 now that this has been fixed upstream.

11 years agoMake `systat -vmstat` to use suffixes to display big floating point numbers
mav [Sat, 23 Mar 2013 13:11:54 +0000 (13:11 +0000)]
Make `systat -vmstat` to use suffixes to display big floating point numbers
that are not fitting into the specified field width, same as done for ints.
In particular that allows to properly display disk tps above 100k, that are
reachable with modern SSDs.

11 years agopost mountroot event after a real/final root is mounted
avg [Sat, 23 Mar 2013 08:59:34 +0000 (08:59 +0000)]
post mountroot event after a real/final root is mounted

not every time an intermediate root (including the first devfs) is
mounted.
This is also consistent with waking up via root_mount_complete.

Reviewed by: jhb
MFC after: 13 days

11 years agodtrace: ensure that we can always catch a process (e.g. when -c is used)
avg [Sat, 23 Mar 2013 08:57:54 +0000 (08:57 +0000)]
dtrace: ensure that we can always catch a process (e.g. when -c is used)

It is not guaranteed that a program has a symbol table entry for main
and thus that it would be possible to set a breakpoint on it.

Reviewed by: rpaulo
Discussed with: rpaulo
MFC after: 13 days

11 years agoRevert r248639 to fix build failure on head/
gjb [Sat, 23 Mar 2013 08:57:14 +0000 (08:57 +0000)]
Revert r248639 to fix build failure on head/

11 years agofbt_getargdesc: correctly handle types for return probes
avg [Sat, 23 Mar 2013 08:52:50 +0000 (08:52 +0000)]
fbt_getargdesc: correctly handle types for return probes

MFC after: 6 days

11 years agolibdwarf: anonymous types are expected to have empty type names...
avg [Sat, 23 Mar 2013 08:50:56 +0000 (08:50 +0000)]
libdwarf: anonymous types are expected to have empty type names...

or no type attributes at all.
This is according to DWARF specification.

MFC after: 13 days

11 years agofbt_typoff_init: fix an off by one in determining required memory size
avg [Sat, 23 Mar 2013 08:48:44 +0000 (08:48 +0000)]
fbt_typoff_init: fix an off by one in determining required memory size

This issue would be silent most of the time, but if the requested memory
is a multiple of a page size, then accessing one element beyond the end
would lead to a kernel page fault.
Otherwise, the unlucky last type would just be inaccessible.

Reported by: glebius
Tested by: glebius
MFC after: 6 days

11 years agoFix the build after addition of cylinder group cacheing (r248625)
mckusick [Sat, 23 Mar 2013 07:57:30 +0000 (07:57 +0000)]
Fix the build after addition of cylinder group cacheing (r248625)

Reported by:   Glen Barber (gjb@)
Pointy hat to: Kirk McKusick (mckusick@)

11 years agoRevert svn r248625
sbruno [Sat, 23 Mar 2013 04:26:13 +0000 (04:26 +0000)]
Revert svn r248625

Clang errors around printf could be trivially fixed, but the breakage in
sbin/fsdb were to significant for this type of change.

Submitter of this changeset has been notified and hopefully this can be
restored soon.

11 years agoAdd AR9300 descriptor decoding.
adrian [Sat, 23 Mar 2013 01:25:11 +0000 (01:25 +0000)]
Add AR9300 descriptor decoding.

11 years agoDon't attempt to reference sc before testing whether it's NULL.
delphij [Fri, 22 Mar 2013 22:46:19 +0000 (22:46 +0000)]
Don't attempt to reference sc before testing whether it's NULL.

Submitted by: Sascha Wildner
Obtained from: DragonFly
MFC after: 2 weeks

11 years agoSpeed up fsck by caching the cylinder group maps in pass1 so
mckusick [Fri, 22 Mar 2013 21:50:43 +0000 (21:50 +0000)]
Speed up fsck by caching the cylinder group maps in pass1 so
that they do not need to be read again in pass5. As this nearly
doubles the memory requirement for fsck, the cache is thrown away
if other memory needs in fsck would otherwise fail. Thus, the
memory footprint of fsck remains unchanged in memory constrained
environments.

This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker

Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   4 weeks

11 years agoAs it's done for libstdc++, use SJLJ-based exceptions on arm when we're not
cognet [Fri, 22 Mar 2013 21:50:32 +0000 (21:50 +0000)]
As it's done for libstdc++, use SJLJ-based exceptions on arm when we're not
using EABI, and use unwind-arm.h instead of unwind-generic.h when using EABI.

11 years agoThe purpose of this change to the FFS layout policy is to reduce the
mckusick [Fri, 22 Mar 2013 21:45:28 +0000 (21:45 +0000)]
The purpose of this change to the FFS layout policy is to reduce the
running time for a full fsck. It also reduces the random access time
for large files and speeds the traversal time for directory tree walks.

The key idea is to reserve a small area in each cylinder group
immediately following the inode blocks for the use of metadata,
specifically indirect blocks and directory contents. The new policy
is to preferentially place metadata in the metadata area and
everything else in the blocks that follow the metadata area.

The size of this area can be set when creating a filesystem using
newfs(8) or changed in an existing filesystem using tunefs(8).
Both utilities use the `-k held-for-metadata-blocks' option to
specify the amount of space to be held for metadata blocks in each
cylinder group. By default, newfs(8) sets this area to half of
minfree (typically 4% of the data area).

This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker

Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   4 weeks

11 years agoRemove __FreeBSD_version ifdefs.
glebius [Fri, 22 Mar 2013 20:44:16 +0000 (20:44 +0000)]
Remove __FreeBSD_version ifdefs.

11 years agorc.d/sysctl: Fix error messages about unknown OIDs.
jilles [Fri, 22 Mar 2013 20:12:25 +0000 (20:12 +0000)]
rc.d/sysctl: Fix error messages about unknown OIDs.

There are three situations where the sysctl script is called:
1. "start", very early
2. "lastload", near the end of rc
3. "reload", at admin request while the system is booted

Ignore unknown OIDs in situation 1 because kernel modules may not be loaded
yet and complain about them in situations 2 and 3.

PR: conf/174595
Submitted by: Olivier Smedts

11 years agoUpgrade to OpenSSH 6.2p1. The most important new features are support
des [Fri, 22 Mar 2013 17:55:38 +0000 (17:55 +0000)]
Upgrade to OpenSSH 6.2p1.  The most important new features are support
for a key revocation list and more fine-grained authentication control.

11 years agoRetire the mislabeled ENABLE_SUID_SSH knob.
des [Fri, 22 Mar 2013 14:10:15 +0000 (14:10 +0000)]
Retire the mislabeled ENABLE_SUID_SSH knob.

11 years agoMFV r248590,248594:
mm [Fri, 22 Mar 2013 13:36:03 +0000 (13:36 +0000)]
MFV r248590,248594:
Update libarchive to 3.1.2

Some of new features:
  - support for lrzip and grzip compression
  - support for writing tar v7 format
  - b64encode and uuencode filters
  - support for __MACOSX directory in Zip archives
  - support for lzop compresion (external utility)

11 years agoVendor import of OpenSSH 6.2p1.
des [Fri, 22 Mar 2013 11:19:48 +0000 (11:19 +0000)]
Vendor import of OpenSSH 6.2p1.

11 years agoReplace deprecated (or remove obsolete) libarchive 2.8 functions
mm [Fri, 22 Mar 2013 10:17:42 +0000 (10:17 +0000)]
Replace deprecated (or remove obsolete) libarchive 2.8 functions
with libarchive 3.0 counterparts

11 years ago- Constify local path variable for chflagsat().
pjd [Fri, 22 Mar 2013 07:40:34 +0000 (07:40 +0000)]
- Constify local path variable for chflagsat().
- Use correct format characters (%lx) for u_long.

This fixes the build broken in r248599.

11 years agoClean up some unused leftover code.
kevlo [Fri, 22 Mar 2013 01:45:54 +0000 (01:45 +0000)]
Clean up some unused leftover code.

Pointed out by: ae

11 years agoRemove unused global variables.
kevlo [Fri, 22 Mar 2013 01:40:17 +0000 (01:40 +0000)]
Remove unused global variables.

Reviewed by: ae, glebius

11 years agoUpdate regression tests after adding chflagsat(2).
pjd [Thu, 21 Mar 2013 23:07:04 +0000 (23:07 +0000)]
Update regression tests after adding chflagsat(2).

Sponsored by: The FreeBSD Foundation

11 years agoFix for building libzpool under i386.
smh [Thu, 21 Mar 2013 23:06:11 +0000 (23:06 +0000)]
Fix for building libzpool under i386.

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 weeks

11 years agoDocument chflagsat(2).
pjd [Thu, 21 Mar 2013 23:05:44 +0000 (23:05 +0000)]
Document chflagsat(2).

Obtained from: jilles

11 years agoRegenerate after r248599.
pjd [Thu, 21 Mar 2013 23:02:19 +0000 (23:02 +0000)]
Regenerate after r248599.

Sponsored by: The FreeBSD Foundation

11 years agoImplement chflagsat(2) system call, similar to fchmodat(2), but operates on
pjd [Thu, 21 Mar 2013 22:59:01 +0000 (22:59 +0000)]
Implement chflagsat(2) system call, similar to fchmodat(2), but operates on
file flags.

Reviewed by: kib, jilles
Sponsored by: The FreeBSD Foundation

11 years agoRegenerate after r248597.
pjd [Thu, 21 Mar 2013 22:47:03 +0000 (22:47 +0000)]
Regenerate after r248597.

Sponsored by: The FreeBSD Foundation

11 years ago- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type
pjd [Thu, 21 Mar 2013 22:44:33 +0000 (22:44 +0000)]
- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type
  u_long. Before this change it was of type int for syscalls, but prototypes
  in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not
  for lchflags(2)) stated that it was u_long. Now some related functions
  use u_long type for flags (strtofflags(3), fflagstostr(3)).
- Make path argument of type 'const char *' for consistency.

Discussed on: arch
Sponsored by: The FreeBSD Foundation

11 years agoCorrect the page count when excess length is trimmed from the bio.
kib [Thu, 21 Mar 2013 22:36:43 +0000 (22:36 +0000)]
Correct the page count when excess length is trimmed from the bio.

Reported and tested by: Ivan Klymenko <fidaj@ukr.net

11 years agoAllow O_CLOEXEC in posix_openpt() flags.
jilles [Thu, 21 Mar 2013 21:39:15 +0000 (21:39 +0000)]
Allow O_CLOEXEC in posix_openpt() flags.

PR: kern/162374
Reviewed by: ed

11 years agoFix a bug in UMTX_PROFILING:
attilio [Thu, 21 Mar 2013 19:58:25 +0000 (19:58 +0000)]
Fix a bug in UMTX_PROFILING:
UMTX_PROFILING should really analyze the distribution of locks as they
index entries in the umtxq_chains hash-table.
However, the current implementation does add/dec the length counters
for *every* thread insert/removal, measuring at all really userland
contention and not the hash distribution.

Fix this by correctly add/dec the length counters in the points where
it is really needed.

Please note that this bug brought us questioning in the past the quality
of the umtx hash table distribution.
To date with all the benchmarks I could try I was not able to reproduce
any issue about the hash distribution on umtx.

Sponsored by: EMC / Isilon storage division
Reviewed by: jeff, davide
MFC after: 2 weeks

11 years agoUpdate libarchive's vendor dist to version 3.1.2 from release branch.
mm [Thu, 21 Mar 2013 18:59:02 +0000 (18:59 +0000)]
Update libarchive's vendor dist to version 3.1.2 from release branch.

Git branch: release
Git commit: 19f23e191f9d3e1dd2a518735046100419965804

Obtained from: https://github.com/libarchive/libarchive.git

11 years agoDocument some flags to the uma_zcreate(). Not all flags are documented,
glebius [Thu, 21 Mar 2013 16:19:46 +0000 (16:19 +0000)]
Document some flags to the uma_zcreate(). Not all flags are documented,
only those that at least are used in the kernel, or that definitely
work.

11 years agoDocument uma_find_refcnt().
glebius [Thu, 21 Mar 2013 16:04:34 +0000 (16:04 +0000)]
Document uma_find_refcnt().

11 years agoMinimal timer period of 100us introduced in r244758 is overkill. While
mav [Thu, 21 Mar 2013 15:42:41 +0000 (15:42 +0000)]
Minimal timer period of 100us introduced in r244758 is overkill.  While
original 2us are indeed not enough, 3us are working quite well on my tests.
To be more safe set minimal period to 5us and to be even more safe replicate
here from HPET mechanism of rereading counter after programming comparator.

This change allows to handle 30K of short nanosleep() calls per second on
Raspberry Pi instead of just 8K before.

Discussed with: gonzo

11 years agoAnother NFS SIGSTOP related fix: Ignore thread suspend requests due to
jhb [Thu, 21 Mar 2013 14:06:27 +0000 (14:06 +0000)]
Another NFS SIGSTOP related fix: Ignore thread suspend requests due to
SIGSTOP if stop signals are currently deferred.  This can occur if a
process is stopped via SIGSTOP while a thread is running or runnable
but before it has set TDF_SBDRY.

Tested by: pho
Reviewed by: kib
MFC after: 1 week

11 years agoFix twa(4) after the r246713. The driver copies data around to
kib [Thu, 21 Mar 2013 13:06:28 +0000 (13:06 +0000)]
Fix twa(4) after the r246713.  The driver copies data around to
satisfy some alignment restrictions.  Do not set TW_OSLI_REQ_FLAGS_CCB
flag for mapped data, pass the csio->data_ptr in the req->data.

Do not put the ccb pointer into req->data ever, ccb is stored in
req->orig_req already.

Submitted by: Shuichi KITAGUCHI <ki@hh.iij4u.or.jp>
PR: kern/177020

11 years agoDocument NGM_NAT_LIBALIAS_INFO.
glebius [Thu, 21 Mar 2013 13:02:43 +0000 (13:02 +0000)]
Document NGM_NAT_LIBALIAS_INFO.

Submitted by: Dmitry Luhtionov <dmitryluhtionov gmail.com>

11 years agoInitialize the variable to avoid (false) compiler warning about
kib [Thu, 21 Mar 2013 12:59:24 +0000 (12:59 +0000)]
Initialize the variable to avoid (false) compiler warning about
use of an uninitialized local.

Reported by: Ivan Klymenko <fidaj@ukr.net>
MFC after: 2 weeks

11 years agoRemove a reference to instant-server which has been removed from the
eadler [Thu, 21 Mar 2013 12:42:25 +0000 (12:42 +0000)]
Remove a reference to instant-server which has been removed from the
ports tree in r313427.

PR: 177012
Submitted by: Kevin Zheng <kevinz5000@gmail.com>
Approved by: bcr (mentor)

11 years agoAdd missing descriptions for ZFS sysctls
smh [Thu, 21 Mar 2013 11:25:21 +0000 (11:25 +0000)]
Add missing descriptions for ZFS sysctls

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 weeks

11 years agoRemove EOL whitespace.
joel [Thu, 21 Mar 2013 11:22:13 +0000 (11:22 +0000)]
Remove EOL whitespace.

11 years agoOptimisation of TRIM processing.
smh [Thu, 21 Mar 2013 11:02:08 +0000 (11:02 +0000)]
Optimisation of TRIM processing.

Previously TRIM processing was very bursty. This was made worse by the fact
that TRIM requests on SSD's are typically much slower than reads or writes.
This often resulted in stalls while large numbers of TRIM's where processed.

In addition due to the way the TRIM thread was only woken by writes, deletes
could stall in the queue for extensive periods of time.

This patch adds a number of controls to how often the TRIM thread for each
SPA processes its outstanding delete requests.
vfs.zfs.trim.timeout: Delay TRIMs by up to this many seconds
vfs.zfs.trim.txg_delay: Delay TRIMs by up to this many TXGs (reduced to 32)
vfs.zfs.vdev.trim_max_bytes: Maximum pending TRIM bytes for a vdev
vfs.zfs.vdev.trim_max_pending: Maximum pending TRIM segments for a vdev
vfs.zfs.trim.max_interval: Maximum interval between TRIM queue processing
(seconds)

Given the most common TRIM implementation is ATA TRIM the current defaults
are targeted at that.

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 weeks

11 years agoNames the ZFS TRIM thread
smh [Thu, 21 Mar 2013 10:41:30 +0000 (10:41 +0000)]
Names the ZFS TRIM thread

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 weeks

11 years agoTRIM cache devices based on time instead of TXGs.
smh [Thu, 21 Mar 2013 10:29:05 +0000 (10:29 +0000)]
TRIM cache devices based on time instead of TXGs.
Currently, the trim module uses the same algorithm for data and cache
devices when deciding to issue TRIM requests, based on how far in the
past the TXG is.

Unfortunately, this is not ideal for cache devices, because the L2ARC
doesn't use the concept of TXGs at all. In fact, when using a pool for
reading only, the L2ARC is written but the TXG counter doesn't
increase, and so no new TRIM requests are issued to the cache device.

This patch fixes the issue by using time instead of the TXG number as
the criteria for trimming on cache devices. The basic delay principle
stays the same, but parameters are expressed in seconds instead of
TXGs. The new parameters are named trim_l2arc_limit and
trim_l2arc_batch, and both default to 30 second.

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
Obtained from: https://github.com/dechamps/zfs/commit/17122c31ac7f82875e837019205c21651c05f8cd
MFC after: 2 weeks

11 years agoImprove TXG handling in the TRIM module.
smh [Thu, 21 Mar 2013 10:16:10 +0000 (10:16 +0000)]
Improve TXG handling in the TRIM module.
This patch adds some improvements to the way the trim module considers
TXGs:

 - Free ZIOs are registered with the TXG from the ZIO itself, not the
   current SPA syncing TXG (which may be out of date);
 - L2ARC are registered with a zero TXG number, as L2ARC has no concept
   of TXGs;
 - The TXG limit for issuing TRIMs is now computed from the last synced
   TXG, not the currently syncing TXG. Indeed, under extremely unlikely
   race conditions, there is a risk we could trim blocks which have been
   freed in a TXG that has not finished syncing, resulting in potential
   data corruption in case of a crash.

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
Obtained from: https://github.com/dechamps/zfs/commit/5b46ad40d9081d75505d6f3bf04ac652445df366
MFC after: 2 weeks

11 years agoDon't register repair writes in the trim map.
smh [Thu, 21 Mar 2013 10:02:32 +0000 (10:02 +0000)]
Don't register repair writes in the trim map.

The trim map inflight writes tree assumes non-conflicting writes, i.e.
that there will never be two simultaneous write I/Os to the same range
on the same vdev. This seemed like a sane assumption; however, in
actual testing, it appears that repair I/Os can very well conflict
with "normal" writes.

I'm not quite sure if these conflicting writes are supposed to happen
or not, but in the mean time, let's ignore repair writes for now. This
should be safe considering that, by definition, we never repair blocks
that are freed.

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
Obtained from: Source: https://github.com/dechamps/zfs/commit/6a3cebaf7c5fcc92007280b5d403c15d0e61dfe3

11 years agoAdd TRIM support for L2ARC.
smh [Thu, 21 Mar 2013 09:34:41 +0000 (09:34 +0000)]
Add TRIM support for L2ARC.

This adds TRIM support to cache vdevs. When ARC buffers are removed
from the L2ARC in arc_hdr_destroy(), arc_release() or l2arc_evict(),
the size previously occupied by the buffer gets scheduled for TRIMming.
As always, actual TRIMs are only issued to the L2ARC after
txg_trim_limit.

Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
Obtained from: https://github.com/dechamps/zfs/commit/31aae373994fd112256607edba7de2359da3e9dc
MFC after: 2 weeks

11 years agoMerge libzfs_core branch:
mm [Thu, 21 Mar 2013 08:38:03 +0000 (08:38 +0000)]
Merge libzfs_core branch:
  includes MFV 238590, 238592, 247580

MFV 238590, 238592:
  In the first zfs ioctl restructuring phase, the libzfs_core library was
  introduced. It is a new thin library that wraps around kernel ioctl's.
  The idea is to provide a forward-compatible way of dealing with new
  features. Arguments are passed in nvlists and not random zfs_cmd fields,
  new-style ioctls are logged to pool history using a new method of
  history logging.

  http://blog.delphix.com/matt/2012/01/17/the-future-of-libzfs/

MFV 247580 [1]:
  To address issues of several deadlocks and race conditions the locking
  code around dsl_dataset was rewritten and the interface to synctasks
  was changed.

User-Visible Changes:
  "zfs snapshot" can create more arbitrary snapshots at once (atomically)
  "zfs destroy" destroys multiple snapshots at once
  "zfs recv" has improved performance

Backward Compatibility:
  I have extended the compatibility layer to support full backward
  compatibility by remapping or rewriting the responsible ioctl arguments.
  Old utilities are fully supported by the new kernel module.

Forward Compatibility:
  New utilities work with old kernels with the following restrictions:
    - creating, destroying, holding and releasing of multiple snapshots
      at once is not supported, this includes recursive (-r) commands

Illumos ZFS issues:
  2882 implement libzfs_core
  2900 "zfs snapshot" should be able to create multiple,
       arbitrary snapshots at once
  3464 zfs synctask code needs restructuring

References:
  https://www.illumos.org/issues/2882
  https://www.illumos.org/issues/2900
  https://www.illumos.org/issues/3464 [1]

MFC after: 1 month
Sponsored by: Hybrid Logic Inc. [1]

11 years agoAdd NGM_NAT_LIBALIAS_INFO command, that reports internal stats
glebius [Thu, 21 Mar 2013 08:36:15 +0000 (08:36 +0000)]
Add NGM_NAT_LIBALIAS_INFO command, that reports internal stats
of libalias instance. To be used in the mpd5 daemon.

Submitted by: Dmitry Luhtionov <dmitryluhtionov gmail.com>

11 years agoOnly size and create the bio_transient_map when unmapped buffers are
kib [Thu, 21 Mar 2013 07:28:15 +0000 (07:28 +0000)]
Only size and create the bio_transient_map when unmapped buffers are
enabled.  Now, disabling the unmapped buffers should result in the
kernel memory map identical to pre-r248550.

Sponsored by: The FreeBSD Foundation

11 years agoAssert that transient mapping of the bio is only done when unmapped
kib [Thu, 21 Mar 2013 07:26:33 +0000 (07:26 +0000)]
Assert that transient mapping of the bio is only done when unmapped
buffers are allowed.

Sponsored by: The FreeBSD Foundation

11 years agoDo not call vnode_pager_setsize() while a NFS node mutex is
kib [Thu, 21 Mar 2013 07:25:08 +0000 (07:25 +0000)]
Do not call vnode_pager_setsize() while a NFS node mutex is
locked. vnode_pager_setsize() might sleep waiting for the page after
EOF be unbusied.

Call vnode_pager_setsize() both for the regular and directory vnodes.

Reported by: mich
Reviewed by: rmacklem
Discussed with: avg, jhb
MFC after: 2 weeks

11 years agoAdd new USB ID.
hselasky [Thu, 21 Mar 2013 07:04:17 +0000 (07:04 +0000)]
Add new USB ID.

PR: usb/177173
MFC after: 1 week

11 years agoSet WARNS=3 so this actually compiles.
neel [Wed, 20 Mar 2013 21:47:05 +0000 (21:47 +0000)]
Set WARNS=3 so this actually compiles.

11 years agoIn bufwrite(), a dirty buffer is moved to the clean queue before the
kib [Wed, 20 Mar 2013 21:08:00 +0000 (21:08 +0000)]
In bufwrite(), a dirty buffer is moved to the clean queue before the
bufobj counter of the writes in progress is incremented.  Other thread
inspecting the bufobj would consider it clean.

For the regular vnodes, the vnode lock is typically held both by the
thread performing the bufwrite() and an other thread doing syncing,
which prevents the situation.  On the other hand, writes to the VCHR
vnodes are done without holding vnode lock.

Increment the write ref counter for the buffer object before calling
bundirty().

Sponsored by: The FreeBSD Foundation
Tested by: pho
MFC after: 2 weeks

11 years agoWhen the journaled FFS volume is suspended due to the journal space
kib [Wed, 20 Mar 2013 21:07:49 +0000 (21:07 +0000)]
When the journaled FFS volume is suspended due to the journal space
becoming too low, the softdep flush thread processes the workitems,
which frees the space in journal, and then unsuspends the fs.  The
softdep_flush() and other workitem processing functions busy the
filesystem before iterating over the worklist, to prevent the parallel
unmount from freeing the mount data. The vfs_busy() is called with
MBF_NOWAIT flag.

Now, if the unmount is already started and the filesystem is suspended
due to low journal space, the journal is never flushed and filesystem
is never unsuspended, because vfs_busy(MBF_NOWAIT) call cannot succeed
for the unmounting fs, and softdep_flush() does not process the
workitems. Unmount needs to write metadata, where it hangs in the
"suspfs" state.

Move the vn_start_write() call in the dounmount() before setting the
MNTK_UNMOUNT flag. This practically ensures that softdep_flush()
processed the pending journal writes by making dounmount() wait for
the lift of the suspension.

Sponsored by: The FreeBSD Foundation
Reported and tested by: pho
MFC after: 2 weeks

11 years agoWhen renaming a directory from one parent directory to another,
mckusick [Wed, 20 Mar 2013 17:57:00 +0000 (17:57 +0000)]
When renaming a directory from one parent directory to another,
we need to call ufs_checkpath() to walk from our new location to
the root of the filesystem to ensure that we do not encounter
ourselves along the way. Until now, we accomplished this by reading
the ".." entries of each directory in our path until we reached
the root (or encountered an error). This change tries to avoid the
I/O of reading the ".." entries by first looking them up in the
name cache and only doing the I/O when the name cache lookup fails.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   4 weeks

11 years agoIntegrate Efika MX project back to home.
ray [Wed, 20 Mar 2013 15:39:27 +0000 (15:39 +0000)]
Integrate Efika MX project back to home.

Sponsored by: The FreeBSD Foundation

11 years agoFix spelling.
hselasky [Wed, 20 Mar 2013 11:51:26 +0000 (11:51 +0000)]
Fix spelling.

11 years agoRemove unused variable.
melifaro [Wed, 20 Mar 2013 10:36:38 +0000 (10:36 +0000)]
Remove unused variable.

11 years agoAdd ipfw support for setting/matching DiffServ codepoints (DSCP).
melifaro [Wed, 20 Mar 2013 10:35:33 +0000 (10:35 +0000)]
Add ipfw support for setting/matching DiffServ codepoints (DSCP).

Setting DSCP support is done via O_SETDSCP which works for both
IPv4 and IPv6 packets. Fast checksum recalculation (RFC 1624) is done for IPv4.
Dscp can be specified by name (AFXY, CSX, BE, EF), by value
(0..63) or via tablearg.

Matching DSCP is done via another opcode (O_DSCP) which accepts several
classes at once (af11,af22,be). Classes are stored in bitmask (2 u32 words).

Many people made their variants of this patch, the ones I'm aware of are
(in alphabetic order):

Dmitrii Tejblum
Marcelo Araujo
Roman Bogorodskiy (novel)
Sergey Matveichuk (sem)
Sergey Ryabin

PR: kern/102471, kern/121122
MFC after: 2 weeks

11 years agoRelease hold on pool before calling zvol_create_minor()
mm [Wed, 20 Mar 2013 09:56:20 +0000 (09:56 +0000)]
Release hold on pool before calling zvol_create_minor()

11 years agoFix the logic inversion in the r248512.
kib [Wed, 20 Mar 2013 09:44:23 +0000 (09:44 +0000)]
Fix the logic inversion in the r248512.

Noted by: mckay

11 years agoPull in r177252 from upstream clang trunk:
andrew [Wed, 20 Mar 2013 08:34:30 +0000 (08:34 +0000)]
Pull in r177252 from upstream clang trunk:

 Make sure to use same EABI version for external assembler as for
 integrated as.

This allows us to use gcc on a world built with clang on ARM.

11 years agoFix the EDMA CABQ handling - for now, the CABQ takes a descriptor chain
adrian [Wed, 20 Mar 2013 05:44:03 +0000 (05:44 +0000)]
Fix the EDMA CABQ handling - for now, the CABQ takes a descriptor chain
like the legacy chips expect.

11 years agoFor RTL8211B or later PHYs, enable crossover detection and
yongari [Wed, 20 Mar 2013 05:31:34 +0000 (05:31 +0000)]
For RTL8211B or later PHYs, enable crossover detection and
auto-correction. This change makes re(4) establish a link with
a system using non-crossover UTP cable.

Tested by: Michael BlackHeart < amdmiek <> gmail dot com >

11 years agoAdd VNET wrappers around the rest of the ieee80211 rtsock messages.
adrian [Wed, 20 Mar 2013 02:42:52 +0000 (02:42 +0000)]
Add VNET wrappers around the rest of the ieee80211 rtsock messages.

I triggered the cac/radar messages when doing testing in DFS channels.

11 years agoRun zvol_create_minors() only if in non-error case
mm [Tue, 19 Mar 2013 22:27:15 +0000 (22:27 +0000)]
Run zvol_create_minors() only if in non-error case

11 years agoRun zvol_create_minors() on snapshot creation
mm [Tue, 19 Mar 2013 22:14:50 +0000 (22:14 +0000)]
Run zvol_create_minors() on snapshot creation

11 years agoAdd simple example.
joel [Tue, 19 Mar 2013 21:40:14 +0000 (21:40 +0000)]
Add simple example.

11 years agoImplement SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC.
jilles [Tue, 19 Mar 2013 20:58:17 +0000 (20:58 +0000)]
Implement SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC.

This change allows creating file descriptors with close-on-exec set in some
situations. SOCK_CLOEXEC and SOCK_NONBLOCK can be OR'ed in socket() and
socketpair()'s type parameter, and MSG_CMSG_CLOEXEC to recvmsg() makes file
descriptors (SCM_RIGHTS) atomically close-on-exec.

The numerical values for SOCK_CLOEXEC and SOCK_NONBLOCK are as in NetBSD.
MSG_CMSG_CLOEXEC is the first free bit for MSG_*.

The SOCK_* flags are not passed to MAC because this may cause incorrect
failures and can be done later via fcntl() anyway. On the other hand, audit
is expected to cope with the new flags.

For MSG_CMSG_CLOEXEC, unp_externalize() is extended to take a flags
argument.

Reviewed by: kib

11 years agoBreak out the RX completion path into "FIFO check / refill" and
adrian [Tue, 19 Mar 2013 19:32:28 +0000 (19:32 +0000)]
Break out the RX completion path into "FIFO check / refill" and
"complete RX frames."

The 128 entry RX FIFO is really easy to fill up and miss refilling
when it's done in the ath taskq - as that gets blocked up doing
RX completion, TX completion and other random things.

So the 128 entry RX FIFO now gets emptied and refilled in the ath_intr()
task (and it grabs / releases locks, so now ath_intr() can't just be
a FAST handler yet!) but the locks aren't held for very long. The
completion part is done in the ath taskqueue context.

Details:

* Create a new completed frame list - sc->sc_rx_rxlist;
* Split the EDMA RX process queue into two halves - one that
  processes the RX FIFO and refills it with new frames; another
  that completes the completed frame list;
* When tearing down the driver, flush whatever is in the deferred
  queue as well as what's in the FIFO;
* Create two new RX methods - one that processes all RX queues,
  one that processes the given RX queue.  When MSI is implemented,
  we get told which RX queue the interrupt came in on so we can
  specifically schedule that.  (And I can do that with the non-MSI
  path too; I'll figure that out later.)
* Convert the legacy code over to use these new RX methods;
* Replace all the instances of the RX taskqueue enqueue with a call
  to a relevant RX method to enqueue one or all RX queues.

Tested:

* AR9380, STA
* AR9580, STA
* AR5413, STA

11 years agoAdd more TODO items.
adrian [Tue, 19 Mar 2013 17:55:36 +0000 (17:55 +0000)]
Add more TODO items.

11 years agoNow that the tx map field is correctly populated for both edma and
adrian [Tue, 19 Mar 2013 17:54:37 +0000 (17:54 +0000)]
Now that the tx map field is correctly populated for both edma and
legacy chips, just use that.

11 years agoAdd a comment about why aout support is still here: We need it for
imp [Tue, 19 Mar 2013 16:57:04 +0000 (16:57 +0000)]
Add a comment about why aout support is still here: We need it for
compat2x, which is still in use, as evidence by recent bug reports.

11 years agoahci(4) and siis(4) are ready to process the unmapped i/o requests
kib [Tue, 19 Mar 2013 15:09:32 +0000 (15:09 +0000)]
ahci(4) and siis(4) are ready to process the unmapped i/o requests

Sponsored by: The FreeBSD Foundation
Tested by: pho
Submitted by: bf (siis patch)

11 years agoUFS support of the unmapped i/o for the user data buffers.
kib [Tue, 19 Mar 2013 15:08:15 +0000 (15:08 +0000)]
UFS support of the unmapped i/o for the user data buffers.

Sponsored by: The FreeBSD Foundation
Tested by: pho, scottl, jhb, bf

11 years agoCommit the removal of a whitespace to record the proper commit message
kib [Tue, 19 Mar 2013 15:05:21 +0000 (15:05 +0000)]
Commit the removal of a whitespace to record the proper commit message
for the r248519:

For the cam-attached HBAs, allow the driver to specify that it accepts
the unmapped bio by the PIM_UNMAPPED flag.  The CAM passes the
CAM_DATA_BIO data transfer type request for the unmapped bio, and the
driver could use the bus_dmamap_load_ccb() as a helper to
transparently handle the ccb.

Sponsored by: The FreeBSD Foundation
Reviewed by: scottl
Tested by: pho, scottl

11 years agoSupport unmapped i/o for the md(4).
kib [Tue, 19 Mar 2013 15:01:50 +0000 (15:01 +0000)]
Support unmapped i/o for the md(4).

The vnode-backed md(4) has to map the unmapped bio because VOP_READ()
and VOP_WRITE() interfaces do not allow to pass unmapped requests to
the filesystem. Vnode-backed md(4) uses pbufs instead of relying on
the bio_transient_map, to avoid usual md deadlock.

Sponsored by: The FreeBSD Foundation
Tested by: pho, scottl

11 years agoSupport unmapped i/o for the md(4).
kib [Tue, 19 Mar 2013 14:53:23 +0000 (14:53 +0000)]
Support unmapped i/o for the md(4).

The vnode-backed md(4) has to map the unmapped bio because VOP_READ()
and VOP_WRITE() interfaces do not allow to pass unmapped requests to
the filesystem. Vnode-backed md(4) uses pbufs instead of relying on
the bio_transient_map, to avoid usual md deadlock.

Sponsored by: The FreeBSD Foundation
Tested by: pho, scottl

11 years agoThe geom_part provider supports unmapped bio iff the underlying
kib [Tue, 19 Mar 2013 14:50:24 +0000 (14:50 +0000)]
The geom_part provider supports unmapped bio iff the underlying
provider does so, since geom_part never inspects the bio_data.

Sponsored by: The FreeBSD Foundation
Tested by: pho

11 years agoA flag for the geom disk driver to indicate that it accepts the
kib [Tue, 19 Mar 2013 14:49:15 +0000 (14:49 +0000)]
A flag for the geom disk driver to indicate that it accepts the
unmapped i/o requests.

Sponsored by: The FreeBSD Foundation
Tested by: pho

11 years agoDo not remap usermode pages into KVA for physio.
kib [Tue, 19 Mar 2013 14:43:57 +0000 (14:43 +0000)]
Do not remap usermode pages into KVA for physio.

Sponsored by: The FreeBSD Foundation
Tested by: pho