mav [Wed, 24 Dec 2014 13:49:40 +0000 (13:49 +0000)]
MFC r275865:
Add configuration options to override physical and UNMAP blocks geometry.
While in most cases CTL should correctly fetch those values from backing
storages, there are some initiators (like MS SQL), that may not like large
physical block sizes, even if they are true. For such cases allow override
fetched values with supported ones (like 4K).
ae [Tue, 23 Dec 2014 16:33:44 +0000 (16:33 +0000)]
MFC r273087 (with modifications):
Overhaul if_gif(4):
o convert to if_transmit;
o use rmlock to protect access to gif_softc;
o use sx lock to protect from concurrent ioctls;
o remove a lot of unneeded and duplicated code;
o remove cached route support (it won't work with concurrent io);
o style fixes.
MFC r273090:
Move memset under ifdef INET6.
MFC r273091:
Add more ifdefs. SIOC*_IN6 are defined only with INET6.
MFC r273121:
Add inet/inet6 to the dependency list. Without them if_gif is useless.
MFC r273209 by bz:
After r273087,r273090,r273091,r273121 changes to gif(4) try to fix
NOIP builds for real.
MFC r273587:
Remove redundant check and m_pullup() call.
pfg [Tue, 23 Dec 2014 03:24:16 +0000 (03:24 +0000)]
MFC r274437;
ifdef ext2_print_inode which is not really used.
ext2_print_inode was nice to have for initial development work but
is not really used anymore. #ifdef it under a new EXT2FS_DEBUG knob
so that we don't spend time compiling it.
delphij [Mon, 22 Dec 2014 20:58:51 +0000 (20:58 +0000)]
MFC r274337,r274673,274681,r275515:
ZFS large block support. The default recordsize remains at 128KB.
A new tunable/sysctl variable, vfs.zfs.max_recordsize is added to
allow adjusting the permitted maximum record size, or
zfs_max_recordsize, with a default of 1MB. ZFS will not allow
setting recordsize greater than zfs_max_recordsize as a safety
belt, because larger recordsize means greater read and write
latency and more memory usage.
Please note that booting from datasets that have recordsize greater
than 128KB is not supported (but it's Okay to enable the feature on
the pool).
Limited safety belt is provided for mounted root filesystem but use
caution when using a larger value.
jhb [Mon, 22 Dec 2014 19:53:55 +0000 (19:53 +0000)]
MFC 271405,271408,271409,272658:
MFamd64: Use initializecpu() to set various model-specific registers on
AP startup and AP resume (it was already used for BSP startup and BSP
resume).
jhb [Mon, 22 Dec 2014 18:40:59 +0000 (18:40 +0000)]
MFC 260557,271076,271077,271082,271083,271098:
- Remove spaces from boot messages when we print the CPU ID/Family/Stepping
- Move prototypes for various functions into out of C files and into
<machine/md_var.h>.
- Reduce diffs between i386 and amd64 initcpu.c and identcpu.c files.
- Move blacklists of broken TSCs out of the printcpuinfo() function
and into the TSC probe routine.
- Merge the amd64 and i386 identcpu.c into a single x86 implementation.
ngie [Mon, 22 Dec 2014 02:22:01 +0000 (02:22 +0000)]
MFC r275622:
Add makewhatis to ITOOLS if MK_MAN != no
This will fix installation with differing host targets in installworld, so
one can build i386/i386 on an amd64 host, then install to an i386/i386 target
ngie [Mon, 22 Dec 2014 00:50:08 +0000 (00:50 +0000)]
MFC r273803,r273810:
r273803:
Filter out TESTS_SUBDIRS already added to SUBDIR instead of blindly
appending the TESTS_SUBDIRS variable to SUBDIR
Duplicate directory entries can cause unexpected side effects, like
installing the same files multiple times. This can be easily
reproduced via the following testcase prior to this commit:
SUBDIR= dir
TESTS_SUBDIRS+= dir
.include <bsd.test.mk>
Sponsored by: EMC / Isilon Storage Division
r273810:
Fix the logic inversion in the previous commit by ensuring that the matched
expression (:M) is empty, not the not matched (:N) is empty. The former case
means we have not found the TEST_SUBDIR value in SUBDIR
Reported by: rodrigc
Pointyhat to: me (did not use a clean install root)
Sponsored by: EMC / Isilon Storage Division
bsd.progs.mk generates a separate depend file for every program being
built, but then it does not properly tell each submake to use those
individual files. Properly propagate the depend file to use.
Discovered while preparing the update of atf to 0.21 and noticing that
the test programs were not being relinked to the new library.
trasz [Sun, 21 Dec 2014 11:33:18 +0000 (11:33 +0000)]
MFC r274784:
Fix smbfs to not zero out statfs f_flags field. Previously, this
made getmntinfo() return empty flags for smbfs filesystems when
called with MNT_WAIT. It's not visible with mount(8), since it uses
MNT_NOWAIT, but broke autounmount(8) operation.
ngie [Sun, 21 Dec 2014 11:11:17 +0000 (11:11 +0000)]
MFC r273482,r274078:
r273482:
The NetBSD libc tests use several definitions/macros that aren't available in
FreeBSD
Add the missing compat definitions/macros to lib/libnetbsd so the testcases
can be compiled with libnetbsd without having to invent ad hoc #define's, or
having to convert things over to FreeBSD idioms
ae [Fri, 19 Dec 2014 13:22:02 +0000 (13:22 +0000)]
MFC r275729:
Increase the buffer size to keep the list of programm names when
parsing programm specification. It is safe to not check out of bounds
access, because !isprint(p[i]) check will stop reading, when '\0'
character will be read from the input string.
kib [Fri, 19 Dec 2014 09:36:59 +0000 (09:36 +0000)]
MFC r275833:
The iret instruction may generate #np and #ss fault, besides #gp.
When returning to usermode, the handler for that exceptions is also
executed with wrong gs base. Handle all three possible faults in the
same way, checking for iret fault, and performing full iret.
mav [Thu, 18 Dec 2014 08:46:53 +0000 (08:46 +0000)]
MFC r275568:
Count consecutive read requests as blocking in CTL for files and ZVOLs.
Technically read requests can be executed in any order or simultaneously
since they are not changing any data. But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads. Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.
This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations. On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.
mav [Thu, 18 Dec 2014 08:43:36 +0000 (08:43 +0000)]
MFC r275481:
Add to CTL support for threshold notifications for file-backed LUNs.
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too. Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system. Pool thresholds are still not implemented in this case.
mav [Thu, 18 Dec 2014 08:38:07 +0000 (08:38 +0000)]
MFC r275474: Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.
mav [Thu, 18 Dec 2014 08:32:06 +0000 (08:32 +0000)]
MFC r275458:
Do not pre-allocate UNIT ATTENTIONs storage for every possible initiator.
Abusing ability of major UAs cover minor ones we may not account UAs for
inactive ports. Allocate UAs storage for port and start accounting only
after some initiator from that port fetched its first POWER ON OCCURRED.
This reduces per-LUN CTL memory usage from >1MB to less then 100K.
mav [Thu, 18 Dec 2014 08:30:28 +0000 (08:30 +0000)]
MFC r275447:
Do not pre-allocate reservation keys memory for every possible initiator.
In configurations with many ports, like iSCSI, each LUN is typically
accessed only by limited subset of ports. Allocating that memory on
demand allows to reduce CTL memory usage from 5.3MB/LUN to 1.3MB/LUN.
mav [Thu, 18 Dec 2014 08:25:00 +0000 (08:25 +0000)]
MFC r275058: Coalesce last data move and command status for read commands.
Make CTL core and block backend set success status before initiating last
data move for read commands. Make CAM target and iSCSI frontends detect
such condition and send command status together with data. New I/O flag
allows to skip duplicate status sending on later fe_done() call.
For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS. For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.
mav [Thu, 18 Dec 2014 08:22:16 +0000 (08:22 +0000)]
MFC r274962: Replace home-grown CTL IO allocator with UMA.
Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments. But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way. That allows to avoid allocations
in hot I/O path. Other frontends either may sleep in allocation context
or can properly handle allocation errors.
On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS! Yay! :)
pfg [Tue, 16 Dec 2014 21:13:55 +0000 (21:13 +0000)]
MFC r275553, r275612;
patch(1): Bring fixes from OpenBSD
Check fstat return value. Use off_t for file size and offsets.
Avoid iterating over end of string.
Introduce strtolinenum to properly check line numbers while parsing:
no signs, no spaces, just digits, 0 <= x <= LONG_MAX
Properly validate line ranges supplied in diff file to prevent overflows.
Also fixes an out of boundary memory access because the resulting values
are used as array indices.
jhb [Tue, 16 Dec 2014 20:05:10 +0000 (20:05 +0000)]
MFC 271635,271722:
- Only the manpage updates from 271635 are merged to give additional
heads up for the stricter checks in 11, but the kernel in 10 remains
permissive.
- Fail with EINVAL if an invalid protection mask is passed to mmap().
- Fail with EINVAL if an unknown flag is passed to mmap().
- Fail with EINVAL if both MAP_PRIVATE and MAP_SHARED are passed to
mmap().
- Require one of either MAP_PRIVATE or MAP_SHARED for non-anonymous
mappings.
- Remove mention of MAP_INHERIT. It hasn't been implemented for thirteen
years.
- Remove mention of unimplemented MAP_SWAP. There are no future plans to
implement it.
jhb [Tue, 16 Dec 2014 19:45:56 +0000 (19:45 +0000)]
MFC 272897:
Various fixes to stats:
- Read the counts of received, dropped, and transmitted management
packets and add sysctl nodes for them.
- Fix the total octets received/transmitted to read all 64 bits of
the counters.
- Add missing sysctl nodes for rlec, tncrs, fcruc, tor, and tot.
- Remove spurious spaces.
pfg [Tue, 16 Dec 2014 18:45:31 +0000 (18:45 +0000)]
MFC r275645;
ext2fs: Fix old out-of-bounds access.
Overrunning buffer pointed to by (caddr_t)&oip->i_db[0] of 48 bytes by
passing it to a function which accesses it at byte offset 59 using
argument 60UL.
The issue was inherited from an older FFS implementation and
fixed there with by merging UFS2 in r98542. We follow the
FFS fix.