asomers [Mon, 11 Jan 2016 20:25:41 +0000 (20:25 +0000)]
MFC r292019
When iostat(8) receives SIGINT while running with "-w" or "-c", it will now
print statistics one more time before exiting. Also, it now implements the
wait using setitimer instead of sleep, so the waits will be more consistent
when the system is heavily loaded.
asomers [Mon, 11 Jan 2016 20:24:56 +0000 (20:24 +0000)]
MFC r292020
Increase devd's client socket buffer size to 256KB. This is not as large as
it looks, because we'll hit the sockbuf's mbuf limit long before hitting its
data limit. A 256KB data limit allows creating a ZFS pool on about 450
drives without overflowing the client socket buffers.
trasz [Mon, 11 Jan 2016 20:10:14 +0000 (20:10 +0000)]
MFC r287396:
It's 2015, and some people are still trying to use fdisk and then
go asking what debug flags to set for GEOM to make it work. Advice
them to use gpart(8) instead.
Something similar should probably done with disklabel,
but I need to rewrite the disklabel examples first.
jimharris [Mon, 11 Jan 2016 17:32:56 +0000 (17:32 +0000)]
MFC r293352:
nvme: add hw.nvme.min_cpus_per_ioq tunable
Due to FreeBSD system-wide limits on number of MSI-X vectors
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321),
it may be desirable to allocate fewer than the maximum number
of vectors for an NVMe device, in order to save vectors for
other devices (usually Ethernet) that can take better
advantage of them and may be probed after NVMe.
This tunable is expressed in terms of minimum number of CPUs
per I/O queue instead of max number of queues per controller,
to allow for a more even distribution of CPUs per queue. This
avoids cases where some number of CPUs have a dedicated queue,
but other CPUs need to share queues. Ideally the PR referenced
above will eventually be fixed and the mechanism implemented
here becomes obsolete anyways.
While here, fix a bug in the CPUs per I/O queue calculation to
properly account for the admin queue's MSI-X vector.
jimharris [Mon, 11 Jan 2016 17:31:18 +0000 (17:31 +0000)]
MFC r293328:
nvme: do not revert to single I/O queue when per-CPU queues not available
Previously nvme(4) would revert to a single I/O queue if it could not
allocate enought interrupt vectors or NVMe submission/completion queues
to have one I/O queue per core. This patch determines how to utilize a
smaller number of available interrupt vectors, and assigns (as closely
as possible) an equal number of cores to each associated I/O queue.
ngie [Sun, 10 Jan 2016 17:39:49 +0000 (17:39 +0000)]
Unbreak stable/10 buildworlds on arm/arm, mips/mips, mips/mips64, mips/mipsel,
mips/mipsn32, powerpc/powerpc, powerpc/powerpc64, sparc64/sparc64 with gcc
after r293307 (some of the BURN_BRIDGES code)
MFC after: 3 days
Pointyhat to: markj
Sponsored by: EMC / Isilon Storage Division
ae [Sun, 10 Jan 2016 13:53:57 +0000 (13:53 +0000)]
MFC r292057:
Make detection of GPT a bit more reliable.
When we are detecting a partition table and didn't find PMBR, try to
read backup GPT header from the last sector and if it is correct,
assume that we have GPT.
dchagin [Sat, 9 Jan 2016 18:28:15 +0000 (18:28 +0000)]
MFC r288994 (by bdrewery):
Remove redundant RFFPWAIT/vfork(2) handling in Linux fork(2) and clone(2) wrappers.
r161611 added some of the code from sys_vfork() directly into the Linux
module wrappers since they use RFSTOPPED. In r232240, the RFFPWAIT handling
was moved to syscallret(), thus this code in the Linux module is no longer
needed as it will be called later.
This also allows the Linux wrappers to benefit from the fix in r275616 for
threads not getting suspended if their vforked child is stopped while they
wait on them.
dchagin [Sat, 9 Jan 2016 18:07:48 +0000 (18:07 +0000)]
MFC r283544:
When I merged the lemul branch I missied kib@'s r282708 commit.
This is not the final fix as I need properly cleanup thread resources
before other threads suicide.
dchagin [Sat, 9 Jan 2016 18:05:04 +0000 (18:05 +0000)]
MFC r283498:
Linux nanosleep() and clock_nanosleep() system calls always
writes the remaining time into the structure pointed to by rmtp
unless rmtp is NULL. The value of *rmtp can then be used to call
nanosleep() again and complete the specified pause if the previous
call was interrupted.
Note. clock_nanosleep() with an absolute time value does not write
the remaining time.
dchagin [Sat, 9 Jan 2016 18:03:09 +0000 (18:03 +0000)]
MFC r283496:
The latest cp tool is trying to use the btrfs clone operation that is
implemented via ioctl interface. First of all return ENOTSUP for this
operation as a cp fallback to usual method in that case. Secondly, do
not print out the message about unimplemented operation.
dchagin [Sat, 9 Jan 2016 17:44:08 +0000 (17:44 +0000)]
MFC r283483:
Convert signal number to native for VT_SETMODE ioctl and remove
strange and invalid ISSIGVALID macro.
The code has not been tested right way but it was originally broken.
dchagin [Sat, 9 Jan 2016 17:39:41 +0000 (17:39 +0000)]
MFC r283479:
The kernel sends signals to the processes via ABI specific sv_sendsig method.
Native ABI do not need signal conversion, only emulators may want this. Usually
emulators implements its own sv_sendsig method. For now only ibcs2 emulator does
not have own sv_sendsig implementation and depends on native sendsig() method.
So, remove any extra attempts to convert signal numbers from native sendsig()
methods except from i386 where ibsc2 is living.
dchagin [Sat, 9 Jan 2016 17:29:08 +0000 (17:29 +0000)]
MFC r283474:
Rework signal code to allow using it by other modules, like linprocfs:
1. Linux sigset always 64 bit on all platforms. In order to move Linux
sigset code to the linux_common module define it as 64 bit int. Move
Linux sigset manipulation routines to the MI path.
2. Move Linux signal number definitions to the MI path. In general, they
are the same on all platforms except for a few signals.
3. Map Linux RT signals to the FreeBSD RT signals and hide signal conversion
tables to avoid conversion errors.
4. Emulate Linux SIGPWR signal via FreeBSD SIGRTMIN signal which is outside
of allowed on Linux signal numbers.
dchagin [Sat, 9 Jan 2016 17:22:51 +0000 (17:22 +0000)]
MFC r283471:
According to Linux man sigaltstack(3) shall return EINVAL if the ss
argument is not a null pointer, and the ss_flags member pointed to by ss
contains flags other than SS_DISABLE. However, in fact, Linux also
allows SS_ONSTACK flag which is simply ignored.
For buggy apps (at least mono) ignore other than SS_DISABLE
flags as a Linux do.
While here move MI part of sigaltstack code to the appropriate place.
dchagin [Sat, 9 Jan 2016 17:08:33 +0000 (17:08 +0000)]
MFC r283461:
As for now our tmpfs is no longer being considered
"highly experimental" remove /dev/shm magic commited
in r218497 and convert tmpfs type to an expected magic number.
dchagin [Sat, 9 Jan 2016 16:44:17 +0000 (16:44 +0000)]
MFC r283441:
Implement epoll family system calls. This is a tiny wrapper
around kqueue() to implement epoll subset of functionality.
The kqueue user data are 32bit on i386 which is not enough for
epoll user data, so we keep user data in the proc emuldata.
Initial patch developed by rdivacky@ in 2007, then extended
by Yuri Victorovich @ r255672 and finished by me
in collaboration with mjg@ and jillies@.
dchagin [Sat, 9 Jan 2016 16:39:15 +0000 (16:39 +0000)]
MFC r283440:
For future use in the Linuxulator:
1. Add a kern_kqueue() counterpart for kqueue() with flags parameter.
2. Be a bit secure. To avoid a double fp lookup add a kern_kevent_fp()
counterpart for kern_kevent() with file pointer parameter instead
of file descriptor an pass the buck to it.