Enji Cooper [Fri, 8 Jan 2016 21:47:41 +0000 (21:47 +0000)]
- Move functions that might be used in class-specific cleanup functions
(geom_test_cleanup, etc) down so the testcases don't emit noise when
bailing
- Conform to the TAP protocol better when dealing with classes that can't
be loaded and with temporary files that can't be allocated for tracking
md(4) devices.
Enji Cooper [Fri, 8 Jan 2016 21:38:26 +0000 (21:38 +0000)]
- Make test-1.sh into a TAP testable testcase
- Delete test-2.sh as it was an incomplete testcase, and the contents were
basically a subset of test-1.sh
- Add a conf.sh file for executing common functions with geom_uzip
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Enji Cooper [Fri, 8 Jan 2016 21:28:09 +0000 (21:28 +0000)]
- Add a geom_stripe specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Enji Cooper [Fri, 8 Jan 2016 21:25:27 +0000 (21:25 +0000)]
- Add a geom_shsec specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Gleb Smirnoff [Fri, 8 Jan 2016 20:34:57 +0000 (20:34 +0000)]
New sendfile(2) syscall. A joint effort of NGINX and Netflix from 2013 and
up to now.
The new sendfile is the code that Netflix uses to send their multiple tens
of gigabits of data per second. The new implementation features asynchronous
I/O, when I/O operations are launched, but not awaited to be complete. An
explanation of why such behavior is beneficial compared to old one is
going to be too long for a commit message, so we will skip it here.
Additional features of new syscall are extra flags, which provide an
application more control over data sent. The SF_NOCACHE flag tells
kernel that data shouldn't be cached after it was sent. The SF_READAHEAD()
macro allows to specify readahead size in pages.
The new syscalls is a drop in replacement. No modifications are required
to applications. One can take nginx binary for stable/10 and run it
successfully on head. Although SF_NODISKIO lost its original sense, as now
sendfile doesn't block, and now means something completely different (tm),
using the new sendfile the old way is absolutely safe.
Celebrates: Netflix global launch!
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
Relnotes: yes
Enji Cooper [Fri, 8 Jan 2016 19:47:49 +0000 (19:47 +0000)]
- Add a geom_raid3 specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Enji Cooper [Fri, 8 Jan 2016 19:43:18 +0000 (19:43 +0000)]
- Add a conf.sh file for executing common functions with gnop
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Enji Cooper [Fri, 8 Jan 2016 19:38:59 +0000 (19:38 +0000)]
- Add a conf.sh file for executing common functions with geli
-- Use linear probing to find the first unique md(4) device, unlike the other
code which uses attach_md, as geli(8) allocates the md(4) devices itself
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Ed Maste [Fri, 8 Jan 2016 19:12:26 +0000 (19:12 +0000)]
Reduce libstand Makefile duplication
Userboot's copy of the libstand Makefile had more extensive changes
compared to the one in sys/boot/libstand32, but it turns out these are
not intentional and we can just include lib/libstand/Makefile as done
for libstand32 in r293040.
Reviewed by: imp, jhb
Tested by: allanjude
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4793
Enji Cooper [Fri, 8 Jan 2016 19:10:52 +0000 (19:10 +0000)]
- Use attach_md for memory disks so they can be tracked.
- Add a geom_concat specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
Gleb Smirnoff [Fri, 8 Jan 2016 19:03:20 +0000 (19:03 +0000)]
Make it possible for sbappend() to preserve M_NOTREADY on mbufs, just like
sbappendstream() does. Although, M_NOTREADY may appear only on SOCK_STREAM
sockets, due to sendfile(2) supporting only the latter, there is a corner
case of AF_UNIX/SOCK_STREAM socket, that still uses records for the sake
of control data, albeit being stream socket.
Provide private version of m_clrprotoflags(), which understands PRUS_NOTREADY,
similar to m_demote().
Last consumer using RTF_RNH_LOCKED flag was eliminated in r291643.
Restrict passing RTF_RNH_LOCKED to rtrequest1_fib() and do better
locking for RTM_ADD / RTM_DELETE cases.
LinuxKPI style changes:
- Properly prefix internal functions with "linux_" instead of only a
single underscore to avoid future namespace collisions.
- Make some functions global instead of inline to ease debugging and
to avoid unnecessary code duplication.
- Remove no longer existing kthread_create() function's prototype.
Warner Losh [Fri, 8 Jan 2016 00:05:28 +0000 (00:05 +0000)]
Setup /pkg as a spot for pkg to operate. This is for testing purposes
only. You need to remount / rw and export TMPDIR=/pkg/tmp. pkg will
then work. It's slow though: 15 minutes to pkg install git on an RPi 2
with a decently fast SD card. Since this is for testing, we set
DEFAULT_ALWAYS_YES and ASSUME_ALWAYS_YES to YES.
Conrad Meyer [Thu, 7 Jan 2016 23:02:15 +0000 (23:02 +0000)]
ioat(4): Add ioat_acquire_reserve() KPI
ioat_acquire_reserve() is an extended version of ioat_acquire(). It
allows users to reserve space in the channel for some number of
descriptors. If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.
Jilles Tjoelker [Thu, 7 Jan 2016 21:46:07 +0000 (21:46 +0000)]
sh: Add a test for 'cd -'.
Redirect 'cd -' output to /dev/null since POSIX requires it to write the new
directory name even if not interactive, but we currently only write it if
interactive.
Bryan Drewery [Thu, 7 Jan 2016 20:52:35 +0000 (20:52 +0000)]
Allow libnv to be built externally using GCC.
GCC does not define _VA_LIST_DECLARED. It defines _VA_LIST_ and others.
This was causing the prototype to not be defined and leading to an error
later due to using nvlist_add_stringv(3) without a prototype in
nvlist_add_stringf(3).
This uses the same check as other va_list prototypes in the original
change in r279438.
Jilles Tjoelker [Thu, 7 Jan 2016 20:48:24 +0000 (20:48 +0000)]
sh: Ensure OPTIND=1 in subshell without forking does not affect outer env.
Command substitutions containing a single simple command and here-document
expansion are performed in a subshell environment, but may not fork. Any
modified state of the shell environment should be restored afterward.
The state that OPTIND=1 had been done was not saved and restored here.
Note that the other parts of shellparam need not be saved and restored,
since they are not modified in these situations (a fork is done before such
modifications).
Jim Harris [Thu, 7 Jan 2016 20:32:04 +0000 (20:32 +0000)]
nvme: add hw.nvme.min_cpus_per_ioq tunable
Due to FreeBSD system-wide limits on number of MSI-X vectors
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321),
it may be desirable to allocate fewer than the maximum number
of vectors for an NVMe device, in order to save vectors for
other devices (usually Ethernet) that can take better
advantage of them and may be probed after NVMe.
This tunable is expressed in terms of minimum number of CPUs
per I/O queue instead of max number of queues per controller,
to allow for a more even distribution of CPUs per queue. This
avoids cases where some number of CPUs have a dedicated queue,
but other CPUs need to share queues. Ideally the PR referenced
above will eventually be fixed and the mechanism implemented
here becomes obsolete anyways.
While here, fix a bug in the CPUs per I/O queue calculation to
properly account for the admin queue's MSI-X vector.
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Intel
Tty.c was untypical in that it handled the si_drv1 issue consistently
and correctly, by always checking for si_drv1 being non-NULL and
sleeping if NULL. The removed code also illustrated unneeded
complications in drivers which are eliminated by the use of new KPI.
Reviewed by: hps, jhb
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746
Provide yet another KPI for cdev creation, make_dev_s(9).
Immediate problem fixed by the new KPI is the long-standing race
between device creation and assignments to cdev->si_drv1 and
cdev->si_drv2, which allows the window where cdevsw methods might be
called with si_drv1,2 fields not yet set. Devices typically checked
for NULL and returned spurious errors to usermode, and often left some
methods unchecked.
The new function interface is designed to be extensible, which should
allow to add more features to make_dev_s(9) without inventing yet
another name for function to create devices, while maintaining KPI and
even KBI backward-compatibility.
Reviewed by: hps, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746
Sean Bruno [Thu, 7 Jan 2016 16:42:48 +0000 (16:42 +0000)]
Switch em(4) to the extended RX descriptor format. This matches the
e1000/e1000e split in linux.
Split rxbuffer and txbuffer apart to support the new RX descriptor format
structures. Move rxbuffer manipulation to em_setup_rxdesc() to unify the
new behavior changes.
Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.
Change em_receive_checksum() to process the new rxdescriptor format
status bit.
Jim Harris [Thu, 7 Jan 2016 16:18:32 +0000 (16:18 +0000)]
nvme: do not revert o single I/O queue when per-CPU queues not possible
Previously nvme(4) would revert to a signle I/O queue if it could not
allocate enought interrupt vectors or NVMe submission/completion queues
to have one I/O queue per core. This patch determines how to utilize a
smaller number of available interrupt vectors, and assigns (as closely
as possible) an equal number of cores to each associated I/O queue.
Do not use 'struct route_in6' inside hash6_insert().
rin6 was used only as sockaddr_in6 storage. Make rtalloc1_fib()
use on-stack sin6 and return rtenry directly, instead of doing
useless work with 'struct route_in6'.
Renato Botelho [Thu, 7 Jan 2016 10:39:13 +0000 (10:39 +0000)]
Make cap_mkdb and services_mkdb file operations sync
Similar fix was done for passwd and group operations in r285050. When a
temporary file is created and then renamed to replace official file there
are no checks to make sure data was written to disk and if a power cycle
happens at this time, system can end up with a 0 length file
Allan Jude [Thu, 7 Jan 2016 05:47:34 +0000 (05:47 +0000)]
Make additional parts of sys/geom/eli more usable in userspace
The upcoming GELI support in the loader reuses parts of this code
Some ifdefs are added, and some code is moved outside of existing ifdefs
The HMAC parts of GELI are broken out into their own file, to separate
them from the kernel crypto/openssl dependant parts that are replaced
in the boot code.
Passed the GELI regression suite (tools/regression/geom/eli)
Files=20 Tests=14996
Result: PASS
Bryan Drewery [Thu, 7 Jan 2016 00:19:03 +0000 (00:19 +0000)]
Move the MAKEOBJDIRPREFIX value guard to sys.mk and expand to MAKEOBJDIR.
This will ensure that the variable was not set as a make override, in
make.conf, src.conf or src-env.conf. It allows setting the value in
src-env.conf when using WITH_AUTO_OBJ since that case properly handles
changing .OBJDIR (except if MAKEOBJDIRPREFIX does not yet exist which is
being discussed to be changed).
This change allows setting a default MAKEOBJDIRPREFIX via local.sys.env.mk.
Ed Maste [Thu, 7 Jan 2016 00:15:02 +0000 (00:15 +0000)]
Switch GNU ld to be installed as ld.bfd and linked as ld
We intend to replace GNU ld with LLVM's lld, and on the path to there
we'll experiment with having lld installed or linked as /usr/bin/ld.
Thus, make ld.bfd the primary install target for GNU ld, to later
facilitate making the ld link optional.
Reviewed by: davide, dim
Differential Revision: https://reviews.freebsd.org/D4790
Gleb Smirnoff [Thu, 7 Jan 2016 00:14:42 +0000 (00:14 +0000)]
Historically we have two fields in tcpcb to describe sender MSS: t_maxopd,
and t_maxseg. This dualism emerged with T/TCP, but was not properly cleaned
up after T/TCP removal. After all permutations over the years the result is
that t_maxopd stores a minimum of peer offered MSS and MTU reduced by minimum
protocol header. And t_maxseg stores (t_maxopd - TCPOLEN_TSTAMP_APPA) if
timestamps are in action, or is equal to t_maxopd otherwise. That's a very
rough estimate of MSS reduced by options length. Throughout the code it
was used in places, where preciseness was not important, like cwnd or
ssthresh calculations.
With this change:
- t_maxopd goes away.
- t_maxseg now stores MSS not adjusted by options.
- new function tcp_maxseg() is provided, that calculates MSS reduced by
options length. The functions gives a better estimate, since it takes
into account SACK state as well.
Ed Maste [Wed, 6 Jan 2016 17:33:32 +0000 (17:33 +0000)]
Add fls() to libstand
Although we don't use it in tree yet libstand is installed as user-
facing /usr/liblibstand.a, and some work in progress makes use of it.
Instead of conflicting with ongoing libstand Makefile deduplication,
just add it now.
Warner Losh [Wed, 6 Jan 2016 17:13:40 +0000 (17:13 +0000)]
Try a little harder to remove firstboot and firstboot-reboot files in
case they accidentally get created as directories or with flags that
prevent their removal. While I wouldn't normally go the extra mile
here and let the normal unix rules prevail, the effects of failure are
large enough that extra care is warranted.
Glen Barber [Wed, 6 Jan 2016 05:23:25 +0000 (05:23 +0000)]
Add a new target to touch the ${.OBJDIR}/release file, which
indicates the 'release' target has run (in order to prevent
subsequent invocations that may clobber original build output).
As is, the 'release' target is a dummy target that does nothing
more than depend on subsequent targets. Unless 'make obj' is
invoked prior to 'make release', .OBJDIR and .CURDIR will always
be '/usr/src/release' (or wherever /usr/src is located).
When 'make release' invokes 'make real-release' (and subsequent
targets), .OBJDIR is not updated, which still leads to src/ tree
pollution.
While arguably a hack, 'make release' will invoke the original
dummy targets as originally intended, but instead of touching an
empty file (or returing @true), will call a 'release-done' target
that will trigger the behavior that was intended to prevent
a subsequent invocation.
Discussed with: hrs
MFC after: 3 days
X-MFC-With: r293173
Sponsored by: The FreeBSD Foundation
Alan Somers [Wed, 6 Jan 2016 00:00:11 +0000 (00:00 +0000)]
"source routing" in rpcbind
Fix a bug in rpcbind for multihomed hosts. If the server had interfaces on
two separate subnets, and a client on the first subnet contacted rpcbind at
the address on the second subnet, rpcbind would advertise addresses on the
first subnet. This is a bug, because it should prefer to advertise the
address where it was contacted. The requested service might be firewalled
off from the address on the first subnet, for example.
usr.sbin/rpcbind/check_bound.c
If the address on which a request was received is known, pass that
to addrmerge as the clnt_uaddr parameter. That is what addrmerge's
comment indicates the parameter is supposed to mean. The previous
behavior is that clnt_uaddr would contain the address from which the
client sent the request.
usr.sbin/rpcbind/util.c
Modify addrmerge to prefer to use an IP that is equal to clnt_uaddr,
if one is found. Refactor the relevant portion of the function for
clarity, and to reduce the number of ifdefs.
etc/mtree/BSD.tests.dist
usr.sbin/rpcbind/tests/Makefile
usr.sbin/rpcbind/tests/addrmerge_test.c
Add unit tests for usr.sbin/rpcbind/util.c:addrmerge.
usr.sbin/rpcbind/check_bound.c
usr.sbin/rpcbind/rpcbind.h
usr.sbin/rpcbind/util.c
Constify some function arguments
Glen Barber [Tue, 5 Jan 2016 21:05:17 +0000 (21:05 +0000)]
Merge ^/projects/release-install-debug:
- Rework MANIFEST generation and parsing via bsdinstall(8).
- Allow selecting debugging distribution sets during install.
- Rework bsdinstall(8) to fetch remote debug distribution sets
when they are not available on the local install medium.
- Allow selecting additional non-GENERIC kernels during install.
At present, GENERIC is still required, and installed by default.
Tested with: head@r293203
Sponsored by: The FreeBSD Foundation
Jilles Tjoelker [Tue, 5 Jan 2016 16:21:20 +0000 (16:21 +0000)]
Add sbin and /usr/local directories to _PATH_DEFPATH.
Set _PATH_DEFPATH to
/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin. This is the
path in the default class in the default /etc/login.conf,
excluding ~/bin which would not be expanded properly in a string
constant.
For normal logins, _PATH_DEFPATH is overridden by /etc/login.conf,
~/.login_conf or shell startup files. _PATH_DEFPATH is still used as a
default by execlp(), execvp(), posix_spawnp() and sh if PATH is not set, and
by cron. Especially the latter is a common trap (most recently in PR
204813).