nwhitehorn [Sun, 10 Jan 2016 18:00:01 +0000 (18:00 +0000)]
Remove dead code and dead comments, most notably the implemenation of the
now-obsolete setfault(). No NetBSD code exists in the AIM locore files, so
update the copyrights there.
nwhitehorn [Sun, 10 Jan 2016 16:42:14 +0000 (16:42 +0000)]
Use setjmp() instead of the identical-except-for-having-a-wrong-prototype
setfault() when testing for faults. This should also help the compiler
do the right thing with this complicated-to-optimize function.
melifaro [Sun, 10 Jan 2016 13:40:29 +0000 (13:40 +0000)]
Split in6_selectsrc() into in6_selectsrc_addr() and in6_selectsrc_socket().
in6_selectsrc() has 2 class of users: socket-based one (raw/udp/pcb/etc) and
socket-less (ND code). The main reason for that change is inability to
specify non-default FIB for callers w/o socket since (internally) inpcb
is used to determine fib.
As as result, add 2 wrappers for in6_selectsrc() (making in6_selectsrc()
static):
1) in6_selectsrc_socket() for the former class. Embed scope_ambiguous check
along with returning hop limit when needed.
2) in6_selectsrc_addr() for the latter case. Add 'fibnum' argument and
pass IPv6 address w/ explicitly specified scope as separate argument.
bz [Sun, 10 Jan 2016 08:14:25 +0000 (08:14 +0000)]
Initialize error after r293626 in case neither INET nor INET6 is
compiled into the kernel. Ideally lots more code would just not
be called (or compiled in) in that case but that requires a lot
more surgery. For now try to make IP-less kernels compile again.
ngie [Sat, 9 Jan 2016 23:45:25 +0000 (23:45 +0000)]
- Delete non-TAP testcases
- Add a conf.sh file for executing common functions with geom_gate
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
- Add/increase sleeps to try and improve synchronization
- Add debug output for when checksums fail
nwhitehorn [Sat, 9 Jan 2016 21:28:56 +0000 (21:28 +0000)]
Make graphical consoles work under PowerKVM. Without using hypercalls, it is
not possible to write the framebuffer before pmap is up. Solve this by
deferring initialization until that happens, like on PS3.
melifaro [Sat, 9 Jan 2016 16:34:37 +0000 (16:34 +0000)]
Finish r275196: do not dereference rtentry in if_output() routines.
The only piece of information that is required is rt_flags subset.
In particular, if_loop() requires RTF_REJECT and RTF_BLACKHOLE flags
to check if this particular mbuf needs to be dropped (and what
error should be returned).
Note that if_loop() will always return EHOSTUNREACH for "reject" routes
regardless of RTF_HOST flag existence. This is due to upcoming routing
changes where RTF_HOST value won't be available as lookup result.
All other functions require RTF_GATEWAY flag to check if they need
to return EHOSTUNREACH instead of EHOSTDOWN error.
There are 11 places where non-zero 'struct route' is passed to if_output().
For most of the callers (forwarding, bpf, arp) does not care about exact
error value. In fact, the only place where this result is propagated
is ip_output(). (ip6_output() passes NULL route to nd6_output_ifp()).
Given that, add 3 new 'struct route' flags (RT_REJECT, RT_BLACKHOLE and
RT_IS_GW) and inline function (rt_update_ro_flags()) to copy necessary
rte flags to ro_flags. Call this function in ip_output() after looking up/
verifying rte.
kevlo [Sat, 9 Jan 2016 14:53:23 +0000 (14:53 +0000)]
- Add the definition of CHARCLASS_NAME_MAX, as per POSIX.1-2001.
- Avoid namespace pollution and move definitions of _POSIX2_CHARCLASS_NAME_MAX
and _POSIX2_COLL_WEIGHTS_MAX into the .2001 section.
With input from bde.
melifaro [Sat, 9 Jan 2016 11:41:37 +0000 (11:41 +0000)]
Remove prefix check from in6_addroute().
This check was added in initial? netinet6/ import
back in 1999 (r53541).
It effectively became unnecessary after 'address/prefix clean-ups'
KAME commit 90ff8792e676132096a440dd787f99a5a5860ee8 (github) in 2001
(merged to FreeBSD in r78064) where prefix check was added to
nd6_prefix_onlink(). Similar IPv4 check (in_addroute() was added
in r137628).
Additionally, the right plance for this (or similar) check is the prefix
addition code (nd6_prefix_onlink(), nd6_prefix_onlink_rtrequest(),
in_addprefix() or rtinit()), but not the generic radix insert routine.
Such handler should pass different set of variables, instead
of directly providing 2 locked route entries.
Given that it hasn't been really used since at least 2012, remove
current code.
Will re-add it after finishing most major routing-related changes.
markj [Sat, 9 Jan 2016 01:56:46 +0000 (01:56 +0000)]
Prevent cv_waiters wraparound.
r282971 attempted to fix this problem by decrementing cv_waiters after
waking up from sleeping on a condition variable, but this can result in
a use-after-free if the CV is freed before all woken threads have had a
chance to run. Instead, avoid incrementing cv_waiters past INT_MAX, and
have cv_signal() explicitly check for sleeping threads once cv_waiters has
reached this bound.
gjb [Sat, 9 Jan 2016 00:45:38 +0000 (00:45 +0000)]
Set FORCE_PKG_REGISTER=1 when installing packages to avoid failures
when re-using build chroot(8) environments.
This is based on the patch in the PR referenced below, but instead
of using 'reinstall' in two locations (one of which already uses
FORCE_PKG_REGISTER=1), changes the non-embedded behavior.
PR: 205998
Submitted by: ngie
MFC after: 5 days
Sponsored by: The FreeBSD Foundation
emaste [Sat, 9 Jan 2016 00:42:07 +0000 (00:42 +0000)]
Support use of LLVM's libunwind for exception unwinding
It is built in libgcc_s.so and libgcc_eh.a to simplify transition.
It is enabled by default on arm64 (where we previously had no other
unwinder) and may be enabled for testing on other platforms by setting
WITH_LLVM_LIBUNWIND in src.conf(5).
Also add compiler-rt's __gcc_personality_v0 implementation for use with
the LLVM unwinder.
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4787
ngie [Fri, 8 Jan 2016 21:47:41 +0000 (21:47 +0000)]
- Move functions that might be used in class-specific cleanup functions
(geom_test_cleanup, etc) down so the testcases don't emit noise when
bailing
- Conform to the TAP protocol better when dealing with classes that can't
be loaded and with temporary files that can't be allocated for tracking
md(4) devices.
ngie [Fri, 8 Jan 2016 21:38:26 +0000 (21:38 +0000)]
- Make test-1.sh into a TAP testable testcase
- Delete test-2.sh as it was an incomplete testcase, and the contents were
basically a subset of test-1.sh
- Add a conf.sh file for executing common functions with geom_uzip
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
ngie [Fri, 8 Jan 2016 21:28:09 +0000 (21:28 +0000)]
- Add a geom_stripe specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
ngie [Fri, 8 Jan 2016 21:25:27 +0000 (21:25 +0000)]
- Add a geom_shsec specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
glebius [Fri, 8 Jan 2016 20:34:57 +0000 (20:34 +0000)]
New sendfile(2) syscall. A joint effort of NGINX and Netflix from 2013 and
up to now.
The new sendfile is the code that Netflix uses to send their multiple tens
of gigabits of data per second. The new implementation features asynchronous
I/O, when I/O operations are launched, but not awaited to be complete. An
explanation of why such behavior is beneficial compared to old one is
going to be too long for a commit message, so we will skip it here.
Additional features of new syscall are extra flags, which provide an
application more control over data sent. The SF_NOCACHE flag tells
kernel that data shouldn't be cached after it was sent. The SF_READAHEAD()
macro allows to specify readahead size in pages.
The new syscalls is a drop in replacement. No modifications are required
to applications. One can take nginx binary for stable/10 and run it
successfully on head. Although SF_NODISKIO lost its original sense, as now
sendfile doesn't block, and now means something completely different (tm),
using the new sendfile the old way is absolutely safe.
Celebrates: Netflix global launch!
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
Relnotes: yes
ngie [Fri, 8 Jan 2016 19:47:49 +0000 (19:47 +0000)]
- Add a geom_raid3 specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
ngie [Fri, 8 Jan 2016 19:43:18 +0000 (19:43 +0000)]
- Add a conf.sh file for executing common functions with gnop
- Use attach_md for attaching md(4) devices
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
ngie [Fri, 8 Jan 2016 19:38:59 +0000 (19:38 +0000)]
- Add a conf.sh file for executing common functions with geli
-- Use linear probing to find the first unique md(4) device, unlike the other
code which uses attach_md, as geli(8) allocates the md(4) devices itself
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
emaste [Fri, 8 Jan 2016 19:12:26 +0000 (19:12 +0000)]
Reduce libstand Makefile duplication
Userboot's copy of the libstand Makefile had more extensive changes
compared to the one in sys/boot/libstand32, but it turns out these are
not intentional and we can just include lib/libstand/Makefile as done
for libstand32 in r293040.
Reviewed by: imp, jhb
Tested by: allanjude
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4793
ngie [Fri, 8 Jan 2016 19:10:52 +0000 (19:10 +0000)]
- Use attach_md for memory disks so they can be tracked.
- Add a geom_concat specific cleanup function and trap on that function at
exit so things are cleaned up properly
- Don't hardcode /tmp for temporary files, which violates the kyua sandbox
glebius [Fri, 8 Jan 2016 19:03:20 +0000 (19:03 +0000)]
Make it possible for sbappend() to preserve M_NOTREADY on mbufs, just like
sbappendstream() does. Although, M_NOTREADY may appear only on SOCK_STREAM
sockets, due to sendfile(2) supporting only the latter, there is a corner
case of AF_UNIX/SOCK_STREAM socket, that still uses records for the sake
of control data, albeit being stream socket.
Provide private version of m_clrprotoflags(), which understands PRUS_NOTREADY,
similar to m_demote().
melifaro [Fri, 8 Jan 2016 16:25:11 +0000 (16:25 +0000)]
Do more fine-grained locking in rtrequest1_fib().
Last consumer using RTF_RNH_LOCKED flag was eliminated in r291643.
Restrict passing RTF_RNH_LOCKED to rtrequest1_fib() and do better
locking for RTM_ADD / RTM_DELETE cases.
hselasky [Fri, 8 Jan 2016 10:04:19 +0000 (10:04 +0000)]
LinuxKPI style changes:
- Properly prefix internal functions with "linux_" instead of only a
single underscore to avoid future namespace collisions.
- Make some functions global instead of inline to ease debugging and
to avoid unnecessary code duplication.
- Remove no longer existing kthread_create() function's prototype.
imp [Fri, 8 Jan 2016 00:05:28 +0000 (00:05 +0000)]
Setup /pkg as a spot for pkg to operate. This is for testing purposes
only. You need to remount / rw and export TMPDIR=/pkg/tmp. pkg will
then work. It's slow though: 15 minutes to pkg install git on an RPi 2
with a decently fast SD card. Since this is for testing, we set
DEFAULT_ALWAYS_YES and ASSUME_ALWAYS_YES to YES.
cem [Thu, 7 Jan 2016 23:02:15 +0000 (23:02 +0000)]
ioat(4): Add ioat_acquire_reserve() KPI
ioat_acquire_reserve() is an extended version of ioat_acquire(). It
allows users to reserve space in the channel for some number of
descriptors. If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.
jilles [Thu, 7 Jan 2016 21:46:07 +0000 (21:46 +0000)]
sh: Add a test for 'cd -'.
Redirect 'cd -' output to /dev/null since POSIX requires it to write the new
directory name even if not interactive, but we currently only write it if
interactive.
bdrewery [Thu, 7 Jan 2016 20:52:35 +0000 (20:52 +0000)]
Allow libnv to be built externally using GCC.
GCC does not define _VA_LIST_DECLARED. It defines _VA_LIST_ and others.
This was causing the prototype to not be defined and leading to an error
later due to using nvlist_add_stringv(3) without a prototype in
nvlist_add_stringf(3).
This uses the same check as other va_list prototypes in the original
change in r279438.
jilles [Thu, 7 Jan 2016 20:48:24 +0000 (20:48 +0000)]
sh: Ensure OPTIND=1 in subshell without forking does not affect outer env.
Command substitutions containing a single simple command and here-document
expansion are performed in a subshell environment, but may not fork. Any
modified state of the shell environment should be restored afterward.
The state that OPTIND=1 had been done was not saved and restored here.
Note that the other parts of shellparam need not be saved and restored,
since they are not modified in these situations (a fork is done before such
modifications).
jimharris [Thu, 7 Jan 2016 20:32:04 +0000 (20:32 +0000)]
nvme: add hw.nvme.min_cpus_per_ioq tunable
Due to FreeBSD system-wide limits on number of MSI-X vectors
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321),
it may be desirable to allocate fewer than the maximum number
of vectors for an NVMe device, in order to save vectors for
other devices (usually Ethernet) that can take better
advantage of them and may be probed after NVMe.
This tunable is expressed in terms of minimum number of CPUs
per I/O queue instead of max number of queues per controller,
to allow for a more even distribution of CPUs per queue. This
avoids cases where some number of CPUs have a dedicated queue,
but other CPUs need to share queues. Ideally the PR referenced
above will eventually be fixed and the mechanism implemented
here becomes obsolete anyways.
While here, fix a bug in the CPUs per I/O queue calculation to
properly account for the admin queue's MSI-X vector.
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Intel
kib [Thu, 7 Jan 2016 20:15:09 +0000 (20:15 +0000)]
Convert tty common code to use make_dev_s().
Tty.c was untypical in that it handled the si_drv1 issue consistently
and correctly, by always checking for si_drv1 being non-NULL and
sleeping if NULL. The removed code also illustrated unneeded
complications in drivers which are eliminated by the use of new KPI.
Reviewed by: hps, jhb
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746
kib [Thu, 7 Jan 2016 20:08:02 +0000 (20:08 +0000)]
Provide yet another KPI for cdev creation, make_dev_s(9).
Immediate problem fixed by the new KPI is the long-standing race
between device creation and assignments to cdev->si_drv1 and
cdev->si_drv2, which allows the window where cdevsw methods might be
called with si_drv1,2 fields not yet set. Devices typically checked
for NULL and returned spurious errors to usermode, and often left some
methods unchecked.
The new function interface is designed to be extensible, which should
allow to add more features to make_dev_s(9) without inventing yet
another name for function to create devices, while maintaining KPI and
even KBI backward-compatibility.
Reviewed by: hps, jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D4746
sbruno [Thu, 7 Jan 2016 16:42:48 +0000 (16:42 +0000)]
Switch em(4) to the extended RX descriptor format. This matches the
e1000/e1000e split in linux.
Split rxbuffer and txbuffer apart to support the new RX descriptor format
structures. Move rxbuffer manipulation to em_setup_rxdesc() to unify the
new behavior changes.
Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.
Change em_receive_checksum() to process the new rxdescriptor format
status bit.
jimharris [Thu, 7 Jan 2016 16:18:32 +0000 (16:18 +0000)]
nvme: do not revert o single I/O queue when per-CPU queues not possible
Previously nvme(4) would revert to a signle I/O queue if it could not
allocate enought interrupt vectors or NVMe submission/completion queues
to have one I/O queue per core. This patch determines how to utilize a
smaller number of available interrupt vectors, and assigns (as closely
as possible) an equal number of cores to each associated I/O queue.