trasz [Sun, 19 Mar 2017 10:35:56 +0000 (10:35 +0000)]
MFC r313351:
Make root_mount_hold() work after boot. This is important for two
reasons. First is rerooting into USB-mounted device that happens
to be not yet enumerated. The second is when mounting with (non-root)
filesystem on USB device on a hub that's enumerated later than the root
mount: the rc scripts explicitly mount for the root mount holds to be
released, but each USB bus takes the hold asynchronously, and if that
happens after root mount, it would just get ignored.
trasz [Sun, 19 Mar 2017 10:34:26 +0000 (10:34 +0000)]
MFC r313350:
In r290196 the root mount hold mechanism was changed to make it not wait
for mount hold release if the root device already exists. So, unless your
rootdev is not on USB - ie in the usual case - the root mount won't wait
for USB. However, the old behaviour was sometimes used as "wait until USB
is fully enumerated", and r290196 broke that.
This commit adds vfs.root_mount_always_wait tunable, to force the kernel
to always wait for root mount holds, even if the root is already there.
ae [Sun, 19 Mar 2017 07:34:19 +0000 (07:34 +0000)]
MFC r314716:
Add IPv6 support to O_IP_DST_LOOKUP opcode.
o check the size of O_IP_SRC_LOOKUP opcode, it can not exceed the size of
ipfw_insn_u32;
o rename ipfw_lookup_table_extended() function into ipfw_lookup_table() and
remove old ipfw_lookup_table();
o use args->f_id.flow_id6 that is in host byte order to get DSCP value;
o add SCTP ports support to 'lookup src/dst-port' opcode;
o add IPv6 support to 'lookup src/dst-ip' opcode.
trasz [Sat, 18 Mar 2017 23:57:47 +0000 (23:57 +0000)]
MFC r311283:
Don't release the cfiscsi session refcount too early. It wasn't
observed to fix any actual error, but it's the right thing to do
from the correctness point of view.
trasz [Sat, 18 Mar 2017 23:55:50 +0000 (23:55 +0000)]
MFC r311284:
Fix bug that would result in a kernel crash in some cases involving
a symlink and an autofs mount request. The crash was caused by namei()
calling bcopy() with a negative length, caused by numeric underflow:
in lookup(), in the relookup path, the ni_pathlen was decremented too
many times. The bug was introduced in r296715.
Big thanks to Alex Deiter for his help with debugging this.
ae [Sat, 18 Mar 2017 22:04:20 +0000 (22:04 +0000)]
MFC r304572 (by bz):
Remove the kernel optoion for IPSEC_FILTERTUNNEL, which was deprecated
more than 7 years ago in favour of a sysctl in r192648.
MFC r305122:
Remove redundant sanity checks from ipsec[46]_common_input_cb().
This check already has been done in the each protocol callback.
MFC r309144,309174,309201 (by fabient):
IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets.
Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.
The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.
MFC r309143,309146 (by fabient):
In a dual processor system (2*6 cores) during IPSec throughput tests,
we see a lot of contention on the arc4 lock, used to generate the IV
of the ESP output packets.
The idea of this patch is to split this mutex in order to reduce the
contention on this lock.
o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
option IPSEC_SUPPORT added. It enables support for loading
and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
support was removed. Added TCP/UDP checksum handling for
inbound packets that were decapsulated by transport mode SAs.
setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
build as part of ipsec.ko module (or with IPSEC kernel).
It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
methods. The only one header file <netipsec/ipsec_support.h>
should be included to declare all the needed things to work
with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
- now all security associations stored in the single SPI namespace,
and all SAs MUST have unique SPI.
- several hash tables added to speed up lookups in SADB.
- SADB now uses rmlock to protect access, and concurrent threads
can do SA lookups in the same time.
- many PF_KEY message handlers were reworked to reflect changes
in SADB.
- SADB_UPDATE message was extended to support new PF_KEY headers:
SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
avoid locking protection for ipsecrequest. Now we support
only limited number (4) of bundled SAs, but they are supported
for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
check for full history of applied IPsec transforms.
o References counting rules for security policies and security
associations were changed. The proper SA locking added into xform
code.
o xform code was also changed. Now it is possible to unregister xforms.
tdb_xxx structures were changed and renamed to reflect changes in
SADB/SPDB, and changed rules for locking and refcounting.
MFC r313331:
Add removed headers into the ObsoleteFiles.inc.
MFC r313561 (by glebius):
Move tcp_fields_to_net() static inline into tcp_var.h, just below its
friend tcp_fields_to_host(). There is third party code that also uses
this inline.
MFC r313697:
Remove IPsec related PCB code from SCTP.
The inpcb structure has inp_sp pointer that is initialized by
ipsec_init_pcbpolicy() function. This pointer keeps strorage for IPsec
security policies associated with a specific socket.
An application can use IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket
options to configure these security policies. Then ip[6]_output()
uses inpcb pointer to specify that an outgoing packet is associated
with some socket. And IPSEC_OUTPUT() method can use a security policy
stored in the inp_sp. For inbound packet the protocol-specific input
routine uses IPSEC_CHECK_POLICY() method to check that a packet conforms
to inbound security policy configured in the inpcb.
SCTP protocol doesn't specify inpcb for ip[6]_output() when it sends
packets. Thus IPSEC_OUTPUT() method does not consider such packets as
associated with some socket and can not apply security policies
from inpcb, even if they are configured. Since IPSEC_CHECK_POLICY()
method is called from protocol-specific input routine, it can specify
inpcb pointer and associated with socket inbound policy will be
checked. But there are two problems:
1. Such check is asymmetric, becasue we can not apply security policy
from inpcb for outgoing packet.
2. IPSEC_CHECK_POLICY() expects that caller holds INPCB lock and
access to inp_sp is protected. But for SCTP this is not correct,
becasue SCTP uses own locks to protect inpcb.
To fix these problems remove IPsec related PCB code from SCTP.
This imply that IP_IPSEC_POLICY and IPV6_IPSEC_POLICY socket options
will be not applicable to SCTP sockets. To be able correctly check
inbound security policies for SCTP, mark its protocol header with
the PR_LASTHDR flag.
MFC r313746:
Add missing check to fix the build with IPSEC_SUPPORT and without MAC.
MFC r313805:
Fix LINT build for powerpc.
Build kernel modules support only when both IPSEC and TCP_SIGNATURE
are not defined.
MFC r313922:
For translated packets do not adjust UDP checksum if it is zero.
In case when decrypted and decapsulated packet is an UDP datagram,
check that its checksum is not zero before doing incremental checksum
adjustment.
MFC r314339:
Document that the size of AH ICV for HMAC-SHA2-NNN should be half of
NNN bits as described in RFC4868.
PR: 215978
MFC r314812:
Introduce the concept of IPsec security policies scope.
Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.
Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.
To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.
For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.
After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
in ipsec
esp/tunnel/87.250.242.144-87.250.242.145/unique:145
spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
out none
spid=5 seq=1 pid=872 scope=global
refcnt=1
jilles [Sat, 18 Mar 2017 16:07:28 +0000 (16:07 +0000)]
MFC r315005: sh: Fix executing wrong command with ${x#$(y)}$(z).
The parsed internal representation of words consists of a byte string with a
list of nodes (commands in command substitution). Each unescaped CTLBACKQ or
CTLBACKQ | CTLQUOTE byte corresponds to an entry in the list.
If param in ${param#%##%%word} is not set, the word is not expanded (in a
deviation of POSIX shared with other ash variants and ksh93). Erroneously,
the pointer in the list of commands (argbackq) was not advanced. This caused
the wrong command to be executed later if the outer word contained another
command substitution.
Example:
echo "${unsetvar#$(echo a)}$(echo b)"
wrote "a" but should write "b".
alc [Sat, 18 Mar 2017 05:53:09 +0000 (05:53 +0000)]
MFC r315318
Relax the locking requirements for vm_object_page_noreuse(). While
reviewing all uses of OFF_TO_IDX(), I observed that
vm_object_page_noreuse() is requiring an exclusive lock on the object
when, in fact, a shared lock suffices.
alc [Sat, 18 Mar 2017 05:48:26 +0000 (05:48 +0000)]
MFC r313186
Over the years, the code and comments in vm_page_startup() have diverged
in one respect. When determining how many page structures to allocate,
contrary to what the comments say, the code does not account for the
overhead of a page structure per page of physical memory. This revision
changes the code to match the comments.
alc [Sat, 18 Mar 2017 05:38:10 +0000 (05:38 +0000)]
MFC r310083
Tidy up. Mostly, remove or replace stale comments. Most of the comments
in this file actually described the operation of the swap pager, not the
default pager. Given that this is the wrong place to discuss the
implementation of the swap pager, it shouldn't come as a surprise that as
the swap pager evolved these comments became increasingly stale. In
addition, apply some style fixes, like modernizing a few remaining old-
style function definitions.
vangyzen [Fri, 17 Mar 2017 14:54:10 +0000 (14:54 +0000)]
MFC r313821 r315277 r315286
Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel.
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack, except for KTR messages.
KTR can correctly log the immediate integral values passed to it,
as well as constant strings, but not non-constant strings,
since they might change by the time ktrdump retrieves them.
Therefore, use hex notation in KTR messages.
mav [Fri, 17 Mar 2017 11:45:16 +0000 (11:45 +0000)]
MFC r309856: Postpone ZVOL media/block size caching till first open.
At least on FreeBSD there are no legal way to access media or get its
size without opening device/provider first. Postponing this caching
allows to skip several disk seeks per ZVOL/snapshot during import.
For HDD pool with 1 ZVOL in dev mode with 1000 snapshots this reduces
pool import time from 40 to 10 seconds.
mav [Fri, 17 Mar 2017 07:56:20 +0000 (07:56 +0000)]
MFC r308782:
After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.
Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.
While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.
mm [Thu, 16 Mar 2017 23:07:35 +0000 (23:07 +0000)]
MFC r314571:
Update libarchive to version 3.3.1 (and sync with latest vendor dist)
Notable vendor changes:
PR #501: improvements in ACL path handling
PR #724: fix hang when reading malformed cpio files
PR #864: fix out of bounds read with malformed GNU tar archives
Documentation, style, test suite improvements and typo fixes.
New options to bsdtar that enable or disable reading and/or writing of:
Access Control Lists (--acls, --no-acls)
Extended file flags (--fflags, --no-fflags)
Extended attributes (--xattrs, --no-xattrs)
Mac OS X metadata (Mac OS X only) (--mac-metadata, --no-mac-metadata)
locks: let primitives for modules unlock without always goging to the slsow path
It is only needed if the LOCK_PROFILING is enabled. It has to always check if
the lock is about to be released which requires an avoidable read if the option
is not specified..
==
sx: fix compilation on UP kernels after r313855
sx primitives use inlines as opposed to macros. Change the tested condition
to LOCK_DEBUG which covers the case, but is slightly overzelaous.
==
mtx: microoptimize lockstat handling in __mtx_lock_sleep
This saves a function call and multiple branches after the lock is acquired.
==
mtx: restrict r313875 to kernels without LOCK_PROFILING
==
mtx: get rid of file/line args from slow paths if they are unused
This denotes changes which went in by accident in r313877.
On most production kernels both said parameters are zeroed and have nothing
reading them in either __mtx_lock_sleep or __mtx_unlock_sleep. Thus this change
stops passing them by internal consumers which this is the case.
Kernel modules use _flags variants which are not affected kbi-wise.
==
sx: fix mips builld after r313855
The namespace in this file really needs cleaning up. In the meantime
let inline primitives be defined as long as LOCK_DEBUG is not enabled.
In particular thius reduces accesses of the lock itself.
==
locks: make trylock routines check for 'unowned' value
Since fcmpset can fail without lock contention e.g. on arm, it was possible
to get spurious failures when the caller was expecting the primitive to succeed.
==
mtx: microoptimize lockstat handling in spin mutexes and thread lock
While here make the code compilablle on kernels with LOCK_PROFILING but without
KDTRACE_HOOKS.
==
KDTRACE_HOOKS isn't guaranteed to be defined. Change to check to see
if it is defined or not rather than if it is non-zero.
locks: let primitives for modules unlock without always goging to the slsow path
It is only needed if the LOCK_PROFILING is enabled. It has to always check if
the lock is about to be released which requires an avoidable read if the option
is not specified..
==
sx: fix compilation on UP kernels after r313855
sx primitives use inlines as opposed to macros. Change the tested condition
to LOCK_DEBUG which covers the case, but is slightly overzelaous.
mtx: get rid of file/line args from slow paths if they are unused
This denotes changes which went in by accident in r313877.
On most production kernels both said parameters are zeroed and have nothing
reading them in either __mtx_lock_sleep or __mtx_unlock_sleep. Thus this change
stops passing them by internal consumers which this is the case.
Kernel modules use _flags variants which are not affected kbi-wise.
mav [Thu, 16 Mar 2017 07:11:05 +0000 (07:11 +0000)]
MFC r314549: Execute last ZIO of log commit synchronously.
For short transactions overhead of context switch can be too large.
Skipping it gives significant latency reduction. For large ones,
including multiple ZIOs, latency is less critical, while throughput
there may become limited by checksumming speed of single CPU core.
To get best of both cases, execute last ZIO directly from calling
thread context to save latency, while all others (if there are any)
enqueue to taskqueues in traditional way.
mjg [Thu, 16 Mar 2017 07:10:08 +0000 (07:10 +0000)]
MFC r313853,r313859:
locks: remove SCHEDULER_STOPPED checks from primitives for modules
They all fallback to the slow path if necessary and the check is there.
This means a panicked kernel executing code from modules will be able to
succeed doing actual lock/unlock, but this was already the case for core code
which has said primitives inlined.
==
Introduce SCHEDULER_STOPPED_TD for use when the thread pointer was already read
mjg [Thu, 16 Mar 2017 06:51:00 +0000 (06:51 +0000)]
MFC r313392,r313784:
rwlock: implement RW_LOCK_WRITER_RECURSED bit
This moves recursion handling out of the inlined wunlock path and in
particular saves a read and a branch.
==
rwlock: tidy up r313392
While a new bit was added and thread alignment got shifted to accomodate it,
RW_READERS_SHIFT was not modified accordingly and clashed with the new flag.
This was surprisingly harmless. If the lock was taken for writing, other flags
were tested. If the lock was taken for reading, it would correctly work for
readers > 1 and this was the only relevant test performed.
mjg [Thu, 16 Mar 2017 06:45:36 +0000 (06:45 +0000)]
MFC r313275,r313280,r313282,r313335:
mtx: move lockstat handling out of inline primitives
Lockstat requires checking if it is enabled and if so, calling a 6 argument
function. Further, determining whether to call it on unlock requires
pre-reading the lock value.
This is problematic in at least 3 ways:
- more branches in the hot path than necessary
- additional cacheline ping pong under contention
- bigger code
Instead, check first if lockstat handling is necessary and if so, just fall
back to regular locking routines. For this purpose a new macro is introduced
(LOCKSTAT_PROFILE_ENABLED).
LOCK_PROFILING uninlines all primitives. Fold in the current inline lock
variant into the _mtx_lock_flags to retain the support. With this change
the inline variants are not used when LOCK_PROFILING is defined and thus
can ignore its existence.
A remaining action is to remove spurious arguments for internal kernel
consumers.
==
sx: move lockstat handling out of inline primitives
See r313275 for details.
==
rwlock: move lockstat handling out of inline primitives
See r313275 for details.
One difference here is that recursion handling was removed from the fallback
routine. As it is it was never supposed to see a recursed lock in the first
place. Future changes will move it out of inline variants, but right now
there is no easy to way to test if the lock is recursed without reading
additional words.
==
locks: fix recursion support after recent changes
When a relevant lockstat probe is enabled the fallback primitive is called with
a constant signifying a free lock. This works fine for typical cases but breaks
with recursion, since it checks if the passed value is that of the executing
thread.
The found value is passed to locking routines in order to reduce cacheline
accesses.
mtx_unlock grows an explicit check for regular unlock. On ll/sc architectures
the routine can fail even if the lock could have been handled by the inline
primitive.
==
rwlock: switch to fcmpset
==
sx: switch to fcmpset
==
sx: uninline slock/sunlock
Shared locking routines explicitly read the value and test it. If the
change attempt fails, they fall back to a regular function which would
retry in a loop.
The problem is that with many concurrent readers the risk of failure is pretty
high and even the value returned by fcmpset is very likely going to be stale
by the time the loop in the fallback routine is reached.
Uninline said primitives. It gives a throughput increase when doing concurrent
slocks/sunlocks with 80 hardware threads from ~50 mln/s to ~56 mln/s.
Interestingly, rwlock primitives are already not inlined.
==
sx: add witness support missed in r313272
==
mtx: fix up _mtx_obtain_lock_fetch usage in thread lock
Since _mtx_obtain_lock_fetch no longer sets the argument to MTX_UNOWNED,
callers have to do it on their own.
==
mtx: fixup r313278, the assignemnt was supposed to go inside the loop
==
mtx: fix spin mutexes interaction with failed fcmpset
While doing so move recursion support down to the fallback routine.
==
locks: ensure proper barriers are used with atomic ops when necessary
Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.
Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
barriers are in the worst case provided by turnstile unlock
I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.
dchagin [Thu, 16 Mar 2017 06:32:58 +0000 (06:32 +0000)]
MFC r313740:
Replace Linuxulator implementation of readdir(), getdents() and
getdents64() with wrapper over kern_getdirentries().
The patch was originally written by emaste@ and then adapted by trasz@
and me.
Note:
1. I divided linux_getdents() and linux_readdir() as in case when the
getdents() called with count = 1 (readdir() case) it can overwrite
user stack (by writing to user buffer pointer more than 1 byte).
2. Linux returns EINVAL in case when user supplied buffer is not enough
to contain fetched dirent.
3. Linux returns ENOTDIR in case when fd points to not a directory.
Summary:
atomic_fcmpset_*() is analogous to atomic_cmpset(), but saves off the read value
from the target memory location into the 'old' pointer in the case of failure.
==
i386: add atomic_fcmpset
==
Don't retry a lost reservation in atomic_fcmpset()
The desired behavior of atomic_fcmpset_() is to always exit on error. Instead
of retrying on lost reservation, leave the retry to the caller, and return
==
Add atomic_fcmpset_*() inlines for MIPS
atomic_fcmpset_*() is analogous to atomic_cmpset(), but saves off the
read value from the target memory location into the 'old' pointer.
==
i386: fixup fcmpset
An incorrect output specifier was used which worked with clang by accident,
but breaks with the in-tree gcc version.
ngie [Thu, 16 Mar 2017 01:59:41 +0000 (01:59 +0000)]
MFC r314924:
sbin/devfs: clarify usage
- Note existence of -m option.
- Note that -s applies to rule keyword, only, by adding usage text
specifically for the `rule` and `ruleset` keywords.
Don't go into any further detail in usage(..) -- it's best that one
reads the manpage to get a better idea of how things work as there are
a number of different option-specific keywords and arguments, as well
as some rule grammar.
ngie [Thu, 16 Mar 2017 01:53:21 +0000 (01:53 +0000)]
MFC r314831:
Don't rely on dependency in Makefile.inc1 for strfile; make datfiles depend on strfile
In most cases strfile is built as part of build-tools, but in the event that someone
cd'ed to the directory, tried to build from scratch, and had MK_GAMES=no previously,
the build would fail in .../datfiles , trying to find strfile .
Mark this directory tree "SUBDIR_PARALLEL" safe to help facilitate this, instead of
shuffling around the SUBDIR entries (all of the other Makefiles will build standalone).
ngie [Thu, 16 Mar 2017 01:38:07 +0000 (01:38 +0000)]
MFC r314830:
mergemaster: fix description of -p
-p only handles updating /etc/master.passwd and /etc/group . No more,
no less.
Also, mergemaster (and no other portions of the vanilla FreeBSD build
process) should be messing with __MAKECONF or SRCCONF as part of the
installworld or distribution process. Don't insinuate that mergemaster
does that as it's a false claim.
ngie [Thu, 16 Mar 2017 01:36:09 +0000 (01:36 +0000)]
MFC r314869,r314871,r314872:
r314869:
Alphabetically sort variables
The only content change is minor rewording around CLEANDIRS/CLEANFILES to
accomodate sorting order.
r314871:
Fix LINKS example in bsd.prog.mk
LINKS appends DESTDIR -- don't suggest double-append in example.
r314872:
Add bsd.man.mk references for MAN under bsd.lib.mk and bsd.prog.mk
The latter set of manpages directly consume bsd.man.mk, so the bsd.man.mk
behavior should be the source of truth for underlying behavior, whereas
the other manpage fragment descriptions should document how they tweak
the variable behavior, if at all (bsd.prog.mk does tweak the default
value, as noted in its description)
mjg [Thu, 16 Mar 2017 01:32:56 +0000 (01:32 +0000)]
MFC r311172,r311194,r311226,r312389,r312390:
mtx: reduce lock accesses
Instead of spuriously re-reading the lock value, read it once.
This change also has a side effect of fixing a performance bug:
on failed _mtx_obtain_lock, it was possible that re-read would find
the lock is unowned, but in this case the primitive would make a trip
through turnstile code.
This is diff reduction to a variant which uses atomic_fcmpset.
==
Reduce lock accesses in thread lock similarly to r311172
==
mtx: plug open-coded mtx_lock access missed in r311172
mjg [Thu, 16 Mar 2017 00:51:24 +0000 (00:51 +0000)]
MFC r312890,r313386,r313390:
Sprinkle __read_mostly on backoff and lock profiling code.
==
locks: change backoff to exponential
Previous implementation would use a random factor to spread readers and
reduce chances of starvation. This visibly reduces effectiveness of the
mechanism.
Switch to the more traditional exponential variant. Try to limit starvation
by imposing an upper limit of spins after which spinning is half of what
other threads get. Note the mechanism is turned off by default.
==
locks: follow up r313386
Unfinished diff was committed by accident. The loop in lock_delay
was changed to decrement, but the loop iterator was still incrementing.
mizhka [Wed, 15 Mar 2017 21:01:03 +0000 (21:01 +0000)]
MFC r310017-r310018
r310017:
[spi] reformat message and ar5315_spi minor fix
This commit corrects print of nomatch (newline was too early) and fix
unit number for new child in ar5315_spi (was 0, now is -1 to calculate it
according to actual system state)
dim [Wed, 15 Mar 2017 19:50:58 +0000 (19:50 +0000)]
MFC r310232:
After r310171, the kernel version of sscanf() has format string checking
enabled. This results in a -Werror warning in mlx4ib:
sys/dev/mlx4/mlx4_ib/mlx4_ib_sysfs.c:90:22: error: format specifies type 'unsigned long long *' but the argument has type 'u64 *' (aka 'unsigned long *') [-Werror,-Wformat]
sscanf(buf, "%llx", &sysadmin_ag_val);
~~~~ ^~~~~~~~~~~~~~~~
Change sysadmin_ag_val to unsigned long long to avoid the warning.
dchagin [Wed, 15 Mar 2017 16:38:39 +0000 (16:38 +0000)]
MFC r305093 (by mjg@):
fd: add fdeget_locked and use in kern_descrip
MFC r305756 (by oshogbo@):
fd: add fget_cap and fget_cap_locked primitives.
They can be used to obtain capabilities along with a referenced fp.
MFC r306174 (by oshogbo@):
capsicum: propagate rights on accept(2)
Descriptor returned by accept(2) should inherits capabilities rights from
the listening socket.
PR: 201052
MFC r306184 (by oshogbo@):
fd: simplify fgetvp_rights by using fget_cap_locked.
MFC r306225 (by mjg@):
fd: fix up fgetvp_rights after r306184
fget_cap_locked returns a referenced file, but the fgetvp_rights does
not need it. Instead, due to the filedesc lock being held, it can
ref the vnode after the file was looked up.
Fix up fget_cap_locked to be consistent with other _locked helpers and not
ref the file.
This plugs a leak introduced in r306184.
MFC r306232 (by oshogbo@):
fd: fix up fget_cap
If the kernel is not compiled with the CAPABILITIES kernel options
fget_unlocked doesn't return the sequence number so fd_modify will
always report modification, in that case we got infinity loop.
MFC r311474 (by glebius@):
Use getsock_cap() instead of fgetsock().
MFC r312079 (by glebius@):
Use getsock_cap() instead of deprecated fgetsock().
MFC r312081 (by glebius@):
Use getsock_cap() instead of deprecated fgetsock().
MFC r312087 (by glebius@):
Remove deprecated fgetsock() and fputsock().
Bump __FreeBSD_version as getsock_cap changed and
fgetsock/fputsock pair removed.
pfg [Wed, 15 Mar 2017 15:33:32 +0000 (15:33 +0000)]
MFC r315095, r315096, r315097, r315187:
libc: small cleanups.
Rename nitems to numitems: it shares the anme with an existing macro in
sys/params.h. Also initialize the value later which avoids asigning the
value if we exit early.
Unsign setlen: it is local and will never be negative. Having one more bit
for growth is beneficial and it avoids a cast when it's going to be used
for allocation.
Remove unused initialization: "num" is properly defined before use.
mjg [Tue, 14 Mar 2017 20:43:04 +0000 (20:43 +0000)]
MFC r312724,r312901,r312902:
hwpmc: partially depessimize munmap handling if the module is not loaded
HWPMC_HOOKS is enabled in GENERIC and triggers some work avoidable in the
common (module not loaded) case.
In particular this avoids permission checks + lock downgrade
singlethreaded and in cases were an executable mapping is found the pmc
sx lock is no longer bounced.
Note this is a band aid.
==
hwpmc: partially depessimize mmap handling if the module is not loaded
In particular this means the pmc sx lock is no longer taken when an
executable mapping succeeds.
==
hwpmc: annotate pmc_hook and pmc_intr as __read_mostly
mjg [Tue, 14 Mar 2017 20:39:06 +0000 (20:39 +0000)]
MFC r312888:
Introduce __read_mostly and __exclusive_cache_line macros.
The intended use is to annotate frequently used globals which either rarely
change (and thus can be grouped in the same cacheline) or are an atomic counter
(which means it may benefit from being the only variable in the cacheline).
Linker script support is provided only for amd64. Architectures without it risk
having other variables put in, i.e. as if they were not annotated. This is
harmless from correctness point of view.
pfg [Tue, 14 Mar 2017 20:14:57 +0000 (20:14 +0000)]
MFC r312934:
Make use of clang nullability attributes in C headers.
Replace uses of the GCC __nonnull__ attribute with the clang nullability
qualifiers. These are starting to get use in clang's static analyzer.
Replacement should be transparent for developers using clang. GCC ports
from older FreeBSD versions may need updating if the compiler was built
before r312860 (Jan-27-2017).