Enji Cooper [Fri, 20 Jan 2017 03:23:24 +0000 (03:23 +0000)]
Replace dot-dot relative pathing with SRCTOP-relative paths where possible
This reduces build output, need for recalculating paths, and makes it clearer
which paths are relative to what areas in the source tree. The change in
performance over a locally mounted UFS filesystem was negligible in my testing,
but this may more positively impact other filesystems like NFS.
LIBC_SRCTOP was left alone so Juniper (and other users) can continue to
manipulate lib/libc/Makefile (and other Makefile.inc's under lib/libc) as
include Makefiles with custom options.
Ed Maste [Fri, 20 Jan 2017 02:09:59 +0000 (02:09 +0000)]
libc: remove reference to nonexistent lib/locale directory
As far as I can tell this was introduced in r72406 and updated in several
subsequent revisions, but the lib/locale directory it referenced never
existed.
Reviewed by: ngie
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D9252
Pedro F. Giffuni [Fri, 20 Jan 2017 00:02:11 +0000 (00:02 +0000)]
mppc - Finish pluging NETGRAPH_MPPC_COMPRESSION.
There were several places where reference to compression were left
unfinished. Furthermore, KASSERTs contained references to MPPC_INVALID
which is not defined in the tree and therefore were sure to break with
INVARIANTS: comment them out.
Reported by: Eugene Grosbein
PR: 216265
MFC after: 3 days
Scott Long [Thu, 19 Jan 2017 21:47:50 +0000 (21:47 +0000)]
Rework the debug print API. Event printing no longer gets special handling.
All of the printing from the tables file now has wrappers so that the
handling is cleaner and it's possible to print something out (say, during
development) without having to fight the global debug flags. This re-org
will also make it easier to have the tables be compiled out at build time
if desired.
Other than fixing some minor bugs, there are no user-visible changes from
this change
Sponsored by: Netflix, Inc.
Differential Revision: D9238
Add mount option for tmpfs(5) to not use namecache.
The option "nonc" disables using of namecache for the created mount,
by default namecache is used. The rationale for the option is that
namecache duplicates the information which is already kept in memory
by tmpfs. Since it believed that namecache scales better than tmpfs,
or will scale better, do not enable the option by default. On the
other hand, smaller machines may benefit from lesser namecache
pressure.
Discussed with: mjg
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
For directories, node->tn_spec.tn_dir.tn_parent pointer to the parent
is used. For non-directories, the implementation is naive, all
directory nodes are scanned to find a dirent linking the specified
node. This can be significantly improved by maintaining tn_parent for
all nodes, later.
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
On dotdot lookup and fhtovp operations, it is possible for the file
represented by tmpfs node to be removed after the thread calculated
the pointer. In this case, tmpfs_alloc_vp() accesses freed memory.
Introduce the reference count on the nodes. The allnodes list from
tmpfs mount owns 1 reference, and threads performing unlocked
operations on the node, add one transient reference. Similarly, since
struct tmpfs_mount maintains the list where nodes are enlisted,
refcount it by one reference from struct mount and one reference from
each node on the list. Both nodes and tmpfs_mounts are removed when
refcount goes to zero.
Note that this means that nodes and tmpfs_mounts might survive some
time after the node is deleted or tmpfs_unmount() finished. The
tmpfs_alloc_vp() in these cases returns error either due to node
removal (tn_nlinks == 0) or because of insmntque1(9) error.
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Andriy Gapon [Thu, 19 Jan 2017 18:46:41 +0000 (18:46 +0000)]
fix a thread preemption regression in schedulers introduced in r270423
Commit r270423 fixed a regression in sched_yield() that was introduced
in earlier changes. Unfortunately, at the same time it introduced an
new regression. The problem is that SWT_RELINQUISH (6), like all other
SWT_* constants and unlike SW_* flags, is not a bit flag. So, (flags &
SWT_RELINQUISH) is true in cases where that was not really indended,
for example, with SWT_OWEPREEMPT (2) and SWT_REMOTEPREEMPT (11).
A straight forward fix would be to use (flags & SW_TYPE_MASK) ==
SWT_RELINQUISH, but my impression is that the switch types are designed
mostly for gathering statistics, not for influencing scheduling
decisions.
So, I decided that it would be better to check for SW_PREEMPT flag
instead. That's also the same flag that was checked before r239157.
I double-checked how that flag is used and I am confident that the flag
is set only in the places where we really have the preemption:
- critical_exit + td_owepreempt
- sched_preempt in the ULE scheduler
- sched_preempt in the 4BSD scheduler
Fix problem with suspend and resume when using Skylake chipsets. Make
sure the XHCI controller is reset after halting it. The problem is
clearly a BIOS bug as the suspend and resume is failing without
loading the XHCI driver. The same happens when using Linux and the
XHCI driver is not loaded.
Conrad Meyer [Thu, 19 Jan 2017 16:46:05 +0000 (16:46 +0000)]
ffs_vnops: Simplify extattr access
As suggested in r167010, use the structure type and macros to access and
modify UFS2 extended attributes. Add assertions that pointers are
aligned in places where we now access the data through a structure
pointer, instead of character-by-character.
Remove TMPFS_ASSERT_ELOCKED(). Its claims are already stated by other
asserts nearby and by VFS guarantees.
Change TMPFS_ASSERT_LOCKED() and one inlined place to use
ASSERT_VOP_(E)LOCKED() instead of hand-rolled imprecise asserts.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Alan Somers [Wed, 18 Jan 2017 20:24:37 +0000 (20:24 +0000)]
Fix several Coverity CIDs in devd
CID 1362055, 1362054: File descriptor leaks during shutdown
CID 1362013: Potential null-termination fail with long network device names
CID 1362097: Uncaught exception during memory pressure
CID 1362017, 1362016: Unchecked errors, possibly resulting in weird behavior
if two devd instances start at the same time.
CID 1362015: Unchecked error that will probably never fail
Conrad Meyer [Wed, 18 Jan 2017 17:55:49 +0000 (17:55 +0000)]
ufs/extattr.h: Fix documentation of ea_name termination
The ea_name string is not nul-terminated. Correct the documentation.
Because the subsequent field is padded to 8 bytes, and the padding is
zeroed, the ea_name string will appear to be nul-terminated whenever the
length isn't exactly one (mod eight).
This was introduced in r167010 (2007).
Additionally, mark the length fields as unsigned. This particularly
matters for the single byte ea_namelength field, which can represent
extended attribute names up to 255 bytes long.
Implement kernel support for hardware rate limited sockets.
- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.
- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.
- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().
- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.
- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.
- How rate limiting works:
1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.
2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.
3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.
4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.
Andrew Turner [Wed, 18 Jan 2017 13:27:24 +0000 (13:27 +0000)]
Use the kernel stack in the ARM FBT DTrace provider. This is used to find
the fifth argument to functions being traced, however there was an error
where the userspace stack was being used. This may be invalid leading to
a kernel panic if this address is unmapped.
Justin Hibbits [Wed, 18 Jan 2017 03:42:21 +0000 (03:42 +0000)]
Use the explicit expanded form of cmp.
Clang apparently requires the explicit form of this instruction, and rejects
uses which ignore the optional cmpD register. This was the only use of the
shorthand form of the instruction, so just fix it up to match the others.
PR: kern/215681
Submitted by: Mark Millard
Reported by: Mark Millard <markmi _AT_ dsl-only.net>
MFC after: 2 weeks
Ed Schouten [Tue, 17 Jan 2017 22:03:08 +0000 (22:03 +0000)]
Sync in the latest CloudABI generated source files.
Languages like C++17 and Go provide direct support for slice types:
pointer/length pairs. The CloudABI generator now has more complete for
this, meaning that for the C binding, pointer/length pairs now use an
automatic naming scheme of ${name} and ${name}_len.
Apart from this change and some reformatting, the ABI definitions are
identical. Binary compatibility is preserved entirely.
Alexander Motin [Tue, 17 Jan 2017 18:32:47 +0000 (18:32 +0000)]
Remove writing 'residual' field of struct ctl_scsiio.
This field has no practical use and never readed. Initiators already
receive respective residual size from frontends. Removed field had
different semantics, which looks useless, and was never passed through
by any frontend.
While there, fix kern_data_resid field support in case of HA, missed in
r312291.
Initialize IPFW static rules rmlock with RM_RECURSE flag.
This lock was replaced from rwlock in r272840. But unlike rwlock, rmlock
doesn't allow recursion on rm_rlock(), so at this time fix this with
RM_RECURSE flag. Later we need to change ipfw to avoid such recursions.
Gleb Smirnoff [Tue, 17 Jan 2017 03:52:57 +0000 (03:52 +0000)]
Fix regression from r310655, which broke operation of bsnmpd if it is bound
to a non-wildcard address. As documented in ip(4), doing sendmsg(2) with
IP_SENDSRCADDR on a socket that is bound to non-wildcard address is
completely different to using this control message on a wildcard one.
A fix is to add a bool to mark whether we did setsockopt(IP_RECVDSTADDR)
on the socket, and use IP_SENDSRCADDR control message only if we did.
While here, garbage collect absolutely useless udp_recv() function that
establishes some structures on stack to never use them later.
[zynq] Fix panic on USB PHY initialization failure
The Zedboard has a hardware bug where initialization of the USB PHY
occasionally fails on boot-up. Fix regression in -CURRENT when
kernel panics on such occasion. 11-RELEASE branch works fine
PR: 215862
Submitted by: Thomas Skibo <thoma555-bsd@yahoo.com>
Ed Maste [Mon, 16 Jan 2017 20:34:42 +0000 (20:34 +0000)]
disambiguate msleep KASSERT diagnostics
Previously "panic: msleep" could happen for a few different reasons.
Break the KASSERTs out into individual cases to identify the failing
condition. Found during the investigation that resulted in r308288.
Reviewed by: kib, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8604
Maxim Sobolev [Mon, 16 Jan 2017 17:46:38 +0000 (17:46 +0000)]
Add a new socket option SO_TS_CLOCK to pick from several different clock
sources to return timestamps when SO_TIMESTAMP is enabled. Two additional
clock sources are:
o nanosecond resolution realtime clock (equivalent of CLOCK_REALTIME);
o nanosecond resolution monotonic clock (equivalent of CLOCK_MONOTONIC).
In addition to this, this option provides unified interface to get bintime
(equivalent of using SO_BINTIME), except it also supported with IPv6 where
SO_BINTIME has never been supported. The long term plan is to depreciate
SO_BINTIME and move everything to using SO_TS_CLOCK.
Idea for this enhancement has been briefly discussed on the Net session
during dev summit in Ottawa last June and the general input was positive.
This change is believed to benefit network benchmarks/profiling as well
as other scenarios where precise time of arrival measurement is necessary.
There are two regression test cases as part of this commit: one extends unix
domain test code (unix_cmsg) to test new SCM_XXX types and another one
implementis totally new test case which exchanges UDP packets between two
processes using both conventional methods (i.e. calling clock_gettime(2)
before recv(2) and after send(2)), as well as using setsockopt()+recv() in
receive path. The resulting delays are checked for sanity for all supported
clock types.
Sean Bruno [Mon, 16 Jan 2017 16:58:12 +0000 (16:58 +0000)]
Change startup order for the no EARLY_AP_STARTUP case to initialize
gtaskqueue bits at SI_SUB_INIT_IF instead of waiting until SI_SUB_SMP
which is far too late.
Add an assertion in taskqgroup_attach() to catch startup initialization
failures in the future.
Ian Lepore [Mon, 16 Jan 2017 16:44:13 +0000 (16:44 +0000)]
Remove arm's cpuconf.h, and references to it, after moving a few lines from
it into pmap-v4.h where they are used. Other than those few lines of
support for different MMU types, nothing in cpuconf.h has been used in our
code for quite a while.
The file existed to set up a variety of symbols to describe the
architecture. Over the past few years we have converted all of our source
to use the new architecture symbols standardized by ARM Inc, and predefined
by both clang and gcc.
Alexander Motin [Mon, 16 Jan 2017 16:19:55 +0000 (16:19 +0000)]
Make CTL frontends report kern_data_resid for under-/overruns.
It seems like kern_data_resid was never really implemented. This change
finally does it. Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.
CAM target frontend still does not properly handle overruns due to CAM API
limitations. We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).
Michael Zhilin [Mon, 16 Jan 2017 15:36:36 +0000 (15:36 +0000)]
[gpioths] new driver for temperature/humidity sensor DHT11
This patch adds driver for temperature/humidity sensor connected via GPIO.
To compile it into kernel add "device gpioths". To activate driver, use
hints (.at and .pins) for gpiobus. As result it will provide temperature &
humidity values via sysctl.
DHT11 is cheap & popular temperature/humidity sensor used via GPIO on ARM
or MIPS devices like Raspberry Pi or Onion Omega.
Reviewed by: adrian
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D9185
Ed Maste [Mon, 16 Jan 2017 14:49:29 +0000 (14:49 +0000)]
rtld: do not rely on a populated GOT on amd64
On rela architectures GNU BFD ld and gold store the relocation addend
in GOT entries (in addition to the relocation's r_addend field).
rtld previously relied on this to access its own _DYNAMIC symbol in
order to apply its own relocations.
However, recording addends in the GOT is not specified by the ABI,
and some versions of LLVM's LLD linker leave the GOT uninitialized on
rela architectures.
BFD ld does not populate the GOT on sparc64, and sparc64 rtld has a
machine-dependent rtld_dynamic_addr() function that returns the
_DYNAMIC address. Use the same approach on amd64, obtaining the %rip-
relative _DYNAMIC address following a suggestion from Rafael EspĂndola.
Architectures other than amd64 should be addressed in future work.
PR: 214972
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D9180
Toomas Soome [Sun, 15 Jan 2017 20:03:13 +0000 (20:03 +0000)]
loader.efi: find_currdev() can leak memory
The find_currdev() is using variable "copy" to store the reference to trimmed
devpath pointer, if for some reason the efi_devpath_handle() fails, we will
leak this copy.
Jilles Tjoelker [Sun, 15 Jan 2017 13:40:14 +0000 (13:40 +0000)]
skel: Do not set -o emacs in .shrc.
sh has defaulted to 'set -o emacs' since FreeBSD 9.0. Therefore, do not set
this again in .shrc, since that only serves to prevent invocations like
'sh -o vi' and 'sh +o emacs' to have the intended effect.
PR: 215958
Submitted by: Andras Farkas
MFC after: 1 week
Kristof Provost [Sun, 15 Jan 2017 10:21:25 +0000 (10:21 +0000)]
arswitch: Ensure the lock is always held when calling arswitch_modifyreg()
arswitch_setled() and a number of _global_setup functions did not acquire the
lock before calling arswitch_modifyreg(). With WITNESS enabled this would
instantly panic.
Discovered on a TPLink-3600:
("panic: mutex arswitch not owned at sys/dev/etherswitch/arswitch/arswitch_reg.c:236")
Reviewed by: adrian, kan
Differential Revision: https://reviews.freebsd.org/D9187
Enji Cooper [Sun, 15 Jan 2017 09:13:41 +0000 (09:13 +0000)]
Mark testcases which use cap_enter as expected failures until the
PR is resolved so those of us that run the tests don't have the
bogus failures counted against our overall results
Enji Cooper [Sun, 15 Jan 2017 09:05:26 +0000 (09:05 +0000)]
Turn COMPILER_VERSION/COMPILER_TYPE make check into a compile-time check
of the clang version
This works around breakage on ^/stable/10 when running installworld from
a ^/stable/10 host where the test wouldn't be compiled on the first
go-around and would be missing when make installworld is run.