MFC r328770:
Merge r1.120 from NetBSD:
Fix a pretty simple, yet pretty tragic typo: we should return IPPROTO_DONE,
not IPPROTO_NONE. With IPPROTO_NONE we will keep parsing the header chain
on an mbuf that was already freed.
Reported by: Maxime Villard <max at m00nbsd dot net>
ae [Wed, 31 Jan 2018 09:26:28 +0000 (09:26 +0000)]
MFC r328350:
Merge revision 1.35 from NetBSD:
fix pointer/offset mistakes in handling of IPv4 options
Reported by: Maxime Villard <maxv at NetBSD.org>
MFC r328352:
Adopt revision 1.76 and 1.77 from NetBSD:
Fix a vulnerability in IPsec-IPv6-AH, that allows an attacker to remotely
crash the kernel with a single packet.
In this loop we need to increment 'ad' by two, because the length field
of the option header does not count the size of the option header itself.
If the length is zero, then 'count' is incremented by zero, and there's
an infinite loop. Beyond that, this code was written with the assumption
that since the IPv6 packet already went through the generic IPv6 option
parser, several fields are guaranteed to be valid; but this assumption
does not hold because of the missing '+2', and there's as a result a
triggerable buffer overflow (write zeros after the end of the mbuf,
potentially to the next mbuf in memory since it's a pool).
Add the missing '+2', this place will be reinforced in separate commits.
jhb [Mon, 29 Jan 2018 23:43:04 +0000 (23:43 +0000)]
MFC 327561:
Report offset relative to the backing object for kinfo_vmentry structures.
For the pathname reported in kinfo_vmentry structures (kve_path), the
sysctl handlers walk the object chain to find the bottom-most VM object.
This permits a COW mapping of a file with dirty pages to report the
pathname of the originally mapped file. Do the same for the object
offset (kve_offset) computing a cumulative offset during the same object
walk so that the reported offset is relative to the reported pathname.
Note that ptrace(PT_VM_ENTRY) already returns a cumulative offset
rather than the raw offset of the VM map entry.
Note also that this does not affect procstat -v output (even structured
output) since that output does not include the kve_offset field.
dim [Mon, 29 Jan 2018 18:11:27 +0000 (18:11 +0000)]
Pull in r217197 from upstream clang trunk (by Richard Smith):
PR20844: If we fail to list-initialize a reference, map to the
referenced type before retrying the initialization to produce
diagnostics. Otherwise, we may fail to produce any diagnostics, and
silently produce invalid AST in a -Asserts build. Also add a note to
this codepath to make it more clear why we were trying to create a
temporary.
This should fix assertions when parsing some forms of incomplete list
intializers.
Direct commit to stable/9 and stable/10, since stable/11 and head
already have this upstream fix.
gjb [Fri, 26 Jan 2018 04:32:31 +0000 (04:32 +0000)]
MFC r328283, r328284:
r328283:
When CHROOTBUILD_SKIP is set, evaluate the existence of /bin/sh
within the CHROOTDIR. If it does not exist, unset CHROOTBUILD_SKIP
to prevent build failures.
jhb [Wed, 24 Jan 2018 21:48:39 +0000 (21:48 +0000)]
MFC 325028,328344: Discard the correct thread event reported for a ptrace stop.
325028:
Discard the correct thread event reported for a ptrace stop.
When multiple threads wish to report a tracing event to a debugger,
both threads call ptracestop() and one thread will win the race to be
the reporting thread (p->p_xthread). The debugger uses PT_LWPINFO
with the process ID to determine which thread / LWP is reporting an
event and the details of that event. This event is cleared as a side
effect of the subsequent ptrace event that resumed the process
(PT_CONTINUE, PT_STEP, etc.). However, ptrace() was clearing the
event identified by the LWP ID passed to the resume request even if
that wasn't the 'p_xthread'. This could result in clearing an event
that had not yet been observed by the debugger and leaving the
existing event for 'p_thread' pending so that it was reported a second
time.
Specifically, if the debugger stopped due to a software breakpoint in
one thread, but then switched to another thread that was used to
resume (e.g. if the user switched to a different thread and issued a
step), the resume request (PT_STEP) cleared a pending event (if any)
for the thread being stepped. However, the process immediately
stopped and the first thread reported it's breakpoint event a second
time. The debugger decremented the PC for "both" breakpoint events
which resulted in the PC now pointing into the middle of an
instruction (on x86) and a SIGILL fault when the process was resumed a
second time.
To fix, always clear the pending event for 'p_xthread' when resuming a
process. ptrace() still honors the requested LWP ID when enabling
single-stepping (PT_STEP) or setting a different PC (PT_CONTINUE).
328344:
Mark the unused argument to continue_thread() as such.
clang in HEAD and 11 does not warn about this, but clang in 10 does.
jhb [Wed, 24 Jan 2018 00:32:02 +0000 (00:32 +0000)]
MFC 326953:
Catch up to r325719 which makes the kern.proc.pid sysctl "work" for zombies.
Some of the ptrace tests need to wait for a child process to become a
zombie before preceding. The parent process polls the child process
via the kern.proc.pid sysctl to wait for it to become a zombie.
Previously the code polled until the sysctl failed with ESRCH. Now it
will poll until either the sysctl fails with ESRCH (for compatiblity
with older kernels) or returns a kinfo_proc structure with the ki_stat
field set to SZOMB.
kp [Tue, 23 Jan 2018 05:03:26 +0000 (05:03 +0000)]
MFC r327675
pf: Avoid integer overflow issues by using mallocarray() iso. malloc()
pfioctl() handles several ioctl that takes variable length input, these
include:
- DIOCRADDTABLES
- DIOCRDELTABLES
- DIOCRGETTABLES
- DIOCRGETTSTATS
- DIOCRCLRTSTATS
- DIOCRSETTFLAGS
All of them take a pfioc_table struct as input from userland. One of
its elements (pfrio_size) is used in a buffer length calculation.
The calculation contains an integer overflow which if triggered can lead
to out of bound reads and writes later on.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
kp [Tue, 23 Jan 2018 04:37:31 +0000 (04:37 +0000)]
MFC r327674, r327796
Introduce mallocarray() in the kernel
Similar to calloc() the mallocarray() function checks for integer
overflows before allocating memory.
It does not zero memory, unless the M_ZERO flag is set.
Additionally, move the overflow check logic out to WOULD_OVERFLOW() for
consumers to have a common means of testing for overflowing allocations.
WOULD_OVERFLOW() should be a secondary check -- on 64-bit platforms, just
because an allocation won't overflow size_t does not mean it is a sane size
to request. Callers should be imposing reasonable allocation limits far,
far, below overflow.
cy [Tue, 23 Jan 2018 04:01:48 +0000 (04:01 +0000)]
MFC r327718:
When growing the state, also grow the seed array. Otherwise memory
that was not allocated will be accessed.
This necessitated refactoring state seed allocation from
ipf_state_soft_init() into a new common ipf_state_seed_alloc() function
as it is now also used by ipf_state_rehash() when changing the size of
the state hash table in addition to by ipf_state_soft_init() during
initialization.
According to Christos Zoulas <christos@NetBSD.org>:
The bug was encountered by a NetBSD vendor who's customer machines had
large ipfilter states. The bug was reliably triggered by resizing the
state variables using "ipf -T".
emaste [Tue, 23 Jan 2018 02:16:06 +0000 (02:16 +0000)]
MFC r317806 by glebius:
The nandsim(4) simulator driver doesn't have any protection against
races at least in its ioctl handler, and at the same time it creates
device entry with 0666 permissions.
To plug possible issues in it:
- Mark it as needing Giant.
- Switch device mode to 0600.
Submitted by: C Turt
Reviewed by: imp
Security: Possible double free in ioctl handler
cy [Sat, 20 Jan 2018 18:00:42 +0000 (18:00 +0000)]
MFC r327912:
Though this block of code is not used by FreeBSD, correct a call to
sprintf() with a macro call to SNPRINTF similar to other calls to
SNPRINTF within this same block.
cy [Fri, 12 Jan 2018 02:49:18 +0000 (02:49 +0000)]
MFC 327737:
USNO and possibly others have misinterpreted the maining of the
leapseconds last-update field and incorrectly increment it when changing
the file even though the leapsecond data has not changed. For instance,
if a leapsecond file is obtained from USNO, when it expires it will not
be replaced by a newer file from other sources because it has an
incorrect later last-update (version).
pfg [Wed, 10 Jan 2018 21:24:03 +0000 (21:24 +0000)]
MFC r327289:
rpc.sprayd: Bring some changes from NetBSD.
Most notable, other than some style issues:
CVS 1.11:
do not use LOG_CONS.
CVS 1.13:
consistently use exit instead of return in main().
use LOG_WARNING instead of LOG_ERR for non critical errors.
pfg [Thu, 4 Jan 2018 15:57:49 +0000 (15:57 +0000)]
MFC r327295:
Start syncing changes from OpenBSD's ip6_id.c instead of ip_id.c.
correct non-repetitive ID code, based on comments from niels provos.
- seed2 is necessary, but use it as "seed2 + x" not "seed2 ^ x".
- skipping number is not needed, so disable it for 16bit generator (makes
the repetition period to 30000)
dim [Mon, 1 Jan 2018 20:39:12 +0000 (20:39 +0000)]
MFC r327164:
Fix clang 6.0.0 compiler warnings in binutils
Latest clang git has a warning -Wnull-pointer-arithmetic which will
trigger a -Werror failure. Addition and subtraction from a null pointer
is undefined behaviour and could be optimized into anything.
Furthermore, using the difference between two pointers and casting the
result back to a pointer is not portable since the size of ptrdiff_t
does not necessary have to be the same as size of void* (this happens
e.g. on CHERI). Using intptr_t instead fixes this portability issue and
the compiler warning.
Submitted by; Alexander Richardson
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D12928
dim [Mon, 1 Jan 2018 20:24:16 +0000 (20:24 +0000)]
MFC r327167:
Remove obsolete register keyword from opensolaris's sysmacros.h. When
compiling zfsd with recent clang, it leads to a warning about the
register storage class being incompatible with C++17.
ae [Sun, 24 Dec 2017 02:06:16 +0000 (02:06 +0000)]
MFC r326898:
Fix possible memory leak.
vxlan_ftable entries are sorted in descending order, due to wrong arguments
order it is possible to stop search before existing element will be found.
Then new element will be allocated in vxlan_ftable_update_locked() and can
be inserted in the list second time or trigger MPASS() assertion with
enabled INVARIANTS.
rmacklem [Tue, 19 Dec 2017 23:00:08 +0000 (23:00 +0000)]
MFC: r326544
Avoid the overhead of acquiring a lock in nfsrv_checkgetattr() when
there are no write delegations issued.
manu@ reported on the freebsd-current@ mailing list that there was
a significant performance hit in nfsrv_checkgetattr() caused by
the acquisition/release of a state lock, even when there were no
write delegations issued.
This patch add a count of outstanding issued write delegations to the
NFSv4 server. This count allows nfsrv_checkgetattr() to return without
acquiring any lock when the count is 0, avoiding the performance hit
for the case where no write delegations are issued.
dim [Tue, 19 Dec 2017 11:44:24 +0000 (11:44 +0000)]
MFC r326880:
Pull in r320755 from upstream clang trunk (by me):
Don't trigger -Wuser-defined-literals for system headers
Summary:
In D41064, I proposed adding `#pragma clang diagnostic ignored
"-Wuser-defined-literals"` to some of libc++'s headers, since these
warnings are now triggered by clang's new `-std=gnu++14` default:
$ cat test.cpp
#include <string>
$ clang -std=c++14 -Wsystem-headers -Wall -Wextra -c test.cpp
In file included from test.cpp:1:
In file included from /usr/include/c++/v1/string:470:
/usr/include/c++/v1/string_view:763:29: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<char> operator "" sv(const char *__str, size_t __len)
^
/usr/include/c++/v1/string_view:769:32: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<wchar_t> operator "" sv(const wchar_t *__str, size_t __len)
^
/usr/include/c++/v1/string_view:775:33: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<char16_t> operator "" sv(const char16_t *__str, size_t __len)
^
/usr/include/c++/v1/string_view:781:33: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<char32_t> operator "" sv(const char32_t *__str, size_t __len)
^
In file included from test.cpp:1:
/usr/include/c++/v1/string:4012:24: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<char> operator "" s( const char *__str, size_t __len )
^
/usr/include/c++/v1/string:4018:27: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<wchar_t> operator "" s( const wchar_t *__str, size_t __len )
^
/usr/include/c++/v1/string:4024:28: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<char16_t> operator "" s( const char16_t *__str, size_t __len )
^
/usr/include/c++/v1/string:4030:28: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<char32_t> operator "" s( const char32_t *__str, size_t __len )
^
8 warnings generated.
Both @aaron.ballman and @mclow.lists felt that adding this workaround
to the libc++ headers was the wrong way, and it should be fixed in
clang instead.
Here is a proposal to do just that. I verified that this suppresses
the warning, even when -Wsystem-headers is used, and that the warning
is still emitted for a declaration outside of system headers.
This will allow to compile some of the libc++ headers in C++14 mode
(which is the default for gcc 6 and higher, and will be the default for
clang 6.0.0 and higher), with -Wsystem-headers and -Werror enabled.
cy [Wed, 13 Dec 2017 20:15:23 +0000 (20:15 +0000)]
MFC r324248:
hen building multiple kernels using KERNCONF, non-existent KERNCONF
files will produce an error and buildkernel will fail. Previously missing
KERNCONF files silently failed giving no indication as to why, only to
subsequently discover during installkernel that the desired kernel was
never built in the first place.
dim [Wed, 13 Dec 2017 18:38:02 +0000 (18:38 +0000)]
MFC r326748:
Document the existence and precision of the remaining long double
functions for which an imprecise stub implementation was added in
r255294, namely powl(3) and tgammal(3).
Submitted by: Steve Kargl
MFC r326753:
Correct r326748, indicating that tgammal(3) is mapped to tgamma(3), not
to itself.
pfg [Tue, 12 Dec 2017 22:10:12 +0000 (22:10 +0000)]
MFC r326282: (by fsu)
Remap ENOATTR to ENODATA in the linuxulator.
In the linux ENOADATA is frequently #defined as ENOATTR. The change is
required for an xattrs support implementation.
hselasky [Fri, 8 Dec 2017 19:21:46 +0000 (19:21 +0000)]
Add support for IPv6 based addresses as part of the TCP unify portspace feature
in ibcore. This resolves an interopability issue when using both iWarp(T6) and
RDMA(CX-4 and CX-5) devices at the same time.
The problem is IPv4 based sockets cannot be bound to an IPv6 based address
causing sobind() to fail preventing all use of IPv6 based addresses with RDMA
when an iWarp device is present.
hselasky [Fri, 8 Dec 2017 15:26:57 +0000 (15:26 +0000)]
MFC r326362:
Disallow TUN and TAP character device IOCTLs to modify the network device
type to any value. This can cause page faults and panics due to accessing
uninitialized fields in the "struct ifnet" which are specific to the network
device type.
Found by: jau@iki.fi
PR: 223767
Sponsored by: Mellanox Technologies
bapt [Fri, 8 Dec 2017 10:44:44 +0000 (10:44 +0000)]
MFC r326526:
In case man(1) found a catpage to display skip looking ".so" which is manpage
only.
In case we are trying to read a catpage, the manpage variable is not defined.
It results in the "cattool" having no arguments.
In case the catpage is compressed, the cattool used is "zcat" which dies if the
standard input is a terminal, meaning the function calling it is exiting as if
there were no ".so"
In case the catpage is uncompressed, the cattool used is "zcat -f" which waits
reading standard input, making the man(1) command hang.
- Mention mismatching numbers in MSR vs. ACPI _PSS count warning.
- Rephrase unsupported AMD CPUs message and wrap as an overly long line.
- Improve readability when reporting resulted P-state transition (debug).
r322710, r323286 (cem):
- Add support for family 17h pstate info from MSRs.
- Yield CPU awaiting frequency change.
r326378, r326383, r326407:
- Fix some style(9) nits.
- Add a tunable "debug.hwpstate_verify" to check P-state after changing it
and turn it off by default.
gjb [Mon, 4 Dec 2017 15:28:07 +0000 (15:28 +0000)]
MFC r326315, r326330, r326331, r326412:
r326315:
Set DISTDIR and WRKDIRPREFIX when building ports within the
chroot(8) to avoid mtime changes within the ports checkout,
which can cause checksum differences.
r326330:
Add a comment to release/release.conf.sample documenting
EMBEDDEDPORTS. [1]
Remove and update stale documentation from release(7) while here.
r326331:
Correct a comment.
r326412:
Fix port build flags passed to make(1) after r326315, where
it was missed for embedded image builds.
PR: 206344 [1]
Sponsored by: The FreeBSD Foundation
hselasky [Mon, 4 Dec 2017 09:54:03 +0000 (09:54 +0000)]
MFC r325897:
Improve the library dependencies helper script in src/tools.
Implement double pass of the relevant Makefiles. First make a list of
library names and directories and then scan for all the dependencies.
Spaces in directories in the source tree are not supported.
This avoids using hardcoded mappings between the library name
and the directory containing the library Makefile.
avg [Fri, 1 Dec 2017 11:14:13 +0000 (11:14 +0000)]
MFC r326070: zfs_write: fix problem with writes appearing to succeed when over quota
The problem happens when the writes have offsets and sizes aligned with
a filesystem's recordsize (maximum block size). In this scenario
dmu_tx_assign() would fail because of being over the quota, but the uio
would already be modified in the code path where we copy data from the
uio into a borrowed ARC buffer. That makes an appearance of a partial
write, so zfs_write() would return success and the uio would be modified
consistently with writing a single block.
That bug can result in a data loss because the writes over the quota
would appear to succeed while the actual data is being discarded.
This commit fixes the bug by ensuring that the uio is not changed until
after all error checks are done. To achieve that the code now uses
uiocopy() + uioskip() as in the original illumos design. We can do that
now that uiocopy() has been updated in r326067 to use
vn_io_fault_uiomove().
kp [Thu, 30 Nov 2017 21:32:28 +0000 (21:32 +0000)]
MFC r325850: pfctl: teach route-to to deal with interfaces with multiple addresses
The route_host parsing code set the interface name, but only for the first
node_host in the list. If that one happened to be the inet6 address and the
rule wanted an inet address it'd get removed by remove_invalid_hosts() later
on, and we'd have no interface name.
We must set the interface name for all node_host entries in the list, not just
the first one.
asomers [Tue, 28 Nov 2017 19:57:16 +0000 (19:57 +0000)]
MFC r325363:
Fix mpr(4) panics caused by bad drive mapping tables
sys/dev/mpr/mpr_mapping.c
If _mapping_process_dpm_pg0 detects inconsistencies in the drive
mapping table (stored in the HBA's NVRAM), abort reading it and
continue to boot as if the mapping table were blank. I observed
such inconsistencies in several HBAs after upgrading firmware from
14.0.0.0 to 15.0.0.0.
https://www.illumos.org/issues/6866
Need this for #6865.
To be generally more scripting-friendly, overload this issue with adding '-q'
option which should skip printing any label information.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Yuri Pankov <yuri.pankov@nexenta.com>
https://www.illumos.org/issues/7280
zdb is very handy for diagnosing problems with a pool in a safe and
quick way. When a pool is in a bad shape, we often want to disable some
fail-safes, or adjust some tunables in order to open them. In the
kernel, this is done by changing public variables in mdb. The goal of
this feature is to add the same capability to zdb and ztest, so that
they can change libzpool tuneables from the command line.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
https://www.illumos.org/issues/8473
Scrubbing is supposed to detect and repair all errors in the pool. However,
it wrongly ignores active spare devices. The problem can easily be
reproduced in OpenZFS at git rev 0ef125d with these commands:
truncate -s 64m /tmp/a /tmp/b /tmp/c
sudo zpool create testpool mirror /tmp/a /tmp/b spare /tmp/c
sudo zpool replace testpool /tmp/a /tmp/c
/bin/dd if=/dev/zero bs=1024k count=63 oseek=1 conv=notrunc of=/tmp/c
sync
sudo zpool scrub testpool
zpool status testpool # Will show 0 errors, which is wrong
sudo zpool offline testpool /tmp/a
sudo zpool scrub testpool
zpool status testpool # Will show errors on /tmp/c,
# which should've already been fixed
FreeBSD head is partially affected: the first scrub will detect some errors, but the second scrub will detect more.
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>