cy [Fri, 30 Nov 2018 06:45:53 +0000 (06:45 +0000)]
This is a direct commit to the stable/11 branch. This would have been
MFC r340754 except that etc/rc.d has been moved in HEAD which would
have resulted in a tree conflict if merged.
Allow forced start of ipmon in special cases where testing is desired
(or other special cases) and when ipfilter is disabled in rc.conf but
started by other means.
dab [Fri, 30 Nov 2018 02:06:30 +0000 (02:06 +0000)]
MFC r337812,r337814,r337820,r341068:
Fix several memory leaks (r337812 & r337814).
The libkqueue tests have several places that leak memory by using an
idiom like:
puts(kevent_to_str(kevp));
Rework to save the pointer returned from kevent_to_str() and then
free() it after it has been used.
r337812 also fixed a bug in the netmap kevent code. The inclusion of
that fix was an oversight that I didn't notice until this
MFC. Reference the code review and PR here in the MFC for
completeness.
r337820 & r341068 were white-space only changes as a follow-up to
r337812 & r337814:
After r337820, which "corrected" some spaces-instead-of-tab whitespace
issues in the libkqueue tests, jmg@ pointed out that these files were
originally space-based, not tab-spaced, and so the correction should
have been to get rid of the tabs that had been introduced in previous
changes, not the spaces. This change does that. This is a whitespace
only change; no functional change is intended.
sef [Thu, 29 Nov 2018 01:05:21 +0000 (01:05 +0000)]
MFC r340442
mountd has no way to configure the listen queue depth; rather than add a new
option, we pass -1 down to listen, which causes it to use the
kern.ipc.soacceptqueue sysctl.
vangyzen [Wed, 28 Nov 2018 21:20:51 +0000 (21:20 +0000)]
MFC r340995
Prevent kernel stack disclosure in signal delivery
On arm64 and riscv platforms, sendsig() failed to zero the signal
frame before copying it out to userspace. Zero it.
On arm, I believe all the contents of the frame were initialized,
so there was no disclosure. However, explicitly zero the whole frame
because that fact could inadvertently change in the future,
it's more clear to the reader, and I could be wrong in the first place.
Security: similar to FreeBSD-EN-18:12.mem and CVE-2018-17155
Sponsored by: Dell EMC Isilon
vangyzen [Tue, 27 Nov 2018 19:40:18 +0000 (19:40 +0000)]
MFC r340257
in6_ifattach_linklocal: handle immediate removal of the new LLA
If another thread immediately removes the link-local address
added by in6_update_ifa(), in6ifa_ifpforlinklocal() can return NULL,
so the following assertion (or dereference) is wrong.
Remove the assertion, and handle NULL somewhat better than panicking.
This matches all of the other callers of in6_update_ifa().
PR: 219250
Reviewed by: bz, dab (both an earlier version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17898
eugen [Mon, 26 Nov 2018 11:32:22 +0000 (11:32 +0000)]
MFC r339810: ipfw: implement ngtee/netgraph actions for layer-2 frames.
Kernel part of ipfw does not support and ignores rules other than
"pass", "deny" and dummynet-related for layer-2 (ethernet frames).
Others are processed as "pass".
Make it support ngtee/netgraph rules just like they are supported
for IP packets. For example, this allows us to mirror some frames
selectively to another interface for delivery to remote network analyzer
over RSPAN vlan. Assuming ng_ipfw(4) netgraph node has a hook named "900"
attached to "lower" hook of vlan900's ng_ether(4) node, that would be
as simple as:
ipfw add ngtee 900 ip from any to 8.8.8.8 layer2 out xmit igb0
eugen [Mon, 26 Nov 2018 11:22:04 +0000 (11:22 +0000)]
MFC r339816: mount_msdosfs
mount_msdosfs: do not fail mounts requiring locale name conversion table
that is already present in a kernel statically.
For example, the command "mount_msdosfs -L ru_RU.KOI8-R" fails with error
"mount_msdosfs: msdosfs_iconv: File exists" for a kernel having
options LIBICONV and MSDOSFS_ICONV. After this change, it mounts
successfully.
emaste [Sun, 25 Nov 2018 00:34:00 +0000 (00:34 +0000)]
MFC r340771: proto: change device permissions to 0600
C Turt reports that the driver is not thread safe and may have
exploitable races.
Note that the proto device is intended for prototyping and development,
and is not for use on production systems. From the man page:
SECURITY CONSIDERATIONS
Because programs have direct access to the hardware, the proto
driver is inherently insecure. It is not advisable to use this
driver on a production machine.
The proto device is not included in any of FreeBSD's kernel config files
(although the module is built).
The issues in the proto device still need to be fixed, and the device is
inherently (and intentionally) insecure, but it might as well be limited
to root only.
admbugs: 782
Reported by: C Turt <ecturt@gmail.com>
Sponsored by: The FreeBSD Foundation
emaste [Fri, 23 Nov 2018 20:41:54 +0000 (20:41 +0000)]
MFC r340663 (rmacklem):
Improve sanity checking for the dircount hint argument to
NFSv3's ReaddirPlus and NFSv4's Readdir operations. The code
checked for a zero argument, but did not check for a very large value.
This patch clips dircount at the server's maximum data size.
emaste [Fri, 23 Nov 2018 20:39:37 +0000 (20:39 +0000)]
MFC r340662 (rmacklem):
nfsm_advance() would panic() when the offs argument was negative.
The code assumed that this would indicate a corrupted mbuf chain, but
it could simply be caused by bogus RPC message data.
This patch replaces the panic() with a printf() plus error return.
emaste [Fri, 23 Nov 2018 20:38:50 +0000 (20:38 +0000)]
MFC r340661 (rmacklem):
r304026 added code that started statistics gathering for an operation
before the operation number (the variable called "op") was sanity checked.
This patch moves the code down to below the range sanity check for "op".
tijl [Thu, 22 Nov 2018 09:47:51 +0000 (09:47 +0000)]
MFC r340674:
Fix another user address dereference in linux_sendmsg syscall.
This was hidden behind the LINUX_CMSG_NXTHDR macro which dereferences its
second argument. Stop using the macro as well as LINUX_CMSG_FIRSTHDR. Use
the size field of the kernel copy of the control message header to obtain
the next control message.
tijl [Thu, 22 Nov 2018 09:41:54 +0000 (09:41 +0000)]
MFC r340631:
Do proper copyin of control message data in the Linux sendmsg syscall.
Instead of calling m_append with a user address, allocate an mbuf cluster
and copy data into it using copyin. For the SCM_CREDS case, instead of
zeroing a stack variable and appending that to the mbuf, zero part of the
mbuf cluster directly. One mbuf cluster is also the size limit used by
the FreeBSD sendmsg syscall (uipc_syscalls.c:sockargs()).
marius [Wed, 21 Nov 2018 18:53:30 +0000 (18:53 +0000)]
MFC: r340495
- Restore setting the clock for devices which support the default/legacy
transfer mode only (lost with r321385). [1]
- Similarly, don't try to set the power class on MMC devices that comply
to version 4.0 of the system specification but are operated in default/
legacy transfer or 1-bit bus mode as no power class is specified for
these cases. Trying to set a power class nevertheless resulted in an -
albeit harmless - error message.
eugen [Tue, 20 Nov 2018 10:44:49 +0000 (10:44 +0000)]
MFC r339558: New sysctl: net.inet.icmp.error_keeptags
Currently, icmp_error() function copies FIB number from original packet
into generated ICMP response but not mbuf_tags(9) chain.
This prevents us from easily matching ICMP responses corresponding
to tagged original packets by means of packet filter such as ipfw(8).
For example, ICMP "time-exceeded in-transit" packets usually generated
in response to traceroute probes lose tags attached to original packets.
This change adds new sysctl net.inet.icmp.error_keeptags
that defaults to 0 to avoid extra overhead when this feature not needed.
Set net.inet.icmp.error_keeptags=1 to make icmp_error() copy mbuf_tags
from original packet to generated ICMP response.
kevans [Mon, 19 Nov 2018 19:05:07 +0000 (19:05 +0000)]
MFC r340392: Add dynamic_kenv assertion to init_static_kenv
Both to formally document the requirement that this not be called after the
dynamic kenv is setup, and to perhaps help static analyzers figure out
what's going on. While calling init_static_kenv this late isn't fatal, there
are some caveats that the caller should be aware of:
- Late calls are effectively a no-op, as far as default FreeBSD is
concerned, as everything will switch to searching the dynamic kenv once it's
available.
- Each of the kern_getenv calls will leak memory, as it's assumed that
these are searching static environment and allocations will not be made.
As such, this usage is not sensible and should be detected.
eugen [Mon, 19 Nov 2018 06:37:38 +0000 (06:37 +0000)]
MFC r339465: rc.initdiskless: add support for auxiliary NVRAM.
Currently, rc.inidiskless assumes that local system configuration
changes are kept in some mountable file system. For example,
nanobsd uses dedicated partition mounted as /cfg for this.
However, small embedded devices like MIPS routers may have no enough flash
space to keep full-blown file system but have only one or couple
small flash blocks to keep persistent local configuration overrides.
This change extends rc.initdiskless and introduces ability to run auxiliary
command /conf/T/M/extract that is supposed to extract configuration overrides
from such local storage.
For example, the command /conf/default/etc/extract may contain something like:
cd "$1" && bsdcpio --quiet -idu < /dev/map/cfg
bsdcpio command extracts compressed archive from the storage to /etc
assuming the storage is exposed by the kernel as /dev/map/cfg to userland.
rmacklem [Sun, 18 Nov 2018 22:59:54 +0000 (22:59 +0000)]
MFC: r339999
Fix NFS client vnode locking to avoid a crash during forced dismount.
A crash was reported where the crash occurred in nfs_advlock() when the
NFS_ISV4(vp) macro was being executed. This was caused by the vnode
being VI_DOOMED due to a forced dismount in progress.
This patch fixes the problem by locking the vnode before executing the
NFS_ISV4() macro.
emaste [Sun, 18 Nov 2018 14:53:29 +0000 (14:53 +0000)]
MFC r340329: build(7): clarify buildenv target can be used for non-cross builds
make buildenv can be used for building for the same architecture as
the host (perhaps this is a degenerate case of cross-building).
TARGET and TARGET_ARCH do not need to be set in this case.
kp [Sun, 18 Nov 2018 12:09:27 +0000 (12:09 +0000)]
MFC r340067:
pfsync: Ensure uninit is done before pf
pfsync touches pf memory (for pf_state and the pfsync callback
pointers), not the other way around. We need to ensure that pfsync is
torn down before pf.
kp [Sun, 18 Nov 2018 12:04:25 +0000 (12:04 +0000)]
MFC r340066:
Notify that the ifnet will go away, even on vnet shutdown
pf subscribes to ifnet_departure_event events, so it can clean up the
ifg_pf_kif and if_pf_kif pointers in the ifnet.
During vnet shutdown interfaces could go away without sending the event,
so pf ends up cleaning these up as part of its shutdown sequence, which
happens after the ifnet has already been freed.
Send the ifnet_departure_event during vnet shutdown, allowing pf to
clean up correctly.
kp [Sun, 18 Nov 2018 10:57:39 +0000 (10:57 +0000)]
MFC r339676:
pf: Fix copy/paste error in IPv6 address rewriting
We checked the destination address, but replaced the source address. This was
fixed in OpenBSD as part of their NAT rework, which we don't want to import
right now.
kp [Sun, 18 Nov 2018 10:47:50 +0000 (10:47 +0000)]
MFC r339470:
pf synproxy will do the 3WHS on behalf of the target machine, and once
the 3WHS is completed, establish the backend connection. The trigger
for "3WHS completed" is the reception of the first ACK. However, we
should not proceed if that ACK also has RST or FIN set.
jhb [Sun, 18 Nov 2018 01:07:36 +0000 (01:07 +0000)]
MFC 339312,339364: Restore more descriptors during VM exits.
339312:
Fully restore the GDTR, IDTR, and LDTR after VT-x VM exits.
The VT-x VMCS only stores the base address of the GDTR and IDTR. As a
result, VM exits use a fixed limit of 0xffff for the host GDTR and
IDTR losing the smaller limits set in when the initial GDT is loaded
on each CPU during boot. Explicitly save and restore the full GDTR
and IDTR contents around VM entries and exits to restore the correct
limit.
Similarly, explicitly save and restore the LDT selector. VM exits
always clear the host LDTR as if the LDT was loaded with a NULL
selector and a userspace hypervisor is probably using a NULL selector
anyway, but save and restore the LDT explicitly just to be safe.
339364:
Reload the LDT selector after an AMD-v #VMEXIT.
cpu_switch() always reloads the LDT, so this can only affect the
hypervisor process itself. Fix this by explicitly reloading the host
LDT selector after each #VMEXIT. The stock bhyve process on FreeBSD
never uses a custom LDT, so this change is cosmetic.
ae [Sun, 18 Nov 2018 00:34:24 +0000 (00:34 +0000)]
MFC r339542:
Retire IPFIREWALL_NAT64_DIRECT_OUTPUT kernel option. And add ability
to switch the output method in run-time. Also document some sysctl
variables that can by changed for NAT64 module.
NAT64 had compile time option IPFIREWALL_NAT64_DIRECT_OUTPUT to use
if_output directly from nat64 module. By default is used netisr based
output method. Now both methods can be used, but they require different
handling by rules.
ae [Sun, 18 Nov 2018 00:31:09 +0000 (00:31 +0000)]
MFC r339533:
Add sadb_x_sa2 extension to SADB_ACQUIRE requests.
SADB_ACQUIRE requests are send by kernel, when security policy doesn't
have corresponding security association for outbound packet. IKE daemon
usually registers its handler for such messages and when the kernel asks
for SA it can handle this request. Now such requests will contain
additional fields that can help IKE daemon to create SA. And IKE now
can create SAs using only information from SADB_ACQUIRE request, this
is useful when many if_ipsec(4) interfaces are in use and IKE doesn track
security policies that was installed by kernel.
ae [Sun, 18 Nov 2018 00:28:56 +0000 (00:28 +0000)]
MFC r339539:
Add IPFW_RULE_JUSTOPTS flag, that is used by ipfw(8) to mark rule,
that was added using "new rule format". And then, when the kernel
returns rule with this flag, ipfw(8) can correctly show it.
Reported by: lev
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D17373
ae [Sun, 18 Nov 2018 00:26:09 +0000 (00:26 +0000)]
MFC r339535:
Do not allow use `create` keyword as hostname when ifconfig(8) is invoked
for already existing interface.
It appeared, that ifconfig(8) assumes `create` keyword as hostname and
tries to resolve it, when `ifconfig ifname create` invoked for already
existing interface. This can produce some unexpected results, when hostname
resolving has successfully happened. This patch adds check for such case.
When an interface is already exists, and create is only one argument,
return error message. But when there are some other arguments, just remove
create keyword from the arguments list.
jhb [Sun, 18 Nov 2018 00:11:19 +0000 (00:11 +0000)]
MFC 338511: bhyve: Use MAP_GUARD when mapping guest memory ranges.
Instead of relying on PROT_NONE mappings with MAP_ANON, use MAP_GUARD
to reserve address space around guest memory ranges including the
guard ranges of address space around mappings.
ygy [Thu, 15 Nov 2018 08:43:17 +0000 (08:43 +0000)]
MFC r338977:
Add description, parameters, options, sysctl and examples of using AQMs to ipfw man page. CoDel, PIE, FQ-CoDel and FQ-PIE AQM for Dummynet exist in FreeBSD 11 and 10.3.
scottl [Tue, 13 Nov 2018 18:49:43 +0000 (18:49 +0000)]
Fix a regression from prior to 11.2 that caused MSI (not MSI-X) interrupt
allocation to fail. While here, refactor the code so that it's more clear
and less likely to break in the future. This is not an MFC due to the code
in 12/head being very different, but it follows the latter's structure
more closely than before.
hselasky [Sat, 10 Nov 2018 10:31:35 +0000 (10:31 +0000)]
MFC r340212:
Sometimes the complete split packet may be queued too early and the
transaction translator will return a NAK. Ignore this message and
retry the complete split instead.
emaste [Fri, 9 Nov 2018 21:45:42 +0000 (21:45 +0000)]
Fix objcopy for little-endian MIPS64 objects.
MFC r338478 (jhb): Fix objcopy for little-endian MIPS64 objects.
MIPS64 does not store the 'r_info' field of a relocation table entry as
a 64-bit value consisting of a 32-bit symbol index in the high 32 bits
and a 32-bit type in the low 32 bits as on other architectures. Instead,
the 64-bit 'r_info' field is really a 32-bit symbol index followed by four
individual byte type fields. For big-endian MIPS64, treating this as a
64-bit integer happens to be compatible with the layout expected by other
architectures (symbol index in upper 32-bits of resulting "native" 64-bit
integer). However, for little-endian MIPS64 the parsed 64-bit integer
contains the symbol index in the low 32 bits and the 4 individual byte
type fields in the upper 32-bits (but as if the upper 32-bits were
byte-swapped).
To cope, add two helper routines in gelf_getrel.c to translate between the
correct native 'r_info' value and the value obtained after the normal
byte-swap translation. Use these routines in gelf_getrel(), gelf_getrela(),
gelf_update_rel(), and gelf_update_rela(). This fixes 'readelf -r' on
little-endian MIPS64 objects which was previously decoding incorrect
relocations as well as 'objcopy: invalid symbox index' warnings from
objcopy when extracting debug symbols from kernel modules.
Even with this fixed, objcopy was still crashing when trying to extract
debug symbols from little-endian MIPS64 modules. The workaround in
gelf_*rel*() depends on the current ELF object having a valid ELF header
so that the 'e_machine' field can be compared against EM_MIPS. objcopy
was parsing the relocation entries to possibly rewrite the 'r_info' fields
in the update_relocs() function before writing the initial ELF header to
the destination object file. Move the initial write of the ELF header
earlier before copy_contents() so that update_relocs() uses the correct
symbol index values.
Note that this change should really go upstream. The binutils readelf
source has a similar hack for MIPS64EL though I implemented this version
from scratch using the MIPS64 ABI PDF as a reference.
MFC r339083 (emaste): libelf: correct mips64el test to use ELF header
libelf maintains two views of endianness: e_byteorder, and
e_ident[EI_DATA] in the ELF header itself. e_byteorder is not always
kept in sync, so use the ELF header endianness to test for mips64el.
MFC r339473 (emaste): libelf: also test for 64-bit ELF in _libelf_is_mips64el
Although _libelf_is_mips64el is only called in contexts where we've
already checked that e_class is ELFCLASS64 but this may change in the
future. Add a safety belt so that we don't access an invalid e_ehdr64
union member if it does.
emaste [Fri, 9 Nov 2018 21:38:53 +0000 (21:38 +0000)]
MFC r331078 (cem): nm: Initialize allocated memory before use
In out of memory scenarios (where one of these allocations failed but
other(s) did not), nm(1) could reference the uninitialized value of these
allocations (undefined behavior).
Always initialize any successful allocations as the most expedient
resolution of the issue. However, I would encourage upstream elftoolchain
contributors to clean up the error path to just abort immediately, rather
than proceeding sloppily when one allocation fails.
if present to enable some devices like WaveShare touchscreens. Unlike
Windows we discard content of the blob. We try mimic Windows driver
behaviour from the USB device point of view.
Submitted by: glebius (initial version)
MFC r337289:
wmt(4): Use internal function to calculate input report size
Usbhid's hid_report_size() calculates integral size of all reports of given
kind found in the HID descriptor rather then exact size of report with given
ID as its userland counterpart does. As all input data processed by the
driver is located within the same report, calculate required driver's buffer
size with userland version, imported in one of the previous commits.
This allows us to skip zeroing of buffer on processing of each report.
While here do some minor refactoring.
MFC r338458:
wmt(4): Fix regression introduced in r337289
r337289 has a side effect of reducing usb frame 0 buffer size down to
touch report size. That broke some devices e.g. "Raydium Touch System"
which are capable of generating non-touch frames of bigger length.
Fix it with enlarging frame 0 buffer up to internal wmt(4) buffer size.