gjb [Thu, 16 Aug 2018 15:17:22 +0000 (15:17 +0000)]
MFC r337717, r337718:
r337717:
Add lang/python2, lang/python3, and lang/python to GCE images
to help avoid hard-coding 'python<MAJOR>.<MINOR>' in several
scripts in the client-side scripts. [1]
r337718:
Add a space between a variable and escaped new line.
PR: 230248 [1]
Sponsored by: The FreeBSD Foundation
loos [Wed, 15 Aug 2018 16:12:13 +0000 (16:12 +0000)]
MFC r312953:
The stf(4) interface name does not conform with the default naming
convention for interfaces, because only one stf(4) interface can exist
in the system.
This disallow the use of unit numbers different than 0, however, it is
possible to create the clone without specify the unit number (wildcard).
In the wildcard case we must update the interface name before return.
This fix an infinite recursion in pf code that keeps track of network
interfaces and groups:
1 - a group for the cloned type of the interface is added (stf in this
case);
2 - the system will now try to add an interface named stf (instead of
stf0) to stf group;
3 - when pfi_kif_attach() tries to search for an already existing 'stf'
interface, the 'stf' group is returned and thus the group is added
as an interface of itself;
This will now cause a crash at the first attempt to traverse the groups
which the stf interface belongs (which loops over itself).
kevans [Wed, 15 Aug 2018 01:24:43 +0000 (01:24 +0000)]
MFC r337504: apply(1): Fix magic number substitution with a magic space
Using a space as the magic character would result in problems if the command
started with a number:
- For a 'valid' number n, n < size of argv, it would erroneously get
replaced with that argument; e.g. `apply -a ' ' -d 1rm x => `execxrm x`
- For an 'invalid' number n, n >= size of argv, it would segfault.
e.g. `apply -a ' ' 2to3 test.py` would try to access argv[2]
This problem occurred because apply(1) would prepend "exec " to the command
string before doing the actual magic number replacements, so it would come
across "exec 2to3 1" and assume that the " 2" is also a magic number to be
replaced.
Re-work this to instead just append "exec " to the command sbuf and
workaround the ugliness. This also simplifies stuff in the process.
This commit is is simply a pops change as r324696 already plugged
this vulnerability. To maintain consistency with the vendor branch
props will be changed.
Another props change. The real work was done by r324696. We're simply
syncing up with the vendor branch again.
mport upline security patch: WNM: Ignore WNM-Sleep Mode Request in
wnm_sleep_mode=0 case. This is also upline git commit 114f2830d2c2aee6db23d48240e93415a256a37c.
kevans [Tue, 14 Aug 2018 19:44:36 +0000 (19:44 +0000)]
MFC r337520: Fix WITHOUT_LOADER_GELI (gptboot) and isoboot in general
gptboot was broken when r316078 added the LOADER_GELI_SUPPORT #ifdef to
not pass geliargs via __exec. KARGS_FLAGS_EXTARG must not be used if we're
not going to pass an additional argument to __exec.
kevans [Tue, 14 Aug 2018 19:42:18 +0000 (19:42 +0000)]
ubldr: Bump heap size, 1MB -> 2MB
1MB was leaving very little margin in some of the worse-case scenarios with
lualoader. 2MB is still low enough that we shouldn't have any problems with
UBoot-supported boards.
jtl [Tue, 14 Aug 2018 18:17:05 +0000 (18:17 +0000)]
MFC r337788:
Update the inet(4) and inet6(4) man pages to reflect the changes made
to the reassembly code in r337778, r337780, r337781, r337782, and
r337783.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 18:15:10 +0000 (18:15 +0000)]
MFC r337787:
Lower the default limits on the IPv6 reassembly queue.
Currently, the limits are quite high. On machines with millions of
mbuf clusters, the reassembly queue limits can also run into
the millions. Lower these values.
Also, try to ensure that no bucket will have a reassembly
queue larger than approximately 100 items. This limits the cost to
find the correct reassembly queue when processing an incoming
fragment.
Due to the low limits on each bucket's length, increase the size of
the hash table from 64 to 1024.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 18:13:36 +0000 (18:13 +0000)]
MFC r337786:
Lower the default limits on the IPv4 reassembly queue.
In particular, try to ensure that no bucket will have a reassembly
queue larger than approximately 100 items. This limits the cost to
find the correct reassembly queue when processing an incoming
fragment.
Due to the low limits on each bucket's length, increase the size of
the hash table from 64 to 1024.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 18:12:02 +0000 (18:12 +0000)]
MFC r337784:
Drop 0-byte IPv6 fragments.
Currently, we process IPv6 fragments with 0 bytes of payload, add them
to the reassembly queue, and do not recognize them as duplicating or
overlapping with adjacent 0-byte fragments. An attacker can exploit this
to create long fragment queues.
There is no legitimate reason for a fragment with no payload. However,
because IPv6 packets with an empty payload are acceptable, allow an
"atomic" fragment with no payload.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 18:10:25 +0000 (18:10 +0000)]
MFC r337783:
Implement a limit on on the number of IPv6 reassembly queues per bucket.
There is a hashing algorithm which should distribute IPv6 reassembly
queues across the available buckets in a relatively even way. However,
if there is a flaw in the hashing algorithm which allows a large number
of IPv6 fragment reassembly queues to end up in a single bucket, a per-
bucket limit could help mitigate the performance impact of this flaw.
Implement such a limit, with a default of twice the maximum number of
reassembly queues divided by the number of buckets. Recalculate the
limit any time the maximum number of reassembly queues changes.
However, allow the user to override the value using a sysctl
(net.inet6.ip6.maxfragbucketsize).
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 18:06:59 +0000 (18:06 +0000)]
MFC r337782:
Add a limit of the number of fragments per IPv6 packet.
The IPv4 fragment reassembly code supports a limit on the number of
fragments per packet. The default limit is currently 17 fragments.
Among other things, this limit serves to limit the number of fragments
the code must parse when trying to reassembly a packet.
Add a limit to the IPv6 reassembly code. By default, limit a packet
to 65 fragments (64 on the queue, plus one final fragment to complete
the packet). This allows an average fragment size of 1,008 bytes, which
should be sufficient to hold a fragment. (Recall that the IPv6 minimum
MTU is 1280 bytes. Therefore, this configuration allows a full-size
IPv6 packet to be fragmented on a link with the minimum MTU and still
carry approximately 272 bytes of headers before the fragmented portion
of the packet.)
Users can adjust this limit using the net.inet6.ip6.maxfragsperpacket
sysctl.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 17:59:42 +0000 (17:59 +0000)]
MFC r337781:
Make the IPv6 fragment limits be global, rather than per-VNET, limits.
The IPv6 reassembly fragment limit is based on the number of mbuf clusters,
which are a global resource. However, the limit is currently applied
on a per-VNET basis. Given enough VNETs (or given sufficient customization
on enough VNETs), it is possible that the sum of all the VNET fragment
limits will exceed the number of mbuf clusters available in the system.
Given the fact that the fragment limits are intended (at least in part) to
regulate access to a global resource, the IPv6 fragment limit should
be applied on a global basis.
Note that it is still possible to disable fragmentation for a particular
VNET by setting the net.inet6.ip6.maxfragpackets sysctl to 0 for that
VNET. In addition, it is now possible to disable fragmentation globally
by setting the net.inet6.ip6.maxfrags sysctl to 0.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 17:54:39 +0000 (17:54 +0000)]
MFC r337780:
Implement a limit on on the number of IPv4 reassembly queues per bucket.
There is a hashing algorithm which should distribute IPv4 reassembly
queues across the available buckets in a relatively even way. However,
if there is a flaw in the hashing algorithm which allows a large number
of IPv4 fragment reassembly queues to end up in a single bucket, a per-
bucket limit could help mitigate the performance impact of this flaw.
Implement such a limit, with a default of twice the maximum number of
reassembly queues divided by the number of buckets. Recalculate the
limit any time the maximum number of reassembly queues changes.
However, allow the user to override the value using a sysctl
(net.inet.ip.maxfragbucketsize).
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 17:52:06 +0000 (17:52 +0000)]
MFC r337778:
Add a global limit on the number of IPv4 fragments.
The IP reassembly fragment limit is based on the number of mbuf clusters,
which are a global resource. However, the limit is currently applied
on a per-VNET basis. Given enough VNETs (or given sufficient customization
of enough VNETs), it is possible that the sum of all the VNET limits
will exceed the number of mbuf clusters available in the system.
Given the fact that the fragment limit is intended (at least in part) to
regulate access to a global resource, the fragment limit should
be applied on a global basis.
VNET-specific limits can be adjusted by modifying the
net.inet.ip.maxfragpackets and net.inet.ip.maxfragsperpacket
sysctls.
To disable fragment reassembly globally, set net.inet.ip.maxfrags to 0.
To disable fragment reassembly for a particular VNET, set
net.inet.ip.maxfragpackets to 0.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 17:46:54 +0000 (17:46 +0000)]
MFC r337776:
Improve IPv6 reassembly performance by hashing fragments into buckets.
Currently, all IPv6 fragment reassembly queues are kept in a flat
linked list. This has a number of implications. Two significant
implications are: all reassembly operations share a common lock,
and it is possible for the linked list to grow quite large.
Improve IPv6 reassembly performance by hashing fragments into buckets,
each of which has its own lock. Calculate the hash key using a Jenkins
hash with a random seed.
Approved by: so
Security: FreeBSD-SA-18:10.ip
Security: CVE-2018-6923
jtl [Tue, 14 Aug 2018 17:43:11 +0000 (17:43 +0000)]
MFC r337775:
Improve hashing of IPv4 fragments.
Currently, IPv4 fragments are hashed into buckets based on a 32-bit
key which is calculated by (src_ip ^ ip_id) and combined with a random
seed. However, because an attacker can control the values of src_ip
and ip_id, it is possible to construct an attack which causes very
deep chains to form in a given bucket.
To ensure more uniform distribution (and lower predictability for
an attacker), calculate the hash based on a key which includes all
the fields we use to identify a reassembly queue (dst_ip, src_ip,
ip_id, and the ip protocol) as well as a random seed.
brooks [Tue, 14 Aug 2018 16:22:04 +0000 (16:22 +0000)]
MFC r337508:
Terminate filter_create_ext() args with NULL, not 0.
filter_create_ext() is documented to take a NULL terminated set of
arguments. 0 is promoted to an int so this would fail on 64-bit
systems if the value was not passed in a register. On all currently
supported 64-bit architectures it is.
kib [Sun, 12 Aug 2018 08:45:23 +0000 (08:45 +0000)]
MFC r336569:
Move mostly useless examples binaries from OFED, as well as the Subnet
Manager, under the new option WITH_OFED_EXTRA, disabled by default.
truckman [Sun, 12 Aug 2018 03:22:28 +0000 (03:22 +0000)]
MFC r336855
Fix the long term ULE load balancer so that it actually works. The
initial call to sched_balance() during startup is meant to initialize
balance_ticks, but does not actually do that since smp_started is
still zero at that time. Since balance_ticks does not get set,
there are no further calls to sched_balance(). Fix this by setting
balance_ticks in sched_initticks() since we know the value of
balance_interval at that time, and eliminate the useless startup
call to sched_balance(). We don't need to randomize the intial
value of balance_ticks.
Since there is now only one call to sched_balance(), we can hoist
the tests at the top of this function out to the caller and avoid
the overhead of the function call when running a SMP kernel on UP
hardware.
kevans [Sun, 12 Aug 2018 00:33:24 +0000 (00:33 +0000)]
MFC r337331: efirt: Don't enter EFI context early, convert addrs to KVA
efi_enter here was needed because efi_runtime dereference causes a fault
outside of EFI context, due to runtime table living in runtime service
space. This may cause problems early in boot, though, so instead access it
by converting paddr to KVA for access.
While here, remove the other direct PHYS_TO_DMAP calls and the explicit DMAP
requirement from efidev.
kevans [Fri, 10 Aug 2018 01:43:05 +0000 (01:43 +0000)]
MFC r337549: libnv: Remove -I${SRCTOP}/sys
This should have been done as part of r336019 -- including ${SRCTOP}/sys is
not a good business model for something that's build in legacy/bootstrap
stages.
Beyond that, libnv seems to build quite alright as legacy, part of
buildworld, and standalone without. Axe it.
davidcs [Thu, 9 Aug 2018 01:17:35 +0000 (01:17 +0000)]
MFC r336695
Remove support for QLNX_RCV_IN_TASKQ - i.e., Rx only in TaskQ.
Added support for LLDP passthru
Upgrade ECORE to version 8.33.5.0
Upgrade STORMFW to version 8.33.7.0
Added support for SRIOV
davidcs [Thu, 9 Aug 2018 00:39:39 +0000 (00:39 +0000)]
MFC r336438
Fixes for the following issues:
1. Fix taskqueues drain/free to fix panic seen when interface is being
bought down and in parallel asynchronous link events happening.
2. Fix bxe_ifmedia_status()
Submitted by: Vaishali.Kulkarni@cavium.com and Anand.Khoje@cavium.com
r320280:
packages: Allow stageworld/stagekernel to run with make jobs.
r320281:
packages: Allow staging world/kernel in parallel.
r320282:
packages: Allow creating kernel/world packages in parallel.
r320283:
packages: Allow actually building individual world packages in parallel.
r320284:
packages: Parallelize individual kernel packaging.
r320285:
Expose only the create-packages-* targets since they set needed
DEST/DIRDIR.
r320692:
Fix create-kernel-packages with multiple BUILDKERNELS after r320284
r322362:
Indent nested conditionals for readability.
r322401:
Avoid creating kernel-dbg.txz distribution sets and kernel-debug packages
when MK_DEBUG_FILES is 'no'.
r322402:
Fix indentation from r322401.
r336181:
Fix parsing of create-kernel-packages
MFC r331098 (by melifaro):
Fix outgoing TCP/UDP packet drop on arp/ndp entry expiration.
Current arp/nd code relies on the feedback from the datapath indicating
that the entry is still used. This mechanism is incorporated into the
arpresolve()/nd6_resolve() routines. After the inpcb route cache
introduction, the packet path for the locally-originated packets changed,
passing cached lle pointer to the ether_output() directly. This resulted
in the arp/ndp entry expire each time exactly after the configured max_age
interval. During the small window between the ARP/NDP request and reply
from the router, most of the packets got lost.
Fix this behaviour by plugging datapath notification code to the packet
path used by route cache. Unify the notification code by using single
inlined function with the per-AF callbacks.
MFC r336132:
Add "record-state", "set-limit" and "defer-action" rule options to ipfw.
"record-state" is similar to "keep-state", but it doesn't produce implicit
O_PROBE_STATE opcode in a rule. "set-limit" is like "limit", but it has the
same feature as "record-state", it is single opcode without implicit
O_PROBE_STATE opcode. "defer-action" is targeted to be used with dynamic
states. When rule with this opcode is matched, the rule's action will
not be executed, instead dynamic state will be created. And when this
state will be matched by "check-state", then rule action will be executed.
This allows create a more complicated rulesets.
dab [Tue, 7 Aug 2018 14:39:00 +0000 (14:39 +0000)]
MFC r336761 & r336781:
Allow a EVFILT_TIMER kevent to be updated.
If a timer is updated (re-added) with a different time period
(specified in the .data field of the kevent), the new time period has
no effect; the timer will not expire until the original time has
elapsed. This violates the documented behavior as the kqueue(2) man
page says (in part) "Re-adding an existing event will modify the
parameters of the original event, and not result in a duplicate
entry."
This modification, adapted from a patch submitted by cem@ to PR214987,
fixes the kqueue system to allow updating a timer entry. The kevent
timer behavior is changed to:
* When a timer is re-added, update the timer parameters to and
re-start the timer using the new parameters.
* Allow updating both active and already expired timers.
* When the timer has already expired, dequeue any undelivered events
and clear the count of expirations.
All of these changes address the original PR and also bring the
FreeBSD and macOS kevent timer behaviors into agreement.
A few other changes were made along the way:
* Update the kqueue(2) man page to reflect the new timer behavior.
* Fix man page style issues in kqueue(2) diagnosed by igor.
* Update the timer libkqueue system test to test for the updated
timer behavior.
* Fix the (test) libkqueue common.h file so that it includes
config.h which defines various HAVE_* feature defines, before the
#if tests for such variables in common.h. This enables the use of
the actual err(3) family of functions.
* Fix the usages of the err(3) functions in the tests for incorrect
type of variables. Those were formerly undiagnosed due to the
disablement of the err(3) functions (see previous bullet point).
jtl [Mon, 6 Aug 2018 17:41:53 +0000 (17:41 +0000)]
MFC r337384:
Address concerns about CPU usage while doing TCP reassembly.
Currently, the per-queue limit is a function of the receive buffer
size and the MSS. In certain cases (such as connections with large
receive buffers), the per-queue segment limit can be quite large.
Because we process segments as a linked list, large queues may not
perform acceptably.
The better long-term solution is to make the queue more efficient.
But, in the short-term, we can provide a way for a system
administrator to set the maximum queue size.
We set the default queue limit to 100. This is an effort to balance
performance with a sane resource limit. Depending on their
environment, goals, etc., an administrator may choose to modify this
limit in either direction.
Reviewed by: jhb
Approved by: so
Security: FreeBSD-SA-18:08.tcp
Security: CVE-2018-6922
kevans [Mon, 6 Aug 2018 03:58:56 +0000 (03:58 +0000)]
MFC r336919, r336924
r336919:
efirt: Add tunable to allow disabling EFI Runtime Services
Leading up to enabling EFIRT in GENERIC, allow runtime services to be
disabled with a new tunable: efi.rt_disabled. This makes it so that EFIRT
can be disabled easily in case we run into some buggy UEFI implementation
and fail to boot.
r336924:
Follow up to r336919 and r336921: s/efi.rt_disabled/efi.rt.disabled/
The latter matches the rest of the tree better [0]. The UPDATING entry has
been updated to reflect this, and the new tunable is now documented in
loader(8) [1].