Stefan Eßer [Sat, 9 Feb 2019 14:21:29 +0000 (14:21 +0000)]
MFC r343479: Fix potential buffer overflow and undefined behavior.
The buffer allocated in read_chat() could be 1 element too short, if the
chatstr parameter passed in is 1 or 3 charachters long (e.g. "a" or "a b").
The allocation of the pointer array does not account for the terminating
NULL pointer in that case.
Overlapping source and destination strings are undefined in strcpy().
Instead of moving a string to the left by one character just increment the
char pointer before it is assigned to the results array.
Stefan Eßer [Sat, 9 Feb 2019 14:07:04 +0000 (14:07 +0000)]
MFC r343303: Silence a CI warning regarding the use of strcpy().
While this is a false positive (a sufficiently large buffer has been
allocated in the line above), the use of strdup() simplifies and clarifies
the code.
This fixes 'Assertion failed: ((VT.getVectorNumElements() +
N2C->getZExtValue() <= N1.getValueType().getVectorNumElements()) &&
"Extract subvector overflow!"), function getNode' when building the
multimedia/aom port (with AVX2 enabled).
Marius Strobl [Sat, 9 Feb 2019 11:51:59 +0000 (11:51 +0000)]
MFC: r343753
o As illustrated by e. g. figure 7-14 of the Intel 82599 10 GbE
controller datasheet revision 3.3, in the context of Ethernet
MACs the control data describing the packet buffers typically
are named "descriptors". Each of these descriptors references
one buffer, multiple of which a packet can be composed of.
By contrast, in comments, messages and the names of structure
members, iflib(4) refers to DMA resources employed for RX and
TX buffers (rather than control data) as "desc(riptors)".
This odd naming convention of iflib(4) made reviewing r343085
and identifying wrong and missing bus_dmamap_sync(9) calls in
particular way harder than it already is. This convention may
also explain why the netmap(4) part of iflib(4) pairs the DMA
tags for control data with DMA maps of buffers and vice versa
in calls to bus_dma(9) functions.
Therefore, change iflib(4) to refer to buf(fers) when buffers
and not the usual understanding of descriptors is meant. This
change does not include corrections to the DMA resources used
in the netmap(4) parts. However, it revises error messages to
state which kind of allocation/creation failed. Specifically,
the "Unable to allocate tx_buffer (map) memory" copy & pasted
inappropriately on several occasions was replaced with proper
messages.
o Enhance some other error messages to indicate which half - RX
or TX - they apply to instead of using identical text in both
cases and generally canonicalize them.
o Correct the descriptions of iflib_{r,t}xsd_alloc() to reflect
reality; current code doesn't use {r,t}x_buffer structures.
o In iflib_queues_alloc():
- Remove redundant BUS_DMA_NOWAIT of iflib_dma_alloc() calls,
- change the M_WAITOK from malloc(9) calls into M_NOWAIT. The
return values are already checked, deferred DMA allocations
not being an option at this point, BUS_DMA_NOWAIT has to be
used anyway and prior malloc(9) calls in this function also
specify M_NOWAIT.
MFC r342908:
Reduce the size of struct ip_fw_args from 240 to 128 bytes on amd64.
And refactor the code to avoid unneeded initialization to reduce overhead
of per-packet processing.
ipfw(4) can be invoked by pfil(9) framework for each packet several times.
Each call uses on-stack variable of type struct ip_fw_args to keep the
state of ipfw(4) processing. Currently this variable has 240 bytes size
on amd64. Each time ipfw(4) does bzero() on it, and then it initializes
some fields.
glebius@ has reported that they at Netflix discovered, that initialization
of this variable produces significant overhead on packet processing.
After patching I managed to increase performance of packet processing on
simple routing with ipfw(4) firewalling to about 11% from 9.8Mpps up to
11Mpps (Xeon E5-2660 v4@ + Mellanox 100G card).
Introduced new field flags, it is used to keep track of what fields was
initialized. Some fields were moved into the anonymous union, to reduce
the size. They all are mutually exclusive. dummypar field was unused, and
therefore it is removed. The hopstore6 field type was changed from
sockaddr_in6 to a bit smaller struct ip_fw_nh6. And now the size of struct
ip_fw_args is 128 bytes.
ipfw_chk() was modified to properly handle ip_fw_args.flags instead of
rely on checking for NULL pointers.
MFC r343551:
Fix the bug introduced in r342908, that causes problems with dynamic
handling for protocols without ports numbers.
Since port numbers were uninitialized for protocols like ICMP/ICMPv6,
ipfw_chk() used some non-zero values to create dynamic states, and due
this it failed to match replies with created states.
Alexander Motin [Sat, 9 Feb 2019 02:09:29 +0000 (02:09 +0000)]
MFC r343673: Fix integer math overflow in UMA hash_alloc().
512GB of ZFS ABD ARC means abd_chunk zone of 128M 4KB items. To manage
them UMA tries to allocate 2GB hash table, which size does not fit into
the int variable, causing later allocation failure, which makes ARC shrink
back below the 512GB, not letting it to use more RAM. With this change I
easily reached >700GB ARC size on 768GB RAM machine.
Add SYNC_KLOOP_MODE option, and add support for direct mode, where application
executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback.
Marius Strobl [Thu, 7 Feb 2019 10:30:11 +0000 (10:30 +0000)]
MFC: r343578 (partial)
- Stop iflib(4) from leaking MSI messages on detachment by calling
bus_teardown_intr(9) before pci_release_msi(9).
- Ensure that iflib(4) and associated drivers pass correct RIDs to
bus_release_resource(9) by obtaining the RIDs via rman_get_rid(9)
on the corresponding resources instead of using the RIDs initially
passed to bus_alloc_resource_any(9) as the latter function may
change those RIDs. Solely em(4) for the ioport resource (but not
others) and bnxt(4) were using the correct RIDs by caching the ones
returned by bus_alloc_resource_any(9).
- Change the logic of iflib_msix_init() around to only map the MSI-X
BAR if MSI-X is actually supported, i. e. pci_msix_count(9) returns
> 0. Otherwise the "Unable to map MSIX table " message triggers for
devices that simply don't support MSI-X and the user may think that
something is wrong while in fact everything works as expected.
- Put some (mostly redundant) debug messages emitted by iflib(4)
and em(4) during attachment under bootverbose. The non-verbose
output of em(4) seen during attachment now is close to the one
prior to the conversion to iflib(4).
- Replace various variants of spelling "MSI-X" (several in messages)
with "MSI-X" as used in the PCI specifications.
- Remove some trailing whitespace from messages emitted by iflib(4)
and change them to consistently start with uppercase.
- Remove some obsolete comments about releasing interrupts from
drivers and correct a few others.
Reviewed by: erj, Jacob Keller, shurd
Differential Revision: https://reviews.freebsd.org/D18980
Dimitry Andric [Thu, 7 Feb 2019 06:55:26 +0000 (06:55 +0000)]
MFC r343748:
Use NLDT to get number of LDTs on i386
Compiling a GENERIC kernel for i386 with clang 8.0 results in the
following warning:
/usr/src/sys/i386/i386/sys_machdep.c:542:40: error: 'sizeof ((ldt))' will return the size of the pointer, not the array itself [-Werror,-Wsizeof-pointer-div]
nldt = pldt != NULL ? pldt->ldt_len : nitems(ldt);
^~~~~~~~~~~
/usr/src/sys/sys/param.h:299:32: note: expanded from macro 'nitems'
#define nitems(x) (sizeof((x)) / sizeof((x)[0]))
~~~~~~~~~~~ ^
Indeed, 'ldt' is declared as 'union descriptor *', so nitems() is not
the right way to determine the number of LDTs. Instead, the NLDT define
from sys/x86/include/segments.h should be used.
netmap: improvements to the netmap kloop (CSB mode)
Changelist:
- Add the proper memory barriers in the kloop ring processing
functions.
- Fix memory barriers usage in the user helpers (nm_sync_kloop_appl_write,
nm_sync_kloop_appl_read).
- Fix nm_kr_txempty() helper to look at rhead rather than rcur. This
is important since the kloop can read a value of rcur which is ahead
of the value of rhead (see explanation in nm_sync_kloop_appl_write)
- Remove obsolete ptnetmap_guest_write_kring_csb() and
ptnet_guest_read_kring_csb(), and update if_ptnet(4) to use those.
- Prepare in advance the arguments for netmap_sync_kloop_[tr]x_ring(),
to make the kloop faster.
- Provide kernel and user implementation for nm_ldld_barrier() and
nm_ldst_barrier()
netmap: fix knote() argument to match the mutex state
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.
MFC r343697:
net80211(4): fix rate check when 'roaming' ifconfig(8) option is set to 'auto'
Do not try to clear 'basic rate' bit from roamRate; it cannot be here and,
actually, this operation clears 'MCS rate' bit instead, breaking comparison
for 11n / 11ac modes.
MFC r343532:
A few corrections and clarifications to r343406.
- Use "in" instead of "on" when referring to directory and UFS partition.
- Switch from hw.physmem to hw.realmem and add a description to
distinguish the two.
- Explain why the "df" command is having trouble displaying ZFS sizes
correctly. Add a bit more descriptive text to help why the output of
"zfs list -o space" should be used.
- Switch to vmstat instead of iostat display for systat(1) as it shows
more information on one screen. Describe what is displayed based on the
text of the man page. Change the list of the other values accordingly.
- Sort the flags to "zfs destroy" alphabetically.
Cy Schubert [Tue, 5 Feb 2019 02:33:57 +0000 (02:33 +0000)]
MFC r342815:
Remove ipsd (IP Scan Detetor). It is unused and to my knowledge has
never been used on any platform that ipfilter has been on. However
it looks like it could be a useful utility, therefore there are plans
to make it a port one day. It lacks a man page as well.
Brooks Davis [Mon, 4 Feb 2019 22:38:34 +0000 (22:38 +0000)]
MFC r343587:
Add a simple port filter to SIFTR.
SIFTR does not allow any kind of filtering, but captures every packet
processed by the TCP stack.
Often, only a specific session or service is of interest, and doing the
filtering in post-processing of the log adds to the overhead of SIFTR.
This adds a new sysctl net.inet.siftr.port_filter. When set to zero, all
packets get captured as previously. If set to any other value, only
packets where either the source or the destination ports match, are
captured in the log file.
Submitted by: Richard Scheffenegger
Reviewed by: Cheng Cui
Differential Revision: https://reviews.freebsd.org/D18897
MFC r343524:
rsu(4): do not ignore mgmtrate / mcastrate / ucastrate.
Enforce net80211 rates for control / management / multicast / EAPOL frames
and allow to override rate for unicast frames via ifconfig(8) 'ucastrate'
option; by default it still uses f/w rate adaptation for unicast frames.
[rpi] Reorganize spigen(4) overlays for Raspberry Pi
- Remove CS=2 entry from spigen-rpi2 since it didn't work
- Add spigen-rpi3 overlay for Raspberry Pi 3
- Enable rpi overlay modules for GENERIC kernel on aarch64
[led] propagate error from set_led() to the caller
Do not lose error condition by always returning 0 from set_led.
None of the calls to set_led checks for return value at the moment so
none of API consumers in base is affected.
PR: 231567
Submitted by: Bertrand Petit <bsdpr@phoe.frmug.org>
r343222:
Fix crash in systat(4) when certain commands are called without arguments
Add check for missing arguments to dsmatchselect and dsselect
PR: 219689
Submitted by: Marko Turk <mt@markoturk.info>
r343223:
Fix inconsistency in return values introduced by r343222
Consistently return 1 or the case of missing arguments in both functions
PR: 219689
X-MFC-With: 343222
r343338:
Fix systat's :only command parser for the multiple arguments case
According to systat(1) :only option is supposed to accept multiple drives
but the parser for its arguments stops after first entry. Fix the parser
logic to accept multiple drives.
PR: 59220
Reported by: Andy Farkas <andyf@speednet.com.au>
r343009:
Add four kerberos CLI utilities to OptionalObsoleteFiles.inc
Add asn1_compile, make-roken, kcc, and slc to the OptionalObsoleteFiles.inc
so they would be removed during delete-old stage if the new world is built
without Kerberos support.
r343109:
Add optional obsolete files for the installworld without sendmail
Add two more entries for WITHOUT_SENDMAIL install. The /var/spool/clientmqueue
entry would be deleted only if there are no files/dirs in it, so the
content generated during previous lifecycle of the system is safe
r343110:
Fix conditional obsolete files entry for WITHOUT_EXAMPLES
Add all the files under /usr/share/examples to the MK_EXAMPLES
section. OLD_DIRS entries are not removed if they're not empty so
prior to this change WITHOUT_EXAMPLES didn't have significant effect
on the updated system.
r343028:
[mv_pci] Increase default PCI space size for mv_pci
mv_pci driver reads PCI memory window layout from DTB data and if the
data is incomplete falls back to default value. The value is too small
to fit two PCI spaces for mwlwifi devices on WRT3200ACM so the resource
allocation for them fails. Increase the default to 4Mb from 1Mb so
the devices can be properly attached.
r343104:
[mv] Fix invalid condition in fdt_fixup_ranges
Add parentheses to perform assignment before comparison. The prior
condition worked because fdt_parent_addr_cells returns 1 for the DTB
on which fdt_fixup_ranges is called and accidentally par_addr_cells
ends up to be set to the same value.
PR: 210705
Submitted by: David Binderman <dcb314@hotmail.com>
MFC r341472:
Add ability to request listing and deleting only for dynamic states.
This can be useful, when net.inet.ip.fw.dyn_keep_states is enabled, but
after rules reloading some state must be deleted. Added new flag '-D'
for such purpose.
Retire '-e' flag, since there can not be expired states in the meaning
that this flag historically had.
Also add "verbose" mode for listing of dynamic states, it can be enabled
with '-v' flag and adds additional information to states list. This can
be useful for debugging.
MFC r341471:
Reimplement how net.inet.ip.fw.dyn_keep_states works.
Turning on of this feature allows to keep dynamic states when parent
rule is deleted. But it worked only when the default rule is
"allow from any to any".
Now when rule with dynamic opcode is going to be deleted, and
net.inet.ip.fw.dyn_keep_states is enabled, existing states will reference
named objects corresponding to this rule, and also reference the rule.
And when ipfw_dyn_lookup_state() will find state for deleted parent rule,
it will return the pointer to the deleted rule, that is still valid.
This implementation doesn't support O_LIMIT_PARENT rules.
The refcnt field was added to struct ip_fw to keep reference, also
next pointer added to be able iterate rules and not damage the content
when deleted rules are chained.
Named objects are referenced only when states are going to be deleted to
be able reuse kidx of named objects when new parent rules will be
installed.
ipfw_dyn_get_count() function was modified and now it also looks into
dynamic states and constructs maps of existing named objects. This is
needed to correctly export orphaned states into userland.
ipfw_free_rule() was changed to be global, since now dynamic state can
free rule, when it is expired and references counters becomes 1.
External actions subsystem also modified, since external actions can be
deregisterd and instances can be destroyed. In these cases deleted rules,
that are referenced by orphaned states, must be modified to prevent access
to freed memory. ipfw_dyn_reset_eaction(), ipfw_reset_eaction_instance()
functions added for these purposes.
Kristof Provost [Fri, 1 Feb 2019 10:04:53 +0000 (10:04 +0000)]
MFC r343418:
pf: Fix use-after-free of counters
When cleaning up a vnet we free the counters in V_pf_default_rule and
V_pf_status from shutdown_pf(), but we can still use them later, for example
through pf_purge_expired_src_nodes().
Free them as the very last operation, as they rely on nothing else themselves.
Build fix for missing NET_EPOCH_XXX() dependencies after r343650.
This patch is to be reverted when the relevant changes are MFC'ed.
This is a direct commit.
MFC r343395:
Fix refcounting leaks in IPv6 MLD code leading to loss of IPv6
connectivity.
Looking at past changes in this area like r337866, some refcounting
bugs have been introduced, one by one. For example like calling
in6m_disconnect() and in6m_rele_locked() in mld_v1_process_group_timer()
where previously no disconnect nor refcount decrement was done.
Calling in6m_disconnect() when it shouldn't causes IPv6 solitation to no
longer work, because all the multicast addresses receiving the solitation
messages are now deleted from the network interface.
This patch reverts some recent changes while improving the MLD
refcounting and concurrency model after the MLD code was converted
to using EPOCH(9).
List changes:
- All CK_STAILQ_FOREACH() macros are now properly enclosed into
EPOCH(9) sections. This simplifies assertion of locking inside
in6m_ifmultiaddr_get_inm().
- Corrected bad use of in6m_disconnect() leading to loss of IPv6
connectivity for MLD v1.
- Factored out checks for valid inm structure into
in6m_ifmultiaddr_get_inm().
MFC r343394:
When detaching a network interface drain the workqueue freeing the inm's
because the destructor will access the if_ioctl() callback in the ifnet
pointer which is about to be freed. This prevents use-after-free.
MFC r343392:
Fix duplicate acquiring of refcount when joining IPv6 multicast groups.
This was observed by starting and stopping rpcbind(8) multiple times.
Brooks Davis [Wed, 30 Jan 2019 23:36:02 +0000 (23:36 +0000)]
MFC r340242:
Add a top-level make target to rebuild all sysent files.
The sysent target is useful when changing makesyscalls.sh, when
making paired changes to syscalls.master files, or in a future where
freebsd32 sysent entries are built from the default syscalls.master.
Marius Strobl [Wed, 30 Jan 2019 11:56:10 +0000 (11:56 +0000)]
MFC: r343481
- In _iflib_fl_refill(), don't mark an RX buffer as available in the
corresponding bitmap before adding an mbuf has actually succeeded.
Previously, m_gethdr(M_NOWAIT, ...) failing caused a "hole" in the
RX ring but not in its bitmap. One implication of such a hole was
that in a subsequent call to _iflib_fl_refill() with the RX buffer
accounting still indicating another reclaimable buffer, bit_ffc(3)
nevertheless returned -1 in frag_idx which in turn caused havoc
when used as an index. Thus, additionally assert that frag_idx is
0 or greater.
Another possible consequence of a hole in the RX ring was a NULL-
dereference when trying to use the unallocated mbuf, for example
in iflib_rxd_pkt_get().
This bug was introduced with r341095, MFCed to stable/12 in r343304.
While at it, make the variable declarations in _iflib_fl_refill()
conform to style(9) and remove redundant checks already performed
by bit_ffc{,_at}(3).
- In iflib_queues_alloc(), don't pass redundant M_ZERO to bit_alloc(3).