CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log

MFV r362082:

Update sqlite3 3.31.1 --> 3.32.0.

PR: 247149
Reported by: spam123@bitbert.com
Reminded by: emaste
MFC after: 3 days
Security: CVE-2020-11655, CVE-2020-13434, CVE-2020-13435,
CVE-2020-13630, CVE-2020-13631, CVE-2020-13632

Teach the arm64 vfp.h about struct thread.

Ensure struct thread is defined in vfp.h. In some cases it is not and stops
the kernel from building.

Sponsored by: Innovate UK

Small cleanup due to upstream ifdef cleanups.

MFC after: 1 week

Add myself (gbe) to committers-doc.dot and calendar.freebsd

Reviewed by: bcr (mentor)
Approved by: bcr (mentor)
Differential Revision: https://reviews.freebsd.org/D25241

[wlanstats] Add the per-node amsdu hardware decap'ed receive stats.

This is useful for tracking hardware provided AMSDU frames to see
when we're (a) seeing them, and (b) seeing the split between
intermediary and final frames.

Tested:

* QCA9880 (athp) - AP mode

[net80211] First part of A-MSDU offload handling - don't bump A-MPDU reordering seqno

When doing A-MSDU offload handling the driver is required to mark
A-MSDUs from the same MPDU with the same sequence number.
It then tags them as AMSDU (if it's a decap'ed A-MSDU) and AMSDU_MORE
(saying there's more AMSDUs decapped in the same MSDU.)
This allows encryption and sequence number offload to work right.

In the A-MSDU path the sequence number check looks at the A-MSDU flags
in the frame to see whether it's part of the same seqno and will pass them
(ie, not increment rx_seq until the last A-MSDU is seen from the driver,
or a new seqno shows up.0

However, I did this work in the A-MSDU path but not the A-MSDU in A-MPDU path.
For the non A-MDSU offload case the A-MPDU receive reordering will do its
thing and then pass up the MPDU up for decap - which then will see it's
an A-MSDU and decap each sub-frame.  But this isn't done for offloaded
A-MSDU frames.

This requires two parts:

* Don't bump the RX sequence number, same as above; and
* If frames go into the reordering buffer, they need to be added into the slot
  as a set of frames rather than a single frame, so once a new seqno shows up
  this slot can be marked as "full" and we can move on.

This patch does the first.  The latter requires that I find and commit
work to change rxa_m from an mbuf to an mbufq and the nhandle A-MSDU
there.  But, the first is enough to allow the normal case (ie, no or not
a lot of A-MPDU RX reordering) to work.

This allows the athp driver (QCA9880) throughput to go from VERY low
(like 5mbit TCP, 1/3-1/4 expected UDP throughput) to ~ 250mbit TCP
and > 300mbit UDP on a VHT/40 channel.  TCP sucks because, well, it
shows up as MASSIVE packet loss when all but one frame in a decap'ed
A-MSDU stream is dropped. Le whoops.

Now, where'd I put that laptop with the patch for rxa_m mbufq that
I wrote like in 2017...

Tested:

* AR9380, STA/AP mode (a big no-op, no A-MSDU hardware decap);
* if_run (RT3593), STA DWDS mode (A-MPDU / A-MSDU receive, but again
  no A-MSDU hardware decap);
* QCA9880, STA/AP mode (which is doing hardware A-MPDU/A-MSDU decap,
  but no A-MPDU reordering in the firmware.)

Import sqlite3-3.32.0 (3320000)

Decode the "LACP Fast Timeout" LAGG option flag

r286700 added the "lacp_fast_timeout" option to `ifconfig', but we forgot to
include the new option in the string used to decode the option bits. Add
"LACP_FAST_TIMO" to LAGG_OPT_BITS.

Also, s/LAGG_OPT_LACP_TIMEOUT/LAGG_OPT_LACP_FAST_TIMO/g , to be clearer that
the flag indicates "Fast Timeout" mode.

Reported by: Greg Foster <gfoster at panasas dot com>
Reviewed by: jpaetzel
MFC after: 1 week
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D25239

Shorten the filename of the coresight replicator driver.

Sponsored by: DARPA, AFRL

netmap: introduce netmap_kring_on()

This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().

MFC after: 1 week

Correct comment (this should have been committed with r362065).

Sponsored by: The FreeBSD Foundation
MFC after: 13 days

Skip sys.net.if_lagg_test.lacp_linkstate_destroy_stress in CI because of panic

PR: 244168
Sponsored by: The FreeBSD Foundation

Remove some more duplicate test cases I accidentally committed

Reported by: markj, yuripv
MFC after: 2 weeks
X-MFC-With: 362017

Restore TLB invalidations done before smp started.

In particular, invalidation of the preloaded modules text to allow
execution from it was broken after D25188/r362031.

Reviewed by: markj
Reported by: delphij, dhw
Sponsored by: The FreeBSD Foundation
MFC after: 13 days

em(4): Always reinit interface when adding/removing VLAN

This partially reverts r361053 since there have been reports
by users that this breaks some functionality for em(4)
devices; it seems at first glance that some sort of interface
restart is required for those cards.

This isn't a proper fix; this unbreaks those users until a proper
fix is found for their issues.

PR: 240818
Reported by: Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after: 3 days

xargs(1): Add EXAMPLES to man page

Add EXAMPLES covering options I, J, L, n, P.

While here, fix warning (STYLE: no blank before trailing delimiter: Fl P,)
Bumping .Dd

Approved by: bcr@
Differential Revision: https://reviews.freebsd.org/D25214

Don't use newlines with linux_msg(). No functional changes.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

Add missing range checks when receiving USB ethernet packets.

Found by: Ilja Van Sprundel, IOActive
MFC after: 3 days
Sponsored by: Mellanox Technologies

Replace LINUX_FASYNC with LINUX_O_ASYNC; no functional changes.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25218

Non-functional changes due to upstream cleanup.

MFC after: 1 week

Fix grabbing of tegra uart.
An attempt to write to FCR register may corrupt transmit FIFO,
so we should wait for the FIFO to be empty before we can modify it.

MFC after: 1 week

Improve the warnings.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

Make linux(4) handle SO_REUSEPORT.

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25216

fix up r362047: a call to zvol_*_minors() was not hidden from userland

Reported by: CI/FreeBSD-head-powerpc64-build
MFC after: 5 weeks
X-MFC with: r362047

rework how ZVOLs are updated in response to DSL operations

With this change all ZVOL updates are initiated from the SPA sync
context instead of a mix of the sync and open contexts.  The updates are
queued to be applied by a dedicated thread in the original order.  This
should ensure that ZVOLs always accurately reflect the corresponding
datasets.  ZFS ioctl operations wait on the mentioned thread to complete
its work.  Thus, the illusion of the synchronous ZVOL update is
preserved.  At the same time, the SPA sync thread never blocks on ZVOL
related operations avoiding problems like reported in bug 203864.

This change is based on earlier work in the same direction: D7179 and
D14669 by Anthoine Bourgeois.  D7179 tried to perform ZVOL operations
in the open context and that opened races between them.  D14669 uses a
design very similar to this change but with different implementation
details.

This change also heavily borrows from similar code in ZoL, but there are
many differences too.  See:
- https://github.com/zfsonlinux/zfs/commit/a0bd735adb1b1eb81fef10b4db102ee051c4d4ff
- https://github.com/zfsonlinux/zfs/issues/3681
- https://github.com/zfsonlinux/zfs/issues/2217

PR: 203864
MFC after: 5 weeks
Sponsored by: CyberSecure
Differential Revision: https://reviews.freebsd.org/D23478

Make sure packets generated by raw IP code is let through by mlx5en(4).

Allow the TCP header to reside in the mbuf following the IP header.
Else such packets will get dropped.

Backtrace:
mlx5e_sq_xmit()
mlx5e_xmit()
ether_output_frame()
ether_output()
ip_output_send()
ip_output()
rip_output()
sosend_generic()
sosend()
kern_sendit()
sendit()
sys_sendto()
amd64_syscall()
fast_syscall_common()

MFC after: 1 week
Sponsored by: Mellanox Technologies

Extend use of unlikely() in the fast path, in mlx5en(4).

Typically the TCP/IP headers fit within the first mbuf and should not
trigger any of the error cases. Use unlikely() for these cases.

No functional change.

MFC after: 1 week
Sponsored by: Mellanox Technologies

Use const keyword when parsing the TCP/IP header in the fast path in mlx5en(4).

When parsing the TCP/IP header in the fast path, make it clear by using
the const keyword, no fields are to be modified inside the transmitted
packet.

No functional change.

MFC after: 1 week
Sponsored by: Mellanox Technologies

iicbb: rebuild the bit-banging algorithms using different primitives

I2C_SET was quite inflexible, it used too long delays as well as some
unnecessary delays.  The new building blocks are iicbb_clockin and
iicbb_clockout.  The former sets SDA and starts the high period of SCL,
the latter executes the low period of SCL.  What happens during the high
phase depends on the operation.  For writes we just hold both lines, for
reads we poll SDA.  S, Sr and P change SDA in the middle of the high
period.

Also, the calculation of udelay has been updated, so that the resulting
period more closely corresponds the requested bus frequency.  There is a
new knob, io_delay, that allows to further adjust udelay based on the
estimated latency of pin toggling operations.

Finally, I slightly changed debug tracing and added error indicators to
it.  The debug prints are compiled in but disabled by default.  This can
be of use if there is any fallout from this change.

Some ideas for further improvements:
- add a function for sub-microsecond delays (e.g., in units of 1/10th of
  a microsecond) and use it for more precise timing of short delays;
- account for the actual time spent in the pin I/O.

Some sample debug output with the new code follows.

Reading temperature and humidity from HTU21 in the bus hold mode:
  <<w80+ we3+ <w81+ .....r6d+ rac+ r94- >>
  <<w80+ we5+ <w81+ .............r47+ re2+ r84- >>
where '<<' is S, '<' is Sr, '>>' is P, '.' is one millisecond of clock
stretching by the slave.

Reading temperature and humidity in the no-hold mode:
  <<w80+ wf3+ >>
  <<w81- >>
  <<w81+ r6d+ r54+ raf- >>
  <<w80+ wf5+ >>
  <<w81- >>
  <<w81+ r48+ r4e+ r9c- >>
where '+' is Ack and '-' is NoAck.
We see that first read attempts are not acknowledged.

MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D22206

Remove duplicate lines from sed tests

Reported by: yuripv
Approved by: pfg (src)
MFC after: 2 weeks
X-MFC-With: 362017

Hard-code the ice_ddp firmware version.

Like every other firmware image in the tree, the makefile will need to
be updated to point to the newest import.

Reviewed by: erj, imp (previous version)
Differential Revision: https://reviews.freebsd.org/D25222

Fix a couple of nits in Linux sysinfo(2) emulation.

- Use the same definition of free memory as Linux.
- Rename the totalbig and freebig fields to match the corresponding
names on Linux.

Discussed with: alc
MFC after: 1 week

Add a comment reflecting the commit log for r361945.

Suggested by: alc
Reviewed by: alc
MFC with: r361945

Remove the FIRMWARE_MAX limit.

The firmware module arbitrarily limits us to at most 50 images. It is
possible to hit this limit on platforms that preload many firmware
images, or link all of the firmware images for a set of devices into the
kernel.

Convert the table into a linked list, removing the limit.

Reported by: Steve Wheeler
Reviewed by: rpokala
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D25161

powerpc/pmap: Fix pte_find_next() iterators for booke64 pmap

After r361988 fixed the reference count leak on booke64, it became possible
for an iteration somewhere in the middle of a page to become stale, with the
page vanishing (correctly) due to all PTEs on that page going away.
pte_find_next() would start at that iterator, and move along 'higher' order
directory pages until it finds a valid one, without zeroing out the lower
order pages. For instance:

/* Find next pte at or above 0x10002000. */
pte = pte_find_next(pmap, &(0x10002000));
pte_remove(pmap, pte);
/* This pte was the last reference in the page table page, page is
* gone.
*/
pte = pte_find_next(pmap, 0x10002000);
/* pte_find_next will see 0x10002000's page is gone, and jump to the
* next one, but starting iteration at the '0x2000' slot, skipping
* 0x0000 and 0x1000.
*/

This caused some processes, like git, to trip the KASSERT() in
pmap_release().

Fix this by zeroing all lower order iterators at each level.

Remove double-calls to tc_get_timecount() to warm timecounters.

It seems that second call does not add any useful state change for all
implemented timecounters.

Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks

Add pthread_getname_np() and pthread_setname_np() aliases for
pthread_get_name_np() and pthread_set_name_np().

This re-applies r361770 after compatibility fixes.

Reviewed by: antoine, jkim, markj
Tested by: antoine (exp-run)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25117

amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations.

Right now code first flushes all local TLB entries that needs to be
flushed, then signals IPI to remote cores, and then waits for
acknowledgements while spinning idle. In the VMWare article 'Don’t
shoot down TLB shootdowns!' it was noted that the time spent spinning
is lost, and can be more usefully used doing local TLB invalidation.

We could use the same invalidation handler for local TLB as for
remote, but typically for pmap == curpmap we can use INVLPG for locals
instead of INVPCID on remotes, since we cannot control context
switches on them. Due to that, keep the local code and provide the
callbacks to be called from smp_targeted_tlb_shootdown() after IPIs
are fired but before spin wait starts.

Reviewed by: alc, cem, markj, Anton Rang <rang at acm.org>
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D25188

Add mode selection to iMX6 IPU driver

- Configure ipu1_di0 tob e sourced from the VIDEO_PLL(PLL5) and hardcode
  frequency to (455000000/3)Mhz. This value, further divided, can yield
  frequencies close enough to support 1080p, 720p, 1024x768, and 640x480
  modes. This is not ideal but it's an improvement comparing to the only
  hardcoded 1024x768 mode.

- Fix memory leaks if attach method failed
- Print EDID when -v passed to the kernel

Fix reading EDID on TVs/monitors without E-DCC support

Writing segment id to I2C device 0x30 only required if the segment is
non-zero. On the devices without E-DCC support writing to that address
fails and whole transaction then fails too. To avoid this do
not attempt write to the segment selection device unless required.

MFC after: 2 weeks

Adjust crypto_apply function callbacks for OCF.

- crypto_apply() is only used for reading a buffer to compute a
  digest, so change the data pointer to a const pointer.

- To better match m_apply(), change the data pointer type to void *
  and the length from uint16_t to u_int.  The length field in
  particular matters as none of the apply logic was splitting requests
  larger than UINT16_MAX.

- Adjust the auth_xform Update callback to match the function
  prototype passed to crypto_apply() and crypto_apply_buf().  This
  removes the needs for casts when using the Update callback.

- Change the Reinit and Setkey callbacks to also use a u_int length
  instead of uint16_t.

- Update auth transforms for the changes.  While here, use C99
  initializers for auth_hash structures and avoid casts on callbacks.

Reviewed by: cem
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25171

pci: loosen PCIe hot-plug requirements

The original PCIe hot-plug code required a couple of things which cause
PCI probing errors on the QEMU Q35 system and possibly physical systems
(Dell R6515).

Allocate the hot-plug interrupt as shared to support INTx interrupts.
The hot-plug interrupt mechanism should normally be MSI as PCIe mandates
MSI support, but QEMU's Q35 bridge only provides INTx interrupts.

Second, the code required the Electromechanical Interlock (Slot Status
EIS) to be engaged if present (Slot Capability EIP). Some platforms
including QEMU Q35 set EIP but not EIS. Fix by deleting the check.

Reviewed by: imp, mav, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24877

Read commands from stdin when -f - is passed to sed(1)

This patch teaches sed to interpret a "-" in a special way when given
as an argument to the -f flag.

This behavior is also present in GNU sed.

PR: 244872
Tested by: antoine (exp-run)
Reviewed by: pfg, tobik (older version)
Approved by: pfg (src)
Relnotes: yes
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24079

[net80211] ok ok if_xname won't ever be NULL.

Somewhere in net80211 if_xname is checked against NULL but it doesn't trigger
a compiler warning, but this does. So DTRT for FreeBSD and the other if_xname
derefences can be converted to this function at a later time.

Make linux(4) set the openfiles soft resource limit to 1024 for Linux
applications, which often depend on this being the case. There's a new
sysctl, compat.linux.default_openfiles, to control this behaviour.

Reviewed by: kevans, emaste, bcr (manpages)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25177

Support SO_SNDBUFFORCE/SO_RCVBUFFORCE by aliasing them to the
standard SO_SNDBUF/SO_RCVBUF. Mostly cosmetics, to get rid
of the warning during 'apt upgrade'.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25173

Fix arm64 kernel build with DEBUG on

Submitted by: Greg V <greg@unrelenting.technology>, andrew
Differential Revision: https://reviews.freebsd.org/D24986

All the ARM Coresight interconnect devices set ResourceProducer on memory
resources, ignore it.

The devices found in the ARM Neoverse N1 System Development Platform
(N1SDP).

Sponsored by: DARPA, AFRL

ARM Coresight Funnel device:
o Split-out FDT attachment to a separate file;
o Add ACPI attachment;
o Add support for the Static Funnel device.

Sponsored by: DARPA, AFRL

release: Fix arm GPT image

msdosfs labels are capitalized, use EFI instead of efi.

MFC after: 3 days

Fix the efi serial console in the Arm models.

On some UEFI implementations the ConsOut EFI variable is not a device
path end type so we never move to the next node. Fix this by always
incrementing the device path node pointer, with a sanity check that
the node length is large enough so no two nodes overlap.

While here return failure on malloc failure rather than a NULL pointer
dereference.

Reviewed by: tsoome, imp (previous version)
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D25202

Switch rtsock code to using newly-create rib_action() KPI call.

This simplifies the code and allows to further split rtentry and nexthop,
removing one of the blockers for multipath code introduction, described in
D24141.

Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D25192

Prevent TCP Cubic to abruptly increase cwnd after app-limited

Cubic calculates the new cwnd based on absolute time
elapsed since the start of an epoch. A cubic epoch is
started on congestion events, or once the congestion
avoidance phase is started, after slow-start has
completed.

When a sender is application limited for an extended
amount of time and subsequently a larger volume of data
becomes ready for sending, Cubic recalculates cwnd
with a lingering cubic epoch. This recalculation
of the cwnd can induce a massive increase in cwnd,
causing a burst of data to be sent at line rate by
the sender.

This adds a flag to reset the cubic epoch once a
session transitions from app-limited to cwnd-limited
to prevent the above effect.

Reviewed by: chengc_netapp.com, tuexen (mentor)
Approved by: tuexen (mentor), rgrimes (mentor)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D25065

Add le_read_channel_map and le_read_remote_features command

PR: 247051
Submitted by: Marc Veldman marc at bumblingdork.com

Add LE events:
READ_REMOTE_FEATURES_COMPL
LONG_TERM_KEY_REQUEST
REMOTE_CONN_PARAM_REQUEST
DATA_LENGTH_CHANGE
READ_LOCAL_P256_PK_COMPL
GEN_DHKEY_COMPL
ENH_CONN_COMPL

PR: 247050
Submitted by: Marc Veldman marc at bumblingdork.com

powerpc/powernv: Don't use the vmem quantum cache for OPAL PCI MSI allocations

vmem quantum cache is only needed when doing a lot of concurrent allocations,
which doesn't happen when allocating MSIs. This wastes memory for the cache
zones. Avoid this waste and don't use the quantum cache.

Reported by: markj

powerpc/mpc85xx: Don't use the quantum cache in vmem for MPIC MSIs

The qcache is unnecessary for this purpose, it's only needed when there are
lots of concurrent allocations.

Reported by: markj

Fixup r361997 by balancing parens. Duh.

Add missing shell script from r361995

Pointy hat: kevans
Reported by: rpokala
X-MFC-With: r361995

Add two functions that create M_EXTPG mbufs with anonymous pages.

These two functions are needed by nfs-over-tls, but could also be
useful for other purposes.
mb_alloc_ext_plus_pages() - Allocates a M_EXTPG mbuf and enough anonymous
      pages to store "len" data bytes.
mb_mapped_to_unmapped() - Copies the data from a list of mapped (non-M_EXTPG)
      mbufs into a list of M_EXTPG mbufs allocated with anonymous pages.
      This is roughly the inverse of mb_unmapped_to_ext().

Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D25182

Restore an RB_COLOR macro, for the benefit of a bit of DIAGNOSTIC code
that depends on it.

Reported by: rpokala, mjguzik
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25204

execvPe: obviate the need for potentially large stack allocations

Some environments in which execvPe may be called have a limited amount of
stack available. Currently, it avoidably allocates a segment on the stack
large enough to hold PATH so that it may be mutated and use strsep() for
easy parsing. This logic is now rewritten to just operate on the immutable
string passed in and do the necessary math to extract individual paths,
since it will be copying out those segments to another buffer anyways and
piecing them together with the name for a full path.

Additional size is also needed for the stack in posix_spawnp(), because it
may need to push all of argv to the stack and rebuild the command with sh in
front of it. We'll make sure it's properly aligned for the new thread, but
future work should likely make rfork_thread a little easier to use by
ensuring proper alignment.

Some trivial cleanup has been done with a couple of error writes, moving
strings into char arrays for use with the less fragile sizeof().

Reported by: Andrew Gierth <andrew_tao173.riddles.org.uk>
Reviewed by: jilles, kib, Andrew Gierth
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25038

execvp: fix up the ENOEXEC fallback

If execve fails with ENOEXEC, execvp is expected to rebuild the command
with /bin/sh instead and try again.

The previous version did this, but overlooked two details:

argv[0] can conceivably be NULL, in which case memp would never get
terminated. We must allocate no less than three * sizeof(char *) so we can
properly terminate at all times. For the non-NULL argv standard case, we
count all the non-NULL elements and actually skip the first argument, so we
end up capturing the NULL terminator in our bcopy().

The second detail is that the spec is actually worded such that we should
have been preserving argv[0] as passed to execvp:

"[...] executed command shall be as if the process invoked the sh utility
using execl() as follows:

execl(<shell path>, arg0, file, arg1, ..., (char *)0);

where <shell path> is an unspecified pathname for the sh utility, file is
the process image file, and for execvp(), where arg0, arg1, and so on
correspond to the values passed to execvp() in argv[0], argv[1], and so on."

So we make this change at this time as well, while we're already touching
it. We decidedly can't preserve a NULL argv[0] as this would be incredibly,
incredibly fragile, so we retain our legacy behavior of using "sh" for
argv[] in this specific instance.

Some light tests are added to try and detect some components of handling the
ENOEXEC fallback; posix_spawnp_enoexec_fallback_null_argv0 is likely not
100% reliable, but it at least won't raise false-alarms and it did result in
useful failures with pre-change libc on my machine.

This is a secondary change in D25038.

Reported by: Andrew Gierth <andrew_tao173.riddles.org.uk>
Reviewed by: jilles, kib, Andrew Gierth
MFC after: 1 week

Add some default cases for unreachable code to silence compiler warnings.

This was caused by r361481 when the buffer type was changed from an
int to an enum.

Reported by: mjg, rpokala
Sponsored by: Chelsio Communications

cred: distribute reference count per thread

This avoids dirtying creds in the common case, see the comment in kern_prot.c
for details.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D24007

ixl(4): Add FW recovery mode support and other things

Update the iflib version of ixl driver based on the OOT version ixl-1.11.29.

Major changes:

- Extract iflib specific functions from ixl_pf_main.c to ixl_pf_iflib.c
  to simplify code sharing between legacy and iflib version of driver

- Add support for most recent FW API version (1.10), which extends FW
  LLDP Agent control by user to X722 devices

- Improve handling of device global reset

- Add support for the FW recovery mode

- Use virtchnl function to validate virtual channel messages instead of
  using separate checks

- Fix MAC/VLAN filters accounting

Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: erj@
Tested by: Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after: 1 week
Relnotes: yes
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D24564

Add a crypto capability flag for accelerated software drivers.

Use this in GELI to print out a different message when accelerated
software such as AESNI is used vs plain software crypto.

While here, simplify the logic in GELI a bit for determing which type
of crypto driver was chosen the first time by examining the
capabilities of the matched driver after a single call to
crypto_newsession rather than making separate calls with different
flags.

Reviewed by: delphij
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25126

Mark padlock(4) and cryptocteon(4) as software drivers.

Both already return the accelerated software priority from
cryptodev_probesession.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25125

powerpc/pmap: Fix wired memory leak in booke64 page directories

Properly handle reference counts in the 64-bit pmap page directories.
Otherwise all page table pages would leak due to over-referencing. This
would cause a quick enter to swap on a desktop system (AmigaOne X5000) when
quitting and rerunning applications, or just building world.

Add an INVARIANTS check to validate no leakage at pmap release time.

Prevent TCP Cubic to abruptly increase cwnd after slow-start

Introducing flags to track the initial Wmax dragging and exit
from slow-start in TCP Cubic. This prevents sudden jumps in the
caluclated cwnd by cubic, especially when the flow is application
limited during slow start (cwnd can not grow as fast as expected).
The downside is that cubic may remain slightly longer in the
concave region before starting the convex region beyond Wmax again.

Reviewed by: chengc_netapp.com, tuexen (mentor)
Approved by: tuexen (mentor), rgrimes (mentor, blanket)
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D23655

Merge bmake-20200606

Relevant items from ChangeLog:

o dir.c: cached_stats - don't confuse stat and lstat results.
o var.c: add :Or for reverse sort.

Fix boot of wandquad after DTS update

In the recent dts sync the name of the aips-bus@ changed to bus@. Reflect
this change and add an additional OF_finddevice in fix_fdt_interrupt_data()
and in fix_fdt_iomuxc_data() with bus@ only. Iow, keep the old naming for
compatibility.

Discussed with: ian@

To reduce the size of an rb_node, drop the color field. Set the least
significant bit in the pointer to the node from its parent to indicate
that the node is red. Have the tree rotation macros leave the
old-parent/new-child node red and the new-parent/old-child node black.

This change makes RB_LEFT and RB_RIGHT no longer assignable, and
RB_COLOR no longer defined. Any code that modifies the tree or
examines a node color would have to be modified after this change.

Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D25105

iflib: netmap: honor netmap_irx_irq return values

In the receive interrupt routine, always call netmap_rx_irq().
The latter function will return != NM_IRQ_PASS if netmap is not
active on that specific receive queue, so that the driver can go
on with iflib_rxeof(). Note that netmap supports partial opening,
where only a subset of the RX or TX rings can be open in netmap mode.
Checking the IFCAP_NETMAP flag is not enough to make sure that the
queue is indeed in netmap mode.
Moreover, in case netmap_rx_irq() returns NM_IRQ_RESCHED, it means
that netmap expects the driver to call netmap_rx_irq() again as soon
as possible. Currently, this may happen when the device is attached
to a VALE switch.

Reviewed by: gallatin
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25167

release: amd64 efi boot name is bootx64

efi_boot_name is just used for arm image so no harm done.

Reported by: gonzo
MFC after: 3 days

libusb: improve compatibility

Specifically, add LIBUSB_CLASS_PHYSICAL and the libusb_has_capability API.
Descriptions and functionality for these derived from the
documentation at [0]. The current set of capabilities are all supported by
libusb.

These were detected as missing after updating net/freerdp to 2.1.1, which
attempted to use both.

[0] http://libusb.sourceforge.net/api-1.0/group__libusb__misc.html

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25194

Similar to UART on ThunderX2, the ARM Coresight (ETM component)
set ResourceProducer on memory resources: ignore it.

Tested on ARM N1SDP board.

Sponsored by: DARPA, AFRL

Refactor ptrace() ABI compatibility.

Add a freebsd32_ptrace() and move as many freebsd32 shims as possible
to freebsd32_ptrace(). Aside from register sets, freebsd32 passes
pointers to native structures to kern_ptrace() and converts to/from
native/32-bit structure formats in freebsd32_ptrace() outside of
kern_ptrace().

Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25195

ARM Embedded Trace Macrocell v4.x driver:
o Split-out FDT attachment to a separate file;
o Add ACPI attachment.

Sponsored by: DARPA, AFRL

Fix style: wrap long lines.

Sponsored by: DARPA, AFRL

Rename coresight drivers: use underscores in filenames.

Sponsored by: DARPA, AFRL

Assert on pg_jobc state.

Stolen from NetBSD.

vm: rework swap_pager_status to execute in constant time

The lock-protected iteration is trivially avoidable.

This removes a serialisation point from Linux binaries (which end up calling
here from the sysinfo syscall).

coufreq_dt: Rename DEBUG to DPRINTF

DEBUG is a kernel configuration flag and if used cpufreq_dt.c will fail the
build of kernel.

PR: 246867
Submitted by: Oskar Holmund (oskar.holmlund@ohdata.se)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25080

ps: remove xo_no_setlocale() call

Apparently libxo was fixed to do the right thing on FreeBSD,
and calling xo_no_setlocale() is no longer needed.

Reported by: phil

Post CVE-2020-12695 cleanup patch:

Resolve a Linuxism to fix the build.

MFC after: 3 days
X-MFC with: r361957, r361958, r361959

MFV r361938:

Upstream commit message:

[PATCH 3/3] WPS UPnP: Handle HTTP initiation failures for events more
properly

While it is appropriate to try to retransmit the event to another
callback URL on a failure to initiate the HTTP client connection, there
is no point in trying the exact same operation multiple times in a row.
Replve the event_retry() calls with event_addr_failure() for these cases
to avoid busy loops trying to repeat the same failing operation.

These potential busy loops would go through eloop callbacks, so the
process is not completely stuck on handling them, but unnecessary CPU
would be used to process the continues retries that will keep failing
for the same reason.

Obtained from: https://w1.fi/security/2020-1/\
0003-WPS-UPnP-Handle-HTTP-initiation-failures-for-events-.patch
MFC after: 3 days
Security: VU#339275 and CVE-2020-12695

MFV r361937:

Upstream commit message:

[PATCH 2/3] WPS UPnP: Fix event message generation using a long URL path

More than about 700 character URL ended up overflowing the wpabuf used
for building the event notification and this resulted in the wpabuf
buffer overflow checks terminating the hostapd process. Fix this by
allocating the buffer to be large enough to contain the full URL path.
However, since that around 700 character limit has been the practical
limit for more than ten years, start explicitly enforcing that as the
limit or the callback URLs since any longer ones had not worked before
and there is no need to enable them now either.

Obtained from: https://w1.fi/security/2020-1/\
0002-WPS-UPnP-Fix-event-message-generation-using-a-long-U.patch
MFC after: 3 days
Security: VU#339275 and CVE-2020-12695

MFV r361936:

Upstream commit message:

[PATCH 1/3] WPS UPnP: Do not allow event subscriptions with URLs to
other networks

The UPnP Device Architecture 2.0 specification errata ("UDA errata
16-04-2020.docx") addresses a problem with notifications being allowed
to go out to other domains by disallowing such cases. Do such filtering
for the notification callback URLs to avoid undesired connections to
external networks based on subscriptions that any device in the local
network could request when WPS support for external registrars is
enabled (the upnp_iface parameter in hostapd configuration).

Obtained from: https://w1.fi/security/2020-1/\
0001-WPS-UPnP-Do-not-allow-event-subscriptions-with-URLs-.patch
MFC after: 3 days
Security: VU#339275 and CVE-2020-12695

Fix a bug where XU_NGROUPS + 1 groups might be copied.

r361780 fixed the code so that it would only remove the duplicate when
it actually existed. However, that might have resulted in XU_NGROUPS + 1
groups being copied, running off the end of the array. This patch fixes
the problem.

Spotted during code inspection for other mountd changes.

MFC after: 2 weeks

Import bmake-20200606

Relevant items from ChangeLog:

o dir.c: cached_stats - don't confuse stat and lstat results.
o var.c: add :Or for reverse sort.

Stop computing a "sharedram" value when emulating Linux sysinfo(2).

The previous code was computing an incorrect value in a very expensive
manner.  "sharedram" is supposed to be the amount of memory used by
named swap objects, which on FreeBSD basically corresponds to memory
usage by shared memory objects (including, for example, GEM objects) and
tmpfs.  We currently have no cheap way to count such pages.  The
previous code tried to determine the number of copy-on-write pages
shared between processes.

Just replace the computed value with 0.  illumos reportedly does the
same thing.  Linux itself did not populate this field until a 2014
commit, "mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces".

Reported by: mjg
MFC after: 1 week

virtio: Support non-legacy network device and queue

The non-legacy interface always defines num_buffers in the header,
regardless of whether VIRTIO_NET_F_MRG_RXBUF, just leaving it unused. We
also need to ensure our virtqueue doesn't filter out VIRTIO_F_VERSION_1
during negotiation, as it supports non-legacy transports just fine. This
fixes network packet transmission on TinyEMU.

Reviewed by: br, brooks (mentor), jhb (mentor)
Approved by: br, brooks (mentor), jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D25132

virtio_mmio: Negotiate the upper half of the feature bits too

The feature bits are exposed as a 32-bit register with 2 banks, so we
should negotiate both halves. Notably, VIRTIO_F_VERSION_1 is in the
upper half, and will be used in an upcoming commit.

The PCI bus driver also has this bug, but the legacy BAR layout did not
include selector registers and is rather different from the modern
layout, so it remains solely as legacy.

Reviewed by: br, brooks (mentor), jhb (mentor)
Approved by: br, brooks (mentor), jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D25131

Use Fl instead of Ar for long flags

Also, bump date after r361935.

MFC after: 1 week

Implement zero-copy iSCSI target transmission/read.

Add ICL_NOCOPY flag to icl_pdu_append_data(), specifying that the method
can just reference the data buffer instead of immediately copying it.

Extend the offload KPI with optional PDU queue method, allowing to specify
completion callback, called when all the data referenced by above has been
transferred and won't be accessed any more (the buffers can be freed).

Implement the above functionality in software iSCSI driver using mbufs
with external storage and reference counter.  Note that some NICs (ixl(4))
may keep the mbuf in TX queue for a long time, so CTL has to be ready.

Add optional method to struct ctl_scsiio for buffer reference counting.
Implement it for CTL block backend, allowing to delay free of the struct
ctl_be_block_io and memory it references as needed.  In first reincarnation
of the patch I tried to delay whole I/O as it is done for FibreChannel,
that was cleaner, but due to the above callback delays I had to rewrite
it this way to not leave LUN referenced potentially for hours or more.

All together on sequential read from ZFS ARC this saves about 30% of CPU
time and memory bandwidth by avoiding one of 3 memory copies (the other
two are from ZFS ARC to DMU cache and then from DMU cache to CTL buffers).
On tests with 2x Xeon Silver 4114 this allows to reach full line rate of
100GigE NIC.  Tests with Gold CPUs and two 100GigE NICs are stil TBD,
but expectations to saturate them are pretty high. ;)

Discussed with: Chelsio
Sponsored by: iXsystems, Inc.

Upstream commit message:

[PATCH 3/3] WPS UPnP: Handle HTTP initiation failures for events more
properly

While it is appropriate to try to retransmit the event to another
callback URL on a failure to initiate the HTTP client connection, there
is no point in trying the exact same operation multiple times in a row.
Replve the event_retry() calls with event_addr_failure() for these cases
to avoid busy loops trying to repeat the same failing operation.

These potential busy loops would go through eloop callbacks, so the
process is not completely stuck on handling them, but unnecessary CPU
would be used to process the continues retries that will keep failing
for the same reason.

Obtained from: https://w1.fi/security/2020-1/\
0003-WPS-UPnP-Handle-HTTP-initiation-failures-for-events-.patch
Security: VU#339275 and CVE-2020-12695

Upstream commit message:

[PATCH 2/3] WPS UPnP: Fix event message generation using a long URL path

More than about 700 character URL ended up overflowing the wpabuf used
for building the event notification and this resulted in the wpabuf
buffer overflow checks terminating the hostapd process. Fix this by
allocating the buffer to be large enough to contain the full URL path.
However, since that around 700 character limit has been the practical
limit for more than ten years, start explicitly enforcing that as the
limit or the callback URLs since any longer ones had not worked before
and there is no need to enable them now either.

Obtained from: https://w1.fi/security/2020-1/\
0002-WPS-UPnP-Fix-event-message-generation-using-a-long-U.patch
Security: VU#339275 and CVE-2020-12695

Upstream commit message:

[PATCH 1/3] WPS UPnP: Do not allow event subscriptions with URLs to
other networks

The UPnP Device Architecture 2.0 specification errata ("UDA errata
16-04-2020.docx") addresses a problem with notifications being allowed
to go out to other domains by disallowing such cases. Do such filtering
for the notification callback URLs to avoid undesired connections to
external networks based on subscriptions that any device in the local
network could request when WPS support for external registrars is
enabled (the upnp_iface parameter in hostapd configuration).

Obtained from: https://w1.fi/security/2020-1/\
0001-WPS-UPnP-Do-not-allow-event-subscriptions-with-URLs-.patch
MFC after: 1 week
Security: VU#339275 and CVE-2020-12695

Add VHDX support to mkimg(1)

VHDX is the successor of Microsoft's VHD file format. It increases
maximum capacity of the virtual drive to 64TB and introduces features
to better handle power/system failures.

VHDX is the required format for 2nd generation Hyper-V VMs.

Reviewed by: marcel
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25184