]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
3 years agofhlink(2): the syscalls do not take flag
Konstantin Belousov [Sun, 28 Feb 2021 00:38:11 +0000 (02:38 +0200)]
fhlink(2): the syscalls do not take flag

(cherry picked from commit 600756afb532a86a39fb488f5c4fc7e248921655)

3 years agoMake kern.timecounter.hardware tunable
Konstantin Belousov [Sun, 7 Mar 2021 23:50:12 +0000 (01:50 +0200)]
Make kern.timecounter.hardware tunable

(cherry picked from commit 56b9bee63a42dbac712acf540f23a4c3dbd099a9)

3 years agoHyper-V: hn: Relinquish cpu in HN_LOCK to avoid deadlock
Wei Hu [Thu, 15 Oct 2020 11:44:28 +0000 (11:44 +0000)]
Hyper-V: hn: Relinquish cpu in HN_LOCK to avoid deadlock

The try lock loop in HN_LOCK put the thread spinning on cpu if the lock
is not available. It is possible to cause deadlock if the thread holding
the lock is sleeping. Relinquish the cpu to work around this problem even
it doesn't completely solve the issue. The priority inversion could cause
the livelock no matter how less likely it could happen. A more complete
solution may be needed in the future.

Reported by: Microsoft, Netapp
MFC after: 2 weeks
Sponsored by: Microsoft

(cherry picked from commit b3460f44524b145c6c8a760ebe65052560a810bf)

3 years agoHyper-V: pcib: Check revoke status during device attach
Wei Hu [Thu, 15 Oct 2020 05:57:20 +0000 (05:57 +0000)]
Hyper-V: pcib: Check revoke status during device attach

It is possible that the vmbus pcib channel is revoked during attach path.
The attach path could be waiting for response from host and this response will never
arrive since the channel has already been revoked from host point of view. Check
this situation during wait complete and return failed if this happens.

Reported by: Netapp
MFC after: 2 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D26486

(cherry picked from commit 75c2786c25fef9a6f8239c9fc1631cd17756579b)

3 years agoHyper-V: storvsc: Enhance srb_status code handling.
Wei Hu [Mon, 31 Aug 2020 09:05:45 +0000 (09:05 +0000)]
Hyper-V: storvsc: Enhance srb_status code handling.

In hv_storvsc_io_request() when coring, prevent changing of the send channel
from the base channel to another one. storvsc_poll always probes on the base
channel.

Based upon conversations with Microsoft, changed the handling of srb_status
codes. Most we should never get, others yes. All are treated as retry-able
except for two. We should not get these statuses, but if we ever do, the I/O
state is not known.

Submitted by: Alexander Sideropoulos <Alexander.Sideropoulos@netapp.com>
Reviewed by: trasz, allanjude, whu
MFC after: 1 week
Sponsored by: Netapp Inc
Differential Revision: https://reviews.freebsd.org/D25756

(cherry picked from commit 2a0ce39d086ffe13782c9dc1e24bb240abbe790a)

3 years agohyperv/vmbus: Fix the wrong size in ndis_offload structure
Wei Hu [Tue, 9 Jul 2019 08:21:14 +0000 (08:21 +0000)]
hyperv/vmbus: Fix the wrong size in ndis_offload structure

Submitted by: whu
MFC after: 2 weeks
Sponsored by: Microsoft

(cherry picked from commit 23a499203c5b3fd74f781d1383269a6e168588cf)

3 years agohyperv/vmbus: Update VMBus version 4.0 and 5.0 support.
Wei Hu [Tue, 9 Jul 2019 07:24:18 +0000 (07:24 +0000)]
hyperv/vmbus: Update VMBus version 4.0 and 5.0 support.

Add VMBus protocol version 4.0. and 5.0 to support Windows 10 and newer HyperV hosts.

For VMBus 4.0 and newer HyperV, the netvsc gpadl teardown must be done after vmbus close.

Submitted by: whu
MFC after: 2 weeks
Sponsored by: Microsoft

(cherry picked from commit ace5ce7e701e5a98c23f820d4f126e5c265aa667)

3 years agoPrevent framebuffer mmio space from being allocated to other devices on HyperV.
Wei Hu [Thu, 30 Jul 2020 07:26:11 +0000 (07:26 +0000)]
Prevent framebuffer mmio space from being allocated to other devices on HyperV.

On Gen2 VMs, Hyper-V provides mmio space for framebuffer.
This mmio address range is not useable for other PCI devices.
Currently only efifb driver is using this range without reserving
it from system.
Therefore, vmbus driver reserves it before any other PCI device
drivers start to request mmio addresses.

PR: 222996
Submitted by: weh@microsoft.com
Reported by: dmitry_kuleshov@ukr.net
Reviewed by: decui@microsoft.com
Sponsored by: Microsoft

(cherry picked from commit c565776195f2f2b62427af07f6b1a9b7670cbc1f)

3 years agoMake software iSCSI more configurable.
Alexander Motin [Thu, 28 Jan 2021 20:53:49 +0000 (15:53 -0500)]
Make software iSCSI more configurable.

Move software iSCSI tunables/sysctls into kern.icl.soft subtree.
Replace several hardcoded length constants there with variables.

While there, stretch the limits to better match Linux' open-iscsi
and our own initiator with new MAXPHYS of 1MB.  Our CTL target is
also optimized for up to 1MB I/Os, so there is also a match now.
For Windows 10 and VMware 6.7 initiators at default settings it
should make no change, since previous limits were sufficient there.

Tests of QD1 1MB writes from FreeBSD over 10GigE link show throughput
increase by 29% on idle connection and 132% with concurrent QD8 reads.

MFC after: 3 days
Sponsored by: iXsystems, Inc.

(cherry picked from commit b75168ed24ca74f65929e5c57d4fed5f0ab08f2a)

3 years agoMove ic_check_send_space clear to the actual check.
Alexander Motin [Wed, 3 Mar 2021 20:21:26 +0000 (15:21 -0500)]
Move ic_check_send_space clear to the actual check.

It closes tiny race when the flag could be set between being cleared
and the space is checked, that would create us some more work.  The
flag setting is protected by both locks, so we can clear it in either
place, but in between both locks are dropped.

MFC after: 1 week

(cherry picked from commit afc3e54eeee635a525c88e4678cc38e3219302c3)

3 years agoRestore condition removed in df3747c6607b.
Alexander Motin [Wed, 3 Mar 2021 16:58:04 +0000 (11:58 -0500)]
Restore condition removed in df3747c6607b.

I think it allowed to avoid some TX thread wakeups while the socket
buffer is full.  But add there another options if ic_check_send_space
is set, which means socket just reported that new space appeared, so
it may have sense to pull more data from ic_to_send for better TX
coalescing.

MFC after: 1 week

(cherry picked from commit aff9b9ee894e3e6b6d8c7e4182d6b973804df853)

3 years agoReplace STAILQ_SWAP() with simpler STAILQ_CONCAT().
Alexander Motin [Tue, 2 Mar 2021 23:39:44 +0000 (18:39 -0500)]
Replace STAILQ_SWAP() with simpler STAILQ_CONCAT().

Also remove stray STAILQ_REMOVE_AFTER(), not causing problems only
because STAILQ_SWAP() fixed corrupted stqh_last.

MFC after: 1 week

(cherry picked from commit df3747c6607be12d48db825653e6adfc3041e97f)

3 years agoFix initiator panic after 6895f89fe54e.
Alexander Motin [Tue, 2 Mar 2021 21:07:22 +0000 (16:07 -0500)]
Fix initiator panic after 6895f89fe54e.

There are sessions without socket that are not disconnecting yet.

MFC after: 3 weeks

(cherry picked from commit 06e9c710998b83a3be21f7f264187fff5d590bc3)

3 years agoOptimize TX coalescing by keeping pointer to last mbuf.
Alexander Motin [Tue, 2 Mar 2021 04:31:34 +0000 (23:31 -0500)]
Optimize TX coalescing by keeping pointer to last mbuf.

Before m_cat() each time traversed through all the coalesced chain.

MFC after: 1 week

(cherry picked from commit b85a67f54a40053e75658a17c620b89bafaba67f)

3 years agoOptimize out few extra memory accesses.
Alexander Motin [Mon, 1 Mar 2021 23:35:45 +0000 (18:35 -0500)]
Optimize out few extra memory accesses.

MFC after: 1 week

(cherry picked from commit a59e2982fe3e6339629cc77fe9d349d60e03a05e)

3 years agoMicro-optimize OOA queue processing.
Alexander Motin [Sat, 27 Feb 2021 15:14:05 +0000 (10:14 -0500)]
Micro-optimize OOA queue processing.

- Move ctl_get_cmd_entry() calls from every OOA traversal to when
  the requests first inserted, storing seridx in struct ctl_scsiio.
- Move some checks out of the loop in ctl_check_ooa().
- Replace checks for errors that can not happen with asserts.
- Transpose ctl_serialize_table, so that any OOA traversal accessed
  only one row (cache line).  Compact it from enum to uint8_t.
- Optimize static branch predictions in hottest places.

Due to O(n) nature on deep LUN queues this can be the hottest code
path in CTL, and additional 20% of IOPS I see in some 4KB I/O tests
are good to have in reserve.  About 50% of CPU time here according
to the profiles is now spent in two memory accesses per traversed
request in OOA.

Sponsored by: iXsystems, Inc.
MFC after: 2 weeks

(cherry picked from commit 9d9fd8b79f0ebe59f791c8225fa01ab59858b7b5)

3 years agoCoalesce socket reads in software iSCSI.
Alexander Motin [Mon, 22 Feb 2021 17:23:35 +0000 (12:23 -0500)]
Coalesce socket reads in software iSCSI.

Instead of 2-4 socket reads per PDU this can do as low as one read
per megabyte, dramatically reducing TCP overhead and lock contention.

With this on iSCSI target I can write more than 4GB/s through a
single connection.

MFC after: 1 month

(cherry picked from commit 6895f89fe54e0858aea70d2bd2a9651f45d7998e)

3 years agoFix build after 2c7dc6bae9fd.
Alexander Motin [Sun, 21 Feb 2021 22:21:14 +0000 (17:21 -0500)]
Fix build after 2c7dc6bae9fd.

MFC after: 1 month

(cherry picked from commit c02a28754bc229c05e8baf9b6632cbd59bc73e48)

3 years agoRefactor CTL datamove KPI.
Alexander Motin [Sun, 21 Feb 2021 21:45:14 +0000 (16:45 -0500)]
Refactor CTL datamove KPI.

 - Make frontends call unified CTL core method ctl_datamove_done()
to report move completion.  It allows to reduce code duplication
in differerent backends by accounting DMA time in common code.
 - Add to ctl_datamove_done() and be_move_done() callback samethr
argument, reporting whether the callback is called in the same
context as ctl_datamove().  It allows for some cases like iSCSI
write with immediate data or camsim frontend write save one context
switch, since we know that the context is sleepable.
 - Remove data_move_done() methods from struct ctl_backend_driver,
unused since forever.

MFC after:  1 month

(cherry picked from commit 2c7dc6bae9fd5c2fa0a65768df8e4e99c2f159f1)

3 years agoMicrooptimize CTL I/O queues.
Alexander Motin [Fri, 19 Feb 2021 20:42:57 +0000 (15:42 -0500)]
Microoptimize CTL I/O queues.

Switch OOA queue from TAILQ to LIST and change its direction, so that
we traverse it forward, not backward.  There is only one place where
we really need other direction, and it is not critical.

Use STAILQ_REMOVE_HEAD() instead of STAILQ_REMOVE() in backends.

Replace few impossible conditions with assertions.

MFC after: 1 month

(cherry picked from commit 05d882b780f5be2da6f3d3bfef9160aacc4888d6)

3 years agoSave context switch per I/O for iSCSI and IOCTL frontends.
Alexander Motin [Fri, 19 Feb 2021 03:07:32 +0000 (22:07 -0500)]
Save context switch per I/O for iSCSI and IOCTL frontends.

Introduce new CTL core KPI ctl_run(), preprocessing I/Os in the caller
context instead of scheduling another thread just for that.  This call
may sleep, that is not acceptable for some frontends like the original
CAM/FC one, but iSCSI already has separate sleepable per-connection RX
threads, and another thread scheduling is mostly just a waste of time.
IOCTL frontend actually waits for the I/O completion in the caller
thread, so the use of another thread for this has even less sense.

With this change I can measure ~5% IOPS improvement on 4KB iSCSI I/Os
to ZFS.

MFC after: 1 month

(cherry picked from commit 812c9f48a2b7bccc31b2a6077b299822357832e4)

3 years agoMove XPT_IMMEDIATE_NOTIFY handling out of periph lock.
Alexander Motin [Thu, 18 Feb 2021 21:22:01 +0000 (16:22 -0500)]
Move XPT_IMMEDIATE_NOTIFY handling out of periph lock.

It is a rare, but still better to not have lock dependencies.

MFC after: 1 month

(cherry picked from commit c67a2909a629db138227993e1093e66bb6c00af5)

3 years agoqat.4: Fix some firmware module names
Mark Johnston [Wed, 3 Mar 2021 14:07:53 +0000 (09:07 -0500)]
qat.4: Fix some firmware module names

PR: 252984

(cherry picked from commit 3adf72a36b9b151eef57e3d83f71a3a9fbacb78d)

3 years agoPartially revert libcxxrt changes to avoid _Unwind_Exception change
Dimitry Andric [Wed, 10 Mar 2021 21:31:40 +0000 (22:31 +0100)]
Partially revert libcxxrt changes to avoid _Unwind_Exception change

After the recent cherry-picking of libcxxrt commits 0ee0dbfb0d26 and
d2b3fadf2db5, users reported that editors/libreoffice packages from the
official package builders did not start anymore. It turns out that the
combination of these commits subtly changes the ABI, requiring all
applications that depend on internal details of struct _Unwind_Exception
(available via unwind-arm.h and unwind-itanium.h) to be recompiled.

However, the FreeBSD package builders always use -RELEASE jails, so
these still use the old declaration of struct _Unwind_Exception, which
is not entirely compatible. In particular, LibreOffice uses this struct
in its internal "uno bridge" component, where it attempts to setup its
own exception handling mechanism.

To fix this incompatibility, go back to the old declarations of struct
_Unwind_Exception, and restore the __LP64__ specific workaround we had
in place before (which was to cope with yet another, older ABI bug).

Effectively, this reverts upstream libcxxrt commits 88bdf6b290da
("Specify double-word alignment for ARM unwind") and b96169641f79
("Updated Itanium unwind"), and reapplies our commit 3c4fd2463bb2
("libcxxrt: add padding in __cxa_allocate_* to fix alignment").

PR: 253840

3 years agoThe list of ports in configuration path shall be protected by locks,
Gleb Smirnoff [Tue, 8 Dec 2020 16:46:00 +0000 (16:46 +0000)]
The list of ports in configuration path shall be protected by locks,
epoch shall be used only for fast path.  Thus use LAGG_XLOCK() in
lagg_[un]register_vlan.  This fixes sleeping in epoch panic.

PR: 240609
(cherry picked from commit e1074ed6a08033ee571b4bedb3ffe6049a4a7361)

3 years agoBuild lib/msun tests with compiler builtins disabled
Dimitry Andric [Tue, 23 Feb 2021 20:03:32 +0000 (21:03 +0100)]
Build lib/msun tests with compiler builtins disabled

This forces the compiler to emit calls to libm functions, instead of
possibly substituting pre-calculated results at compile time, which
should help to actually test those functions.

Reviewed by: emaste, arichardson, ngie
Differential Revision: https://reviews.freebsd.org/D28577

(cherry picked from commit cf97d2a1dab8f2cddc4466fe64d37818339c73be)

riscv: Add a soft-float implementation of fabs()

We could just use a C implementation using __builtin_fabs(), but using
this assembly version guarantees that there is no additional prolog/epilog
code. Additionally, clang generates worse code for masking off the top bit
than GCC: https://bugs.llvm.org/show_bug.cgi?id=49377.

This fixes the RISCV64 softfloat world build after cf97d2a1dab8. That commit
added -fno-builtin to the msun tests which resulted in the first references to
fabs (previously the compiler inlined all calls).

Reviewed By: dim
Reported by: mjg
Differential Revision: https://reviews.freebsd.org/D28994

(cherry picked from commit 524b018d200408bed5eb0d2b892db5b9fb46808b)

riscv: Fix whitespace issues in fabs added in 524b018d2004

(cherry picked from commit 066dab17e7a4a78d43dbcef8119960ddc8090a73)

3 years agoipfw: add IPv6 support for sockarg opcode.
Andrey V. Elsukov [Tue, 2 Mar 2021 09:45:59 +0000 (12:45 +0300)]
ipfw: add IPv6 support for sockarg opcode.

Sponsored by: Yandex LLC

(cherry picked from commit a9f7eba9597189c0e438f6986067d31dca1c53b0)

3 years agoDo not exit ctl_be_block_worker() prematurely.
Alexander Motin [Sat, 6 Mar 2021 03:39:52 +0000 (22:39 -0500)]
Do not exit ctl_be_block_worker() prematurely.

Return while there are any I/Os in a queue may result in them stuck
indefinitely, since there is only one taskqueue task for all of them.
I think I've reproduced this by switching ha_role to secondary under
heavy load.

MFC after: 3 days

(cherry picked from commit 6ed39db2573bb808ac2c206cd6c831f0be86219c)

3 years agoMove back the isa non-PNP driver deadline to FreeBSD 14.
Warner Losh [Mon, 8 Mar 2021 22:59:48 +0000 (15:59 -0700)]
Move back the isa non-PNP driver deadline to FreeBSD 14.

(cherry picked from commit 6ffdaa5f2d4f0881557f64dabf61fb57541e0fba)

3 years ago[PowerPC] Allow traversal of oversize OF properties.
Brandon Bergren [Fri, 13 Nov 2020 16:49:41 +0000 (16:49 +0000)]
[PowerPC] Allow traversal of oversize OF properties.

In standards such as LoPAPR, property names in excess of the usual 31
characters exist.

This breaks property traversal.

While in IEEE 1275-1994, nextprop is defined explicitly to work with a
32-byte region of memory, using a larger buffer should be fine. There is
actually no way to pass a buffer length to the nextprop call in the OF
client interface, so SLOF actually just blindly overflows the buffer.

So we have to defensively make the buffer larger, to avoid memory
corruption when reading out long properties on live OF systems.

Note also that on real-mode OF, things are pretty tight because we are
allocating against a static bounce buffer in low memory, so we can't just
use a huge buffer to work around this without it being wasteful of our
limited amount of 32-bit physical memory.

This allows a patched ofwdump to operate properly on SLOF (i.e. pseries)
systems, as well as any other PowerPC systems with overlength properties.

Reviewed by: jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26669

(cherry picked from commit 26869ad14c70306313405029229a1e2fd94510cd)

3 years ago[PowerPC64] Fix multiple issues in fpsetmask().
Brandon Bergren [Mon, 1 Mar 2021 02:35:53 +0000 (20:35 -0600)]
[PowerPC64] Fix multiple issues in fpsetmask().

Building R exposed a problem in fpsetmask() whereby we were not properly
clamping the provided mask to the valid range.

R initilizes the mask by calling fpsetmask(~0) on FreeBSD. Since we
recently enabled precise exceptions, this was causing an immediate
SIGFPE because we were attempting to set invalid bits in the fpscr.

Properly limit the range of bits that can be set via fpsetmask().

While here, use the correct fp_except_t type instead of fp_rnd_t.

Reported by: pkubaj (in IRC)
Sponsored by: Tag1 Consulting, Inc.

(cherry picked from commit dd95b39235dd81c890aa3cce02a5bb7f91f23803)
(cherry picked from commit a79735386c46298274d71577ab6b4dd00be261cc)

3 years ago[PowerPC] [PowerPCSPE] Fix multiple issues in fpsetmask().
Brandon Bergren [Mon, 1 Mar 2021 03:06:59 +0000 (21:06 -0600)]
[PowerPC] [PowerPCSPE] Fix multiple issues in fpsetmask().

Building R on powerpc64 exposed a problem in fpsetmask() whereby we
were not properly clamping the provided mask to the valid range.

This same issue affects powerpc and powerpcspe.

Properly limit the range of bits that can be set via fpsetmask().

While here, use the correct fp_except_t type instead of fp_rnd_t.

Reported by: pkubaj, jhibbits (in IRC)
Sponsored by: Tag1 Consulting, Inc.

(cherry picked from commit 384ee7cc6e9e4ddc91a6e9e623fcbbe5826bce38)
(cherry picked from commit 8b96d6ac04e7e761ec6b9eff47c801a2b89fbd6d)

3 years ago[PowerPC] Fix SPE floating point environment manipulation
Brandon Bergren [Thu, 12 Dec 2019 17:12:18 +0000 (17:12 +0000)]
[PowerPC] Fix SPE floating point environment manipulation

Fix multiple problems in the powerpcspe floating point code.

* Endianness handling of the SPEFSCR in fenv.h was completely broken.
* Ensure SPEFSCR synchronization requirements are being met.

The __r.__d -> __r transformations were written by jhibbits.

Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D22526

(cherry picked from commit 4f9ed3156c3aff08629d37c8a89ed5ba525b01c9)

3 years agoFix diroffdiroff, probably copy/paste bug.
Alexander Motin [Sun, 28 Feb 2021 14:07:13 +0000 (09:07 -0500)]
Fix diroffdiroff, probably copy/paste bug.

Too long name looks bad in `vmstat -m`.

MFC after: 1 week

(cherry picked from commit d01032736cf067d63e66d6428ffc08e47652600f)

3 years agolibkvm: Plug couple of memory leaks and check possible calloc(3) failure
Jung-uk Kim [Wed, 3 Mar 2021 23:10:00 +0000 (18:10 -0500)]
libkvm: Plug couple of memory leaks and check possible calloc(3) failure

First, r204494 introduced dpcpu_off in struct __kvm and it was allocated
from _kvm_dpcpu_init() but it was not free(3)'ed from kvm_close(3).
Second, r291406 introduced kvm_nlist2(3) and converted kvm_nlist(3) to
use the new function but it did not free the temporary buffer.
Also, check possible calloc(3) failure while I am in the neighborhood.

Differential Revision: https://reviews.freebsd.org/D29019

(cherry picked from commit 645eaa2ccaed6eea801d07d6a092974fc1713896)
(cherry picked from commit 483c6da3a20b2064cd655f7cb19e6b98dee677ff)

3 years agoarcmsr(4): Fixed no action of hot plugging device on type_F adapter.
Xin LI [Wed, 3 Mar 2021 06:57:20 +0000 (22:57 -0800)]
arcmsr(4): Fixed no action of hot plugging device on type_F adapter.

Many thanks to Areca for continuing to support FreeBSD.

Submitted by: 黃清隆 <ching2048 areca com tw>
MFC after: 3 days

(cherry picked from commit 5842073a9b7471831e0da48d29dd984d575f4e9e)

3 years agogrowfs: allow operation on RW-mounted filesystems
Ed Maste [Tue, 2 Mar 2021 22:35:48 +0000 (17:35 -0500)]
growfs: allow operation on RW-mounted filesystems

growfs supports growing mounted filesystems (writes are temporarily
suspended while the grow happens).  Drop the check for fs_clean == 0
to restore this case.  Leave fs_flags check for FS_UNCLEAN or
FS_NEEDSFSCK which represent the state of the filesystem when it was
mounted, and fsck should be run first if they are set.

PR: 253754
Reviewed by: mckusick
Fixes: 6eb925f8450f ("Filesystem utilities that modify the...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29021

(cherry picked from commit 0dcde5cc12744e5188300711a8829e5e6a9cd0de)

3 years agoRemove pointless lun->be_lun checks.
Alexander Motin [Fri, 26 Feb 2021 00:45:59 +0000 (19:45 -0500)]
Remove pointless lun->be_lun checks.

There is no such thing as LUN without backend, at least for years.

MFC after: 1 week

(cherry picked from commit a9bd22814f680ce87259d56155204f9294e684ce)

3 years agoRevert "MFC kern: cpuset: properly rebase when attaching to a jail"
Kyle Evans [Thu, 4 Mar 2021 20:13:55 +0000 (14:13 -0600)]
Revert "MFC kern: cpuset: properly rebase when attaching to a jail"

This behavior change is too invasive to be made between minor versions,
back it out in stable/12 -- it will be first introduced in 13.0.

The cpuset test has been adjusted to account for the legacy behavior,
with a note added as to why it's different and doesn't work if run
as-is on 13.0.

This reverts commit 7bb960ce6447bd535e0fbb648e4d9edbb1dc067f.

3 years agoamdtemp(4): Add missing Family 17h models
Conrad Meyer [Sat, 12 Dec 2020 19:43:38 +0000 (19:43 +0000)]
amdtemp(4): Add missing Family 17h models

Add missing model numbers M20h (Dali, Zen1), M60H (Renoir, Zen2), and
M90H (Van Gogh, Zen2).

Submitted by: Greg V <greg AT unrelenting.technology>

(cherry picked from commit b499ab877f3d6bf5e2c894edfcfdcf89ce7c79d3)

3 years agoamdsmn(4), amdtemp(4): add support for Family 19h (Zen 3)
Conrad Meyer [Sat, 12 Dec 2020 19:34:12 +0000 (19:34 +0000)]
amdsmn(4), amdtemp(4): add support for Family 19h (Zen 3)

Zen 3 "Vermeer" support, tested on Ryzen 9 5950X.

Model numbers from https://en.wikichip.org/wiki/amd/cpuid "Extended
Model" column.

Submitted by: Greg V <greg AT unrelenting.technology>
Differential Revision: https://reviews.freebsd.org/D27552

(cherry picked from commit ea6189d3a470ce9ffb19335f915eab6af0cfef57)

3 years agoamdtemp(4), amdsmn(4): Attach to Ryzen 4000 APU (Zen 2, "Renoir")
Conrad Meyer [Fri, 25 Sep 2020 04:16:28 +0000 (04:16 +0000)]
amdtemp(4), amdsmn(4): Attach to Ryzen 4000 APU (Zen 2, "Renoir")

PR: 249864
Reported by: Florian Millet <florian.millet AT laposte.net>
Tested by: Florian Millet

(cherry picked from commit 5b505170794dfaae633294aaf178bd797b7a1b11)

3 years agoamdtemp(4): Remove dead code that snuck in with r357190
Conrad Meyer [Tue, 28 Jan 2020 03:27:06 +0000 (03:27 +0000)]
amdtemp(4): Remove dead code that snuck in with r357190

I intended to remove this before committing, but neglected to.

(cherry picked from commit cc3b01385bfd7e7f67866c4ac0a1b43370d7e6b7)

3 years agoamdtemp(4): Add support for Family 17h CCD sensors
Conrad Meyer [Tue, 28 Jan 2020 01:39:50 +0000 (01:39 +0000)]
amdtemp(4): Add support for Family 17h CCD sensors

Probe Family 17h CPUs for up to 4 (Zen, Zen+) or 8 (Zen2) CCD temperature
sensors.  These were discovered by Ondrej Čerman
(https://github.com/ocerman) and collaborators experimentally, and are not
currently documented in any datasheet I have access to.

(cherry picked from commit c59b9a4f8d2c7a34782a3885f1c76fb1decea174)

3 years agoamdtemp(4): Refactor shared temperature calculation logic
Conrad Meyer [Tue, 28 Jan 2020 01:38:51 +0000 (01:38 +0000)]
amdtemp(4): Refactor shared temperature calculation logic

No functional change intended.

(cherry picked from commit 02f700029357ddf31b538bbb5a23785d4ca4c7a8)

3 years agoMake DataSN counter of solicited Data-Out local.
Alexander Motin [Tue, 2 Feb 2021 18:37:13 +0000 (13:37 -0500)]
Make DataSN counter of solicited Data-Out local.

DataSN for solicited Data-Out is per-R2T.  Since we handle whole R2T
in one go, we don't need to store it anywhere, especially in global
per-command structure.  This may allow us to handle multiple R2T per
command at once, if we decide, or may be relax locking.

Rename the second use of that field to io_referenced_task_tag.

MFC after: 1 month

(cherry picked from commit 3dd2a7a5ea2f1641c7525f692eed416fa02c28e6)

3 years agobuf: Fix the dirtybufthresh check
Mark Johnston [Thu, 25 Feb 2021 15:04:44 +0000 (10:04 -0500)]
buf: Fix the dirtybufthresh check

dirtybufthresh is a watermark, slightly below the high watermark for
dirty buffers.  When a delayed write is issued, the dirtying thread will
start flushing buffers if the dirtybufthresh watermark is reached.  This
helps ensure that the high watermark is not reached, otherwise
performance will degrade as clustering and other optimizations are
disabled (see buf_dirty_count_severe()).

When the buffer cache was partitioned into "domains", the dirtybufthresh
threshold checks were not updated.  Fix this.

Reported by: Shrikanth R Kamath <kshrikanth@juniper.net>
Reviewed by: rlibby, mckusick, kib, bdrewery
Sponsored by: Juniper Networks, Inc., Klara, Inc.
Fixes: 3cec5c77d6
Differential Revision: https://reviews.freebsd.org/D28901

(cherry picked from commit 369706a6f887f8ffe1037d78bc31565ec701d72b)

3 years agoRACK: fix an issue triggered by using the CDG CC module
Michael Tuexen [Tue, 2 Mar 2021 11:32:16 +0000 (12:32 +0100)]
RACK: fix an issue triggered by using the CDG CC module

Manually resolved merge conflics.

Obtained from: rrs@
PR: 238741
Sponsored by: Netlix, Inc.

(cherry picked from commit 99adf230061268175a36061130e6adb0882270e8)

3 years agobridge tests: Test that we also forward on some interfaces
Kristof Provost [Wed, 24 Feb 2021 15:40:37 +0000 (16:40 +0100)]
bridge tests: Test that we also forward on some interfaces

Ensure that we not only block on some interfaces, but also forward on
some. Without the previous commit we wound up discarding on all ports,
rather than only on the ports needed to break the loop.

MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28917

(cherry picked from commit 7a4dbffa4205fc274b4884a6332d4831c5791320)

3 years agobridgestp: Ensure we send STP on VLAN interfaces
Kristof Provost [Wed, 24 Feb 2021 15:38:53 +0000 (16:38 +0100)]
bridgestp: Ensure we send STP on VLAN interfaces

Reviewed by: donner@
MFC after: 1 week
X-MFC-with: 711ed156b94562c3dcb2ee9c1b3f240f960a75d2
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28916

(cherry picked from commit f5537cd0693c85efdb2180a0a107c51eae15ba39)

3 years agoipfw: make algo name argument optional for some table types
Andrey V. Elsukov [Thu, 25 Feb 2021 13:57:47 +0000 (16:57 +0300)]
ipfw: make algo name argument optional for some table types

Most of table types currently supported by ipfw have only one
algorithm implementation. When user creates such tables, allow
to omit algo name in arguments. E.g. now it is possible:
ipfw table T1 create type number
ipfw table T2 create type iface
ipfw table T3 create type flow

PR: 233072
Sponsored by: Yandex LLC

(cherry picked from commit 13ad237a19b7368124483d9d1dc3258c27880fef)

3 years agogetdirentries.2: fix for NFS mounts
Rick Macklem [Mon, 15 Feb 2021 02:16:58 +0000 (18:16 -0800)]
getdirentries.2: fix for NFS mounts

It was reported that getdirentries(2) was
returning dirents with d_off set to 0 for an NFS
mount.

This is believed to be correct behaviour at
this time (it may change for some NFS mounts
in the future), but is inconsistent with what the
getdirentries(2) man page says.

This patch fixes the man page.

This is a content change.

PR: 253428

(cherry picked from commit a0698341cd894ba4a640e9a9bb0f72c2133d1228)

3 years agoClarify kld_list format
Chris Rees [Mon, 24 Dec 2018 10:47:48 +0000 (10:47 +0000)]
Clarify kld_list format

PR: docs/234248
Submitted by: David Fiander
Submitted by: Miroslav Lachman

(cherry picked from commit 261e62db4c62ecab7c8d8055b1a548acb80c16dc)

3 years agoMFC 9febbc454190:
Hans Petter Selasky [Mon, 22 Feb 2021 10:58:46 +0000 (11:58 +0100)]
MFC 9febbc454190:
Fix for natd(8) sending wrong sequence number after TCP retransmission,
terminating a TCP connection.

If a TCP packet must be retransmitted and the data length has changed in the
retransmitted packet, due to the internal workings of TCP, typically when ACK
packets are lost, then there is a 30% chance that the logic in GetDeltaSeqOut()
will find the correct length, which is the last length received.

This can be explained as follows:

If a "227 Entering Passive Mode" packet must be retransmittet and the length
changes from 51 to 50 bytes, for example, then we have three cases for the
list scan in GetDeltaSeqOut(), depending on how many prior packets were
received modulus N_LINK_TCP_DATA=3:

  case 1:  index 0:   original packet        51
           index 1:   retransmitted packet   50
           index 2:   not relevant

  case 2:  index 0:   not relevant
           index 1:   original packet        51
           index 2:   retransmitted packet   50

  case 3:  index 0:   retransmitted packet   50
           index 1:   not relevant
           index 2:   original packet        51

This patch simply changes the searching order for TCP packets, always starting
at the last received packet instead of any received packet, in
GetDeltaAckIn() and GetDeltaSeqOut().

Else no functional changes.

Discussed with: rscheff@
Submitted by: Andreas Longwitz <longwitz@incore.de>
PR: 230755
Sponsored by: Mellanox Technologies // NVIDIA Networking

(cherry picked from commit 9febbc4541903bb8e6b0f1c84988c98b2f7c96ef)

3 years agoatomic: add atomic_interrupt_fence()
Konstantin Belousov [Tue, 23 Feb 2021 22:12:29 +0000 (00:12 +0200)]
atomic: add atomic_interrupt_fence()

(cherry picked from commit e2494f7561c852951d8ac567314f5e12f19ee7af)

3 years agormlock: Add a required compiler membar to the rlock slow path
Mark Johnston [Wed, 24 Feb 2021 02:15:50 +0000 (21:15 -0500)]
rmlock: Add a required compiler membar to the rlock slow path

The tracker flags need to be loaded only after the tracker is removed
from its per-CPU queue.  Otherwise, readers may fail to synchronize with
pending writers attempting to propagate priority to active readers, and
readers and writers deadlock on each other.  This was observed in a
stable/12-based armv7 kernel where the compiler had reordered the load
of rmp_flags to before the stores updating the queue.

Reviewed by: rlibby, scottl
Discussed with: kib
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D28821

(cherry picked from commit 1d44514fcd68809cfd493a7352ace29ddad443d6)

3 years agobridge tests: Test STP on top of VLAN devices
Kristof Provost [Sat, 20 Feb 2021 09:13:33 +0000 (10:13 +0100)]
bridge tests: Test STP on top of VLAN devices

This is basically the same test as the existing STP test, but now on top
of VLAN interfaces instead of directly using the epair devices.

MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28861

(cherry picked from commit 26492ba2716f8b839f743bb663ce47405990fdf0)

3 years agobridge tests: Avoid building a switching loop
Kristof Provost [Mon, 1 Jun 2020 19:26:16 +0000 (19:26 +0000)]
bridge tests: Avoid building a switching loop

Enable STP before bringing the bridges up. This avoids a switching loop,
which has a tendency to drown out progress in userspace processes,
especially on single-core systems.

Only check that we have indeed shut down one of the looped interfaces

PR: 246448
Reviewed by: melifaro
Differential Revision: https://reviews.freebsd.org/D25084

(cherry picked from commit e07e002e950aa673266e3d4b30c43e1198af65e0)

3 years agobridge tests: Test for #216510
Kristof Provost [Sun, 26 Apr 2020 16:27:03 +0000 (16:27 +0000)]
bridge tests: Test for #216510

We used to have an issue with recursive locking with
net.link.bridge.inherit_mac. This causes us to send an ARP request while
we hold the BRIDGE_LOCK, which used to cause us to acquire the
BRIDGE_LOCK again. We can't re-acquire it, so this caused a panic.

Now that we no longer need to acquire the BRIDGE_LOCK for
bridge_transmit() this should no longer panic. Test this.

PR: 216510
Reviewed by: emaste, philip
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24251

(cherry picked from commit 5377560783d95b92fce3bea3caac37d2860b1d48)

3 years agobridge tests: Ensure that bridges in different jails get different MAC addresses
Kristof Provost [Sun, 19 Apr 2020 16:30:49 +0000 (16:30 +0000)]
bridge tests: Ensure that bridges in different jails get different MAC addresses

We used to have a problem where bridges created in different vnet jails
would end up having the same mac address. This is now fixed by
including the jail name as a seed for the mac address generation, but we
should verify that it doesn't regress.

(cherry picked from commit 2885ae0c3ca3ea93e1f227ecb3003db2e94f4129)

3 years agobridge tests: Test deleting a bridge with members
Kristof Provost [Fri, 17 Apr 2020 14:57:15 +0000 (14:57 +0000)]
bridge tests: Test deleting a bridge with members

Reviewed by: philip, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24337

(cherry picked from commit 3f359bfd47430183f69b9c03f34458217e7c7970)

3 years agobridge tests: Basic span test
Kristof Provost [Mon, 16 Mar 2020 08:44:46 +0000 (08:44 +0000)]
bridge tests: Basic span test

Reviewed by: philip, emaste (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23961

(cherry picked from commit bb490fcf195450d9cbbac00e6338b352aac32c5c)

3 years agobridge test: adding and removing static addresses
Kristof Provost [Tue, 10 Mar 2020 06:29:59 +0000 (06:29 +0000)]
bridge test: adding and removing static addresses

Reviewed by: philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23960

(cherry picked from commit d99bb677c1cf43b22e91d54c49a8b7f0592e6fce)

3 years agobridge test: spanning tree
Kristof Provost [Tue, 10 Mar 2020 06:28:45 +0000 (06:28 +0000)]
bridge test: spanning tree

Basic test case where we create a bridge loop, verify that we really are
looping and then enable spanning tree to resolve the loop.

Reviewed by: philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23959

(cherry picked from commit 6f0a65b080aac1b3144c7489b020b26b345d1a1b)

3 years agobridge tests: Remove unneeded 'All rights reserved.'
Kristof Provost [Wed, 19 Feb 2020 16:44:16 +0000 (16:44 +0000)]
bridge tests: Remove unneeded 'All rights reserved.'

The FreeBSD foundation no longer requires this, as per
https://lists.freebsd.org/pipermail/svn-src-all/2019-February/177215.html and
private communications.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit e3c73f3d74c77b2c168519b10bdb6910a84287ef)

3 years agobridge: Basic test case
Kristof Provost [Sun, 16 Feb 2020 13:16:40 +0000 (13:16 +0000)]
bridge: Basic test case

Very basic bridge test: Set up two jails and test that they can pass IPv4
traffic over the bridge.

Reviewed by: melifaro, philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23697

(cherry picked from commit 095aabf7dc814ae96d83bc5327a4b1f2e23be419)

3 years agobridge/stp: Ensure we enter NET_EPOCH whenever we can send traffic
Kristof Provost [Sun, 21 Feb 2021 20:18:46 +0000 (21:18 +0100)]
bridge/stp: Ensure we enter NET_EPOCH whenever we can send traffic

Reviewed by: donner@
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28858

(cherry picked from commit 89fa9c34d76bbf85cd7cda60c1868f5e3dba4ec7)

3 years agoarp/nd: Cope with late calls to iflladdr_event
Kristof Provost [Mon, 22 Feb 2021 07:19:43 +0000 (08:19 +0100)]
arp/nd: Cope with late calls to iflladdr_event

When tearing down vnet jails we can move an if_bridge out (as
part of the normal vnet_if_return()). This can, when it's clearing out
its list of member interfaces, change its link layer address.
That sends an iflladdr_event, but at that point we've already freed the
AF_INET/AF_INET6 if_afdata pointers.

In other words: when the iflladdr_event callbacks fire we can't assume
that ifp->if_afdata[AF_INET] will be set.

Reviewed by: donner@, melifaro@
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28860

(cherry picked from commit c139b3c19b52abe3b5ba23a8175e58e70c7a528d)

3 years agobridge: Remove members when assigned to a new vnet
Kristof Provost [Sun, 21 Feb 2021 20:20:32 +0000 (21:20 +0100)]
bridge: Remove members when assigned to a new vnet

When the bridge is moved to a different vnet we must remove all of its
member interfaces (and span interfaces), because we don't know if those
will be moved along with it. We don't want to hold references to
interfaces not in our vnet.

Reviewed by: donner@
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28859

(cherry picked from commit 38c0951386d82f4c51cf4e245253cdef18d2254a)

3 years agobridge: Support STP on VLAN devices
Kristof Provost [Sat, 20 Feb 2021 09:11:30 +0000 (10:11 +0100)]
bridge: Support STP on VLAN devices

VLAN devices have type IFT_L2VLAN, so the STP code mistakenly believed
they couldn't be used for STP. That's not the case, so add the
ITF_L2VLAN to the check.

Reviewed by: donner@
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28857

(cherry picked from commit 711ed156b94562c3dcb2ee9c1b3f240f960a75d2)

3 years agoBump CTL block backend threads from 14 to 32 per LUN.
Alexander Motin [Tue, 23 Feb 2021 15:58:56 +0000 (10:58 -0500)]
Bump CTL block backend threads from 14 to 32 per LUN.

This makes random read benchmarks look better on a wide ZFS pools.
I am not sure where the original value goes from, but it is there
for too long now.

MFC after: 1 week

(cherry picked from commit 7d4c444374d53e54ce197138df64bf40c1fb05a3)

3 years agopmap: Fix largemap restart checks in the kernel_maps sysctl handler
Mark Johnston [Thu, 25 Feb 2021 23:49:47 +0000 (18:49 -0500)]
pmap: Fix largemap restart checks in the kernel_maps sysctl handler

The purpose of these checks is to ensure that the address of the
next-level page table page is valid, since nothing is synchronizing with
a concurrent update of the large map and large map PTPs are freed to the
system.  However, if PG_PS is set, there is no next level.

Reported by: rpokala
Reviewed by: kib
Tested by: rpokala
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28922

(cherry picked from commit aac25e222525780db8939d07a594d3e090c0a148)

3 years agopf: Fix incorrect fragment handling
Kristof Provost [Thu, 25 Feb 2021 07:07:36 +0000 (08:07 +0100)]
pf: Fix incorrect fragment handling

A sequence of overlapping IPv4 fragments could crash the kernel in
pf due to an assertion.

Reported by: Alexander Bluhm
Obtained from: OpenBSD
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC ("Netgate")

(cherry picked from commit 5f1b1f184b7f12330cf4a027e3db7c6700c67640)

3 years agopf: Fix build if INVARIANTS is not set
Kristof Provost [Fri, 2 Nov 2018 19:23:50 +0000 (19:23 +0000)]
pf: Fix build if INVARIANTS is not set

r340061 included a number of assertions pf_frent_remove(), but these assertions
were the only use of the 'prev' variable. As a result builds without
INVARIANTS had an unused variable, and failed.

Reported by: vangyzen@

(cherry picked from commit 58ef854f8b05508f41aff3bdaf1564c8dd4c1d4f)

3 years agopf: Limit the fragment entry queue length to 64 per bucket.
Kristof Provost [Fri, 2 Nov 2018 15:32:04 +0000 (15:32 +0000)]
pf: Limit the fragment entry queue length to 64 per bucket.

So we have a global limit of 1024 fragments, but it is fine grained to
the region of the packet.  Smaller packets may have less fragments.
This costs another 16 bytes of memory per reassembly and devides the
worst case for searching by 8.

Obtained from: OpenBSD
Differential Revision: https://reviews.freebsd.org/D17734

(cherry picked from commit 790194cd472b1d17e08940e9f839322abcf14ec9)

3 years agopf: Split the fragment reassembly queue into smaller parts
Kristof Provost [Fri, 2 Nov 2018 15:26:51 +0000 (15:26 +0000)]
pf: Split the fragment reassembly queue into smaller parts

Remember 16 entry points based on the fragment offset.  Instead of
a worst case of 8196 list traversals we now check a maximum of 512
list entries or 16 array elements.

Obtained from: OpenBSD
Differential Revision: https://reviews.freebsd.org/D17733

(cherry picked from commit fd2ea405e601bd5e240153c5de0f7c264946ce6f)

3 years agopf: Count holes rather than fragments for reassembly
Kristof Provost [Fri, 2 Nov 2018 15:23:57 +0000 (15:23 +0000)]
pf: Count holes rather than fragments for reassembly

Avoid traversing the list of fragment entris to check whether the
pf(4) reassembly is complete.  Instead count the holes that are
created when inserting a fragment.  If there are no holes left, the
fragments are continuous.

Obtained from: OpenBSD
Differential Revision: https://reviews.freebsd.org/D17732

(cherry picked from commit 2b1c354ee6fb075953d2c3e81c8221f4115ce981)

3 years agoRevert "pf: Limit the maximum number of fragments per packet"
Kristof Provost [Fri, 2 Nov 2018 15:01:59 +0000 (15:01 +0000)]
Revert "pf: Limit the maximum number of fragments per packet"

This reverts commit r337969.
We'll handle this the OpenBSD way, in upcoming commits.

(cherry picked from commit 19a22ae31328d9a960732a0904116c1b5566351b)

3 years agopwrite(2): add a BUGS section
Guangyuan Yang [Sat, 20 Feb 2021 08:03:15 +0000 (08:03 +0000)]
pwrite(2): add a BUGS section

Add a BUGS section about using pwrite(2) when O_APPEND is set on the fd.

Submitted by: Ka Ho Ng <khng300@gmail.com>
Reviewed by: gbe, yuripv
Differential Revision: https://reviews.freebsd.org/D28372

(cherry picked from commit 504e64af32ba6c62fdcc894a3b1da76061c64796)

3 years agonetgraph/ng_car: Add color marking code
Lutz Donnerhacke [Wed, 27 Jan 2021 20:19:14 +0000 (21:19 +0100)]
netgraph/ng_car: Add color marking code

Chained policing should be able to reuse the classification of
traffic.  A new mbuf_tag type is defined to handle gereral QoS
marking.  A new subtype is defined to track the color marking.

Reviewed by: manpages (bcr), melifaro, kp
Sponsored by: IKS Service GmbH
Differential Revision: https://reviews.freebsd.org/D22110

(cherry picked from commit d0d2e523bafb74180f8bebb90788790f0d2f0290)

3 years agoautomount(8): fix absolute path when creating a mountpoint
Robert Wing [Wed, 17 Feb 2021 09:22:23 +0000 (00:22 -0900)]
automount(8): fix absolute path when creating a mountpoint

When executing automount(8), it will attempt to create the directory where an
autofs filesystem is to be mounted. Explicity set the root path for this
directory to "/".

This fixes the issue where the directory being created was being treated as a
relative path instead of an absolute path (as expected).

PR:     224601
Reported by:    kusumi.tomohiro@gmail.com
Reviewed by:    trasz
Differential Revision:  https://reviews.freebsd.org/D27832

(cherry picked from commit 63640b2f552c0476f50484635eb9888eafcd22dc)

3 years agoSkip the vm.pmap.kernel_maps sysctl by default.
John Baldwin [Fri, 18 Dec 2020 20:41:23 +0000 (20:41 +0000)]
Skip the vm.pmap.kernel_maps sysctl by default.

This sysctl node can generate very verbose output, so don't trigger it
for sysctl -a or sysctl vm.pmap.

Reviewed by: markj, kib
Differential Revision: https://reviews.freebsd.org/D27504

(cherry picked from commit 1dce7d9e7eefead038610df6a8d6c86a0fdbebb8)

3 years agonetgraph/ng_nat: Add RFC 6598/Carrier Grade NAT support
Neel Chauhan [Sun, 24 Jan 2021 19:23:39 +0000 (20:23 +0100)]
netgraph/ng_nat: Add RFC 6598/Carrier Grade NAT support

This extends upon the RFC 6598 support to libalias/ipfw in r357092.

Reviewed By: manpages (bcr), donner, adrian, kp
Differential Revision: https://reviews.freebsd.org/D23461

(cherry picked from commit 5fe433a6e4d8cab6b64284698301afc0c55a9db2)

3 years agonetgraph/ng_bridge: Add counters for the first link, too
Lutz Donnerhacke [Wed, 10 Feb 2021 10:47:38 +0000 (11:47 +0100)]
netgraph/ng_bridge: Add counters for the first link, too

For broadcast, multicast and unknown unicast, the replication loop
sends a copy of the packet to each link, beside the first one. This
special path is handled later, but the counters are not updated.
Factor out the common send and count actions as a function.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28537

(cherry picked from commit 3c958f5fdfc01b7579ea0fbfc3f15f8a85bebee9)

3 years agonetgraph/ng_bridge: Document staleness in multithreaded operation
Lutz Donnerhacke [Tue, 9 Feb 2021 11:32:46 +0000 (12:32 +0100)]
netgraph/ng_bridge: Document staleness in multithreaded operation

In the data path of ng_bridge(4), the only value of the host struct,
which needs to be modified, is the staleness, which is reset every
time a frame is received.  It's save to leave the code as it is.

This patch is part of a series to make ng_bridge(4) multithreaded.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28546

(cherry picked from commit 011b7317dbb5038a95b9b4fca050325a62f3991e)

3 years agonetgraph/ng_bridge: Merge internal structures
Lutz Donnerhacke [Thu, 25 Feb 2021 09:59:45 +0000 (10:59 +0100)]
netgraph/ng_bridge: Merge internal structures

In a earlier version of ng_bridge(4) the exernal visible host entry
structure was a strict subset of the internal one.  So internal view
was a direct annotation of the external structure.  This strict
inheritance was lost many versions ago.  There is no need to
encapsulate a part of the internal represntation as a separate
structure.

This patch is a preparation to make the internal structure read only
in the data path in order to make ng_bridge(4) multithreaded.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D28545

(cherry picked from commit ccf4cd2e7830394467d5f6cf546ab453f9657b69)

3 years agonetgraph/ng_bridge: Make simple internal functions read-only
Lutz Donnerhacke [Wed, 13 Jan 2021 22:18:55 +0000 (23:18 +0100)]
netgraph/ng_bridge: Make simple internal functions read-only

The data path in netgraph is designed to work on an read only state of
the whole netgraph network.  Currently this is achived by convention,
there is no technical enforcment.  In the case of NETGRAPH_DEBUG all
nodes can be annotated for debugging purposes, so the strict
enforcment needs to be lifted for this purpose.

This patch is part of a series to make ng_bridge multithreaded, which
is done by rewrite the data path to operate on const.

Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D28141

(cherry picked from commit 6117aa58fa4f5891badf58b13c759976983f4f04)

3 years agonetgraph/ng_bridge: switch stats to counter framework
Lutz Donnerhacke [Wed, 13 Jan 2021 06:16:34 +0000 (07:16 +0100)]
netgraph/ng_bridge: switch stats to counter framework

This is the first patch of a series of necessary steps
to make ng_bridge(4) multithreaded.

Reviewed by: melifaro (network), afedorov
Differential Revision: https://reviews.freebsd.org/D28125

(cherry picked from commit 66c72859f66dc6c852234589f3508ce5d36d0336)

3 years agonetgraph/ng_bridge: Derive forwarding mode from first attached hook
Lutz Donnerhacke [Sat, 6 Feb 2021 10:25:04 +0000 (11:25 +0100)]
netgraph/ng_bridge: Derive forwarding mode from first attached hook

Handling of unknown MACs on an bridge with incomplete learning
capabilites (aka uplink ports) can be defined in different ways.

The classical approach is to broadcast unicast frames send to an
unknown MAC, because the unknown devices can be everywhere. This mode
is default for ng_bridge(4).

In the case of dedicated uplink ports, which prohibit learning of MAC
addresses in order to save memory and CPU cycles, the broadcast
approach is dangerous. All traffic to the uplink port is broadcasted
to every downlink port, too. In this case, it's better to restrict the
distribution of frames to unknown MAC to the uplink ports only.

In order to keep the chance small and the handling as natural as
possible, the first attached link is used to determine the behaviour
of the bridge: If it is an "uplink" port, then the bridge switch from
classical mode to restricted mode.

Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D28487

(cherry picked from commit c869d905baa4e329dfd6793e7487b5985248ddb6)

3 years agonetgraph/ng_bridge: Introduce "uplink" ports without MAC learning
Lutz Donnerhacke [Sat, 6 Feb 2021 10:08:24 +0000 (11:08 +0100)]
netgraph/ng_bridge: Introduce "uplink" ports without MAC learning

The ng_bridge(4) node is designed to work in moderately small
environments. Connecting such a node to a larger network rapidly fills
the MAC table for no reason. It even become complicated to obtain data
from the gettable message, because the result is too large to
transmit.

This patch introduces, two new functionality bits on the hooks:
  - Allow or disallow MAC address learning for incoming patckets.
  - Allow or disallow sending unknown MACs through this hook.

Uplinks are characterized by denied learning while sending out
unknowns. Normal links are charaterized by allowed learning and
sending out unknowns.

Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D23963

(cherry picked from commit f961caf2184c94d6f59c8d522207156b3533d977)

3 years agonetgraph/ng_vlan_rotate: IEEE 802.1ad VLAN manipulation netgraph type
Lutz Donnerhacke [Tue, 26 Jan 2021 15:50:04 +0000 (16:50 +0100)]
netgraph/ng_vlan_rotate: IEEE 802.1ad VLAN manipulation netgraph type

This node is part of an A10-NSP (L2-BSA) development.

Carrier networks tend to stack three or more tags for internal
purposes and therefore hiding the service tags deep inside of the
stack. When decomposing such an access network frame, the processing
order is typically reversed: First distinguish by service, than by
other means.

This new netgragh node allows to bring the relevant VLAN in front (to
the out-most position). This way other netgraph nodes (like ng_vlan)
can operate on this specific type.

Reviewed by: manpages (gbe), brueffer (manpages), kp
Relnotes: yes
Sponsored by: IKS Service GmbH
Differential Revision: https://reviews.freebsd.org/D22076

(cherry picked from commit cfd6422a5217410fbd66f7a7a8a64d9d85e61229)

3 years agoether: add older ethertype definitions for QinQ
Philip Paeps [Thu, 17 Oct 2019 00:34:53 +0000 (00:34 +0000)]
ether: add older ethertype definitions for QinQ

Older network equipment used the ethertypes 0x9100, 0x9200, and 0x9300 for
outer VLANs, before standardisation introduced 0x88a8.

Submitted by:  Lutz Donnerhacke <lutz_donnerhacke.de>
Differential Revision: https://reviews.freebsd.org/D21846

(cherry picked from commit 579b70db8922b1debf3bd99bb2b822d60b95575d)

3 years agocxgb(4): Rework my commit 9dc7c250.
Alexander Motin [Mon, 22 Feb 2021 22:21:05 +0000 (17:21 -0500)]
cxgb(4): Rework my commit 9dc7c250.

The previous implementation was reported to try to coalesce packets
in situations when it should not, that resulted in assertion later.
This implementation better checks the first packet of the chain for
the coallescing elligibility.

(cherry picked from commit d510bf133d045d6c83742aeda6949bec150f6cbf)

3 years agoFix possibly unitialized variables in __cxa_demangle_gnu3()
Dimitry Andric [Mon, 22 Feb 2021 20:01:09 +0000 (21:01 +0100)]
Fix possibly unitialized variables in __cxa_demangle_gnu3()

After 0ee0dbfb0d26cf4bc37f24f12e76c7f532b0f368 where I imported a more
recent libcxxrt snapshot, the variables 'rtn' and 'has_ret' could in
some cases be used while still uninitialized. Most obviously this would
lead to a jemalloc complaint about a bad free(), aborting the program.

Fix this by initializing a bunch variables in their declarations. This
change has also been sent upstream, with some additional changes to be
used in their testing framework.

PR: 253226

(cherry picked from commit d149877758f162f0c777e7760164bf2c1f7a1bc1)

3 years ago504ebd612ec: kern: sonewconn: set so_options before pru_attach()
Kyle Evans [Wed, 20 Jan 2021 17:53:05 +0000 (11:53 -0600)]
504ebd612ec: kern: sonewconn: set so_options before pru_attach()

Protocol attachment has historically been able to observe and modify
so->so_options as needed, and it still can for newly created sockets.
779f106aa169 moved this to after pru_attach() when we re-acquire the
lock on the listening socket.

Restore the historical behavior so that pru_attach implementations can
consistently use it. Note that some pru_attach() do currently rely on
this, though that may change in the future. D28265 contains a change to
remove the use in TCP and IB/SDP bits, as resetting the requested linger
time on incoming connections seems questionable at best.

This does move the assignment out from under the head's listen lock, but
glebius notes that head won't be going away and applications cannot
assume any specific ordering with a race between a connection coming in
and the application changing socket options anyways.

4c0bef07be0: kern: net: remove TCP_LINGERTIME

TCP_LINGERTIME can be traced back to BSD 4.4 Lite and perhaps beyond, in
exactly the same form that it appears here modulo slightly different
context.  It used to be the case that there was a single pr_usrreq
method with requests dispatched to it; these exact two lines appeared in
tcp_usrreq's PRU_ATTACH handling.

The only purpose of this that I can find is to cause surprising behavior
on accepted connections. Newly-created sockets will never hit these
paths as one cannot set SO_LINGER prior to socket(2). If SO_LINGER is
set on a listening socket and inherited, one would expect the timeout to
be inherited rather than changed arbitrarily like this -- noting that
SO_LINGER is nonsense on a listening socket beyond inheritance, since
they cannot be 'connected' by definition.

Neither Illumos nor Linux reset the timer like this based on testing and
inspection of Illumos, and testing of Linux.

(cherry picked from commit 504ebd612ec61165bb949cfce3a348b0d6f37008)
(cherry picked from commit 4c0bef07be071a1633ebc86a653f9bd59d40796e)

3 years agopam_login_access: Fix negative entry matching logic
Mark Johnston [Tue, 23 Feb 2021 22:01:29 +0000 (17:01 -0500)]
pam_login_access: Fix negative entry matching logic

PR: 252194
Approved by: so
Security: CVE-2020-25580
Security: FreeBSD-SA-21:03.pam_login_access

(cherry picked from commit 6ab923cbca8759503a08683a5978b9ebf5efd607)

3 years agoFix divide-by-zero panic when ASLR is enabled and superpages disabled
Jason A. Harmening [Mon, 15 Feb 2021 02:47:22 +0000 (18:47 -0800)]
Fix divide-by-zero panic when ASLR is enabled and superpages disabled

When locating the anonymous memory region for a vm_map with ASLR
enabled, we try to keep the slid base address aligned on a superpage
boundary to minimize pagetable fragmentation and maximize the potential
usage of superpage mappings.  We can't (portably) do this if superpages
have been disabled by loader tunable and pagesizes[1] is 0, and it
would be less beneficial in that case anyway.

PR: 253511

(cherry picked from commit 41032835dc2d489ec7841d7529f74f6389329cd3)

3 years agoukbd: Fix handling of keyboard ErrorRollOver reports
Vladimir Kondratyev [Sat, 13 Feb 2021 18:12:56 +0000 (21:12 +0300)]
ukbd: Fix handling of keyboard ErrorRollOver reports

Ignore fantom keyboard state reports entirelly rather than ignore
RollOver states for each key separatelly.  Latter results in spurious
release/push pairs of events on each fantom keyboard state report.

Reported by: Jan Martin Mikkelsen <janm_AT_transactionware_DOT_com>
Submitted by: Jan Martin Mikkelsen (initial version)
PR: 253249
MFC after: 1 week

(cherry picked from commit 032d3153877ef1767c121bbdf8e00f4f93b30a5d)

3 years agoxen-blkback: fix leak of grant maps on ring setup failure
Roger Pau Monné [Wed, 20 Jan 2021 18:40:51 +0000 (19:40 +0100)]
xen-blkback: fix leak of grant maps on ring setup failure

Multi page rings are mapped using a single hypercall that gets passed
an array of grants to map. One of the grants in the array failing to
map would lead to the failure of the whole ring setup operation, but
there was no cleanup of the rest of the grant maps in the array that
could have likely been created as a result of the hypercall.

Add proper cleanup on the failure path during ring setup to unmap any
grants that could have been created.

This is part of XSA-361.

Sponsored by: Citrix Systems R&D

(cherry picked from commit 808d4aad1022a2a33d222663b0c9badde30b9d45)

3 years agoExclude reserved iSCSI Initiator Task Tag.
Alexander Motin [Sun, 24 Jan 2021 19:23:04 +0000 (14:23 -0500)]
Exclude reserved iSCSI Initiator Task Tag.

RFC 7143 (11.2.1.8):
   An ITT value of 0xffffffff is reserved and MUST NOT be assigned for a
   task by the initiator.  The only instance in which it may be seen on
   the wire is in a target-initiated NOP-In PDU (Section 11.19) and in
   the initiator response to that PDU, if necessary.

MFC after: 1 month