]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
2 years agowpa: Fix WITHOUT_CRYPT build
Cy Schubert [Thu, 28 Oct 2021 23:55:48 +0000 (16:55 -0700)]
wpa: Fix WITHOUT_CRYPT build

PASN requires CRYPT and when built WITHOUT_CRYPT buildworld
fails. Only enable PASN when MK_CRYPT is enabled (default).

PR: 259517
Reported by: emaste
Fixes: c1d255d3ffdbe447de3ab875bf4e7d7accc5bfc5

(cherry picked from commit a30e8044aa4753858c189f3384dae2b2f25a150b)

2 years agowpa: Address CTRL-EVENT-SCAN-FAILED
Cy Schubert [Thu, 9 Sep 2021 00:20:52 +0000 (17:20 -0700)]
wpa: Address CTRL-EVENT-SCAN-FAILED

5fcdc19a8111 didn't fully resolve the issue. There remains a report
that an ifconfig wlan0 up by itself is insufficient. Ifconfig down
must precede it.

Reported by: Filipe da Silva Santos <contact _ shiori_com_br>
Fixes: 5fcdc19a8111

(cherry picked from commit d06d7eb09131edea666bf049d6c0c55672726f76)

2 years agowpa: Address CTRL-EVENT-SCAN-FAILED
Cy Schubert [Tue, 7 Sep 2021 01:48:39 +0000 (18:48 -0700)]
wpa: Address CTRL-EVENT-SCAN-FAILED

Some installations may experience CTRL-EVENT-SCAN-FAILED when
associating to an AP. Installations that specify
ifconfig_wlan0="WPA ... up" in rc.conf do not experience
the problem whereas those which specify ifconfig_wlan0="WPA" without
the "up" will experience CTRL-EVENT-SCAN_FAILED.

However those that specify "up" in ifconfig_wlan0 will be able to
reproduce this problem by service netif stop wlan0;
service netif start wlan0. Interestingly The service netif stop/start
problem is reproducible on the older wpa 2.9 as well.

Reported by: dhw
Reported by: "Oleg V. Nauman" <oleg _ theweb_org_ua>
Reported by: Filipe da Silva Santos <contact _ shiori_com_br>
Reported by: Jakob Alvermark <jakob _ alvermark_net>

(cherry picked from commit 5fcdc19a81115d975e238270754e28557a2fcfc5)

2 years agowpa: Enable RSN Preauthentication
Cy Schubert [Fri, 3 Sep 2021 13:14:59 +0000 (06:14 -0700)]
wpa: Enable RSN Preauthentication

RSN Preauthentication allows a station autnetnicate to an AP that
it is not associated with yet while associated with a different AP.
This allows athentication to multiple APs simulteneously.

Tested by: philip

(cherry picked from commit bd452dcbede69b1862c769f244948f94b86448b5)

2 years agowpa: Enable MBO
Cy Schubert [Fri, 3 Sep 2021 13:14:01 +0000 (06:14 -0700)]
wpa: Enable MBO

Enable WiFi 6 MBO (Multi Band Operation). MBO is a prereq to 802.11ax.

MBO allows the efficient use of multiple frequency bands (channels).

To facilitate MBO, WNM (Wireless Network Monitoring) is a prerequisite.
It is required to build.

Tested by: philip

(cherry picked from commit 3968b47cd974e503df303265f3be9ba5865499ab)

2 years agowpa: Import wpa_supplicant/hostapd commits up to b4f7506ff
Cy Schubert [Fri, 3 Sep 2021 13:07:19 +0000 (06:07 -0700)]
wpa: Import wpa_supplicant/hostapd commits up to b4f7506ff

Merge vendor commits 40c7ff83e74eabba5a7e2caefeea12372b2d3f9a,
efec8223892b3e677acb46eae84ec3534989971f, and
2f6c3ea9600b494d24cac5a38c1cea0ac192245e.

Tested by: philip

(cherry picked from commit c1d255d3ffdbe447de3ab875bf4e7d7accc5bfc5)

2 years agoUnmap shared page manually before doing vm_map_remove() on exit or exec
Konstantin Belousov [Wed, 20 Oct 2021 20:32:59 +0000 (23:32 +0300)]
Unmap shared page manually before doing vm_map_remove() on exit or exec

(cherry picked from commit 1c69690319c5bb7deae6ce1add6ea25bb40b3b91)

2 years agoamd64 pmap: adjust the empty pmap optimization in pmap_remove()
Konstantin Belousov [Wed, 20 Oct 2021 20:30:34 +0000 (23:30 +0300)]
amd64 pmap: adjust the empty pmap optimization in pmap_remove()

(cherry picked from commit 0b3bc7288984c17da00d9f8c29f116d56bf44d35)

2 years agoamd64 pmap: account for the top-level pages
Konstantin Belousov [Wed, 20 Oct 2021 01:03:43 +0000 (04:03 +0300)]
amd64 pmap: account for the top-level pages

(cherry picked from commit e93b5adb6bb83d487eaa4211ac26e116db748c63)

2 years agokern_tc.c: Scaling/large delta recalculation
Sebastian Huber [Thu, 28 Oct 2021 08:22:58 +0000 (10:22 +0200)]
kern_tc.c: Scaling/large delta recalculation

(cherry picked from commit ae750fbac72387c05c8e13623c2465b20497b4be)

2 years agoForce WITHOUT_OPENSSL_KTLS off when WITHOUT_OPENSSL
Ed Maste [Thu, 28 Oct 2021 21:07:34 +0000 (17:07 -0400)]
Force WITHOUT_OPENSSL_KTLS off when WITHOUT_OPENSSL

Discussed with: jhb
MFC after: 1 week
Reported by: Michael Dexter, Build Option Survey
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 6940d0e4703e72b8ea445541567d0ef64c2bb94b)

2 years agoiscsid: set max_recv_data_segment_length to what we advertise
Ed Maste [Thu, 21 Oct 2021 15:09:58 +0000 (11:09 -0400)]
iscsid: set max_recv_data_segment_length to what we advertise

Previously we updated the conection's conn_max_recv_data_segment_length
only when we received a response containing MaxRecvDataSegmentLength
from the target.  If the target did not send MaxRecvDataSegmentLength
then we left conn_max_recv_data_segment_length at the default (i.e.,
8192).  A target could then send more data than that defult (up to our
advertised maximum), and we would drop the connection.

RFC 7143 specifies that MaxRecvDataSegmentLength is Declarative, not
negotiated.  Just set conn_max_recv_data_segment_length to our
advertised value in login_negotiate().

PR: 259355
Reviewed by: mav
MFC after: 1 week
Fixes: a15fbc904a4d ("Alike to r312190 decouple iSCSI...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32605

(cherry picked from commit fc79cf4fea7221fa62d480648f05e30f7df73f92)

2 years agovm_page: Break reservations to handle noobj allocations
Mark Johnston [Thu, 21 Oct 2021 15:46:25 +0000 (11:46 -0400)]
vm_page: Break reservations to handle noobj allocations

vm_reserv_reclaim_*() will release pages to the default freepool, not
the direct freepool from which noobj allocations are drawn.  But if both
pools are empty, the noobj allocator variants must break reservations to
make progress.

Reported by: cy
Reviewed by: kib (previous version)
Fixes: b498f71bc56a ("vm_page: Add a new page allocator interface for unnamed pages")
Sponsored by: The FreeBSD Foundation

(cherry picked from commit d7acbe481d17ccb81c2b879b9731c83b018f3094)

2 years agoUse the vm_radix_init() helper when initializing pmaps
Mark Johnston [Wed, 20 Oct 2021 00:29:05 +0000 (20:29 -0400)]
Use the vm_radix_init() helper when initializing pmaps

No functional change intended.

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit ff93447d8ed61081adfe00a23a1e4c7bee479e53)

2 years agoamd64: Add comments to pmap_pinit_type()
Mark Johnston [Wed, 20 Oct 2021 00:29:18 +0000 (20:29 -0400)]
amd64: Add comments to pmap_pinit_type()

... explaining why we don't pass the pmap pointer to
pmap_alloc_pt_page().

Reported by: alc
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 34fac29e98313fb0bfba0503e2e19e352b452516)

2 years agoConvert consumers to vm_page_alloc_noobj_contig()
Mark Johnston [Wed, 20 Oct 2021 00:25:04 +0000 (20:25 -0400)]
Convert consumers to vm_page_alloc_noobj_contig()

Remove now-unneeded page zeroing.  No functional change intended.

Reviewed by: alc, hselasky, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 84c3922243a7b7fd510dcfb100aec59c878c57d0)

2 years agoIntroduce vm_page_alloc_noobj_contig()
Mark Johnston [Wed, 20 Oct 2021 00:24:21 +0000 (20:24 -0400)]
Introduce vm_page_alloc_noobj_contig()

This is the same as vm_page_alloc_noobj(), but allocates physically
contiguous runs of memory.  For now it is implemented in terms of
vm_page_alloc_contig(), with the difference that
vm_page_alloc_noobj_contig() implements VM_ALLOC_ZERO by zeroing the
page.

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 92db9f3bb7623883231214e74ec38788c3dffc6a)

2 years agoConvert vm_page_alloc() callers to use vm_page_alloc_noobj().
Mark Johnston [Wed, 20 Oct 2021 00:23:39 +0000 (20:23 -0400)]
Convert vm_page_alloc() callers to use vm_page_alloc_noobj().

Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by: alc, hselasky, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit a4667e09e6520dc2c4b0b988051f060fed695a91)

2 years agovm_page: Add a new page allocator interface for unnamed pages
Mark Johnston [Wed, 20 Oct 2021 00:22:12 +0000 (20:22 -0400)]
vm_page: Add a new page allocator interface for unnamed pages

The diff adds vm_page_alloc_noobj() and vm_page_alloc_noobj_domain().
These mostly correspond to vm_page_alloc() and vm_page_alloc_domain()
when no VM object is specified, with the exception that they handle
VM_ALLOC_ZERO by zeroing the page, rather than by preserving PG_ZERO.

This simplifies callers and will permit simplification of the
vm_page_alloc_domain() definition.

Since the new allocator variant is similar to vm_page_alloc_freelist(),
implement both of them using a common backend allocator function.  No
functional change intended.

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit b498f71bc56af0069d9a4685b8385ee613a00727)

2 years agoAdd a VM flag to prevent reclaim on a failed contig allocation
Ryan Stone [Fri, 29 Jan 2021 21:13:57 +0000 (16:13 -0500)]
Add a VM flag to prevent reclaim on a failed contig allocation

If a M_WAITOK contig alloc fails, the VM subsystem will try to
reclaim contiguous memory twice before actually failing the
request.  On a system with 64GB of RAM I've observed this take
400-500ms before it finally gives up, and I believe that this
will only be worse on systems with even more memory.

In certain contexts this delay is extremely harmful, so add a flag
that will skip reclaim for allocation requests to allow those
paths to opt-out of doing an expensive reclaim.

Sponsored by: Dell Inc
Differential Revision: https://reviews.freebsd.org/D28422
Reviewed by: markj, kib

(cherry picked from commit 660344ca44c63bfe4a16c3e57d0f6dbcbb5e083e)

2 years agovlapic: Schedule callouts on the local CPU
Mark Johnston [Wed, 20 Oct 2021 00:50:06 +0000 (20:50 -0400)]
vlapic: Schedule callouts on the local CPU

The virtual LAPIC driver uses callouts to implement the LAPIC timer.
Callouts are armed using callout_reset_sbt(), which currently puts
everything on CPU 0.  On systems running many bhyve VMs this results in
a large amount of contention for CPU 0's callout lock.

Modify vlapic to schedule callouts on the local CPU instead.  This
allows timer interrupts to be scheduled more evenly among CPUs where
bhyve is running.

Reviewed by: grehan, jhb
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 4c812fe61b7ce2f297a381950ff7bd87fd51f698)

2 years agormslock: Update td_locks during lock and unlock operations
Mark Johnston [Wed, 27 Oct 2021 15:18:13 +0000 (11:18 -0400)]
rmslock: Update td_locks during lock and unlock operations

Reviewed by: mjg
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 71f31d784e1816a155cafbccf4b28291200097aa)

2 years agoamd64: Define KVA regions for KMSAN shadow maps
Mark Johnston [Tue, 10 Aug 2021 20:25:39 +0000 (16:25 -0400)]
amd64: Define KVA regions for KMSAN shadow maps

KMSAN requires two shadow maps, each one-to-one with the kernel map.
Allocate regions of the kernels PML4 page for them.  Add functions to
create mappings in the shadow map regions, these will be used by the
KMSAN runtime.

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit f95f780ea4e163ce9a0295a699f41f0a7e1591d4)

2 years agoconf: Add a KMSAN kernel option
Mark Johnston [Tue, 10 Aug 2021 19:51:03 +0000 (15:51 -0400)]
conf: Add a KMSAN kernel option

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 30d00832d7733e60f5e030d335c129bfa77dd77a)

2 years agokasan: Use vm_offset_t for the first parameter to kasan_shadow_map()
Mark Johnston [Thu, 29 Apr 2021 15:39:02 +0000 (11:39 -0400)]
kasan: Use vm_offset_t for the first parameter to kasan_shadow_map()

No functional change intended.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 20e3b9d8bd778445bb80b2be28d2fdedf7bae37e)

2 years agoamd64 pmap: Pre-set PG_M on 2MB KASAN shadow map entries
Mark Johnston [Tue, 10 Aug 2021 20:23:42 +0000 (16:23 -0400)]
amd64 pmap: Pre-set PG_M on 2MB KASAN shadow map entries

Also remove a redundant assertion in pmap_kasan_enter().

Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 4fd450a87df015fe85cadfac0e22c73e3c878d24)

2 years agousb(4): Fix for use after free in combination with EVDEV_SUPPORT.
Hans Petter Selasky [Sun, 24 Oct 2021 11:38:04 +0000 (13:38 +0200)]
usb(4): Fix for use after free in combination with EVDEV_SUPPORT.

When EVDEV_SUPPORT was introduced, the USB transfers may be running
after the main FIFO is closed. In connection to this a race may appear
which can lead to use-after-free scenarios. Fix this for all FIFO
consumers by initializing and resetting the FIFO queues under the
lock used by the client. Then the client driver will see an empty
queue in all cases a race may appear.

Found by: pho@
Sponsored by: NVIDIA Networking

(cherry picked from commit aad0c65d6b37364d8ba92ecb8c85e004398a5194)

2 years agosinpi[fl] etc: Fix the ld128 implementations
Steve Kargl [Sun, 31 Oct 2021 22:26:20 +0000 (00:26 +0200)]
sinpi[fl] etc: Fix the ld128 implementations

PR: 218514

(cherry picked from commit 4f889260c33c163ab28e0e082b4d7e7562d9c647)

2 years agosinpi,cospi,tanpi: float.h needed for week reference
Steve Kargl [Thu, 28 Oct 2021 22:53:13 +0000 (01:53 +0300)]
sinpi,cospi,tanpi: float.h needed for week reference

PR: 218514

(cherry picked from commit 3bfc837685b8128067b946b31dfe2120dae0d003)

2 years agolib/msun: Move the files to appropriate locations in the Makefile
Steve Kargl [Tue, 26 Oct 2021 20:53:51 +0000 (23:53 +0300)]
lib/msun: Move the files to appropriate locations in the Makefile

(cherry picked from commit ca3d8cb087cd5b40369478b1693f3e4038b5fa23)

2 years agolib/msun/ld128/s_tanpil.c: make it compile.
Konstantin Belousov [Tue, 26 Oct 2021 21:14:35 +0000 (00:14 +0300)]
lib/msun/ld128/s_tanpil.c: make it compile.

(cherry picked from commit 6312d144613f97bf59703c442ee4871be1450c46)

2 years ago[LIBM] implementations of sinpi[fl], cospi[fl], and tanpi[fl]
Steve Kargl [Mon, 25 Oct 2021 13:13:52 +0000 (16:13 +0300)]
[LIBM] implementations of sinpi[fl], cospi[fl], and tanpi[fl]

PR: 218514

(cherry picked from commit dce5f3abed7181cc533ca5ed3de44517775e78dd)

2 years agosleepqueue(9): Remove sbinuptime() from sleepq_timeout().
Alexander Motin [Sun, 3 Oct 2021 00:57:55 +0000 (20:57 -0400)]
sleepqueue(9): Remove sbinuptime() from sleepq_timeout().

Callout c_time is always bigger or equal than the scheduled time.  It
is also smaller than sbinuptime() and can't change while the callback
is running.  So we reliably can use it instead of sbinuptime() here.
In case there was a race and the callout was rescheduled to the later
time, the callback will be called again.

According to profiles it saves ~5% of the timer interrupt time even
with fast TSC timecounter.

MFC after: 1 month

(cherry picked from commit 6df1359e5542f69179c142be1ea099d447e273d1)

2 years agoGeneralize sanitizer interceptors for memory and string routines
Mark Johnston [Wed, 24 Mar 2021 23:43:05 +0000 (19:43 -0400)]
Generalize sanitizer interceptors for memory and string routines

Similar to commit 3ead60236f ("Generalize bus_space(9) and atomic(9)
sanitizer interceptors"), use a more generic scheme for interposing
sanitizer implementations of routines like memcpy().

No functional change intended.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit ec8f1ea8d536e91ad37e03e45a688c4e255b9cb0)

2 years agoGeneralize bus_space(9) and atomic(9) sanitizer interceptors
Mark Johnston [Tue, 23 Mar 2021 01:44:55 +0000 (21:44 -0400)]
Generalize bus_space(9) and atomic(9) sanitizer interceptors

Make it easy to define interceptors for new sanitizer runtimes, rather
than assuming KCSAN.  Lay a bit of groundwork for KASAN and KMSAN.

When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions
in atomic_san.h are used by default instead of the inline
implementations in the platform's atomic.h.  These definitions are
implemented in the sanitizer runtime, which includes
machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual
implementations.

No functional change intended.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 3ead60236fd25ce64fece7ae4a453318ca18c119)

2 years agoKASAN: Disable checking before triggering a panic
Mark Johnston [Fri, 23 Jul 2021 14:41:00 +0000 (10:41 -0400)]
KASAN: Disable checking before triggering a panic

KASAN hooks will not generate reports if panicstr != NULL, but then
there is a window after the initial panic() call where another report
may be raised.  This can happen if a false positive occurs; to simplify
debugging of such problems, avoid recursing.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit ea3fbe0707f9a02a29875966668b6f15284f335a)

2 years agoredzone: Raise a compile error if KASAN is configured
Mark Johnston [Fri, 23 Jul 2021 14:30:29 +0000 (10:30 -0400)]
redzone: Raise a compile error if KASAN is configured

redzone(9) does some munging of the allocation to insert redzones before
and after a valid memory buffer, but KASAN does not know about this and
will raise false positives if both are configured.  Until this is fixed,
do not allow both to be configured.  Note that KASAN provides similar
checking on its own but currently does not force the creation of
redzones for all UMA allocations; this should be addressed as well.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 4e8e26a00471f1a5e7a2af322265c45b1529c5b8)

2 years agoKASAN: Implement __asan_unregister_globals()
Mark Johnston [Sat, 10 Jul 2021 00:38:28 +0000 (20:38 -0400)]
KASAN: Implement __asan_unregister_globals()

It will be called during KLD unload to unpoison the redzones following
global variables.  Otherwise, virtual address ranges previously used for
a KLD may be left tainted, triggering false positives when they are
recycled.

Reported by: pho
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 588c7a06dffbc74b281dacbdd854437b0815e501)

2 years agouma: Fix a few problems with KASAN integration
Mark Johnston [Sat, 10 Jul 2021 00:38:21 +0000 (20:38 -0400)]
uma: Fix a few problems with KASAN integration

- Ensure that all items returned by UMA are aligned to
  KASAN_SHADOW_SCALE (8).  This was true in practice since smaller
  alignments are not used by any consumers, but we should enforce it
  anyway.
- Use a non-zero code for marking redzones that appear naturally in
  items that are not a multiple of the scale factor in size.  Currently
  we do not modify keg layouts to force the creation of redzones.
- Use a non-zero code for marking freed per-CPU items, otherwise
  accesses of freed per-CPU items are not detected by the runtime.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit b0dfc48684780024a3d736c5a5449284dad97f4e)

2 years agox86: Mark the trapframe as initialized in ipi_bitmap_handler()
Mark Johnston [Sat, 10 Jul 2021 00:38:18 +0000 (20:38 -0400)]
x86: Mark the trapframe as initialized in ipi_bitmap_handler()

Otherwise KASAN may generate false positives if the trapframe was
written into a poisoned region of the stack.

Reported by: pho
Reported by: syzbot+ee60455cd58e6eed20c9@syzkaller.appspotmail.com
Reported by: syzbot+be5f9df26426ace3a00c@syzkaller.appspotmail.com
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 36226163fa48ee2c5f73bd2e870ce2e5a057f42e)

2 years agohwpmc: Disable KASAN in pmc_save_kernel_callchain()
Mark Johnston [Sat, 10 Jul 2021 00:38:11 +0000 (20:38 -0400)]
hwpmc: Disable KASAN in pmc_save_kernel_callchain()

As in commit 831850d8b087, this routine can trigger false positives, so
exclude it from instrumentation.

Reported by: pho
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 5d243d41b1206044cb5eddd5d48c1c711b731478)

2 years agoamd64: Mark the trapframe as initialized in trap()
Mark Johnston [Sat, 10 Jul 2021 00:38:03 +0000 (20:38 -0400)]
amd64: Mark the trapframe as initialized in trap()

Otherwise KASAN may generate false positives if the trapframe was
written into a poisoned region of the stack.

Reported by: pho
Sponsored by: The FreeBSD Foundation

(cherry picked from commit f08f0ae5247ab31de58bda0817e74ccc1a3a5e95)

2 years agostack(9): Disable KASAN in stack_capture()
Mark Johnston [Fri, 7 May 2021 18:20:53 +0000 (14:20 -0400)]
stack(9): Disable KASAN in stack_capture()

When unwinding the stack, we may encounter a stack frame in a poisoned
region of the stack, triggering a false positive.

Reviewed by: andrew, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 831850d8b0870c75c21d2e01527af1e55fe2fec8)

2 years agocdefs: Make __nosanitizeaddress work for KASAN as well
Mark Johnston [Fri, 7 May 2021 18:26:28 +0000 (14:26 -0400)]
cdefs: Make __nosanitizeaddress work for KASAN as well

Add __nosanitizememory while I'm here.

Reviewed by: andrew, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit cfad8bd24f038e4779e937f48b05511f2dd4a5a8)

2 years agolinker_set: Disable ASAN only in userspace
Mark Johnston [Fri, 7 May 2021 18:24:37 +0000 (14:24 -0400)]
linker_set: Disable ASAN only in userspace

KASAN does not insert redzones around global variables and so is not
susceptible to the problem that led to us disabling ASAN for linker set
elements in the first place (see commit fe3d8086fb6f).

Reviewed by: andrew, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30126

(cherry picked from commit 2d499d505262c9c965fc5f4fd36afdd2bb7cad3d)

2 years agorealloc: Fix KASAN(9) shadow map updates
Mark Johnston [Wed, 5 May 2021 21:05:46 +0000 (17:05 -0400)]
realloc: Fix KASAN(9) shadow map updates

When copying from the old buffer to the new buffer, we don't know the
requested size of the old allocation, but only the size of the
allocation provided by UMA.  This value is "alloc".  Because the copy
may access bytes in the old allocation's red zone, we must mark the full
allocation valid in the shadow map.  Do so using the correct size.

Reported by: kp
Tested by: kp
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 9a7c2de36460cdb916734a6969aac666707a639b)

2 years agomalloc: Add state transitions for KASAN
Mark Johnston [Tue, 13 Apr 2021 21:40:27 +0000 (17:40 -0400)]
malloc: Add state transitions for KASAN

- Reuse some REDZONE bits to keep track of the requested and allocated
  sizes, and use that to provide red zones.
- As in UMA, disable memory trashing to avoid unnecessary CPU overhead.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 06a53ecf24005b3a74b85ecc4b504a401ac26cd0)

2 years agoexecve: Mark exec argument buffers
Mark Johnston [Tue, 13 Apr 2021 21:40:19 +0000 (17:40 -0400)]
execve: Mark exec argument buffers

We cache mapped execve argument buffers to avoid the overhead of TLB
shootdowns.  Mark them invalid when they are freed to the cache.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit f1c3adefd95d35115bd4597293e0b904ae401245)

2 years agovfs: Add KASAN state transitions for vnodes
Mark Johnston [Tue, 13 Apr 2021 21:40:11 +0000 (17:40 -0400)]
vfs: Add KASAN state transitions for vnodes

vnodes are a bit special in that they may exist on per-CPU lists even
while free.  Add a KASAN-only destructor that poisons regions of each
vnode that are not expected to be accessed after a free.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit b261bb4057f4abbc1366e4af8e9e4081d039be4a)

2 years agokmem: Add KASAN state transitions
Mark Johnston [Tue, 13 Apr 2021 21:40:01 +0000 (17:40 -0400)]
kmem: Add KASAN state transitions

Memory allocated with kmem_* is unmapped upon free, so KASAN doesn't
provide a lot of benefit, but since allocations are always a multiple of
the page size we can create a redzone when the allocation request size
is not a multiple of the page size.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 2b914b85ddf4c25d112b2639bbbb7618641872b4)

2 years agokstack: Add KASAN state transitions
Mark Johnston [Tue, 13 Apr 2021 21:39:55 +0000 (17:39 -0400)]
kstack: Add KASAN state transitions

We allocate kernel stacks using a UMA cache zone.  Cache zones have
KASAN disabled by default, but in this case it makes sense to enable it.

Reviewed by: andrew

(cherry picked from commit 244f3ec642ed99a371c97b946b93b877d8be1756)

2 years agouma: Add KASAN state transitions
Mark Johnston [Tue, 13 Apr 2021 21:39:50 +0000 (17:39 -0400)]
uma: Add KASAN state transitions

- Add a UMA_ZONE_NOKASAN flag to indicate that items from a particular
  zone should not be sanitized.  This is applied implicitly for NOFREE
  and cache zones.
- Add KASAN call backs which get invoked:
  1) when a slab is imported into a keg
  2) when an item is allocated from a zone
  3) when an item is freed to a zone
  4) when a slab is freed back to the VM

  In state transitions 1 and 3, memory is poisoned so that accesses will
  trigger a panic.  In state transitions 2 and 4, memory is marked
  valid.
- Disable trashing if KASAN is enabled.  It just adds extra CPU overhead
  to catch problems that are detected by KASAN.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit 09c8cb717d214d03e51b3e4f8e9997b9f4e1624d)

2 years agoamd64: Add MD bits for KASAN
Mark Johnston [Tue, 13 Apr 2021 21:39:35 +0000 (17:39 -0400)]
amd64: Add MD bits for KASAN

- Initialize KASAN before executing SYSINITs.
- Add a GENERIC-KASAN kernel config, akin to GENERIC-KCSAN.
- Increase the kernel stack size if KASAN is enabled.  Some of the
  ASAN instrumentation increases stack usage and it's enough to
  trigger stack overflows in ZFS.
- Mark the trapframe as valid in interrupt handlers if it is
  assigned to td_intr_frame.  Otherwise, an interrupt in a function
  which creates a poisoned alloca region can trigger false positives.

Sponsored by: The FreeBSD Foundation

(cherry picked from commit f115c0612131d8f939f6f357f57bdd85bd6a59de)

2 years agoamd64: Implement a KASAN shadow map
Mark Johnston [Tue, 13 Apr 2021 20:30:05 +0000 (16:30 -0400)]
amd64: Implement a KASAN shadow map

The idea behind KASAN is to use a region of memory to track the validity
of buffers in the kernel map.  This region is the shadow map.  The
compiler inserts calls to the KASAN runtime for every emitted load
and store, and the runtime uses the shadow map to decide whether the
access is valid.  Various kernel allocators call kasan_mark() to update
the shadow map.

Since the shadow map tracks only accesses to the kernel map, accesses to
other kernel maps are not validated by KASAN.  UMA_MD_SMALL_ALLOC is
disabled when KASAN is configured to reduce usage of the direct map.
Currently we have no mechanism to completely eliminate uses of the
direct map, so KASAN's coverage is not comprehensive.

The shadow map uses one byte per eight bytes in the kernel map.  In
pmap_bootstrap() we create an initial set of page tables for the kernel
and preloaded data.

When pmap_growkernel() is called, we call kasan_shadow_map() to extend
the shadow map.  kasan_shadow_map() uses pmap_kasan_enter() to allocate
memory for the shadow region and map it.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29417

(cherry picked from commit 6faf45b34b14da5f138774b43ec14fb5567ac584)

2 years agoAdd the KASAN runtime
Mark Johnston [Tue, 13 Apr 2021 21:39:19 +0000 (17:39 -0400)]
Add the KASAN runtime

KASAN enables the use of LLVM's AddressSanitizer in the kernel.  This
feature makes use of compiler instrumentation to validate memory
accesses in the kernel and detect several types of bugs, including
use-after-frees and out-of-bounds accesses.  It is particularly
effective when combined with test suites or syzkaller.  KASAN has high
CPU and memory usage overhead and so is not suited for production
environments.

The runtime and pmap maintain a shadow of the kernel map to store
information about the validity of memory mapped at a given kernel
address.

The runtime implements a number of functions defined by the compiler
ABI.  These are prefixed by __asan.  The compiler emits calls to
__asan_load*() and __asan_store*() around memory accesses, and the
runtime consults the shadow map to determine whether a given access is
valid.

kasan_mark() is called by various kernel allocators to update state in
the shadow map.  Updates to those allocators will come in subsequent
commits.

The runtime also defines various interceptors.  Some low-level routines
are implemented in assembly and are thus not amenable to compiler
instrumentation.  To handle this, the runtime implements these routines
on behalf of the rest of the kernel.  The sanitizer implementation
validates memory accesses manually before handing off to the real
implementation.

The sanitizer in a KASAN-configured kernel can be disabled by setting
the loader tunable debug.kasan.disable=1.

Obtained from: NetBSD
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 38da497a4dfcf1979c8c2b0e9f3fa0564035c147)

2 years agoAdd a KASAN option to the kernel build
Mark Johnston [Tue, 13 Apr 2021 20:29:47 +0000 (16:29 -0400)]
Add a KASAN option to the kernel build

LLVM support for enabling KASAN has not yet landed so the option is not
yet usable, but hopefully this will change soon.

Reviewed by: imp, andrew
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 01028c736cbcdba079967c787bee1551fc8439aa)

2 years agotimecounter: Lock the timecounter list
Mark Johnston [Sat, 16 Oct 2021 13:46:55 +0000 (09:46 -0400)]
timecounter: Lock the timecounter list

Timecounter registration is dynamic, i.e., there is no requirement that
timecounters must be registered during single-threaded boot.  Loadable
drivers may in principle register timecounters (which can be switched to
automatically).  Timecounters cannot be unregistered, though this could
be implemented.

Registered timecounters belong to a global linked list.  Add a mutex to
synchronize insertions and the traversals done by (mpsafe) sysctl
handlers.  No functional change intended.

Reviewed by: imp, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 621fd9dcb2d83daab477c130bc99b905f6fc27dc)

2 years agocpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
Mark Johnston [Tue, 21 Sep 2021 15:36:55 +0000 (11:36 -0400)]
cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it

This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

This is a re-application of commit
9068f6ea697b1b28ad1326a4c7a9ba86f08b985e.

Reviewed by: cem, kib, jhb
Sponsored by: The FreeBSD Foundation

(cherry picked from commit de8554295b47475e758a573ab7418265f21fee7e)

2 years agobitset: Reimplement BIT_FOREACH_IS(SET|CLR)
Mark Johnston [Sat, 16 Oct 2021 13:38:26 +0000 (09:38 -0400)]
bitset: Reimplement BIT_FOREACH_IS(SET|CLR)

Eliminate the nested loops and re-implement following a suggestion from
rlibby.

Add some simple regression tests.

Reviewed by: rlibby, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 51425cb2107c07ff379639edfbad65c77b55c3b8)

2 years agoclang-format: Add bitset loop macros
Mark Johnston [Tue, 21 Sep 2021 15:39:49 +0000 (11:39 -0400)]
clang-format: Add bitset loop macros

Sponsored by: The FreeBSD Foundation

(cherry picked from commit a3e3d90863f3af81bca485468814a787206a235d)

2 years agobitset(9): Introduce BIT_FOREACH_ISSET and BIT_FOREACH_ISCLR
Mark Johnston [Tue, 21 Sep 2021 15:32:23 +0000 (11:32 -0400)]
bitset(9): Introduce BIT_FOREACH_ISSET and BIT_FOREACH_ISCLR

These allow one to non-destructively iterate over the set or clear bits
in a bitset.  The motivation is that we have several code fragments
which iterate over a CPU set like this:

while ((cpu = CPU_FFS(&cpus)) != 0) {
cpu--;
CPU_CLR(cpu, &cpus);
<do something>;
}

This is slow since CPU_FFS begins the search at the beginning of the
bitset each time.  On amd64 and arm64, CPU sets have size 256, so there
are four limbs in the bitset and we do a lot of unnecessary scanning.

A second problem is that this is destructive, so code which needs to
preserve the original set has to make a copy.  In particular, we have
quite a few functions which take a cpuset_t parameter by value, meaning
that each call has to copy the 32 byte cpuset_t.

The new macros address both problems.

Reviewed by: cem, kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit dfd3bde5775ecf88851d5dffd6a8ed6076b53566)

2 years agosignal: Add SIG_FOREACH and refactor issignal()
Mark Johnston [Sat, 16 Oct 2021 13:44:40 +0000 (09:44 -0400)]
signal: Add SIG_FOREACH and refactor issignal()

Add a SIG_FOREACH macro that can be used to iterate over a signal set.
This is a bit cleaner and more efficient than calling sig_ffs() in a
loop.  The implementation is based on BIT_FOREACH_ISSET(), except
that the bitset limbs are always 32 bits wide, and signal sets are
1-indexed rather than 0-indexed like bitset(9) sets.

issignal() cannot really be modified to use SIG_FOREACH() directly.
Take this opportunity to split the function into two explicit loops.
I've always found this function hard to read and think that this change
is an improvement.

Remove sig_ffs(), nothing uses it now.

Reviewed by: kib
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 81f2e9063d64cc976b47e7ee1e9c35692cda7cb4)

2 years agosort: Fix random sort
Mark Johnston [Fri, 29 Oct 2021 18:25:42 +0000 (14:25 -0400)]
sort: Fix random sort

bwsrawdata() is supposed to return the string buffer.

PR: 259451
Reported by: sigsys@gmail.com
Fixes: d053fb22f6d3 ("usr.bin/sort: Avoid UBSan errors")
Sponsored by: The FreeBSD Foundation

(cherry picked from commit e9bfb50d5e7aa5d673a5a35318820320c4190d33)

2 years agohyperv: Register hyperv_timecounter later during boot
Mark Johnston [Mon, 25 Oct 2021 17:08:38 +0000 (13:08 -0400)]
hyperv: Register hyperv_timecounter later during boot

Previously the MSR-based timecounter was registered during
SI_SUB_HYPERVISOR, i.e., very early during boot, and before SI_SUB_LOCK.
After commit 621fd9dcb2d8 this triggers a panic since the timecounter
list lock is not yet initialized.

The hyperv timecounter does not need to be registered so early, so defer
that to SI_SUB_DRIVERS, at the same time the hyperv TSC timecounter is
registered.

Reported by: whu
Approved by: whu
Fixes: 621fd9dcb2d8 ("timecounter: Lock the timecounter list")
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 9ef7df022a467776aa616b92fe5783e4261e84c6)

2 years agonfscl: Handle NFSv4.1/4.2 Close RPC NFSERR_DELAY replies better
Rick Macklem [Mon, 18 Oct 2021 22:02:21 +0000 (15:02 -0700)]
nfscl: Handle NFSv4.1/4.2 Close RPC NFSERR_DELAY replies better

Without this patch, if a NFSv4.1/4.2 server replies NFSERR_DELAY to
a Close operation, the client loops retrying the Close while holding
a shared lock on the clientID.  This shared lock blocks returns of
delegations, even though the server has issued a CB_RECALL to request
the delegation return.

This patch delays doing a retry of a Close that received a reply of
NFSERR_DELAY until after the shared lock on the clientID is released,
for NFSv4.1/4.2.  To fix this for NFSv4.0 would be very difficult and
since the only known NFSv4 server to reply NFSERR_DELAY to Close only
does NFSv4.1/4.2, this fix is hoped to be sufficient.

This problem was detected during a recent IETF working group NFSv4
testing event.

(cherry picked from commit 52dee2bc035545f7ae2b838d8a0449f65043cd8a)

2 years agonfscl: Modify Close RPC so that it does not use "owner" for NFSv4.1/4.2
Rick Macklem [Mon, 18 Oct 2021 00:50:56 +0000 (17:50 -0700)]
nfscl: Modify Close RPC so that it does not use "owner" for NFSv4.1/4.2

This patch modifies the function that does the Close RPC (nfsrpc_closerpc)
so that it does not use the open_owner (nfso_own) for NFSv4.1/4.2.
Use of the seqid in the open_owner structure is only needed for NFSv4.0.
Same applies to a NFSERR_STALESTATEID reply, which should only happen
for NFSv4.0.  This allows nfsrpc_closerpc() to be called when nfso_own
is no longer valid.  This, in turn, allows nfsrpc_closerpc() to be called
after the shared lock on the clientID is released, for NFSv4.1/4.2.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

(cherry picked from commit d95c0a12a2dd58b4b13cbc2d1a9fccd848f8ac5e)

2 years agosystat: Handle SIGWINCH to properly window resizing and adjust -swap disk stat based...
Michael Reifenberger [Wed, 21 Apr 2021 18:31:58 +0000 (20:31 +0200)]
systat: Handle SIGWINCH to properly window resizing and adjust -swap disk stat based on new size.

(cherry picked from commit 66483838039b21a20d748448f8916a73ec419691)

2 years agoAugment systat(1) -swap to display large swap space processes
Konstantin Belousov [Tue, 26 Oct 2021 08:43:08 +0000 (11:43 +0300)]
Augment systat(1) -swap to display large swap space processes

(cherry picked from commit 57e5da2c98003e5ab77a337e9fbe22ab7e512ba7)

2 years agolibutil: add kinfo_getswapvmobject(3)
Konstantin Belousov [Tue, 26 Oct 2021 08:40:10 +0000 (11:40 +0300)]
libutil: add kinfo_getswapvmobject(3)

(cherry picked from commit f2069331e5821f4c2b65d82af2809946a34158d2)

2 years agosysctl vm.objects: yield if hog
Konstantin Belousov [Fri, 7 May 2021 22:13:29 +0000 (01:13 +0300)]
sysctl vm.objects: yield if hog

(cherry picked from commit 350fc36b4cf896cbfce657a6dab600b26367a34a)

2 years agovm.objects_swap: disable reporting some information
Konstantin Belousov [Tue, 13 Jul 2021 10:34:31 +0000 (13:34 +0300)]
vm.objects_swap: disable reporting some information

(cherry picked from commit 7738118e9a298a205b37c256245fd8449acccb0c)

2 years agoAdd vm.swap_objects sysctl
Konstantin Belousov [Tue, 13 Jul 2021 10:27:36 +0000 (13:27 +0300)]
Add vm.swap_objects sysctl

(cherry picked from commit 42812ccc969f174b3e5827c1c320b1738a1e0985)

2 years agovm_object_list: split sysctl handler in separate function
Konstantin Belousov [Tue, 13 Jul 2021 10:23:25 +0000 (13:23 +0300)]
vm_object_list: split sysctl handler in separate function

(cherry picked from commit 1b610624fdc851f54871f7ee4d67642f5879096f)

2 years agoMakefile.inc1: Remove mentions of removed target "update"
Mateusz Piotrowski [Sun, 24 Oct 2021 19:03:22 +0000 (21:03 +0200)]
Makefile.inc1: Remove mentions of removed target "update"

This is follow-up to commits e290182bcf38 and 1f7d11e636ab.

(cherry picked from commit eab5358b90804669681b639f76ff7e5707e27138)

2 years agoconfig(5): Update upper limit for maxusers on 64-bit systems
Felix Johnson [Thu, 28 Oct 2021 18:15:08 +0000 (14:15 -0400)]
config(5): Update upper limit for maxusers on 64-bit systems

The limit of 384 maxusers for auto configuration was only imposed on
32-bit systems. Document that maxusers scales above 384 based on memory
for 64-bit systems.

PR: 204938
Reported by: David Höppner <0xffea@gmail.com>

(cherry picked from commit 191c624d9519a2767801de390b192ee7a96b41cd)

2 years agoRevert "bhyve: Map the MSI-X table unconditionally for passthrough"
Mark Johnston [Sun, 31 Oct 2021 13:59:59 +0000 (09:59 -0400)]
Revert "bhyve: Map the MSI-X table unconditionally for passthrough"

This reverts commit 382eec24c0284bd7dc5997b85abc9ee70ea704a1.

This change causes a regression where a VM using passthrough no longer
starts.  Until this is resolved, revert the commit.

Reported by: Raúl Muñoz <raul.munoz@custos.es>

2 years agoRevert "bhyve: Fix the WITH_BHYVE_SNAPSHOT build"
Mark Johnston [Sun, 31 Oct 2021 13:59:50 +0000 (09:59 -0400)]
Revert "bhyve: Fix the WITH_BHYVE_SNAPSHOT build"

This reverts commit 000b70f038f4fd6893d69bd3dce75a416cd13dfe.

2 years agosh: Set PATH envvar after setting HOME in dotfile
Ka Ho Ng [Tue, 26 Oct 2021 14:48:57 +0000 (22:48 +0800)]
sh: Set PATH envvar after setting HOME in dotfile

In single-user mode, all env vars are absent, so exptilde() would not be
able to expand ~ correctly.
Place the lines setting PATH below HOME, so exptilde() would work as
expected.

Sponsored by: The FreeBSD Foundation
Reviewed by: jilles, emaste
Differential Revision: https://reviews.freebsd.org/D27003

(cherry picked from commit fcfa64801a4fe836ff481465ea068e791aa4ce6a)

2 years agobhyve: Fix the WITH_BHYVE_SNAPSHOT build
Mark Johnston [Sat, 16 Oct 2021 17:13:26 +0000 (13:13 -0400)]
bhyve: Fix the WITH_BHYVE_SNAPSHOT build

Note, this breaks compatibility with snapshots generated by older builds
of bhyve(8).

Fixes: 7fa233534736 ("bhyve: Map the MSI-X table unconditionally for passthrough")
Reported by: Greg V <greg@unrelenting.technology>
Reviewed by: grehan, bz
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 77bc75c7abd29de69d3ef35b66c23c7baba95094)

2 years agobhyve: Map the MSI-X table unconditionally for passthrough
Mark Johnston [Sat, 9 Oct 2021 15:36:19 +0000 (11:36 -0400)]
bhyve: Map the MSI-X table unconditionally for passthrough

It is possible for the PBA to reside in the same page as the MSI-X
table.  And, while devices are not supposed to do this, at least some
Intel wifi devices place registers in a page shared with the MSI-X
table.  To handle the first case we currently map the PBA page using
/dev/mem, and the second case is not handled.

Kill two birds with one stone: map the MSI-X table BAR using the
PCIOCBARMMAP ioctl instead of /dev/mem, and map the entire table so that
accesses beyond the bounds of the table can be emulated.  Regions of the
BAR not containing the table are left unmapped.

Reviewed by: bz, grehan, jhb
Sponsored by: The FreeBSD Foundation

(cherry picked from commit 7fa2335347362378322a4d27cb40f6e6cd5dd0fb)

2 years agobxe(4): Fix a few common typos in source code comments
Gordon Bergling [Wed, 27 Oct 2021 04:15:06 +0000 (06:15 +0200)]
bxe(4): Fix a few common typos in source code comments

- s/controled/controlled/
- s/allignment/alignment/

(cherry picked from commit 80abcfbdfe1af72318c2c0b1690013f43e875267)

2 years agojail(8): Fix a few common typos in source code comments
Gordon Bergling [Wed, 27 Oct 2021 04:16:06 +0000 (06:16 +0200)]
jail(8): Fix a few common typos in source code comments

- s/phyiscal/physical/

(cherry picked from commit 70de1003da6f6e78e32f92bd98c9f18f965e6663)

2 years agonfscl: Move release of the clientID lock into nfscl_doclose()
Rick Macklem [Sat, 16 Oct 2021 22:49:38 +0000 (15:49 -0700)]
nfscl: Move release of the clientID lock into nfscl_doclose()

This patch moves release of the shared clientID lock from nfsrpc_close()
just after the nfscl_doclose() call to the end of nfscl_doclose() call.
This does make the code cleaner, since the shared lock is acquired at
the beginning of nfscl_doclose().  The only semantics change is that
the code no longer drops and reaquires the NFSCLSTATELOCK() mutex,
which I do not believe will have a negative effect on the NFSv4 client.

This is being done to prepare the code for a future patch that fixes
the case where an NFSv4.1/4.2 server replies NFSERR_DELAY to a Close
operation.

(cherry picked from commit e2aab5e2d73486aa76bb861d583bbce021661601)

2 years agoiscsi: Abort data-out tasks queued on a terminating session.
John Baldwin [Wed, 15 Sep 2021 20:25:30 +0000 (13:25 -0700)]
iscsi: Abort data-out tasks queued on a terminating session.

cfiscsi_datamove_out() can race with cfiscsi_session_terminate_tasks()
and enqueue a new task after the latter function has aborted existing
tasks.  This could result in a deadlock as
cfiscsi_session_terminate_tasks() waited forever for this task to
complete.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31892

(cherry picked from commit 0cd6e85e242bb07a33df9a6314e90bcb0ba99576)

2 years agoiscsi: Add a helper routine to abort a data-out task.
John Baldwin [Wed, 15 Sep 2021 20:25:04 +0000 (13:25 -0700)]
iscsi: Add a helper routine to abort a data-out task.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31891

(cherry picked from commit 529364b032d774bff4dc818ff23d20be482f9d99)

2 years agoctld: Disable TCP DDP for connection sockets.
John Baldwin [Mon, 13 Sep 2021 16:57:54 +0000 (09:57 -0700)]
ctld: Disable TCP DDP for connection sockets.

cxgbei is not able to offload PDU processing for a socket using TCP
DDP offload.

Sponsored by: Chelsio Communications

(cherry picked from commit 3b5f95d7bd20e366d720a47a79c451ae037a3ae1)

2 years agoiscsid: Disable TCP DDP for connection sockets.
John Baldwin [Mon, 13 Sep 2021 16:57:54 +0000 (09:57 -0700)]
iscsid: Disable TCP DDP for connection sockets.

cxgbei is not able to offload PDU processing for a socket using TCP
DDP offload.

Sponsored by: Chelsio Communications

(cherry picked from commit 91c62d626d0e9995da9dc424120a4f1b0b987eea)

2 years agocxgbei: Only convert "plain" TCP connections to ISCSI.
John Baldwin [Mon, 13 Sep 2021 16:57:54 +0000 (09:57 -0700)]
cxgbei: Only convert "plain" TCP connections to ISCSI.

Reject attempts to convert a connection using a different ULP
mode: (e.g. DDP or TLS) to ISCSI.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

(cherry picked from commit f63ddf465fe09d3547deaf80fbdb91bc7b816dfb)

2 years agocxgbei: Return early for EBUSY error in icl_cxgbei_conn_handoff.
John Baldwin [Mon, 13 Sep 2021 16:57:54 +0000 (09:57 -0700)]
cxgbei: Return early for EBUSY error in icl_cxgbei_conn_handoff.

This permits unindenting almost half of the function.

Sponsored by: Chelsio Communications

(cherry picked from commit b7caa8157602f4eb9acd2729b48ba3a0c0cdc045)

2 years agocxgbei: Disable ISO for -SO cards without external memory.
John Baldwin [Mon, 13 Sep 2021 16:57:54 +0000 (09:57 -0700)]
cxgbei: Disable ISO for -SO cards without external memory.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

(cherry picked from commit 9b1bb0aee697352b39b3efa1843f581ca29068ba)

2 years agocxgbei: Handle errors in PDUs.
John Baldwin [Fri, 10 Sep 2021 22:10:00 +0000 (15:10 -0700)]
cxgbei: Handle errors in PDUs.

When a PDU with an error (bad padding, header digest, or data digest)
is received, log the error via ICL_WARN() and then reset the
connection via the ic_error callback.

While here, add per-rxq counters for errors.

Sponsored by: Chelsio Communications

(cherry picked from commit 4d4cf62e29b06a763dfa8b218de38c8d2cf051bb)

2 years agocxgbei: Add sysctls to report the maximum data segment lengths.
John Baldwin [Mon, 30 Aug 2021 22:55:40 +0000 (15:55 -0700)]
cxgbei: Add sysctls to report the maximum data segment lengths.

These sysctls report the maximum data segment lengths supported by an
adapter.  These are the values advertised to the remote end during the
login phase.

Sponsored by: Chelsio Communications

(cherry picked from commit d39e65b5bdc04cac4521ad8e071015cd751c2302)

2 years agocxgbei: Limit T5 transmit data segments to 15k.
John Baldwin [Mon, 30 Aug 2021 22:27:08 +0000 (15:27 -0700)]
cxgbei: Limit T5 transmit data segments to 15k.

This avoids exceeding a limit in the firmware when using ISO with
jumbo frames.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

(cherry picked from commit 64f09f2346650f02b6deccbe05bb02b88fce4a5e)

2 years agoiscsi: Teach the iSCSI stack about "large" received PDUs.
John Baldwin [Wed, 18 Aug 2021 17:56:28 +0000 (10:56 -0700)]
iscsi: Teach the iSCSI stack about "large" received PDUs.

When using iSCSI PDU offload (cxgbei) on T6 adapters, a burst of
received PDUs can be reported via a single message to the driver.

Previously the driver passed these multi-PDU bursts up to the iSCSI
stack up as a single "large" PDU by rewriting the buffer offset, data
segment length, and DataSN fields in the iSCSI header.  The DataSN
field in particular was rewritten so that each of the "large" PDUs
used consecutively increasing values.  While this worked, the forged
DataSN values did not match the ExpDataSN value in the subsequent SCSI
Response PDU.  The initiator does not currently verify this value, but
the forged DataSN values prevent adding a check.

To avoid this, allow a logical iSCSI PDU (struct icl_pdu) to describe
a burst of PDUs via a new 'ip_additional_pdus' field.  Normally this
field is set to zero when 'struct icl_pdu' represents a single PDU.
If logical PDU represents a burst of on-the-wire PDUs, then 'ip_npdus'
contains the count of additional on-the-wire PDUs.  The header of this
"large" PDU is still modified, but the DataSN field now contains the
DataSN value of the first on-the-wire PDU in the burst.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31577

(cherry picked from commit c261b6ea4e2ef1fc6a446443ee594ad76f392350)

2 years agocxgbei: Restrict received PDUs to 4 DDP pages in length.
John Baldwin [Tue, 17 Aug 2021 18:14:37 +0000 (11:14 -0700)]
cxgbei: Restrict received PDUs to 4 DDP pages in length.

Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31576

(cherry picked from commit d75b0870e542613e63d9f4ac8ec9fb22817e34fa)

2 years agocxgbei: Only round PDU data segment lengths down by 512 on T5.
John Baldwin [Tue, 17 Aug 2021 18:14:29 +0000 (11:14 -0700)]
cxgbei: Only round PDU data segment lengths down by 512 on T5.

Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31575

(cherry picked from commit f28715fdc1f7e801b260369787e7bcd633a481bb)

2 years agocxgbei: Restructure how PDU limits are managed.
John Baldwin [Tue, 17 Aug 2021 18:14:11 +0000 (11:14 -0700)]
cxgbei: Restructure how PDU limits are managed.

- Compute data segment limits in read_pdu_limits() rather than PDU
  length limits.

- Add back connection-specific PDU overhead lengths to compute PDU
  length limits in icl_cxgbei_conn_handoff().

Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31574

(cherry picked from commit cbc186360c658eda884ed97f37cdc2d1b6512b91)

2 years agocxgbei: Wait for the final CPL to be received in icl_cxgbei_conn_close.
John Baldwin [Thu, 12 Aug 2021 15:48:14 +0000 (08:48 -0700)]
cxgbei: Wait for the final CPL to be received in icl_cxgbei_conn_close.

A socket in the FIN_WAIT_1 state is marked disconnected by
do_close_con_rpl() even though there might still receive data pending.
This is because the socket at that point has set SBS_CANTRCVMORE which
causes the protocol layer to discard any data received before the FIN.
However, icl_cxgbei_conn_close needs to wait until all the data has
been discarded.  Replace the wait for SS_ISDISCONNECTED with instead
waiting for final_cpl_received() to be called.

Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications

(cherry picked from commit 2eb0e53a6b5ec1a72be70e966d4e562e1a8d4e88)

2 years agocxgbei: Support for ISO (iSCSI segmentation offload).
John Baldwin [Fri, 6 Aug 2021 21:21:37 +0000 (14:21 -0700)]
cxgbei: Support for ISO (iSCSI segmentation offload).

ISO can be disabled before establishing a connection by setting
dev.tNnex.N.toe.iso to 0.

Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31223

(cherry picked from commit 5b27e4b27caae840bd79ccc5cb7811a0c9acc656)

2 years agoiSCSI: Add support for segmentation offload for hardware offloads.
John Baldwin [Fri, 6 Aug 2021 21:03:00 +0000 (14:03 -0700)]
iSCSI: Add support for segmentation offload for hardware offloads.

Similar to TSO, iSCSI segmentation offload permits the upper layers to
submit a "large" virtual PDU which is split up into multiple segments
(PDUs) on the wire.  Similar to how the TCP/IP headers are used as
templates for TSO, the BHS at the start of a large PDU is used as a
template to construct the specific BHS at the start of each PDU.  In
particular, the DataSN is incremented for each subsequent PDU, and the
'F' flag is only set on the last PDU.

struct icl_conn has a new 'ic_hw_isomax' field which defaults to 0,
but can be set to the largest virtual PDU a backend supports.  If this
value is non-zero, the iSCSI target and initiator use this size
instead of 'ic_max_send_data_segment_length' to determine the maximum
size for SCSI Data-In and SCSI Data-Out PDUs.  Note that since PDUs
can be constructed from multiple buffers before being dispatched, the
target and initiator must wait for the PDU to be fully constructed
before determining the number of DataSN values were consumed (and thus
updating the per-transfer DataSN value used for the start of the next
PDU).

The target generates large PDUs for SCSI Data-In PDUs in
cfiscsi_datamove_in().  The initiator generates large PDUs for SCSI
Data-Out PDUs generated in response to an R2T.

Reviewed by: mav
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D31222

(cherry picked from commit f0594f52f6fdabecee134dd5700bf936283959ad)