Ulrich Spörlein [Wed, 30 Dec 2020 10:29:55 +0000 (11:29 +0100)]
THIS BRANCH IS OBSOLETE, PLEASE READ:
Dear all,
the FreeBSD project has switched their source of truth repository from
Subversion to Git on 2020-12-23. The previously published commit hashes
were missing required merge commits and were thus unsuitable to be the
basis of the source of truth repository.
They had to be changed (aka "force pushed"). We're sorry for the
inconvenience this causes.
The `master` branch will no longer see any updates and we will push to
`main` going forward. The `stable/X` and `releng/X` branches will see a
force-push directly.
To transplant your work, it is recommended to head over to the wiki at
https://github.com/freebsd/git_conv/wiki for instructions on how to
rebase or remerge your in-flight work. A short outline will be provided
below.
Please note that the actual git "trees" are of course identical between
the old and new conversions, so it's relatively easy to craft git
commits that merge or rebase the work w/o any merge conflicts. Please
reach out to git@FreeBSD.org or file an issue under
https://github.com/freebsd/git_conv/issues if you need assistance.
The mapping of the old to new commit hashes (for the same tree) are
given below. We have archived a copy of the legacy repo under
https://github.com/freebsd/freebsd-legacy which you can add as a remote
to always have a reference to the old `master` or `stable/X` branches
and names.
This is a merge commit that brings both histories together, giving you
common history ancestors, which should help with later merging.
You should be able to cleanly merge into this "legacy" master, and then
merge into "main" following from that. All the following commands assume
you've checked out your own workbranch.
$ git remote add freebsd-legacy https://github.com/freebsd/freebsd-legacy.git
$ git fetch --all
$ git merge freebsd-legacy/stable/X (to get to the legacy hashes from below)
$ git merge -s ours --allow-unrelated-histories <new-hash-from-below> (this is guaranteed conflict free)
$ git merge origin/stable/X
PLEASE NOTE: You'll end up with twice the history and git log output
will show old history twice and will likely confuse you. Please make an
effort to migrate your work over to a fresh branch based off of main.
A git replace --graft can be used to at least patch up git log output.
-- Rebase your work --
Only for folks that always rebase their local work on top of an origin
branch.
Again, we're sorry for the inconvenience, please reach out via GitHub
issues and the mailing list to get further help in transplanting your
work onto the one true, canonical, source of truth.
Good luck in 2021!
Ulrich Spörlein, on behalf of the FreeBSD Git working group.
gallatin [Sat, 19 Dec 2020 22:04:46 +0000 (22:04 +0000)]
Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain
In order to efficiently serve web traffic on a NUMA
machine, one must avoid as many NUMA domain crossings as
possible. With SO_REUSEPORT_LB, a number of workers can share a
listen socket. However, even if a worker sets affinity to a core
or set of cores on a NUMA domain, it will receive connections
associated with all NUMA domains in the system. This will lead to
cross-domain traffic when the server writes to the socket or
calls sendfile(), and memory is allocated on the server's local
NUMA node, but transmitted on the NUMA node associated with the
TCP connection. Similarly, when the server reads from the socket,
he will likely be reading memory allocated on the NUMA domain
associated with the TCP connection.
This change provides a new socket ioctl, TCP_REUSPORT_LB_NUMA. A
server can now tell the kernel to filter traffic so that only
incoming connections associated with the desired NUMA domain are
given to the server. (Of course, in the case where there are no
servers sharing the listen socket on some domain, then as a
fallback, traffic will be hashed as normal to all servers sharing
the listen socket regardless of domain). This allows a server to
deal only with traffic that is local to its NUMA domain, and
avoids cross-domain traffic in most cases.
This patch, and a corresponding small patch to nginx to use
TCP_REUSPORT_LB_NUMA allows us to serve 190Gb/s of kTLS encrypted
https media content from dual-socket Xeons with only 13% (as
measured by pcm.x) cross domain traffic on the memory controller.
Andrew Gallatin [Sat, 19 Dec 2020 22:04:46 +0000 (22:04 +0000)]
Filter TCP connections to SO_REUSEPORT_LB listen sockets by NUMA domain
In order to efficiently serve web traffic on a NUMA
machine, one must avoid as many NUMA domain crossings as
possible. With SO_REUSEPORT_LB, a number of workers can share a
listen socket. However, even if a worker sets affinity to a core
or set of cores on a NUMA domain, it will receive connections
associated with all NUMA domains in the system. This will lead to
cross-domain traffic when the server writes to the socket or
calls sendfile(), and memory is allocated on the server's local
NUMA node, but transmitted on the NUMA node associated with the
TCP connection. Similarly, when the server reads from the socket,
he will likely be reading memory allocated on the NUMA domain
associated with the TCP connection.
This change provides a new socket ioctl, TCP_REUSPORT_LB_NUMA. A
server can now tell the kernel to filter traffic so that only
incoming connections associated with the desired NUMA domain are
given to the server. (Of course, in the case where there are no
servers sharing the listen socket on some domain, then as a
fallback, traffic will be hashed as normal to all servers sharing
the listen socket regardless of domain). This allows a server to
deal only with traffic that is local to its NUMA domain, and
avoids cross-domain traffic in most cases.
This patch, and a corresponding small patch to nginx to use
TCP_REUSPORT_LB_NUMA allows us to serve 190Gb/s of kTLS encrypted
https media content from dual-socket Xeons with only 13% (as
measured by pcm.x) cross domain traffic on the memory controller.
gallatin [Sat, 19 Dec 2020 21:46:09 +0000 (21:46 +0000)]
Optionally bind ktls threads to NUMA domains
When ktls_bind_thread is 2, we pick a ktls worker thread that is
bound to the same domain as the TCP connection associated with
the socket. We use roughly the same code as netinet/tcp_hpts.c to
do this. This allows crypto to run on the same domain as the TCP
connection is associated with. Assuming TCP_REUSPORT_LB_NUMA
(D21636) is in place & in use, this ensures that the crypto source
and destination buffers are local to the same NUMA domain as we're
running crypto on.
This change (when TCP_REUSPORT_LB_NUMA, D21636, is used) reduces
cross-domain traffic from over 37% down to about 13% as measured
by pcm.x on a dual-socket Xeon using nginx and a Netflix workload.
Andrew Gallatin [Sat, 19 Dec 2020 21:46:09 +0000 (21:46 +0000)]
Optionally bind ktls threads to NUMA domains
When ktls_bind_thread is 2, we pick a ktls worker thread that is
bound to the same domain as the TCP connection associated with
the socket. We use roughly the same code as netinet/tcp_hpts.c to
do this. This allows crypto to run on the same domain as the TCP
connection is associated with. Assuming TCP_REUSPORT_LB_NUMA
(D21636) is in place & in use, this ensures that the crypto source
and destination buffers are local to the same NUMA domain as we're
running crypto on.
This change (when TCP_REUSPORT_LB_NUMA, D21636, is used) reduces
cross-domain traffic from over 37% down to about 13% as measured
by pcm.x on a dual-socket Xeon using nginx and a Netflix workload.
gbe [Sat, 19 Dec 2020 14:54:28 +0000 (14:54 +0000)]
libc: Fix most issues reported by mandoc
- varios "new sentence, new line" warnings
- varios "sections out of conventional order" warnings
- varios "unusual Xr order" warnings
- varios "missing section argument" warnings
- varios "no blank before trailing delimiter" warnings
- varios "normalizing date format" warnings
Gordon Bergling [Sat, 19 Dec 2020 14:54:28 +0000 (14:54 +0000)]
libc: Fix most issues reported by mandoc
- varios "new sentence, new line" warnings
- varios "sections out of conventional order" warnings
- varios "unusual Xr order" warnings
- varios "missing section argument" warnings
- varios "no blank before trailing delimiter" warnings
- varios "normalizing date format" warnings
gbe [Sat, 19 Dec 2020 13:51:46 +0000 (13:51 +0000)]
zonectl(8): Fix a few issues reported by mandoc
- Add missing quotation mark for a comment above the .Dd
- inserting missing end of block: Sh breaks Bd
- skipping paragraph macro: Pp before Bl
- skipping paragraph macro: Pp before Bd
- empty block: Bd
Gordon Bergling [Sat, 19 Dec 2020 13:51:46 +0000 (13:51 +0000)]
zonectl(8): Fix a few issues reported by mandoc
- Add missing quotation mark for a comment above the .Dd
- inserting missing end of block: Sh breaks Bd
- skipping paragraph macro: Pp before Bl
- skipping paragraph macro: Pp before Bd
- empty block: Bd
gbe [Sat, 19 Dec 2020 13:36:59 +0000 (13:36 +0000)]
bluetooth: Fix a mandoc related issues
- new sentence, new line
- sections out of conventional order: Sh FILES
- unusual Xr order: bthost(1) after bthidd(8)
- no blank before trailing delimiter
- whitespace at end of input line
- sections out of conventional order: Sh EXIT STATUS
Gordon Bergling [Sat, 19 Dec 2020 13:36:59 +0000 (13:36 +0000)]
bluetooth: Fix a mandoc related issues
- new sentence, new line
- sections out of conventional order: Sh FILES
- unusual Xr order: bthost(1) after bthidd(8)
- no blank before trailing delimiter
- whitespace at end of input line
- sections out of conventional order: Sh EXIT STATUS
gbe [Sat, 19 Dec 2020 12:47:40 +0000 (12:47 +0000)]
ipfw(8): Fix a few mandoc related issues
- no blank before trailing delimiter
- missing section argument: Xr inet_pton
- skipping paragraph macro: Pp before Ss
- unusual Xr order: syslogd after sysrc
- tab in filled text
There were a few multiline NAT examples which used the .Dl macro with
tabs. I converted them to .Bd, which is a more suitable macro for that case.
Gordon Bergling [Sat, 19 Dec 2020 12:47:40 +0000 (12:47 +0000)]
ipfw(8): Fix a few mandoc related issues
- no blank before trailing delimiter
- missing section argument: Xr inet_pton
- skipping paragraph macro: Pp before Ss
- unusual Xr order: syslogd after sysrc
- tab in filled text
There were a few multiline NAT examples which used the .Dl macro with
tabs. I converted them to .Bd, which is a more suitable macro for that case.
gbe [Sat, 19 Dec 2020 11:47:38 +0000 (11:47 +0000)]
nvmecontrol(8): Fix a few mandoc related issues and add a SEE ALSO section
- inserting missing end of block: Ss breaks Bl
- skipping paragraph macro: Pp before Ss
- referenced manual not found: Xr nvme 4 (2 times)
- unknown standard specifier: St The
The macro .St can only be used for standards known by mdoc(7). So add a
SEE ALSO section and add a reference to the NVM Express Base Specification.
Gordon Bergling [Sat, 19 Dec 2020 11:47:38 +0000 (11:47 +0000)]
nvmecontrol(8): Fix a few mandoc related issues and add a SEE ALSO section
- inserting missing end of block: Ss breaks Bl
- skipping paragraph macro: Pp before Ss
- referenced manual not found: Xr nvme 4 (2 times)
- unknown standard specifier: St The
The macro .St can only be used for standards known by mdoc(7). So add a
SEE ALSO section and add a reference to the NVM Express Base Specification.
gbe [Sat, 19 Dec 2020 10:15:58 +0000 (10:15 +0000)]
bhnd_erom(9): Fix a few mandoc related issues
- skipping paragraph macro: Pp before Bl
- skipping paragraph macro: Pp after Ss
- skipping paragraph macro: Pp at the end of Ss
- unusual Xr punctuation: none before bhnd_driver_get_erom_class(9)
- unusual Xr punctuation: none before bus_space(9)
Gordon Bergling [Sat, 19 Dec 2020 10:15:58 +0000 (10:15 +0000)]
bhnd_erom(9): Fix a few mandoc related issues
- skipping paragraph macro: Pp before Bl
- skipping paragraph macro: Pp after Ss
- skipping paragraph macro: Pp at the end of Ss
- unusual Xr punctuation: none before bhnd_driver_get_erom_class(9)
- unusual Xr punctuation: none before bus_space(9)
gbe [Sat, 19 Dec 2020 10:11:37 +0000 (10:11 +0000)]
bhnd(9): Fix a few mandoc related issues
- skipping paragraph macro: Pp before Bl
- skipping paragraph macro: Pp at the end of Ss
- missing section argument: Xr device_set_desc
- unusual Xr punctuation: none before bhnd_erom(9)
Gordon Bergling [Sat, 19 Dec 2020 10:11:37 +0000 (10:11 +0000)]
bhnd(9): Fix a few mandoc related issues
- skipping paragraph macro: Pp before Bl
- skipping paragraph macro: Pp at the end of Ss
- missing section argument: Xr device_set_desc
- unusual Xr punctuation: none before bhnd_erom(9)
gbe [Sat, 19 Dec 2020 09:55:02 +0000 (09:55 +0000)]
disk(9): Fix a few mandoc related errors
- function name without markup: g_io_deliver()
- function name without markup: disk_gone()
- sections out of conventional order: Sh SEE ALSO
- referenced manual not found: Xr MAKE_DEV 9
Actually the man page of MAKE_DEV has never existed.
Gordon Bergling [Sat, 19 Dec 2020 09:55:02 +0000 (09:55 +0000)]
disk(9): Fix a few mandoc related errors
- function name without markup: g_io_deliver()
- function name without markup: disk_gone()
- sections out of conventional order: Sh SEE ALSO
- referenced manual not found: Xr MAKE_DEV 9
Actually the man page of MAKE_DEV has never existed.
rlibby [Sat, 19 Dec 2020 08:38:31 +0000 (08:38 +0000)]
rtld-elf: link udivmoddi4 from compiler_rt
This fixes the gcc9 build of rtld-elf32 on amd64, which needed an
implementation of udivmoddi4.
rtld-elf uses certain functions normally found in libc, and so it
includes certain files from libc in its own build. It has two
mechanisms to include files from libc: one that rebuilds source files in
the rtld-elf environment, and one that extracts object files from a
purpose-built no-SSP PIC archive.
In addition to libc functions, rtld-elf may need to link functions
normally found in libcompiler_rt (formerly libgcc). Now, add an ability
to rebuild libcompiler_rt source files in the rtld-elf environment. We
don't yet have a need for an object file extraction mechanism.
libcompiler_rt could also supply udivdi3 and umoddi3, but leave them
alone for now.
Ryan Libby [Sat, 19 Dec 2020 08:38:31 +0000 (08:38 +0000)]
rtld-elf: link udivmoddi4 from compiler_rt
This fixes the gcc9 build of rtld-elf32 on amd64, which needed an
implementation of udivmoddi4.
rtld-elf uses certain functions normally found in libc, and so it
includes certain files from libc in its own build. It has two
mechanisms to include files from libc: one that rebuilds source files in
the rtld-elf environment, and one that extracts object files from a
purpose-built no-SSP PIC archive.
In addition to libc functions, rtld-elf may need to link functions
normally found in libcompiler_rt (formerly libgcc). Now, add an ability
to rebuild libcompiler_rt source files in the rtld-elf environment. We
don't yet have a need for an object file extraction mechanism.
libcompiler_rt could also supply udivdi3 and umoddi3, but leave them
alone for now.
rlibby [Sat, 19 Dec 2020 08:38:27 +0000 (08:38 +0000)]
rtld-libc: fix incremental build
ar cr is an update of an archive, not a creation of a new one. During
incremental builds (e.g. with meta mode) the archive was not getting
cleaned, and so could retain now-deleted objects from previous builds.
Now, delete the archive before creating/updating it.
Ryan Libby [Sat, 19 Dec 2020 08:38:27 +0000 (08:38 +0000)]
rtld-libc: fix incremental build
ar cr is an update of an archive, not a creation of a new one. During
incremental builds (e.g. with meta mode) the archive was not getting
cleaned, and so could retain now-deleted objects from previous builds.
Now, delete the archive before creating/updating it.
kevans [Sat, 19 Dec 2020 03:30:06 +0000 (03:30 +0000)]
kern: cpuset: allow jails to modify child jails' roots
This partially lifts a restriction imposed by r191639 ("Prevent a superuser
inside a jail from modifying the dedicated root cpuset of that jail") that's
perhaps beneficial after r192895 ("Add hierarchical jails."). Jails still
cannot modify their own cpuset, but they can modify child jails' roots to
further restrict them or widen them back to the modifying jails' own mask.
As a side effect of this, the system root may once again widen the mask of
jails as long as they're still using a subset of the parent jails' mask.
This was previously prevented by the fact that cpuset_getroot of a root set
will return that root, rather than the root's parent -- cpuset_modify uses
cpuset_getroot since it was introduced in r327895, previously it was just
validating against set->cs_parent which allowed the system root to widen
jail masks.
Kyle Evans [Sat, 19 Dec 2020 03:30:06 +0000 (03:30 +0000)]
kern: cpuset: allow jails to modify child jails' roots
This partially lifts a restriction imposed by r191639 ("Prevent a superuser
inside a jail from modifying the dedicated root cpuset of that jail") that's
perhaps beneficial after r192895 ("Add hierarchical jails."). Jails still
cannot modify their own cpuset, but they can modify child jails' roots to
further restrict them or widen them back to the modifying jails' own mask.
As a side effect of this, the system root may once again widen the mask of
jails as long as they're still using a subset of the parent jails' mask.
This was previously prevented by the fact that cpuset_getroot of a root set
will return that root, rather than the root's parent -- cpuset_modify uses
cpuset_getroot since it was introduced in r327895, previously it was just
validating against set->cs_parent which allowed the system root to widen
jail masks.
pfg [Sat, 19 Dec 2020 03:07:38 +0000 (03:07 +0000)]
login(1): when exporting variables check the result of setenv(3)
When exporting a variable we correctly check all the preconditions that
could make setenv(3) fail. Checking the setenv(3) return value seems
redundant, but given that login(1) is critical, it doesn't hurt to have
a post-check.
This change is based on the "Principles of Secure Coding" course by
Matthew Bishop, PhD., which specifically discusses this code in FreeBSD.
(This change redoes r368776 due to a silly mistake)
Pedro F. Giffuni [Sat, 19 Dec 2020 03:07:38 +0000 (03:07 +0000)]
login(1): when exporting variables check the result of setenv(3)
When exporting a variable we correctly check all the preconditions that
could make setenv(3) fail. Checking the setenv(3) return value seems
redundant, but given that login(1) is critical, it doesn't hurt to have
a post-check.
This change is based on the "Principles of Secure Coding" course by
Matthew Bishop, PhD., which specifically discusses this code in FreeBSD.
(This change redoes r368776 due to a silly mistake)
pfg [Sat, 19 Dec 2020 02:23:53 +0000 (02:23 +0000)]
login(1): when exporting variables check the result of setenv(3)
When exporting a variable we correctly check all the preconditions that
could make setenv(3) fail. Checking the setenv(3) return value seems
redundant, but given that login(1) is critical, it doesn't hurt to have
a post-check.
This change is based on the "Principles of Secure Coding" course by
Matthew Bishop, PhD., which specifically discusses this code in FreeBSD.
Pedro F. Giffuni [Sat, 19 Dec 2020 02:23:53 +0000 (02:23 +0000)]
login(1): when exporting variables check the result of setenv(3)
When exporting a variable we correctly check all the preconditions that
could make setenv(3) fail. Checking the setenv(3) return value seems
redundant, but given that login(1) is critical, it doesn't hurt to have
a post-check.
This change is based on the "Principles of Secure Coding" course by
Matthew Bishop, PhD., which specifically discusses this code in FreeBSD.
jrtc27 [Fri, 18 Dec 2020 23:31:36 +0000 (23:31 +0000)]
usb: Replace ITUNERNET vendor with MICROCHIP and improve product names
These Mini-Box LCDs are using Microchip components and sub-licensed product
IDs. Whilst here, update the constant names and descriptions for the products
to use the names listed on the manufacturer's website rather than vague ones.
The picoLCD 4x20 is named that on the manufacturer's website so prefer that
name, even though linux-usb.org lists it with the numbers reversed as one might
expect.
Jessica Clarke [Fri, 18 Dec 2020 23:31:36 +0000 (23:31 +0000)]
usb: Replace ITUNERNET vendor with MICROCHIP and improve product names
These Mini-Box LCDs are using Microchip components and sub-licensed product
IDs. Whilst here, update the constant names and descriptions for the products
to use the names listed on the manufacturer's website rather than vague ones.
The picoLCD 4x20 is named that on the manufacturer's website so prefer that
name, even though linux-usb.org lists it with the numbers reversed as one might
expect.
mckusick [Fri, 18 Dec 2020 23:28:27 +0000 (23:28 +0000)]
Rename pass4check() to freeblock() and move from pass4.c to inode.c.
The new name more accurately describes what it does and the file move
puts it with other similar functions. Done in preparation for future
cleanups. No functional differences intended.
Sponsored by: Netflix
Historic Footnote: my last FreeBSD svn commit
Kirk McKusick [Fri, 18 Dec 2020 23:28:27 +0000 (23:28 +0000)]
Rename pass4check() to freeblock() and move from pass4.c to inode.c.
The new name more accurately describes what it does and the file move
puts it with other similar functions. Done in preparation for future
cleanups. No functional differences intended.
Sponsored by: Netflix
Historic Footnote: my last FreeBSD svn commit
melifaro [Fri, 18 Dec 2020 22:00:57 +0000 (22:00 +0000)]
Switch direct rt fields access in rtsock.c to newly-create field acessors.
rtsock code was build around the assumption that each rtentry record
in the system radix tree is a ready-to-use sockaddr. This assumptions
turned out to be not quite true:
* masks have their length tweaked, so we have rtsock_fix_netmask() hack
* IPv6 addresses have their scope embedded, so we have another explicit
deembedding hack.
Change the code to decouple rtentry internals from rtsock code using
newly-created rtentry accessors. This will allow to eventually eliminate
both of the hacks and change rtentry dst/mask format.
Switch direct rt fields access in rtsock.c to newly-create field acessors.
rtsock code was build around the assumption that each rtentry record
in the system radix tree is a ready-to-use sockaddr. This assumptions
turned out to be not quite true:
* masks have their length tweaked, so we have rtsock_fix_netmask() hack
* IPv6 addresses have their scope embedded, so we have another explicit
deembedding hack.
Change the code to decouple rtentry internals from rtsock code using
newly-created rtentry accessors. This will allow to eventually eliminate
both of the hacks and change rtentry dst/mask format.
markj [Fri, 18 Dec 2020 16:04:48 +0000 (16:04 +0000)]
acpi: Ensure that adjacent memory affinity table entries are coalesced
The SRAT may contain multiple distinct entries that together describe a
contiguous region of physical memory. In this case we were not
coalescing the corresponding entries in the memory affinity table, which
led to fragmented phys_avail[] entries. Since r338431 the vm_phys_segs[]
entries derived from phys_avail[] will be coalesced, resulting in a
situation where vm_phys_segs[] entries do not have a covering
phys_avail[] entry. vm_page_startup() will not add such segments to the
physical memory allocator, leaving them unused.
Reported by: Don Morris <dgmorris@earthlink.net>
Reviewed by: kib, vangyzen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27620