emaste [Sat, 29 Feb 2020 03:25:51 +0000 (03:25 +0000)]
remove GCC 4.2.1 build infrastructure
As described in Warner's email message[1] to the FreeBSD-arch mailing
list we have reached GCC 4.2.1's retirement date. At this time all
supported architectures either use in-tree Clang, or rely on external
toolchain (i.e., a contemporary GCC version from ports).
GCC 4.2.1 was released July 18, 2007 and was imported into FreeBSD later
that year, in r171825. GCC has served us well, but version 4.2.1 is
obsolete and not used by default on any architecture in FreeBSD. It
does not support modern C and does not support arm64 or RISC-V.
Thanks to everyone responsible for maintaining, updating, and testing
GCC in the FreeBSD base system over the years.
brooks [Fri, 28 Feb 2020 21:13:15 +0000 (21:13 +0000)]
Define SCTL_MASK32 when COMPAT_FREEBSD32 is defined.
Remove the list of architectures and depend on COMPAT_FREEBSD32 which is
defined (if relevent) in opt_global.h and thus defined everywhere in
the kernel.
This is a minor change in behavior in that 32-bit compat for sysctls now
depends on COMPAT_FREEBSD32 rather than on the potential for 32-bit
compat support. The prior arrangement may have been part of an attempt
to allow 32-bit compat to be loadable, but such attempts are doomed to
failure (due to the fact that ioctls have no meaning without the
associated file descriptor) without vastly more refactoring and some
sort of COMPAT_FREEBSD32_SUPPORT option.
pfg [Fri, 28 Feb 2020 20:43:35 +0000 (20:43 +0000)]
/etc/services: attempt to bring the database to this century 1/2.
This is the result of splitting r358153 in two, in order to avoid a build
system bug and being able to merge the change to previous releases..
Document better this file, updating the URL to the IANA registry and closely
match the official services.
For system ports (0 to 1023) we now try to follow the registry closely, noting
some historical differences where applicable.
As a side effect: drop references to unofficial Kerberos IV which was EOL'ed
on Oct 2006[1]. While it is conceivable some people may still use it in some
very old FreeBSD machines that can't be replaced easily, the use of it is
considered a security risk. Also drop the unofficial netatalk, which we
supported long ago in the kernel but was dropped long ago.
Leave for now smtps, even though it conflicts with IANA's submissions.
The change should have very little visibility, if any, but should be a
step closer to the current IANA database.
rlibby [Fri, 28 Feb 2020 18:32:36 +0000 (18:32 +0000)]
amd64 atomic.h: minor codegen optimization in flag access
Previously the pattern to extract status flags from inline assembly
blocks was to use setcc in the block to write the flag to a register.
This was suboptimal in a few ways:
- It would lead to code like: sete %cl; test %cl; jne, i.e. a flag
would just be loaded into a register and then reloaded to a flag.
- The setcc would force the block to use an additional register.
- If the client code didn't care for the flag value then the setcc
would be entirely pointless but could not be eliminated by the
optimizer.
A more modern inline asm construct (since gcc 6 and clang 9) allows for
"flag output operands", where a C variable can be written directly from
a flag. The optimizer can then use this to produce direct code where
the flag does not take a trip through a register.
In practice this makes each affected operation sequence shorter by five
bytes of instructions. It's unlikely this has a measurable performance
impact.
markj [Fri, 28 Feb 2020 16:05:18 +0000 (16:05 +0000)]
Add a blocking counter KPI.
refcount(9) was recently extended to support waiting on a refcount to
drop to zero, as this was needed for a lockless VM object
paging-in-progress counter. However, this adds overhead to all uses of
refcount(9) and doesn't really match traditional refcounting semantics:
once a counter has dropped to zero, the protected object may be freed at
any point and it is not safe to dereference the counter.
This change removes that extension and instead adds a new set of KPIs,
blockcount_*, for use by VM object PIP and busy.
Reviewed by: jeff, kib, mjg
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23723
bz [Fri, 28 Feb 2020 11:16:41 +0000 (11:16 +0000)]
mld6: initialize oifp to avoid bogus results/panics in edge cases
In certain cases (probably not during normal operation but observed in
the lab during development) ip6_ouput() could return without error
and ifpp (&oifp) not updated.
Given oifp was never initialized we would take the later branch
as oifp was not NULL, and when calling icmp6_ifstat_inc() we would
panic dereferencing a garbage pointer.
For code stability initialize oifp to NULL before first use to always
have a deterministic value and not rely on a called function to behave
and always and for ever do the work for us as we hope for.
jkim [Thu, 27 Feb 2020 22:36:16 +0000 (22:36 +0000)]
Do not free p and g parameters after calling DH_set0_pqg(3).
It is specifically mentioned in the manual page. Note it has no functional
change in reality because DH_set0_pqg() cannot fail when both p and g are
not NULL.
dim [Thu, 27 Feb 2020 19:59:17 +0000 (19:59 +0000)]
Merge r358406 from the clang1000-import branch:
Fix the following -Werror warning from clang 10.0.0:
sys/arm/arm/identcpu-v6.c:227:5: error: misleading indentation; statement is not part of the previous 'if' [-Werror,-Wmisleading-indentation]
if (val & CPUV7_CT_CTYPE_RA)
^
sys/arm/arm/identcpu-v6.c:225:4: note: previous statement is here
if (val & CPUV7_CT_CTYPE_WB)
^
This was due to an accidentally inserted tab before the if statement.
hrs [Thu, 27 Feb 2020 19:49:59 +0000 (19:49 +0000)]
Fix poor performance of ftp(1) due to small SO_SNDBUF and SO_RCVBUF.
ftp(1) from vendor/tnftp always tried the following for
every TCP connection:
1. Get the current buffer length of SO_SNDBUF and SO_RCVBUF
by getsockopt(2).
2. Invoke setsockopt(2) to set them to the same values
after checking if they are in a range between 8 KiB to 8 MiB.
This behavior broke dynamic buffer sizing enabled by
default (net.inet.tcp.{recv,send}buf_auto sysctls) and
led to a very poor transfer rate. The fetch(1) utility
does not have this problem.
This change prevents SO_SNDBUF and SO_RCVBUF from configuring
when the buffer auto-sizing is enabled unless the buffer sizes are
explicitly specified.
hrs [Thu, 27 Feb 2020 19:40:29 +0000 (19:40 +0000)]
Fix broken STARTTLS when SharedMemoryKey is enabled.
OpenSSL 1.1 API patch for sendmail had a bug which
prevented sm_RSA_generate_key() function from working.
This function is used to generate a temporary RSA key
for a shared memory region used for TLS processing.
Note that 12.0 and 12.1-RELEASE include this bug.
This affects only if SM_CONF_SHM compile-time
option (enabled by default) and SharedMemoryKey
run-time option (not enabled by default) in a .cf file are
specified. The latter corresponds to confSHARED_MEMORY_KEY in
a .mc file.
PR: 242861
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D23734
jeff [Thu, 27 Feb 2020 19:05:26 +0000 (19:05 +0000)]
Simplify lazy advance with a 64bit atomic cmpset.
This provides the potential to force a lazy (tick based) SMR to advance
when there are blocking waiters by decoupling the wr_seq value from the
ticks value.
imp [Thu, 27 Feb 2020 15:34:30 +0000 (15:34 +0000)]
Better check for floating point type.
Use __riscv_flen instead of __riscv_float_abi_soft. While the latter works for
userland (and one could argue it's more correct), it fails for the kernel. We
compile the kernel with -mabi=lp64 (eg soft float abi) to avoid floating point
instructions in the kernel. We also compile the kernel -march=rv64imafdc for
hard float kernels (eg those with options FPE), but with -march=rv64imac for
softfloat kernels (eg those with FPE). Since we do this, in the kernel (as in
userland) __riscv_flen will be defined for 'riscv64' and not for 'riscv64sf'.
This also removes the -DMACHINE_ARCH hack now that it's no longer needed.
Longer term, we should return the ABI from the sysctl hw.machine_arch like on
amd64 for i386 binaries.
avg [Thu, 27 Feb 2020 14:12:43 +0000 (14:12 +0000)]
dsl_dataset_promote_sync: populate 'oldname' before using it
It's very unlikely that zfsvfs_update_fromname() and
zvol_rename_minors() ever did anything during the promote operation as
the old name was not initialized.
kaktus [Thu, 27 Feb 2020 13:12:14 +0000 (13:12 +0000)]
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (18 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
jeff [Thu, 27 Feb 2020 08:23:10 +0000 (08:23 +0000)]
A pair of performance improvements.
Swap buckets on free as well as alloc so that alloc is always the most
cache-hot data.
When selecting a zone domain for the round-robin bucket cache use the
local domain unless there is a severe imbalance. This does not affinitize
memory, only locks and queues.
jeff [Thu, 27 Feb 2020 02:37:27 +0000 (02:37 +0000)]
Add unlocked grab* function variants that use lockless radix code to
lookup pages. These variants will fall back to their locked counterparts
if the page is not present.
mav [Wed, 26 Feb 2020 20:38:48 +0000 (20:38 +0000)]
MFZoL: Relax restriction on zfs_ioc_next_obj() iteration
Per the documentation for dnode_next_offset in dnode.c, the "txg"
parameter specifies a lower bound on which transaction the dnode can
be found in. We are interested in all dnodes that are removed between
the first and last transaction in the snapshot. It doesn't need to be
created in that snapshot to correspond to a removed file.
In fact, the behavior of zfs diff in the test case exactly matches
this: the transaction that created the data that was deleted in snapshot
"2" was produced before, in snapshot "1", definitely predating the first
transaction in snapshot "2".
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Chase <Tim Chase <tim@onlight.com>
Closes #2081
zfsonlinux/zfs@7290cd3c4ed19fb3f75b8133db2e36afcdd24beb
imp [Wed, 26 Feb 2020 19:15:08 +0000 (19:15 +0000)]
Remove support for all pre FreeBSD 11.0 versions from mpr and mps.
Remove a number of workarounds for older versions of FreeBSD. FreeBSD stable/10
was branched over 6 years ago. All of these changes date from about that time or
earlier. These workarounds are extensive and get in the way of understanding
the current flow in the driver.
emaste [Wed, 26 Feb 2020 19:08:23 +0000 (19:08 +0000)]
src.opts.mk: drop MIPS special case for disabling BINUTILS_BOOTSTRAP
Binutils has already been reduced to installing ld only on powerpc32
and as only on amd64. (Also objdump on every arch supported by binutils
2.17.50.) Although BINUTILS_BOOTSTRAP serves no purpose on MIPS there
is no reason to have a special case for it.
imp [Wed, 26 Feb 2020 18:55:09 +0000 (18:55 +0000)]
Remove sparc64 specific parts of libc.
Also update comments for which architectures use 128 bit long doubles,
as appropriate.
The softfloat specialization routines weren't updated since they
appear to be from an upstream source which we may want to update in
the future to get a more favorable license.
imp [Wed, 26 Feb 2020 18:55:03 +0000 (18:55 +0000)]
Remove sparc64 specific parts of libm and fix comments
Once upon a time, sparc64 was the only ld128 architecture. However,
both aarch64 and riscv are now such architectures. Many of the
comments about how slow multiplication was on old sparc64 processors
are now no longer true. However, since no evaluation has been done for
aarch64 yet, it's unclear if they are still relevant or not. If not,
the code should be changed. If so, the comments should remove the
uncertainty.
mav [Wed, 26 Feb 2020 16:51:45 +0000 (16:51 +0000)]
MFZoL: Fix resilver writes in vdev_indirect_io_start
This patch addresses an issue found in ztest where resilver
write zios that were passed to an indirect vdev would end up
being handled as though they were resilver read zios. This
caused issues where the zio->io_abd would be both read to
and written from at the same time, causing asserts to fail.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matt Ahrens <matt@delphix.com> Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8193
zfsonlinux/zfs@5aa95ba0d3502779695341b5f55fa5ba1d3330ff
mav [Wed, 26 Feb 2020 15:59:46 +0000 (15:59 +0000)]
MFZoL: Fix issue with scanning dedup blocks as scan ends
This patch fixes an issue discovered by ztest where
dsl_scan_ddt_entry() could add I/Os to the dsl scan queues
between when the scan had finished all required work and
when the scan was marked as complete. This caused the scan
to spin indefinitely without ending.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8010
zfsonlinux/zfs@5e0bd0ae056e26de36dee3c199c6fcff8f14ee15
mav [Wed, 26 Feb 2020 15:47:40 +0000 (15:47 +0000)]
MFZoL: Fix 2 small bugs with cached dsl_scan_phys_t
This patch corrects 2 small bugs where scn->scn_phys_cached was
not properly updated to match the primary copy when it needed to
be. The first resulted in the pause state not being properly
updated and the second resulted in the cached version being
completely zeroed even if the primary was not.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com>
Closes #8010
zfsonlinux/zfs@8cb119e3dc0ac6c90b1517fbadc021b7e9741fc6
mav [Wed, 26 Feb 2020 15:45:04 +0000 (15:45 +0000)]
MFZoL: Fix txg_sync_thread hang in scan_exec_io()
When scn->scn_maxinflight_bytes has not been initialized it's
possible to hang on the condition variable in scan_exec_io().
This issue was uncovered by ztest and is only possible when
deduplication is enabled through the following call path.
Resolve the issue by always initializing scn_maxinflight_bytes
to a reasonable minimum value. This value will be recalculated
in dsl_scan_sync() to pick up changes to zfs_scan_vdev_limit
and the addition/removal of vdevs.
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #7098
zfsonlinux/zfs@f90a30ad1b32a971f62a540f8944e42f99b254ce
kaktus [Wed, 26 Feb 2020 14:26:36 +0000 (14:26 +0000)]
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
rrs [Wed, 26 Feb 2020 13:48:33 +0000 (13:48 +0000)]
This commit expands tcp_ratelimit to be able to handle cards
like the mlx-c5 and c6 that require a "setup" routine before
the tcp_ratelimit code can declare and use a rate. I add the
setup routine to if_var as well as fix tcp_ratelimit to call it.
I also revisit the rates so that in the case of a mlx card
of type c5/6 we will use about 100 rates concentrated in the range
where the most gain can be had (1-200Mbps). Note that I have
tested these on a c5 and they work and perform well. In fact
in an unloaded system they pace right to the correct rate (great
job mlx!). There will be a further commit here from Hans that
will add the respective changes to the mlx driver to support this
work (which I was testing with).
Sponsored by: Netflix Inc.
Differential Revision: ttps://reviews.freebsd.org/D23647
andrew [Wed, 26 Feb 2020 11:50:24 +0000 (11:50 +0000)]
Generalise the arm64 ASID allocator.
The requirements of an Address Space ID allocator and a Virtual Machine ID
allocator are similar. Generalise the former code so it can be used with
the latter.
andrew [Wed, 26 Feb 2020 11:47:24 +0000 (11:47 +0000)]
Start to support multiple stages in the arm64 pmap.
On arm64 the stage 1 and stage 2 pte formats are similar enough we can
reuse the pmap code for both. As they are only similar and not identical
we need to know if we are managing stage 1 or stage 2 tables.
Add an enum to store this information and a check to make sure it is
set to stage 1 when we manage stage 1 pte fields.
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D23830
kp [Wed, 26 Feb 2020 08:47:18 +0000 (08:47 +0000)]
bridge: Move locking defines into if_bridge.c
The locking defines for if_bridge used to live in if_bridgevar.h, but
they're only ever used by the bridge implementation itself (in
if_bridge.c). Moving them into the .c file.
Reported by: philip, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23808
Remove an old workaround that is no longer necessary since rS343824.
There used to be a problem with FMan interrupts firing on multiple CPUS
at the same time.
This ended up being due to multicast interrupts being unsupported in the
Freescale PIC (so instead of using a selection algorithm, it would do some
unspecified action, such as interrupting multiple cpus at random.)
glebius [Tue, 25 Feb 2020 19:29:05 +0000 (19:29 +0000)]
Generalize resources freeing in sendfile with different scenarios.
Now we execute sendfile_iodone() in all possible cases, which
guarantees that vm_object_pip_wakeup() is called and sfio structure
is freed.
At the beginning of sendfile initialize sfio->m to NULL, that would
indicate that the mbuf chain either doesn't exist, or belongs to the
syscall (not to I/O completion). Fill sfio->m only at a point when
we are positive that there are I/Os ongoing and before releasing
syscall's reference on sfio.
In sendfile_iodone() perform vm_object_pip_wakeup() once last
reference is released, then check for sfio->m. NULL pointer
indicates that we need only to free the memory.
kaktus [Tue, 25 Feb 2020 19:04:39 +0000 (19:04 +0000)]
Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (16 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.