Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Wed, 1 Aug 2018 02:39:44 +0000 (02:39 +0000)]
MFV r337022:
9403 assertion failed in arc_buf_destroy() when concurrently reading block with checksum error
This assertion (VERIFY) failure was reported when reading a block. Turns out
the problem is that if we get an i/o error (ECKSUM in this case), and there
are multiple concurrent ARC reads of the same block (from different clones),
then the ARC will put multiple buf's on the same ANON hdr, which isn't
supposed to happen, and then causes a panic when we try to arc_buf_destroy()
the buf.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 23:44:13 +0000 (23:44 +0000)]
9403 assertion failed in arc_buf_destroy() when concurrently reading block with checksum error
This assertion (VERIFY) failure was reported when reading a block. Turns out
the problem is that if we get an i/o error (ECKSUM in this case), and there
are multiple concurrent ARC reads of the same block (from different clones),
then the ARC will put multiple buf's on the same ANON hdr, which isn't
supposed to happen, and then causes a panic when we try to arc_buf_destroy()
the buf.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author: Matthew Ahrens <mahrens@delphix.com>
Michael Tuexen [Tue, 31 Jul 2018 22:56:03 +0000 (22:56 +0000)]
Add a dtrace provider for UDP-Lite.
The dtrace provider for UDP-Lite is modeled after the UDP provider.
This fixes the bug that UDP-Lite packets were triggering the UDP
provider.
Thanks to dteske@ for providing the dwatch module.
Alexander Motin [Tue, 31 Jul 2018 22:50:50 +0000 (22:50 +0000)]
MFV r337014:
9421 zdb should detect and print out the number of "leaked" objects
9422 zfs diff and zdb should explicitly mark objects that are on the deleted queue
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 21:06:04 +0000 (21:06 +0000)]
MFV r336991, r337001:
9102 zfs should be able to initialize storage devices
The first access to a disk block can incur a performance penalty on some
platforms (e.g. AWS's EBS, VMware VMDKs). Therefore it is recommended that
volumes be "thick provisioned", where supported by the platform (VMware).
Thick provisioning is time consuming and often is ignored. If the thick
provision step is omitted, customers will see suboptimal performance until
we have written to all parts of the LUN. ZFS should be able to initialize
any unused storage to remove any first-write penalty that exists.
Reviewed by: John Wren Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: George Wilson <george.wilson@delphix.com>
ofw_cpu: Add support for getting cpu clock via clock property
Nominal Mhz is either expressed via the clock-frequency property
or can be get via the clock property that holds the cpu clock.
Add support for the later.
Since we have now EFI framebuffer enabled for ARM64 if we boot on a board
with an screen, u-boot will set up a EFI GOP framebuffer and we won't boot
using the serial console.
Also on RPI3 the firmware always setup the framebuffer area resulting in u-boot
always setup the EFI GOP and FreeBSD never using the serial console.
This is not a problem for 12-CURRENT as EFI boot works but it doesn't
for 11.
While here some board arm_install_uboot also copy ubldr.bin et create
firstboot files but it's already done in arm_install_boot
Reviewed by: gjb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D16481
The nvmem interface helps provider of nvmem data to expose themselves to consumer.
NVMEM is generally present on some embedded board in a form of eeprom or fuses.
The nvmem api are helpers for consumer to read/write the cell data from a provider.
FORCE_PKG_REGISTER is broken so multiple invocation of release.sh for the
same board will fails if /scratch isn't cleaned.
Leave it but deinstall the package first.
Alexander Motin [Tue, 31 Jul 2018 18:49:07 +0000 (18:49 +0000)]
9102 zfs should be able to initialize storage devices
The first access to a disk block can incur a performance penalty on some
platforms (e.g. AWS's EBS, VMware VMDKs). Therefore it is recommended that
volumes be "thick provisioned", where supported by the platform (VMware).
Thick provisioning is time consuming and often is ignored. If the thick
provision step is omitted, customers will see suboptimal performance until
we have written to all parts of the LUN. ZFS should be able to initialize
any unused storage to remove any first-write penalty that exists.
For compat32, emulate the same wraparound check as occurs on the real
ILP32 system.
Reported by and discussed with: asomers
PR: 230162
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D16525
Alan Cox [Tue, 31 Jul 2018 17:41:48 +0000 (17:41 +0000)]
Allow vm object coalescing to occur in the midst of a vm object when the
OBJ_ONEMAPPING flag is set. In other words, allow recycling of existing
but unused subranges of a vm object when the OBJ_ONEMAPPING flag is set.
Such situations are increasingly common with jemalloc >= 5.0. This
change has the expected effect of reducing the number of vm map entry and
object allocations and increasing the number of superpage promotions.
snd_hda: Byteswap the buffer descriptor entries as needed
The buffer descriptor list entries should be in little endian format. Byte swap
them on BE. This is the last piece of the puzzle for snd_hda(4) to work on
PowerPC.
Ed Maste [Tue, 31 Jul 2018 15:25:03 +0000 (15:25 +0000)]
lld: [ELF][ARM] Implement support for Tag_ABI_VFP_args
The Tag_ABI_VFP_args build attribute controls the procedure call
standard used for floating point parameters on ARM. The values are:
0 - Base AAPCS (FP Parameters passed in Core (Integer) registers
1 - VFP AAPCS (FP Parameters passed in FP registers)
2 - Toolchain specific (Neither Base or VFP)
3 - Compatible with all (No use of floating point parameters)
If the Tag_ABI_VFP_args build attribute is missing it has an implicit
value of 0.
We use the attribute in two ways:
* Detect a clash in calling convention between Base, VFP and Toolchain.
we follow ld.bfd's lead and do not error if there is a clash between an
implicit Base AAPCS caused by a missing attribute. Many projects
including the hard-float (VFP AAPCS) version of glibc contain assembler
files that do not use floating point but do not have Tag_ABI_VFP_args.
* Set the EF_ARM_ABI_FLOAT_SOFT or EF_ARM_ABI_FLOAT_HARD ELF header flag
for Base or VFP AAPCS respectively. This flag is used by some ELF
loaders.
References:
* Addenda to, and Errata in, the ABI for the ARM Architecture for
Tag_ABI_VFP_args
* Elf for the ARM Architecture for ELF header flags
Fixes LLVM PR36009
PR: 229050
Obtained from: llvm r338377 by Peter Smith
Ed Maste [Tue, 31 Jul 2018 14:14:41 +0000 (14:14 +0000)]
llvm: [ARM] Complete enumeration values for Tag_ABI_VFP_args
The LLD implementation of Tag_ABI_VFP_args needs to check the rarely
seen values of 3 (toolchain specific) and 4 compatible with both Base
and VFP. Add the missing enumeration values so that LLD can refer to
them without having to use the raw numbers.
Ed Maste [Tue, 31 Jul 2018 14:12:09 +0000 (14:12 +0000)]
llvm: [ELF][ARM] Add Arm ABI names for float ABI ELF Header flags
The ELF for the Arm architecture document defines, for EF_ARM_EABI_VER5
and above, the flags EF_ARM_ABI_FLOAT_HARD and EF_ARM_ABI_FLOAT_SOFT.
These have been defined to be compatible with the existing
EF_ARM_VFP_FLOAT and EF_ARM_SOFT_FLOAT used by gcc for
EF_ARM_EABI_UNKNOWN.
This patch adds the flags in addition to the existing ones so that any
code depending on the old names will still work.
Andrew Turner [Tue, 31 Jul 2018 12:53:27 +0000 (12:53 +0000)]
Implement the SSBD (CVE-2018-3639) workaround on arm64
This calls into the Arm Trusted Firmware to enable and disable the
workaround for the Speculative Store Bypass Disable (SSBD) issue, also
known as Spectre Variant 4.
As this may have a large performance overhead, and how exploitable SSBD is
is unknown we follow the Linux lead of allowing the administrator to select
between always on, always off, or only enabled in the kernel, with the
latter being the default.
Only NULL check the VNET pointer when VIMAGE is enabled in ibcore.
Else a NULL VNET pointer should be ignored. This fixes address resolving
when VIMAGE is disabled.
MFC after: 3 days
Sponsored by: Mellanox Technologies
Michael Tuexen [Tue, 31 Jul 2018 06:27:05 +0000 (06:27 +0000)]
Fix INET only builds.
r336940 introduced an "unused variable" warning on platforms which
support INET, but not INET6, like MALTA and MALTA64 as reported
by Mark Millard. Improve the #ifdefs to address this issue.
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Paul Dagnelie <pcd@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:58:21 +0000 (00:58 +0000)]
MFV r336958: 9337 zfs get all is slow due to uncached metadata
This project's goal is to make read-heavy channel programs and zfs(1m)
administrative commands faster by caching all the metadata that they will
need in the dbuf layer. This will prevent the data from being evicted, so
that any future call to i.e. zfs get all won't have to go to disk (very
much).
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:56:41 +0000 (00:56 +0000)]
9337 zfs get all is slow due to uncached metadata
This project's goal is to make read-heavy channel programs and zfs(1m)
administrative commands faster by caching all the metadata that they will
need in the dbuf layer. This will prevent the data from being evicted, so
that any future call to i.e. zfs get all won't have to go to disk (very
much).
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Thomas Caputi <tcaputi@datto.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:47:27 +0000 (00:47 +0000)]
MFV r336955: 9236 nuke spa_dbgmsg
We should use zfs_dbgmsg instead of spa_dbgmsg. Or at least,
metaslab_condense() should call zfs_dbgmsg because it's important and rare
enough to always log. It's possible that the message in zio_dva_allocate()
would be too high-frequency for zfs_dbgmsg.
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:42:31 +0000 (00:42 +0000)]
9236 nuke spa_dbgmsg
We should use zfs_dbgmsg instead of spa_dbgmsg. Or at least,
metaslab_condense() should call zfs_dbgmsg because it's important and rare
enough to always log. It's possible that the message in zio_dva_allocate()
would be too high-frequency for zfs_dbgmsg.
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Richard Elling <Richard.Elling@RichardElling.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:37:45 +0000 (00:37 +0000)]
MFV r336952: 9192 explicitly pass good_writes to vdev_uberblock/label_sync
Currently vdev_label_sync and vdev_uberblock_sync take a zio_t and assume
that its io_private is a pointer to the good_writes count. They should
instead accept this argument explicitly.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:34:39 +0000 (00:34 +0000)]
9192 explicitly pass good_writes to vdev_uberblock/label_sync
Currently vdev_label_sync and vdev_uberblock_sync take a zio_t and assume
that its io_private is a pointer to the good_writes count. They should
instead accept this argument explicitly.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>
Alexander Motin [Tue, 31 Jul 2018 00:25:39 +0000 (00:25 +0000)]
MFV r336950: 9290 device removal reduces redundancy of mirrors
Mirrors are supposed to provide redundancy in the face of whole-disk failure
and silent damage (e.g. some data on disk is not right, but ZFS hasn't
detected the whole device as being broken). However, the current device
removal implementation bypasses some of the mirror's redundancy.
Alexander Motin [Tue, 31 Jul 2018 00:13:04 +0000 (00:13 +0000)]
9290 device removal reduces redundancy of mirrors
Mirrors are supposed to provide redundancy in the face of whole-disk failure
and silent damage (e.g. some data on disk is not right, but ZFS hasn't
detected the whole device as being broken). However, the current device
removal implementation bypasses some of the mirror's redundancy.
Alexander Motin [Tue, 31 Jul 2018 00:02:42 +0000 (00:02 +0000)]
MFV r336948: 9112 Improve allocation performance on high-end systems
On high-end systems running async sequential write workloads, especially
NUMA systems with flash or NVMe storage, one significant performance
bottleneck is selecting a metaslab to do allocations from. This process
can be parallelized, providing significant performance increases for
these workloads.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Alexander Motin <mav@FreeBSD.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Paul Dagnelie <pcd@delphix.com>
Alexander Motin [Mon, 30 Jul 2018 23:53:25 +0000 (23:53 +0000)]
9112 Improve allocation performance on high-end systems
On high-end systems running async sequential write workloads, especially
NUMA systems with flash or NVMe storage, one significant performance
bottleneck is selecting a metaslab to do allocations from. This process
can be parallelized, providing significant performance increases for
these workloads.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Reviewed by: Alexander Motin <mav@FreeBSD.org>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Paul Dagnelie <pcd@delphix.com>
Alexander Motin [Mon, 30 Jul 2018 23:47:38 +0000 (23:47 +0000)]
MFV r336946: 9238 ZFS Spacemap Encoding V2
The current space map encoding has the following disadvantages:
[1] Assuming 512 sector size each entry can represent at most 16MB for a segment.
This makes the encoding very inefficient for large regions of space.
[2] As vdev-wide space maps have started to be used by new features (i.e.
device removal, zpool checkpoint) we've started imposing limits in the
vdevs that can be used with them based on the maximum addressable offset
(currently 64PB for a top-level vdev).
The new remains backwards compatible with the old one. The introduced
two-word entry format, besides extending the limits imposed by the single-entry
layout, also includes a vdev field and some extra padding after its prefix.
The extra padding after the prefix should is reserved for future usage (e.g.
new prefixes for future encodings or new fields for flags). The new vdev field
not only makes the space maps more self-descriptive, but also opens the doors
for pool-wide space maps.
One final important note is that the number of bits used for vdevs is reduced
to 24 bits for blkptrs. That was decided as we don't know of any setups that
use more than 16M vdevs for the time being and
we wanted to fit the vdev field in the space map. In addition that gives us
some extra bits in dva_t.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
Alexander Motin [Mon, 30 Jul 2018 22:56:24 +0000 (22:56 +0000)]
9238 ZFS Spacemap Encoding V2
The current space map encoding has the following disadvantages:
[1] Assuming 512 sector size each entry can represent at most 16MB for a segment.
This makes the encoding very inefficient for large regions of space.
[2] As vdev-wide space maps have started to be used by new features (i.e.
device removal, zpool checkpoint) we've started imposing limits in the
vdevs that can be used with them based on the maximum addressable offset
(currently 64PB for a top-level vdev).
The new remains backwards compatible with the old one. The introduced
two-word entry format, besides extending the limits imposed by the single-entry
layout, also includes a vdev field and some extra padding after its prefix.
The extra padding after the prefix should is reserved for future usage (e.g.
new prefixes for future encodings or new fields for flags). The new vdev field
not only makes the space maps more self-descriptive, but also opens the doors
for pool-wide space maps.
One final important note is that the number of bits used for vdevs is reduced
to 24 bits for blkptrs. That was decided as we don't know of any setups that
use more than 16M vdevs for the time being and
we wanted to fit the vdev field in the space map. In addition that gives us
some extra bits in dva_t.
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
Alexander Motin [Mon, 30 Jul 2018 22:39:30 +0000 (22:39 +0000)]
MFV r336944: 9286 want refreservation=auto
When a ZFS volume is created with zfs create -V (but without -s), the
refreservation property is set to a value that is volsize plus the maximum
size of metadata. If refreservation is ever set to another value, it is
impossible to set it back to the automatically determined value. There are
other cases where refreservation may be wrong. These include receiving a
volume that was sent without properties and zfs clone.
We need:
zfs set refreservation=auto <volume>
zfs clone -o refreservation=auto <volume>
Each one would use the same function used by zfs create -V to determine the
proper value for refreservation.
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Mike Gerdts <mike.gerdts@joyent.com>
Alexander Motin [Mon, 30 Jul 2018 22:10:15 +0000 (22:10 +0000)]
9286 want refreservation=auto
When a ZFS volume is created with zfs create -V (but without -s), the
refreservation property is set to a value that is volsize plus the maximum
size of metadata. If refreservation is ever set to another value, it is
impossible to set it back to the automatically determined value. There are
other cases where refreservation may be wrong. These include receiving a
volume that was sent without properties and zfs clone.
We need:
zfs set refreservation=auto <volume>
zfs clone -o refreservation=auto <volume>
Each one would use the same function used by zfs create -V to determine the
proper value for refreservation.
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Mike Gerdts <mike.gerdts@joyent.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Approved by: Matt Ahrens <mahrens@delphix.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Michael Tuexen [Mon, 30 Jul 2018 21:27:26 +0000 (21:27 +0000)]
Allow implicit TCP connection setup for TCP/IPv6.
TCP/IPv4 allows an implicit connection setup using sendto(), which
is used for TTCP and TCP fast open. This patch adds support for
TCP/IPv6.
While there, improve some tests for detecting multicast addresses,
which are mapped.
Michael Tuexen [Mon, 30 Jul 2018 21:13:42 +0000 (21:13 +0000)]
Send consistent SEG.WIN when using timewait codepath for TCP.
When sending TCP segments from the timewait code path, a stored
value of the last sent window is used. Use the same code for
computing this in the timewait code path as in the main code
path used in tcp_output() to avoiv inconsistencies.
Michael Tuexen [Mon, 30 Jul 2018 20:35:50 +0000 (20:35 +0000)]
Fix some TCP fast open issues.
The following issues are fixed:
* Whenever a TCP server with TCP fast open enabled, calls accept(),
recv(), send(), and close() before the TCP-ACK segment has been received,
the TCP connection is just dropped and the reception of the TCP-ACK
segment triggers the sending of a TCP-RST segment.
* Whenever a TCP server with TCP fast open enabled, calls accept(), recv(),
send(), send(), and close() before the TCP-ACK segment has been received,
the first byte provided in the second send call is not transferred.
* Whenever a TCP client with TCP fast open enabled calls sendto() followed
by close() the TCP connection is just dropped.
Rick Macklem [Mon, 30 Jul 2018 20:25:32 +0000 (20:25 +0000)]
Silence newer gcc warnings.
Newer versions of gcc generate "set, but not used" warnings.
Add __unused macros to silence these warnings.
Although the variables are not being used, they are values parsed from
arguments to callback RPCs that might be needed in the future.
The CORB and RIRB buffers exist in DMA memory, but the device reads them as
little-endian only. Read and write as LE into the DMA memory block, to work on
BE platforms.
Alexander Motin [Mon, 30 Jul 2018 19:44:14 +0000 (19:44 +0000)]
9284 arc_reclaim_thread has 2 jobs
`arc_reclaim_thread()` calls `arc_adjust()` after calling
`arc_kmem_reap_now()`; `arc_adjust()` signals `arc_get_data_buf()` to
indicate that we may no longer be `arc_is_overflowing()`.
The problem is, `arc_kmem_reap_now()` can take several seconds to
complete, has no impact on `arc_is_overflowing()`, but due to how the
code is structured, can impact how long the ARC will remain in the
`arc_is_overflowing()` state.
The fix is to use seperate threads to:
1. keep `arc_size` under `arc_c`, by calling `arc_adjust()`, which
improves `arc_is_overflowing()`
2. keep enough free memory in the system, by calling
`arc_kmem_reap_now()` plus `arc_shrink()`, which improves
`arc_available_memory()`.
Follow up to r336919 and r336921: s/efi.rt_disabled/efi.rt.disabled/
The latter matches the rest of the tree better [0]. The UPDATING entry has
been updated to reflect this, and the new tunable is now documented in
loader(8) [1].
As noted in UDPATING, the new loader tunable efi.rt_disabled may be used to
disable EFIRT at runtime. It should have no effect if you are not booted via
UEFI boot.
efirt: Add tunable to allow disabling EFI Runtime Services
Leading up to enabling EFIRT in GENERIC, allow runtime services to be
disabled with a new tunable: efi.rt_disabled. This makes it so that EFIRT
can be disabled easily in case we run into some buggy UEFI implementation
and fail to boot.
Alan Somers [Mon, 30 Jul 2018 15:46:40 +0000 (15:46 +0000)]
Make timespecadd(3) and friends public
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Reuse of the index variable in two nested loops resulted in only the first
argument in the list being used (fine for gzip, not fine for zstd). Also
add tests for xz and zstd, and fix the COMPRESS_SUFFIX_MAXLEN macro.
snd_hda: Print error codes in decimal, rather than hex
It's easy to confuse the error code as naked it looks decimal (EINVAL is
reported as error 16, instead of error 22, so first reading looks like EBUSY).
Ed Maste [Mon, 30 Jul 2018 15:10:06 +0000 (15:10 +0000)]
Revert accidental change from r336908
By default ld.lld should be the bootstrap linker (only) on i386 right
now. Once the i386 exp-run with LLD_IS_LD has a good result this will
also be enabled by default.
Andrew Turner [Mon, 30 Jul 2018 15:05:07 +0000 (15:05 +0000)]
As with DPCPU_DEFINE_STATIC make VNET_DEFINE_STATIC non-static on arm64 in
modules. It also fails in the same way, we are unable to relocate static
variables as the compiler uses PC-relative loads with nothing for the
kernel linker to relocate.
Andrew Turner [Mon, 30 Jul 2018 14:25:17 +0000 (14:25 +0000)]
Ensure the DPCPU and VNET module spaces are aligned to hold a pointer.
Previously they may have been aligned to a char, leading to misaligned
DPCPU and VNET variables.
David Bright [Mon, 30 Jul 2018 14:21:49 +0000 (14:21 +0000)]
Correct possible misleading error message in kqtest.
ian@ pointed out that in the test_abstime() function time(NULL) is
used twice; once in an "if" test and again in the enclosed error
message. If the true branch was taken and the process got preempted
before the second time(NULL) call, by the time the error message was
generated enough time could have elapsed that the message could claim
that the event came "too early" but print an event time that was after
the expected timeout. Correct by making the time(NULL) call only once
and using that returned time in both the "if" test and the error
message.
Ed Maste [Mon, 30 Jul 2018 12:38:08 +0000 (12:38 +0000)]
Enable ld.lld as bootstrap linker by default on i386
Akin to r327783 for amd64. lld has been usable for amd64 for quite some
time, but a couple of issues remained that affected i386. These were
recently addressed upstream in lld and merged into FreeBSD or addressed
directly in FreeBSD (r326831, r326879, r326897, r326957, r333401,
r334626, r336664).
Similarly to the intial amd64 commit this change enables lld only as the
bootstrap linker (used to link the kernel and userland libraries and
executables), while GNU ld.bfd is still installed as /usr/bin/ld and
used for ports builds. That will be changed shortly, after an exp-run.
This is a recommit of r327823 after additional lld fixes.
PR: 225128 (exp-run)
Relnotes: Yes
Sponsored by: The FreeBSD Foundation