]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoSimplify VM and UMA startup by eliminating boot pages. Instead use careful
Jeff Roberson [Thu, 16 Jan 2020 05:01:21 +0000 (05:01 +0000)]
Simplify VM and UMA startup by eliminating boot pages.  Instead use careful
ordering to allocate early pages in the same way boot pages were but only
as needed.  After the KVA allocator has started up we allocate the KVA that
we consumed during boot.  This also makes the boot pages freeable since they
have vm_page structures allocated with the rest of memory.

Parts of this patch were written and tested by markj.

Reviewed by: glebius, markj
Differential Revision: https://reviews.freebsd.org/D23102

4 years ago[PPC64] memcpy/memmove/bcopy optimization
Leandro Lupori [Wed, 15 Jan 2020 20:25:52 +0000 (20:25 +0000)]
[PPC64] memcpy/memmove/bcopy optimization

For copies shorter than 512 bytes, the data is copied using plain
ld/std instructions.
For 512 bytes or more, the copy is done in 3 phases:

Phase 1: copy from the src buffer until it's aligned at a 16-byte boundary
Phase 2: copy as many aligned 64-byte blocks from the src buffer as possible
Phase 3: copy the remaining data, if any

In phase 2, this code uses VSX instructions when available. Otherwise,
it uses ldx/stdx.

Submitted by: Luis Pires <lffpires_ruabrasil.org> (original version)
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15118

4 years ago[PPC64] strncpy optimization
Leandro Lupori [Wed, 15 Jan 2020 19:53:03 +0000 (19:53 +0000)]
[PPC64] strncpy optimization

Assembly optimization of strncpy for PowerPC64, using double words
instead of bytes to copy strings.

Submitted by: Leonardo Bianconi <leonardo.bianconi_eldorado.org.br> (original version)
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15369

4 years ago[PPC64] strcpy optimization
Leandro Lupori [Wed, 15 Jan 2020 19:46:01 +0000 (19:46 +0000)]
[PPC64] strcpy optimization

Assembly optimization of strcpy for PowerPC64, using double words
instead of bytes to copy strings.

Submitted by: Leonardo Bianconi <leonardo.bianconi_eldorado.org.br> (original version)
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D15368

4 years agoacpi_ibm: reference ThinkPad instead of IBM
Ed Maste [Wed, 15 Jan 2020 19:43:45 +0000 (19:43 +0000)]
acpi_ibm: reference ThinkPad instead of IBM

These are now Lenovo ThinkPads, not IBM ThinkPads.

PR: 234403
Submitted by: Kevin Zheng <kevinz5000@gmail.com> (original)

4 years agoPeter Holm reports that his test that does an umount(8) on an active
Kirk McKusick [Wed, 15 Jan 2020 18:53:32 +0000 (18:53 +0000)]
Peter Holm reports that his test that does an umount(8) on an active
mount point while numerous tests are running that are writing to
files on that mount point cause the unmount(8) to hang forever.

The unmount(8) system call is handled in the kernel by the dounmount()
function. The cause of the hang is that prior to dounmount() calling
VFS_UNMOUNT() it is calling VFS_SYNC(mp, MNT_WAIT). The MNT_WAIT
flag indicates that VFS_SYNC() should not return until all the dirty
buffers associated with the mount point have been written to disk.
Because user processes are allowed to continue writing and can do
so faster than the data can be written to disk, the call to VFS_SYNC()
can never finish.

Unlike VFS_SYNC(), the VFS_UNMOUNT() routine can suspend all processes
when they request to do a write thus having a finite number of dirty
buffers to write that cannot be expanded. There is no need to call
VFS_SYNC() before calling VFS_UNMOUNT(), because VFS_UNMOUNT() needs
to flush everything again anyway after suspending writes, to catch
anything that was dirtied between the VFS_SYNC() and writes being
suspended.

The fix is to simply remove the unnecessary call to VFS_SYNC() from
dounmount().

Reported by:  Peter Holm
Analysis by:  Chuck Silvers
Tested by:    Peter Holm
MFC after:    7 days
Sponsored by: Netflix

4 years agoFix a spacing error from the previous commit for -ll mode. Add a little
Scott Long [Wed, 15 Jan 2020 16:47:44 +0000 (16:47 +0000)]
Fix a spacing error from the previous commit for -ll mode.  Add a little
more space padding to that mode to give the columns a consistent offset.

4 years agomips trampoline: don't bother with unwind tables
Kyle Evans [Wed, 15 Jan 2020 15:59:32 +0000 (15:59 +0000)]
mips trampoline: don't bother with unwind tables

The utility here seems somewhat limited, but clang will attempt to generate
.eh_frame and actively fail in doing so. It is perhaps worth investigating
why it's being generated in the first place (GCC doesn't do so), but this
isn't a high priority.

4 years agoHandle a NULL thread pointer in linux_close_file().
Mark Johnston [Wed, 15 Jan 2020 15:31:35 +0000 (15:31 +0000)]
Handle a NULL thread pointer in linux_close_file().

This can happen if a file is closed during unix socket GC.  The same bug
was fixed for devfs descriptors in r228361.

PR: 242913
Reported and tested by: iz-rpi03@hs-karlsruhe.de
Reviewed by: hselasky, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23178

4 years agoUpdate WITHOUT_BINUTILS* descriptions
Ed Maste [Wed, 15 Jan 2020 13:52:13 +0000 (13:52 +0000)]
Update WITHOUT_BINUTILS* descriptions

In the WITHOUT_ descriptions we don't need to mention that ld.bfd is
limited to powerpc. When WITHOUT_BINUTILS is specified ld.bfd is not
installed on any CPU architecture.

4 years agobsdinstall: Change "default" (first) Partitioning method to ZFS
Ben Woods [Wed, 15 Jan 2020 07:47:52 +0000 (07:47 +0000)]
bsdinstall: Change "default" (first) Partitioning method to ZFS

Reported by: Ruben Schade (during his talk at linux.conf.au)
Approved by: philip
Differential Revision: https://reviews.freebsd.org/D23173

4 years agogif_transmit() must always be called in the network epoch.
Gleb Smirnoff [Wed, 15 Jan 2020 06:18:32 +0000 (06:18 +0000)]
gif_transmit() must always be called in the network epoch.

4 years agoA miss from r356754.
Gleb Smirnoff [Wed, 15 Jan 2020 06:12:39 +0000 (06:12 +0000)]
A miss from r356754.

4 years agoIntroduce NET_EPOCH_CALL() macro and use it everywhere where we free
Gleb Smirnoff [Wed, 15 Jan 2020 06:05:20 +0000 (06:05 +0000)]
Introduce NET_EPOCH_CALL() macro and use it everywhere where we free
data based on the network epoch.   The macro reverses the argument
order of epoch_call(9) - first function, then its argument. NFC

4 years agoUse official macro to enter/exit the network epoch. NFC
Gleb Smirnoff [Wed, 15 Jan 2020 05:48:36 +0000 (05:48 +0000)]
Use official macro to enter/exit the network epoch. NFC

4 years agoMechanically substitute assertion of in_epoch(net_epoch_preempt) to
Gleb Smirnoff [Wed, 15 Jan 2020 05:45:27 +0000 (05:45 +0000)]
Mechanically substitute assertion of in_epoch(net_epoch_preempt) to
NET_EPOCH_ASSERT(). NFC

4 years agoStop header pollution and don't include if_var.h via in_pcb.h.
Gleb Smirnoff [Wed, 15 Jan 2020 03:41:15 +0000 (03:41 +0000)]
Stop header pollution and don't include if_var.h via in_pcb.h.

4 years agoSince this code dereferences struct ifnet, it must include if_var.h
Gleb Smirnoff [Wed, 15 Jan 2020 03:40:32 +0000 (03:40 +0000)]
Since this code dereferences struct ifnet, it must include if_var.h
explicitly, not via header pollution.  While here move TCPSTATES
declaration right above the include that is going to make use of it.

4 years agoSince this code uses if_ref()/if_rele() it must include if_var.h
Gleb Smirnoff [Wed, 15 Jan 2020 03:39:11 +0000 (03:39 +0000)]
Since this code uses if_ref()/if_rele() it must include if_var.h
explicitly, not via header pollution.

4 years agoNetgraph queue processing thread must process all its items
Gleb Smirnoff [Wed, 15 Jan 2020 03:35:57 +0000 (03:35 +0000)]
Netgraph queue processing thread must process all its items
in the network epoch.

Reported by: Michael Zhilin <mizhka@ >

4 years ago- Move global network epoch definition to epoch.h, as more different
Gleb Smirnoff [Wed, 15 Jan 2020 03:34:21 +0000 (03:34 +0000)]
- Move global network epoch definition to epoch.h, as more different
  subsystems tend to need to know about it, and including if_var.h is
  huge header pollution for them.  Polluting possible non-network
  users with single symbol seems much lesser evil.
- Remove non-preemptible network epoch.  Not used yet, and unlikely
  to get used in close future.

4 years agoThe non-preemptible network epoch identified by net_epoch isn't used.
Gleb Smirnoff [Wed, 15 Jan 2020 03:30:33 +0000 (03:30 +0000)]
The non-preemptible network epoch identified by net_epoch isn't used.
This code definitely meant net_epoch_preempt.

4 years agovfs: in vop_stdadd_writecount only vlazy vnodes on mounts using msync
Mateusz Guzik [Wed, 15 Jan 2020 01:34:05 +0000 (01:34 +0000)]
vfs: in vop_stdadd_writecount only vlazy vnodes on mounts using msync

The only reason to vlazy there is to (overzealously) ensure all vnodes
which need to be visited by msync scan can be found there.

In particluar this is of no use zfs and tmpfs.

While here depessimize the check.

4 years agotmpfs: add missing CLTFLAG_MPSAFE annotation
Mateusz Guzik [Wed, 15 Jan 2020 01:32:11 +0000 (01:32 +0000)]
tmpfs: add missing CLTFLAG_MPSAFE annotation

4 years agonfs: add missing CLTFLAG_MPSAFE annotations
Mateusz Guzik [Wed, 15 Jan 2020 01:31:57 +0000 (01:31 +0000)]
nfs: add missing CLTFLAG_MPSAFE annotations

4 years agofusefs: add missing CLTFLAG_MPSAFE annotation
Mateusz Guzik [Wed, 15 Jan 2020 01:31:28 +0000 (01:31 +0000)]
fusefs: add missing CLTFLAG_MPSAFE annotation

4 years agortld: remove hand rolled memset and bzero
Mateusz Guzik [Wed, 15 Jan 2020 01:30:32 +0000 (01:30 +0000)]
rtld: remove hand rolled memset and bzero

They were introduced to take care of ifunc, but right now no architecture
provides ifunc'ed variants. Since rtld uses memset extensively this results in
a pessmization. Should someone want to use ifunc here they should provide a
mandatory symbol (e.g., rtld_memset).

See the review for profiling data.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23176

4 years agobsdinstall: Use TMPDIR if set
Rebecca Cran [Wed, 15 Jan 2020 00:45:05 +0000 (00:45 +0000)]
bsdinstall: Use TMPDIR if set

Submitted by: Ryan Moeller <ryan@freqlabs.com>
Reviewed by: bcran, Nick Wolff <darkfiberiru@gmail.com>
Differential Revision: https://reviews.freebsd.org/D22979/

4 years agoWhen sync'ing a mount point, the mount point's vnodes were scanned
Kirk McKusick [Tue, 14 Jan 2020 22:27:46 +0000 (22:27 +0000)]
When sync'ing a mount point, the mount point's vnodes were scanned
twice. Once to update the changed inodes, and a second time to update
changed quota information. This change merges these two scans into a
single scan which does both inode and quota updates.

MFC after: 7 days

4 years agosrc.conf.5: regen after r356736, limiting ld.bfd to powerpc
Ed Maste [Tue, 14 Jan 2020 18:06:09 +0000 (18:06 +0000)]
src.conf.5: regen after r356736, limiting ld.bfd to powerpc

4 years agoPreserve the inherited value of the status register in cpu_set_upcall().
John Baldwin [Tue, 14 Jan 2020 18:00:04 +0000 (18:00 +0000)]
Preserve the inherited value of the status register in cpu_set_upcall().

Instead of re-deriving the value of SR using logic similar to
exec_set_regs(), just inherit the value from the existing thread
similar to fork().

Reviewed by: brooks
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23059

4 years agolimit ld.bfd to powerpc
Ed Maste [Tue, 14 Jan 2020 17:56:54 +0000 (17:56 +0000)]
limit ld.bfd to powerpc

All archs except powerpc either use lld or require external toolchain.
powerpc still needs binutils ld to link 32-bit binaries.

Reviewed by: jhibbits
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23107

4 years agoRevert r353140: Re-add ALLOW_MIPS_SHARED_TEXTREL, sprinkle it around
Kyle Evans [Tue, 14 Jan 2020 17:50:13 +0000 (17:50 +0000)]
Revert r353140: Re-add ALLOW_MIPS_SHARED_TEXTREL, sprinkle it around

arichardson has an actual fix for the same issue that this was working
around; given that we don't build with llvm today, go ahead and revert the
workaround in advance.

4 years agosrc.conf.5: regen after option description updates
Ed Maste [Tue, 14 Jan 2020 17:38:34 +0000 (17:38 +0000)]
src.conf.5: regen after option description updates

4 years agoUpdate WITH_/WITHOUT_CLANG_IS_CC descriptions
Ed Maste [Tue, 14 Jan 2020 17:35:34 +0000 (17:35 +0000)]
Update WITH_/WITHOUT_CLANG_IS_CC descriptions

Describe /usr/bin/cc etc. as links to the compiler, and don't conflate
WITHOUT_CLANG_IS_CC with installing GCC.  Leave a reference to WITH_GCC
and WITHOUT_CLANG_IS_CC installing links to GCC, although this will be
removed in ~1.5 months when GCC 4.2.1 is removed from the tree.

Sponsored by: The FreeBSD Foundation

4 years agoUpdate WITH_AMD description reflecting upcoming removal
Ed Maste [Tue, 14 Jan 2020 16:59:21 +0000 (16:59 +0000)]
Update WITH_AMD description reflecting upcoming removal

In-tree amd(8) is deprecated; update WITH_AMD's description to make
this more clear.

Sponsored by: The FreeBSD Foundation

4 years agoDo not skip line-by-line comparison if -q and -I are specified.
Mark Johnston [Tue, 14 Jan 2020 15:35:03 +0000 (15:35 +0000)]
Do not skip line-by-line comparison if -q and -I are specified.

This fixes a regression from r356695.

Submitted by: kevans
Reported by: Jenkins via lwhsu
MFC after: 6 days

4 years agostorvsc: port a Linux patch, properly set residual data length on errors
Andriy Gapon [Tue, 14 Jan 2020 13:20:16 +0000 (13:20 +0000)]
storvsc: port a Linux patch, properly set residual data length on errors

This change is based on Linux commit 40630f462824ee.  csio.resid should
account for transfer_len only for success and SRB_STATUS_DATA_OVERRUN
condition.

I am not sure how exactly this change works, but I have a report from a
user that they see lots of checksum errors when running a pool scrub
concurrently with iozone -l 1 -s 100G.  After applying this patch the
problem cannot be reproduced.

Reviewed by: nobody
Sponsored by: CyberSecure
Differential Revision: https://reviews.freebsd.org/D22312

4 years agoMake linux(4) use kern_setsockopt(9) instead of going through
Edward Tomasz Napierala [Tue, 14 Jan 2020 11:33:07 +0000 (11:33 +0000)]
Make linux(4) use kern_setsockopt(9) instead of going through
sys_setsockopt.  Just a cleanup; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22812

4 years agoMake linux(4) use kern_getsockopt(9) instead of going through
Edward Tomasz Napierala [Tue, 14 Jan 2020 11:30:30 +0000 (11:30 +0000)]
Make linux(4) use kern_getsockopt(9) instead of going through
sys_getsockopt().  It's a cleanup; no functional changes.

Reviewed by: kib (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22813

4 years agoMake linux getcpu(2) report the domain.
Edward Tomasz Napierala [Tue, 14 Jan 2020 11:24:06 +0000 (11:24 +0000)]
Make linux getcpu(2) report the domain.

Submitted by: markj
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23144

4 years agoWhen system calls indicate an error they return -1, not some arbitrary
Baptiste Daroussin [Tue, 14 Jan 2020 08:22:28 +0000 (08:22 +0000)]
When system calls indicate an error they return -1, not some arbitrary
value < 0.  errno is only updated in this case.

Obtained from: OpenBSD
MFC after: 3 days

4 years agoasprintf returns -1, not an arbitrary value < 0. Also upon error the
Baptiste Daroussin [Tue, 14 Jan 2020 08:18:04 +0000 (08:18 +0000)]
asprintf returns -1, not an arbitrary value < 0. Also upon error the
(very sloppy specification) leaves an undefined value in *ret, so it is
wrong to inspect it, the error condition is enough.

Obtained from: OpenBSD
MFC after: 3 days

4 years agomkstemp returns -1
Baptiste Daroussin [Tue, 14 Jan 2020 08:16:15 +0000 (08:16 +0000)]
mkstemp returns -1

Obtained from: OpenBSD
MFC after: 3 days

4 years agoRestore loop break in vm_pageout_lowmem().
Alexander Motin [Tue, 14 Jan 2020 03:27:57 +0000 (03:27 +0000)]
Restore loop break in vm_pageout_lowmem().

r355004 removed return statement from this loop with intention to also
call uma_reclaim_wakeup().  But in case of vm.lowmem_period=0 it causes
infinite loop.

Reviewed by: markj
Sponsored by: iXsystems, Inc.

4 years agouma: split slabzone into two sizes
Ryan Libby [Tue, 14 Jan 2020 02:14:15 +0000 (02:14 +0000)]
uma: split slabzone into two sizes

By allowing more items per slab, we can improve memory efficiency for
small allocs.  If we were just to increase the bitmap size of the
slabzone, we would then waste slabzone memory.  So, split slabzone into
two zones, one especially for 8-byte allocs (512 per slab).  The
practical effect should be reduced memory usage for counter(9).

Reviewed by: jeff, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23149

4 years agomalloc: remove assumptions about MINALLOCSIZE
Ryan Libby [Tue, 14 Jan 2020 02:14:02 +0000 (02:14 +0000)]
malloc: remove assumptions about MINALLOCSIZE

Remove assumptions about the minimum MINALLOCSIZE, in order to allow
testing of smaller MINALLOCSIZE.  A following patch will lower the
MINALLOCSIZE, but not so much that the present patch is required for
correctness at these sites.

Reviewed by: jeff, markj
Sponsored by: Dell EMC Isilon

4 years agouma: fixup some ktr messages
Ryan Libby [Tue, 14 Jan 2020 02:13:46 +0000 (02:13 +0000)]
uma: fixup some ktr messages

Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23148

4 years agoFix a long standing bug in journaled soft-updates. The dirrem structure
Jeff Roberson [Tue, 14 Jan 2020 02:00:24 +0000 (02:00 +0000)]
Fix a long standing bug in journaled soft-updates.  The dirrem structure
needs to handle file removal, directory removal, file move, directory move,
etc.  The code in handle_workitem_remove() needs to propagate any completed
journal entries to the write that will render the change stable.  In the
case of a moved directory this means the new parent.  However, for an
overwrite that frees a directory (DIRCHG) we must move the jsegdep to the
removed inode to be released when it is stable in the cg bitmap or the
unlinked inode list.  This case was previously unhandled and caused a
panic.

Reported by: mckusick, pho
Reviewed by: mckusick
Tested by: pho

4 years agocxgbe/iw_cxgbe: Do not allow memory registrations with page size greater
Navdeep Parhar [Tue, 14 Jan 2020 01:43:04 +0000 (01:43 +0000)]
cxgbe/iw_cxgbe: Do not allow memory registrations with page size greater
than 128MB, which is the maximum supported by the hardware in RDMA mode.

Obtained from: Chelsio Communications
MFC after: 3 days
Sponsored by: Chelsio Communications

4 years agopowerpc/mpc85xx: Partially revert r356640
Justin Hibbits [Mon, 13 Jan 2020 23:09:00 +0000 (23:09 +0000)]
powerpc/mpc85xx: Partially revert r356640

The count block was correct before.  r356640 caused a read past the end of
the tuple.

4 years agofstyp hammer2: remove dead code
Eric van Gyzen [Mon, 13 Jan 2020 22:36:29 +0000 (22:36 +0000)]
fstyp hammer2: remove dead code

best_i will always be >= 0, so remove code to test otherwise.

Reported by: Coverity
CID: 1412244
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23159

4 years agofstyp hammer: use strlcpy
Eric van Gyzen [Mon, 13 Jan 2020 22:33:48 +0000 (22:33 +0000)]
fstyp hammer: use strlcpy

Use strlcpy to guarantee NUL termination.  Due to this, there is
no need for strncmp; simply use strcmp.

Reported by: Coverity
CID: 1412242
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23159

4 years agoMap ECKSUM and EFRAGS from ZFS onto real errnos.
Alexander Motin [Mon, 13 Jan 2020 22:06:16 +0000 (22:06 +0000)]
Map ECKSUM and EFRAGS from ZFS onto real errnos.

Make it less confusing when, for example, stat sets errno to 122 because a
checksum failed in ZFS:

Before: getfacl: /foo/bar: stat() failed: Unknown error: 122
After: getfacl: /foo/bar: stat() failed: Integrity check failed

Submitted by: Ryan Moeller <ryan@ixsystems.com>
Reviewed by: mckusick, mav
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D22973

4 years agosavecore: include time zone in info.N file
Eric van Gyzen [Mon, 13 Jan 2020 22:01:37 +0000 (22:01 +0000)]
savecore: include time zone in info.N file

This helps with event correlation when machines are distributed
across multiple time zones.

Format the time with relaxed ISO 8601 for all the usual reasons.

MFC after: 2 weeks
Sponsored by: Dell EMC Isilon

4 years agoAdd missing comma in nfsv4_errstr
Eric van Gyzen [Mon, 13 Jan 2020 21:49:27 +0000 (21:49 +0000)]
Add missing comma in nfsv4_errstr

Reported by: Coverity
CID: 1412243
Sponsored by: Dell EMC Isilon

4 years agonetmap: disable passthrough with no hypervisor support
Vincenzo Maffione [Mon, 13 Jan 2020 21:47:23 +0000 (21:47 +0000)]
netmap: disable passthrough with no hypervisor support

The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.

PR: 241774
MFC after: 3 days

4 years agovmx: fix initialization of TSO related descriptor fields
Vincenzo Maffione [Mon, 13 Jan 2020 21:26:17 +0000 (21:26 +0000)]
vmx: fix initialization of TSO related descriptor fields

Fix a mistake introduced by r343291, which ported the vmx(4)
driver to iflib.
In case of TSO, the hlen field of the (first) tx descriptor must
be initialized to the cumulative length of Ethernet, IP and TCP
headers. The length of the TCP header was missing.

PR: 236999
Reported by: pkelsey
Reviewed by: avg
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22967

4 years agoFix yet another regression from r354484. Error code from cr_cansee()
Gleb Smirnoff [Mon, 13 Jan 2020 21:12:10 +0000 (21:12 +0000)]
Fix yet another regression from r354484. Error code from cr_cansee()
aliases with hard error from other operations.

Reported by: flo

4 years agoMerge commit f46ba4f07 from llvm git (by Simon Atanasyan):
Dimitry Andric [Mon, 13 Jan 2020 20:31:10 +0000 (20:31 +0000)]
Merge commit f46ba4f07 from llvm git (by Simon Atanasyan):

  [mips] Use less registers to load address of TargetExternalSymbol

  There is no pattern matched `add hi, (MipsLo texternalsym)`. As a
  result, loading an address of 32-bit symbol requires two registers
  and one more additional instruction:
  ```
  addiu $1, $zero, %lo(foo)
  lui   $2, %hi(foo)
  addu  $25, $2, $1
  ```

  This patch adds the missed pattern and enables generation more
  effective set of instructions:
  ```
  lui   $1, %hi(foo)
  addiu $25, $1, %lo(foo)
  ```

  Differential Revision: https://reviews.llvm.org/D66771

  llvm-svn: 370196

Merge commit 59bb3609f from llvm git (by Simon Atanasyan):

  [mips] Fix 64-bit address loading in case of applying 32-bit mask to
  the result

  If result of 64-bit address loading combines with 32-bit mask, LLVM
  tries to optimize the code and remove "redundant" loading of upper
  32-bits of the address. It leads to incorrect code on MIPS64 targets.

  MIPS backend creates the following chain of commands to load 64-bit
  address in the `MipsTargetLowering::getAddrNonPICSym64` method:
  ```
  (add (shl (add (shl (add %highest(sym), %higher(sym)),
      16),
 %hi(sym)),
    16),
       %lo(%sym))
  ```

  If the mask presents, LLVM decides to optimize the chain of commands.
  It really does not make sense to load upper 32-bits because the
  0x0fffffff mask anyway clears them. After removing redundant commands
  we get this chain:
  ```
  (add (shl (%hi(sym), 16), %lo(%sym))
  ```

  There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in
  `SYM_32` predicate definition, backend incorrectly selects a pattern
  for a 32-bit symbols and uses the `lui` instruction for loading
  `%hi(sym)`.

  As a result we get incorrect set of instructions with unnecessary
  16-bit left shifting:
  ```
  lui     at,0x0
      R_MIPS_HI16     foo
  dsll    at,at,0x10
  daddiu  at,at,0
      R_MIPS_LO16     foo
  ```

  This patch resolves two problems:
  - Fix `SYM_32/SYM_64` predicates to prevent selection of patterns
    dedicated to 32-bit symbols in case of using N64 ABI.
  - Add missed patterns for 64-bit symbols for `%hi/%lo`.

  Fix PR42736.

  Differential Revision: https://reviews.llvm.org/D66228

  llvm-svn: 370268

These two commits fix a miscompilation of the kernel for mips64, and
should allow clang to be used as the default compiler for mips64.

Requested by: arichards
MFC after: 3 days

4 years agoBackout 356693. The libsa malloc does provide necessary alignment and
Toomas Soome [Mon, 13 Jan 2020 20:02:27 +0000 (20:02 +0000)]
Backout 356693. The libsa malloc does provide necessary alignment and
memalign by 4 will reduce alignment for some platforms. Thanks for Ian for
pointing this out.

4 years agoOptimize diff -q.
Mark Johnston [Mon, 13 Jan 2020 18:29:47 +0000 (18:29 +0000)]
Optimize diff -q.

Once we know whether the files differ, we don't need to do any further
work.

PR: 242828
Submitted by: fehmi noyan isi <fnoyanisi@yahoo.com> (original version)
Reviewed by: bapt, kevans
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23152

4 years agotap(4): also note that we drop configured addresses
Kyle Evans [Mon, 13 Jan 2020 18:26:27 +0000 (18:26 +0000)]
tap(4): also note that we drop configured addresses

This provides a specific pointer for users of tap(4) to understand why their
interfaces are losing their addresses, and specifically how to workaround
this if they need different behavior.

This manpage received a .Dd bump earlier today in r35688, so no bump occurs
this time.

Submitted by: sigsys@gmail.com (via IRC)

4 years agoloader: allocate properly aligned buffer for network packet
Toomas Soome [Mon, 13 Jan 2020 18:22:54 +0000 (18:22 +0000)]
loader: allocate properly aligned buffer for network packet

Use memalign(4, size) to ensure we have properly aligned buffer.

MFC after: 2 weeks

4 years agoInstall tap(4) manpage as vmnet(4) as well
Kyle Evans [Mon, 13 Jan 2020 17:02:42 +0000 (17:02 +0000)]
Install tap(4) manpage as vmnet(4) as well

If one comes across a vmnet interface, this is a useful pointer to have
towards what it actually is if they're otherwise unfamiliar.

MFC after: 3 days

4 years agogprof: Enable riscv
Kristof Provost [Mon, 13 Jan 2020 16:52:26 +0000 (16:52 +0000)]
gprof: Enable riscv

Add a missing riscv.h header file, and fix the check for riscv (must test
MACHINE_CPUARCH, not MACHINE_ARCH, if we want to use 'riscv').

Sponsored by: Axiado

4 years agoFix a typo.
Glen Barber [Mon, 13 Jan 2020 16:31:58 +0000 (16:31 +0000)]
Fix a typo.

MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (netgate.com)

4 years agoEnsure the TYPE, BRANCH, and REVISION variables are set in
Glen Barber [Mon, 13 Jan 2020 16:31:00 +0000 (16:31 +0000)]
Ensure the TYPE, BRANCH, and REVISION variables are set in
cloudware targets when OSRELEASE is overridden.

Submitted by: Trond Endrestol
PR: 243287
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (netgate.com)

4 years agosrc.conf.5: regen after r356615, KERBEROS_SUPPORT dep on KERBEROS
Ed Maste [Mon, 13 Jan 2020 14:50:22 +0000 (14:50 +0000)]
src.conf.5: regen after r356615, KERBEROS_SUPPORT dep on KERBEROS

4 years agoufs: relax an overzealous assert added in r356671
Mateusz Guzik [Mon, 13 Jan 2020 14:33:51 +0000 (14:33 +0000)]
ufs: relax an overzealous assert added in r356671

Part of i_flag can persist across a drop to hold count of 0, at which
point the vnode is taken off the lazy list. Then whoever locks and unlocks
the vnode can trip on the assert.

This trips over kyua running a test untarring character devices to ufs.

Reported by: lwhsu

4 years agoCode must not unlock a mutex while owning the thread lock.
Konstantin Belousov [Mon, 13 Jan 2020 14:30:19 +0000 (14:30 +0000)]
Code must not unlock a mutex while owning the thread lock.

Reviewed by: hselasky, markj
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23150

4 years agoSync with r356645. desiredvnodes is now maxvnodes.
Cy Schubert [Mon, 13 Jan 2020 06:55:38 +0000 (06:55 +0000)]
Sync with r356645. desiredvnodes is now maxvnodes.

4 years agoAs of r356642 desiredvnodes is u_long.
Cy Schubert [Mon, 13 Jan 2020 06:55:35 +0000 (06:55 +0000)]
As of r356642 desiredvnodes is u_long.

4 years agoUnbound's config.h is manually maintained, using a ./configure produced
Cy Schubert [Mon, 13 Jan 2020 06:55:31 +0000 (06:55 +0000)]
Unbound's config.h is manually maintained, using a ./configure produced
config.h as a guide. In practice contributed software maintains a copy
of config.h within its build directory tree containing its Makefile.
usr.sbin/unbound is the home for its config.h.

MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22983

4 years agoRISC-V: fix global symbol lookups for mpentry with lld
Mitchell Horne [Mon, 13 Jan 2020 03:39:02 +0000 (03:39 +0000)]
RISC-V: fix global symbol lookups for mpentry with lld

This is a follow up to r356481. In locore.S, before virtual memory is
set up, we should avoid using indirect address lookups through the GOT.
Therefore we need to convert uses of the la instruction to lla, which
always generates an auipc/addi pair of instructions. This conversion was
done for the BSP case, but not the AP case, resulting in a fault
somewhere before mpva and a failure to bring APs online.

Reported by: lwhsu
Reviewed by: lwhsu, jrtc27 (accepted in a comment)
Differential Revision: https://reviews.freebsd.org/D23138

4 years agovfs: per-cpu batched requeuing of free vnodes
Mateusz Guzik [Mon, 13 Jan 2020 02:39:41 +0000 (02:39 +0000)]
vfs: per-cpu batched requeuing of free vnodes

Constant requeuing adds significant lock contention in certain
workloads. Lessen the problem by batching it.

Per-cpu areas are locked in order to synchronize against UMA freeing
memory.

vnode's v_mflag is converted to short to prevent the struct from
growing.

Sample result from an incremental make -s -j 104 bzImage on tmpfs:
stock:   122.38s user 1780.45s system 6242% cpu 30.480 total
patched: 144.84s user 985.90s system 4856% cpu 23.282 total

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22998

4 years agovfs: rework vnode list management
Mateusz Guzik [Mon, 13 Jan 2020 02:37:25 +0000 (02:37 +0000)]
vfs: rework vnode list management

The current notion of an active vnode is eliminated.

Vnodes transition between 0<->1 hold counts all the time and the
associated traversal between different lists induces significant
scalability problems in certain workloads.

Introduce a global list containing all allocated vnodes. They get
unlinked only when UMA reclaims memory and are only requeued when
hold count reaches 0.

Sample result from an incremental make -s -j 104 bzImage on tmpfs:
stock:   118.55s user 3649.73s system 7479% cpu 50.382 total
patched: 122.38s user 1780.45s system 6242% cpu 30.480 total

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22997

4 years agoufs: use lazy list instead of active list for syncer
Mateusz Guzik [Mon, 13 Jan 2020 02:35:15 +0000 (02:35 +0000)]
ufs: use lazy list instead of active list for syncer

Quota code is temporarily regressed to do a full vnode scan.

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22996

4 years agovfs: add per-mount vnode lazy list and use it for deferred inactive + msync
Mateusz Guzik [Mon, 13 Jan 2020 02:34:02 +0000 (02:34 +0000)]
vfs: add per-mount vnode lazy list and use it for deferred inactive + msync

This obviates the need to scan the entire active list looking for vnodes
of interest.

msync is handled by adding all vnodes with write count to the lazy list.

deferred inactive directly adds vnodes as it sets the VI_DEFINACT flag.

Vnodes get dequeued from the list when their hold count reaches 0.

Newly added MNT_VNODE_FOREACH_LAZY* macros support filtering so that
spurious locking is avoided in the common case.

Reviewed by: jeff
Tested by: pho (in a larger patch, previous version)
Differential Revision: https://reviews.freebsd.org/D22995

4 years agoufs: add a setter for inode i_flag field
Mateusz Guzik [Mon, 13 Jan 2020 02:31:51 +0000 (02:31 +0000)]
ufs: add a setter for inode i_flag field

This will be used later to add vnodes to the lazy list.

Reviewed by: kib (previous version), jeff
Tested by: pho (in a larger patch)
Differential Revision: https://reviews.freebsd.org/D22994

4 years agoFix a typo in r356667 comment
Conrad Meyer [Sun, 12 Jan 2020 23:52:16 +0000 (23:52 +0000)]
Fix a typo in r356667 comment

No functional change.

Reported by: bdragon
Approved by: csprng(markm), earlier version
X-MFC-With: r356667

4 years agogetrandom(2): Add Linux GRND_INSECURE API flag
Conrad Meyer [Sun, 12 Jan 2020 20:47:38 +0000 (20:47 +0000)]
getrandom(2): Add Linux GRND_INSECURE API flag

Treat it as a synonym for GRND_NONBLOCK.  The reasoning is this:

We have two choices for handling Linux's GRND_INSECURE API flag.

1. We could ignore it completely (like GRND_RANDOM).  However, this might
produce the surprising result of GRND_INSECURE requests blocking, when the
Linux API does not block.

2. Alternatively, we could treat GRND_INSECURE requests as requests for
GRND_NONBLOCk.  Here, the surprising result for Linux programs is that
invocations with unseeded random(4) will produce EAGAIN, rather than
garbage.

Honoring the flag in the way Linux does seems fraught.  If we actually use
the output of a random(4) implementation prior to seeding, we leak some
entropy (in an information theory and also practical sense) from what will
be the initial seed to attackers (or allow attackers to arbitrary DoS
initial seeding, if we don't leak).  This seems unacceptable -- it defeats
the purpose of blocking on initial seeding.

Secondary to that concern, before seeding we may have arbitrarily little
entropy collected; producing output from zero or a handful of entropy bits
does not seem particularly useful to userspace.

If userspace can accept garbage, insecure, non-random bytes, they can create
their own insecure garbage with srandom(time(NULL)) or similar.  Any program
which would be satisfied with a 3-bit key CTR stream has no need for CSPRNG
bytes.  So asking the kernel to produce such an output from the secure
getrandom(2) API seems inane.

For now, we've elected to emulate GRND_INSECURE as an alternative spelling
of GRND_NONBLOCK (2).  Consider this API not-quite stable for now.  We
guarantee it will never block.  But we will attempt to monitor actual port
uptake of this bizarre API and may revise our plans for the unseeded
behavior (prior stable/13 branching).

Approved by: csprng(markm), manpages(bcr)
See also: https://lwn.net/ml/linux-kernel/cover.1577088521.git.luto@kernel.org/
See also: https://lwn.net/ml/linux-kernel/20200107204400.GH3619@mit.edu/
Differential Revision: https://reviews.freebsd.org/D23130

4 years agoFix the way 'factor' behaves when using OpenSSL to match the description
Garance A Drosehn [Sun, 12 Jan 2020 20:25:11 +0000 (20:25 +0000)]
Fix the way 'factor' behaves when using OpenSSL to match the description
of how it works when not compiled with OpenSSL.

Also, allow users to specify a hexadecimal number by using a prefix of
'0x'.  Before this, users could only specify a hexadecimal value if that
value included a hex digit ('a'-'f') in the value.

PR: 243136
Submitted by: Steve Kargl
Reviewed by: gad
MFC after: 3 weeks

4 years agoFix race when accepting TCP connections.
Michael Tuexen [Sun, 12 Jan 2020 17:52:32 +0000 (17:52 +0000)]
Fix race when accepting TCP connections.

When expanding a SYN-cache entry to a socket/inp a two step approach was
taken:
1) The local address was filled in, then the inp was added to the hash
   table.
2) The remote address was filled in and the inp was relocated in the
   hash table.
Before the epoch changes, a write lock was held when this happens and
the code looking up entries was holding a corresponding read lock.
Since the read lock is gone away after the introduction of the
epochs, the half populated inp was found during lookup.
This resulted in processing TCP segments in the context of the wrong
TCP connection.
This patch changes the above procedure in a way that the inp is fully
populated before inserted into the hash table.

Thanks to Paul <devgs@ukr.net> for reporting the issue on the net@
mailing list and for testing the patch!

Reviewed by: rrs@
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D22971

4 years agond6_rtr: constantly use __func__ for nd6log()
Bjoern A. Zeeb [Sun, 12 Jan 2020 17:41:09 +0000 (17:41 +0000)]
nd6_rtr: constantly use __func__ for nd6log()

Over time one or two hard coded function names did not match the
actual function anymore.  Consistently use __func__ for nd6log() calls
and re-wrap/re-format some messages for consitency.

MFC after: 2 weeks

4 years agond6_rtr: make nd6_prefix_onlink() static
Bjoern A. Zeeb [Sun, 12 Jan 2020 16:58:21 +0000 (16:58 +0000)]
nd6_rtr: make nd6_prefix_onlink() static

nd6_prefix_onlink() is not used anywhere outside nd6_rtr.c.  Stop
exporting it and make it file local static.

4 years agoFix division by zero issue.
Michael Tuexen [Sun, 12 Jan 2020 15:45:27 +0000 (15:45 +0000)]
Fix division by zero issue.

Thanks to Stas Denisov for reporting the issue for the userland stack
and providing a fix.

MFC after: 3 days

4 years agodd kern_getpriority(), make Linuxulator use it.
Edward Tomasz Napierala [Sun, 12 Jan 2020 14:25:44 +0000 (14:25 +0000)]
dd kern_getpriority(), make Linuxulator use it.

Reviewed by: kib, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22842

4 years agoAdd kern_setpriority(), use it in Linuxulator.
Edward Tomasz Napierala [Sun, 12 Jan 2020 13:38:51 +0000 (13:38 +0000)]
Add kern_setpriority(), use it in Linuxulator.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22841

4 years agoTighten FAT checks and fix off-by-one error in corner case.
Xin LI [Sun, 12 Jan 2020 06:13:52 +0000 (06:13 +0000)]
Tighten FAT checks and fix off-by-one error in corner case.

sbin/fsck_msdosfs/fat.c:
 - readfat:
    * Only truncate out-of-range cluster pointers (1, or greater than
      NumClusters but smaller than CLUST_RSRVD), as the current cluster
      may contain some data. We can't fix reserved cluster pointers at
      this pass, because we do no know the potential cluster preceding
      it.
    * Accept valid cluster for head bitmap. This is a no-op, and mainly
      to improve code readability, because the 1 is already handled in
      the previous else if block.
 - truncate_at: absorbed into checkchain.
 - checkchain: save the previous node we have traversed in case that we
   have a chain that ends with a special (>= CLUST_RSRVD) cluster, or is
   free. In these cases, we need to truncate at the cluster preceding the
   current cluster, as the current cluster contains a marker instead of
   a next pointer and can not be changed to CLUST_EOF (the else case can
   happen if the user answered "no" at some point in readfat()).
 - clearchain: correct the iterator for next cluster so that we don't
   stop after clearing the first cluster.
 - checklost: If checkchain() thinks the chain have no cluster, it
   doesn't make sense to reconnect it, so don't bother asking.

Reviewed by: kevlo
MFC after: 24 days
X-MFC-With: r356313
Differential Revision: https://reviews.freebsd.org/D23065

4 years agoAdd "panicked" boolean which can be tested instead of panicstr
Mateusz Guzik [Sun, 12 Jan 2020 06:09:10 +0000 (06:09 +0000)]
Add "panicked" boolean which can be tested instead of panicstr

The test is performed all the time and reading entire panicstr to do it
wastes space.

4 years agoAdd KERNEL_PANICKED macro for use in place of direct panicstr tests
Mateusz Guzik [Sun, 12 Jan 2020 06:07:54 +0000 (06:07 +0000)]
Add KERNEL_PANICKED macro for use in place of direct panicstr tests

4 years agosysctl: add missing CLTFLAG_MPSAFE annotation to CONST_STRING
Mateusz Guzik [Sun, 12 Jan 2020 05:25:06 +0000 (05:25 +0000)]
sysctl: add missing CLTFLAG_MPSAFE annotation to CONST_STRING

4 years agovm: add missing CLTFLAG_MPSAFE annotations
Mateusz Guzik [Sun, 12 Jan 2020 05:08:57 +0000 (05:08 +0000)]
vm: add missing CLTFLAG_MPSAFE annotations

This covers all vm/* files.

4 years agodtrace: add missing CLTFLAG_MPSAFE annotations
Mateusz Guzik [Sun, 12 Jan 2020 04:53:22 +0000 (04:53 +0000)]
dtrace: add missing CLTFLAG_MPSAFE annotations

4 years agozfs: add missing CLTFLAG_MPSAFE annotations
Mateusz Guzik [Sun, 12 Jan 2020 04:53:01 +0000 (04:53 +0000)]
zfs: add missing CLTFLAG_MPSAFE annotations

4 years agoMakefile.inc1: push /usr/libexec into the BPATH/TMPPATH
Kyle Evans [Sun, 12 Jan 2020 04:18:36 +0000 (04:18 +0000)]
Makefile.inc1: push /usr/libexec into the BPATH/TMPPATH

${WORLDTMP}/legacy/usr/libexec will only have libexec/ bits that we've
pushed as bootstrap tools, so this is generally safe to include prior to
PATH. The following are the ramifications of this change:

- BPATH addition gets us at least bootstrap flua in WMAKEENV path for
  buildenv, for those earlier systems where it's bootstrapped still

- Reworked the sysent target to just set PATH and let it get worked out in
  src.lua.mk or individual sysent makefiles -- this gives us back the
  ability to overwrite LUA_CMD and use a different/external lua for these
  targets.  sysent can also now work cleanly in buildenv.

- tools/build/Makefile will now symlink the host flua into build's host
  tools so that the above can work without needing to add the host's
  /usr/libexec explicitly into TMPPATH.

Reviewed by: arichardson, brooks, imp (all slightly earlier version)
Differential Revision: https://reviews.freebsd.org/D22464

4 years agoregulator: small enhancements to regulator_shutdown
Kyle Evans [Sun, 12 Jan 2020 04:07:03 +0000 (04:07 +0000)]
regulator: small enhancements to regulator_shutdown

Highlights:

- Exit early if we're not disabling unused regulators; there's no need to
  take the regulator topology lock and re-evaluate this every iteration, as
  it's not going to change.
- Don't emit a notice that we're shutting down a regulator if it's not
  enabled, to reduce noise.
- Mention the outcome of the shutdown, to aide debugging and easily let
  developer/user collect list of regulators we actually shutdown to
  determine problematic one.

Reviewed by: manu
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D22213

4 years agovfs: only recalculate watermarks when limits are changing
Mateusz Guzik [Sat, 11 Jan 2020 23:00:57 +0000 (23:00 +0000)]
vfs: only recalculate watermarks when limits are changing

Previously they would get recalculated all the time, in particular in:
getnewvnode -> vcheckspace -> vspace

4 years agovfs: deduplicate vnode allocation logic
Mateusz Guzik [Sat, 11 Jan 2020 22:59:44 +0000 (22:59 +0000)]
vfs: deduplicate vnode allocation logic

This creates a dedicated routine (vn_alloc) to allocate vnodes.

As a side effect code duplicationw with getnewvnode_reserve is eleminated.

Add vn_free for symmetry.