]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoAdd t4_keyctx.c to sys/conf/files for the non-module build.
John Baldwin [Wed, 13 Nov 2019 17:06:10 +0000 (17:06 +0000)]
Add t4_keyctx.c to sys/conf/files for the non-module build.

Missed in r354667.

Pointy hat to: jhb
MFC after: 1 month
Sponsored by: Chelsio Communications

4 years agoIn if_siocaddmulti() enter VNET.
Gleb Smirnoff [Wed, 13 Nov 2019 16:28:53 +0000 (16:28 +0000)]
In if_siocaddmulti() enter VNET.

Reported & tested by: garga

4 years agoDefine wrapper functions vm_map_entry_{succ,pred} to act as wrappers
Doug Moore [Wed, 13 Nov 2019 15:56:07 +0000 (15:56 +0000)]
Define wrapper functions vm_map_entry_{succ,pred} to act as wrappers
around entry->{next,prev} when those are used for ordered list
traversal, and use those wrapper functions everywhere. Where the next
field is used for maintaining a stack of deferred operations, #define
defer_next to make that different usage clearer, and then use the
'right' pointer instead of 'next' for that purpose.

Approved by: markj
Tested by: pho (as part of a larger patch)
Differential Revision: https://reviews.freebsd.org/D22347

4 years agoStop the VESA driver from whining loudly in the dmesg during boot on
Scott Long [Wed, 13 Nov 2019 15:31:31 +0000 (15:31 +0000)]
Stop the VESA driver from whining loudly in the dmesg during boot on
systems that use EFI instead of BIOS.

4 years agond6: remove unused structs and defines
Bjoern A. Zeeb [Wed, 13 Nov 2019 14:28:07 +0000 (14:28 +0000)]
nd6: remove unused structs and defines

Remove a collections of unused structs and #defines to make it easier
to understand what is actually in use.

Sponsored by: Netflix

4 years agond6: make nd6_alloc() file static
Bjoern A. Zeeb [Wed, 13 Nov 2019 13:53:17 +0000 (13:53 +0000)]
nd6: make nd6_alloc() file static

nd6_alloc() is a function used only locally.  Make it static and no
longer export it.  Keeps the KPI smaller.

Sponsored by: Netflix

4 years agond6 defrouter: consolidate nd_defrouter manipulations in nd6_rtr.c
Bjoern A. Zeeb [Wed, 13 Nov 2019 12:05:48 +0000 (12:05 +0000)]
nd6 defrouter: consolidate nd_defrouter manipulations in nd6_rtr.c

Move the nd_defrouter along with the sysctl handler from nd6.c to
nd6_rtr.c and make the variable file static.  Provide (temporary)
new accessor functions for code manipulating nd_defrouter from nd6.c,
and stop exporting functions no longer needed outside nd6_rtr.c.
This also shuffles a few functions around in nd6_rtr.c without
functional changes.

Given all nd_defrouter logic is now in one place we can tidy up the
code, locking and, and other open items.

MFC after: 3 weeks
X-MFC: keep exporting the functions
Sponsored by: Netflix

4 years agolltabl: remove dead code
Bjoern A. Zeeb [Wed, 13 Nov 2019 11:21:02 +0000 (11:21 +0000)]
lltabl: remove dead code

Remove the long (8? years ago) #if 0 marked function lltable_drain() and
while here also remove the unused function llentry_alloc() which has call
paths tools keep finding and are never used.

Sponsored by: Netflix

4 years agoLogging improvements to loader::nfs
Ravi Pokala [Wed, 13 Nov 2019 03:56:51 +0000 (03:56 +0000)]
Logging improvements to loader::nfs

Include the server IP address when logging nfs_open(), add a few missing
"\n"s, and correct a typo.

Reviewed by: kevans
MFC after: 2 weeks
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D22346

4 years agossp: rework the logic to use priority=200 on clang builds
Kyle Evans [Wed, 13 Nov 2019 03:00:32 +0000 (03:00 +0000)]
ssp: rework the logic to use priority=200 on clang builds

The preproc logic was added at the last minute to appease GCC 4.2, and
kevans@ did clearly not go back and double-check that the logic worked out
for clang builds to use the new variant.

It turns out that clang defines __GNUC__ == 4. Flip it around and check
__clang__ as well, leaving a note to remove it later.

Reported by: cem

4 years agopowerpc64: Don't guard ISA 3.0 partition table setup with hw_direct_map
Justin Hibbits [Wed, 13 Nov 2019 02:22:00 +0000 (02:22 +0000)]
powerpc64: Don't guard ISA 3.0 partition table setup with hw_direct_map

PowerISA 3.0 eliminated the 64-bit bridge mode which allowed 32-bit kernels
to run on 64-bit AIM/Book-S hardware.  Since therefore only a 64-bit kernel
can run on this hardware, and 64-bit native always has the direct map, there
is no need to guard it.

4 years agopowerpc: Don't savectx() twice in IPI_STOP handler
Justin Hibbits [Wed, 13 Nov 2019 02:16:24 +0000 (02:16 +0000)]
powerpc: Don't savectx() twice in IPI_STOP handler

We already save context in stoppcbs[] array, so there's no need to also save it
in the PCB, it won't be used.

4 years agossp: add a priority to the __stack_chk_guard constructor
Kyle Evans [Wed, 13 Nov 2019 02:14:17 +0000 (02:14 +0000)]
ssp: add a priority to the __stack_chk_guard constructor

First, this commit is a NOP on GCC <= 4.x; this decidedly doesn't work
cleanly on GCC 4.2, and it will be gone soon anyways so I chose not to dump
time into figuring out if there's a way to make it work. xtoolchain-gcc,
clocking in as GCC6, can cope with it just fine and later versions are also
generally ok with the syntax. I suspect very few users are running GCC4.2
built worlds and also experiencing potential fallout from the status quo.

For dynamically linked applications, this change also means very little.
rtld will run libc ctors before most others, so the situation is
approximately a NOP for these as well.

The real cause for this change is statically linked applications doing
almost questionable things in their constructors. qemu-user-static, for
instance, creates a thread in a global constructor for their async rcu
callbacks. In general, this works in other places-

- On OpenBSD, __stack_chk_guard is stored in an .openbsd.randomdata section
  that's initialized by the kernel in the static case, or ld.so in the
  dynamic case
- On Linux, __stack_chk_guard is apparently stored in TLS and such a problem
  is circumvented there because the value is presumed stable in the new
  thread.

On FreeBSD, the rcu thread creation ctor and __guard_setup are both unmarked
priority. qemu-user-static spins up the rcu thread prior to __guard_setup
which starts making function calls- some of these are sprinkled with the
canary. In the middle of one of these functions, __guard_setup is invoked in
the main thread and __stack_chk_guard changes- qemu-user-static is promptly
terminated for an SSP violation that didn't actually happen.

This is not an all-too-common problem. We circumvent it here by giving the
__stack_chk_guard constructor a solid priority. 200 was chosen because that
gives static applications ample range (down to 101) for working around it
if they really need to. I suspect most applications will "just work" as
expected- the default/non-prioritized flavor of __constructor__ functions
run last, and the canary is generally not expected to change as of this
point at the very least.

This took approximately three weeks of spare time debugging to pin down.

PR: 241905

4 years agoFix a race between daopen and damediapoll
Warner Losh [Wed, 13 Nov 2019 01:58:43 +0000 (01:58 +0000)]
Fix a race between daopen and damediapoll

When we do a daopen, we call dareprobe and wait for the results. The repoll runs
the da state machine up through the DA_STATE_RC* and then exits.

For removable media, we poll the device every 3 seconds with a TUR to see if it
has disappeared. This introduces a race. If the removable device has lots of
partitions, and if it's a little slow (like say a USB2 connected USB stick),
then we can have a fair amount of time that this reporbe is going on for. If,
during that time, damediapoll fires, it calls daschedule which changes the
scheduling priority from NONE to NORMAL. When that happens, the careful single
stepping in the da state machine is disrupted and we wind up sceduling multiple
read capacity calls. The first one succeeds and releases the reference. The
second one succeeds and releases the reference (and panics if the right code is
compiled into the da driver).

To avoid the race, only do the TUR calls while in state normal, otherwise just
reschedule damediapoll. This prevents the race from happening.

4 years agoCreate a file to hold shared routines for dealing with T6 key contexts.
John Baldwin [Wed, 13 Nov 2019 00:53:45 +0000 (00:53 +0000)]
Create a file to hold shared routines for dealing with T6 key contexts.

ccr(4) and TLS support in cxgbe(4) construct key contexts used by the
crypto engine in the T6.  This consolidates some duplicated code for
helper functions used to build key contexts.

Reviewed by: np
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22156

4 years agosesutil: fix another memory leak
Alan Somers [Tue, 12 Nov 2019 23:57:57 +0000 (23:57 +0000)]
sesutil: fix another memory leak

Instead of calloc()ing (and forgetting to free) in a tight loop, just put
this small array on the stack.

Reported by: Coverity
Coverity CID: 1331665
MFC after: 2 weeks
Sponsored by: Axcient

4 years agosesutil: fix some memory leaks
Alan Somers [Tue, 12 Nov 2019 23:09:55 +0000 (23:09 +0000)]
sesutil: fix some memory leaks

Reported by: Coverity
Coverity CID: 1331665
MFC after: 2 weeks
Sponsored by: Axcient

4 years agosesutil: fix an out-of-bounds array access
Alan Somers [Tue, 12 Nov 2019 23:03:52 +0000 (23:03 +0000)]
sesutil: fix an out-of-bounds array access

sesutil would allow the user to toggle an LED that was one past the maximum
element.  If he tried, ENCIOC_GETELMSTAT would return EINVAL.

Reported by: Coverity
Coverity CID: 1398940
MFC after: 2 weeks
Sponsored by: Axcient

4 years agolibcompat: Correct rtld MLINKS
Brooks Davis [Tue, 12 Nov 2019 22:31:59 +0000 (22:31 +0000)]
libcompat: Correct rtld MLINKS

Don't install duplicate ld-elf.so.1.1 and ld.so.1 links in rtld-elf32.
Do install lib-elf32.so.1.1 and ldd32.1 links.

Reported by: madpilot

4 years agoSync target triple generation with the version in Makefile.inc1.
John Baldwin [Tue, 12 Nov 2019 21:35:05 +0000 (21:35 +0000)]
Sync target triple generation with the version in Makefile.inc1.

Reviewed by: dim
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22333

4 years agoForce MK_CLANG_IS_CC on in XMAKE.
John Baldwin [Tue, 12 Nov 2019 21:29:52 +0000 (21:29 +0000)]
Force MK_CLANG_IS_CC on in XMAKE.

This ensures that a bootstrap clang compiler is always installed as cc
in WORLDTMP.  If it is only installed as 'clang' then /usr/bin/cc is
used during the build instead of the bootstrap compiler.

Reviewed by: imp
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22332

4 years agoEnable the RISC-V LLVM backend by default.
John Baldwin [Tue, 12 Nov 2019 21:26:50 +0000 (21:26 +0000)]
Enable the RISC-V LLVM backend by default.

Reviewed by: dim, mhorne, emaste
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22284

4 years agobhyve: rework mevent processing to fix a race condition
Vincenzo Maffione [Tue, 12 Nov 2019 21:07:51 +0000 (21:07 +0000)]
bhyve: rework mevent processing to fix a race condition

At the end of both mevent_add() and mevent_update(), mevent_notify()
is called to wakeup the I/O thread, that will call kevent(changelist)
to update the kernel.
A race condition is possible where the client calls mevent_add() and
mevent_update(EV_ENABLE) before the I/O thread has the chance to wake
up and call mevent_build()+kevent(changelist) in response to mevent_add().
The mevent_add() is therefore ignored by the I/O thread, and
kevent(fd, EV_ENABLE) is called before kevent(fd, EV_ADD), resuliting
in a failure of the kevent(fd, EV_ENABLE) call.

PR: 241808
Reviewed by: jhb, markj
MFC with: r354288
Differential Revision: https://reviews.freebsd.org/D22286

4 years agoAdd new bit definitions for TSX, related to the TAA issue. The actual
Scott Long [Tue, 12 Nov 2019 19:15:16 +0000 (19:15 +0000)]
Add new bit definitions for TSX, related to the TAA issue.  The actual
mitigation will follow in a future commit.

Sponsored by: Intel

4 years agoWorkaround for Intel SKL002/SKL012S errata.
Konstantin Belousov [Tue, 12 Nov 2019 18:01:33 +0000 (18:01 +0000)]
Workaround for Intel SKL002/SKL012S errata.

Disable the use of executable 2M page mappings in EPT-format page
tables on affected CPUs.  For bhyve virtual machines, this effectively
disables all use of superpage mappings on affected CPUs.  The
vm.pmap.allow_2m_x_ept sysctl can be set to override the default and
enable mappings on affected CPUs.

Alternate approaches have been suggested, but at present we do not
believe the complexity is warranted for typical bhyve's use cases.

Reviewed by: alc, emaste, markj, scottl
Security: CVE-2018-12207
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21884

4 years agonvdimm(4): Fix various problems when the using the second label index block
D Scott Phillips [Tue, 12 Nov 2019 16:24:37 +0000 (16:24 +0000)]
nvdimm(4): Fix various problems when the using the second label index block

struct nvdimm_label_index is dynamically sized, with the `free`
bitfield expanding to hold `slot_cnt` entries. Fix a few places
where we were treating the struct as though it had a fixed sized.

Reviewed by: cem
Approved by: scottl (mentor)
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D22253

4 years agoi386: stop guessing the address of the trap frame in ddb backtrace.
Konstantin Belousov [Tue, 12 Nov 2019 15:56:27 +0000 (15:56 +0000)]
i386: stop guessing the address of the trap frame in ddb backtrace.

Save the address of the trap frame in %ebp on kernel entry.  This
automatically provides it in struct i386_frame.f_frame to unwinder.

While there, more accurately handle the terminating frames,

Reviewed by: avg, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22321

4 years agoamd64: move GDT into PCPU area.
Konstantin Belousov [Tue, 12 Nov 2019 15:51:47 +0000 (15:51 +0000)]
amd64: move GDT into PCPU area.

Reviewed by: jhb, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22302

4 years agonvdimm(4): Only expose namespaces for accessible data SPAs
D Scott Phillips [Tue, 12 Nov 2019 15:50:30 +0000 (15:50 +0000)]
nvdimm(4): Only expose namespaces for accessible data SPAs

Apply the same user accessible filter to namespaces as is applied
to full-SPA devices. Also, explicitly filter out control region
SPAs which don't expose the nvdimm data area.

Reviewed by: cem
Approved by: scottl (mentor)
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21987

4 years agoamd64: assert that size of the software prototype table for gdt is equal
Konstantin Belousov [Tue, 12 Nov 2019 15:47:46 +0000 (15:47 +0000)]
amd64: assert that size of the software prototype table for gdt is equal
to the size of hardware gdt.

Reviewed by: jhb, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22302

4 years agonetinet*: update *mp to pass the proper value back
Bjoern A. Zeeb [Tue, 12 Nov 2019 15:46:28 +0000 (15:46 +0000)]
netinet*: update *mp to pass the proper value back

In ip6_[direct_]input() we are looping over the extension headers
to deal with the next header.  We pass a pointer to an mbuf pointer
to the handling functions.  In certain cases the mbuf can be updated
there and we need to pass the new one back.  That missing in
dest6_input() and route6_input().  In tcp6_input() we should also
update it before we call tcp_input().

In addition to that mark the mbuf NULL all the times when we return
that we are done with handling the packet and no next header should
be checked (IPPROTO_DONE).  This will eventually allow us to assert
proper behaviour and catch the above kind of errors more easily,
expecting *mp to always be set.

This change is extracted from a larger patch and not an exhaustive
change across the entire stack yet.

PR: 240135
Reported by: prabhakar.lakhera gmail.com
MFC after: 3 weeks
Sponsored by: Netflix

4 years agonetstat: igmp stats, error on unexpected information, not only warn
Bjoern A. Zeeb [Tue, 12 Nov 2019 13:57:17 +0000 (13:57 +0000)]
netstat: igmp stats, error on unexpected information, not only warn

The igmp stats tend to print two lines of warning for an unexpected
version and length.  Despite an invalid version and struct size it
continues to try to do something with the data.  Do not try to parse
the remainder of the struct and error on warning.

Note the underlying issue of the data not being available properly
is still there and needs to be fixed seperately.

Reported by: test cases, lwhsu
MFC after: 3 weeks

4 years agoteach db_nextframe/x86 about [X]xen_intr_upcall interrupt handler
Andriy Gapon [Tue, 12 Nov 2019 11:00:01 +0000 (11:00 +0000)]
teach db_nextframe/x86 about [X]xen_intr_upcall interrupt handler

Discussed with: kib, royger
MFC after: 3 weeks
Sponsored by: Panzura

4 years agoxen: fix dispatching of NMIs
Roger Pau Monné [Tue, 12 Nov 2019 10:31:28 +0000 (10:31 +0000)]
xen: fix dispatching of NMIs

Currently NMIs are sent over event channels, but that defeats the
purpose of NMIs since event channels can be masked. Fix this by
issuing NMIs using a hypercall, which injects a NMI (vector #2) to the
desired vCPU.

Note that NMIs could also be triggered using the emulated local APIC,
but using a hypercall is better from a performance point of view
since it doesn't involve instruction decoding when not using x2APIC
mode.

Reported and Tested by: avg
Sponsored by: Citrix Systems R&D

4 years agoreverting r354594
Toomas Soome [Tue, 12 Nov 2019 10:02:39 +0000 (10:02 +0000)]
reverting r354594

In our case the structure is more complex and simple static initializer
will upset compiler diagnostics - using memset is still better than building
more complext initializer.

4 years agoFix netstat -gs with ip_mroute module and/or vnet
Mike Karels [Tue, 12 Nov 2019 01:03:08 +0000 (01:03 +0000)]
Fix netstat -gs with ip_mroute module and/or vnet

The code for "netstat -gs -f inet" failed if the kernel namelist did not
include the _mrtstat symbol. However, that symbol is not in a standard
kernel even with the ip_mroute module loaded, where the functionality is
available. It is also not in a kernel with MROUTING but also VIMAGE, as
there can be multiple sets of stats. However, when running the command
on a live system, the symbol is not used; a sysctl is used. Go ahead
and try the sysctl in any case, and complain that IPv4 MROUTING is not
present only if the sysctl fails with ENOENT. Also fail if _mrtstat is
not defined when running on a core file; netstat doesn't know about vnets,
so can only work if MROUTING was included, and VIMAGE was not.

Reviewed by: bz
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22311

4 years agoIn ufs_dir_dd_ino(), always initialize *dd_vp since the caller expects it.
Chuck Silvers [Tue, 12 Nov 2019 00:32:33 +0000 (00:32 +0000)]
In ufs_dir_dd_ino(), always initialize *dd_vp since the caller expects it.

Reviewed by: kib, mckusick
Approved by: imp (mentor)
Sponsored by: Netflix

4 years agoAdd the text attribute for MDS_NO in the IA32_ARCH_CAP MSR.
Scott Long [Mon, 11 Nov 2019 22:18:05 +0000 (22:18 +0000)]
Add the text attribute for MDS_NO in the IA32_ARCH_CAP MSR.

4 years agoamd64: Issue MFENCE on context switch on AMD CPUs when reusing address space.
Konstantin Belousov [Mon, 11 Nov 2019 21:59:20 +0000 (21:59 +0000)]
amd64: Issue MFENCE on context switch on AMD CPUs when reusing address space.

On some AMD CPUs, in particular, machines that do not implement
CLFLUSHOPT but do provide CLFLUSH, the CLFLUSH instruction is only
synchronized with MFENCE.

Code using CLFLUSH typicall needs to brace it with MFENCE both before
and after flush, see for instance pmap_invalidate_cache_range().  If
context switch occurs while inside the protected region, we need to
ensure visibility of flushes done on the old CPU, to new CPU.

For all other machines, locked operation done to lock switched thread,
should be enough.  For case of different address spaces, reload of
%cr3 is serializing.

Reviewed by: cem, jhb, scottph
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22007

4 years agoFix handling of PIPE_EOF in the direct write path.
Mark Johnston [Mon, 11 Nov 2019 20:44:30 +0000 (20:44 +0000)]
Fix handling of PIPE_EOF in the direct write path.

Suppose a writing thread has pinned its pages and gone to sleep with
pipe_map.cnt > 0.  Suppose that the thread is woken up by a signal (so
error != 0) and the other end of the pipe has simultaneously been
closed.  In this case, to satisfy the assertion about pipe_map.cnt in
pipe_destroy_write_buffer(), we must mark the buffer as empty.

Reported by: syzbot+5cce271bf2cb1b1e1876@syzkaller.appspotmail.com
Reviewed by: kib
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22261

4 years agodb_nextframe/i386: reduce the number of special frame types
Andriy Gapon [Mon, 11 Nov 2019 19:06:04 +0000 (19:06 +0000)]
db_nextframe/i386: reduce the number of special frame types

This change removes TRAP_INTERRUPT and TRAP_TIMERINT frame types.

Their names are a bit confusing: trap + interrupt, what is that?
The TRAP_TIMERINT name is too specific -- can it only be used for timer
"trap-interrupts"?  What is so special about them?

My understanding of the code is that INTERRUPT, TRAP_INTERRUPT and
TRAP_TIMERINT differ only in how an offset from callee's frame pointer to a
trap frame on the stack is calculated.  And that depends on a number of
arguments that a special handler passes to a callee (a function with a
normal C calling convention).

So, this change makes that logic explicit and collapses all interrupt frame
types into the INTERRUPT type.

Reviewed by: markj
Discussed with: kib, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22303

4 years agoMerge commit 371ea70bb from llvm git (by Louis Dionne):
Dimitry Andric [Mon, 11 Nov 2019 17:41:56 +0000 (17:41 +0000)]
Merge commit 371ea70bb from llvm git (by Louis Dionne):

  [libc++] Harden usage of static_assert against C++03

  In C++03, we emulate static_assert with a macro, and we must
  parenthesize multiple arguments.

  llvm-svn: 373328

This is a follow-up to r354460, which causes errors for pre-C++11
programs using <cmath>, similar to:

/usr/include/c++/v1/cmath:622:68: error: too many arguments provided to
function-like macro invocation

Reported by: antoine
MFC after: immediately (because of ports breakage)

4 years agotip/cu: check for EOF on input on the local side
Eric van Gyzen [Mon, 11 Nov 2019 17:41:52 +0000 (17:41 +0000)]
tip/cu: check for EOF on input on the local side

If cu reads an EOF on the input side, it goes into a tight loop
sending a garbage byte to the remote.  With this change, it exits
gracefully, along with its child.

MFC after: 2 weeks
Sponsored by: Dell EMC Isilon

4 years agoAdd asserts for some state transitions
Warner Losh [Mon, 11 Nov 2019 17:36:57 +0000 (17:36 +0000)]
Add asserts for some state transitions

For the PROBEWP and PROBERC* states, add assertiosn that both the da device
state is in the right state, as well as the ccb state is the right one when we
enter dadone_probe{wp,rc}. This will ensure that we don't sneak through when
we're re-probing the size and write protection status of the device and thereby
leak a reference which can later lead to an invalidated peripheral going away
before all references are released (and resulting panic).

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295

4 years agoUpdate the softc state of the da driver before releasing the CCB.
Warner Losh [Mon, 11 Nov 2019 17:36:52 +0000 (17:36 +0000)]
Update the softc state of the da driver before releasing the CCB.

There are contexts where releasing the ccb triggers dastart() to be run
inline. When da was written, there was always a deferral, so it didn't matter
much. Now, with direct dispatch, we can call dastart from the dadone*
routines. If the probe state isn't updated, then dastart will redo things with
stale information. This normally isn't a problem, because we run the probe state
machine once at boot... Except that we also run it for each open of the device,
which means we can have multiple threads racing each other to try to kick off
the probe. However, if we update the state before we release the CCB, we can
avoid the race. While it's needed only for the probewp and proberc* states, do
it everywhere because it won't hurt the other places.

The race here happens because we reprobe dozens of times on boot when drives
have lots of partitions.  We should consider caching this info for 1-2 seconds
to avoid this thundering hurd.

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295

4 years agoRequire and enforce that dareprobe() has to be called with the periph lock held.
Warner Losh [Mon, 11 Nov 2019 17:36:47 +0000 (17:36 +0000)]
Require and enforce that dareprobe() has to be called with the periph lock held.

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295

4 years agoFix panic message to indicate right action that was improper.
Warner Losh [Mon, 11 Nov 2019 17:36:42 +0000 (17:36 +0000)]
Fix panic message to indicate right action that was improper.

Reviewed by: scottl, ken
Differential Revision: https://reviews.freebsd.org/D22295

4 years agodb_nextframe/amd64: remove TRAP_INTERRUPT frame type
Andriy Gapon [Mon, 11 Nov 2019 17:11:49 +0000 (17:11 +0000)]
db_nextframe/amd64: remove TRAP_INTERRUPT frame type

Besides the confusing name, this type is effectively unused.
In all cases where it could be set, the INTERRUPT type is set by the
earlier code.  The conditions for TRAP_INTERRUPT are a subset of the
conditions for INTERRUPT.

Reviewed by: kib, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D22305

4 years agoswap_pager_meta_free() frees allocated blocks in a way that
Doug Moore [Mon, 11 Nov 2019 16:59:49 +0000 (16:59 +0000)]
swap_pager_meta_free() frees allocated blocks in a way that
exploits the sparsity of allocated blocks in a range, without
issuing an "are you there?" query for every block in the range.
swap_pager_copy() is not so smart.  Modify the implementation
of swap_pager_meta_free() slightly so that swap_pager_copy()
can use that smarter implementation too.

Based on an observation of: Yoshihiro Ota (ota_j.email.ne.jp)
Reviewed by: kib,alc
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22280

4 years agoIt is unclear why in6_pcblookup_local() would require write access
Gleb Smirnoff [Mon, 11 Nov 2019 06:28:25 +0000 (06:28 +0000)]
It is unclear why in6_pcblookup_local() would require write access
to the PCB hash.  The function doesn't modify the hash. It always
asserted write lock historically, but with epoch conversion this
fails in some special cases.

Reviewed by: rwatson, bz
Reported-by: syzbot+0b0488ca537e20cb2429@syzkaller.appspotmail.com

4 years agoRevert r354605: Update jemalloc to version 5.2.1.
Jason Evans [Mon, 11 Nov 2019 05:06:49 +0000 (05:06 +0000)]
Revert r354605: Update jemalloc to version 5.2.1.

Compilation fails for non-llvm-based platforms.

4 years agoUpdate jemalloc to version 5.2.1.
Jason Evans [Mon, 11 Nov 2019 03:27:14 +0000 (03:27 +0000)]
Update jemalloc to version 5.2.1.

4 years agoplic: check for sifive compatible string
Mitchell Horne [Mon, 11 Nov 2019 01:39:06 +0000 (01:39 +0000)]
plic: check for sifive compatible string

The Linux dts for the HiFive Unleashed does not contain the usual
"riscv,plic0" compat string, but our PLIC driver is compatible.

MFC after: 1 week

4 years agoplic: fix PLIC_MAX_IRQS
Mitchell Horne [Mon, 11 Nov 2019 01:35:50 +0000 (01:35 +0000)]
plic: fix PLIC_MAX_IRQS

The maximum number of PLIC interrupts is defined in the PLIC spec[1]
as 1024.

[1] https://github.com/riscv/riscv-plic-spec/blob/master/riscv-plic.adoc

MFC after: 1 week

4 years agolinprocfs: Make sure to report -1 as tty when we have no controlling tty.
Olivier Houchard [Mon, 11 Nov 2019 00:21:05 +0000 (00:21 +0000)]
linprocfs: Make sure to report -1 as tty when we have no controlling tty.

When reporting a process' stats, we can't just provide the tty as an
unsigned long, as if we have no controlling tty, the tty would be NODEV, or
-1. Instaed, just special-case NODEV.

Submitted by: Juraj Lutter <otis@sk.FreeBSD.org>
MFC after: 1 week

4 years agoConsolidate powerpcspe CFLAGS
Justin Hibbits [Sun, 10 Nov 2019 22:08:07 +0000 (22:08 +0000)]
Consolidate powerpcspe CFLAGS

Don't depend on CPUTYPE to define powerpcspe CFLAGS, they should be set
unconditionally.  This reduces duplication.  Also, set some CFLAGS as
gcc-only, because clang's SPE support always uses the SPE ABI, it's not an
optional feature.

4 years agopowerpcspe: use -mspe instead of -mspe=yes to enable SPE
Justin Hibbits [Sun, 10 Nov 2019 20:36:38 +0000 (20:36 +0000)]
powerpcspe: use -mspe instead of -mspe=yes to enable SPE

-mspe=yes/no was deprecated even before GCC 4.2.1 in favor of
-mspe/-mno-spe.  Clang only supports -mspe/-mno-spe.

4 years agoSome language fixes.
Alexander Motin [Sun, 10 Nov 2019 18:07:02 +0000 (18:07 +0000)]
Some language fixes.

Submitted by: rpokala@
MFC after: 2 weeks

4 years agoMFV r354582: file 5.37.
Xin LI [Sun, 10 Nov 2019 17:00:23 +0000 (17:00 +0000)]
MFV r354582: file 5.37.

MFC after: 3 days

4 years agoloader: use struct initializer in vdev_probe().
Toomas Soome [Sun, 10 Nov 2019 15:07:36 +0000 (15:07 +0000)]
loader: use struct initializer in vdev_probe().

Hopefully it is a bit more clear this way.

4 years agoloader: memory leak in vdev_label_read_config()
Toomas Soome [Sun, 10 Nov 2019 15:03:59 +0000 (15:03 +0000)]
loader: memory leak in vdev_label_read_config()

We need to free the allocated buffer for label.

4 years agoamd64: change r_gdt to the local variable in hammer_time().
Konstantin Belousov [Sun, 10 Nov 2019 10:03:22 +0000 (10:03 +0000)]
amd64: change r_gdt to the local variable in hammer_time().

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agoamd64: Change SFENCE to locked op for synchronizing with CLFLUSHOPT on Intel.
Konstantin Belousov [Sun, 10 Nov 2019 09:41:29 +0000 (09:41 +0000)]
amd64: Change SFENCE to locked op for synchronizing with CLFLUSHOPT on Intel.

Reviewed by: cem, jhb
Discussed with: alc, scottph
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22007

4 years agoamd64: move common_tss into pcpu.
Konstantin Belousov [Sun, 10 Nov 2019 09:28:18 +0000 (09:28 +0000)]
amd64: move common_tss into pcpu.

This saves some memory, around 256K I think.  It removes some code,
e.g. KPTI does not need to specially map common_tss anymore.  Also,
common_tss become domain-local.

Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22231

4 years agoInclude cache zones into zone_foreach() where appropriate.
Konstantin Belousov [Sun, 10 Nov 2019 09:25:19 +0000 (09:25 +0000)]
Include cache zones into zone_foreach() where appropriate.

The r354367 is reverted since it is subsumed by this, more complete, approach.

Suggested by: markj
Reviewed by: alc. glebius, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22242

4 years agoEliminate a redundant pmap_load() from pmap_remove_pages().
Alan Cox [Sun, 10 Nov 2019 05:22:01 +0000 (05:22 +0000)]
Eliminate a redundant pmap_load() from pmap_remove_pages().

There is no reason why the pmap_invalidate_all() in pmap_remove_pages()
must be performed before the final PV list lock release.  Move it past
the lock release.

Eliminate a stale comment from pmap_page_test_mappings().  We implemented
a modified bit in r350004.

MFC after: 1 week

4 years agopowerpc64/powernv: Use OPAL call for non-POWER8 PCI TCE reset
Justin Hibbits [Sun, 10 Nov 2019 04:24:36 +0000 (04:24 +0000)]
powerpc64/powernv: Use OPAL call for non-POWER8 PCI TCE reset

According to the OPAL documentation, only the POWER8 (PHB3) should use
the register write TCE reset method.  All others should use the OPAL
call.

On POWER9 the call is semantically identical to the register write, with
a wait for completion.

4 years agoVendor import of file 5.37
Xin LI [Sun, 10 Nov 2019 03:44:32 +0000 (03:44 +0000)]
Vendor import of file 5.37

4 years agoAdd compact scraptchpad protocol for ntb_transport(4).
Alexander Motin [Sun, 10 Nov 2019 03:37:45 +0000 (03:37 +0000)]
Add compact scraptchpad protocol for ntb_transport(4).

Previously ntb_transport(4) required at least 6 scratchpad registers,
plus 2 more for each additional memory window.  That is too much for some
configurations, where several drivers have to share resources of the same
NTB hardware.  This patch introduces new compact version of the protocol,
requiring only 3 scratchpad registers, plus one more for each additional
memory window.  The optimization is based on fact that neither of version,
number of windows or number of queue pairs really need more then one byte
each, and window sizes of 4GB are not very useful now.  The new protocol
is activated automatically when the configuration is low on scratchpad
registers, or it can be activated explicitly with loader tunable.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

4 years agoAllow splitting PLX NTB BAR2 into several memory windows.
Alexander Motin [Sun, 10 Nov 2019 03:24:53 +0000 (03:24 +0000)]
Allow splitting PLX NTB BAR2 into several memory windows.

Address Lookup Table (A-LUT) being enabled allows to specify separate
translation for each 1/128th or 1/256th of the BAR2.  Previously it was
used only to limit effective window size by blocking access through some
of A-LUT elements.  This change allows A-LUT elements to also point
different memory locations, providing to upper layers several (up to 128)
independent memory windows.  A-LUT hardware allows even more flexible
configurations than this, but NTB KPI have no way to manage that now.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

4 years agobcm2835_sdhci: don't panic in DMA interrupt if curcmd went away
Kyle Evans [Sun, 10 Nov 2019 03:06:03 +0000 (03:06 +0000)]
bcm2835_sdhci: don't panic in DMA interrupt if curcmd went away

This is an exceptional case; generally found during controller errors.
A panic when we attempt to acess slot->curcmd->data is less ideal than
warning, and other verbiage will be emitted to indicate the exact error.

4 years agoRevert premature part of r354577
Kyle Evans [Sun, 10 Nov 2019 02:31:29 +0000 (02:31 +0000)]
Revert premature part of r354577

bcm2835_vcbus.c will be the future home to some I/O address mapping
routines, but it has neither been committed nor reviewed.

4 years agoarm64: add SOC_BRCM_BCM2838, build it in GENERIC
Kyle Evans [Sun, 10 Nov 2019 01:43:51 +0000 (01:43 +0000)]
arm64: add SOC_BRCM_BCM2838, build it in GENERIC

BCM2838/BCM2711 is the Raspberry Pi 4, which we will soon be able to boot
on once some ports bits are worked out.

4 years agoUpdate the VOP_COPY_FILE_RANGE man page to reflect the semantic change
Rick Macklem [Sun, 10 Nov 2019 01:21:10 +0000 (01:21 +0000)]
Update the VOP_COPY_FILE_RANGE man page to reflect the semantic change
made by r354574.

This is a content change.

4 years agoUpdate the copy_file_range man page to reflect the semantic change
Rick Macklem [Sun, 10 Nov 2019 01:13:41 +0000 (01:13 +0000)]
Update the copy_file_range man page to reflect the semantic change
done by r354574.

This is a content change.

4 years agoUpdate copy_file_range(2) to be Linux5 compatible.
Rick Macklem [Sun, 10 Nov 2019 01:08:14 +0000 (01:08 +0000)]
Update copy_file_range(2) to be Linux5 compatible.

The current linux man page and testing done on a fairly recent linux5.n
kernel have identified two changes to the semantics of the linux
copy_file_range system call.
Since the copy_file_range(2) system call is intended to be linux compatible
and is only currently in head/current and not used by any commands,
it seems appropriate to update the system call to be compatible with
the current linux one.
The first of these semantic changes was changed to be compatible with
linux5.n by r354564.
For the second semantic change, the old linux man page stated that, if
infd and outfd referred to the same file, EBADF should be returned.
Now, the semantics is to allow infd and outfd to refer to the same file
so long as the byte ranges defined by the input file offset, output file offset
and len does not overlap. If the byte ranges do overlap, EINVAL should be
returned.
This patch modifies copy_file_range(2) to be linux5.n compatible for this
semantic change.

4 years agogeneric_ehci_fdt: Fix compile when EXT_RESOURCES isn't present
Emmanuel Vadot [Sat, 9 Nov 2019 22:25:45 +0000 (22:25 +0000)]
generic_ehci_fdt: Fix compile when EXT_RESOURCES isn't present

4 years agolibipsec: correct a typo
Bjoern A. Zeeb [Sat, 9 Nov 2019 21:59:29 +0000 (21:59 +0000)]
libipsec: correct a typo

Correct a typo in the ipsec_errlist and replicated in a comment.
No functional changes.

MFC after: 3 weeks

4 years agoAdd GEOM attribute to report physical device name, and report it
Edward Tomasz Napierala [Sat, 9 Nov 2019 17:30:19 +0000 (17:30 +0000)]
Add GEOM attribute to report physical device name, and report it
via 'diskinfo -v'.  This avoids the need to track it down via CAM,
and should also work for disks that don't use CAM.  And since it's
inherited thru the GEOM hierarchy, in most cases one doesn't need
to walk the GEOM graph either, eg you can use it on a partition
instead of disk itself.

Reviewed by: allanjude, imp
Sponsored by: Klara Inc
Differential Revision: https://reviews.freebsd.org/D22249

4 years agoFor vm_map, #defining DIAGNOSTIC to turn on full assertion-based
Doug Moore [Sat, 9 Nov 2019 17:08:27 +0000 (17:08 +0000)]
For vm_map, #defining DIAGNOSTIC to turn on full assertion-based
consistency checking slows performance dramatically. This change
reduces the number of assertions checked by completely walking the
vm_map tree only when the write-lock is released, and only then if the
number of modifications to the tree since the last walk exceeds the
number of tree nodes.

Reviewed by: alc, kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22163

4 years agoUpdate the VOP_COPY_FILE_RANGE.9 man page to reflect the semantic change
Rick Macklem [Fri, 8 Nov 2019 23:58:33 +0000 (23:58 +0000)]
Update the VOP_COPY_FILE_RANGE.9 man page to reflect the semantic change
implemented by r354564.

This is a content change.

4 years agoUpdate the copy_file_range.2 man page to reflect the semantic change
Rick Macklem [Fri, 8 Nov 2019 23:49:27 +0000 (23:49 +0000)]
Update the copy_file_range.2 man page to reflect the semantic change
implemented by r354564.

This is a content change.

4 years agoUpdate copy_file_range(2) to be Linux5 compatible.
Rick Macklem [Fri, 8 Nov 2019 23:39:17 +0000 (23:39 +0000)]
Update copy_file_range(2) to be Linux5 compatible.

The current linux man page and testing done on a fairly recent linux5.n
kernel have identified two changes to the semantics of the linux
copy_file_range system call.
Since the copy_file_range(2) system call is intended to be linux compatible
and is only currently in head/current and not used by any commands,
it seems appropriate to update the system call to be compatible with
the current linux one.
The old linux man page stated that, if the
offset + len exceeded file_size for the input file, EINVAL should be returned.
Now, the semantics is to copy up to at most file_size bytes and return that
number of bytes copied. If the offset is at or beyond file_size, a return
of 0 bytes is done.
This patch modifies copy_file_range(2) to be linux compatible for this
semantic change.
A separate patch will change copy_file_range(2) for the other semantic
change, which allows the infd and outfd to refer to the same file, so
long as the byte ranges do not overlap.

4 years agobcm2835: commit missing constant from r354560
Kyle Evans [Fri, 8 Nov 2019 20:53:56 +0000 (20:53 +0000)]
bcm2835: commit missing constant from r354560

Surgically pulling the patch from my debugging work lead to this slopiness-
my apologies.

4 years agoAdd new iwm(4) files to sys/conf/files.
Mark Johnston [Fri, 8 Nov 2019 20:47:59 +0000 (20:47 +0000)]
Add new iwm(4) files to sys/conf/files.

Submitted by: rea
MFC with: r354504

4 years agobcm2835_sdhci: remove unused power_id field
Kyle Evans [Fri, 8 Nov 2019 20:14:36 +0000 (20:14 +0000)]
bcm2835_sdhci: remove unused power_id field

This was once set, but I removed it by the time I committed it because both
configurations use the same POWER_ID. This can be separated back out if the
situation changes.

4 years agobcm2835_sdhci: add some very basic support for rpi4
Kyle Evans [Fri, 8 Nov 2019 20:12:57 +0000 (20:12 +0000)]
bcm2835_sdhci: add some very basic support for rpi4

DMA is currently disabled while I work out why it's broken, but this is
enough for upstream U-Boot + rpi-firmware + our rpi3-psci-monitor to boot
with the right config.

The RPi 4 is still not in a good "supported" state, as we have no
USB/PCI-E/Ethernet drivers, but if air-gapped pies only able to operate over
cereal is your thing, here's your guy.

Submitted by: Robert Crowston (with modifications)

4 years agoloader.efi: Default to serial if we don't have a ConOut variable
Emmanuel Vadot [Fri, 8 Nov 2019 20:08:44 +0000 (20:08 +0000)]
loader.efi: Default to serial if we don't have a ConOut variable

In the EFI implementation in U-Boot no ConOut efi variable is created,
this cause loader to fallback to TERM_EMU implementation which is very
very very slow (and uses the ConOut device in the system table anyway).
The UEFI spec aren't clear as if this variable needs to exists or not.

Reviewed by: imp, kevans

4 years agoRemove explicit declaration of rk_clk_fract_set_freq() function
Michal Meloun [Fri, 8 Nov 2019 19:29:14 +0000 (19:29 +0000)]
Remove explicit declaration of rk_clk_fract_set_freq() function
forgotten in r354556.

MFC after: 3 weeks
MFC with: r354556
Noticed by: manu

4 years agoTidy up Rockchip composite clock.
Michal Meloun [Fri, 8 Nov 2019 19:15:50 +0000 (19:15 +0000)]
Tidy up Rockchip composite clock.
- add support for log2 based dividers
- use proper write mask when writing to divider register

MFC after: 3 weeks
Reviewed by: manu
Differential Revision:  https://reviews.freebsd.org/D22283

4 years agoEnhance Rockchip clocks implementation.
Michal Meloun [Fri, 8 Nov 2019 19:13:11 +0000 (19:13 +0000)]
Enhance Rockchip clocks implementation.
- add support for fractional dividers
- allow to declare fixed and linked clock

MFC after: 3 weeks
Reviewed by: manu
Differential Revision:  https://reviews.freebsd.org/D22282

4 years agoCleanup Rockchip clocks implementation.
Michal Meloun [Fri, 8 Nov 2019 19:03:34 +0000 (19:03 +0000)]
Cleanup Rockchip clocks implementation.
- style
- unify dprinf defines
- make dprinf's 32-bit compatible
Not a functional change.

MFC after: 3 weeks
Reviewed by: manu, imp
Differential Revision:  https://reviews.freebsd.org/D22281

4 years agoImplement support for (soft)linked clocks.
Michal Meloun [Fri, 8 Nov 2019 18:57:41 +0000 (18:57 +0000)]
Implement support for (soft)linked clocks.
This kind of clock nodes represent temporary placeholder for clocks
defined later in boot process. Also, these are necessary to break
circular dependencies occasionally occurring in complex clock graphs.

MFC after: 3 weeks

4 years agoReenable netinet6 and netpfil tests on i386, net/scapy 2.4.3_2 contains the fix
Li-Wen Hsu [Fri, 8 Nov 2019 18:56:02 +0000 (18:56 +0000)]
Reenable netinet6 and netpfil tests on i386, net/scapy 2.4.3_2 contains the fix

PR: 239380
Sponsored by: The FreeBSD Foundation

4 years agobhyve: add support for virtio-net mergeable rx buffers
Vincenzo Maffione [Fri, 8 Nov 2019 17:57:03 +0000 (17:57 +0000)]
bhyve: add support for virtio-net mergeable rx buffers

Mergeable rx buffers is a virtio-net feature that allows the hypervisor
to use multiple RX descriptor chains to receive a single receive packet.
Without this feature, a TSO-enabled guest is compelled to publish only
64K (or 32K) long chains, and each of these large buffers is consumed
to receive a single packet, even a very short one. This is a waste of
memory, as a RX queue has room for 256 chains, which means up to 16MB
of buffer memory for each (single-queue) vtnet device.
With the feature on, the guest can publish 2K long chains, and the
hypervisor will merge them as needed.

This change also enables the feature in the netmap backend, which
supports virtio-net offloads. We plan to add support for the
tap backend too.
Note that differently from QEMU/KVM, here we implement one-copy receive,
while QEMU uses two copies.

Reviewed by:    jhb
MFC after:      3 weeks
Differential Revision: https://reviews.freebsd.org/D21007

4 years agoDereference lem(4), no longer in 13-CURRENT.
Glen Barber [Fri, 8 Nov 2019 17:33:42 +0000 (17:33 +0000)]
Dereference lem(4), no longer in 13-CURRENT.
While here, fix formatting of inline parenthesis and Xrs.

Sponsored by: Rubicon Communications, LLC (netgate.com)

4 years agolibpmc: Forgot regex.h
Emmanuel Vadot [Fri, 8 Nov 2019 17:27:20 +0000 (17:27 +0000)]
libpmc: Forgot regex.h

Reported by: ci
MFC after: 1 week
X-MFC-With: r354549

4 years agolibpmc: Match on the cpuid with a regex
Emmanuel Vadot [Fri, 8 Nov 2019 16:56:48 +0000 (16:56 +0000)]
libpmc: Match on the cpuid with a regex

The CPUID is, or can be, a regex to be matched.
Use regex from libc instead of strcmp

Tested-by: gallatin
MFC after: 1 week

4 years agovmm: pass M_WAITOK to uma_zalloc when allocating FPU save area
Eric van Gyzen [Fri, 8 Nov 2019 16:30:55 +0000 (16:30 +0000)]
vmm: pass M_WAITOK to uma_zalloc when allocating FPU save area

Submitted by: patrick.sullivan3@dell.com
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22276

4 years agomark LLVM_LIBUNWIND as broken on sparc64, with PR reference
Ed Maste [Fri, 8 Nov 2019 15:20:19 +0000 (15:20 +0000)]
mark LLVM_LIBUNWIND as broken on sparc64, with PR reference

PR: 233405