]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoloader: remove unused variable from efipart.c
tsoome [Sat, 16 Nov 2019 08:16:50 +0000 (08:16 +0000)]
loader: remove unused variable from efipart.c

4 years agoRISC-V: busdma_bounce: fix BUS_DMA_ALLOCNOW for non-paged aligned sizes
mhorne [Sat, 16 Nov 2019 01:25:51 +0000 (01:25 +0000)]
RISC-V: busdma_bounce: fix BUS_DMA_ALLOCNOW for non-paged aligned sizes

RISC-V inherited this code from arm64, so implement the fix from r354712.
See the revision for the full description.

Submitted by: kevans (arm64 version)

4 years agoTSX Asynchronous Abort mitigation for Intel CVE-2019-11135.
scottl [Sat, 16 Nov 2019 00:26:42 +0000 (00:26 +0000)]
TSX Asynchronous Abort mitigation for Intel CVE-2019-11135.
This CVE has already been announced in FreeBSD SA-19:26.mcu.

Mitigation for TAA involves either turning off TSX or turning on the
VERW mitigation used for MDS. Some CPUs will also be self-mitigating
for TAA and require no software workaround.

Control knobs are:
machdep.mitigations.taa.enable:
        0 - no software mitigation is enabled
        1 - attempt to disable TSX
        2 - use the VERW mitigation
        3 - automatically select the mitigation based on processor
    features.

machdep.mitigations.taa.state:
        inactive        - no mitigation is active/enabled
        TSX disable     - TSX is disabled in the bare metal CPU as well as
                        - any virtualized CPUs
        VERW            - VERW instruction clears CPU buffers
not vulnerable - The CPU has identified itself as not being
  vulnerable

Nothing in the base FreeBSD system uses TSX.  However, the instructions
are straight-forward to add to custom applications and require no kernel
support, so the mitigation is provided for users with untrusted
applications and tenants.

Reviewed by: emaste, imp, kib, scottph
Sponsored by: Intel
Differential Revision: 22374

4 years agond6: retire defrouter_select(), use _fib() variant.
bz [Sat, 16 Nov 2019 00:17:35 +0000 (00:17 +0000)]
nd6: retire defrouter_select(), use _fib() variant.

Burn bridges and replace the last two calls of defrouter_select() with
defrouter_select_fib().  That allows us to retire defrouter_select()
and make it more clear in the calling code that it applies to all FIBs.

Sponsored by: Netflix

4 years agond6_rtr:
bz [Sat, 16 Nov 2019 00:02:36 +0000 (00:02 +0000)]
nd6_rtr:

Pull in the TAILQ_HEAD() as it is not needed outside nd6_rtr.c.
Rename the TAILQ_HEAD() struct and the nd_defrouter variable from
"nd_" to "nd6_" as they are not part of the RFC 3542 API which uses "ND_".

Ideally I'd like to also rename the struct nd_defrouter {} to "nd6_*"
but given that is used externally there is more work to do.

No functional changes.

MFC after: 3 weeks
Sponsored by: Netflix

4 years agoCreate a new sysctl subtree, machdep.mitigations. Its purpose is to organize
scottl [Fri, 15 Nov 2019 23:27:17 +0000 (23:27 +0000)]
Create a new sysctl subtree, machdep.mitigations.  Its purpose is to organize
knobs and indicators for code that mitigates functional and security issues
in the architecture/platform.  Controls for regular operational policy should
still go into places security, hw, kern, etc.

The machdep root node is inherently architecture dependent, but mitigations
tend to be architecture dependent as well.  Some cases like Spectre do cross
architectural boundaries, but the mitigation code for them tends to be
architecture dependent anyways, and multiple architectures won't be active
in the same image of the kernel.

Many mitigation knobs already exist in the system, and they will be moved
with compat naming in the future.  Going forward, mitigations should collect
in machdep.mitigations.

Reviewed by: imp, brooks, rwatson, emaste, jhb
Sponsored by: Intel

4 years agoif_llatbl: change htable_unlink_entry() to early exist if no work to do
bz [Fri, 15 Nov 2019 23:12:19 +0000 (23:12 +0000)]
if_llatbl: change htable_unlink_entry() to early exist if no work to do

Adjust the logic in htable_unlink_entry() to the one in
htable_link_entry() saving a block indent and making it more clear
in which case we do not do any work.

No functional change.

MFC after: 3 weeks
Sponsored by: Netflix

4 years agoUse a sv_copyout_auxargs hook in the Linux ELF ABIs.
jhb [Fri, 15 Nov 2019 23:01:43 +0000 (23:01 +0000)]
Use a sv_copyout_auxargs hook in the Linux ELF ABIs.

Reviewed by: emaste
Tested on: amd64 (linux64 only), i386
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22356

4 years agoInitialize *comp_update with valid value.
mav [Fri, 15 Nov 2019 23:01:09 +0000 (23:01 +0000)]
Initialize *comp_update with valid value.

I've noticed that sometimes with enabled DMAR initial write from device
to this address is somehow getting delayed, triggering assertion due to
zero default being invalid.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

4 years agoCleanup address range checks in ioat(4).
mav [Fri, 15 Nov 2019 22:47:59 +0000 (22:47 +0000)]
Cleanup address range checks in ioat(4).

 - Deduce allowed address range for bus_dma(9) from the hardware version.
Different versions (CPU generations) have different documented limits.
 - Remove difference between address ranges for src/dst and crc.  At least
docs for few recent generations of CPUs do not mention anything like that,
while older are already limited with above limits.
 - Remove address assertions from arguments.  While I do not think the
addresses out of allowed ranges should realistically happen there due to
the platforms physical address limitations, there is now bus_dma(9) to
make sure of that, preferably via IOMMU.
 - Since crc now has the same address range as src/dst, remove crc_dmamap,
reusing dst2_dmamap instead.

Discussed with: cem
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

4 years agoRemove now unused IPv6 macros and update docs.
bz [Fri, 15 Nov 2019 21:55:41 +0000 (21:55 +0000)]
Remove now unused IPv6 macros and update docs.

After r354748-354750 all uses of the IP6_EXTHDR_CHECK() and
IP6_EXTHDR_GET() macros are gone from the kernel.  IP6_EXTHDR_GET0()
was unused.  Remove the macros and update the documentation.

Sponsored by: Netflix

4 years agoIP6_EXTHDR_CHECK(): remove the last instances
bz [Fri, 15 Nov 2019 21:51:43 +0000 (21:51 +0000)]
IP6_EXTHDR_CHECK(): remove the last instances

While r354748 removed almost all IP6_EXTHDR_CHECK() calls, these
are not part of the PULLDOWN_TESTS.
Equally convert these IP6_EXTHDR_CHECK()s here to m_pullup() and remove
the extra check and m_pullup() in tcp_input() under isipv6 given
tcp6_input() has done exactly that pullup already.

MFC after: 8 weeks
Sponsored by: Netflix

4 years agonetinet*: replace IP6_EXTHDR_GET()
bz [Fri, 15 Nov 2019 21:44:17 +0000 (21:44 +0000)]
netinet*: replace IP6_EXTHDR_GET()

In a few places we have IP6_EXTHDR_GET() left in upper layer protocols.
The IP6_EXTHDR_GET() macro might perform an m_pulldown() in case the data
fragment is not contiguous.

Convert these last remaining instances into m_pullup()s instead.
In CARP, for example, we will a few lines later call m_pullup() anyway,
the IPsec code coming from OpenBSD would otherwise have done the m_pullup()
and are copying the data a bit later anyway, so pulling it in seems no
better or worse.

Note: this leaves very few m_pulldown() cases behind in the tree and we
might want to consider removing them as well to make mbuf management
easier again on a path to variable size mbufs, especially given
m_pulldown() still has an issue not re-checking M_WRITEABLE().

Reviewed by: gallatin
MFC after: 8 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D22335

4 years agonetinet6: Remove PULLDOWN_TESTs.
bz [Fri, 15 Nov 2019 21:40:40 +0000 (21:40 +0000)]
netinet6: Remove PULLDOWN_TESTs.

Remove the KAME introduced PULLDOWN_TESTs which did not even
have a compile-time option in sys/conf to turn them on for a
custom kernel build. They made the code a lot harder to read
or more complicated in a few cases.

Convert the IP6_EXTHDR_CHECK() calls into FreeBSD looking code.
Rather than throwing the packet away if it would not fit the
KAME mbuf expectations, convert the macros to m_pullup() calls.
Do not do any extra manual conditional checks upfront as to
whether the m_len would suffice (*), simply let m_pullup() do
its work (incl. an early check).

Remove extra m_pullup() calls where earlier in the function or
the only caller has already done the pullup.

Discussed with: rwatson (*)
Reviewed by: ae
MFC after: 8 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D22334

4 years agoAllow per-file lex and yacc options.
bz [Fri, 15 Nov 2019 21:19:06 +0000 (21:19 +0000)]
Allow per-file lex and yacc options.

In order to allow software with multiple (different) options
for lex and yacc add extra per-file options to the calls.
This is especially useful when one .l file needs -Pprefix.

Reviewed by: imp
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22337

4 years agoloader: add support for hybrid PMBR for GPT partition table
tsoome [Fri, 15 Nov 2019 20:43:39 +0000 (20:43 +0000)]
loader: add support for hybrid PMBR for GPT partition table

Note hybrid table is nor really UEFI specification compliant.

Sample hybrid partition table:
> ::mbr
Format: unknown
Signature: 0xaa55 (valid)
UniqueMBRDiskSignature: 0

PART TYPE                  ACTIVE  STARTCHS    ENDCHS      SECTOR     NUMSECT
0    EFI_PMBR:0xee         0       1023/254/63 1023/254/63 1          409639
1    0xff                  0       1023/254/63 1023/254/63 409640     978508408
2    FDISK_EXT_WIN:0xc     0       1023/254/63 1023/254/63 978918048  31250000
3    0xff                  0       1023/254/63 1023/254/63 1010168048 32
>

4 years agoCombine ELF sysvecs for MIPS to reduce code duplication.
jhb [Fri, 15 Nov 2019 19:00:20 +0000 (19:00 +0000)]
Combine ELF sysvecs for MIPS to reduce code duplication.

Reviewed by: brooks, kevans
Tested on: mips, mips64
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22357

4 years agoloader: r354415 did miss to sort subpaths below the partitions
tsoome [Fri, 15 Nov 2019 18:57:00 +0000 (18:57 +0000)]
loader: r354415 did miss to sort subpaths below the partitions

Tested on actual system (MBP with UEFI 1.10).

4 years agocxgbev(4): Catch up with the pciids in the PF driver.
np [Fri, 15 Nov 2019 18:48:14 +0000 (18:48 +0000)]
cxgbev(4): Catch up with the pciids in the PF driver.

MFC after: 3 days
Sponsored by: Chelsio Communications

4 years agoAdd a sv_copyout_auxargs() hook in sysentvec.
jhb [Fri, 15 Nov 2019 18:42:13 +0000 (18:42 +0000)]
Add a sv_copyout_auxargs() hook in sysentvec.

Change the FreeBSD ELF ABIs to use this new hook to copyout ELF auxv
instead of doing it in the sv_fixup hook.  In particular, this new
hook allows the stack space to be allocated at the same time the auxv
values are copied out to userland.  This allows us to avoid wasting
space for unused auxv entries as well as not having to recalculate
where the auxv vector is by walking back up over the argv and
environment vectors.

Reviewed by: brooks, emaste
Tested on: amd64 (amd64 and i386 binaries), i386, mips, mips64
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22355

4 years agoFix build race in bsd.files.mk
arichardson [Fri, 15 Nov 2019 18:34:36 +0000 (18:34 +0000)]
Fix build race in bsd.files.mk

We need to ensure that installdirs-FOO runs before installfiles-FOO since
otherwise the directory may not exist when we attempt to install the target.
This was randomly causing failures in our Jenkins instance when installing
drti.o in cddl/lib/drti.

Reviewed By: brooks
Differential Revision: https://reviews.freebsd.org/D22382

4 years agomakefs: Also set UFS di_birthtime when building on Linux
arichardson [Fri, 15 Nov 2019 18:34:30 +0000 (18:34 +0000)]
makefs: Also set UFS di_birthtime when building on Linux

Since st_birthtime doesn't exists on Linux (unless you use statx(2)), we
instead populate it with the st_ctime value.

Reviewed By: emaste
Differential Revision: https://reviews.freebsd.org/D22386

4 years agoFix contents= being ignored in msdosfs makefs mtree
arichardson [Fri, 15 Nov 2019 18:34:23 +0000 (18:34 +0000)]
Fix contents= being ignored in msdosfs makefs mtree

I noticed this while trying to build an EFI boot image

Reviewed By: emaste
Differential Revision: https://reviews.freebsd.org/D22387

4 years agoFix regression from r353841: ctx.rc needs to be initialized,
glebius [Fri, 15 Nov 2019 18:02:37 +0000 (18:02 +0000)]
Fix regression from r353841: ctx.rc needs to be initialized,
otherwise driver might silently fail to initialize.

Pointy hat to: glebius

4 years agoUse __ as the separator for the exported vars in bsd.compiler/linker.mk
arichardson [Fri, 15 Nov 2019 16:43:36 +0000 (16:43 +0000)]
Use __ as the separator for the exported vars in bsd.compiler/linker.mk

By using '__' instead of '.' as the separator we can also support systems
that use dash as /bin/sh (it's the default shell on Ubuntu/Debian). Dash
will unset any environment variables that use a non alphanumeric+undedscore
character and therefore submakes will fail to import the COMPILER_*
variables if we use '.' as the separator.

Reviewed By: emaste
Differential Revision: https://reviews.freebsd.org/D22381

4 years agoDisable ntpd stack gap. When ASLR with STACK GAP != 0 ntpd suffers SIGSEGV.
cy [Fri, 15 Nov 2019 16:34:35 +0000 (16:34 +0000)]
Disable ntpd stack gap. When ASLR with STACK GAP != 0 ntpd suffers SIGSEGV.

PR: 241421, 241960
Reported by: Vladimir Zakharov <zakharov.vv@gmail.com>,
dewayne@heuristicsystems.com.au
Reviewed by: kib, imp (previous version), ian (suggestion)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D22358

4 years agoSupport O_CLOEXEC in linux(4) open(2) and openat(2).
trasz [Fri, 15 Nov 2019 16:21:46 +0000 (16:21 +0000)]
Support O_CLOEXEC in linux(4) open(2) and openat(2).

Reviewed by: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21966

4 years agond6: simplify code
bz [Fri, 15 Nov 2019 13:45:38 +0000 (13:45 +0000)]
nd6: simplify code

We are taking the same actions in both cases of the branch inside the block.
Simplify that code as the extra branch is not needed.

MFC after: 3 weeks
Sponsored by: Netflix

4 years agoRevert a patch that accidentally was committed with r354729
scottl [Fri, 15 Nov 2019 11:54:51 +0000 (11:54 +0000)]
Revert a patch that accidentally was committed with r354729

4 years agoFix a typo in how the AVX512DQ feature bit is checked.
scottl [Fri, 15 Nov 2019 11:53:06 +0000 (11:53 +0000)]
Fix a typo in how the AVX512DQ feature bit is checked.

Reviewed by: kib
Sponsored by: Intel

4 years agoPrevent potential underflow in ibcore.
hselasky [Fri, 15 Nov 2019 11:46:53 +0000 (11:46 +0000)]
Prevent potential underflow in ibcore.

Linux commit:
a9018adfde809d44e71189b984fa61cc89682b5e

MFC after: 1 week
Sponsored by: Mellanox Technologies

4 years agoCorrect MR length field to be 64-bit in ibcore.
hselasky [Fri, 15 Nov 2019 11:45:14 +0000 (11:45 +0000)]
Correct MR length field to be 64-bit in ibcore.

Linux commit:
edd31551148c09608feee6b8756ad148d550ee3b

MFC after: 1 week
Sponsored by: Mellanox Technologies

4 years agoif_llatbl: cleanup
bz [Fri, 15 Nov 2019 11:00:03 +0000 (11:00 +0000)]
if_llatbl: cleanup

Remove function prototypes which are not needed (no use before function
definition for these file static functions).

MFC after: 3 weeks
Sponsored by: Netflix

4 years agoMerge commit 5bbb604bb from llvm git (by Craig Topper):
dim [Fri, 15 Nov 2019 06:56:25 +0000 (06:56 +0000)]
Merge commit 5bbb604bb from llvm git (by Craig Topper):

  [InstCombine] Disable some portions of foldGEPICmp for GEPs that
  return a vector of pointers. Fix other portions.

  llvm-svn: 370114

This should fix instances of 'Assertion failed: (isa<X>(Val) &&
"cast<Ty>() argument of incompatible type!"), function cast, file
/usr/src/contrib/llvm/include/llvm/Support/Casting.h, line 255', when
building openjdk8 for aarch64 and armv7.

Reported by: jbeich
PR: 236566
MFC after: 3 days

4 years agoatomic: Add atomic_cmpset_masked to powerpc and use it
jhibbits [Fri, 15 Nov 2019 04:33:07 +0000 (04:33 +0000)]
atomic: Add atomic_cmpset_masked to powerpc and use it

Summary:
This is a more optimal way of doing atomic_compset_masked() than the
fallback in sys/_atomic_subword.h.  There's also an override for
_atomic_fcmpset_masked_word(), which may or may not be necessary, and is
unused for powerpc.

Reviewed by: kevans, kib
Differential Revision: https://reviews.freebsd.org/D22359

4 years agoRISC-V: Print SBI info at startup
mhorne [Fri, 15 Nov 2019 03:40:02 +0000 (03:40 +0000)]
RISC-V: Print SBI info at startup

SBI version 0.2 introduces functions for obtaining the details of the
SBI implementation, such as version and implemntation ID. Print this
info at startup when it is available.

Reviewed by: jhb, kp
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22327

4 years agoAdd missing files from r354720
mhorne [Fri, 15 Nov 2019 03:37:49 +0000 (03:37 +0000)]
Add missing files from r354720

MFC with: r354720
Differential Revision: https://reviews.freebsd.org/D22326

4 years agoRISC-V: add support for SBI spec v0.2
mhorne [Fri, 15 Nov 2019 03:34:27 +0000 (03:34 +0000)]
RISC-V: add support for SBI spec v0.2

The Supervisor Binary Interface (SBI) specification v0.2 is a backwards
incompatible update to the SBI call interface for kernels running in
supervisor mode. The goal of this update was to make it easier for new
and optional functionality to be added to the SBI.

SBI functions are now called by passing an "extension ID" and a
"function ID" which are passed in a7 and a6 respectively. SBI calls
will also return an error and value in the following struct:

struct sbi_ret {
    long error;
    long value;
}

This version introduces several new functions under the "base"
extension. It is expected that all SBI implementations >= 0.2 will
support this base set of functions, as they implement some essential
services such as obtaining the SBI version, CPU implementation info, and
extension probing.

Existing SBI functions have been designated as "legacy". For the time
being they will remain implemented, but it is expected that in the
future their functionality will be duplicated or replaced by new SBI
extensions. Each legacy function has been assigned its own extension ID,
and for now we simply probe and assert for their existence.

Compatibility with legacy SBI implementations (such as BBL) is
maintained by checking the output of sbi_get_spec_version(). This
function is guaranteed to succeed by the new spec, but will return an
error in legacy implementations. We use this as an indicator of whether
or not we can rely on the new SBI base extensions.

For further info on the Supervisor Binary Interface, see:
https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc

Reviewed by: kp, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22326

4 years agoRISC-V: pass arg6 in sbi_call
mhorne [Fri, 15 Nov 2019 03:22:08 +0000 (03:22 +0000)]
RISC-V: pass arg6 in sbi_call

Allow for an additional argument to sbi_call which will be passed in a6.
This is required for SBI spec 0.2 support, as a6 will indicate the SBI
function ID.

While here, introduce some macros to clean up the calls.

Reviewed by: kp, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22325

4 years agoplic: support irq distribution
mhorne [Fri, 15 Nov 2019 03:18:11 +0000 (03:18 +0000)]
plic: support irq distribution

Our PLIC implementation only enables interrupts on the boot cpu.
Implement plic_bind_intr() so that they can be redistributed near the
end of boot during intr_irq_shuffle().

This also slightly modifies how enable bits are handled in an attempt to
better fit the PIC interface. plic_enable_intr()/plic_disable_intr() are
converted to manage an interrupt source's threshold value, since this
value can be used as to globally enable/disable an irq. All handing of the
per-context enable bits is moved to the new methods plic_setup_intr()
and plic_bind_intr().

Reviewed by: br
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D21928

4 years agoplic: fix context calculation
mhorne [Fri, 15 Nov 2019 03:15:14 +0000 (03:15 +0000)]
plic: fix context calculation

The RISC-V PLIC (platform level interrupt controller) registers are divided up
by "context", which is purposefully left ambiguous in the PLIC spec. Currently
we assume each CPU number corresponds 1-to-1 with a context number, but that is
not correct. Most existing PLIC implementations (such as SiFive's) have
multiple contexts per-cpu. For example, a single CPU might have a context for
machine mode interrupts and a context for supervisor mode interrupts. To
complicate things further, FreeBSD renumbers the CPUs during boot, but the PLIC
driver still assumes that CPU ID equals the RISC-V hart number, meaning
interrupt enables/claims might be performed for the wrong context registers.

To fix this, we must calculate each CPU's context number during
attachment. This is done by reading the interrupt properties from the
device tree, from which a mapping from context to RISC-V hart to CPU
number can be created.

Reviewed by: br
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D21927

4 years agoFix build with GCC
jpaetzel [Fri, 15 Nov 2019 01:07:39 +0000 (01:07 +0000)]
Fix build with GCC

Fix suggested by: jhb, scottl
Sponsored by: Panzura

4 years agoAdd the pvscsi driver to the tree.
jpaetzel [Thu, 14 Nov 2019 23:31:20 +0000 (23:31 +0000)]
Add the pvscsi driver to the tree.

This driver allows to usage of the paravirt SCSI controller
in VMware products like ESXi.  The pvscsi driver provides a
substantial performance improvement in block devices versus
the emulated mpt and mps SCSI/SAS controllers.

Error handling in this driver has not been extensively tested
yet.

Submitted by: vbhakta@vmware.com
Relnotes: yes
Sponsored by: VMware, Panzura
Differential Revision: D18613

4 years agoBoot arm64 kernel using booti command from U-boot.
jhibbits [Thu, 14 Nov 2019 21:58:40 +0000 (21:58 +0000)]
Boot arm64 kernel using booti command from U-boot.

Summary:
Boot arm64 kernel using booti command from U-boot. booti can relocate initrd
image into higher ram addresses, therefore align the initrd load address to 1GiB
and create VA = PA map for it. Create L2 pagetable entries to copy the initrd
image into KVA.
(parts of the code in https://reviews.freebsd.org/D13861 was referred and used
as appropriate)

Submitted by: Siddharth Tuli <siddharthtuli_gmail.com>
Reviewed by: manu
Sponsored by: Juniper Networks, Inc
Differential Revision: https://reviews.freebsd.org/D22255

4 years ago[PowerPC64] Fix broken kernel modules due to LLD 9+ TOC optimization
bdragon [Thu, 14 Nov 2019 19:56:42 +0000 (19:56 +0000)]
[PowerPC64] Fix broken kernel modules due to LLD 9+ TOC optimization

LLD9 introduced a TOC optimization that isn't compatible with kernel dynamic
linker causing panic when loading kernel modules (pf, linuxkpi etc.)

This patch disables TOC optimization when building kernel modules.

Submitted by: Alfredo Dal'Ava Junior <alfredo.junior@eldorado.org.br>
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D22317

4 years agoarm64: busdma_bounce: fix BUS_DMA_ALLOCNOW for non-paged aligned sizes
kevans [Thu, 14 Nov 2019 18:38:56 +0000 (18:38 +0000)]
arm64: busdma_bounce: fix BUS_DMA_ALLOCNOW for non-paged aligned sizes

For any size that isn't page-aligned, we end up not pre-allocating enough
for a single mapping because we truncate the size instead of rounding up to
make sure the last bit is accounted for, leaving us one page shy of what we
need to fulfill a request.

Differential Revision: https://reviews.freebsd.org/D22288

4 years agoTidy syscall declerations.
brooks [Thu, 14 Nov 2019 17:11:52 +0000 (17:11 +0000)]
Tidy syscall declerations.

Pointer arguments should be of the form "<type> *..." and not "<type>* ...".

No functional change.

Reviewed by: kevans
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22373

4 years agoCompile in arm/unwind.c if options STACK is in effect; the new arm stack(9)
ian [Thu, 14 Nov 2019 17:04:19 +0000 (17:04 +0000)]
Compile in arm/unwind.c if options STACK is in effect; the new arm stack(9)
code now uses unwind.c.

4 years agoRewrite arm/stack_machdep.c for EABI; add stack(9) support to arm kernels.
ian [Thu, 14 Nov 2019 16:46:27 +0000 (16:46 +0000)]
Rewrite arm/stack_machdep.c for EABI; add stack(9) support to arm kernels.

The old stack_machdep.c code was written for the APCS ABI (aka "oldabi").
When we switched to ARM EABI (back in freebsd 10) this file never got
updated, and apparently nobody noticed that until now.

The new implementation uses the same stack unwinder code used by the
arm implemenation of the db_trace stuff.

4 years agoFor idle TCP sessions using the CUBIC congestio control, reset ssthresh
tuexen [Thu, 14 Nov 2019 16:28:02 +0000 (16:28 +0000)]
For idle TCP sessions using the CUBIC congestio control, reset ssthresh
to the higher of the previous ssthresh or 3/4 of the prior cwnd.

Submitted by: Richard Scheffenegger
Reviewed by: Cheng Cui
Differential Revision: https://reviews.freebsd.org/D18982

4 years agollvm: use elf_aux_info to get executable's path, if available
emaste [Thu, 14 Nov 2019 15:10:01 +0000 (15:10 +0000)]
llvm: use elf_aux_info to get executable's path, if available

Obtained from: LLVM a0a38b81ea
MFC with: r354692
Sponsored by: The FreeBSD Foundation

4 years agoPass more reasonable WAIT flags to bus_dma(9) calls.
mav [Thu, 14 Nov 2019 04:39:48 +0000 (04:39 +0000)]
Pass more reasonable WAIT flags to bus_dma(9) calls.

MFC after: 2 weeks

4 years agoMake ntb(4) send bus_get_dma_tag() requests to parent buses passing real
mav [Thu, 14 Nov 2019 04:34:58 +0000 (04:34 +0000)]
Make ntb(4) send bus_get_dma_tag() requests to parent buses passing real
bus' child pointers instead of grandchilds.

DMAR does not like requests from devices not parented directly by PCI.

MFC after: 2 weeks

4 years agopowerpc: Kernel fixes for ppc32 and powerpcspe w/ lld
bdragon [Thu, 14 Nov 2019 04:34:17 +0000 (04:34 +0000)]
powerpc: Kernel fixes for ppc32 and powerpcspe w/ lld

Fix wrong section ordering that was causing a ".got is not contiguous with
other relro sections" lld error. This also brings ldscript.powerpc and
ldscript.powerpcspe closer to ldscript.powerpc64.

Also, remove unnecessary text relocs from the ppc32 AIM trap code.

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D22349

4 years agoarmv6 soft float build fixed
imp [Thu, 14 Nov 2019 01:38:48 +0000 (01:38 +0000)]
armv6 soft float build fixed

Add ifdefs in the assembler for soft-float compile case.

Submitted by: Hiroki Mori
Reviewed by: ray@
Differential Review: https://reviews.freebsd.org/D22352

4 years agoImprove the description of AT_EXECPATH availability.
brooks [Wed, 13 Nov 2019 23:31:23 +0000 (23:31 +0000)]
Improve the description of AT_EXECPATH availability.

Reported by: kib
Sponsored by: DARPA, AFRL

4 years agocpucontrol: print more useful information when MSR access fails.
kib [Wed, 13 Nov 2019 22:43:11 +0000 (22:43 +0000)]
cpucontrol: print more useful information when MSR access fails.

Instead of providing ioctl cmd value, which has no meaning to user,
print MSR number.  The later is what the user expects in this place
even.

Reported by: pstef
Sponsored by: The FreeBSD Foundation
MFC after: 3 days

4 years agoamd64: only set PCB_FULL_IRET pcb flag when #gp or similar exception comes
kib [Wed, 13 Nov 2019 22:39:46 +0000 (22:39 +0000)]
amd64: only set PCB_FULL_IRET pcb flag when #gp or similar exception comes
from usermode.

If CPU supports RDFSBASE, the flag also means that userspace fsbase
and gsbase are already written into pcb, which might be not true when
we handle #gp from kernel.

The offender is rdmsr_safe(), and the visible result is corrupted
userspace TLS base.

Reported by: pstef
Sponsored by: The FreeBSD Foundation
MFC after: 3 days

4 years agoelf_aux_info: Add support for AT_EXECPATH.
brooks [Wed, 13 Nov 2019 21:51:55 +0000 (21:51 +0000)]
elf_aux_info: Add support for AT_EXECPATH.

Reviewed by: emaste, sef
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22353

4 years agoRefine r354661 to unbreak the GCC_BOOTSTRAP case.
jhb [Wed, 13 Nov 2019 21:49:46 +0000 (21:49 +0000)]
Refine r354661 to unbreak the GCC_BOOTSTRAP case.

MK_CLANG_IS_CC controls installing links for GCC, not just clang.  Set
MK_CLANG_IS_CC to the value of MK_CLANG_BOOTSTRAP.  This will leave it
as "no" if no bootstrap compiler is being built or GCC 4.2.1 is being
used as the bootstrap compiler, and "yes" if clang is being used as
the bootstrap compiler.

Submitted by: bdrewery (kind of, he suggested this on IRC while I was
     testing the original patch)
Reviewed by: kevans, imp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22350

4 years agollvm: use AT_EXECPATH from ELF auxiliary vectors for getExecutablePath
emaste [Wed, 13 Nov 2019 21:02:18 +0000 (21:02 +0000)]
llvm: use AT_EXECPATH from ELF auxiliary vectors for getExecutablePath

/proc/curproc/file and the KERN_PROC_PATHNAME sysctl may not return the
desired path if there are multiple hardlinks to the file.

PR: 241932
Tested by: ler
Sponsored by: The FreeBSD Foundation

4 years agoImprove Linuxulator man pages to better reflect the current state,
trasz [Wed, 13 Nov 2019 20:32:23 +0000 (20:32 +0000)]
Improve Linuxulator man pages to better reflect the current state,
and add some missing Xrs.

Reviewed by: brueffer, emaste (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22277

4 years agoAdd 'linux_mounts_enable' rc.conf(5) variable, to make it possible
trasz [Wed, 13 Nov 2019 20:27:38 +0000 (20:27 +0000)]
Add 'linux_mounts_enable' rc.conf(5) variable, to make it possible
to disable mounting Linux-specific filesystems under /compat/linux
when 'linux_enable' is set to YES.

Reviewed by: netchild, ian (earlier version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22320

4 years agossp: further refine the conditional used for constructor priority
kevans [Wed, 13 Nov 2019 18:21:06 +0000 (18:21 +0000)]
ssp: further refine the conditional used for constructor priority

__has_attribute(__constructor__) is a better test for clang than
defined(__clang__). Switch to it instead.

While we're already here and touching it, pfg@ nailed down when GCC actually
introduced the priority argument -- 4.3. Use that instead of our
hammer-guess of GCC >= 5 for the sake of correctness.

4 years agoFix a typo in the PMAP_PTE_SET_CACHE_BITS macro.
brooks [Wed, 13 Nov 2019 18:10:42 +0000 (18:10 +0000)]
Fix a typo in the PMAP_PTE_SET_CACHE_BITS macro.

The second argument should have been "pa" not "ps".  It worked by
accident because the argument was always "pa" which was an in-scope
local variable.

Submitted by: sson
Reviewed by: jhb, kevans
Obtained from: CheriBSD
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22338

4 years agoAdd t4_keyctx.c to sys/conf/files for the non-module build.
jhb [Wed, 13 Nov 2019 17:06:10 +0000 (17:06 +0000)]
Add t4_keyctx.c to sys/conf/files for the non-module build.

Missed in r354667.

Pointy hat to: jhb
MFC after: 1 month
Sponsored by: Chelsio Communications

4 years agoIn if_siocaddmulti() enter VNET.
glebius [Wed, 13 Nov 2019 16:28:53 +0000 (16:28 +0000)]
In if_siocaddmulti() enter VNET.

Reported & tested by: garga

4 years agoDefine wrapper functions vm_map_entry_{succ,pred} to act as wrappers
dougm [Wed, 13 Nov 2019 15:56:07 +0000 (15:56 +0000)]
Define wrapper functions vm_map_entry_{succ,pred} to act as wrappers
around entry->{next,prev} when those are used for ordered list
traversal, and use those wrapper functions everywhere. Where the next
field is used for maintaining a stack of deferred operations, #define
defer_next to make that different usage clearer, and then use the
'right' pointer instead of 'next' for that purpose.

Approved by: markj
Tested by: pho (as part of a larger patch)
Differential Revision: https://reviews.freebsd.org/D22347

4 years agoStop the VESA driver from whining loudly in the dmesg during boot on
scottl [Wed, 13 Nov 2019 15:31:31 +0000 (15:31 +0000)]
Stop the VESA driver from whining loudly in the dmesg during boot on
systems that use EFI instead of BIOS.

4 years agond6: remove unused structs and defines
bz [Wed, 13 Nov 2019 14:28:07 +0000 (14:28 +0000)]
nd6: remove unused structs and defines

Remove a collections of unused structs and #defines to make it easier
to understand what is actually in use.

Sponsored by: Netflix

4 years agond6: make nd6_alloc() file static
bz [Wed, 13 Nov 2019 13:53:17 +0000 (13:53 +0000)]
nd6: make nd6_alloc() file static

nd6_alloc() is a function used only locally.  Make it static and no
longer export it.  Keeps the KPI smaller.

Sponsored by: Netflix

4 years agond6 defrouter: consolidate nd_defrouter manipulations in nd6_rtr.c
bz [Wed, 13 Nov 2019 12:05:48 +0000 (12:05 +0000)]
nd6 defrouter: consolidate nd_defrouter manipulations in nd6_rtr.c

Move the nd_defrouter along with the sysctl handler from nd6.c to
nd6_rtr.c and make the variable file static.  Provide (temporary)
new accessor functions for code manipulating nd_defrouter from nd6.c,
and stop exporting functions no longer needed outside nd6_rtr.c.
This also shuffles a few functions around in nd6_rtr.c without
functional changes.

Given all nd_defrouter logic is now in one place we can tidy up the
code, locking and, and other open items.

MFC after: 3 weeks
X-MFC: keep exporting the functions
Sponsored by: Netflix

4 years agolltabl: remove dead code
bz [Wed, 13 Nov 2019 11:21:02 +0000 (11:21 +0000)]
lltabl: remove dead code

Remove the long (8? years ago) #if 0 marked function lltable_drain() and
while here also remove the unused function llentry_alloc() which has call
paths tools keep finding and are never used.

Sponsored by: Netflix

4 years agoLogging improvements to loader::nfs
rpokala [Wed, 13 Nov 2019 03:56:51 +0000 (03:56 +0000)]
Logging improvements to loader::nfs

Include the server IP address when logging nfs_open(), add a few missing
"\n"s, and correct a typo.

Reviewed by: kevans
MFC after: 2 weeks
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D22346

4 years agossp: rework the logic to use priority=200 on clang builds
kevans [Wed, 13 Nov 2019 03:00:32 +0000 (03:00 +0000)]
ssp: rework the logic to use priority=200 on clang builds

The preproc logic was added at the last minute to appease GCC 4.2, and
kevans@ did clearly not go back and double-check that the logic worked out
for clang builds to use the new variant.

It turns out that clang defines __GNUC__ == 4. Flip it around and check
__clang__ as well, leaving a note to remove it later.

Reported by: cem

4 years agopowerpc64: Don't guard ISA 3.0 partition table setup with hw_direct_map
jhibbits [Wed, 13 Nov 2019 02:22:00 +0000 (02:22 +0000)]
powerpc64: Don't guard ISA 3.0 partition table setup with hw_direct_map

PowerISA 3.0 eliminated the 64-bit bridge mode which allowed 32-bit kernels
to run on 64-bit AIM/Book-S hardware.  Since therefore only a 64-bit kernel
can run on this hardware, and 64-bit native always has the direct map, there
is no need to guard it.

4 years agopowerpc: Don't savectx() twice in IPI_STOP handler
jhibbits [Wed, 13 Nov 2019 02:16:24 +0000 (02:16 +0000)]
powerpc: Don't savectx() twice in IPI_STOP handler

We already save context in stoppcbs[] array, so there's no need to also save it
in the PCB, it won't be used.

4 years agossp: add a priority to the __stack_chk_guard constructor
kevans [Wed, 13 Nov 2019 02:14:17 +0000 (02:14 +0000)]
ssp: add a priority to the __stack_chk_guard constructor

First, this commit is a NOP on GCC <= 4.x; this decidedly doesn't work
cleanly on GCC 4.2, and it will be gone soon anyways so I chose not to dump
time into figuring out if there's a way to make it work. xtoolchain-gcc,
clocking in as GCC6, can cope with it just fine and later versions are also
generally ok with the syntax. I suspect very few users are running GCC4.2
built worlds and also experiencing potential fallout from the status quo.

For dynamically linked applications, this change also means very little.
rtld will run libc ctors before most others, so the situation is
approximately a NOP for these as well.

The real cause for this change is statically linked applications doing
almost questionable things in their constructors. qemu-user-static, for
instance, creates a thread in a global constructor for their async rcu
callbacks. In general, this works in other places-

- On OpenBSD, __stack_chk_guard is stored in an .openbsd.randomdata section
  that's initialized by the kernel in the static case, or ld.so in the
  dynamic case
- On Linux, __stack_chk_guard is apparently stored in TLS and such a problem
  is circumvented there because the value is presumed stable in the new
  thread.

On FreeBSD, the rcu thread creation ctor and __guard_setup are both unmarked
priority. qemu-user-static spins up the rcu thread prior to __guard_setup
which starts making function calls- some of these are sprinkled with the
canary. In the middle of one of these functions, __guard_setup is invoked in
the main thread and __stack_chk_guard changes- qemu-user-static is promptly
terminated for an SSP violation that didn't actually happen.

This is not an all-too-common problem. We circumvent it here by giving the
__stack_chk_guard constructor a solid priority. 200 was chosen because that
gives static applications ample range (down to 101) for working around it
if they really need to. I suspect most applications will "just work" as
expected- the default/non-prioritized flavor of __constructor__ functions
run last, and the canary is generally not expected to change as of this
point at the very least.

This took approximately three weeks of spare time debugging to pin down.

PR: 241905

4 years agoFix a race between daopen and damediapoll
imp [Wed, 13 Nov 2019 01:58:43 +0000 (01:58 +0000)]
Fix a race between daopen and damediapoll

When we do a daopen, we call dareprobe and wait for the results. The repoll runs
the da state machine up through the DA_STATE_RC* and then exits.

For removable media, we poll the device every 3 seconds with a TUR to see if it
has disappeared. This introduces a race. If the removable device has lots of
partitions, and if it's a little slow (like say a USB2 connected USB stick),
then we can have a fair amount of time that this reporbe is going on for. If,
during that time, damediapoll fires, it calls daschedule which changes the
scheduling priority from NONE to NORMAL. When that happens, the careful single
stepping in the da state machine is disrupted and we wind up sceduling multiple
read capacity calls. The first one succeeds and releases the reference. The
second one succeeds and releases the reference (and panics if the right code is
compiled into the da driver).

To avoid the race, only do the TUR calls while in state normal, otherwise just
reschedule damediapoll. This prevents the race from happening.

4 years agoCreate a file to hold shared routines for dealing with T6 key contexts.
jhb [Wed, 13 Nov 2019 00:53:45 +0000 (00:53 +0000)]
Create a file to hold shared routines for dealing with T6 key contexts.

ccr(4) and TLS support in cxgbe(4) construct key contexts used by the
crypto engine in the T6.  This consolidates some duplicated code for
helper functions used to build key contexts.

Reviewed by: np
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22156

4 years agosesutil: fix another memory leak
asomers [Tue, 12 Nov 2019 23:57:57 +0000 (23:57 +0000)]
sesutil: fix another memory leak

Instead of calloc()ing (and forgetting to free) in a tight loop, just put
this small array on the stack.

Reported by: Coverity
Coverity CID: 1331665
MFC after: 2 weeks
Sponsored by: Axcient

4 years agosesutil: fix some memory leaks
asomers [Tue, 12 Nov 2019 23:09:55 +0000 (23:09 +0000)]
sesutil: fix some memory leaks

Reported by: Coverity
Coverity CID: 1331665
MFC after: 2 weeks
Sponsored by: Axcient

4 years agosesutil: fix an out-of-bounds array access
asomers [Tue, 12 Nov 2019 23:03:52 +0000 (23:03 +0000)]
sesutil: fix an out-of-bounds array access

sesutil would allow the user to toggle an LED that was one past the maximum
element.  If he tried, ENCIOC_GETELMSTAT would return EINVAL.

Reported by: Coverity
Coverity CID: 1398940
MFC after: 2 weeks
Sponsored by: Axcient

4 years agolibcompat: Correct rtld MLINKS
brooks [Tue, 12 Nov 2019 22:31:59 +0000 (22:31 +0000)]
libcompat: Correct rtld MLINKS

Don't install duplicate ld-elf.so.1.1 and ld.so.1 links in rtld-elf32.
Do install lib-elf32.so.1.1 and ldd32.1 links.

Reported by: madpilot

4 years agoSync target triple generation with the version in Makefile.inc1.
jhb [Tue, 12 Nov 2019 21:35:05 +0000 (21:35 +0000)]
Sync target triple generation with the version in Makefile.inc1.

Reviewed by: dim
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22333

4 years agoForce MK_CLANG_IS_CC on in XMAKE.
jhb [Tue, 12 Nov 2019 21:29:52 +0000 (21:29 +0000)]
Force MK_CLANG_IS_CC on in XMAKE.

This ensures that a bootstrap clang compiler is always installed as cc
in WORLDTMP.  If it is only installed as 'clang' then /usr/bin/cc is
used during the build instead of the bootstrap compiler.

Reviewed by: imp
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22332

4 years agoEnable the RISC-V LLVM backend by default.
jhb [Tue, 12 Nov 2019 21:26:50 +0000 (21:26 +0000)]
Enable the RISC-V LLVM backend by default.

Reviewed by: dim, mhorne, emaste
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22284

4 years agobhyve: rework mevent processing to fix a race condition
vmaffione [Tue, 12 Nov 2019 21:07:51 +0000 (21:07 +0000)]
bhyve: rework mevent processing to fix a race condition

At the end of both mevent_add() and mevent_update(), mevent_notify()
is called to wakeup the I/O thread, that will call kevent(changelist)
to update the kernel.
A race condition is possible where the client calls mevent_add() and
mevent_update(EV_ENABLE) before the I/O thread has the chance to wake
up and call mevent_build()+kevent(changelist) in response to mevent_add().
The mevent_add() is therefore ignored by the I/O thread, and
kevent(fd, EV_ENABLE) is called before kevent(fd, EV_ADD), resuliting
in a failure of the kevent(fd, EV_ENABLE) call.

PR: 241808
Reviewed by: jhb, markj
MFC with: r354288
Differential Revision: https://reviews.freebsd.org/D22286

4 years agoAdd new bit definitions for TSX, related to the TAA issue. The actual
scottl [Tue, 12 Nov 2019 19:15:16 +0000 (19:15 +0000)]
Add new bit definitions for TSX, related to the TAA issue.  The actual
mitigation will follow in a future commit.

Sponsored by: Intel

4 years agoWorkaround for Intel SKL002/SKL012S errata.
kib [Tue, 12 Nov 2019 18:01:33 +0000 (18:01 +0000)]
Workaround for Intel SKL002/SKL012S errata.

Disable the use of executable 2M page mappings in EPT-format page
tables on affected CPUs.  For bhyve virtual machines, this effectively
disables all use of superpage mappings on affected CPUs.  The
vm.pmap.allow_2m_x_ept sysctl can be set to override the default and
enable mappings on affected CPUs.

Alternate approaches have been suggested, but at present we do not
believe the complexity is warranted for typical bhyve's use cases.

Reviewed by: alc, emaste, markj, scottl
Security: CVE-2018-12207
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21884

4 years agonvdimm(4): Fix various problems when the using the second label index block
scottph [Tue, 12 Nov 2019 16:24:37 +0000 (16:24 +0000)]
nvdimm(4): Fix various problems when the using the second label index block

struct nvdimm_label_index is dynamically sized, with the `free`
bitfield expanding to hold `slot_cnt` entries. Fix a few places
where we were treating the struct as though it had a fixed sized.

Reviewed by: cem
Approved by: scottl (mentor)
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D22253

4 years agoi386: stop guessing the address of the trap frame in ddb backtrace.
kib [Tue, 12 Nov 2019 15:56:27 +0000 (15:56 +0000)]
i386: stop guessing the address of the trap frame in ddb backtrace.

Save the address of the trap frame in %ebp on kernel entry.  This
automatically provides it in struct i386_frame.f_frame to unwinder.

While there, more accurately handle the terminating frames,

Reviewed by: avg, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22321

4 years agoamd64: move GDT into PCPU area.
kib [Tue, 12 Nov 2019 15:51:47 +0000 (15:51 +0000)]
amd64: move GDT into PCPU area.

Reviewed by: jhb, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22302

4 years agonvdimm(4): Only expose namespaces for accessible data SPAs
scottph [Tue, 12 Nov 2019 15:50:30 +0000 (15:50 +0000)]
nvdimm(4): Only expose namespaces for accessible data SPAs

Apply the same user accessible filter to namespaces as is applied
to full-SPA devices. Also, explicitly filter out control region
SPAs which don't expose the nvdimm data area.

Reviewed by: cem
Approved by: scottl (mentor)
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21987

4 years agoamd64: assert that size of the software prototype table for gdt is equal
kib [Tue, 12 Nov 2019 15:47:46 +0000 (15:47 +0000)]
amd64: assert that size of the software prototype table for gdt is equal
to the size of hardware gdt.

Reviewed by: jhb, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22302

4 years agonetinet*: update *mp to pass the proper value back
bz [Tue, 12 Nov 2019 15:46:28 +0000 (15:46 +0000)]
netinet*: update *mp to pass the proper value back

In ip6_[direct_]input() we are looping over the extension headers
to deal with the next header.  We pass a pointer to an mbuf pointer
to the handling functions.  In certain cases the mbuf can be updated
there and we need to pass the new one back.  That missing in
dest6_input() and route6_input().  In tcp6_input() we should also
update it before we call tcp_input().

In addition to that mark the mbuf NULL all the times when we return
that we are done with handling the packet and no next header should
be checked (IPPROTO_DONE).  This will eventually allow us to assert
proper behaviour and catch the above kind of errors more easily,
expecting *mp to always be set.

This change is extracted from a larger patch and not an exhaustive
change across the entire stack yet.

PR: 240135
Reported by: prabhakar.lakhera gmail.com
MFC after: 3 weeks
Sponsored by: Netflix

4 years agonetstat: igmp stats, error on unexpected information, not only warn
bz [Tue, 12 Nov 2019 13:57:17 +0000 (13:57 +0000)]
netstat: igmp stats, error on unexpected information, not only warn

The igmp stats tend to print two lines of warning for an unexpected
version and length.  Despite an invalid version and struct size it
continues to try to do something with the data.  Do not try to parse
the remainder of the struct and error on warning.

Note the underlying issue of the data not being available properly
is still there and needs to be fixed seperately.

Reported by: test cases, lwhsu
MFC after: 3 weeks

4 years agoteach db_nextframe/x86 about [X]xen_intr_upcall interrupt handler
avg [Tue, 12 Nov 2019 11:00:01 +0000 (11:00 +0000)]
teach db_nextframe/x86 about [X]xen_intr_upcall interrupt handler

Discussed with: kib, royger
MFC after: 3 weeks
Sponsored by: Panzura

4 years agoxen: fix dispatching of NMIs
royger [Tue, 12 Nov 2019 10:31:28 +0000 (10:31 +0000)]
xen: fix dispatching of NMIs

Currently NMIs are sent over event channels, but that defeats the
purpose of NMIs since event channels can be masked. Fix this by
issuing NMIs using a hypercall, which injects a NMI (vector #2) to the
desired vCPU.

Note that NMIs could also be triggered using the emulated local APIC,
but using a hypercall is better from a performance point of view
since it doesn't involve instruction decoding when not using x2APIC
mode.

Reported and Tested by: avg
Sponsored by: Citrix Systems R&D

4 years agoreverting r354594
tsoome [Tue, 12 Nov 2019 10:02:39 +0000 (10:02 +0000)]
reverting r354594

In our case the structure is more complex and simple static initializer
will upset compiler diagnostics - using memset is still better than building
more complext initializer.