]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:31 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:26 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:23 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:19 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:15 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:09 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:06:06 +0000 (18:06 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:05:43 +0000 (18:05 +0000)]
Convert to if_foreach_llmaddr() KPI.

This driver seems to have a bug.  The bug was carefully saved during
conversion.  In the al_eth_mac_table_unicast_add() the argument 'addr',
which is the actual address is unused.  So, the function is called as
many times as we have addresses, but with the exactly same argument
list.  This doesn't make any sense, but was preserved.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 18:00:17 +0000 (18:00 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 17:59:53 +0000 (17:59 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 17:59:16 +0000 (17:59 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agoConvert to if_foreach_llmaddr() KPI.
Gleb Smirnoff [Mon, 21 Oct 2019 17:59:02 +0000 (17:59 +0000)]
Convert to if_foreach_llmaddr() KPI.

4 years agotuntap(4): restrict scope of net.link.tap.user_open slightly
Kyle Evans [Mon, 21 Oct 2019 14:38:11 +0000 (14:38 +0000)]
tuntap(4): restrict scope of net.link.tap.user_open slightly

net.link.tap.user_open has historically allowed non-root users to do devfs
cloning and open /dev/tap* nodes based on permissions. Loosen this up to
make it only allow users to do devfs cloning -- we no longer check it in
tunopen.

This allows tap devices to be created that can actually be opened by a user,
rather than swiftly restricting them to root because the magic sysctl has
not been set.

The sysctl has not yet been completely deprecated, because more thought is
needed for how to handle the devfs cloning case. There is not an easy
suitable replacement for the sysctl there, and more care needs to be placed
in determining whether that's OK or not.

PR: 200185

4 years agodebug,kassert.warnings is a statistic, not a tunable
Andriy Gapon [Mon, 21 Oct 2019 12:21:56 +0000 (12:21 +0000)]
debug,kassert.warnings is a statistic, not a tunable

MFC after: 1 week

4 years ago[PPC64] Add minidump support to PowerNV
Leandro Lupori [Mon, 21 Oct 2019 11:56:57 +0000 (11:56 +0000)]
[PPC64] Add minidump support to PowerNV

Implementation of PowerNV specific minidump code.

Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D21643

4 years agofrag6: import a set of test cases
Bjoern A. Zeeb [Mon, 21 Oct 2019 09:33:45 +0000 (09:33 +0000)]
frag6: import a set of test cases

In order to ensure that changing the frag6 code does not change behaviour
or break code a set of test cases were implemented.

Like some other test cases these use Scapy to generate packets and possibly
wait for expected answers.  In most cases we do check the global and
per interface (netstat) statistics output using the libxo output and grep
to validate fields and numbers.  This is a bit hackish but we currently have
no better way to match a selected number of stats only (we have to ignore
some of the ND6 variables; otherwise we could use the entire list).

Test cases include atomic fragments, single fragments, multi-fragments,
and try to cover most error cases in the code currently.
In addition vnet teardown is tested to not panic.

A separate set (not in-tree currently) of probes were used in order to
make sure that the test cases actually test what they should.

The "sniffer" code was copied and adjusted from the netpfil version
as we sometimes will not get packets or have longer timeouts to deal with.

Sponsored by: Netflix

4 years agofrag6: fix vnet teardown leak
Bjoern A. Zeeb [Mon, 21 Oct 2019 08:48:47 +0000 (08:48 +0000)]
frag6: fix vnet teardown leak

When shutting down a VNET we did not cleanup the fragmentation hashes.
This has multiple problems: (1) leak memory but also (2) leak on the
global counters, which might eventually lead to a problem on a system
starting and stopping a lot of vnets and dealing with a lot of IPv6
fragments that the counters/limits would be exhausted and processing
would no longer take place.

Unfortunately we do not have a useable variable to indicate when
per-VNET initialization of frag6 has happened (or when destroy happened)
so introduce a boolean to flag this. This is needed here as well as
it was in r353635 for ip_reass.c in order to avoid tripping over the
already destroyed locks if interfaces go away after the frag6 destroy.

While splitting things up convert the TRY_LOCK to a LOCK operation in
now frag6_drain_one().  The try-lock was derived from a manual hand-rolled
implementation and carried forward all the time.  We no longer can afford
not to get the lock as that would mean we would continue to leak memory.

Assert that all the buckets are empty before destroying to lock to
ensure long-term stability of a clean shutdown.

Reported by: hselasky
Reviewed by: hselasky
MFC after: 3 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D22054

4 years agofrag6: add read-only sysctl for nfrags.
Bjoern A. Zeeb [Mon, 21 Oct 2019 08:36:15 +0000 (08:36 +0000)]
frag6: add read-only sysctl for nfrags.

Add a read-only sysctl exporting the global number of fragments
(base system and all vnets).  This is helpful to (a) know how many
fragments are currently being processed, (b) if there are possible
leaks, (c) if vnet teardown is not working correctly, and lastly
(d) it can be used as part of test-suits to ensure (a) to (c).

MFC after: 3 weeks
Sponsored by: Netflix

4 years agotools/tools/locale: allow POSIX target to be built in parallel
Yuri Pankov [Mon, 21 Oct 2019 03:01:05 +0000 (03:01 +0000)]
tools/tools/locale: allow POSIX target to be built in parallel

While it's rarely used target, more so a one not used during the
buildworld, it helps when it's not taking hours (literally).

4 years agopicobsd: add deprecation notices
Kyle Evans [Mon, 21 Oct 2019 00:52:21 +0000 (00:52 +0000)]
picobsd: add deprecation notices

Notices appear both in picobsd(8) (near the top for easy notice) and are
also printed to stderr on every invocation of picobsd for visibility.

The tentative date for removal is October 31st, as no volunteers have
stepped forward at all from postings to -arch@ at least.

No objection from: -arch@
MFC after: 3 days

4 years agotuntap(4): use cdevpriv w/ dtor for last close instead of d_close
Kyle Evans [Sun, 20 Oct 2019 22:55:47 +0000 (22:55 +0000)]
tuntap(4): use cdevpriv w/ dtor for last close instead of d_close

cdevpriv dtors will be called when the reference count on the associated
struct file drops to 0, while d_close can be unreliable for cleaning up
state at "last close" for a number of reasons. As far as tunclose/tundtor is
concerned the difference is minimal, so make the switch.

4 years agotuntap(4): Use make_dev_s to avoid si_drv1 race
Kyle Evans [Sun, 20 Oct 2019 22:39:40 +0000 (22:39 +0000)]
tuntap(4): Use make_dev_s to avoid si_drv1 race

This allows us to avoid some dance in tunopen for dealing with the
possibility of dev->si_drv1 being NULL as it's set prior to the devfs node
being created in all cases.

There's still the possibility that the tun device hasn't been fully
initialized, since that's done after the devfs node was created. Alleviate
this by returning ENXIO if we're not to that point of tuncreate yet.

This work is what sparked r353128, full initialization of cloned devices
w/ specified make_dev_args.

4 years agotuntap(4): break out after setting TUN_DSTADDR
Kyle Evans [Sun, 20 Oct 2019 21:06:25 +0000 (21:06 +0000)]
tuntap(4): break out after setting TUN_DSTADDR

This is now the only flag we set in this loop, terminate early.

4 years agotuntap(4): Drop TUN_IASET
Kyle Evans [Sun, 20 Oct 2019 21:03:48 +0000 (21:03 +0000)]
tuntap(4): Drop TUN_IASET

This flag appears to have been effectively unused since introduction to
if_tun(4) -- drop it now.

4 years agoAdd a manpage for ng_pipe(4).
Christian Brueffer [Sun, 20 Oct 2019 20:57:57 +0000 (20:57 +0000)]
Add a manpage for ng_pipe(4).

Submitted by: Lutz Donnerhacke <lutz_donnerhacke.de>
Reviewed by: bcr (previous version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22067

4 years agoFix option names in the Examples section of the manual page
Alan Somers [Sun, 20 Oct 2019 20:29:17 +0000 (20:29 +0000)]
Fix option names in the Examples section of the manual page

This corrects an oversight from r351423.

Submitted by: Ján Sučan <sucanjan@gmail.com>
MFC after: Never
Differential Revision: https://reviews.freebsd.org/D22093

4 years ago- In em_intr(), just call em_handle_link() instead of duplicating it.
Marius Strobl [Sun, 20 Oct 2019 17:40:50 +0000 (17:40 +0000)]
- In em_intr(), just call em_handle_link() instead of duplicating it.
- In em_msix_link(), properly handle IGB-class devices after the iflib(4)
  conversion again by only setting EM_MSIX_LINK for the EM-class 82574
  and by re-arming link interrupts unconditionally, i. e. not only in
  case of spurious interrupts. This fixes the interface link state change
  detection for the IGB-class. [1]
- In em_if_update_admin_status(), only re-arm the link state change
  interrupt for 82574 and also only if such a device uses MSI-X, i. e.
  takes advantage of autoclearing. In case of INTx and MSI as well as
  for LEM- and IGB-class devices, re-arming isn't appropriate here and
  setting EM_MSIX_LINK isn't either.
  While at it, consistently take advantage of the hw variable.

PR: 236724 [1]
Differential Revision: https://reviews.freebsd.org/D21924

4 years agopowerpc/booke: Don't zero MAS8, it's unnecessary
Justin Hibbits [Sun, 20 Oct 2019 15:50:33 +0000 (15:50 +0000)]
powerpc/booke: Don't zero MAS8, it's unnecessary

MAS8 is hypervisor privileged, defining the logical partition (VM) to
operate on for TLB accesses.  It's already guaranteed to be cleared when
booting bare metal (bootloader needs it zeroed to work), and we can't touch
it from a guest.  Assume that if/when we eventually port bhyve to PowerPC
(and Book-E) the hypervisor module will take care of managing MAS8.  This
saves several (tens) of clocks on each TLB miss.

MFC after: 2 weeks

4 years agonetmap: minor misc improvements
Vincenzo Maffione [Sun, 20 Oct 2019 14:15:45 +0000 (14:15 +0000)]
netmap: minor misc improvements

 - use ring->head rather than ring->cur in lb(8)
 - use strlcat() rather than strncat()
 - fix bandwidth computation in pkt-gen(8)

MFC after: 1 week

4 years agoAdd driver for DesignWare PCIE core, and its Armada 8K specific attachement.
Michal Meloun [Sun, 20 Oct 2019 11:11:32 +0000 (11:11 +0000)]
Add driver for DesignWare PCIE core, and its Armada 8K specific attachement.

MFC after: 3 weeks

4 years agoUpdate Armada 8k drivers to cover newly imported DT and latest changes
Michal Meloun [Sun, 20 Oct 2019 10:48:27 +0000 (10:48 +0000)]
Update Armada 8k drivers to cover newly imported DT and latest changes
in simple multifunction driver.
- follow interrupt changes in DT. Split old ICU driver to function oriented
  parts and add drivers for newly defined parts (system error interrupts).
- Many drivers are children of simple multifunction driver. But after r349596
  simple MF driver doesn't longer exports memory resources, and all children
  must use syscon interface to access their registers. Adapt affected
  drivers to this fact.

MFC after: 3 weeks

4 years agoFix spelling of DPSRCS.
Bryan Drewery [Sat, 19 Oct 2019 21:44:33 +0000 (21:44 +0000)]
Fix spelling of DPSRCS.

Submitted by: vangyzen
Sponsored by: DellEMC
MFC after: 2 weeks

4 years agoFix compile issues when building a kernel without the VIMAGE option.
Michael Tuexen [Sat, 19 Oct 2019 20:48:53 +0000 (20:48 +0000)]
Fix compile issues when building a kernel without the VIMAGE option.
Thanks to cem@ for discussing the issue which resulted in this patch.

Reviewed by: cem@
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D22089

4 years agoAdd the fstat -s option to display socket information.
Jeremie Le Hen [Sat, 19 Oct 2019 19:52:19 +0000 (19:52 +0000)]
Add the fstat -s option to display socket information.

Reviewed by: jilles
MFC after: 1 week
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D21880

4 years agoRemove IS_INADDR_ANY().
Jeremie Le Hen [Sat, 19 Oct 2019 19:38:53 +0000 (19:38 +0000)]
Remove IS_INADDR_ANY().

Requested by rgrimes@ in
https://lists.freebsd.org/pipermail/svn-src-head/2019-October/129784.html

4 years agohw.intrbalance: Make sysctl tunable
Conrad Meyer [Sat, 19 Oct 2019 16:37:49 +0000 (16:37 +0000)]
hw.intrbalance: Make sysctl tunable

This allows specifying a boot-time preference in loader.conf.

4 years agopowerpc/booke pmap: Fix printf format type warnings
Justin Hibbits [Sat, 19 Oct 2019 16:09:06 +0000 (16:09 +0000)]
powerpc/booke pmap: Fix printf format type warnings

4 years agoMerge ACPICA 20191018.
Jung-uk Kim [Sat, 19 Oct 2019 14:56:44 +0000 (14:56 +0000)]
Merge ACPICA 20191018.

4 years agoloader: zfs_fmtdev can crash when pool discovery did fail and we have no spa
Toomas Soome [Sat, 19 Oct 2019 08:08:06 +0000 (08:08 +0000)]
loader: zfs_fmtdev can crash when pool discovery did fail and we have no spa

When zfs probe did fail and no spa was created, but zfs_fmtdev() is called,
we will crash while dereferencing spa (NULL pointer dereference).

MFC after: 1 week

4 years agobuildkernel: always add standard kernel configuration include path
Andriy Gapon [Sat, 19 Oct 2019 07:16:20 +0000 (07:16 +0000)]
buildkernel: always add standard kernel configuration include path

This should change nothing for kernel configurations at the standard
locations in the source tree.  However, if KERNCONFDIR is used to
specify a custom location for a kernel configuration file (e.g., out of
tree), then both the custom location and the standard location, in this
order, will be used as include paths for config(8).  This will allow the
kernel configuration to include files from both locations.

Reviewed by: bdrewery
MFC after: 16 days
Differential Revision: https://reviews.freebsd.org/D22057

4 years agoremove wmb() call from x86 cpu_reset()
Andriy Gapon [Sat, 19 Oct 2019 07:13:15 +0000 (07:13 +0000)]
remove wmb() call from x86 cpu_reset()

The rationale is pretty much the same as in r353747.
There is no subsequent dependent store.
The store is to the regular (TSO) memory anyway.

MFC after: 23 days

4 years agovmm: remove a wmb() call
Andriy Gapon [Sat, 19 Oct 2019 07:10:15 +0000 (07:10 +0000)]
vmm: remove a wmb() call

After removing wmb(), vm_set_rendezvous_func() became super trivial, so
there was no point in keeping it.

The wmb (sfence on amd64, lock nop on i386) was not needed.  This can be
explained from several points of view.

First, wmb() is used for store-store ordering (although, the primitive
is undocumented).  There was no obvious subsequent store that needed the
barrier.

Second, x86 has a memory model with strong ordering including total
store order.  An explicit store barrier may be needed only when working
with special memory (device, special caching mode) or using special
instructions (non-temporal stores).  That was not the case for this
code.

Third, I believe that there is a misconception that sfence "flushes" the
store buffer in a sense that it speeds up the propagation of stores from
the store buffer to the global visibility.  I think that such
propagation always happens as fast as possible.  sfence only makes
subsequent stores wait for that propagation to complete.  So, sfence is
only useful for ordering of stores and only in the situations described
above.

Reviewed by: jhb
MFC after: 23 days
Differential Revision: https://reviews.freebsd.org/D21978

4 years agopowerpc/aim: Fix comment typo
Justin Hibbits [Sat, 19 Oct 2019 02:47:32 +0000 (02:47 +0000)]
powerpc/aim: Fix comment typo

4 years agopowerpc/mpc85xx: Replace global PCI config mutex with per-controller mutex
Justin Hibbits [Sat, 19 Oct 2019 01:07:35 +0000 (01:07 +0000)]
powerpc/mpc85xx: Replace global PCI config mutex with per-controller mutex

PCI controllers need to enforce exclusive config register access on their
own bus, not between all buses.

4 years agoDo not remove /usr/share/mk/bsd.compat.mk. It was reintroduced by r353659.
Jung-uk Kim [Fri, 18 Oct 2019 22:08:04 +0000 (22:08 +0000)]
Do not remove /usr/share/mk/bsd.compat.mk.  It was reintroduced by r353659.

4 years agoFix debugnet(4) link/build fallout on some configurations
Conrad Meyer [Fri, 18 Oct 2019 22:03:36 +0000 (22:03 +0000)]
Fix debugnet(4) link/build fallout on some configurations

Introduced in r353685 (sys/conf/files), r353694 (debugnet.c db_printf).

Submitted by: kevans
Reported by: cy
X-MFC-With: r353685, r353694

4 years agotap: add support for virtio-net offloads
Vincenzo Maffione [Fri, 18 Oct 2019 21:53:27 +0000 (21:53 +0000)]
tap: add support for virtio-net offloads

This patch is part of an effort to make bhyve networking (in particular TCP)
faster. The key strategy to enhance TCP throughput is to let the whole packet
datapath work with TSO/LRO packets (up to 64KB each), so that the per-packet
overhead is amortized over a large number of bytes.
This capability is supported in the guest by means of the vtnet(4) driver,
which is able to handle TSO/LRO packets leveraging the virtio-net header
(see struct virtio_net_hdr and struct virtio_net_hdr_mrg_rxbuf).
A bhyve VM exchanges packets with the host through a network backend,
which can be vale(4) or if_tap(4).
While vale(4) supports TSO/LRO packets, if_tap(4) does not.
This patch extends if_tap(4) with the ability to understand the virtio-net
header, so that a tapX interface can process TSO/LRO packets.
A couple of ioctl commands have been added to configure and probe the
virtio-net header. Once the virtio-net header is set, the tapX interface
acquires all the IFCAP capabilities necessary for TSO/LRO.

Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D21263

4 years agonvdimm(4): Persist unit numbers in cdev
Conrad Meyer [Fri, 18 Oct 2019 21:32:45 +0000 (21:32 +0000)]
nvdimm(4): Persist unit numbers in cdev

They're formatted into the device name like unit numbers, anyway; store the
number in mda_unit => si_drv0 like dev2unit() expects.

No functional change intended.

Sponsored by: Dell EMC Isilon

4 years agoPull in r374154 from upstream clang trunk (by Simon Atanasyan):
Dimitry Andric [Fri, 18 Oct 2019 20:05:27 +0000 (20:05 +0000)]
Pull in r374154 from upstream clang trunk (by Simon Atanasyan):

  [mips] Set default float ABI to "soft" on FreeBSD

  Initial patch by Kyle Evans.

  Fix PR43596

Requested by: kevans
MFC after: 1 month
X-MFC-With: r353358

4 years agoPull in r372651 from upstream lld trunk (by Simon Atanasyan):
Dimitry Andric [Fri, 18 Oct 2019 20:02:46 +0000 (20:02 +0000)]
Pull in r372651 from upstream lld trunk (by Simon Atanasyan):

  [mips] Support elf32btsmipn32_fbsd / elf32ltsmipn32_fbsd emulations

  Patch by Kyle Evans.

Requested by: kevans
MFC after: 1 month
X-MFC-With: r353358

4 years agoProvide a src.conf(5) description for the new WITHOUT_CAROOT option, and
Dimitry Andric [Fri, 18 Oct 2019 19:30:12 +0000 (19:30 +0000)]
Provide a src.conf(5) description for the new WITHOUT_CAROOT option, and
rename the WITH_LOADER_VERIEXEC_PASS_MANFIEST description to its correct
name.  Also correct a bunch of spelling errors in that description.

MFC after: 3 days

4 years agoFurther constrain the use of per-CPU caches for free pages.
Mark Johnston [Fri, 18 Oct 2019 17:36:42 +0000 (17:36 +0000)]
Further constrain the use of per-CPU caches for free pages.

In low memory conditions a significant number of pages may end up stuck
in the caches, and currently these caches cannot be reaped, leading to
spurious memory allocation failures and OOM kills.  So:

- Take into account the fact that we may cache up to two full buckets
  of pages per CPU, not just one.
- Increase the amount of RAM required per CPU to enable the caches.

This is a temporary measure until the page cache management policy is
improved.

PR: 241048
Reported and tested by: Kevin Oberman <rkoberman@gmail.com>
Reviewed by: alc, kib
Discussed with: jeff
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22040

4 years agoAbbreviate softdep lock names.
Mark Johnston [Fri, 18 Oct 2019 17:01:27 +0000 (17:01 +0000)]
Abbreviate softdep lock names.

The softdep lock names were unusually long and tended to stick out in
lock profiling reports.  Abbreviate them and make them consistent with
our conventional style for lock names.

Reviewed by: mckusick
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22042

4 years agoMake rt_getifa_fib() static.
Gleb Smirnoff [Fri, 18 Oct 2019 15:20:24 +0000 (15:20 +0000)]
Make rt_getifa_fib() static.

4 years agoTighten mapping protections on preloaded files on amd64.
Mark Johnston [Fri, 18 Oct 2019 14:05:13 +0000 (14:05 +0000)]
Tighten mapping protections on preloaded files on amd64.

- We load the kernel at 0x200000.  Memory below that address need not
  be executable, so do not map it as such.
- Remove references to .ldata and related sections in the kernel linker
  script.  They come from ld.bfd's default linker script, but are not
  used, and we now use ld.lld to link the amd64 kernel.  lld does not
  contain a default linker script.
- Pad the .bss to a 2MB as we do between .text and .data.  This
  forces the loader to load additional files starting in the following
  2MB page, preserving the use of superpage mappings for kernel data.
- Map memory above the kernel image with NX.  The kernel linker now
  upgrades protections as needed, and other preloaded file types
  (e.g., entropy, microcode) need not be mapped with execute permissions
  in the first place.

Reviewed by: kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21859

4 years agoApply mapping protections to preloaded kernel modules on amd64.
Mark Johnston [Fri, 18 Oct 2019 13:56:45 +0000 (13:56 +0000)]
Apply mapping protections to preloaded kernel modules on amd64.

With an upcoming change the amd64 kernel will map preloaded files RW
instead of RWX, so the kernel linker must adjust protections
appropriately using pmap_change_prot().

Reviewed by: kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21860

4 years agoApply mapping protections to .o kernel modules.
Mark Johnston [Fri, 18 Oct 2019 13:53:14 +0000 (13:53 +0000)]
Apply mapping protections to .o kernel modules.

Use the section flags to derive mapping protections.  When multiple
sections overlap within a page, the union of their protections must be
applied.  With r353701 the .text and .rodata sections are padded to
ensure that this does not happen on amd64.

Reviewed by: kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21896

4 years agogpioiic: add the detach method
Andriy Gapon [Fri, 18 Oct 2019 12:34:30 +0000 (12:34 +0000)]
gpioiic: add the detach method

bus_generic_detach was not enough, we also need to clean up the iicbus
child device.

MFC after: 1 week

4 years agoddb: use 'textdump dump' instead of 'call doadump'
Andriy Gapon [Fri, 18 Oct 2019 12:32:01 +0000 (12:32 +0000)]
ddb: use 'textdump dump' instead of 'call doadump'

The change is for the example in textdump.4 and the default ddb.conf.

First of all, doadump now requires an argument and it won't do a
textdump if the argument is not 'true'.
And 'textdump dump' is more idiomatic anyway.

For what it's worth, ddb 'dump' command seems to always request a vmcore
dump even if a textdump was requested earlier, e.g., by 'textdump set'.
Finally, ddb 'call' command is not documented.

MFC after: 2 weeks

4 years agolinux: futex_mtx should follow futex_list
Yuri Pankov [Fri, 18 Oct 2019 12:25:33 +0000 (12:25 +0000)]
linux: futex_mtx should follow futex_list

Move futex_mtx to linux_common.ko for amd64 and aarch64 along
with respective list/mutex init/destroy.

PR: 240989
Reported by: Alex S <iwtcex@gmail.com>

4 years agolinux: provide just one instance of futex_list
Yuri Pankov [Fri, 18 Oct 2019 10:28:08 +0000 (10:28 +0000)]
linux: provide just one instance of futex_list

Move futex_list definition to linux.c which is included once
in linux.ko (i386) and in linux_common.ko (amd64 and aarch64)
allowing 32/64 bit linux programs to access the same futexes
in the latter case.

PR: 240989
Reviewed by: dchagin
Differential Revision: https://reviews.freebsd.org/D22073

4 years agoImprove the way we calculate variance to reduce the rounding errors
Poul-Henning Kamp [Fri, 18 Oct 2019 07:55:01 +0000 (07:55 +0000)]
Improve the way we calculate variance to reduce the rounding errors
when variance is small relative to data points.

Now [0, 1, 2] shows same standard deviation as [10000000000000, ...1, ...2]

Also:  Various nitpickery from my own tree.

4 years agopf: Must be in NET_EPOCH to call icmp_error
Kristof Provost [Fri, 18 Oct 2019 03:36:26 +0000 (03:36 +0000)]
pf: Must be in NET_EPOCH to call icmp_error

icmp_reflect(), called through icmp_error() requires us to be in NET_EPOCH.
Failure to hold it leads to the following panic (with INVARIANTS):

  panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/netinet/ip_icmp.c:742
  cpuid = 2
  time = 1571233273
  KDB: stack backtrace:
  db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e0977920
  vpanic() at vpanic+0x17e/frame 0xfffffe00e0977980
  panic() at panic+0x43/frame 0xfffffe00e09779e0
  icmp_reflect() at icmp_reflect+0x625/frame 0xfffffe00e0977aa0
  icmp_error() at icmp_error+0x720/frame 0xfffffe00e0977b10
  pf_intr() at pf_intr+0xd5/frame 0xfffffe00e0977b50
  ithread_loop() at ithread_loop+0x1c6/frame 0xfffffe00e0977bb0
  fork_exit() at fork_exit+0x80/frame 0xfffffe00e0977bf0
  fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e0977bf0

Note that we now enter NET_EPOCH twice if we enter ip_output() from pf_intr(),
but ip_output() will soon be converted to a function that requires epoch, so
entering NET_EPOCH directly from pf_intr() makes more sense.

Discussed with: glebius@

4 years agonvdimm_e820: Fix braino in size=all SPA hint
Conrad Meyer [Fri, 18 Oct 2019 03:01:21 +0000 (03:01 +0000)]
nvdimm_e820: Fix braino in size=all SPA hint

The sentinel value for "use the rest of the region," -1, isn't zero modulo
PAGE_SIZE.  Relax the check to permit the intended special value.

X-MFC-With: r353110
Sponsored by: Dell EMC Isilon

4 years agox86: Remove unused variable from r353712
Conrad Meyer [Fri, 18 Oct 2019 02:25:30 +0000 (02:25 +0000)]
x86: Remove unused variable from r353712

It was in my git tree (uncommitted) and didn't get carried over to SVN in
r353712.

X-MFC-With: r353712

4 years agox86: Fetch and save standard CPUID leaf 6 in identcpu
Conrad Meyer [Fri, 18 Oct 2019 02:18:17 +0000 (02:18 +0000)]
x86: Fetch and save standard CPUID leaf 6 in identcpu

Rather than a few scattered places in the tree.  Organize flag names in a
contiguous region of specialreg.h.

While here, delete deprecated PCOMMIT from leaf 7.

No functional change.

4 years agoFix build of LLVM RISC-V backend
Mitchell Horne [Fri, 18 Oct 2019 01:46:38 +0000 (01:46 +0000)]
Fix build of LLVM RISC-V backend

Reviewed by: dim
MFC with: r353358
Differential Revision: https://reviews.freebsd.org/D21963

4 years agoRemove obsolete, non-use of CLANG_NO_IAS.
Brooks Davis [Fri, 18 Oct 2019 00:00:17 +0000 (00:00 +0000)]
Remove obsolete, non-use of CLANG_NO_IAS.

CLANG_NO_IAS was removed in r351661.

4 years agogdb(4): Implement support for NoAckMode
Conrad Meyer [Thu, 17 Oct 2019 22:37:25 +0000 (22:37 +0000)]
gdb(4): Implement support for NoAckMode

When the underlying debugport transport is reliable, GDB's additional
checksums and acknowledgements are redundant.  NoAckMode eliminates the
the acks and allows us to skip checking RX checksums.  The GDB packet
framing does not change, so unfortunately (valid) checksums are still
included as message trailers.

The gdb(4) stub in FreeBSD advertises support for the feature in response to
the client's 'qSupported' request IFF the current debugport has the
gdb_dbfeatures flag GDB_DBGP_FEAT_RELIABLE set.  Currently, only netgdb(4)
supports this feature.

If the remote GDB client supports the feature and does not have it disabled
via a GDB configuration knob, it may instruct our gdb(4) stub to enter
NoAckMode.  Unless and until it issues that command, we must continue to
transmit acks as usual (and for now, we continue to wait until we receive
them as well, even if we know the debugport is on a reliable transport).

In the kernel sources, the sense of the flag representing the state of the
feature is reversed from that of the GDB command.  (I.e., it is
'gdb_ackmode', not 'gdb_noackmode.')  This is to avoid confusing double-
negative conditions.

For reference, see:
  * https://sourceware.org/gdb/onlinedocs/gdb/Packet-Acknowledgment.html
  * https://sourceware.org/gdb/onlinedocs/gdb/General-Query-Packets.html#QStartNoAckMode

Reviewed by: jhb, markj (both earlier version)
Differential Revision: https://reviews.freebsd.org/D21761

4 years agoAdd an ldscript for amd64 kernel modules.
Mark Johnston [Thu, 17 Oct 2019 21:39:23 +0000 (21:39 +0000)]
Add an ldscript for amd64 kernel modules.

Use it to pad the text and read-only data sections to a 4KB boundary.
This will be used to enforce strict memory protections for some
sections of loadable kernel modules.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21970

4 years agoImplement NetGDB(4)
Conrad Meyer [Thu, 17 Oct 2019 21:33:01 +0000 (21:33 +0000)]
Implement NetGDB(4)

NetGDB(4) is a component of a system using a panic-time network stack to
remotely debug crashed FreeBSD kernels over the network, instead of
traditional serial interfaces.

There are three pieces in the complete NetGDB system.

First, a dedicated proxy server must be running to accept connections from
both NetGDB and gdb(1), and pass bidirectional traffic between the two
protocols.

Second, the NetGDB client is activated much like ordinary 'gdb' and
similarly to 'netdump' in ddb(4) after a panic.  Like other debugnet(4)
clients (netdump(4)), the network interface on the route to the proxy server
must be online and support debugnet(4).

Finally, the remote (k)gdb(1) uses 'target remote <proxy>:<port>' (like any
other TCP remote) to connect to the proxy server.

The NetGDB v1 protocol speaks the literal GDB remote serial protocol, and
uses a 1:1 relationship between GDB packets and sequences of debugnet
packets (fragmented by MTU).  There is no encryption utilized to keep
debugging sessions private, so this is only appropriate for local
segments or trusted networks.

Submitted by: John Reimer <john.reimer AT emc.com> (earlier version)
Discussed some with: emaste, markj
Relnotes: sure
Differential Revision: https://reviews.freebsd.org/D21568

4 years agoClean up some nits in link_elf_(un)load_file().
Mark Johnston [Thu, 17 Oct 2019 21:25:50 +0000 (21:25 +0000)]
Clean up some nits in link_elf_(un)load_file().

- Remove a redundant assignment of ef->address.
- Don't return a Mach error number to the caller if vm_map_find() fails.
- Use ptoa() and fix style.

MFC after: 2 weeks
Sponsored by: Netflix

4 years agoBelatedly bump __FreeBSD_version for r353537 and related commits.
Mark Johnston [Thu, 17 Oct 2019 20:46:33 +0000 (20:46 +0000)]
Belatedly bump __FreeBSD_version for r353537 and related commits.

At least one small update to the out-of-tree DRM drivers is required
now that cdev_pager_free_page() expects an xbusy page.

Discussed with: jeff, zeising

4 years agoAllow loader.efi to identify non-standard boot setup
Simon J. Gerraty [Thu, 17 Oct 2019 20:40:06 +0000 (20:40 +0000)]
Allow loader.efi to identify non-standard boot setup

PATH_BOOTABLE_TOKEN can be set to a non-standard
path that identifies a device as bootable.

Reviewed by: kevans, bcran
Differential Revision:  https://reviews.freebsd.org/D22062

4 years agodebugnet(4): Add optional full-duplex mode
Conrad Meyer [Thu, 17 Oct 2019 20:25:15 +0000 (20:25 +0000)]
debugnet(4): Add optional full-duplex mode

It remains unattached to any client protocol.  Netdump is unaffected
(remaining half-duplex).  The intended consumer is NetGDB.

Submitted by: John Reimer <john.reimer AT emc.com> (earlier version)
Discussed with: markj
Differential Revision: https://reviews.freebsd.org/D21541

4 years agoRevert two parts of r353292 that enter epoch when processing vlan capabilities.
Gleb Smirnoff [Thu, 17 Oct 2019 20:18:07 +0000 (20:18 +0000)]
Revert two parts of r353292 that enter epoch when processing vlan capabilities.
It could be that entering epoch isn't necessary here, but better take a
conservative approach.

Submitted by: kp

4 years agodebugnet(4): Infer non-server connection parameters
Conrad Meyer [Thu, 17 Oct 2019 20:10:32 +0000 (20:10 +0000)]
debugnet(4): Infer non-server connection parameters

Loosen requirements for connecting to debugnet-type servers.  Only require a
destination address; the rest can theoretically be inferred from the routing
table.

Relax corresponding constraints in netdump(4) and move ifp validation to
debugnet connection time.

Submitted by: John Reimer <john.reimer AT emc.com> (earlier version)
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D21482

4 years agoacpica: Fix for the fix, unfortunately
Conrad Meyer [Thu, 17 Oct 2019 19:53:55 +0000 (19:53 +0000)]
acpica: Fix for the fix, unfortunately

Follow-up to incomplete pedantic change in r353691 by actually fixing the
default implementation to match the interface type.  Mea culpa.

X-MFC-With: r353691, r339754

4 years agoAdd ddb(4) 'netdump' command to netdump a core without preconfiguration
Conrad Meyer [Thu, 17 Oct 2019 19:49:20 +0000 (19:49 +0000)]
Add ddb(4) 'netdump' command to netdump a core without preconfiguration

Add a 'X -s <server> -c <client> [-g <gateway>] -i <interface>' subroutine
to the generic debugnet code.  The imagined use is both netdump, shown here,
and NetGDB (vaporware).  It uses the ddb(4) lexer, with some new extensions,
to parse out IPv4 addresses.

'Netdump' uses the generic debugnet routine to load a configuration and
start a dump, without any netdump configuration prior to panic.

Loosely derived from work by: John Reimer <john.reimer AT emc.com>
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D21460

4 years agoacpica: Match ID_PROBE default implementation to interface
Conrad Meyer [Thu, 17 Oct 2019 18:45:11 +0000 (18:45 +0000)]
acpica: Match ID_PROBE default implementation to interface

After r339754, the additional interface parameter was accidentally left out
of the default acpi_generic_id_probe implementation.  Apparently this does
not cause any real problems, so this fix is mostly stylistic.

No functional change intended.

X-MFC-With: r339754

4 years agoAdd a very limited DDB dumpon(8)-alike to MI dumper code
Conrad Meyer [Thu, 17 Oct 2019 18:29:44 +0000 (18:29 +0000)]
Add a very limited DDB dumpon(8)-alike to MI dumper code

This allows ddb(4) commands to construct a static dumperinfo during
panic/debug and invoke doadump(false) using the provided dumper
configuration (always inserted first in the list).

The intended usecase is a ddb(4)-time netdump(4) command.

Reviewed by: markj (earlier version)
Differential Revision: https://reviews.freebsd.org/D21448

4 years agodebugnet: Respond to broadcast ARP requests
Conrad Meyer [Thu, 17 Oct 2019 17:48:32 +0000 (17:48 +0000)]
debugnet: Respond to broadcast ARP requests

The in-tree netdump code has always ignored non-directed ARP requests, and
that seems to work most of the time for netdump.

In my work and testing on NetGDB, it seems like sometimes the remote FreeBSD
conversant (the non-panic system) will send broadcast-destination ARP
requests to the debugnet kernel; without this change, those are dropped and
the remote will see EHOSTDOWN "Host is down" errors from the userspace
interface of the network stack.

Discussed with: markj

4 years agodebugnet(4): Check hardware-validated UDP checksums
Conrad Meyer [Thu, 17 Oct 2019 17:19:16 +0000 (17:19 +0000)]
debugnet(4): Check hardware-validated UDP checksums

Similar to INET checksums, lazily validate UDP checksums when the driver has
already performed the check for us.  Like debugnet(4) INET checksums,
validation in software is left as future work.

Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D21745

4 years agoQuickly fix up r353683: enter the epoch before calling into netisr_dispatch().
Gleb Smirnoff [Thu, 17 Oct 2019 17:02:50 +0000 (17:02 +0000)]
Quickly fix up r353683: enter the epoch before calling into netisr_dispatch().

4 years agoUpdate Conrad Meyer's email
Ed Maste [Thu, 17 Oct 2019 16:38:44 +0000 (16:38 +0000)]
Update Conrad Meyer's email

cem is now a committer

Approved by: cem

4 years agoSplit out a more generic debugnet(4) from netdump(4)
Conrad Meyer [Thu, 17 Oct 2019 16:23:03 +0000 (16:23 +0000)]
Split out a more generic debugnet(4) from netdump(4)

Debugnet is a simplistic and specialized panic- or debug-time reliable
datagram transport.  It can drive a single connection at a time and is
currently unidirectional (debug/panic machine transmit to remote server
only).

It is mostly a verbatim code lift from netdump(4).  Netdump(4) remains
the only consumer (until the rest of this patch series lands).

The INET-specific logic has been extracted somewhat more thoroughly than
previously in netdump(4), into debugnet_inet.c.  UDP-layer logic and up, as
much as possible as is protocol-independent, remains in debugnet.c.  The
separation is not perfect and future improvement is welcome.  Supporting
INET6 is a long-term goal.

Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to
'debugnet_' or 'dn_' -- sorry.  I thought keeping the netdump name on the
generic module would be more confusing than the refactoring.

The only functional change here is the mbuf allocation / tracking.  Instead
of initiating solely on netdump-configured interface(s) at dumpon(8)
configuration time, we watch for any debugnet-enabled NIC for link
activation and query it for mbuf parameters at that time.  If they exceed
the existing high-water mark allocation, we re-allocate and track the new
high-water mark.  Otherwise, we leave the pre-panic mbuf allocation alone.
In a future patch in this series, this will allow initiating netdump from
panic ddb(4) without pre-panic configuration.

No other functional change intended.

Reviewed by: markj (earlier version)
Some discussion with: emaste, jhb
Objection from: marius
Differential Revision: https://reviews.freebsd.org/D21421

4 years agoigmp_v1v2_queue_report() doesn't require epoch.
Gleb Smirnoff [Thu, 17 Oct 2019 16:02:34 +0000 (16:02 +0000)]
igmp_v1v2_queue_report() doesn't require epoch.

4 years agosnd_hda: style(9) whitespace fixup
Ed Maste [Thu, 17 Oct 2019 14:58:03 +0000 (14:58 +0000)]
snd_hda: style(9) whitespace fixup

PR: 241299
Submitted by: Neel Chauhan

4 years agoswapon_check_swzone(): use already calculated static variables.
Konstantin Belousov [Thu, 17 Oct 2019 13:49:47 +0000 (13:49 +0000)]
swapon_check_swzone(): use already calculated static variables.

Submitted by: ota@j.email.ne.jp
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D22065

4 years agovt: remove comment that is not true since r259680
Ed Maste [Thu, 17 Oct 2019 13:08:50 +0000 (13:08 +0000)]
vt: remove comment that is not true since r259680

r259680 added support to vt(4) for printing double-width characters.
Remove the comment that claims no support.

MFC after: 3 days
Sponsored by: The FreeBSD Foundation

4 years agodocument taskqueue_start_threads_in_proc
Andriy Gapon [Thu, 17 Oct 2019 06:58:07 +0000 (06:58 +0000)]
document taskqueue_start_threads_in_proc

While here, fix taskqueue_start_threads_cpuset that was documented under
old name of taskqueue_start_threads_pinned.

MFC after: 4 weeks

4 years agoprovide a way to assign taskqueue threads to a kernel process
Andriy Gapon [Thu, 17 Oct 2019 06:32:34 +0000 (06:32 +0000)]
provide a way to assign taskqueue threads to a kernel process

This can be used to group all threads belonging to a single logical
entity under a common kernel process.
I am planning to use the new interface for ZFS threads.

MFC after: 4 weeks

4 years agowbwd: small clean-ups and improvements
Andriy Gapon [Thu, 17 Oct 2019 06:21:09 +0000 (06:21 +0000)]
wbwd: small clean-ups and improvements

This change applies some suggestions by delphij from D21979.
A write-only variable is removed.
There is a diagnostic message if the driver does not recognize the chip.
A chained if-statement is converted to a switch.

MFC after: 3 weeks

4 years agoether: add older ethertype definitions for QinQ
Philip Paeps [Thu, 17 Oct 2019 00:34:53 +0000 (00:34 +0000)]
ether: add older ethertype definitions for QinQ

Older network equipment used the ethertypes 0x9100, 0x9200, and 0x9300 for
outer VLANs, before standardisation introduced 0x88a8.

Submitted by:  Lutz Donnerhacke <lutz_donnerhacke.de>
Differential Revision: https://reviews.freebsd.org/D21846

4 years agoFormalize the use of linker scripts for kernel modules.
Mark Johnston [Wed, 16 Oct 2019 22:19:56 +0000 (22:19 +0000)]
Formalize the use of linker scripts for kernel modules.

Automatically apply ldscript.kmod.${MACHINE_ARCH} if it exists.
We already have an i386-specific linker script; rename it accordingly.

Note that the linker script is applied when the object files are
partially linked.  (For amd64 this is also the final link.)

Reviewed by: imp, kib
Discussed with: jhb
MFC after: 1 week
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21887

4 years agoIntroduce pmap_change_prot() for amd64.
Mark Johnston [Wed, 16 Oct 2019 22:12:34 +0000 (22:12 +0000)]
Introduce pmap_change_prot() for amd64.

This updates the protection attributes of subranges of the kernel map.
Unlike pmap_protect(), which is typically used for user mappings,
pmap_change_prot() does not perform lazy upgrades of protections.
pmap_change_prot() also updates the aliasing range of the direct map.

Reviewed by: kib
MFC after: 1 month
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21758

4 years agoUse KOBJMETHOD_END in the kernel linker.
Mark Johnston [Wed, 16 Oct 2019 22:06:19 +0000 (22:06 +0000)]
Use KOBJMETHOD_END in the kernel linker.

MFC after: 1 week

4 years agoRemove page locking from pmap_mincore().
Mark Johnston [Wed, 16 Oct 2019 22:03:27 +0000 (22:03 +0000)]
Remove page locking from pmap_mincore().

After r352110 the page lock no longer protects a page's identity, so
there is no purpose in locking the page in pmap_mincore().  Instead,
if vm.mincore_mapped is set to the non-default value of 0, re-lookup
the page after acquiring its object lock, which holds the page's
identity stable.

The change removes the last callers of vm_page_pa_tryrelock(), so
remove it.

Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21823

4 years agoMake all the gnop parameters optional in the request from userland,
Chuck Silvers [Wed, 16 Oct 2019 21:49:44 +0000 (21:49 +0000)]
Make all the gnop parameters optional in the request from userland,
filling in the same defaults that the current userland module uses.
This allows an old geom_nop.so userland module to work with a new kernel.

Approved by: imp (mentor)
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21972

4 years agoAdd a new gctl_get_paraml_opt() interface to extract optional parameters from
Chuck Silvers [Wed, 16 Oct 2019 21:49:39 +0000 (21:49 +0000)]
Add a new gctl_get_paraml_opt() interface to extract optional parameters from
the request.  It is the same as gctl_get_paraml() except that the request
is not marked with an error if the parameter is not present.

Approved by: imp (mentor)
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21972