]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
5 years agoDocument userspace firmware flash in mlx5tool(8) and mlx5io(4).
hselasky [Wed, 8 May 2019 10:51:07 +0000 (10:51 +0000)]
Document userspace firmware flash in mlx5tool(8) and mlx5io(4).

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement userspace firmware update for ConnectX-4/5/6.
hselasky [Wed, 8 May 2019 10:50:35 +0000 (10:50 +0000)]
Implement userspace firmware update for ConnectX-4/5/6.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRename mlx5_fwdump_addr to more neutral mlx5_tool_addr in mlx5core.
hselasky [Wed, 8 May 2019 10:50:08 +0000 (10:50 +0000)]
Rename mlx5_fwdump_addr to more neutral mlx5_tool_addr in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd mlxfw callbacks in mlx5core.
hselasky [Wed, 8 May 2019 10:49:36 +0000 (10:49 +0000)]
Add mlxfw callbacks in mlx5core.

Add mlx5 implementation for the ones defined by the mlxfw
shared module to be used while flashing the device firmware.

The callbacks do their job through the MCQI, MCC and MCDA registers.

Linux commit:
62bd22cf326dc4ac5be673c11cef4602dc1f5e47

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoInitial version of Mellanox in-kernel firmware upgrade support.
hselasky [Wed, 8 May 2019 10:49:05 +0000 (10:49 +0000)]
Initial version of Mellanox in-kernel firmware upgrade support.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoConvert remaining module parameters into SYSCTLs in mlx5core.
hselasky [Wed, 8 May 2019 10:44:53 +0000 (10:44 +0000)]
Convert remaining module parameters into SYSCTLs in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRemove redundant line of code in mlx5core.
hselasky [Wed, 8 May 2019 10:44:27 +0000 (10:44 +0000)]
Remove redundant line of code in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoChange implicit and probably erronous EPERM to EIO on command status error
hselasky [Wed, 8 May 2019 10:44:02 +0000 (10:44 +0000)]
Change implicit and probably erronous EPERM to EIO on command status error
in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix typo.
hselasky [Wed, 8 May 2019 10:43:35 +0000 (10:43 +0000)]
Fix typo.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix style.
hselasky [Wed, 8 May 2019 10:42:51 +0000 (10:42 +0000)]
Fix style.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix netstat counters mapping in mlx5en(4).
hselasky [Wed, 8 May 2019 10:42:33 +0000 (10:42 +0000)]
Fix netstat counters mapping in mlx5en(4).

The current mapping of driver counters to netstat counters is wrong.
For example, a single jabber packet, will cause the Ierrs counter to
count three times.

The work for mapping the hardware and software counters to their right
place in netstat counters were already done in Linux, take that as is
to the FreeBSD driver.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix endless loop in ipoib_poll().
hselasky [Wed, 8 May 2019 10:42:05 +0000 (10:42 +0000)]
Fix endless loop in ipoib_poll().

ib_req_notify_cq may return negative value which will indicate a
failure. In the case of uncorrectable error, we will end up in an
endless loop. Fix that, by going to another loop with poll_more
only if there is anything left to poll.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAvoid leaking send queue mbufs during error recovery in mlx5en(4).
hselasky [Wed, 8 May 2019 10:41:44 +0000 (10:41 +0000)]
Avoid leaking send queue mbufs during error recovery in mlx5en(4).

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd helper functions to set/query MCC/MCDA/MCQI registers in mlx5core.
hselasky [Wed, 8 May 2019 10:41:21 +0000 (10:41 +0000)]
Add helper functions to set/query MCC/MCDA/MCQI registers in mlx5core.

To be used by the mlx5 callbacks exposed to the mlxfw module.

Linux commit:
d2ad488b0073bd1a2c3f5d2ea50a7eb632103e5d

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoEnhance MCAM reg to allow query on access reg support in mlx5core.
hselasky [Wed, 8 May 2019 10:41:00 +0000 (10:41 +0000)]
Enhance MCAM reg to allow query on access reg support in mlx5core.

Enhance MCAM to allow the driver to query which access regs are
supported. For now, expose the regs needed for FW flashing.

Linux commit:
0ab87743cc8c5bcd482daf71961ed5fc45349e01

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd MCC (Management Component Control) register definitions in mlx5core.
hselasky [Wed, 8 May 2019 10:40:41 +0000 (10:40 +0000)]
Add MCC (Management Component Control) register definitions in mlx5core.

MCC (Management Component Control) allows to control a firmware
component update.

MCDA (Management Component Data Access) allows to read and write
a firmware component.

MCQI (Management Component Query Information) allows to query
information about firmware components.

Linux commit:
4717628938423fcba0aa8fa889e9fed4eb6a655f

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd reading the mcam_reg in mlx5core.
hselasky [Wed, 8 May 2019 10:40:13 +0000 (10:40 +0000)]
Add reading the mcam_reg in mlx5core.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoQuery and cache PCAM, MCAM registers on initialization in mlx5core.
hselasky [Wed, 8 May 2019 10:39:53 +0000 (10:39 +0000)]
Query and cache PCAM, MCAM registers on initialization in mlx5core.

On load_one, we now cache our capabilities registers internally, similar
to QUERY_HCA_CAP. Capabilities can later be queried using macros
introduced in this patch.

Linux commit:
71862561f3a62015a11de16d1c306481e8415c08

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement PCAM, MCAM access register commands in mlx5core.
hselasky [Wed, 8 May 2019 10:39:25 +0000 (10:39 +0000)]
Implement PCAM, MCAM access register commands in mlx5core.

Introduced registers will expose capabilities of new registers and
features related to port/management.
Driver will query MCAM and PCAM in order to avoid failing on old
firmwares with lack of support.

Linux commit:
c835ad64683bd3e2d1b31ed2cb1ff4366932edb1

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoExpose PCAM, MCAM registers infrastructure in mlx5core.
hselasky [Wed, 8 May 2019 10:39:01 +0000 (10:39 +0000)]
Expose PCAM, MCAM registers infrastructure in mlx5core.

PCAM: Ports capabilities mask register.
MCAM: Management capabilities mask register.

PCAM and MCAM registers will provide information regarding firmware
support for different features, in order to avoid cases where new driver
combined with old firmware results in syndromes (for ex. PCIe counters
before this patchset).

Linux commit:
cfdcbceaeffc669b70d904d80a2df9c86c232566

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd sysctl(8) to control fast unload support in mlx5core.
hselasky [Wed, 8 May 2019 10:38:31 +0000 (10:38 +0000)]
Add sysctl(8) to control fast unload support in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd Fast teardown support to mlx5core.
hselasky [Wed, 8 May 2019 10:38:06 +0000 (10:38 +0000)]
Add Fast teardown support to mlx5core.

Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown

This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.

Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been
   scattered to memory

Linux commit:
fcd29ad17c6ff885dfae58f557e9323941e63ba2

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMake sure the running variable is properly set for ratelimited SQs in mlx5en(4).
hselasky [Wed, 8 May 2019 10:37:31 +0000 (10:37 +0000)]
Make sure the running variable is properly set for ratelimited SQs in mlx5en(4).

Else the SQs won't be properly released when closing rate-limited connections
leading to wrong state transitions on the SQ.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement get and set nic state as global functions in mlx5core.
hselasky [Wed, 8 May 2019 10:37:03 +0000 (10:37 +0000)]
Implement get and set nic state as global functions in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoTicks are integer type in FreeBSD.
hselasky [Wed, 8 May 2019 10:36:32 +0000 (10:36 +0000)]
Ticks are integer type in FreeBSD.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoConfigure firmware to use RX hash format in mini CQE in mlx5en(4).
hselasky [Wed, 8 May 2019 10:35:55 +0000 (10:35 +0000)]
Configure firmware to use RX hash format in mini CQE in mlx5en(4).

When using CQE zipping, one can choose between RX hash and Checksum.
This will indicate the parameter on which a zipping session should be
stopped.

While porting the Linux code, Checksum was chosen. However, the value
of Checksum is not being used anywhere.
For the FreeBSD driver, we prefer to use the RX hash format which will
guarantee the RX hash value for all the mini CQEs.
While at it, make sure to initialize the Checksum value in the
decompressed CQE.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDisable CQE zipping by default in mlx5en(4).
hselasky [Wed, 8 May 2019 10:35:35 +0000 (10:35 +0000)]
Disable CQE zipping by default in mlx5en(4).

After doing performance measurements, it seems like CQE zipping doesn't
have any significant benefit.
Moreover, we know that this feature is disabled by default on other
operating systems (Linux for example).

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoSplit mlx5e_update_stats_work() in mlx5en(4).
hselasky [Wed, 8 May 2019 10:35:14 +0000 (10:35 +0000)]
Split mlx5e_update_stats_work() in mlx5en(4).

Split the function into the mlx5e_update_stats_locked() core and make
mlx5e_update_stats_work() call the _locked helper, similar to many other
places in the kernel. This improves the code structure, making the
locking clean.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement fast close of RX channel in mlx5en(4).
hselasky [Wed, 8 May 2019 10:34:42 +0000 (10:34 +0000)]
Implement fast close of RX channel in mlx5en(4).

Instead of waiting for all jobs to be cancelled, simply close the completion
queue to prevent more completion events and let mlx5e_destroy_rq() cleanup
the remaining mbufs.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoCorrect number of elements for priority to traffic class mappings in mlx5en(4).
hselasky [Wed, 8 May 2019 10:34:14 +0000 (10:34 +0000)]
Correct number of elements for priority to traffic class mappings in mlx5en(4).

The number of priorities is always 8, while the number of traffic classes
supported can vary. While at it convert the sysctl node into an array.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRemove unused module parameter in mlx5ib.
hselasky [Wed, 8 May 2019 10:33:29 +0000 (10:33 +0000)]
Remove unused module parameter in mlx5ib.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMake sure to error out when arming the CQ fails in mlx4ib and mlx5ib.
hselasky [Wed, 8 May 2019 10:33:09 +0000 (10:33 +0000)]
Make sure to error out when arming the CQ fails in mlx4ib and mlx5ib.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMake sure to error out when arming the CQ fails in ibcore.
hselasky [Wed, 8 May 2019 10:32:45 +0000 (10:32 +0000)]
Make sure to error out when arming the CQ fails in ibcore.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDestroy port stats debug context in correct order in mlx5en(4).
hselasky [Wed, 8 May 2019 10:32:22 +0000 (10:32 +0000)]
Destroy port stats debug context in correct order in mlx5en(4).
Destroy children nodes before parent nodes.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix tx_jumbo_packets counter in mlx5en(4).
hselasky [Wed, 8 May 2019 10:32:03 +0000 (10:32 +0000)]
Fix tx_jumbo_packets counter in mlx5en(4).

Instead of reading Ethernet RFC 2819 pXtoYoctets counters from
hardware which counts RX octets, count tx_stat_pXtoYoctets from
Ethernet extended counters which counts TX octets.

TX jumbo counters should be accumulated only after the PPCNT
counters were fetched from hardware with their latest value.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoUpdate Ethernet extended counters in mlx5en(4).
hselasky [Wed, 8 May 2019 10:31:32 +0000 (10:31 +0000)]
Update Ethernet extended counters in mlx5en(4).

Expose all Ethernet extended counters those counters via debug_stats
sysctl:
dev.mce.X.debug_stats

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoProtect from infinite sw-reset loop in mlx5core.
hselasky [Wed, 8 May 2019 10:30:47 +0000 (10:30 +0000)]
Protect from infinite sw-reset loop in mlx5core.

Avoid an infinite software firmware reset loop that may be caused by a
hardware bug by limiting the maximum number of resets.
The counter between resets is reset by request for reset, and not by a
successful reset.
The interval between two resets can be configured via sysctl:
hw.mlx5.sw_reset_timeout
which is global to all mlx5 devices in the system.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDisable all MSIX interrupts before shutdown in mlx5.
hselasky [Wed, 8 May 2019 10:30:18 +0000 (10:30 +0000)]
Disable all MSIX interrupts before shutdown in mlx5.

Make sure the interrupt handlers don't race with the fast unload one
code in the shutdown handler.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImport Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.
hselasky [Wed, 8 May 2019 10:29:45 +0000 (10:29 +0000)]
Import Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd temperature warning event to log in mlx5core.
hselasky [Wed, 8 May 2019 10:28:18 +0000 (10:28 +0000)]
Add temperature warning event to log in mlx5core.

Temperature warning event is sent by FW to indicate high temperature
as detected by one of the sensors on the board.
Add handling of this event by writing the numbers of the alert sensors
to the kernel log.

Linux commit:
1865ea9adbfaf341c5cd5d8f7d384f19948b2fe9

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoCorrectly define the interface state bits in mlx5en(4).
hselasky [Wed, 8 May 2019 10:27:29 +0000 (10:27 +0000)]
Correctly define the interface state bits in mlx5en(4).

While at it remove unused interface state bits. This also fixes and issue
during shutdown:

There is an issue where the firmware fails during mlx5_load_one,
the health_care timer detects the issue and schedules a health_care call.
Then the mlx5_load_one detects the issue, cleans up and quits. Then
the health_care starts and calls mlx5_unload_one to clean up the resources
that no longer exist and causes kernel panic.

The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
after mlx5_load_one fails. The solution is removing the bit
MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
is redundant and we can use MLX5_INTERFACE_STATE_UP instead.

Linux commit:
10a8d00707082955b177164d4b4e758ffcbd4017
b3cb5388499c5e219324bfe7da2e46cbad82bfcf

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoEnable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.
hselasky [Wed, 8 May 2019 10:26:33 +0000 (10:26 +0000)]
Enable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.
hselasky [Wed, 8 May 2019 10:25:14 +0000 (10:25 +0000)]
Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).
hselasky [Wed, 8 May 2019 10:23:33 +0000 (10:23 +0000)]
Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).

Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAllow to build without INET and INET6 again after r347221.
marius [Wed, 8 May 2019 09:03:43 +0000 (09:03 +0000)]
Allow to build without INET and INET6 again after r347221.

Submitted by: cam

5 years agoMove contrib/zlib to sys/contrib/zlib so that we can use it in kernel.
delphij [Wed, 8 May 2019 08:43:15 +0000 (08:43 +0000)]
Move contrib/zlib to sys/contrib/zlib so that we can use it in kernel.
This is a prerequisite of unifying kernel zlib instances.

Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20191

5 years agoPull in r360099 from upstream llvm trunk (by Eli Friedman):
dim [Wed, 8 May 2019 05:45:00 +0000 (05:45 +0000)]
Pull in r360099 from upstream llvm trunk (by Eli Friedman):

  [ARM] Glue register copies to tail calls.

  This generally follows what other targets do. I don't completely
  understand why the special case for tail calls existed in the first
  place; even when the code was committed in r105413, call lowering
  didn't work in the way described in the comments.

  Stack protector lowering breaks if the register copies are not glued
  to a tail call: we have to insert the stack protector check before
  the tail call, and we choose the location based on the assumption
  that all physical register dependencies of a tail call are adjacent
  to the tail call. (See FindSplitPointForStackProtector.) This is sort
  of fragile, but I don't see any reason to break that assumption.

  I'm guessing nobody has seen this before just because it's hard to
  convince the scheduler to actually schedule the code in a way that
  breaks; even without the glue, the only computation that could
  actually be scheduled after the register copies is the computation of
  the call address, and the scheduler usually prefers to schedule that
  before the copies anyway.

  Fixes https://bugs.llvm.org/show_bug.cgi?id=41417

  Differential Revision: https://reviews.llvm.org/D60427

This should fix several instances of "Bad machine code: Using an
undefined physical register", when compiling ports such as
multimedia/vlc, audio/alsa-lib and devel/avro-c for armv6, with
-fstack-protector-strong.

Reported by: jbeich
PR: 237074, 237783, 237784
MFC after: 3 days

5 years agopowerpc: hide innocuous printf behind bootverbose
jhibbits [Wed, 8 May 2019 03:15:22 +0000 (03:15 +0000)]
powerpc: hide innocuous printf behind bootverbose

NUMA associativity, and OFW node existence, is completely optional, and
shouldn't warn always.

5 years agotun/tap: merge and rename to `tuntap`
kevans [Wed, 8 May 2019 02:32:11 +0000 (02:32 +0000)]
tun/tap: merge and rename to `tuntap`

tun(4) and tap(4) share the same general management interface and have a lot
in common. Bugs exist in tap(4) that have been fixed in tun(4), and
vice-versa. Let's reduce the maintenance requirements by merging them
together and using flags to differentiate between the three interface types
(tun, tap, vmnet).

This fixes a couple of tap(4)/vmnet(4) issues right out of the gate:
- tap devices may no longer be destroyed while they're open [0]
- VIMAGE issues already addressed in tun by kp

[0] emaste had removed an easy-panic-button in r240938 due to devdrn
blocking. A naive glance over this leads me to believe that this isn't quite
complete -- destroy_devl will only block while executing d_* functions, but
doesn't block the device from being destroyed while a process has it open.
The latter is the intent of the condvar in tun, so this is "fixed" (for
certain definitions of the word -- it wasn't really broken in tap, it just
wasn't quite ideal).

ifconfig(8) also grew the ability to map an interface name to a kld, so
that `ifconfig {tun,tap}0` can continue to autoload the correct module, and
`ifconfig vmnet0 create` will now autoload the correct module. This is a
low overhead addition.

(MFC commentary)

This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this,
and how critical they are. Changes after this are likely easily MFC'd
without taking this merge, but the merge will be easier.

I have no plans to do this MFC as of now.

Reviewed by: bcr (manpages), tuexen (testing, syzkaller/packetdrill)
Input also from: melifaro
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D20044

5 years agoFix dataset name comparison in zfs_compare().
mav [Wed, 8 May 2019 01:35:43 +0000 (01:35 +0000)]
Fix dataset name comparison in zfs_compare().

The code never returned match comparing two datasets (not snapshots).
As result, uu_avl_find(), called from zfs_callback(), never succeeded,
allowing to add same dataset into the list multiple times, for example:

# zfs get name pers pers pers@z pers@z
NAME    PROPERTY  VALUE   SOURCE
pers    name      pers    -
pers    name      pers    -
pers@z  name      pers@z  -

With the patch:

# zfs get name pers pers pers@z pers@z
NAME    PROPERTY  VALUE   SOURCE
pers    name      pers    -
pers@z  name      pers@z  -

MFC after: 1 week
Sponsored by: iXsystems, Inc.

5 years agorandom: x86 driver: Prefer RDSEED over RDRAND when available
cem [Wed, 8 May 2019 00:45:16 +0000 (00:45 +0000)]
random: x86 driver: Prefer RDSEED over RDRAND when available

Per
https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
, RDRAND is a PRNG seeded from the same source as RDSEED.  The source is
more suitable as PRNG seed material, so prefer it when the RDSEED intrinsic
is available (indicated in CPU feature bits).

Reviewed by: delphij, jhb, imp (earlier version)
Approved by: secteam(delphij)
Security: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20192

5 years agovmm(4): Pass through RDSEED feature bit to guests
cem [Wed, 8 May 2019 00:40:08 +0000 (00:40 +0000)]
vmm(4): Pass through RDSEED feature bit to guests

Reviewed by: jhb
Approved by: #bhyve (jhb)
MFC after: 2 leapseconds
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20194

5 years agoAdd missing newline to debug printf.
imp [Wed, 8 May 2019 00:09:10 +0000 (00:09 +0000)]
Add missing newline to debug printf.

5 years agoFix libsbuf sbuf_printf_drain symbol version
cem [Tue, 7 May 2019 21:15:11 +0000 (21:15 +0000)]
Fix libsbuf sbuf_printf_drain symbol version

(Introduced incorrectly in r347229 earlier today.)

As pointed out by kevans, 1.6 should be used for FreeBSD 13, like r340383.

Submitted by: kevans
Reported by: kib
Reviewed by: jilles
X-MFC-with:  r347229
Differential Revision: https://reviews.freebsd.org/D20187

5 years agoImprove the legibility of the login.access.5 man page by separating
cy [Tue, 7 May 2019 20:39:39 +0000 (20:39 +0000)]
Improve the legibility of the login.access.5 man page by separating
each argument into its own paragraph.

MFC after: 3 days

5 years agoRemove non-functional SCTP checksum offload support for virtio.
tuexen [Tue, 7 May 2019 20:28:12 +0000 (20:28 +0000)]
Remove non-functional SCTP checksum offload support for virtio.

Checksum offloading for SCTP is not currently specified for virtio.
If the hypervisor announces checksum offloading support, it means TCP
and UDP checksum offload. If an SCTP packet is sent and the host announced
checksum offload support, the hypervisor inserts the IP checksum (16-bit)
at the correct offset, but this is not the right checksum, which is a CRC32c.
This results in all outgoing packets having the wrong checksum and therefore
breaking SCTP based communications.

This patch removes SCTP checksum offloading support from the virtio
network interface.

Thanks to Felix Weinrank for making me aware of the issue.

Reviewed by: bryanv@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20147

5 years agoSupport PTRACE_GETREGSET w/ NT_PRSTATUS in Linux ptrace(2).
trasz [Tue, 7 May 2019 19:06:41 +0000 (19:06 +0000)]
Support PTRACE_GETREGSET w/ NT_PRSTATUS in Linux ptrace(2).

While Linux strace(1) doesn't strictly require it - it has a fallback
to PTRACE_GETREGS - it's a newer interface, so we better support it
before the old one is deprecated.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20152

5 years agomake sysent after r347228
emaste [Tue, 7 May 2019 18:10:21 +0000 (18:10 +0000)]
make sysent after r347228

Regenerate to add @generated tag in generated files.

5 years agodevice_printf: Use sbuf for more coherent prints on SMP
cem [Tue, 7 May 2019 17:47:20 +0000 (17:47 +0000)]
device_printf: Use sbuf for more coherent prints on SMP

device_printf does multiple calls to printf allowing other console messages to
be inserted between the device name, and the rest of the message.  This change
uses sbuf to compose to two into a single buffer, and prints it all at once.

It exposes an sbuf drain function (drain-to-printf) for common use.

Update documentation to match; some unit tests included.

Submitted by: jmg
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D16690

5 years agomakesyscalls: use @generated tag in generated files
emaste [Tue, 7 May 2019 16:17:33 +0000 (16:17 +0000)]
makesyscalls: use @generated tag in generated files

Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.

Reviewed by: cem
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20183

5 years agoSimplify the test against maxproc in fork1().
markj [Tue, 7 May 2019 15:03:26 +0000 (15:03 +0000)]
Simplify the test against maxproc in fork1().

Previously nprocs_new would be tested against maxprocs twice when
nprocs_new < maxprocs - 10.  Eliminate the unnecessary comparison.

Submitted by: Wuyang Chung <wuyang.chung1@gmail.com>
GitHub PR: https://github.com/freebsd/freebsd/pull/397
MFC after: 1 week

5 years agoDisable interrupts first and then set spinlock_count to 1.
br [Tue, 7 May 2019 14:32:17 +0000 (14:32 +0000)]
Disable interrupts first and then set spinlock_count to 1.
Otherwise interrupt can be generated just after setting spinlock_count
and before disabling interrupts.

Sponsored by: DARPA, AFRL

5 years agoProvide a template for busdma code for RISC-V.
br [Tue, 7 May 2019 13:41:43 +0000 (13:41 +0000)]
Provide a template for busdma code for RISC-V.

RISC-V ISA specifies no cache management instructions so leave cache
operations in cpufunc.h as no-op for now.

Note some new hardware comes with their own memory-mapped cache
management controller.

Tested on HiFive Unleashed board with cgem(4).

Reviewed by: markj
Obtained from: arm64
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D20126

5 years agoUse @generated tag in generated files
emaste [Tue, 7 May 2019 13:04:26 +0000 (13:04 +0000)]
Use @generated tag in generated files

Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makeobjops.awk and vnode_if.awk as we've done
for other generated files.

Sponsored by: The FreeBSD Foundation

5 years agocommand_bcache() does not use argv
tsoome [Tue, 7 May 2019 10:01:45 +0000 (10:01 +0000)]
command_bcache() does not use argv

Therefore mark argv __unused.

5 years agoo Avoid determining the MAC class (LEM/EM or IGB) - possibly even multiple
marius [Tue, 7 May 2019 08:31:54 +0000 (08:31 +0000)]
o Avoid determining the MAC class (LEM/EM or IGB) - possibly even multiple
  times - on every interrupt by using an own set of device methods for the
  IGB class. This translates to introducing igb_if_intr_{disable,enable}()
  and igb_if_{rx,tx}_queue_intr_enable() with that IGB-specific code moved
  out of their EM counterparts and otherwise continuing to use the EM IFDI
  methods also for IGB.
  Note that igb_if_intr_{disable,enable}() also issue E1000_WRITE_FLUSH as
  lost with the conversion of igb(4) to iflib(4).
  Also note, that the em_if_{disable,enable}_intr() methods are renamed to
  em_if_intr_{disable,enable}() for consistency with the names used in the
  interface declaration.
o In em_intr():
  - Don't bother to bail out if the interrupt type is "legacy", i. e. INTx
    or MSI, as iflib(4) doesn't use ift_legacy_intr methods for MSI-X. All
    other iflib(4)-based drivers avoid this check, too.
  - Given that only the MSI-X interrupts have one-shot behavior (by taking
    advantage of the EIAC register), explicitly disable interrupts. Hence,
    em_intr() now matches what {em,igb}_irq_fast() previously did (in case
    of igb(4) supposedly also to work around MSI message reordering errata
    on certain systems).
o In em_if_intr_disable():
  - Clear the EIAC register unconditionally for 82574 and not just in case
    of MSI-X, matching em_if_intr_enable() and bringing back the last hunk
    of r206437 lost with the iflib(4) conversion.
  - Write to EM_EIAC for clearing said register instead of to the IGB-only
    E1000_EIAC used ever since the iflib(4) conversion.

Reviewed by: shurd
Differential Revision: https://reviews.freebsd.org/D20176

5 years agoo Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
marius [Tue, 7 May 2019 08:28:35 +0000 (08:28 +0000)]
o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
  MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
  _task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
  TX throughput of EM-class devices on par with the MSI-X case and, thus,
  close to wirespeed/pre-iflib(4) times again. [1]
  Note that independently of the interrupt type, the UDP performance with
  these MACs still is abysmal and nowhere near to where it was before the
  conversion of em(4) to iflib(4).
o In iflib_init_locked(), announce which free list failed to set up.
o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
  of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
  as the latter is valid with MSI-X only.
o Instead of adding the missing - and apparently convoluted enough that a
  DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
  ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
  iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
  make the checks fail gracefully rather than panic. This avoids invoking
  the checks at runtime over and over again in iflib_fast_intr_rxtx() and
  _task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
  these functions more readable.
o In iflib_rx_structures_setup(), only initialize LRO resources if device
  and driver have LRO capability in order to not waste memory. Also, free
  the LRO resources again if setting them up fails for one of the queues.
  However, don't bother invoking iflib_rx_sds_free() in that case because
  iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
  iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
  of failure via iflib_rx_structures_free(), but there definitely is some
  asymmetry left to be fixed, though).
o Similarly, free LRO resources again in iflib_rx_structures_free().
o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
  instead of panicing (but only in case of INVARIANTS). This is a follow-
  up to r344132, as such driver bugs shouldn't be fatal.
o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
  gracefully, too.
o Bring yet more sanity to iflib_msix_init():
  - If the device doesn't provide enough MSI-X vectors or not all vectors
    can be allocate so the expected number of queues in addition to admin
    interrupts can't be supported, try MSI next (and then INTx) as proper
    MSI-X vector distribution can't be assured in such cases. In essence,
    this change brings r254008 forward to iflib(4). Also, this is the fix
    alluded to in the commit message of r343934.
  - If the MSI-X allocation has failed, don't prematurely announce MSI is
    going to be used as the latter in fact may not be available either.
  - When falling back to MSI, only release the MSI-X table resource again
    if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
    driver, in the first place.
o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.

PR: 235031 (likely) [1]
Reviewed by: shurd
Differential Revision: https://reviews.freebsd.org/D20175

5 years agoloader: bcache code does not need to check argument for free()
tsoome [Tue, 7 May 2019 08:14:30 +0000 (08:14 +0000)]
loader: bcache code does not need to check argument for free()

5 years agoloader: use safer DPRINTF body for non-debug case
tsoome [Tue, 7 May 2019 07:46:40 +0000 (07:46 +0000)]
loader: use safer DPRINTF body for non-debug case

5 years agoRemove wrong copyright line. Discussed with Carlos Neira.
dchagin [Tue, 7 May 2019 05:08:13 +0000 (05:08 +0000)]
Remove wrong copyright line. Discussed with Carlos Neira.

Reported by: Rodney W. Grimes
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13656

5 years agoamd64: fix BUS_SPACE_MAXSIZE to 64bit max value.
kib [Tue, 7 May 2019 01:18:57 +0000 (01:18 +0000)]
amd64: fix BUS_SPACE_MAXSIZE to 64bit max value.

Reviewed by: jhb, tychon (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20154

5 years agoThe intention of the blist cursor is for the search for free blocks to
dougm [Mon, 6 May 2019 22:12:15 +0000 (22:12 +0000)]
The intention of the blist cursor is for the search for free blocks to
resume where the last search left off. Suppose that there are no free
blocks of size 32, but plenty of size 16. If we repeatedly request
size 32 blocks, fail, and retry with size 16 blocks, then the failures
all reset the cursor to the beginning of memory, making the 16 block
allocation use a first fit, rather than next fit, strategy.

This change has blist_alloc make a copy of the cursor for its own
decision making, and only updates the real blist cursor after a
successful allocation, making those 16 block searches behave like
next-fit searches.

Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D20177

5 years ago- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
marius [Mon, 6 May 2019 20:56:41 +0000 (20:56 +0000)]
- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
- Remove the only ever written to ift_db_mtx_name member of struct iflib_txq.
- Remove the unused or only ever written to ifr_size, ifr_cq_pidx, ifr_cq_gen
  and ifr_lro_enabled members of struct iflib_rxq.
- Consistently spell DMA, RX and TX uppercase in comments, messages etc.
  instead of mixing with some lowercase variants.
- Consistently use if_t instead of a mix of if_t and struct ifnet pointers.
- Bring the function comments of _iflib_fl_refill(), iflib_rx_sds_free() and
  iflib_fl_setup() in line with reality.
- Judging problem reports, people are wondering what on earth messages like:
  "TX(0) desc avail = 1024, pidx = 0"
  are trying to indicate. Thus, extend this string to be more like that of
  non-iflib(4) Ethernet MAC drivers, notifying about a watchdog timeout due
  to which the interface will be reset.
- Take advantage of the M_HAS_VLANTAG macro.
- Use false/true rather than FALSE/TRUE for variables of type bool.
- Use FALLTHROUGH as advocated by style(9).

5 years agoImport libxo-1.0.4:
phil [Mon, 6 May 2019 20:20:21 +0000 (20:20 +0000)]
Import libxo-1.0.4:
- Avoid NULL deref in xo_xml_leader_len (replacing local fix in rS345967)
- update copyright dates
- update test cases
- fix uncommitted version change

Submitted by: phil
MFC after: 2 weeks

5 years agoAdds sys/class/net devices to linsysfs.
dchagin [Mon, 6 May 2019 20:01:13 +0000 (20:01 +0000)]
Adds sys/class/net devices to linsysfs.

Only two interfaces are created eth0 and lo and they expose
the following properties:
address, addr_len, flags, ifindex, mty, tx_queue_len and type.

Initial patch developed by Carlos Neira in 2017 and finished by me.

PR: 223722
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13656

5 years agoRewrite linux_ifflags() in more readable Linuxulator style.
dchagin [Mon, 6 May 2019 19:57:51 +0000 (19:57 +0000)]
Rewrite linux_ifflags() in more readable Linuxulator style.

Reviewed by: emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20146

5 years agoComplete r347052 (https://reviews.freebsd.org/D20137) as it it was not
dchagin [Mon, 6 May 2019 19:56:13 +0000 (19:56 +0000)]
Complete r347052 (https://reviews.freebsd.org/D20137) as it it was not
a final revision.

Fix style issues and change bool-like variables from int to bool.

Reviewed by: emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20141

5 years agoSimplify boot1 allocation of handles.
imp [Mon, 6 May 2019 19:35:30 +0000 (19:35 +0000)]
Simplify boot1 allocation of handles.

There's no need to pre-malloc the number of handles. Instead call
LocateHandles twice, once to get the size, and once to get the
data.

5 years agoDrop periph lock around cam_periph_unmapmem().
mav [Mon, 6 May 2019 19:08:03 +0000 (19:08 +0000)]
Drop periph lock around cam_periph_unmapmem().

Since r345656 it may call copyout(), that may sleep.

MFC after: 3 days
Sponsored by: iXsystems, Inc.

5 years agoThe build process generates assym.inc from genassym.o, so don't forget
dchagin [Mon, 6 May 2019 18:46:42 +0000 (18:46 +0000)]
The build process generates assym.inc from genassym.o, so don't forget
to clean genassym.o

MFC after: 2 weeks

5 years agoAbstract out efi_devpath_to_handle to search for a handle that matches
imp [Mon, 6 May 2019 18:39:27 +0000 (18:39 +0000)]
Abstract out efi_devpath_to_handle to search for a handle that matches
the desired devpath.

5 years agoWe only ever need one devinfo per handle. So allocate it outside of
imp [Mon, 6 May 2019 18:39:22 +0000 (18:39 +0000)]
We only ever need one devinfo per handle. So allocate it outside of
looping over the filesystem modules rather than doing a malloc + free
each time through the loop. In addition, nothing changes from loop to
loop, so setup the new devinfo outside the loop as well.

5 years agoReach over and pull in devpath.c from libefi
imp [Mon, 6 May 2019 18:38:46 +0000 (18:38 +0000)]
Reach over and pull in devpath.c from libefi

This allows us to remove three nearly identical functions because the
differences don't matter, and the size difference is trivial.

5 years agoList-ify kernel dump device configuration
cem [Mon, 6 May 2019 18:24:07 +0000 (18:24 +0000)]
List-ify kernel dump device configuration

Allow users to specify multiple dump configurations in a prioritized list.
This enables fallback to secondary device(s) if primary dump fails.  E.g.,
one might configure a preference for netdump, but fallback to disk dump as a
second choice if netdump is unavailable.

This change does not list-ify netdump configuration, which is tracked
separately from ordinary disk dumps internally; only one netdump
configuration can be made at a time, for now.  It also does not implement
IPv6 netdump.

savecore(8) is already capable of scanning and iterating multiple devices
from /etc/fstab or passed on the command line.

This change doesn't update the rc or loader variables 'dumpdev' in any way;
it can still be set to configure a single dump device, and rc.d/savecore
still uses it as a single device.  Only dumpon(8) is updated to be able to
configure the more complicated configurations for now.

As part of revving the ABI, unify netdump and disk dump configuration ioctl
/ structure, and leave room for ipv6 netdump as a future possibility.
Backwards-compatibility ioctls are added to smooth ABI transition,
especially for developers who may not keep kernel and userspace perfectly
synced.

Reviewed by: markj, scottl (earlier version)
Relnotes: maybe
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D19996

5 years agoUse PCIV_INVALID in pci_channel_offline() in the LinuxKPI.
hselasky [Mon, 6 May 2019 16:22:45 +0000 (16:22 +0000)]
Use PCIV_INVALID in pci_channel_offline() in the LinuxKPI.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: slavash@
Sponsored by: Mellanox Technologies

5 years agoDisabling a PCI device should only disable busmaster in the LinuxKPI.
hselasky [Mon, 6 May 2019 16:17:38 +0000 (16:17 +0000)]
Disabling a PCI device should only disable busmaster in the LinuxKPI.

As Linux comment for this function point:
Signal to the system that the PCI device is not in use by the system
anymore. This only involves disabling PCI bus-mastering, if active.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: slavash@
Sponsored by: Mellanox Technologies

5 years agoImplement print_hex_dump_debug() function macro in the LinuxKPI.
hselasky [Mon, 6 May 2019 16:10:26 +0000 (16:10 +0000)]
Implement print_hex_dump_debug() function macro in the LinuxKPI.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: slavash@
Sponsored by: Mellanox Technologies

5 years agoReformat arm64 linux syscalls.master per current style
emaste [Mon, 6 May 2019 16:07:14 +0000 (16:07 +0000)]
Reformat arm64 linux syscalls.master per current style

Equivalent to r339958 for sys/kern/syscalls.master.

MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D14858

5 years agoAllow controlling pr_debug at runtime in the LinuxKPI.
hselasky [Mon, 6 May 2019 16:00:20 +0000 (16:00 +0000)]
Allow controlling pr_debug at runtime in the LinuxKPI.

Turning on pr_debug at compile time make it non-optional at runtime.
This often means that the amount of the debugging is unbearable.
Allow developer to turn on pr_debug output only when needed.

Build tested drm-current-kmod prior to commit.

MFC after: 1 week
Submitted by: kib@
Sponsored by: Mellanox Technologies

5 years agogeom: fix initialization order
royger [Mon, 6 May 2019 09:48:34 +0000 (09:48 +0000)]
geom: fix initialization order

There's a race between the initialization of devsoftc.mtx (by devinit)
and the creation of the geom worker thread g_run_events, which calls
devctl_queue_data_f. Both of those are initialized at SI_SUB_DRIVERS
and SI_ORDER_FIRST, which means the geom worked thread can be created
before the mutex has been initialized, leading to the panic below:

 wpanic: mtx_lock() of spin mutex (null) @ /usr/home/osstest/build.135317.build-amd64-freebsd/freebsd/sys/kern/subr_bus.c:620
 cpuid = 3
 time = 1
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe003b968710
 vpanic() at vpanic+0x19d/frame 0xfffffe003b968760
 panic() at panic+0x43/frame 0xfffffe003b9687c0
 __mtx_lock_flags() at __mtx_lock_flags+0x145/frame 0xfffffe003b968810
 devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfffffe003b968840
 g_dev_taste() at g_dev_taste+0x463/frame 0xfffffe003b968a00
 g_load_class() at g_load_class+0x1bc/frame 0xfffffe003b968a30
 g_run_events() at g_run_events+0x197/frame 0xfffffe003b968a70
 fork_exit() at fork_exit+0x84/frame 0xfffffe003b968ab0
 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003b968ab0
 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
 KDB: enter: panic
 [ thread pid 13 tid 100029 ]
 Stopped at      kdb_enter+0x3b: movq    $0,kdb_why

Fix this by initializing geom at SI_ORDER_SECOND instead of
SI_ORDER_FIRST.

Sponsored by: Citrix Systems R&D
Reviewed by: kevans, markj
Differential revision: https://reviews.freebsd.org/D20148

5 years agoDo not flush NFS node from NFS VOP_SET_TEXT().
kib [Mon, 6 May 2019 08:49:43 +0000 (08:49 +0000)]
Do not flush NFS node from NFS VOP_SET_TEXT().

The more appropriate place to do the flushing is VOP_OPEN().  This was
uncovered because VOP_SET_TEXT() is now called with the vnode'
vm_object rlocked, which is incompatible with the flush operations.

After the move, there is no need for NFS-specific VOP_SET_TEXT
overload.

Sponsored by: The FreeBSD Foundation
MFC after: 30 days

5 years agoNoted by: alc
kib [Mon, 6 May 2019 08:46:11 +0000 (08:46 +0000)]
Noted by: alc
Reviewed by: alc, markj (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 6 days

5 years agoAdd ipsec.ko to required_modules for rc.d/ipsec script.
ae [Mon, 6 May 2019 08:30:53 +0000 (08:30 +0000)]
Add ipsec.ko to required_modules for rc.d/ipsec script.

Thus it can be automatically loaded if ipsec_enable="YES" and option IPSEC
is not in the kernel config.

MFC after: 1 week

5 years agozero inputs to vm_page_initfake() for predictable results
tychon [Mon, 6 May 2019 00:57:05 +0000 (00:57 +0000)]
zero inputs to vm_page_initfake() for predictable results

Reviewed by: kib
Submitted by: Anton Rang <rang at acm.org>
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20162

5 years agopowerpc/booke: Use #ifdef __powerpc64__ instead of hw_direct_map in places
jhibbits [Sun, 5 May 2019 20:23:43 +0000 (20:23 +0000)]
powerpc/booke: Use #ifdef __powerpc64__ instead of hw_direct_map in places

Since the DMAP is only available on powerpc64, and is *always* available on
Book-E powerpc64, don't penalize either side (32-bit or 64-bit) by always
checking hw_direct_map to perform operations.  This saves 5-10% time on
various ports builds, and on buildworld+buildkernel on Book-E hardware.

MFC after: 3 weeks

5 years agopowerpc/booke: Fix size check for phys_avail in pmap bootstrap
jhibbits [Sun, 5 May 2019 20:05:50 +0000 (20:05 +0000)]
powerpc/booke: Fix size check for phys_avail in pmap bootstrap

Use the nitems() macro instead of the expansion, a'la r298352.  Also, fix the
location of this check to after initializing availmem_regions_sz, so that the
check isn't always against 0, thus always failing (nitems(phys_avail) is always
more than 0).

5 years agoDecode some more ATA commands found in ACS-4.
mav [Sun, 5 May 2019 17:10:12 +0000 (17:10 +0000)]
Decode some more ATA commands found in ACS-4.

MFC after: 1 week

5 years agoEnsure that error is initialized in ufs_bmap_seekdata().
markj [Sun, 5 May 2019 16:57:03 +0000 (16:57 +0000)]
Ensure that error is initialized in ufs_bmap_seekdata().

Reported and tested by: jhibbits
MFC with: r346932
Sponsored by: The FreeBSD Foundation

5 years agoDecode Deallocate Logical Block Features.
mav [Sun, 5 May 2019 15:47:21 +0000 (15:47 +0000)]
Decode Deallocate Logical Block Features.

MFC after: 1 week

5 years agoSwitch to use shared vnode locks for text files during image activation.
kib [Sun, 5 May 2019 11:20:43 +0000 (11:20 +0000)]
Switch to use shared vnode locks for text files during image activation.

kern_execve() locks text vnode exclusive to be able to set and clear
VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0
condition.

The change removes VV_TEXT, replacing it with the condition
v_writecount <= -1, and puts v_writecount under the vnode interlock.
Each text reference decrements v_writecount.  To clear the text
reference when the segment is unmapped, it is recorded in the
vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and
v_writecount is incremented on the map entry removal

The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that
v_writecount does not contradict the desired change.  vn_writecheck()
is now racy and its use was eliminated everywhere except access.
Atomic check for writeability and increment of v_writecount is
performed by the VOP.  vn_truncate() now increments v_writecount
around VOP_SETATTR() call, lack of which is arguably a bug on its own.

nullfs bypasses v_writecount to the lower vnode always, so nullfs
vnode has its own v_writecount correct, and lower vnode gets all
references, since object->handle is always lower vnode.

On the text vnode' vm object dealloc, the v_writecount value is reset
to zero, and deadfs vop_unset_text short-circuit the operation.
Reclamation of lowervp always reclaims all nullfs vnodes referencing
lowervp first, so no stray references are left.

Reviewed by: markj, trasz
Tested by: mjg, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D19923