]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
5 years agoEnsure that only one command is specified at a time in mlx5tool(8).
hselasky [Wed, 8 May 2019 11:05:30 +0000 (11:05 +0000)]
Ensure that only one command is specified at a time in mlx5tool(8).

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement firmware reset from userspace in mlx5tool(8).
hselasky [Wed, 8 May 2019 11:05:09 +0000 (11:05 +0000)]
Implement firmware reset from userspace in mlx5tool(8).

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd Firmware Reset Level, MFRL, register accessors in mlx5core.
hselasky [Wed, 8 May 2019 11:04:40 +0000 (11:04 +0000)]
Add Firmware Reset Level, MFRL, register accessors in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd ConnectX-6 DX HCA ID to libmlx5.
hselasky [Wed, 8 May 2019 11:04:09 +0000 (11:04 +0000)]
Add ConnectX-6 DX HCA ID to libmlx5.

In addition, add "ConnectX family mlx5Gen Virtual Function" device ID.
Every new HCA VF will be identified with this device ID.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoExpose per-lane counters before correction mechanism in mlx5en(4).
hselasky [Wed, 8 May 2019 11:03:29 +0000 (11:03 +0000)]
Expose per-lane counters before correction mechanism in mlx5en(4).

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd support for extended PCIe counters in mlx5en(4).
hselasky [Wed, 8 May 2019 11:02:36 +0000 (11:02 +0000)]
Add support for extended PCIe counters in mlx5en(4).

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoExtend the counters framework in mlx5en(4).
hselasky [Wed, 8 May 2019 10:59:16 +0000 (10:59 +0000)]
Extend the counters framework in mlx5en(4).

Allow more macro arguments and split the variable type and name into
separate arguments. This allows simple and powerful copy and extraction
of values from IFC based structures into SYSCTLs with the use of a single
macro.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoUpdate performance counter bits in mlx5core.
hselasky [Wed, 8 May 2019 10:58:41 +0000 (10:58 +0000)]
Update performance counter bits in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement reading PCI power status in mlx5core.
hselasky [Wed, 8 May 2019 10:58:06 +0000 (10:58 +0000)]
Implement reading PCI power status in mlx5core.

Implement a watchdog as part of the healtcare subsystem which
reads the PCI power status during startup and upon the PCI
power status change event and store it into the core device
structure. This value is then exported to user-space via a
read-only SYSCTL. A dmesg print has been added to inform
the admin about the PCI power status.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMove workqueue from mlx5en(4) to mlx5core.
hselasky [Wed, 8 May 2019 10:57:37 +0000 (10:57 +0000)]
Move workqueue from mlx5en(4) to mlx5core.

This avoids creating more workqueues in mlx5core to do
simple firmware command polling tasks.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAlways return success for RoCE modify port in mlx5ib.
hselasky [Wed, 8 May 2019 10:57:16 +0000 (10:57 +0000)]
Always return success for RoCE modify port in mlx5ib.

CM layer calls ib_modify_port() regardless of the link layer.

For the Ethernet ports, qkey violation and Port capabilities
are meaningless. Therefore, always return success for ib_modify_port
calls on the Ethernet ports.

Linux Commit:
ec2558796d25e6024071b6bcb8e11392538d57bf

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd support for new rates to mlx5ib.
hselasky [Wed, 8 May 2019 10:56:51 +0000 (10:56 +0000)]
Add support for new rates to mlx5ib.

Submitted by: slavash@
MFC after:      3 days
Sponsored by:   Mellanox Technologies

5 years agoAdd support for 200Gbit speeds to libibverbs.
hselasky [Wed, 8 May 2019 10:56:22 +0000 (10:56 +0000)]
Add support for 200Gbit speeds to libibverbs.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd new rates to ibcore.
hselasky [Wed, 8 May 2019 10:55:47 +0000 (10:55 +0000)]
Add new rates to ibcore.

Add the new rates that were added to the Infiniband specification as part of
HDR and 2x support.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDo not add IFM_10G_LR and IFM_40G_ER4 to supported media types by default in
hselasky [Wed, 8 May 2019 10:55:15 +0000 (10:55 +0000)]
Do not add IFM_10G_LR and IFM_40G_ER4 to supported media types by default in
mlx5en(4).

IFM_10G_LR and IFM_40G_ER4 media should be added only if the device
has the needed capability bit set for it.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd support for 200Gb ethernet speeds to mlx5core.
hselasky [Wed, 8 May 2019 10:54:54 +0000 (10:54 +0000)]
Add support for 200Gb ethernet speeds to mlx5core.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRemove unused speed enums in mlx5core.
hselasky [Wed, 8 May 2019 10:54:24 +0000 (10:54 +0000)]
Remove unused speed enums in mlx5core.

Submitted by: slavash@
MFC after:      3 days
Sponsored by:   Mellanox Technologies

5 years agoControl automatic update of firmware on driver load with a tunable in mlx5core.
hselasky [Wed, 8 May 2019 10:54:05 +0000 (10:54 +0000)]
Control automatic update of firmware on driver load with a tunable in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoCorrect check for the calibration generation in mlx5en(4).
hselasky [Wed, 8 May 2019 10:53:47 +0000 (10:53 +0000)]
Correct check for the calibration generation in mlx5en(4).

If generation is cleared due to hardware clock failure, check for it before
the divisor is used.  Actually clear generation when failure occurs.

While there, stop doing the calculations inside the generation loop.  Since
all members of mlx5e_clbr_point are used for calculations, get the
local copy of the structure and use it after generation stabilized.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoLet rx_out_of_buffer be a 32-bit counter in mlx5en(4).
hselasky [Wed, 8 May 2019 10:53:25 +0000 (10:53 +0000)]
Let rx_out_of_buffer be a 32-bit counter in mlx5en(4).

This fixes counting issues when the firmware resets the counter during
allocation of the counter set where the counter belongs.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd vnic steering drop statistics in mlx5en(4).
hselasky [Wed, 8 May 2019 10:53:01 +0000 (10:53 +0000)]
Add vnic steering drop statistics in mlx5en(4).

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoUse software counters for rx_packets and rx_bytes in mlx5en(4).
hselasky [Wed, 8 May 2019 10:52:32 +0000 (10:52 +0000)]
Use software counters for rx_packets and rx_bytes in mlx5en(4).

The physical- and virtual- port counters might not reflect the amount
of data received after address filtering. Use the software counters
instead for rx_packets and rx_bytes to know exactly how much data
was received.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd mlx5_firmware_update() in mlx5core.
hselasky [Wed, 8 May 2019 10:52:11 +0000 (10:52 +0000)]
Add mlx5_firmware_update() in mlx5core.
Add support for upgrading firmware on mlx5 module load.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoHandle IB_EVENT_DEVICE_FATAL event in ipoib.
hselasky [Wed, 8 May 2019 10:51:49 +0000 (10:51 +0000)]
Handle IB_EVENT_DEVICE_FATAL event in ipoib.
Perform flush if IB_EVENT_DEVICE_FATAL was received.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix for double bus master disable in mlx5core.
hselasky [Wed, 8 May 2019 10:51:29 +0000 (10:51 +0000)]
Fix for double bus master disable in mlx5core.

mlx5_pci_disable_device is calling pci_disable_device which disables
bus master. No need to explicitly call pci_clear_master.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDocument userspace firmware flash in mlx5tool(8) and mlx5io(4).
hselasky [Wed, 8 May 2019 10:51:07 +0000 (10:51 +0000)]
Document userspace firmware flash in mlx5tool(8) and mlx5io(4).

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement userspace firmware update for ConnectX-4/5/6.
hselasky [Wed, 8 May 2019 10:50:35 +0000 (10:50 +0000)]
Implement userspace firmware update for ConnectX-4/5/6.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRename mlx5_fwdump_addr to more neutral mlx5_tool_addr in mlx5core.
hselasky [Wed, 8 May 2019 10:50:08 +0000 (10:50 +0000)]
Rename mlx5_fwdump_addr to more neutral mlx5_tool_addr in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd mlxfw callbacks in mlx5core.
hselasky [Wed, 8 May 2019 10:49:36 +0000 (10:49 +0000)]
Add mlxfw callbacks in mlx5core.

Add mlx5 implementation for the ones defined by the mlxfw
shared module to be used while flashing the device firmware.

The callbacks do their job through the MCQI, MCC and MCDA registers.

Linux commit:
62bd22cf326dc4ac5be673c11cef4602dc1f5e47

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoInitial version of Mellanox in-kernel firmware upgrade support.
hselasky [Wed, 8 May 2019 10:49:05 +0000 (10:49 +0000)]
Initial version of Mellanox in-kernel firmware upgrade support.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoConvert remaining module parameters into SYSCTLs in mlx5core.
hselasky [Wed, 8 May 2019 10:44:53 +0000 (10:44 +0000)]
Convert remaining module parameters into SYSCTLs in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRemove redundant line of code in mlx5core.
hselasky [Wed, 8 May 2019 10:44:27 +0000 (10:44 +0000)]
Remove redundant line of code in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoChange implicit and probably erronous EPERM to EIO on command status error
hselasky [Wed, 8 May 2019 10:44:02 +0000 (10:44 +0000)]
Change implicit and probably erronous EPERM to EIO on command status error
in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix typo.
hselasky [Wed, 8 May 2019 10:43:35 +0000 (10:43 +0000)]
Fix typo.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix style.
hselasky [Wed, 8 May 2019 10:42:51 +0000 (10:42 +0000)]
Fix style.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix netstat counters mapping in mlx5en(4).
hselasky [Wed, 8 May 2019 10:42:33 +0000 (10:42 +0000)]
Fix netstat counters mapping in mlx5en(4).

The current mapping of driver counters to netstat counters is wrong.
For example, a single jabber packet, will cause the Ierrs counter to
count three times.

The work for mapping the hardware and software counters to their right
place in netstat counters were already done in Linux, take that as is
to the FreeBSD driver.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix endless loop in ipoib_poll().
hselasky [Wed, 8 May 2019 10:42:05 +0000 (10:42 +0000)]
Fix endless loop in ipoib_poll().

ib_req_notify_cq may return negative value which will indicate a
failure. In the case of uncorrectable error, we will end up in an
endless loop. Fix that, by going to another loop with poll_more
only if there is anything left to poll.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAvoid leaking send queue mbufs during error recovery in mlx5en(4).
hselasky [Wed, 8 May 2019 10:41:44 +0000 (10:41 +0000)]
Avoid leaking send queue mbufs during error recovery in mlx5en(4).

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd helper functions to set/query MCC/MCDA/MCQI registers in mlx5core.
hselasky [Wed, 8 May 2019 10:41:21 +0000 (10:41 +0000)]
Add helper functions to set/query MCC/MCDA/MCQI registers in mlx5core.

To be used by the mlx5 callbacks exposed to the mlxfw module.

Linux commit:
d2ad488b0073bd1a2c3f5d2ea50a7eb632103e5d

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoEnhance MCAM reg to allow query on access reg support in mlx5core.
hselasky [Wed, 8 May 2019 10:41:00 +0000 (10:41 +0000)]
Enhance MCAM reg to allow query on access reg support in mlx5core.

Enhance MCAM to allow the driver to query which access regs are
supported. For now, expose the regs needed for FW flashing.

Linux commit:
0ab87743cc8c5bcd482daf71961ed5fc45349e01

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd MCC (Management Component Control) register definitions in mlx5core.
hselasky [Wed, 8 May 2019 10:40:41 +0000 (10:40 +0000)]
Add MCC (Management Component Control) register definitions in mlx5core.

MCC (Management Component Control) allows to control a firmware
component update.

MCDA (Management Component Data Access) allows to read and write
a firmware component.

MCQI (Management Component Query Information) allows to query
information about firmware components.

Linux commit:
4717628938423fcba0aa8fa889e9fed4eb6a655f

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd reading the mcam_reg in mlx5core.
hselasky [Wed, 8 May 2019 10:40:13 +0000 (10:40 +0000)]
Add reading the mcam_reg in mlx5core.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoQuery and cache PCAM, MCAM registers on initialization in mlx5core.
hselasky [Wed, 8 May 2019 10:39:53 +0000 (10:39 +0000)]
Query and cache PCAM, MCAM registers on initialization in mlx5core.

On load_one, we now cache our capabilities registers internally, similar
to QUERY_HCA_CAP. Capabilities can later be queried using macros
introduced in this patch.

Linux commit:
71862561f3a62015a11de16d1c306481e8415c08

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement PCAM, MCAM access register commands in mlx5core.
hselasky [Wed, 8 May 2019 10:39:25 +0000 (10:39 +0000)]
Implement PCAM, MCAM access register commands in mlx5core.

Introduced registers will expose capabilities of new registers and
features related to port/management.
Driver will query MCAM and PCAM in order to avoid failing on old
firmwares with lack of support.

Linux commit:
c835ad64683bd3e2d1b31ed2cb1ff4366932edb1

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoExpose PCAM, MCAM registers infrastructure in mlx5core.
hselasky [Wed, 8 May 2019 10:39:01 +0000 (10:39 +0000)]
Expose PCAM, MCAM registers infrastructure in mlx5core.

PCAM: Ports capabilities mask register.
MCAM: Management capabilities mask register.

PCAM and MCAM registers will provide information regarding firmware
support for different features, in order to avoid cases where new driver
combined with old firmware results in syndromes (for ex. PCIe counters
before this patchset).

Linux commit:
cfdcbceaeffc669b70d904d80a2df9c86c232566

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd sysctl(8) to control fast unload support in mlx5core.
hselasky [Wed, 8 May 2019 10:38:31 +0000 (10:38 +0000)]
Add sysctl(8) to control fast unload support in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd Fast teardown support to mlx5core.
hselasky [Wed, 8 May 2019 10:38:06 +0000 (10:38 +0000)]
Add Fast teardown support to mlx5core.

Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown

This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.

Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been
   scattered to memory

Linux commit:
fcd29ad17c6ff885dfae58f557e9323941e63ba2

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMake sure the running variable is properly set for ratelimited SQs in mlx5en(4).
hselasky [Wed, 8 May 2019 10:37:31 +0000 (10:37 +0000)]
Make sure the running variable is properly set for ratelimited SQs in mlx5en(4).

Else the SQs won't be properly released when closing rate-limited connections
leading to wrong state transitions on the SQ.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement get and set nic state as global functions in mlx5core.
hselasky [Wed, 8 May 2019 10:37:03 +0000 (10:37 +0000)]
Implement get and set nic state as global functions in mlx5core.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoTicks are integer type in FreeBSD.
hselasky [Wed, 8 May 2019 10:36:32 +0000 (10:36 +0000)]
Ticks are integer type in FreeBSD.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoConfigure firmware to use RX hash format in mini CQE in mlx5en(4).
hselasky [Wed, 8 May 2019 10:35:55 +0000 (10:35 +0000)]
Configure firmware to use RX hash format in mini CQE in mlx5en(4).

When using CQE zipping, one can choose between RX hash and Checksum.
This will indicate the parameter on which a zipping session should be
stopped.

While porting the Linux code, Checksum was chosen. However, the value
of Checksum is not being used anywhere.
For the FreeBSD driver, we prefer to use the RX hash format which will
guarantee the RX hash value for all the mini CQEs.
While at it, make sure to initialize the Checksum value in the
decompressed CQE.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDisable CQE zipping by default in mlx5en(4).
hselasky [Wed, 8 May 2019 10:35:35 +0000 (10:35 +0000)]
Disable CQE zipping by default in mlx5en(4).

After doing performance measurements, it seems like CQE zipping doesn't
have any significant benefit.
Moreover, we know that this feature is disabled by default on other
operating systems (Linux for example).

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoSplit mlx5e_update_stats_work() in mlx5en(4).
hselasky [Wed, 8 May 2019 10:35:14 +0000 (10:35 +0000)]
Split mlx5e_update_stats_work() in mlx5en(4).

Split the function into the mlx5e_update_stats_locked() core and make
mlx5e_update_stats_work() call the _locked helper, similar to many other
places in the kernel. This improves the code structure, making the
locking clean.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImplement fast close of RX channel in mlx5en(4).
hselasky [Wed, 8 May 2019 10:34:42 +0000 (10:34 +0000)]
Implement fast close of RX channel in mlx5en(4).

Instead of waiting for all jobs to be cancelled, simply close the completion
queue to prevent more completion events and let mlx5e_destroy_rq() cleanup
the remaining mbufs.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoCorrect number of elements for priority to traffic class mappings in mlx5en(4).
hselasky [Wed, 8 May 2019 10:34:14 +0000 (10:34 +0000)]
Correct number of elements for priority to traffic class mappings in mlx5en(4).

The number of priorities is always 8, while the number of traffic classes
supported can vary. While at it convert the sysctl node into an array.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoRemove unused module parameter in mlx5ib.
hselasky [Wed, 8 May 2019 10:33:29 +0000 (10:33 +0000)]
Remove unused module parameter in mlx5ib.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMake sure to error out when arming the CQ fails in mlx4ib and mlx5ib.
hselasky [Wed, 8 May 2019 10:33:09 +0000 (10:33 +0000)]
Make sure to error out when arming the CQ fails in mlx4ib and mlx5ib.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoMake sure to error out when arming the CQ fails in ibcore.
hselasky [Wed, 8 May 2019 10:32:45 +0000 (10:32 +0000)]
Make sure to error out when arming the CQ fails in ibcore.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDestroy port stats debug context in correct order in mlx5en(4).
hselasky [Wed, 8 May 2019 10:32:22 +0000 (10:32 +0000)]
Destroy port stats debug context in correct order in mlx5en(4).
Destroy children nodes before parent nodes.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoFix tx_jumbo_packets counter in mlx5en(4).
hselasky [Wed, 8 May 2019 10:32:03 +0000 (10:32 +0000)]
Fix tx_jumbo_packets counter in mlx5en(4).

Instead of reading Ethernet RFC 2819 pXtoYoctets counters from
hardware which counts RX octets, count tx_stat_pXtoYoctets from
Ethernet extended counters which counts TX octets.

TX jumbo counters should be accumulated only after the PPCNT
counters were fetched from hardware with their latest value.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoUpdate Ethernet extended counters in mlx5en(4).
hselasky [Wed, 8 May 2019 10:31:32 +0000 (10:31 +0000)]
Update Ethernet extended counters in mlx5en(4).

Expose all Ethernet extended counters those counters via debug_stats
sysctl:
dev.mce.X.debug_stats

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoProtect from infinite sw-reset loop in mlx5core.
hselasky [Wed, 8 May 2019 10:30:47 +0000 (10:30 +0000)]
Protect from infinite sw-reset loop in mlx5core.

Avoid an infinite software firmware reset loop that may be caused by a
hardware bug by limiting the maximum number of resets.
The counter between resets is reset by request for reset, and not by a
successful reset.
The interval between two resets can be configured via sysctl:
hw.mlx5.sw_reset_timeout
which is global to all mlx5 devices in the system.

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoDisable all MSIX interrupts before shutdown in mlx5.
hselasky [Wed, 8 May 2019 10:30:18 +0000 (10:30 +0000)]
Disable all MSIX interrupts before shutdown in mlx5.

Make sure the interrupt handlers don't race with the fast unload one
code in the shutdown handler.

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoImport Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.
hselasky [Wed, 8 May 2019 10:29:45 +0000 (10:29 +0000)]
Import Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd temperature warning event to log in mlx5core.
hselasky [Wed, 8 May 2019 10:28:18 +0000 (10:28 +0000)]
Add temperature warning event to log in mlx5core.

Temperature warning event is sent by FW to indicate high temperature
as detected by one of the sensors on the board.
Add handling of this event by writing the numbers of the alert sensors
to the kernel log.

Linux commit:
1865ea9adbfaf341c5cd5d8f7d384f19948b2fe9

Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoCorrectly define the interface state bits in mlx5en(4).
hselasky [Wed, 8 May 2019 10:27:29 +0000 (10:27 +0000)]
Correctly define the interface state bits in mlx5en(4).

While at it remove unused interface state bits. This also fixes and issue
during shutdown:

There is an issue where the firmware fails during mlx5_load_one,
the health_care timer detects the issue and schedules a health_care call.
Then the mlx5_load_one detects the issue, cleans up and quits. Then
the health_care starts and calls mlx5_unload_one to clean up the resources
that no longer exist and causes kernel panic.

The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
after mlx5_load_one fails. The solution is removing the bit
MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
is redundant and we can use MLX5_INTERFACE_STATE_UP instead.

Linux commit:
10a8d00707082955b177164d4b4e758ffcbd4017
b3cb5388499c5e219324bfe7da2e46cbad82bfcf

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoEnable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.
hselasky [Wed, 8 May 2019 10:26:33 +0000 (10:26 +0000)]
Enable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.
hselasky [Wed, 8 May 2019 10:25:14 +0000 (10:25 +0000)]
Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.

Submitted by: kib@
MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAdd support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).
hselasky [Wed, 8 May 2019 10:23:33 +0000 (10:23 +0000)]
Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).

Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after: 3 days
Sponsored by: Mellanox Technologies

5 years agoAllow to build without INET and INET6 again after r347221.
marius [Wed, 8 May 2019 09:03:43 +0000 (09:03 +0000)]
Allow to build without INET and INET6 again after r347221.

Submitted by: cam

5 years agoMove contrib/zlib to sys/contrib/zlib so that we can use it in kernel.
delphij [Wed, 8 May 2019 08:43:15 +0000 (08:43 +0000)]
Move contrib/zlib to sys/contrib/zlib so that we can use it in kernel.
This is a prerequisite of unifying kernel zlib instances.

Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20191

5 years agoPull in r360099 from upstream llvm trunk (by Eli Friedman):
dim [Wed, 8 May 2019 05:45:00 +0000 (05:45 +0000)]
Pull in r360099 from upstream llvm trunk (by Eli Friedman):

  [ARM] Glue register copies to tail calls.

  This generally follows what other targets do. I don't completely
  understand why the special case for tail calls existed in the first
  place; even when the code was committed in r105413, call lowering
  didn't work in the way described in the comments.

  Stack protector lowering breaks if the register copies are not glued
  to a tail call: we have to insert the stack protector check before
  the tail call, and we choose the location based on the assumption
  that all physical register dependencies of a tail call are adjacent
  to the tail call. (See FindSplitPointForStackProtector.) This is sort
  of fragile, but I don't see any reason to break that assumption.

  I'm guessing nobody has seen this before just because it's hard to
  convince the scheduler to actually schedule the code in a way that
  breaks; even without the glue, the only computation that could
  actually be scheduled after the register copies is the computation of
  the call address, and the scheduler usually prefers to schedule that
  before the copies anyway.

  Fixes https://bugs.llvm.org/show_bug.cgi?id=41417

  Differential Revision: https://reviews.llvm.org/D60427

This should fix several instances of "Bad machine code: Using an
undefined physical register", when compiling ports such as
multimedia/vlc, audio/alsa-lib and devel/avro-c for armv6, with
-fstack-protector-strong.

Reported by: jbeich
PR: 237074, 237783, 237784
MFC after: 3 days

5 years agopowerpc: hide innocuous printf behind bootverbose
jhibbits [Wed, 8 May 2019 03:15:22 +0000 (03:15 +0000)]
powerpc: hide innocuous printf behind bootverbose

NUMA associativity, and OFW node existence, is completely optional, and
shouldn't warn always.

5 years agotun/tap: merge and rename to `tuntap`
kevans [Wed, 8 May 2019 02:32:11 +0000 (02:32 +0000)]
tun/tap: merge and rename to `tuntap`

tun(4) and tap(4) share the same general management interface and have a lot
in common. Bugs exist in tap(4) that have been fixed in tun(4), and
vice-versa. Let's reduce the maintenance requirements by merging them
together and using flags to differentiate between the three interface types
(tun, tap, vmnet).

This fixes a couple of tap(4)/vmnet(4) issues right out of the gate:
- tap devices may no longer be destroyed while they're open [0]
- VIMAGE issues already addressed in tun by kp

[0] emaste had removed an easy-panic-button in r240938 due to devdrn
blocking. A naive glance over this leads me to believe that this isn't quite
complete -- destroy_devl will only block while executing d_* functions, but
doesn't block the device from being destroyed while a process has it open.
The latter is the intent of the condvar in tun, so this is "fixed" (for
certain definitions of the word -- it wasn't really broken in tap, it just
wasn't quite ideal).

ifconfig(8) also grew the ability to map an interface name to a kld, so
that `ifconfig {tun,tap}0` can continue to autoload the correct module, and
`ifconfig vmnet0 create` will now autoload the correct module. This is a
low overhead addition.

(MFC commentary)

This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this,
and how critical they are. Changes after this are likely easily MFC'd
without taking this merge, but the merge will be easier.

I have no plans to do this MFC as of now.

Reviewed by: bcr (manpages), tuexen (testing, syzkaller/packetdrill)
Input also from: melifaro
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D20044

5 years agoFix dataset name comparison in zfs_compare().
mav [Wed, 8 May 2019 01:35:43 +0000 (01:35 +0000)]
Fix dataset name comparison in zfs_compare().

The code never returned match comparing two datasets (not snapshots).
As result, uu_avl_find(), called from zfs_callback(), never succeeded,
allowing to add same dataset into the list multiple times, for example:

# zfs get name pers pers pers@z pers@z
NAME    PROPERTY  VALUE   SOURCE
pers    name      pers    -
pers    name      pers    -
pers@z  name      pers@z  -

With the patch:

# zfs get name pers pers pers@z pers@z
NAME    PROPERTY  VALUE   SOURCE
pers    name      pers    -
pers@z  name      pers@z  -

MFC after: 1 week
Sponsored by: iXsystems, Inc.

5 years agorandom: x86 driver: Prefer RDSEED over RDRAND when available
cem [Wed, 8 May 2019 00:45:16 +0000 (00:45 +0000)]
random: x86 driver: Prefer RDSEED over RDRAND when available

Per
https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
, RDRAND is a PRNG seeded from the same source as RDSEED.  The source is
more suitable as PRNG seed material, so prefer it when the RDSEED intrinsic
is available (indicated in CPU feature bits).

Reviewed by: delphij, jhb, imp (earlier version)
Approved by: secteam(delphij)
Security: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20192

5 years agovmm(4): Pass through RDSEED feature bit to guests
cem [Wed, 8 May 2019 00:40:08 +0000 (00:40 +0000)]
vmm(4): Pass through RDSEED feature bit to guests

Reviewed by: jhb
Approved by: #bhyve (jhb)
MFC after: 2 leapseconds
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20194

5 years agoAdd missing newline to debug printf.
imp [Wed, 8 May 2019 00:09:10 +0000 (00:09 +0000)]
Add missing newline to debug printf.

5 years agoFix libsbuf sbuf_printf_drain symbol version
cem [Tue, 7 May 2019 21:15:11 +0000 (21:15 +0000)]
Fix libsbuf sbuf_printf_drain symbol version

(Introduced incorrectly in r347229 earlier today.)

As pointed out by kevans, 1.6 should be used for FreeBSD 13, like r340383.

Submitted by: kevans
Reported by: kib
Reviewed by: jilles
X-MFC-with:  r347229
Differential Revision: https://reviews.freebsd.org/D20187

5 years agoImprove the legibility of the login.access.5 man page by separating
cy [Tue, 7 May 2019 20:39:39 +0000 (20:39 +0000)]
Improve the legibility of the login.access.5 man page by separating
each argument into its own paragraph.

MFC after: 3 days

5 years agoRemove non-functional SCTP checksum offload support for virtio.
tuexen [Tue, 7 May 2019 20:28:12 +0000 (20:28 +0000)]
Remove non-functional SCTP checksum offload support for virtio.

Checksum offloading for SCTP is not currently specified for virtio.
If the hypervisor announces checksum offloading support, it means TCP
and UDP checksum offload. If an SCTP packet is sent and the host announced
checksum offload support, the hypervisor inserts the IP checksum (16-bit)
at the correct offset, but this is not the right checksum, which is a CRC32c.
This results in all outgoing packets having the wrong checksum and therefore
breaking SCTP based communications.

This patch removes SCTP checksum offloading support from the virtio
network interface.

Thanks to Felix Weinrank for making me aware of the issue.

Reviewed by: bryanv@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20147

5 years agoSupport PTRACE_GETREGSET w/ NT_PRSTATUS in Linux ptrace(2).
trasz [Tue, 7 May 2019 19:06:41 +0000 (19:06 +0000)]
Support PTRACE_GETREGSET w/ NT_PRSTATUS in Linux ptrace(2).

While Linux strace(1) doesn't strictly require it - it has a fallback
to PTRACE_GETREGS - it's a newer interface, so we better support it
before the old one is deprecated.

Reviewed by: dchagin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20152

5 years agomake sysent after r347228
emaste [Tue, 7 May 2019 18:10:21 +0000 (18:10 +0000)]
make sysent after r347228

Regenerate to add @generated tag in generated files.

5 years agodevice_printf: Use sbuf for more coherent prints on SMP
cem [Tue, 7 May 2019 17:47:20 +0000 (17:47 +0000)]
device_printf: Use sbuf for more coherent prints on SMP

device_printf does multiple calls to printf allowing other console messages to
be inserted between the device name, and the rest of the message.  This change
uses sbuf to compose to two into a single buffer, and prints it all at once.

It exposes an sbuf drain function (drain-to-printf) for common use.

Update documentation to match; some unit tests included.

Submitted by: jmg
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D16690

5 years agomakesyscalls: use @generated tag in generated files
emaste [Tue, 7 May 2019 16:17:33 +0000 (16:17 +0000)]
makesyscalls: use @generated tag in generated files

Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.

Reviewed by: cem
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20183

5 years agoSimplify the test against maxproc in fork1().
markj [Tue, 7 May 2019 15:03:26 +0000 (15:03 +0000)]
Simplify the test against maxproc in fork1().

Previously nprocs_new would be tested against maxprocs twice when
nprocs_new < maxprocs - 10.  Eliminate the unnecessary comparison.

Submitted by: Wuyang Chung <wuyang.chung1@gmail.com>
GitHub PR: https://github.com/freebsd/freebsd/pull/397
MFC after: 1 week

5 years agoDisable interrupts first and then set spinlock_count to 1.
br [Tue, 7 May 2019 14:32:17 +0000 (14:32 +0000)]
Disable interrupts first and then set spinlock_count to 1.
Otherwise interrupt can be generated just after setting spinlock_count
and before disabling interrupts.

Sponsored by: DARPA, AFRL

5 years agoProvide a template for busdma code for RISC-V.
br [Tue, 7 May 2019 13:41:43 +0000 (13:41 +0000)]
Provide a template for busdma code for RISC-V.

RISC-V ISA specifies no cache management instructions so leave cache
operations in cpufunc.h as no-op for now.

Note some new hardware comes with their own memory-mapped cache
management controller.

Tested on HiFive Unleashed board with cgem(4).

Reviewed by: markj
Obtained from: arm64
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D20126

5 years agoUse @generated tag in generated files
emaste [Tue, 7 May 2019 13:04:26 +0000 (13:04 +0000)]
Use @generated tag in generated files

Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makeobjops.awk and vnode_if.awk as we've done
for other generated files.

Sponsored by: The FreeBSD Foundation

5 years agocommand_bcache() does not use argv
tsoome [Tue, 7 May 2019 10:01:45 +0000 (10:01 +0000)]
command_bcache() does not use argv

Therefore mark argv __unused.

5 years agoo Avoid determining the MAC class (LEM/EM or IGB) - possibly even multiple
marius [Tue, 7 May 2019 08:31:54 +0000 (08:31 +0000)]
o Avoid determining the MAC class (LEM/EM or IGB) - possibly even multiple
  times - on every interrupt by using an own set of device methods for the
  IGB class. This translates to introducing igb_if_intr_{disable,enable}()
  and igb_if_{rx,tx}_queue_intr_enable() with that IGB-specific code moved
  out of their EM counterparts and otherwise continuing to use the EM IFDI
  methods also for IGB.
  Note that igb_if_intr_{disable,enable}() also issue E1000_WRITE_FLUSH as
  lost with the conversion of igb(4) to iflib(4).
  Also note, that the em_if_{disable,enable}_intr() methods are renamed to
  em_if_intr_{disable,enable}() for consistency with the names used in the
  interface declaration.
o In em_intr():
  - Don't bother to bail out if the interrupt type is "legacy", i. e. INTx
    or MSI, as iflib(4) doesn't use ift_legacy_intr methods for MSI-X. All
    other iflib(4)-based drivers avoid this check, too.
  - Given that only the MSI-X interrupts have one-shot behavior (by taking
    advantage of the EIAC register), explicitly disable interrupts. Hence,
    em_intr() now matches what {em,igb}_irq_fast() previously did (in case
    of igb(4) supposedly also to work around MSI message reordering errata
    on certain systems).
o In em_if_intr_disable():
  - Clear the EIAC register unconditionally for 82574 and not just in case
    of MSI-X, matching em_if_intr_enable() and bringing back the last hunk
    of r206437 lost with the iflib(4) conversion.
  - Write to EM_EIAC for clearing said register instead of to the IGB-only
    E1000_EIAC used ever since the iflib(4) conversion.

Reviewed by: shurd
Differential Revision: https://reviews.freebsd.org/D20176

5 years agoo Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
marius [Tue, 7 May 2019 08:28:35 +0000 (08:28 +0000)]
o Use iflib_fast_intr_rxtx() also for "legacy" interrupts, i. e. INTx and
  MSI. Unlike as with iflib_fast_intr_ctx(), the former will also enqueue
  _task_fn_tx() in addition to _task_fn_rx() if appropriate, bringing TCP
  TX throughput of EM-class devices on par with the MSI-X case and, thus,
  close to wirespeed/pre-iflib(4) times again. [1]
  Note that independently of the interrupt type, the UDP performance with
  these MACs still is abysmal and nowhere near to where it was before the
  conversion of em(4) to iflib(4).
o In iflib_init_locked(), announce which free list failed to set up.
o In _task_fn_tx() when running netmap(4), issue ifdi_intr_enable instead
  of the ifdi_tx_queue_intr_enable method in case of a "legacy" interrupt
  as the latter is valid with MSI-X only.
o Instead of adding the missing - and apparently convoluted enough that a
  DBG_COUNTER_INC was put into a wrong spot in _task_fn_rx() - checks for
  ifdi_{r,t}x_queue_intr_enable being available in the MSI-X case also to
  iflib_fast_intr_rxtx(), factor these out to iflib_device_register() and
  make the checks fail gracefully rather than panic. This avoids invoking
  the checks at runtime over and over again in iflib_fast_intr_rxtx() and
  _task_fn_{r,t}x() - even if it's just in case of INVARIANTS - and makes
  these functions more readable.
o In iflib_rx_structures_setup(), only initialize LRO resources if device
  and driver have LRO capability in order to not waste memory. Also, free
  the LRO resources again if setting them up fails for one of the queues.
  However, don't bother invoking iflib_rx_sds_free() in that case because
  iflib_rx_structures_setup() doesn't call iflib_rxsd_alloc() either (and
  iflib_{device,pseudo}_register() will issue iflib_rx_sds_free() in case
  of failure via iflib_rx_structures_free(), but there definitely is some
  asymmetry left to be fixed, though).
o Similarly, free LRO resources again in iflib_rx_structures_free().
o In iflib_irq_set_affinity(), handle get_core_offset() errors gracefully
  instead of panicing (but only in case of INVARIANTS). This is a follow-
  up to r344132, as such driver bugs shouldn't be fatal.
o Likewise, handle unknown iflib_intr_type_t in iflib_irq_alloc_generic()
  gracefully, too.
o Bring yet more sanity to iflib_msix_init():
  - If the device doesn't provide enough MSI-X vectors or not all vectors
    can be allocate so the expected number of queues in addition to admin
    interrupts can't be supported, try MSI next (and then INTx) as proper
    MSI-X vector distribution can't be assured in such cases. In essence,
    this change brings r254008 forward to iflib(4). Also, this is the fix
    alluded to in the commit message of r343934.
  - If the MSI-X allocation has failed, don't prematurely announce MSI is
    going to be used as the latter in fact may not be available either.
  - When falling back to MSI, only release the MSI-X table resource again
    if it was allocated in iflib_msix_init(), i. e. isn't supplied by the
    driver, in the first place.
o In mp_ndesc_handler(), handle unknown type arguments gracefully, too.

PR: 235031 (likely) [1]
Reviewed by: shurd
Differential Revision: https://reviews.freebsd.org/D20175

5 years agoloader: bcache code does not need to check argument for free()
tsoome [Tue, 7 May 2019 08:14:30 +0000 (08:14 +0000)]
loader: bcache code does not need to check argument for free()

5 years agoloader: use safer DPRINTF body for non-debug case
tsoome [Tue, 7 May 2019 07:46:40 +0000 (07:46 +0000)]
loader: use safer DPRINTF body for non-debug case

5 years agoRemove wrong copyright line. Discussed with Carlos Neira.
dchagin [Tue, 7 May 2019 05:08:13 +0000 (05:08 +0000)]
Remove wrong copyright line. Discussed with Carlos Neira.

Reported by: Rodney W. Grimes
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13656

5 years agoamd64: fix BUS_SPACE_MAXSIZE to 64bit max value.
kib [Tue, 7 May 2019 01:18:57 +0000 (01:18 +0000)]
amd64: fix BUS_SPACE_MAXSIZE to 64bit max value.

Reviewed by: jhb, tychon (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D20154

5 years agoThe intention of the blist cursor is for the search for free blocks to
dougm [Mon, 6 May 2019 22:12:15 +0000 (22:12 +0000)]
The intention of the blist cursor is for the search for free blocks to
resume where the last search left off. Suppose that there are no free
blocks of size 32, but plenty of size 16. If we repeatedly request
size 32 blocks, fail, and retry with size 16 blocks, then the failures
all reset the cursor to the beginning of memory, making the 16 block
allocation use a first fit, rather than next fit, strategy.

This change has blist_alloc make a copy of the cursor for its own
decision making, and only updates the real blist cursor after a
successful allocation, making those 16 block searches behave like
next-fit searches.

Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D20177

5 years ago- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
marius [Mon, 6 May 2019 20:56:41 +0000 (20:56 +0000)]
- Remove the unused ifc_link_irq and ifc_mtx_name members of struct iflib_ctx.
- Remove the only ever written to ift_db_mtx_name member of struct iflib_txq.
- Remove the unused or only ever written to ifr_size, ifr_cq_pidx, ifr_cq_gen
  and ifr_lro_enabled members of struct iflib_rxq.
- Consistently spell DMA, RX and TX uppercase in comments, messages etc.
  instead of mixing with some lowercase variants.
- Consistently use if_t instead of a mix of if_t and struct ifnet pointers.
- Bring the function comments of _iflib_fl_refill(), iflib_rx_sds_free() and
  iflib_fl_setup() in line with reality.
- Judging problem reports, people are wondering what on earth messages like:
  "TX(0) desc avail = 1024, pidx = 0"
  are trying to indicate. Thus, extend this string to be more like that of
  non-iflib(4) Ethernet MAC drivers, notifying about a watchdog timeout due
  to which the interface will be reset.
- Take advantage of the M_HAS_VLANTAG macro.
- Use false/true rather than FALSE/TRUE for variables of type bool.
- Use FALLTHROUGH as advocated by style(9).

5 years agoImport libxo-1.0.4:
phil [Mon, 6 May 2019 20:20:21 +0000 (20:20 +0000)]
Import libxo-1.0.4:
- Avoid NULL deref in xo_xml_leader_len (replacing local fix in rS345967)
- update copyright dates
- update test cases
- fix uncommitted version change

Submitted by: phil
MFC after: 2 weeks

5 years agoAdds sys/class/net devices to linsysfs.
dchagin [Mon, 6 May 2019 20:01:13 +0000 (20:01 +0000)]
Adds sys/class/net devices to linsysfs.

Only two interfaces are created eth0 and lo and they expose
the following properties:
address, addr_len, flags, ifindex, mty, tx_queue_len and type.

Initial patch developed by Carlos Neira in 2017 and finished by me.

PR: 223722
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13656