]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
6 years agoSet correct SL in completion for RoCE in mlx5ib(4).
Hans Petter Selasky [Thu, 8 Mar 2018 16:27:31 +0000 (16:27 +0000)]
Set correct SL in completion for RoCE in mlx5ib(4).

There is a difference when parsing a completion entry between Ethernet
and IB ports. When link layer is Ethernet the bits describe the type of
L3 header in the packet. In the case when link layer is Ethernet and VLAN
header is present the value of SL is equal to the 3 UP bits in the VLAN
header. If VLAN header is not present then the SL is undefined and consumer
of the completion should check if IB_WC_WITH_VLAN is set.

While that, this patch also fills the vlan_id field in the completion if
present.

linux commit 12f8fedef2ec94c783f929126b20440a01512c14

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd call to setup firmware data dump structure during device load in
Hans Petter Selasky [Thu, 8 Mar 2018 16:19:01 +0000 (16:19 +0000)]
Add call to setup firmware data dump structure during device load in
mlx5core.

Do not consider the inability to create a firmware dump fatal, but
inform about the situation and allow the driver to attach. The device
might not implement the needed VSC, or we might not know the layout of
the registers map. In either case, only firmware dump functionality is
limited, the network operations should be fine.

Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAvoid more LFENCE/SFENCe on x86 in mlx5en(4),
Hans Petter Selasky [Thu, 8 Mar 2018 15:58:30 +0000 (15:58 +0000)]
Avoid more LFENCE/SFENCe on x86 in mlx5en(4),
by using the FreeBSD native fences.

Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoFix mlx5en(4) driver to properly call m_defrag().
Hans Petter Selasky [Thu, 8 Mar 2018 15:53:04 +0000 (15:53 +0000)]
Fix mlx5en(4) driver to properly call m_defrag().

When the mlx5en(4) driver was converted to using BUSDMA(9) the call to
m_defrag() was moved after the part of the TX routine that strips the
header from the mbuf chain. Before it called m_defrag it first trimmed
off the now-empty mbufs from the start of the chain. This has the side
effect of also removing the head of the chain that has M_PKTHDR set.
m_defrag() will not defrag a chain that does not have M_PKTHDR set,
thus it was effectively never defragging the mbuf chains.

As it turns out, trimming the mbufs in this fashion is unnecessary since
the call to bus_dmamap_load_mbuf_sg doesn't map empty mbufs anyway, so
remove it.

Differential Revision: https://reviews.freebsd.org/D12050
Submitted by: mjoras@
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoUse vport rather than physical-port MTU in mlx5en(4).
Hans Petter Selasky [Thu, 8 Mar 2018 15:47:17 +0000 (15:47 +0000)]
Use vport rather than physical-port MTU in mlx5en(4).

Set and report vport MTU rather than physical MTU,
The driver will set both vport and physical port mtu
and will rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Based on Linux upstream commit:
cd255efff9baadd654d6160e52d17ae7c568c9d3

Submitted by: Meny Yossefi <menyy@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoUse the device unit number for naming the ifnet interface in mlx5en(4).
Hans Petter Selasky [Thu, 8 Mar 2018 15:43:41 +0000 (15:43 +0000)]
Use the device unit number for naming the ifnet interface in mlx5en(4).

Currently the ifnet interface is named mceX, where X is a monotonically
incremented value. If the device is reset due to a fatal error, then the
interface name will change.  Using the device unit number will keep the
naming consistent across the reset logic.

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoRemove duplicate prototypes.
Hans Petter Selasky [Thu, 8 Mar 2018 15:37:09 +0000 (15:37 +0000)]
Remove duplicate prototypes.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoCheck that the address is specified in mlx5tool(8).
Hans Petter Selasky [Thu, 8 Mar 2018 15:28:13 +0000 (15:28 +0000)]
Check that the address is specified in mlx5tool(8).

Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd kernel and userspace code to dump the firmware state of supported
Hans Petter Selasky [Thu, 8 Mar 2018 15:21:56 +0000 (15:21 +0000)]
Add kernel and userspace code to dump the firmware state of supported
ConnectX-4/5 devices in mlx5core.

The dump is obtained by reading a predefined register map from the
non-destructive crspace, accessible by the vendor-specific PCIe
capability (VSC). The dump is stored in preallocated kernel memory and
managed by the mlx5tool(8), which communicates with the driver using a
character device node.

The utility allows to store the dump in format
    <address> <value>
into a file, to reset the dump content, and to manually initiate the
dump.

A call to mlx5_fwdump() should be added at the places where a dump
must be fetched automatically. The most likely place is right before a
firmware reset request.

Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd myself and Hans Petter Selasky
Slava Shwartsman [Thu, 8 Mar 2018 14:33:59 +0000 (14:33 +0000)]
Add myself and Hans Petter Selasky

Approved by:    hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies

6 years agoAdd vendor specific capability interface support in mlx5core.
Hans Petter Selasky [Thu, 8 Mar 2018 11:59:47 +0000 (11:59 +0000)]
Add vendor specific capability interface support in mlx5core.

Add the ability to access the vendor specific space gateway in order
to support reading and writing data into the different configuration
domains.

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoUse device_printf() instead of printf() when printing warnings and errors
Hans Petter Selasky [Thu, 8 Mar 2018 11:58:27 +0000 (11:58 +0000)]
Use device_printf() instead of printf() when printing warnings and errors
to dmesg(8) in mlx5core.

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd support for per priority flow control, PFC, to mlx5en(4).
Hans Petter Selasky [Thu, 8 Mar 2018 11:40:39 +0000 (11:40 +0000)]
Add support for per priority flow control, PFC, to mlx5en(4).

Add support for PFC and implement reading the per priority statistics
using the sysctl(8) interface. PFC is used together with VLAN priority
and can be enabled and disabled on a per priority basis.

Global pause frames and PFC are incompatible features and surrounding
logic has been added to warn the user about misconfiguration.

Update relevant mlx5core APIs for PFC configuration.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd support for explicit congestion notification, ECN, to mlx5ib(4).
Hans Petter Selasky [Thu, 8 Mar 2018 11:23:14 +0000 (11:23 +0000)]
Add support for explicit congestion notification, ECN, to mlx5ib(4).

ECN configuration and statistics is available through a set of sysctl(8)
nodes under sys.class.infiniband.mlx5_X.cong . The ECN configuration
nodes can also be used as loader tunables.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoUse the autogenerated interface file for all commands in mlx5core.
Hans Petter Selasky [Thu, 8 Mar 2018 10:43:42 +0000 (10:43 +0000)]
Use the autogenerated interface file for all commands in mlx5core.

This patch accumulates the following Linux commits:
90b3e38d048f09b22fb50bcd460cea65fd00b2d7
  mlx5_core: Modify CQ moderation parameters
09a7d9eca1a6cf5eb4f9abfdf8914db9dbd96f08
  mlx5_core: QP/XRCD commands via mlx5 ifc
1a412fb1caa2c1b77719ccb5ed8b0c3c2bc65da7
  mlx5_core: Modify QP commands via mlx5 ifc
ec22eb53106be1472ba6573dc900943f52f8fd1e
  mlx5_core: MKey/PSV commands via mlx5 ifc
73b626c182dff06867ceba996a819e8372c9b2ce
  mlx5_core: EQ commands via mlx5 ifc
20ed51c643b6296789a48adc3bc2cc875a1612cf
  mlx5_core: Access register and MAD IFC commands via mlx5 ifc
a533ed5e179cd15512d40282617909d3482a771c
  mlx5_core: Pages management commands via mlx5 ifc
b8a4ddb2e8f44f872fb93bbda2d541b27079fd2b
  mlx5_core: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON
af1ba291c5e498973cc325c501dd8da80b234571
  mlx5_core: Refactor internal SRQ API
b06e7de8a9d8d1d540ec122bbdf2face2a211634
  mlx5_core: Refactor device capability function
c4f287c4a6ac489c18afc4acc4353141a8c53070
  mlx5_core: Unify and improve command interface

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoFix race between PCI error handlers and health work in mlx5core.
Hans Petter Selasky [Thu, 8 Mar 2018 09:58:41 +0000 (09:58 +0000)]
Fix race between PCI error handlers and health work in mlx5core.

linux commit 05ac2c0b7438ea08c5d54b48797acf9b22cb2f6f

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAvoid calling sleeping function from the health poll thread in mlx5core.
Hans Petter Selasky [Thu, 8 Mar 2018 09:51:33 +0000 (09:51 +0000)]
Avoid calling sleeping function from the health poll thread in mlx5core.

linux commit c1d4d2e92ad670168a17a57dfa182a5a5baa72d4

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoUpdates for PCI and health monitor recovery in mlx5core.
Hans Petter Selasky [Thu, 8 Mar 2018 09:47:09 +0000 (09:47 +0000)]
Updates for PCI and health monitor recovery in mlx5core.
This patch accumulates the following Linux commits:

mlx5_health.c
78ccb25861d76a8fc5c678d762180e6918834200
  mlx5_core: Fix wrong name in struct
171bb2c560f45c0427ca3776a4c8f4e26e559400
  mlx5_core: Update health syndromes
0144a95e2ad53a40c62148f44fb0c1f9d2a0d1e9
  mlx5_core: Use accessor functions to read from device memory
ac6ea6e81a80172612e0c9ef93720f371b198918
  mlx5_core: Use private health thread for each device
fd76ee4da55abb21babfc69310d321b9cb9a32e0
  mlx5_core: Fix internal error detection conditions
2241007b3d783cbdbaa78c30bdb1994278b6f9b9
  mlx5: Clear health sick bit when starting health poll
712bfef60912d91033cb25739f7444d5b8d8c59f
  mlx5: Fix version printout in case of health issue
89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver

mlx5_cmd.c
be87544de8df2b1eb34bcb5e32691287d96f9ec4
  mlx5_core: Fix async commands return code
a31208b1e11df334d443ec8cace7636150bb8ce2
  mlx5_core: New init and exit flow for mlx5_core
020446e01eebc9dbe7eda038e570ab9c7ab13586
  mlx5_core: Prepare cmd interface to system errors handling
89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver
0d834442cc247c7b3f3bd6019512ae03e96dd99a
  mlx5: Fix teardown errors that happen in pci error handler

mlx5_main.c
5fc7197d3a256d9c5de3134870304b24892a4908
  mlx5: Add pci shutdown callback

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoChase rename of rwho script in r290252
Eitan Adler [Thu, 8 Mar 2018 07:15:14 +0000 (07:15 +0000)]
Chase rename of rwho script in r290252

The script and associated variable was changed in r290252. Now just
chase it.

MFC With: r290252
Reported by: Aaron LI <aly@aaronly.me>

6 years agocalendars: update Judaic calendar to 2018+
Eitan Adler [Thu, 8 Mar 2018 05:28:43 +0000 (05:28 +0000)]
calendars: update Judaic calendar to 2018+

This was generated by

∴hebcal --years 10 -r 2018 | awk -F '[/\t]' '{print $3 "/" $1 "/" $2
"*\t" $4}'

MFC After: 1 week

6 years agog_bio(9): fix a documentation oversight from r163870
Alan Somers [Thu, 8 Mar 2018 03:19:04 +0000 (03:19 +0000)]
g_bio(9): fix a documentation oversight from r163870

MFC after: 3 weeks

6 years agolualoader: Return status in cli_execute_unparsed properly
Kyle Evans [Wed, 7 Mar 2018 22:05:23 +0000 (22:05 +0000)]
lualoader: Return status in cli_execute_unparsed properly

cli_execute was changed to return the status, cascade that to
cli_execute_unparsed.

This fixes a lot of false "Failed to execute" errors following r330620; no
failures actually occurred, but [module]_error would've then promptly
executed (and also "failed")

6 years agoDon't assert that the domain free lock is held until we're certain that
Jeff Roberson [Wed, 7 Mar 2018 22:04:27 +0000 (22:04 +0000)]
Don't assert that the domain free lock is held until we're certain that
there is a valid reservation.  This can trip erroneously when memory
falls within a domain but doesn't have the reservation initialized because
it does not meet size or alignment requirements.

Reported by: pho, mjg
Sponsored by: Netflix, Dell/EMC Isilon

6 years agoloader.conf(5): Document some other settings
Kyle Evans [Wed, 7 Mar 2018 18:45:24 +0000 (18:45 +0000)]
loader.conf(5): Document some other settings

These tend to have less coverage in other places and they don't have
defaults as of yet, so mention them here:
- fdt_overlays
- kernels_autodetect (lualoader only)

6 years agolua-lint: Whitelist cli_execute_unparsed as a global
Kyle Evans [Wed, 7 Mar 2018 18:41:16 +0000 (18:41 +0000)]
lua-lint: Whitelist cli_execute_unparsed as a global

6 years agolualoader: Use cli_execute_unparsed for commands passed in via loader.conf
Kyle Evans [Wed, 7 Mar 2018 18:37:04 +0000 (18:37 +0000)]
lualoader: Use cli_execute_unparsed for commands passed in via loader.conf

This applies to:
- exec
- [module]_before
- [module]_error
- [module]_after

Before this commit, these used loader.perform to execute them as a pure,
unsalted loader command. This means that they were not able to take
advantage of any Lua-salted loader commands, like boot and autoboot, or pure
Lua loader commands (functions attached to the 'cli' module).

They now have access to the full arsenal, just shy of being able to execute
arbitrary Lua.

6 years agofpu_kern.9: Document fpu_kern_enter API change in r329878
Conrad Meyer [Wed, 7 Mar 2018 18:31:31 +0000 (18:31 +0000)]
fpu_kern.9: Document fpu_kern_enter API change in r329878

While here, clean up some of the language.

Reported by: delphij
Sponsored by: Dell EMC Isilon

6 years agolualoader: Use cli_execute_unparsed instead of loader.interpret
Kyle Evans [Wed, 7 Mar 2018 18:31:01 +0000 (18:31 +0000)]
lualoader: Use cli_execute_unparsed instead of loader.interpret

loader.interpret should not be used for executing loader commands from an
untrusted source (e.g. environment vars) as it will allow execution of
arbitrary Lua. Replace it with a call to the recently introduced
cli_execute_unparsed, which parses it out as a loader command and then
dispatches it as a loader command. This effectively filters out arbitrary
Lua.

6 years agolualoader: Fix name, cli.execute_unparsed -> cli_execute_unparsed
Kyle Evans [Wed, 7 Mar 2018 18:28:41 +0000 (18:28 +0000)]
lualoader: Fix name, cli.execute_unparsed -> cli_execute_unparsed

6 years agolualoader: Expose loader.parse and add cli_execute_unparsed
Kyle Evans [Wed, 7 Mar 2018 18:25:27 +0000 (18:25 +0000)]
lualoader: Expose loader.parse and add cli_execute_unparsed

This will be used for scenarios where the command to execute is coming in
via the environment (from, for example, loader.conf(5)) and is thus not
necessarily trusted.

cli_execute_unparsed will immediately be used for handling
module_{before,after,error} as well as menu_timeout_command. We still want
to offer these variables the ability to execute Lua-intercepted loader
commands, but we don't want them to be able to execute arbitrary Lua.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D14580

6 years agoFix a lock recursion introduced in r327065.
Tycho Nightingale [Wed, 7 Mar 2018 18:03:22 +0000 (18:03 +0000)]
Fix a lock recursion introduced in r327065.

Reported by: kmacy
Reviewed by: grehan, jhb
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D14548

6 years agoRegen src.conf.5 after r330613 CROSS_TOOLCHAIN change
Ed Maste [Wed, 7 Mar 2018 17:37:36 +0000 (17:37 +0000)]
Regen src.conf.5 after r330613 CROSS_TOOLCHAIN change

6 years agoDisable LLD_BOOTSTRAP under WITHOUT_CROSS_COMPILER
Ed Maste [Wed, 7 Mar 2018 17:33:41 +0000 (17:33 +0000)]
Disable LLD_BOOTSTRAP under WITHOUT_CROSS_COMPILER

LLD is a cross toolchain component. It shouldn't be built when
requesting a build without building a cross compiler.

(CROSS_COMPILER is somewhat unfortunately named; in any case, lld
should be treated as GNU binutils here.)

Submitted by: Dan McGregor <dan.mcgregor at usask.ca>
MFC after: 1 week

6 years agostand/ficl: Fix testmain
Kyle Evans [Wed, 7 Mar 2018 17:18:46 +0000 (17:18 +0000)]
stand/ficl: Fix testmain

testmain is a userland application intended to be built with standard
headers and whatnot, which we broke.

Fix it by having the testmain build clobber cflags, reducing it to just the
set of defines/includes it needs to build.

Discussed with: imp
MFC after: 3 days

6 years agoMove the powerpc64 direct map base address from zero to high memory. This
Nathan Whitehorn [Wed, 7 Mar 2018 17:08:07 +0000 (17:08 +0000)]
Move the powerpc64 direct map base address from zero to high memory. This
accomplishes a few things:
- Makes NULL an invalid address in the kernel, which is useful for catching
  bugs.
- Lays groundwork for radix-tree translation on POWER9, which requires the
  direct map be at high memory.
- Similarly lays groundwork for a direct map on 64-bit Book-E.

The new base address is chosen as the base of the fourth radix quadrant
(the minimum kernel address in this translation mode) and because all
supported CPUs ignore at least the first two bits of addresses in real
mode, allowing direct-map addresses to be used in real-mode handlers.
This is required by Linux and is part of the architecture standard
starting in POWER ISA 3, so can be relied upon.

Reviewed by: jhibbits, Breno Leitao
Differential Revision: D14499

6 years agoImplement priority to traffic class mapping in mlx5core.
Hans Petter Selasky [Wed, 7 Mar 2018 15:23:07 +0000 (15:23 +0000)]
Implement priority to traffic class mapping in mlx5core.

Add support for mapping priority to traffic class via sysctl

Submitted by: Slava Shwartsman <slavash@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoImplement rate limit per traffic class in mlx5core.
Hans Petter Selasky [Wed, 7 Mar 2018 15:17:36 +0000 (15:17 +0000)]
Implement rate limit per traffic class in mlx5core.

Add support for rate limiting traffic class via sysctl.

Submitted by: Slava Shwartsman <slavash@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoImplement missing query for current port rate in mlx5ib(4).
Hans Petter Selasky [Wed, 7 Mar 2018 15:03:11 +0000 (15:03 +0000)]
Implement missing query for current port rate in mlx5ib(4).

- Factor out port speed definitions into new port.h header file,
  similarly as done in Linux upstream.
- Correct two existing port speed definitions in mlx5en according to
  Linux upstream.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd log message for unsupported QSFPs in mlx5core.
Hans Petter Selasky [Wed, 7 Mar 2018 14:51:50 +0000 (14:51 +0000)]
Add log message for unsupported QSFPs in mlx5core.

Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMake sure default VNET is set when adding a new interface in mlx5core.
Hans Petter Selasky [Wed, 7 Mar 2018 14:49:27 +0000 (14:49 +0000)]
Make sure default VNET is set when adding a new interface in mlx5core.

Adding an interface might be done outside the device_attach() routine
and will then cause a panic, due to the VNET not being set.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agosys/cloudabi: Avoid relying on GNU specific extensions
Eitan Adler [Wed, 7 Mar 2018 14:47:43 +0000 (14:47 +0000)]
sys/cloudabi: Avoid relying on GNU specific extensions

An empty initializer list is not technically valid C grammar.

MFC After: 1 week

6 years agosys: Fix a few potential infoleaks in cloudabi
Eitan Adler [Wed, 7 Mar 2018 14:44:32 +0000 (14:44 +0000)]
sys: Fix a few potential infoleaks in cloudabi

While there is no immediate leak, if the structure changes underneath
us, there might be in the future.

Submitted by: Domagoj Stolfa <domagoj.stolfa@gmail.com>
MFC After: 1 month
Sponsored by: DARPA/AFRL

6 years agoAdd timeout handle to commands with callback in mlx5core.
Hans Petter Selasky [Wed, 7 Mar 2018 14:41:29 +0000 (14:41 +0000)]
Add timeout handle to commands with callback in mlx5core.

The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get firmware response. Add delayed callback timeout work
before posting the command to firmware. In case of real firmware
command completion we will cancel the delayed work. In case of
firmware command timeout the callback timeout handler will be called
and it will simulate firmware completion with timeout error.

linux commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoFix potential deadlock in command mode change in mlx5core.
Hans Petter Selasky [Wed, 7 Mar 2018 14:35:28 +0000 (14:35 +0000)]
Fix potential deadlock in command mode change in mlx5core.

Call command completion handler in case of timeout when working in
interrupts mode. Avoid flushing the commands workqueue after acquiring
the semaphores to prevent a potential deadlock.

linux commit commit 9cba4ebcf374c3772f6eb61f2d065294b2451b49

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoUse a macro in mlx5_command_str() instead of copying OP name.
Hans Petter Selasky [Wed, 7 Mar 2018 14:29:30 +0000 (14:29 +0000)]
Use a macro in mlx5_command_str() instead of copying OP name.

linux commit 42ca502e179d0654ef441333a9d0f35c948734f3

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoDisable unsupported disassociate ucontext functionality in mlx5ib(4).
Hans Petter Selasky [Wed, 7 Mar 2018 14:03:31 +0000 (14:03 +0000)]
Disable unsupported disassociate ucontext functionality in mlx5ib(4).

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoBump version information in mlx4ib(4).
Hans Petter Selasky [Wed, 7 Mar 2018 13:59:46 +0000 (13:59 +0000)]
Bump version information in mlx4ib(4).

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoThe mlx4ib(4) should not be loaded before the ibcore is initialized.
Hans Petter Selasky [Wed, 7 Mar 2018 13:58:58 +0000 (13:58 +0000)]
The mlx4ib(4) should not be loaded before the ibcore is initialized.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoDisable unsupported disassociate ucontext functionality in mlx4ib(4).
Hans Petter Selasky [Wed, 7 Mar 2018 13:57:32 +0000 (13:57 +0000)]
Disable unsupported disassociate ucontext functionality in mlx4ib(4).

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoBump MAXCPUS on arm64. We are starting to see hardware with more than 96
Andrew Turner [Wed, 7 Mar 2018 13:54:44 +0000 (13:54 +0000)]
Bump MAXCPUS on arm64. We are starting to see hardware with more than 96
cores so increase it to the same as amd64.

Sponsored by: DARPA, AFRL
Sponsored by: Cavium (Hardware)

6 years agoMFV r330591: 8984 fix for 6764 breaks ACL inheritance
Andriy Gapon [Wed, 7 Mar 2018 13:49:26 +0000 (13:49 +0000)]
MFV r330591: 8984 fix for 6764 breaks ACL inheritance

illumos/illumos-gate@e9bacc6d1a71ea3f7082038b2868de8c4dd98bdc
https://github.com/illumos/illumos-gate/commit/e9bacc6d1a71ea3f7082038b2868de8c4dd98bdc

https://www.illumos.org/issues/8984
  Consider a directory configured as:
  drwx-ws---+ 2 henson cpp 3 Jan 23 12:35 dropbox/
  user:henson:rwxpdDaARWcC--:f-i----:allow
  owner@:--------------:f-i----:allow
  group@:--------------:f-i----:allow
  everyone@:--------------:f-i----:allow
  owner@:rwxpdDaARWcC--:-di----:allow
  group:cpp:-wx-----------:-------:allow
  owner@:rwxpdDaARWcC--:-------:allow
  A new file created in this directory ends up looking like:
  rw-r--r-+ 1 astudent cpp 0 Jan 23 12:39 testfile
  user:henson:rw-pdDaARWcC--:------I:allow
  owner@:--------------:------I:allow
  group@:--------------:------I:allow
  everyone@:--------------:------I:allow
  owner@:rw-p--aARWcCos:-------:allow
  group@:r-----a-R-c--s:-------:allow
  everyone@:r-----a-R-c--s:-------:allow
  with extraneous group@ and everyone@ entries allowing read access that
  shouldn't exist.
  Per Albert Lee on the zfs mailing list:
  "aclinherit=passthrough/passthrough-x should still
  ignore the requested mode when an inheritable ACE for owner@ group@,
  or everyone@ is present in the parent directory.
  It appears there was an oversight in my fix for
  https://www.illumos.org/issues/6764 which made calling zfs_acl_chmod
  from zfs_acl_inherit unconditional. I think the parent ACL check for
  aclinherit=passthrough needs to be reintroduced in zfs_acl_inherit."
  We have a large number of faculty who use dropbox directories like the example
  to have students submit projects. All of these directories are now allowing

Reviewed by: Sam Zaydel <szaydel@racktopsystems.com>
Reviewed by: Paul B. Henson <henson@acm.org>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Dominik Hassler <hadfl@omniosce.org>

PR: 216886
MFC after: 2 weeks

6 years ago8984 fix for 6764 breaks ACL inheritance
Andriy Gapon [Wed, 7 Mar 2018 13:47:01 +0000 (13:47 +0000)]
8984 fix for 6764 breaks ACL inheritance

illumos/illumos-gate@e9bacc6d1a71ea3f7082038b2868de8c4dd98bdc
https://github.com/illumos/illumos-gate/commit/e9bacc6d1a71ea3f7082038b2868de8c4dd98bdc

https://www.illumos.org/issues/8984
  Consider a directory configured as:
  drwx-ws---+ 2 henson cpp 3 Jan 23 12:35 dropbox/
  user:henson:rwxpdDaARWcC--:f-i----:allow
  owner@:--------------:f-i----:allow
  group@:--------------:f-i----:allow
  everyone@:--------------:f-i----:allow
  owner@:rwxpdDaARWcC--:-di----:allow
  group:cpp:-wx-----------:-------:allow
  owner@:rwxpdDaARWcC--:-------:allow
  A new file created in this directory ends up looking like:
  rw-r--r-+ 1 astudent cpp 0 Jan 23 12:39 testfile
  user:henson:rw-pdDaARWcC--:------I:allow
  owner@:--------------:------I:allow
  group@:--------------:------I:allow
  everyone@:--------------:------I:allow
  owner@:rw-p--aARWcCos:-------:allow
  group@:r-----a-R-c--s:-------:allow
  everyone@:r-----a-R-c--s:-------:allow
  with extraneous group@ and everyone@ entries allowing read access that
  shouldn't exist.
  Per Albert Lee on the zfs mailing list:
  "aclinherit=passthrough/passthrough-x should still
  ignore the requested mode when an inheritable ACE for owner@ group@,
  or everyone@ is present in the parent directory.
  It appears there was an oversight in my fix for
  https://www.illumos.org/issues/6764 which made calling zfs_acl_chmod
  from zfs_acl_inherit unconditional. I think the parent ACL check for
  aclinherit=passthrough needs to be reintroduced in zfs_acl_inherit."
  We have a large number of faculty who use dropbox directories like the example
  to have students submit projects. All of these directories are now allowing

Reviewed by: Sam Zaydel <szaydel@racktopsystems.com>
Reviewed by: Paul B. Henson <henson@acm.org>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Dominik Hassler <hadfl@omniosce.org>

6 years agoMake sure VNET is set when calling sa6_recoverscope() in ibcore.
Hans Petter Selasky [Wed, 7 Mar 2018 13:32:52 +0000 (13:32 +0000)]
Make sure VNET is set when calling sa6_recoverscope() in ibcore.

Else panic will occur when VIMAGE is enabled.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoDefine values instead of using hardcoding.
Hans Petter Selasky [Wed, 7 Mar 2018 13:30:38 +0000 (13:30 +0000)]
Define values instead of using hardcoding.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoRecover IPv6 scope ID for multicast link-local addresses as well as
Hans Petter Selasky [Wed, 7 Mar 2018 13:28:12 +0000 (13:28 +0000)]
Recover IPv6 scope ID for multicast link-local addresses as well as
unicast link-local addresses.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoEmbed the IPv6 scope ID before calling rtalloc1() in ibcore.
Hans Petter Selasky [Wed, 7 Mar 2018 13:25:40 +0000 (13:25 +0000)]
Embed the IPv6 scope ID before calling rtalloc1() in ibcore.
Else rtalloc1() will resolve to the loopback interface.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoCreate macros for the ACPI interrupt cross references. This is considered a
Andrew Turner [Wed, 7 Mar 2018 13:16:03 +0000 (13:16 +0000)]
Create macros for the ACPI interrupt cross references. This is considered a
band aid until a better solution to find the correct interrupt controller
can be found.

While here fix one place in the GICv3 ITS driver where the offset wasn't
correctly applied.

Sponsored by: DARPA, AFRL
Sponsored by: Cavium (Hardware)

6 years agoAdd IB_SPEED_HDR definition in ibcore.
Hans Petter Selasky [Wed, 7 Mar 2018 13:01:00 +0000 (13:01 +0000)]
Add IB_SPEED_HDR definition in ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMake sure the IPv6 scope ID gets properly masked in ibcore.
Hans Petter Selasky [Wed, 7 Mar 2018 12:58:51 +0000 (12:58 +0000)]
Make sure the IPv6 scope ID gets properly masked in ibcore.

When exchanging CM messages the IPv6 scope ID should be ignored
for link local addresses when doing comparisons. Make sure the
scope ID is always set to zero for link local addresses.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoFix for use-after-free when using delayed work structures in ibcore.
Hans Petter Selasky [Wed, 7 Mar 2018 12:56:04 +0000 (12:56 +0000)]
Fix for use-after-free when using delayed work structures in ibcore.

It is not enough to cancel delayed work structures before freeing.
Always cancel delayed work synchronously before freeing!

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd an acpi attachment to the pci_host_generic driver and have the ACPI
Andrew Turner [Wed, 7 Mar 2018 10:47:27 +0000 (10:47 +0000)]
Add an acpi attachment to the pci_host_generic driver and have the ACPI
bus provide it with its needed memory resources.

This allows us to use PCIe on the ThunderX2 and, with a previous version
of the patch, on the SoftIron 3000 with ACPI.

Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Sponsored by: DARPA, AFRL
Sponsored by: Cavium (Hardware)
Differential Revision: https://reviews.freebsd.org/D8767

6 years agoRestrict the arm64 DMAP region to the 1G blocks where we have at least
Andrew Turner [Wed, 7 Mar 2018 09:58:36 +0000 (09:58 +0000)]
Restrict the arm64 DMAP region to the 1G blocks where we have at least
one physical page. This is in preparation for limiting it further as this
is needed on some hardware, however testing has shown issues with further
restricting the DMAP and ACPI.

Sponsored by: DARPA, AFRL
Sponsored by: Cavium (Hardware)

6 years agopsm.4: remove useless information
Eitan Adler [Wed, 7 Mar 2018 09:40:41 +0000 (09:40 +0000)]
psm.4: remove useless information

Obtained from: DragonflyBSD (f49f67c528ec63f5524da5c11e060a0e67866242)
MFC After: 1 week

6 years agodes_crypt.3: Fix typo.
Eitan Adler [Wed, 7 Mar 2018 09:31:27 +0000 (09:31 +0000)]
des_crypt.3: Fix typo.

Obtained from: DragonflyBSD (a78d083cf561cf325e8f1a151251b8901159e2ce)
MFC After: 3 days

6 years agolualoader: Only loadelf before boot/autoboot if no kernel loaded
Kyle Evans [Wed, 7 Mar 2018 04:11:14 +0000 (04:11 +0000)]
lualoader: Only loadelf before boot/autoboot if no kernel loaded

Back when I "fixed" the loading of kernel/modules to be deferred until
booting, I inadvertently broke the ability to manually load a set of kernels
and modules in case of something bad having happened. lualoader would
instead happily load whatever is specified in loader.conf(5) and go about
the boot, leading to a panic loop as you try to rediscover a way to stop the
panicky efirt module from loading and fail miserably.

Reported by: me, sadly

6 years agog_part_gpt: Fix memory leak in error path
Conrad Meyer [Wed, 7 Mar 2018 01:55:50 +0000 (01:55 +0000)]
g_part_gpt: Fix memory leak in error path

If g_part_gpt_read() encountered a disk with bad primary and secondary
tables, it could leak memory.

Reported by: Coverity
Sponsored by: Dell EMC Isilon

6 years agochflags: Add SIGINFO support.
Bryan Drewery [Wed, 7 Mar 2018 01:55:38 +0000 (01:55 +0000)]
chflags: Add SIGINFO support.

This is copied from chmod r311668.

MFC after: 2 weeks

6 years agoBump dwatch(1) internal version from 1.0-beta-91 to 1.0
Devin Teske [Tue, 6 Mar 2018 23:58:53 +0000 (23:58 +0000)]
Bump dwatch(1) internal version from 1.0-beta-91 to 1.0

6 years agoIntroduce dwatch(1) as a tool for making DTrace more useful
Devin Teske [Tue, 6 Mar 2018 23:44:19 +0000 (23:44 +0000)]
Introduce dwatch(1) as a tool for making DTrace more useful

Reviewed by: markj, gnn, bdrewery (earlier version)
Relnotes: yes
Sponsored by: Smule, Inc.
Differential Revision: https://reviews.freebsd.org/D10006

6 years ago[ig4] Add support for i2c controllers on Skylake and Kaby Lake
Oleksandr Tymoshenko [Tue, 6 Mar 2018 23:39:43 +0000 (23:39 +0000)]
[ig4] Add support for i2c controllers on Skylake and Kaby Lake

This was tested by Ben on  HP Chromebook 13 G1 with a
Skylake CPU and Sunrise Point-LP I2C controller and by me on
Minnowboard Turbot with Atom E3826 (formerly Bay Trail)

Submitted by: Ben Pye <ben@curlybracket.co.uk>
Reviewed by: gonzo
Obtained from: DragonflyBSD (a4549657 by Imre Vadász)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13654

6 years agoaw_usbphy: Move later to SUPPORTDEV pass
Kyle Evans [Tue, 6 Mar 2018 22:45:45 +0000 (22:45 +0000)]
aw_usbphy: Move later to SUPPORTDEV pass

vbus-supply properties may be specified for each PHY. These properties
reference a regulator that we must turn on/off as we turn the PHY on/off.
However, if the usbphy comes up before the regulator in question (as is the
case with GPIO-controlled regulators), then we will fail to grab a handle to
the regulator and control it as the PHY power state changes.

Fix it by just attaching the usbphy driver later. We don't really need it at
RESOURCE, we just need it to be before DEFAULT when ehci/ohci attach. In
particular, this fixes the USB NIC on a board that we don't yet supported-
without this, it will not power on and if_ure cannot attach.

Tested on: various boards [manu]
Tested on: OrangePi R1 [Rap2 (irc)]
Reported by: Rap2 (irc, "Cannot find USB NIC")

6 years agoAdd example devd.conf(5) entry for notifying init(8) about new USB ttys.
Edward Tomasz Napierala [Tue, 6 Mar 2018 21:05:34 +0000 (21:05 +0000)]
Add example devd.conf(5) entry for notifying init(8) about new USB ttys.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation

6 years agopsm(4): Initialize variables before use
Conrad Meyer [Tue, 6 Mar 2018 20:31:14 +0000 (20:31 +0000)]
psm(4): Initialize variables before use

dxp/dyp could have been used uninitialized in the subsequent debugging log
invocation.

Reported by: Coverity
Sponsored by: Dell EMC Isilon

6 years agoRemove reference to unimplemented fuiword, etc.
Brooks Davis [Tue, 6 Mar 2018 18:28:55 +0000 (18:28 +0000)]
Remove reference to unimplemented fuiword, etc.

We don't support Harvard architectures.

6 years agoFix use of unitialized variables.
Nathan Whitehorn [Tue, 6 Mar 2018 15:52:43 +0000 (15:52 +0000)]
Fix use of unitialized variables.

6 years agoUnbreak amd64 FBT after r330539.
Mark Johnston [Tue, 6 Mar 2018 15:51:59 +0000 (15:51 +0000)]
Unbreak amd64 FBT after r330539.

X-MFC with: r330539

6 years agoUpdate the diskless manpage
Rodrigo Osorio [Tue, 6 Mar 2018 14:31:15 +0000 (14:31 +0000)]
Update the diskless manpage

According with /etc/rc.initdiskless the default mfs allocation
is now 5Mb (10240 x 512 bytes sectors)

Submitted by: rodrigo
Reviewed by: bcr
Approved by: manpages (bcr)
Differential Revision: https://reviews.freebsd.org/D14592

6 years agoamd64: Protect the kernel text, data, and BSS by setting the RW/NX bits
Jonathan T. Looney [Tue, 6 Mar 2018 14:28:37 +0000 (14:28 +0000)]
amd64: Protect the kernel text, data, and BSS by setting the RW/NX bits
correctly for the data contained on each memory page.

There are several components to this change:
 * Add a variable to indicate the start of the R/W portion of the
   initial memory.
 * Stop detecting NX bit support for each AP.  Instead, use the value
   from the BSP and, if supported, activate the feature on the other
   APs just before loading the correct page table.  (Functionally, we
   already assume that the BSP and all APs had the same support or
   lack of support for the NX bit.)
 * Set the RW and NX bits correctly for the kernel text, data, and
   BSS (subject to some caveats below).
 * Ensure DDB can write to memory when necessary (such as to set a
   breakpoint).
 * Ensure GDB can write to memory when necessary (such as to set a
   breakpoint).  For this purpose, add new MD functions gdb_begin_write()
   and gdb_end_write() which the GDB support code can call before and
   after writing to memory.

This change is not comprehensive:
 * It doesn't do anything to protect modules.
 * It doesn't do anything for kernel memory allocated after the kernel
   starts running.
 * In order to avoid excessive memory inefficiency, it may let multiple
   types of data share a 2M page, and assigns the most permissions
   needed for data on that page.

Reviewed by: jhb, kib
Discussed with: emaste
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14282

6 years agoNudge lld to break the kernel read-only and read-write sections into
Jonathan T. Looney [Tue, 6 Mar 2018 14:18:45 +0000 (14:18 +0000)]
Nudge lld to break the kernel read-only and read-write sections into
separate 2M pages.  The binutils default for max-page-size and
common-page-size used to produce this result.  By setting these
values, we can nudge lld to also separate these sections into separate
2M pages.

Reviewed by: jhb, kib
Discussed with: emaste
Sponsored by: Netflix
Differential Revision: D14282

6 years agoAdd mapping for several ethernet types used by Linux to FreeBSD
Andrey V. Elsukov [Tue, 6 Mar 2018 12:58:00 +0000 (12:58 +0000)]
Add mapping for several ethernet types used by Linux to FreeBSD
ethernet types.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14594

6 years agoDefine ethernet type 0x88A8 as ETHERTYPE_QINQ.
Andrey V. Elsukov [Tue, 6 Mar 2018 12:01:31 +0000 (12:01 +0000)]
Define ethernet type 0x88A8 as ETHERTYPE_QINQ.

Reviewed by: kp
Obtained from: OpenBSD
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14593

6 years agoBuild the ds1672 driver as a module. Add a detach() to unregister the rtc.
Ian Lepore [Tue, 6 Mar 2018 02:30:34 +0000 (02:30 +0000)]
Build the ds1672 driver as a module.  Add a detach() to unregister the rtc.

6 years agoFix a paste-o that broke the build. There is no softc pointer here, just
Ian Lepore [Tue, 6 Mar 2018 02:21:41 +0000 (02:21 +0000)]
Fix a paste-o that broke the build.  There is no softc pointer here, just
use the dev arg.

Reported by: Jonathan Looney <jonlooney@gmail.com>
Pointy hat: ian@

6 years agoUse umtx_copyin_umtx_time32() in __umtx_op_lock_umutex_compat32().
Brooks Davis [Tue, 6 Mar 2018 01:52:04 +0000 (01:52 +0000)]
Use umtx_copyin_umtx_time32() in __umtx_op_lock_umutex_compat32().

Non-NULL timeouts where copied in improperly and could produce failures
due to incompatible data structures.

Reviewed by: kib
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14587

6 years agoMove softfloat symbol map entries to softfloat/Symbol.map.
John Baldwin [Mon, 5 Mar 2018 20:51:23 +0000 (20:51 +0000)]
Move softfloat symbol map entries to softfloat/Symbol.map.

The arm, mips, and riscv MD Symbol.map files listed some (but not all)
of the softfloat symbols that were actually defined in softfloat.c.

While here, also remove entries for __fixuns[sd]fsi which are provided
by libcompiler_rt and not by libc.

Sponsored by: DARPA / AFRL

6 years agoMFV: zstd: FIO_addFInfo: Fully initialize output 'total' struct
Conrad Meyer [Mon, 5 Mar 2018 20:03:45 +0000 (20:03 +0000)]
MFV: zstd: FIO_addFInfo: Fully initialize output 'total' struct

Silence a Coverity warning about 'windowSize' being uninitialized.
(Yes, nothing that calls this routine actually uses the windowSize
value.  Still, appeasing Coverity is pretty harmless in this case.)

Reported by: Coverity
Reviewed by: Yann Collet
Obtained from: zstd 606374269cf3485972c90b993fbb84dc20da032f
Sponsored by: Dell EMC Isilon

6 years agoRegen after r330517.
Brooks Davis [Mon, 5 Mar 2018 17:02:50 +0000 (17:02 +0000)]
Regen after r330517.

6 years agoRemove remenants of 1990s efforts to let us run Net/OpenBSD binaries.
Brooks Davis [Mon, 5 Mar 2018 17:02:16 +0000 (17:02 +0000)]
Remove remenants of 1990s efforts to let us run Net/OpenBSD binaries.

No functional change (comments change in some generated files.)

Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14571

6 years agospray: fix the spelling in an output string
Alan Somers [Mon, 5 Mar 2018 16:13:29 +0000 (16:13 +0000)]
spray: fix the spelling in an output string

MFC after: 3 weeks

6 years agorpc.sprayd: raise WARNS to 6
Alan Somers [Mon, 5 Mar 2018 16:11:07 +0000 (16:11 +0000)]
rpc.sprayd: raise WARNS to 6

MFC after: 3 weeks

6 years agoWe shouldn't need to execute code in the recursive page table mappings;
Jonathan T. Looney [Mon, 5 Mar 2018 15:12:35 +0000 (15:12 +0000)]
We shouldn't need to execute code in the recursive page table mappings;
therefore, it should be safe to set the NX bit on the PML4E for the
recursive page table mappings.  According to the Intel docs, the effect
of the NX bit should propogate to any page reached through a PML4E which
has the NX bit set.

Reviewed by: kib, markj
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14333

6 years agoPrior to r329071, pmap_bootstrap() used pmap_kmem_choose() to round the
Jonathan T. Looney [Mon, 5 Mar 2018 15:10:17 +0000 (15:10 +0000)]
Prior to r329071, pmap_bootstrap() used pmap_kmem_choose() to round the
first available virtual address to a 2MB boundary. After r329071,
create_pagetables() rounds firstaddr up to a 2MB boundary. This ensures
the kernel is mapped in super-pages, which is the point of the logic
in pmap_kmem_choose(). Therefore, it is no longer necessary for
pmap_bootstrap() to round up to the 2MB boundary again.

As pmap_bootstrap() was the only user of pmap_kmem_choose(), we can
delete pmap_kmem_choose().

Reviewed by: kib
MFC after: 2 weeks
X-MFC-with: r329071
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14355

6 years agoOptimize ibcore RoCE address handle creation from user-space.
Hans Petter Selasky [Mon, 5 Mar 2018 14:34:52 +0000 (14:34 +0000)]
Optimize ibcore RoCE address handle creation from user-space.

Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoGet correct network device when accepting incoming RDMA connections in ibcore.
Hans Petter Selasky [Mon, 5 Mar 2018 14:24:30 +0000 (14:24 +0000)]
Get correct network device when accepting incoming RDMA connections in ibcore.

This patch ensures the GID index is always used as a basis of resolving
incoming RDMA connections, as compared to the GID value itself.

Background:
On a per infiniband port basis, the GID identifier is not a unique identifier!
This assumption falls apart when VLAN ID, IPv6 scope ID and RoCE type,
as supported by RoCE v2, is taken into account. This additional
information is stored in the so-called GID attributes and is needed to
correctly identify the destination network interface for an incoming
connection.

Different VLANs are allowed to define the same IPv4 addresses and especially
for the default IPv6 link-local addresses or when using so-called containers
or jails, this is true.

The VNET information for the destination network interface is needed in
order to perform the L2 address lookup in the right Virtual Network Stack
context.

Consequently old functions previously used by RoCE v1, like
rdma_addr_find_smac_by_sgid() are impossible to support, because
there can be multiple identical GIDs associated with the same
infiniband port, and the answer to such a request becomes undefined.
This function has been removed.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoPass valid if_index to rdma_addr_find_l2_eth_by_grh() in ibcore when possible.
Hans Petter Selasky [Mon, 5 Mar 2018 14:22:36 +0000 (14:22 +0000)]
Pass valid if_index to rdma_addr_find_l2_eth_by_grh() in ibcore when possible.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd support for loopback in ibcore.
Hans Petter Selasky [Mon, 5 Mar 2018 13:57:37 +0000 (13:57 +0000)]
Add support for loopback in ibcore.

Implement the missing pieces in addr_resolve() to support loopback
addresses. IB core will test for the IFF_LOOPBACK flag in the network
interface and treat these devices in a special way.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMake sure to register the VLAN GIDs using the VLAN network interface
Hans Petter Selasky [Mon, 5 Mar 2018 12:39:34 +0000 (12:39 +0000)]
Make sure to register the VLAN GIDs using the VLAN network interface
and not the parent one in ibcore. Else looking up the VLAN GIDs will
fail for VLAN IPs.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoNeed to check for IPv6 linklocal address inside rdma_resolve_addr() in ibcore.
Hans Petter Selasky [Mon, 5 Mar 2018 12:04:34 +0000 (12:04 +0000)]
Need to check for IPv6 linklocal address inside rdma_resolve_addr() in ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMap type of service, TOS, to IB or VLAN service level 1:1 in ibcore.
Hans Petter Selasky [Mon, 5 Mar 2018 11:59:54 +0000 (11:59 +0000)]
Map type of service, TOS, to IB or VLAN service level 1:1 in ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoSelect RoCEv2 by default in ibcore.
Hans Petter Selasky [Mon, 5 Mar 2018 11:58:37 +0000 (11:58 +0000)]
Select RoCEv2 by default in ibcore.

MFC after: 1 week
Sponsored by: Mellanox Technologies