When several threads are trying to send datagram to the same destination,
but fragmentation is disabled and datagram size exceeds link MTU,
ip6_output() calls pfctlinput2(PRC_MSGSIZE). It does notify all
sockets wanted to know MTU to this destination. And since all threads
hold PCB lock while sending, taking the lock for each PCB in the
in6_pcbnotify() leads to deadlock.
RFC 3542 p.11.3 suggests notify all application wanted to receive
IPV6_PATHMTU ancillary data for each ICMPv6 packet too big message.
But it doesn't require this, when we don't receive ICMPv6 message.
Change ip6_notify_pmtu() function to be able use it directly from
ip6_output() to notify only one socket, and to notify all sockets
when ICMPv6 packet too big message received.
hselasky [Wed, 4 Mar 2015 09:30:03 +0000 (09:30 +0000)]
Updates for the Mellanox ethernet driver
> List of fixes:
* use correct format for GID printouts
* double array indexing
* spelling in printouts
* void pointer arithmetic
* allow more receive rings
* correct maximum number of transmit rings
* use "const" instead of "static" for constants
* check for invalid VLAN tags
* check for lack of IRQ resources
> Added more hardware specific defines
> Added more verbose printouts of firmware status codes
Sponsored by: Mellanox Technologies
MFC after: 3 days
adrian [Wed, 4 Mar 2015 03:48:11 +0000 (03:48 +0000)]
Fix both arge0 and arge1 to work correctly on the AP135.
* Force the arge0 interface to not use a PHY for speed negotiation
for now. It'd be nice to do it, but right now the RGMII interface
to the switch needs to stay at 1000/full in order to match what
the switch side of the port is programmed as.
So until that's all sorted out, disconnect arge0 from the PHY
and leave it at fixed at 1000/full.
I noticed this when I tried using a busted ethernet cable that
forced the PHY to negotiate 100/full. The switch was fine and
it negotiated to 100/full, but then arge0 saw the link update
and set the speed to 100/full when the switch side of that
hook up was set to 1000/full. Tsk.
* When using argemdio, the mdio device resets and initialises
the MAC, /not/ the arge_attach (or, as I discovered, arge_init.)
So arge1 wasn't being fully initialised and thus no traffic
would ever flow.
So until I tidy up that mess, just create an argemdio bus for
arge1. It's totally fine; it won't do anything or find anything
attached to it.
Tested:
* AP135 reference board - both arge0 and arge1 now work.
allanjude [Tue, 3 Mar 2015 23:20:18 +0000 (23:20 +0000)]
Add a new safetly belt to freebsd-update to prevent a user doing a minor update (-pX) while having an unfinished major upgrade (9.x to 9.y)
Safetly belt can be disabled with the -F flag
Additionally, add the --not-running-from-cron flag they bypasses the TTY requirement, and allows freebsd-update to be invoked by orchestration frameworks, scripts, or otherwise.
ken [Tue, 3 Mar 2015 22:49:07 +0000 (22:49 +0000)]
Add density code for DAT-72, and notes on DAT-160.
As it turns out, the density code for DAT-160 (0x48) is the same
as for SDLT220. Since the SDLT values are already in the table,
we will leave them in place.
Thanks to Harald Schmalzbauer for confirming the DAT-72 density code.
lib/libmt/mtlib.c:
Add DAT-72 density code, and commented out DAT-160 density
code. Explain why DAT-160 is commented out. Add notes
explaining where the bpi values for these formats came from.
usr.bin/mt/mt.1:
Add DAT-72 density code, and add a note explaining that
the SDLTTapeI(110) density code (0x48) is the same as
DAT-160.
Create nd6_ns_output_fib() function with extra argument fibnum. Use it
to initialize mbuf's fibnum. Uninitialized fibnum value can lead to
panic in the routing code. Currently we use only RT_DEFAULT_FIB value
for initialization.
andrew [Tue, 3 Mar 2015 09:48:19 +0000 (09:48 +0000)]
Fix the pl011 driver to work when the uart will write in zero cycles. This
is the case, depending on the options, in some of the ARM hardware
simulators. In these cases we don't get an interrupt so will need to
schedule the task to write more data to the uart.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
neel [Mon, 2 Mar 2015 20:13:49 +0000 (20:13 +0000)]
Fix warnings/errors when building vmm.ko with gcc:
- fix warning about comparison of 'uint8_t v_tpr >= 0' always being true.
- fix error triggered by an empty clobber list in the inline assembly for
"clgi" and "stgi"
- fix error when compiling "vmload %rax", "vmrun %rax" and "vmsave %rax". The
gcc assembler does not like the explicit operand "%rax" while the clang
assembler requires specifying the operand "%rax". Fix this by encoding the
instructions using the ".byte" directive.
hrs [Mon, 2 Mar 2015 20:00:03 +0000 (20:00 +0000)]
Fix group membership of cloned interfaces when one is moved by
if_vmove().
In if_vmove(), if_detach_internal() and if_attach_internal() were
called in series to detach and reattach the interface. When
detaching, if_delgroup() was called and the interface leaves all of
the group membership. And then upon attachment, if_addgroup(ifp,
IFG_ALL) was called and it joined only "all" group again.
This had a problem. Normally, a cloned interface automatically joins
a group whose name is ifc_name of the cloner in addition to "all"
upon creation. However, if_vmove() removed the membership and did
not restore upon attachment.
ken [Mon, 2 Mar 2015 18:09:49 +0000 (18:09 +0000)]
Change the sa(4) driver to check for long position support on
SCSI-2 devices.
Some older tape devices claim to be SCSI-2, but actually do support
long position information. (Long position information includes
the current file mark.) For example, the COMPAQ SuperDLT1.
So we now only disable the check on SCSI-1 and older devices.
sys/cam/scsi/scsi_sa.c:
In saregister(), only disable fetching long position
information on SCSI-1 and older drives. Update the
comment to explain why.
hrs [Mon, 2 Mar 2015 17:30:26 +0000 (17:30 +0000)]
Implement Enhanced DAD algorithm for IPv6 described in
draft-ietf-6man-enhanced-dad-13.
This basically adds a random nonce option (RFC 3971) to NS messages
for DAD probe to detect a looped back packet. This looped back packet
prevented DAD on some pseudo-interfaces which aggregates multiple L2 links
such as lagg(4).
The length of the nonce is set to 6 bytes. This algorithm can be disabled by
setting net.inet6.ip6.dad_enhanced sysctl to 0 in a per-vnet basis.
Reported by: hiren
Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D1835
adrian [Mon, 2 Mar 2015 02:24:46 +0000 (02:24 +0000)]
Bring over the initial QCA955x SoC support framework.
This is enough to bring up the basic SoC support.
What works thus far:
* The mips74k core, pll setup, and UART (or else well, stuff would
be really difficult..)
* both USB 2.0 EHCI controllers
* on-board 2GHz 3x3 wifi (the other variant has 2GHz/5GHz wifi on-chip);
* arge0 - not yet sure why arge1 isn't firing off interrupts and thus
handling traffic, but I will soon figure it out and fix it here.
Tested:
* AP135 reference design, QCA9558 SoC, pretending to be an 11n
2GHz AP.
TODO:
* There's an interrupt mux hooking up devices to IP2 and IP3 - but it's
not a read-and-clear or write-to-clear register. So, trying to use it
naively like I have been ends up with massive interrupt storms.
For now the things that share those interrupts can just take them as
shared interrupts and try to play nice.
* There's two PCIe root complexes /and/ one of them can actually be
a PCIe device endpoint. Yes, you heard right. I have to teach the
AR724x PCIe bridge code to handle multiple instances with multiple
memory/irq regions, and then there'll be RC support, but EP support
isn't on my TODO list.
* I'm not sure why arge1 isn't up and running. I'll go figure that
out soon and fix it here.
Thankyou to Qualcomm Atheros for providing me with hardware and
an abundance of documentation about these things.
adrian [Mon, 2 Mar 2015 02:14:44 +0000 (02:14 +0000)]
Lay some groundwork for having this stuff hang off of AHB rather than
the CPU nexus.
* Add ahb as a possible bus attachment
* Lay a comment down to remind me or whoever else ends up trying
to debug why the EEPROM isn't mapped in as to what's going on.
imp [Sun, 1 Mar 2015 21:41:37 +0000 (21:41 +0000)]
nandfs_meta_bread() calls bread() which can set bp to NULL in some
error cases. Calling brelse() with a NULL pointer is not allowed,
so only call brelse() when the bp is non-NULL.
Reported by: Maxime Villard (reported as uninitialized variable)
nwhitehorn [Sun, 1 Mar 2015 21:20:18 +0000 (21:20 +0000)]
Merge r278429 from ppc64:
Fix an extremely subtle concurrency bug triggered by running on 32-thread
POWER8 systems. During thread switch, there was a very small window when
the stack pointer was set to the stack pointer of the outgoing thread, but
after the lock on that thread had already been released.
If, during that window, the outgoing thread were rescheduled on another CPU
and begin execution and an exception were taken on the original CPU, the
trap handler and the outgoing thread would simultaneously execute on the same
stack, causing memory corruption. Fix this by making sure to release the
old thread only after cpu_switch() is done with its stack.
kargl [Sun, 1 Mar 2015 20:32:47 +0000 (20:32 +0000)]
Give compilers a stronger hint to inline the functions
pzero[f], qzero[f], pone[f], and qone[f]. While here
fix the function declarations in accordance with style(9).
kargl [Sun, 1 Mar 2015 20:26:03 +0000 (20:26 +0000)]
When j0() and j1() were converted to j0f() and j1f(), the threshold
values for the different invervals were not converted correctly.
Adjust the threshold values to values, which should agree with the
comments.
Now, depending upon how things are wired up, the second CPU port (MAC1)
can be wired to either the switch (port6), or through port5's PHY, bypassing
the GMAC+switch entirely. Ie, it can pretend to be a boring PHY, saving
system designers from having to include a separate PHY for a "WAN" port.
Here's the rub - the AP135 board (QCA955x SoC) hooks up arge0 to
the second CPU port on the AR8327, but it's hooked up as RGMII.
So, in order to hook it up to the rest of the switch, it isn't configured
as a separate PHY - OpenWRT has it setup as connected via RGMII to
GMAC6 and (I'm guessing) it's set to be a WAN port by configuring up
port-based VLANs or something.
Thus, with a port mask of 0x3f, GMAC6 was never allowed to receive traffic
from any other port. It could transmit fine, but not receive anything.
So, now it works enough for me to continue doing board bootstrapping.
Note, this isn't enough to make the QCA955x + AR8327 work - there's
a bunch of uncommitted work to both the platform SoC (interrupt handling,
ethernet, etc) and the ethernet switch (register access space, setup, etc)
that needs to happen. However, this particular change is also relevant to
other SoCs, like the AR934x and AR7161, both of which can be glued to
this switch.
Tested:
* AP135 development board
TODO:
* Figure out whether I can somehow abuse another port mode to have this
be a pass-through PHY, or whether I should just create some more boot
time hints to explicitly set up port-based isolation so this works
in a more useful way by default.
dumbbell [Sun, 1 Mar 2015 12:54:22 +0000 (12:54 +0000)]
vt(4): Add support to "downgrade" from eg. vt_fb to vt_vga
The main purpose of this feature is to be able to unload a KMS driver.
When going back from the current vt(4) backend to the previous backend,
the previous backend is reinitialized with the special VDF_DOWNGRADE
flag set. Then the current driver is terminated with the new "vd_fini"
callback.
In the case of vt_fb and vt_vga, this allows the former to pass the
vgapci device vt_fb used to vt_vga so the device can be rePOSTed.
andrew [Sun, 1 Mar 2015 10:04:14 +0000 (10:04 +0000)]
Fix the dtrace ARM atomic compare-and-set functions. These functions are
expected to return the data in the memory location pointed at by target
after the operation. The FreeBSD atomic functions previously used return
either 0 or 1 to indicate if the comparison succeeded or not respectively.
With this change these functions only support ARMv6 and later are supported
by these functions.
adrian [Sun, 1 Mar 2015 07:00:34 +0000 (07:00 +0000)]
Add very initial QCA955x awareness to the GPIO code.
There's a lot more to come - the QCA955x has a bunch more GPIO MUX
configuration, reminiscent of what the ARM chips let you do - but
it'll have to come later.
rstone [Sun, 1 Mar 2015 00:52:34 +0000 (00:52 +0000)]
Add functions for parsing the iovctl config file
Add two functions for parsing the iovctl config file. The config
file is parsed using libucl[1], which accepts most YAML files and
a superset of JSON. The first function is an ad-hoc parser that
searches the file for the PF.DEVICE configuration value. We need
to know that value in order to fetch the schema from the kernel,
and we need the schema in order to be able to fully parse the file.
The second function parses the config file and validates it
against a schema. This function will exit with an error message
if any validation error occurs. If it succeeds, the configuration
is returned as an nvlist suitable for passing to the kernel.
rstone [Sun, 1 Mar 2015 00:52:28 +0000 (00:52 +0000)]
Add iovctl functions for validating config
Add an function to iovctl that validates the configuration against
a schema. This function is able to assume that the parser has
done most of the validation already and it's only responsible for
applying default VF values specified in the config file, confirming
that all required parameters have been set and that no invalid VF
numbers have been specified.
rstone [Sun, 1 Mar 2015 00:40:57 +0000 (00:40 +0000)]
Pass SR-IOV configuration to kernel using an nvlist
Pass all SR-IOV configuration to the kernel using an nvlist. The
main benefit that this offers is flexibility. It allows a driver
to accept any number of parameters of any type supported by the
SR-IOV configuration infrastructure with having to make any
changes outside of the driver.
It also offers the user very fine-grained control over the
configuration of the VFs -- if they want, they can have different
configuration applied to every VF.
rstone [Sun, 1 Mar 2015 00:40:51 +0000 (00:40 +0000)]
Add function to validate the consistency of SR-IOV config
Add a function that validates that the user-provided SR-IOV
configuration is valid. This includes basic checks that the
structure of the configuration is correct (e.g. all required
configuration nodes are present) as well as validating against
a configuration schema.
The schema validation consists of:
- Ensuring that all required config parameters are present.
- If the schema defines a default value for a parameter,
adding the default value if the parameter is not set.
- Ensuring that no parameters are specified in the config
that are not defined in the schema.
- Ensuring that have the correct type defined in the schema.
- Ensuring that no configuration nodes are present for devices
that do not exist. For example, if 2 VFs are configured,
then we validate that a node called VF-5 does not exist.
rstone [Sun, 1 Mar 2015 00:40:26 +0000 (00:40 +0000)]
Allocate PCI I/O memory spaces for VFs
When creating VFs, we must size each SR-IOV BAR on the PF and
allocate a configuous I/O memory window large enough for every VF.
However, the window only needs to be aligned to a boundary equal
to the size of the window for a single VF.
When a VF attempts to allocate an I/O memory resource, we must
intercept the request in the pci driver and pass it off to the
SR-IOV code, which will allocate the correct window from the
pre-allocated memory space for the PF.
Inform the pci driver about the size and address of the BARs on
the VF when the VF is created. This is required by pciconf -b and
bhyve.
rstone [Sun, 1 Mar 2015 00:40:19 +0000 (00:40 +0000)]
Emulate the Device ID and Vendor ID registers for VFs
The SR-IOV standard requires VFs to read all-ones when the VID
and DID registers are read. The VMM (hypervisor) is required to
emulate them instead. Make pci_read_config() do this emulation.
Change pci_user.c to use pci_read_config() to read config space
registers instead of going directly to the pcib so that the
emulated VID/DID registers work correctly on VFs. This is
required both for pciconf and bhyve PCI passthrough.