Andriy Gapon [Wed, 18 May 2016 08:02:49 +0000 (08:02 +0000)]
zfsctl_freebsd_root_lookup: gfs_vop_lookup may return a doomed vnode
gfs code is (almsot) completely agnostic of FreeBSD VFS locking, so it
does not handle doomed but not yet dead vnodes and may return them.
Check for those vnodes here and retry a lookup.
Note that ZFS and gfs have additional protections that ensure that a
parent vnode of the current vnode is never doomed.
The fixed problem is an occasional failure to lookup a 'snapshot' or
'shares' directories under .zfs.
Note that for the above reason all uses of zfsctl_root_lookup() are
better be replaced with VOP_LOOKUP.
Adrian Chadd [Wed, 18 May 2016 07:01:22 +0000 (07:01 +0000)]
[siba] use the correct SPROM offsets.
I transcribed the linux ssb offsets and .. didn't pick up that our SIBA
SPROM code has an offset of 0x1000.
This fixes a bunch of odd parsing values that showed up when I tried
using a newer NIC. The NIC still doesn't yet work but now the SPROM
values are right.
Warner Losh [Wed, 18 May 2016 06:01:18 +0000 (06:01 +0000)]
Make armv6 hard float abi by default. Kill armv6hf.
Allow CPUTYPE=soft to build the current soft-float abi libraries.
Add UPDATING entry to announce this.
Warner Losh [Wed, 18 May 2016 05:59:05 +0000 (05:59 +0000)]
Fix several instances where the boot loader ignored pager_output
return value when it could return 1 (indicating we should stop).
Fix a few instances of pager_open() / pager_close() not being called.
Actually use these routines for the environment variable printing code
I just committed.
Warner Losh [Wed, 18 May 2016 05:58:58 +0000 (05:58 +0000)]
Fix build breakage on arm64 by papering over the problem. We implement
a slightly non-standard %S that's more useful in the UEFI environment,
so ignore printf errors. There's no good cast to use. We'll need to
revisit this in the future.
Adrian Chadd [Wed, 18 May 2016 05:56:25 +0000 (05:56 +0000)]
[bwn] add initial 5xx firmware API support
* Add the new TX/RX frame formats;
* Use the right TX/RX format based on the frame info;
* Disable the 5xx firmware check, since now it should
somewhat work (but note, we don't yet use it unless
you manually add ucode11/initvals11 from the 5.x driver
to bwn-kmod-firmware;
* Misc: update some comments/debugging now I know what's
actually going on.
Tested:
* BCM4321MC, STA mode, both 4xx and 666 firmware, DMA mode
TODO:
* The newer firmware ends up logging "warn: firmware state (0)";
not sure yet what's going on there. But, yes, it still works.
I'm committing this via a BCM4321MC, 11a station, firmware
rev 666.
Obtained from: Linux b43 (TX/RX descriptor format for 5xx)
Scott Long [Wed, 18 May 2016 04:35:58 +0000 (04:35 +0000)]
Import the 'iflib' API library for network drivers. From the author:
"iflib is a library to eliminate the need for frequently duplicated device
independent logic propagated (poorly) across many network drivers."
Participation is purely optional. The IFLIB kernel config option is
provided for drivers that want to transition between legacy and iflib
modes of operation. ixl and ixgbe driver conversions will be committed
shortly. We hope to see participation from the Broadcom and maybe
Chelsio drivers in the near future.
Mark Johnston [Wed, 18 May 2016 03:50:21 +0000 (03:50 +0000)]
Micro-optimize sleepq_broadcast().
- Avoid a conditional branch on the return value of sleepq_resume_thread()
by ORing its return value into the boolean wakeup_swapper. This is
consistent with other sleepqueue functions which just pass this return
value to their caller.
- sleepq_resume_thread() unconditionally removes the thread from its queue,
so there's no need to maintain a pointer to the next element in the queue.
Mark Johnston [Wed, 18 May 2016 03:34:02 +0000 (03:34 +0000)]
Remove the MUTEX_DEBUG kernel option.
It has no counterpart among the other lock primitives and has been a
no-op for years. Mutex consistency checks are generally done whenver
INVARIANTS is enabled.
Do enable io accounting for read-only mounts and mounts which are
remounted to writeable after initial read-only. Assign to
dev->si_mountpt earlier to account the accesses done at the mount
time.
Warner Losh [Tue, 17 May 2016 21:25:18 +0000 (21:25 +0000)]
Implement uuid-to-string and uuid-from-string. uuid-from-string takes
a string, interprets it as a standard UUID, and returns a binary from
of the UUID. uuid-to-string does the reverse. The binary UUID is in
allocated memory, so you'll need to free it with 'free' after you are
done using it. It won't be automatically garbage collected. Likewise
with the string...
John Baldwin [Tue, 17 May 2016 19:48:28 +0000 (19:48 +0000)]
Rework managing hotplug commands with command completions.
Previously the command completion interrupt would post any pending
command immediately before pcib_pcie_hotplug_update() had been
run to inspect the current status. Now, the command completion
interrupt merely clears the flag and stops the timer assuming that
the caller is always going to call pcib_pcie_hotplug_update() to
generate the next hotplug command if one is needed.
While here, fix a bug for systems with command completion where the
old (existing) value was written to the slot control register instead
of the new value. This fixes the complaint about a missing hotplug
interrupt on my T400.
John Baldwin [Tue, 17 May 2016 19:34:07 +0000 (19:34 +0000)]
Document the formatting requirements of location and pnpinfo strings.
devd requires location and pnpinfo strings generated by bus drivers
to be formatted as a list of name=value keypairs. Non-conforming
bus drivers cause devd to mis-parse device events for these buses.
Note that this documents the desired requirements. devctl_safe_quote()
doesn't yet escape backslash characters, and devd doesn't handle escaped
characters in quoted values.
Pedro F. Giffuni [Tue, 17 May 2016 18:20:33 +0000 (18:20 +0000)]
makefs(1): use all the random(3) range.
The generation number is uint32_t so we can fit the complete range
of random(3). We could have used arc4random() but the result would
be unpredictable and it would prohibit reproducible builds.
While here add a comment where seeding is done: this affects
reproducible builds and might have to be re-visited to use a
release dependent value.
Emmanuel Vadot [Tue, 17 May 2016 17:46:12 +0000 (17:46 +0000)]
Add driver for "generic-ohci" as defined by FDT.
If platform support EXT_RESOURCES, clocks and resets are handled out of
the box.
If not driver can be subclassed using the generic_usb interface.
generic_usb name was choosed because at one point I'll add generic-ehci
FDT driver.
Warner Losh [Tue, 17 May 2016 17:08:13 +0000 (17:08 +0000)]
Per Ravi Pokala's suggestion, rewrite the g_reset_bio description to
be clearer. It also describes it with more nuance. Add missing MLINKS
noticed by trasz@. Bump the date.
Alan Somers [Tue, 17 May 2016 15:17:23 +0000 (15:17 +0000)]
Speed up vdev_geom_open_by_guids
Speedup is hard to measure because the only time vdev_geom_open_by_guids
gets called on many drives at the same time is during boot. But with
vdev_geom_open hacked to always call vdev_geom_open_by_guids, operations
like "zpool create" speed up by 65%.
* Read all of a vdev's labels in parallel instead of sequentially.
* In vdev_geom_read_config, don't read the entire label, including
the uberblock. That's a waste of RAM. Just read the vdev config
nvlist. Reduces the IO and RAM involved with tasting from 1MB to
448KB.
Warner Losh [Tue, 17 May 2016 14:10:45 +0000 (14:10 +0000)]
It sure would be nice to use printf with wide strings. Implement %S to
do that. The C_WIDEOUT flag indicates that the console supports
it. Mark the EFI console as supporting this.
Bjoern A. Zeeb [Tue, 17 May 2016 13:12:26 +0000 (13:12 +0000)]
The GIC (v2 at least) has a bit in the TYPER register to indicate whether the GIC
supports the Security Extensions or not. This bit is not the same as the CPU one.
Currently we are not checking for either before trying to write to the special
registers. This can lead to problems on hardware or simulators that do not
provide the security extensions. Add the missing checks. Their interactions with
the CPU flag is not entirely clear to me but using a macro will make it easier
to quickly adjust the condition once the CPU bits are sorted as well.
Andrew Turner [Tue, 17 May 2016 12:46:50 +0000 (12:46 +0000)]
Clean up the GICv3 intrng code:
* In gic_v3_attach free the correct data on failure.
* Implement gic_v3_teardown_intr.
* Update the panic string when enabling/disabling an invalid interrupt.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Add implementation of robust mutexes, hopefully close enough to the
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Randall Stewart [Tue, 17 May 2016 09:53:22 +0000 (09:53 +0000)]
This small change adopts the excellent suggestion for using named
structures in the add of a new tcp-stack that came in late to me
via email after the last commit. It also makes it so that a new
stack may optionally get a callback during a retransmit
timeout. This allows the new stack to clear specific state (think
sack scoreboards or other such structures).
Sponsored by: Netflix Inc.
Differential Revision: http://reviews.freebsd.org/D6303
Andriy Gapon [Tue, 17 May 2016 07:56:05 +0000 (07:56 +0000)]
zfs_ioc_rename: fix a reversed condition
FreeBSD zfs_ioc_rename() has an option, not present upstream, that
allows to rename snapshots without unmounting them first. I am not sure
what is a rationale for that option, but its actual behavior was the
opposite of the intended behavior. That is, by default the snapshots
were not unmounted.
The option was introduced as part of a large update from upstream in
r248498.
One of the consequences was a havoc under .zfs/snapshot after the rename.
The snapshots got new names but were mounted on top of directories with
old names, so readdir would list the new names, but lookup would still
find the old mounts.
Add initial support for negotiating iSER parameters to iscsid(8). Some
rework might be needed to support asymetrical limits, but this should be
ok for now.
Obtained from: Mellanox Technologies (earlier version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Make named objects set-aware. Now it is possible to create named
objects with the same name in different sets.
Add optional manage_sets() callback to objects rewriting framework.
It is intended to implement handler for moving and swapping named
object's sets. Add ipfw_obj_manage_sets() function that implements
generic sets handler. Use new callback to implement sets support for
lookup tables.
External actions objects are global and they don't support sets.
Modify eaction_findbyname() to reflect this.
ipfw(8) now may fail to move rules or sets, because some named objects
in target set may have conflicting names.
Note that ipfw_obj_ntlv type was changed, but since lookup tables
actually didn't support sets, this change is harmless.
Adrian Chadd [Tue, 17 May 2016 07:15:25 +0000 (07:15 +0000)]
[bwn] add in bwn n-phy linking.
* The default kernel and options won't build the GPL PHY bits;
* bwn(4) defaults to building as a module anyway!;
* If BWN_GPL_PHY is specified in the config file, and you uncomment
the GPL PHY bits in the module Makefile, you'll get a working
N-PHY.
This is specifically designed to be obtuse for now, as I don't want
to flip it on by default. It's easy enough for people to flip on
and build, and it's a module so the default GENERIC kernel won't be
GPL tainted.
I'll have to add an actual HAL layer that allows the GPL PHY to be loaded
before if_bwn so it can be "magic", but that'll come later.
Adrian Chadd [Tue, 17 May 2016 06:52:53 +0000 (06:52 +0000)]
[bhnd] Finish bhnd(4) PCI/PCIe-G1 hostb support.
Now that we've got access to SPROM and can access board identification,
this implements all known remaining hardware work-arounds for the bhnd(4)
PCI and PCIe-G1 cores operating endpoint mode.
Additionally, this adds an initial set of skeleton PCIe-G2 hostb and pcib
drivers, required by fullmac and newer softmac devices.
PCIe PHY needs different initialization on MT7628/MT7688 SoCs than it does
on MT7620.
However, LEDE (and OpenWRT) dts files have the PCIe node for MT7628/MT7688
as compatible with mt7620-pci.
We already can handle this properly in our driver, so we just need to add
compat strings to fbsd-mt7628an.dtsi and the PCIe driver.
Approved by: adrian (mentor)
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D6395
This is an import of the reworked LEDE dts files. Besides other things
they make it easier for us to reuse.
The only diffs left are for the following SoCs:
MT7620A (fbsd-mt7620a.dtsi)
MT7621 (fbsd-mt7621.dtsi)
MT7628 (fbsd-mt7628an.dtsi)
RT3883 (fbsd-rt3883.dtsi)
So we include the fbsd-*.dtsi files at the end of the original LEDE dtsi
files, using '#include "fbsd-xxxx.dtsi"'.
For example, for MT7621, the LEDE dtsi file is mt7621.dtsi. At the end of
it we add:
#include "fbsd-mt7621.dtsi"
Approved by: adrian (mentor)
Obtained from: LEDE project
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D6394
Andrew Rybchenko [Tue, 17 May 2016 06:28:03 +0000 (06:28 +0000)]
sfxge(4): only raise an exception after MC assert or reboot in the common code
Fix efx_mcdi_request_poll so it only raises an exception if EIO is
reported from a detected MC assert or reboot. This prevents
an unnecessary exception being raised if an MCDI response error code
is trandlated to EIO.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6392
Andrew Rybchenko [Tue, 17 May 2016 06:26:02 +0000 (06:26 +0000)]
sfxge(4): fix Medford timer quantum calculation in common code
The event/timer block used sysclk in Huntington, but has been
moved to the dpcpu clock domain for Medford. Fix the computed
timer quantum to use the right clock.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6389
Andrew Rybchenko [Tue, 17 May 2016 06:25:00 +0000 (06:25 +0000)]
sfxge(4): query and use current MTU if setting the MTU fails
This allows the driver to fall back to the largest usable MTU if a
user attempts to configure an unprivileged function with an MTU higher
than that of the attached port.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D6387