mav [Thu, 18 Aug 2016 11:02:01 +0000 (11:02 +0000)]
MFC r303514: Fix NTBT_QP_LINKS negotiation.
I believe it never worked correctly for more the one queue even in Linux.
This fixes case when one of consumer drivers is not loaded on one side,
but its queues still announced as ready if something else brought link up.
mav [Thu, 18 Aug 2016 11:00:48 +0000 (11:00 +0000)]
MFC r303494: Once more refactor KPI between ntb_transport(4) and if_ntb(4).
New design allows to attach multiple consumers to ntb_transport(4) instance.
Previous design obtained from Linux theoretically allowed that, but was not
practically usable (Linux also has only one consumer driver now).
mav [Thu, 18 Aug 2016 10:59:12 +0000 (10:59 +0000)]
MFC r303429, r303437, r303551:
Once more refactor KPI between NTB hardware and consumers.
New design allows hardware resources to be split between several consumers.
For example, one BAR can be dedicated for remote memory access, while other
resources can be used for packet transport for virtual Ethernet interface.
And even without resource split, this code allows to specify which consumer
driver should attach the hardware.
From some points this makes the code even closer to Linux one, even though
Linux does not provide the described flexibility.
mav [Thu, 18 Aug 2016 10:54:21 +0000 (10:54 +0000)]
MFC r303266: Postpone ntb_get_msix_info() till we need to negotiate MSIX.
Calling it earlier increases the window when MSIX info may change.
This change does not solve the problem completely, but seems logical.
Complete solution should probably include link reset in case of MSIX
remap to trigger new negotiation, but we have no way to get notified
about that now.
mav [Thu, 18 Aug 2016 10:53:03 +0000 (10:53 +0000)]
MFC r302531: Revert odd change, setting limit registers before base.
I don't know what errata is mentioned there, I was unable to find it, but
setting limit before the base simply does not work at all. According to
specification attempt to set limit out of the present window range resets
it to zero, effectively disabling it. And that is what I see in practice.
Fixing this properly disables access for remote side to our memory until
respective xlat is negotiated and set. As I see, Linux does the same.
mav [Thu, 18 Aug 2016 10:51:53 +0000 (10:51 +0000)]
MFC r302529: Remove callout_reset(link_work) from ntb_transport_attach().
At that point link is quite likely not established yet, so messing with
scratch registers is premature there. Original commit message mentioned
code diff reduction from Linux, but this line is not present in Linux now.
mav [Thu, 18 Aug 2016 10:50:27 +0000 (10:50 +0000)]
MFC r302508: Disable SB01BASE_LOCKUP workaround when split BARs disabled.
For some reason hack with sending MSI-X interrupts by writing to remote
LAPIC memory works only for 32-bit BARs, that are available only if split
BARs mode is enabled in BIOS. If it is not, complain loudly and fall back
to less efficient workaround.
mav [Thu, 18 Aug 2016 10:46:29 +0000 (10:46 +0000)]
MFC r302499: Improve checksum "offload" support.
For compatibility reasons make driver not report any checksum offload by
default, since there is indeed none. But if administrator knows that
interface is used only for local traffic, he can enable fake checksum
offload manually on both sides to save some CPU cycles, since the data
are already protected by CRC32 of PCIe link.
mav [Thu, 18 Aug 2016 10:43:59 +0000 (10:43 +0000)]
MFC r302493: Reimplement doorbell register emulation for NTB_SB01BASE_LOCKUP.
This allows at least first three doorbells to work very close to normal
hardware, properly signaling events to upper layers without spurious or
lost events. Doorbells above the first three may still report spurious
events due to lack of reliable information, but they are rarely used.
mav [Thu, 18 Aug 2016 10:42:48 +0000 (10:42 +0000)]
MFC r302491: Switch ctx_lock from mutex to rmlock.
It is odd idea to serialize different MSI-X vectors. Use of rmlocks
here allows them to execute in parallel, but still protects ctx.
If upper layers require any additional serialization -- they can
do it by themselves.
mav [Thu, 18 Aug 2016 10:39:00 +0000 (10:39 +0000)]
MFC r302484: NewBus'ify NTB subsystem.
This follows NTB subsystem modularization in Linux, tuning it to FreeBSD
native NewBus interfaces. This change allows to support different types
of hardware with different drivers, support multiple NTB instances in a
system, ntb_transport module use for needs other then if_ntb, etc.
mav [Thu, 18 Aug 2016 10:24:31 +0000 (10:24 +0000)]
MFC r302482: Fix NTB_SDOORBELL_LOCKUP workaround.
Since SBARxSZ register can be write-once, it can be unusable for disabling
the SBAR. For such case also set SBARxBASE to zero to not intersect with
config BAR.
kp [Wed, 17 Aug 2016 15:14:21 +0000 (15:14 +0000)]
MFC r289932, r289940:
PF_ANEQ() macro will in most situations returns TRUE comparing two identical
IPv4 packets (when it should return FALSE). It happens because PF_ANEQ() doesn't
stop if first 32 bits of IPv4 packets are equal and starts to check next 3*32
bits (like for IPv6 packet). Those bits containt some garbage and in result
PF_ANEQ() wrongly returns TRUE.
Fix: Check if packet is of AF_INET type and if it is then compare only first 32
bits of data.
kp [Wed, 17 Aug 2016 09:24:46 +0000 (09:24 +0000)]
MFC r302497:
pf: Map hook returns onto the correct error values
pf returns PF_PASS, PF_DROP, ... in the netpfil hooks, but the hook callers
expect to get E<foo> error codes.
Map the returns values. A pass is 0 (everything is OK), anything else means
pf ate the packet, so return EACCES, which tells the stack not to emit an ICMP
error message.
1) All situations with glob(3) error return codes are well defined by
POSIX, so rewrite old sporadic errors processing to match those
definitions.
Including subcases:
Both C99 and POSIX directly prohibits any standard function to set errno
to 0. Breaking this rule in 2001 NetBSD hack was imported which attempts
to workaround very limited glob(3) return codes amount.
Use POSIX-compatible workaround now with E2BIG which can't comes from
other functions used instead of prohibited 0.
Process errors happpens in (*readdirfunc)() too, as POSIX requires.
Per POSIX GLOB_NOCHECK should return original pattern,
unmodified, if no matches found. But our code strips all '\'
returning it. Rewrite the code to allow to return original pattern.
GLOB_ERR and gl_errfunc are supposed to work only for real directories
per POSIX, so don't act on missing or plain files for ENOENT or ENOTDIR
(as TODO in the code suggested).
Remove the hack in the manpage describing how to skip ENOENT and ENOTDIR
in gl_errfunc, it is unneeded now.
Per POSIX GLOB_ERR must be considered even if gl_errfunc is not set,
old code skips it in that case.
2) For near MAXPATHLEN long pathes old glob(3) code can operate on
truncated results, prevent it in several places.
3) Results was not sorted according to collate as POSIX requires.
4) globtilde() forget to convert expanded user home dir from multibyte to
wide chars. Moreover, those chars are addded as not protected, so
can be treated as special chars.
5) Backward hack for EILSEQ in g_Ctoc() was not implemented, so all
pathes with illegal byte sequences are skipped as result, implement it now.
6) GLOB_BRACE was somehow broken. First it repeatedly calls glob0() in
globexp1() recursive calls, but glob0() was not supposed to be called
repeatedly in the original code. It finalize results by possible adding
original pattern for no match case, may return GLOB_NOMATCH error and
by sorting all things. Original pattern adding or GLOB_NOMATCH error
can happens each time glob0() called repeatedly, and sorting happens
for one item only, all things are never sorted. Second, f.e. "a{a"
pattern does not match "a{a" file but match "a" file instead.
Third, some errors (f.e. for limits or overflow) can be ignored
by GLOB_BRACE code because it forces return (0).
Add non-finalizing flag to glob0() and make globexp0() wrapper around
recursively called globexp1() to finalize things like glob0() does.
Reorganize braces code to work correctly.
7) Don't allow MB_CUR_MAX * strlen overallocation hits GLOB_LIMIT_STRING
(ARG_MAX) limit, use final string length, not malloced space for it.
jhb [Mon, 15 Aug 2016 21:10:41 +0000 (21:10 +0000)]
MFC 302900,302902,302921,303461,304009:
Add a mask of optional ptrace() events.
302900:
Add a test for user signal delivery.
This test verifies we get the correct ptrace event details when a signal
is posted to a traced process from userland.
302902:
Add a mask of optional ptrace() events.
ptrace() now stores a mask of optional events in p_ptevents. Currently
this mask is a single integer, but it can be expanded into an array of
integers in the future.
Two new ptrace requests can be used to manipulate the event mask:
PT_GET_EVENT_MASK fetches the current event mask and PT_SET_EVENT_MASK
sets the current event mask.
The current set of events include:
- PTRACE_EXEC: trace calls to execve().
- PTRACE_SCE: trace system call entries.
- PTRACE_SCX: trace syscam call exits.
- PTRACE_FORK: trace forks and auto-attach to new child processes.
- PTRACE_LWP: trace LWP events.
The S_PT_SCX and S_PT_SCE events in the procfs p_stops flags have
been replaced by PTRACE_SCE and PTRACE_SCX. PTRACE_FORK replaces
P_FOLLOW_FORK and PTRACE_LWP replaces P2_LWP_EVENTS.
The PT_FOLLOW_FORK and PT_LWP_EVENTS ptrace requests remain for
compatibility but now simply toggle corresponding flags in the
event mask.
While here, document that PT_SYSCALL, PT_TO_SCE, and PT_TO_SCX both
modify the event mask and continue the traced process.
302921:
Rename PTRACE_SYSCALL to LINUX_PTRACE_SYSCALL.
303461:
Note that not all optional ptrace events use SIGTRAP.
New child processes attached due to PTRACE_FORK use SIGSTOP instead of
SIGTRAP. All other ptrace events use SIGTRAP.
304009:
Remove description of P_FOLLOWFORK as this flag was removed.
hselasky [Mon, 15 Aug 2016 09:09:01 +0000 (09:09 +0000)]
MFC r303870:
Fix for use after free.
Clear the device description to avoid use after free because the
bsddev is not destroyed when the mlx5en module is unloaded. Only when
the parent mlx5 module is unloaded the bsddev is destroyed. This fixes
a panic on listing sysctls which refer strings in the bsddev after the
mlx5en module has been unloaded.
hselasky [Mon, 15 Aug 2016 09:00:46 +0000 (09:00 +0000)]
MFC r303765:
Keep a reference count on USB keyboard polling to allow recursive
cngrab() during a panic for example, similar to what the AT-keyboard
driver is doing.
jhb [Fri, 12 Aug 2016 19:43:06 +0000 (19:43 +0000)]
MFC 292894,292896: Add ptrace(2) reporting for LWP events.
292894:
Add ptrace(2) reporting for LWP events.
Add two new LWPINFO flags: PL_FLAG_BORN and PL_FLAG_EXITED for reporting
thread creation and destruction. Newly created threads will stop to report
PL_FLAG_BORN before returning to userland and exiting threads will stop to
report PL_FLAG_EXIT before exiting completely. Both of these events are
only enabled and reported if PT_LWP_EVENTS is enabled on a process.
292896:
Document the recently added support for ptrace(2) LWP events.
gjb [Fri, 12 Aug 2016 19:06:29 +0000 (19:06 +0000)]
MFC r303897:
Pass overrides to make(1) when building ports for arm/armv6
targets, similar to what is done for the run-autotools-fixup
override for non-arm targets.
hselasky [Fri, 12 Aug 2016 07:57:27 +0000 (07:57 +0000)]
MFC r301039:
Add support for simplex USB MIDI devices, which only provide BULK or
INTERRUPT endpoints for moving data in one direction, like the KeyRig
49 from M-Audio.
jhb [Wed, 10 Aug 2016 16:31:15 +0000 (16:31 +0000)]
MFC 273102:
Use '-e' to check if the virtio backing file has already been created.
The '-f' check works fine on a regular file but not if the backing file is
a device (e.g., /dev/md0). In this case it would print a misleading but
otherwise benign message about the backing file not being present.
dim [Tue, 9 Aug 2016 18:49:19 +0000 (18:49 +0000)]
MFC r270132 (by gabor):
- Do not look for more matching lines if -L is specified
Submitted by: eadler (based on)
MFC r296799 (by ian):
Fix a bug in bsdgrep that caused the program to hang in a tight loop for
some combinations of command line options and search patterns. The code was
examining regexec flags looking for a regcomp flag value. The fix is to
look in the struct field where the decoded regcomp flag was stored when the
regex was compiled.
With this fix, it's possible to build WITHOUT_GNU_GREP_COMPAT and
WITH_BSDGREP and have a usable GPL-free grep (which of course lacks gnugrep
extensions). It now passes the kyua tests except for one test that requires
the -z/--null-data gnu extension, and one test involving outputting context
lines across multiple files which appears to sometimes output an extra
delimiter line ("--") between matches (a rather obscure failure of a rather
obscure feature, so bsdgrep should be generally usable now).
MFC r303676:
Fix a segfault in bsdgrep when parsing the invalid extended regexps "?"
or "+" (these are invalid, because there is no preceding operand).
When bsdgrep attempts to emulate GNU grep in discarding and ignoring the
invalid ? or + operators, some later logic in tre_compile_fast() goes
beyond the end of the buffer, leading to a crash.
Fix this by bailing out, and reporting a bad pattern instead.
r289932 accidentally broke the rule skip calculation. The address family
argument to PF_ANEQ() is now important, and because it was set to 0 the macro
always evaluated to false.
This resulted in incorrect skip values, which in turn broke the rule
evaluations.
sephe [Mon, 8 Aug 2016 06:33:59 +0000 (06:33 +0000)]
MFC 303737
hyperv/storvsc: Claim SPC-3 conformance, thus enable UNMAP support
The Hyper-V on pre-win10 systems will only report SPC-2 conformance,
but it actually conforms to SPC-3. The INQUIRY response is adjusted
to propagate the SPC-3 version information to CAM.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7405
jhb [Sat, 6 Aug 2016 23:53:33 +0000 (23:53 +0000)]
MFC 303076,303225: Use MTX_SYSINIT for the VESA lock.
303076:
vesa: fix panic on suspend
Fix the following panic seen when migrating a FreeBSD guest on Xen:
panic: mtx_lock() of destroyed mutex @ /usr/src/sys/dev/fb/vesa.c:541
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001d2fa4f0
vpanic() at vpanic+0x182/frame 0xfffffe001d2fa570
kassert_panic() at kassert_panic+0x126/frame 0xfffffe001d2fa5e0
__mtx_lock_flags() at __mtx_lock_flags+0x15b/frame 0xfffffe001d2fa630
vesa_bios_save_restore() at vesa_bios_save_restore+0x78/frame 0xfffffe001d2fa680
vga_suspend() at vga_suspend+0xa3/frame 0xfffffe001d2fa6b0
isavga_suspend() at isavga_suspend+0x1d/frame 0xfffffe001d2fa6d0
bus_generic_suspend_child() at bus_generic_suspend_child+0x44/frame
[...]
This is caused because vga_sub_configure (which is called if the VGA adapter
is attached after VESA tried to initialize), points to vesa_configure, which
doesn't initialize the VESA mutex. In order to fix it, make sure
vga_sub_configure points to vesa_load, so that all the needed vesa
components are properly initialized.
303225:
Use MTX_SYSINIT for the VESA lock.
vesa_init_done isn't a reliable guard for the mutex init. If
vesa_configure() doesn't find valid VESA info it will not set
vesa_init_done, but the lock will remain initialized. Revert r303076
and use MTX_SYSINIT to deterministically init the lock.
jhb [Fri, 5 Aug 2016 17:13:25 +0000 (17:13 +0000)]
MFC 302181,302635: Disable MSI-X migration on older Xen hypervisors.
302181:
Add a tunable to disable migration of MSI-X interrupts.
The new 'machdep.disable_msix_migration' tunable can be set to 1 to
disable migration of MSI-X interrupts.
Xen versions prior to 4.6.0 do not properly handle updates to MSI-X
table entries after the initial write. In particular, the operation
to unmask a table entry after updating it during migration is not
propagated to the "real" table for passthrough devices causing the
interrupt to remain masked. At least some systems in EC2 are
affected by this bug when using SRIOV. The tunable can be set in
loader.conf as a workaround.
If the hypervisor version is smaller than 4.6.0. Xen commits 74fd00 and
70a3cb are required on the hypervisor side for this to be fixed, and those
are only included in 4.6.0, so stay on the safe side and disable MSI-X
interrupt migration on anything older than 4.6.0.
It should not cause major performance degradation unless a lot of MSI-X
interrupts are allocated.
ngie [Wed, 3 Aug 2016 01:19:10 +0000 (01:19 +0000)]
MFstable/11 r303691:
MFC r302550,r302551,r302552,r302553:
r302550:
Deobfuscate cleanup path in clnt_dg_create(..)
Similar to r300836 and r301800, cl and cu will always be non-NULL as they're
allocated using the mem_alloc routines, which always use
`malloc(..., M_WAITOK)`.
Deobfuscating the cleanup path fixes a leak where if cl was NULL and
cu was not, cu would not be free'd, and also removes a duplicate test for
cl not being NULL.
Similar to r300836, r301800, and r302550, cl and ct will always
be non-NULL as they're allocated using the mem_alloc routines,
which always use `malloc(..., M_WAITOK)`.
emaste [Mon, 1 Aug 2016 19:53:18 +0000 (19:53 +0000)]
MFC r303338: vt: lock Giant around kbd calls in CONS_GETINFO
Note that keyboards are stored in an array and are not freed (just
"unregistered" by clearing some fields) so a race would be limited to
obtaining stale information about an unregistered keyboard.
296063:
Lock the NDP default router list and count defrouter references.
This addresses a number of race conditions that can cause crashes as a
result of unsynchronized access to the list.
297397
Modify nd6_llinfo_timer() to acquire the nd6 lock before the LLE lock.
When expiring a neighbour cache entry we may need to look up the associated
default router, which requires the nd6 read lock. To avoid an LOR, the nd6
lock should be acquired first.
299213
Clean up callers of nd6_prelist_add().
nd6_prelist_add() sets *newp if and only if it is successful, so there's no
need for code that handles the case where the return value is 0 and
*newp == NULL. Fix some style bugs in nd6_prelist_add() while here.
MFC 303109: Update crashinfo to work with newer gdb from ports.
If gdb from ports is installed, use it instead of the base system gdb
to extract variables from a kernel. Note that base gdb and ports gdb
do not support the same options for invoking a single command in batch
mode, so a wrapper shell function is used. In addition, prefer kgdb
from ports when generating a backtrace if present.