andrew [Fri, 4 Aug 2017 10:33:22 +0000 (10:33 +0000)]
Read the numa-node-id property from each CPU node. This will initially be
used to support the dual package ThunderX where we need to send MSI/MSI-X
interrupts to the same package as the device the interrupt came from.
imp [Fri, 4 Aug 2017 03:40:01 +0000 (03:40 +0000)]
Make nvd vs nda choice boot-time rather than build-time
Introduce hw.nvme.use_nvd tunable. This tunable allows both nvd and
nda to be installed in the kernel, while allowing only one of them to
create devices. This is an all-or-nothing setting, and you can't
change it after boot-time. However, it will allow easier A/B testing.
markj [Thu, 3 Aug 2017 22:41:34 +0000 (22:41 +0000)]
Fix procstat --libxo -L.
- Use the title role for column headers.
- Fix a typo in a field name (lpwid -> lwpid).
- Place the fields of different threads in separate containers.
cem [Thu, 3 Aug 2017 22:28:30 +0000 (22:28 +0000)]
x86: Tag some intrinsics with __pure2
Some C wrappers for x86 instructions do not touch global memory and only act
on their arguments; they can be marked __pure2, aka __const__. Without this
annotation, Clang 3.9.1 is not intelligent enough on its own to grok that
these functions are __const__.
Submitted by: Anton Rang <anton.rang AT isilon.com>
Sponsored by: Dell EMC Isilon
markj [Thu, 3 Aug 2017 21:35:53 +0000 (21:35 +0000)]
Bump the maximum file name length in pseudofs filesystems to 48.
The previous limit of 24 was somewhat restrictive, and with this change
ceil(log2(sizeof(struct pfs_node))) is the same as before in both the ILP32
and LP64 models, so the malloc zone used for allocations of struct pfs_node
is the same as before.
ngie [Thu, 3 Aug 2017 17:53:14 +0000 (17:53 +0000)]
Remove special-case logic for running tests on host machines
I'm not sure what process sjg@ was using, but using CHECKDIR=${.OBJDIR} with
"make check" on ^/head is the correct thing to do. This unbreaks "make check"
for me (unsandboxed, not using CHECKDIR=${.OBJDIR}).
phil [Thu, 3 Aug 2017 15:47:42 +0000 (15:47 +0000)]
Update from libxo-0.8.1 to 0.8.4:
0.8.4:
- void anchor width optimization when we have a custom formatter (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221130)
- make "{[:/18}" do the right thing (also allows "{[:/%s}", wide ? 40 : 10)
- Can't skip anchor formatting in non-display styles
- add test case for {[:/18}
- add upload-xohtml-files to 'make upload'
0.8.3:
- xohtml: Add "-w" option to pull support files from gh_pages
- Add "upload-xohtml-files" target to publish support files in gh_pages/
- add HISTORY/AUTHORS section to man pages
0.8.2:
- xohtml: Add div.units as standard CSS text
- Don't treat values as format strings; they are not
- add "-p" to "mkdir -p build" in setup.sh
- add test case for {U:%%} (from df.c)
- detect end-of-string in '%' and '' escaping
- make xo_simple_field, for common simple cases
- xohtml: nuke "n" in "echo" commands
- rename "format" to "fmt" for consistency; same for "str" to "value"
ian [Thu, 3 Aug 2017 14:43:41 +0000 (14:43 +0000)]
Add an ahci driver for imx6.
This was submitted by Rogiel Sulzbach (thank you!) but has a few last-minute
changes by me, mostly where the code interfaces to my still-utterly-deficient
imx6_ccm clocks implementation. So blame me for any mistakes.
ngie [Thu, 3 Aug 2017 13:50:46 +0000 (13:50 +0000)]
Revert r321969
My change had good intentions, but the implementation was incorrect:
- printf was returning the number of characters in the format string
plus the NUL, but failed in two regards implementation wise:
-- the pathological case, printf(""), wasn't being handled properly since
the pointer is always incremented, so the value returned would be
off-by-one.
-- printf(3) reports the number of characters printed post-conversion via
vfprintf, etc.
- putchar(3) should return the character printed or EOF, not the number
of characters output to the screen.
My goal in making the change (again) was to increase parity, but as bde
pointed out these are freestanding functions, so they don't have to
conform to libc/POSIX. I argued that the functions should be named
differently since the implementation is different enough to warrant it
and to allow boot2 code to be usable when linked against sys/boot and
libstand and other libraries in base. I have no interest in pushing
this change forward more though, as the original concern I had behind
the change with zfsboottest was resolved in r321849 and r321852. The
next person that updates the toolchain gets to deal with the
inconsistency if it's flagged by a newer compiler.
hselasky [Thu, 3 Aug 2017 09:31:10 +0000 (09:31 +0000)]
Change reject message type when destroying cm_id in ibore.
This patch fixes an interopability issue between FreeBSD and non-FreeBSD
systems when the connection establishment is aborted. Refer to the
initial commit in Linux, drivers/infiniband/core/cm.c,
for a more detailed description.
Obtained from: Linux
MFC after: 3 days
Sponsored by: Mellanox Technologies
hselasky [Thu, 3 Aug 2017 09:14:43 +0000 (09:14 +0000)]
Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.
MFC after: 3 days
Sponsored by: Mellanox Technologies
hselasky [Thu, 3 Aug 2017 09:11:51 +0000 (09:11 +0000)]
Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.
MFC after: 3 days
Sponsored by: Mellanox Technologies
ngie [Thu, 3 Aug 2017 03:43:41 +0000 (03:43 +0000)]
Chase r321920 and r321930 (dev_t being widened)
The layout of st_rdev has changed after this commit, and assumptions made
in the NetBSD tests are no longer valid. Change the hardcoded assumed
values to account for the fact that major/minor are now represented by
64 bits as opposed to the less precise legacy precision of 16 bits.
PR: 221048
Relnotes: st_rdev layout changed; warning about impact of r321920 to
downstream consumers
markj [Thu, 3 Aug 2017 00:38:13 +0000 (00:38 +0000)]
Rework and simplify the ksyms(4) implementation.
- Store the symbol table contents in an anonymous swap-backed object. Have
mmap(/dev/ksyms) map that object, and stop mapping the symbol table into
the calling process in ksyms_open(). Previously we would cache a pointer
to the pmap of the opening process, and mmap(/dev/ksyms) would create a
mapping using the physical address found by a pmap lookup at the initial
mapping address. However, this assumes that the cached pmap is valid,
which may not be the case. [1]
- Remove the ksyms ioctl interface. It appears to have been added to work
around a limitation in libelf that no longer exists; see r321842.
Moreover, the interface is difficult to support and isn't present in
illumos. Since ksyms was added specifically to support lockstat(1), it
is expected that this removal won't have any real impact.
- Simplify ksyms_read() to avoid unnecessary copying.
- Don't call the device handle destructor if we fail to capture a snapshot
of the kernel's symbol table. devfs will do that for us.
ngie [Wed, 2 Aug 2017 21:49:37 +0000 (21:49 +0000)]
Delete comment above "__DEFAULT_DEPENDENT_OPTIONS" related to "meta mode options"
src.conf(5) should document which knobs are which and the dependency between each;
remove the comment so the variable can apply to non-"meta mode options".
marius [Wed, 2 Aug 2017 21:11:51 +0000 (21:11 +0000)]
- Correct the remainder of confusing and error prone mix-ups between
"br" or "bridge" where - according to the terminology outlined in
comments of bridge.h and mmcbr_if.m around since their addition in
r163516 - the bus is meant and used instead. Some of these instances
are also rather old, while those in e. g. mmc_subr.c are as new as
r315430 and were caused by choosing mmc_wait_for_request(), i. e. the
one pre-r315430 outliner existing in mmc.c, as template for function
parameters in mmc_subr.c inadvertently. This correction translates to
renaming "brdev" to "busdev" and "mmcbr" to "mmcbus" respectively as
appropriate.
While at it, also rename "reqdev" to just "dev" in mmc_subr.[c,h]
for consistency with was already used in mmm.c pre-r315430, again
modulo mmc_wait_for_request() that is.
- Remove comment lines from bridge.h incorrectly suggesting that there
would be a MMC bridge base class driver.
- Update comments in bridge.h regarding the star topology of SD and SDIO;
since version 3.00 of the SDHCI specification, for eSD and eSDIO bus
topologies are actually possible in form of so called "shared buses"
(in some subcontext later on renamed to "embedded" buses).
ian [Wed, 2 Aug 2017 18:28:06 +0000 (18:28 +0000)]
Fix the interface to imx_iomux_gpr_get/set(). The functions were defined
as taking a register number, and that would get multiplied by 4 to make
a register address. But the header file that consumers have to reference
this stuff publishes register addresses, not numbers. So now everything
works in terms of register addresses.
Note that the HDMI init code was writing into the wrong register before
this change. Apparently whatever it wrote to was harmless, and apparently
HDMI was working because uboot had set up the right bits.
ian [Wed, 2 Aug 2017 15:15:18 +0000 (15:15 +0000)]
The imx6_snvs driver is not strictly required for the system to run, so
change it from standard to optional and add a device statement for it so
that it's included unless someone uses nodevice to eliminate it.
hselasky [Wed, 2 Aug 2017 14:27:27 +0000 (14:27 +0000)]
Fix LinuxKPI regression after r321920. The mda_unit and si_drv0 fields are not
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.
While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.
Keep top page on CloudABI to work around AMD Ryzen stability issues.
Similar to r321899, reduce sv_maxuser by one page inside of CloudABI.
This ensures that the stack, the vDSO and any allocations cannot touch
the top page of user virtual memory.
Considering that CloudABI userspace is completely oblivious to virtual
memory layout, don't bother making this conditional based on the CPU of
the running system.
kib [Wed, 2 Aug 2017 10:12:10 +0000 (10:12 +0000)]
Do not call trapsignal() after handling usermode fault or interrupt,
when a signal is not intended to be sent.
The variable holding the signal number to send is left uninitialized,
which sometimes triggers invalid signal checks.
For NMI, a return to usermode without ast processing is done. On the
other hand, for spurious dtrace probe interrupt it is usermode which
triggered the interrupt, so handle it through userret() as any other
fault.
Reported by: Nils Beyer <nbe@renzel.net>
PR: 221151
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
truckman [Wed, 2 Aug 2017 01:43:35 +0000 (01:43 +0000)]
Lower the amd64 shared page, which contains the signal trampoline,
from the top of user memory to one page lower on machines with the
Ryzen (AMD Family 17h) CPU. This pushes ps_strings and the stack
down by one page as well. On Ryzen there is some sort of interaction
between code running at the top of user memory address space and
interrupts that can cause FreeBSD to either hang or silently reset.
This sounds similar to the problem found with DragonFly BSD that
was fixed with this commit:
https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/b48dd28447fc8ef62fbc963accd301557fd9ac20
but our signal trampoline location was already lower than the address
that DragonFly moved their signal trampoline to. It also does not
appear to be related to SMT as described here:
https://www.phoronix.com/forums/forum/hardware/processors-memory/955368-some-ryzen-linux-users-are-facing-issues-with-heavy-compilation-loads?p=955498#post955498
"Hi, Matt Dillon here. Yes, I did find what I believe to be a
hardware issue with Ryzen related to concurrent operations. In a
nutshell, for any given hyperthread pair, if one hyperthread is
in a cpu-bound loop of any kind (can be in user mode), and the
other hyperthread is returning from an interrupt via IRETQ, the
hyperthread issuing the IRETQ can stall indefinitely until the
other hyperthread with the cpu-bound loop pauses (aka HLT until
next interrupt). After this situation occurs, the system appears
to destabilize. The situation does not occur if the cpu-bound
loop is on a different core than the core doing the IRETQ. The
%rip the IRETQ returns to (e.g. userland %rip address) matters a
*LOT*. The problem occurs more often with high %rip addresses
such as near the top of the user stack, which is where DragonFly's
signal trampoline traditionally resides. So a user program taking
a signal on one thread while another thread is cpu-bound can cause
this behavior. Changing the location of the signal trampoline
makes it more difficult to reproduce the problem. I have not
been because the able to completely mitigate it. When a cpu-thread
stalls in this manner it appears to stall INSIDE the microcode
for IRETQ. It doesn't make it to the return pc, and the cpu thread
cannot take any IPIs or other hardware interrupts while in this
state."
since the system instability has been observed on FreeBSD with SMT
disabled. Interrupts to appear to play a factor since running a
signal-intensive process on the first CPU core, which handles most
of the interrupts on my machine, is far more likely to trigger the
problem than running such a process on any other core.
Also lower sv_maxuser to prevent a malicious user from using mmap()
to load and execute code in the top page of user memory that was made
available when the shared page was moved down.
Make the same changes to the 64-bit Linux emulator.
bdrewery [Tue, 1 Aug 2017 18:26:20 +0000 (18:26 +0000)]
CCACHE_BUILD: Follow-up r321880: Fix some PATH issues with buildworld.
- bsd.compiler.mk: Must ensure that the CCACHE_WRAPPER_PATH comes first
in PATH.
- Makefile.inc1: Must prepend the CCACHE_WRAPPER_PATH into BPATH as it
overrides the PATH set in bsd.compiler.mk in sub-makes. The PATH
set in bsd.compiler.mk is not exported and doing so would cause it to
then override the BPATH set from environment. The only sane solution
is to prepend into BPATH as needed.
CCACHE_PATH could possibly be used for some of this as well.
markj [Tue, 1 Aug 2017 17:50:28 +0000 (17:50 +0000)]
Fix a witness assertion that fires when a lock type's class changes.
When all instances of a lock type are destroyed (for example, after a
module unload), the corresponding witness entry remains associated with
that lock type. In this case, we shouldn't panic if a new instance of the
lock type is created and its lock class does not match that recorded in the
witness entry.
sevan [Tue, 1 Aug 2017 16:20:33 +0000 (16:20 +0000)]
For the udp-client example, instruct user to add an entry for a udp based
service.
For tcp-client & udp-client, use the same port in configuration snippet as used
in the comment prior to remove any ambiguity on the port number which needs to
be specified.
This uses the /usr/local/libexec/ccache/<cc,c++> wrappers rather than
modifying CC to be '/usr/local/bin/ccache cc'. Some forms of compilation
do not support the 'command' type.
ian [Tue, 1 Aug 2017 14:54:25 +0000 (14:54 +0000)]
In xdev-links, when installing symlinks to the cross-compiler pieces that
includes the OS version (armv6-freebsd12.0-cc, etc), use the OS version of
the compiler/world source code, not the version of the build host machine.
royger [Tue, 1 Aug 2017 10:47:44 +0000 (10:47 +0000)]
pci: fix write order when sizing BARs
According to the PCI Local Specification rev. 3.0 in case of a 64-bit
BAR both the low and the high parts of the register should be set to
~0 before attempting to read back the size.
So far I have found no single device that has problems with the
previous approach, but I think it's better to stay on the safe size.
This commit should not introduce any functional change.