Conrad Meyer [Tue, 19 Dec 2017 02:49:11 +0000 (02:49 +0000)]
Implement ACPI CPU support when Processor object is not present
By the ACPI standard (ACPI 5 chapter 8.4 Declaring Processors) Processors
can be implemented in 2 distinct ways:
* Through a Processor object type (which provides P_BLK)
* Through a Device object type
Prior to this change, the FreeBSD driver only supported the former. AMD
Epyc / Poweredge systems we are testing both implement the latter only. Add
the missing support.
Because P_BLK is not defined in the device object case, C-states entering
must be completely controlled via _CST methods rather than P_LVL2/3.
John Baldwin points out that ACPI 6.0 formally deprecates the Processor
keyword, so eventually processors will only be enumerated as Device objects.
Submitted by: attilio
Reviewed by: jhb, markj, Anton Rang <rang AT acm.org>
Relnotes: maybe
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D13457
Warner Losh [Tue, 19 Dec 2017 00:18:17 +0000 (00:18 +0000)]
Support more images (but still no geli)
Print a qemu line to a shell script to ease testing each image
Start to support multiple architectures (still very green)
Create /etc/rc that echos success and halts the system for better
automation (also include halt)
Create /etc/fstab on a per-boot type to test loader's passing root
to kernel.
This lets me run a test, connect to it with telnet and get either a
timeout, or a report of success.
John Baldwin [Mon, 18 Dec 2017 23:35:14 +0000 (23:35 +0000)]
Catch up to r325719 which makes the kern.proc.pid sysctl "work" for zombies.
Some of the ptrace tests need to wait for a child process to become a
zombie before preceding. The parent process polls the child process
via the kern.proc.pid sysctl to wait for it to become a zombie.
Previously the code polled until the sysctl failed with ESRCH. Now it
will poll until either the sysctl fails with ESRCH (for compatiblity
with older kernels) or returns a kinfo_proc structure with the ki_stat
field set to SZOMB.
Warner Losh [Mon, 18 Dec 2017 18:38:00 +0000 (18:38 +0000)]
When we're disabling the nvme device, some drives have a controller
bug that requires 'hands off' for a period of time (2.3s) before we
check the RDY bit. Sicne this is a very odd quirk for a very limited
selection of drives, do this as a quirk. This prevented a successful
reset of the card when the card wedged.
Also, make sure that we comply with the advice from section 3.1.5 of
the 1.3 spec says that transitioning CC.EN from 0 to 1 when CSTS.RDY
is 1 or transitioning CC.EN from 1 to 0 when CSTS.RDY is 0 "has
undefined results". Short circuit when EN == RDY == desired state.
Finally, fail the reset if the disable fails. This will lead to a
failed device, which is what we want. (note: nda device needs
work for coping with a failed device).
Bruce Evans [Mon, 18 Dec 2017 14:29:48 +0000 (14:29 +0000)]
Also forgotten in the previous that removed the permanent double mapping
of low physical memory:
Update the comment about leaving the permanent mapping in place. This
also improves the wording of the comment. PTD 0 is still left alone
because it is fairly important that it was unmapped earlier, and the
comment now describes the unmapping of the other low PTDs that the code
actually does.
Bruce Evans [Mon, 18 Dec 2017 13:53:22 +0000 (13:53 +0000)]
Remove the permanent double mapping of low physical memory and replace
it by a transient double mapping for the one instruction in ACPI wakeup
where it is needed (and for many surrounding instructions in ACPI resume).
Invalidate the TLB as soon as convenient after undoing the transient
mapping. ACPI resume already has the strict ordering needed for this.
This fixes the non-trapping of null pointers and other garbage pointers
below NBPDR (except transiently). NBPDR is quite large (4MB, or 2MB for
PAE).
This fixes spurious traps at the first instruction in VM86 bioscalls.
The traps are for transiently missing read permission in the first
VM86 page (physical page 0) which was just written to at KERNBASE in
the kernel. The mechanism is unknown (it is not simply PG_G).
locore uses a similar but larger transient double mapping and needs
it for 2 instructions instead of 1. Unmap the first PDE in it after
the 2 instructions to detect most garbage pointers while bootstrapping.
pmap_bootstrap() finishes the unmapping.
Remove the avoidance of the double mapping for a recently fixed special
case. ACPI resume could use this avoidance (made non-special) to avoid
any problems with the transient double mapping, but no such problems
are known.
Update comments in locore. Many were for old versions of FreeBSD which
tried to map low memory r/o except for special cases, or might have
allowed access to low memory via physical offsets. Now all kernel
maps are r/w, and removal of of the double map disallows use of physical
offsets again.
Bruce Evans [Mon, 18 Dec 2017 11:57:05 +0000 (11:57 +0000)]
Fix the undersupported option KERNLOAD, part 2: fix crashes in locore
when KERNLOAD is smaller than NBPDR (not the default) and PG_G is
enabled (the default if the CPU supports it). This case has relatively
minor problems with coherency of the permanent double mapping, but the
fix in r167869 to improve coherency creates page tables with 3 different
errors so never worked.
The permanent double mapping is fundamentally broken and will be removed
soon. It fundamentally breaks trapping for null pointers and requires
complications to avoid cache coherency bugs. It is currently used for
only a single instruction in ACPI resume,
Many fixes VM86 and/or ACPI and/or the double map were attempted near
r1200000. r167869 attempted to fix cache coherency bugs in an unusual
case, but the bugs were unreachable because older errors in page tables
caused a crash first.
This commit just makes r167869 work as intended. Part 1 of these fixes
fixed the other errors, but also stopped mapping the PDE for KERNBASE
as a large page, so double mapping of this PDE only causes the same
problems as when KERNLOAD is the default. Except for the problem of
trapping null pointers, r167869 could be used to fix these problems,
but it is inactive in usual cases. The only known other problem is
that incoherent permissions for page 0 cause spurious traps in VM86
BIOS calls.
When building the command to execute for compression, newsyslog was modifying
the generic arguments array instead of its own copy.
Meaning on the second file to compress with the same arguments, the command line
was not the one expected.
Fix it by creating one copy of the arguments per execution and modifying that
copy.
While here, print the command line executed in verbose mode.
Bruce Evans [Mon, 18 Dec 2017 09:32:56 +0000 (09:32 +0000)]
Fix the undersupported option KERNLOAD, part 1: fix crashes in locore
when KERNLOAD is not a multiple of NBPDR (not the default) and PSE is
enabled (the default if the CPU supports it). Addresses in PDEs must
be a multiple of NBPDR in the PSE case, but were not so in the crashing
case.
KERNLOAD defaults to NBPDR. NBPDR is 4 MB for !PAE and 2 MB for PAE.
The default can be changed by editing i386/include/vmparam.h or using
makeoptions. It can be changed to less than NBPDR to save real and
virtual memory at a small cost in time, or to more than NBPDR to waste
real and virtual memory. It must be larger than 1 MB and a multiple of
PAGE_SIZE. When it is less than NBPDR, it is necessarily not a multiple
of NBPDR. This case has much larger bugs which will be fixed in part 2.
The fix is to only use PSE for physical addresses above <KERNLOAD
rounded _up_ to an NBPDR boundary>. When the rounding is non-null,
this leaves part of the kernel not using large pages. Rounding down
would avoid this pessimization, but would break setting of PAT bits
on i/o pages if it goes below 1MB. Since rounding down always goes
below 1MB when KERNLOAD < NBPDR and the KERNLOAD > NBPDR case is not
useful, never round down.
Fix related style bugs (e.g., wrong literal values for NBPDR in comments).
Warner Losh [Mon, 18 Dec 2017 04:51:34 +0000 (04:51 +0000)]
Move loader help file definitions to being 100% inside of loader.mk.
HELP_FILES is a loader only thing, so move it to loader.mk. Only
generate the help file if HELP_FILES is defined. Adjust Makefiles to
new convention. Fix a few cases where ${.CURDIR}/ was missing
resulting in missing bits from the help files.
Ian Lepore [Mon, 18 Dec 2017 02:34:37 +0000 (02:34 +0000)]
Do not attempt to refill the TX fifo if there is no data left to transfer.
A comment in bcm_bsc_fill_tx_fifo() even lists sc_totlen > 0 as a
precondition for calling the routine. I apparently forgot to make the
code do what my comment said.
Mark Johnston [Sat, 16 Dec 2017 20:19:00 +0000 (20:19 +0000)]
Fix a logic bug in makefs lazy inode initialization.
We may need to initialize multiple inode blocks before writing a given
inode. makefs(8) was only initializing a single block at a time, so
certain inode allocation patterns could lead to a situation where it
wrote an inode to an uninitialized block. That inode might be clobbered
by a later initialization, resulting in a filesystem image containing
directory entries that point to a seemingly unused inode.
Ed Schouten [Sat, 16 Dec 2017 19:40:28 +0000 (19:40 +0000)]
Make truss(8) work for i686-unknown-cloudabi binaries on FreeBSD/amd64.
This change copies the existing amd64_cloudabi64.c to amd64_cloudabi32.c
and reimplements the functions for fetching system call arguments and
return values to use the same scheme as used by the vDSO that is used
when running cloudabi32 executables.
As arguments are automatically padded to 64-bit words by the vDSO in
userspace, we can copy the arguments directly into the array used by
truss(8) internally.
Ed Schouten [Sat, 16 Dec 2017 19:37:55 +0000 (19:37 +0000)]
libsysdecode: Add a new ABI type, SYSDECODE_ABI_CLOUDABI32.
In order to let truss(8) support tracing of 32-bit CloudABI
applications, we need to add a new ABI type to libsysdecode. We can
reuse the existing errno mapping table. Also link in the cloudabi32
system call table to translate system call names.
While there, remove all of the architecture ifdefs. There are not
needed, as the CloudABI data types and system call tables build fine on
any architecture. Building this unconditionally will make it easier to
do tracing for different compat modes, emulation, etc.
vxlan_ftable entries are sorted in ascending order, due to wrong arguments
order it is possible to stop search before existing element will be found.
Then new element will be allocated in vxlan_ftable_update_locked() and can
be inserted in the list second time or trigger MPASS() assertion with
enabled INVARIANTS.
Ed Maste [Sat, 16 Dec 2017 14:26:11 +0000 (14:26 +0000)]
lld: Slightly simplify code and add comment.
Cherry-pick lld r315658 by Rui Ueyama:
This is not a mechanical transformation. Even though I believe this
patch is correct, I'm not 100% sure if lld with this patch behaves
exactly the same way as before on all edge cases. At least all tests
still pass.
I'm submitting this patch because it took almost a day to understand
this function, and I don't want to lose it.
This fixes jemalloc assertion failures observed at startup with i386
binaries and an lld-linked libc.so.
Reviewed by: dim
Obtained from: LLVM r315658
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D13503
Warner Losh [Fri, 15 Dec 2017 23:16:53 +0000 (23:16 +0000)]
Remove the 'mini libstand in libstand' that util.[ch] provided. These
weren't needed, and their existance interfered with things in subtle
ways. One of these subtle ways was that malloc could be different
based on what files were included when (even within the same .c file,
it turns out). Move to a single malloc implementation as well by
adding the calls to setheap() to gptboot.c and zfsboot.c. Once upon a
time, these boot loaders strove to not use libstand. However, with the
proliferation of features, that striving is too hard for too little
gain and lead to stupid mistakes.
This fixes the GELI-enabled (but not even using) boot environment. The
geli routines were calling libstand malloc but zfsboot.c and gptboot.c
were using the mini libstand malloc, so this failed when we tried to
probe for GELI partitions. Subtle changes in build order when moving
to self-contained stand build in r326593 toggled what it used from one
type to another due to odd nesting of the zfs implementation code that
differed subtly between zfsloader and zfsboot.
Warner Losh [Fri, 15 Dec 2017 23:16:42 +0000 (23:16 +0000)]
Use -h -D in preference to -D so that the serial port gets the
interactive console rather than the video port. qemu has issues with X
on my mac at the moment and this is the easiest path forward.
Dimitry Andric [Fri, 15 Dec 2017 18:58:21 +0000 (18:58 +0000)]
Pull in r320755 from upstream clang trunk (by me):
Don't trigger -Wuser-defined-literals for system headers
Summary:
In D41064, I proposed adding `#pragma clang diagnostic ignored
"-Wuser-defined-literals"` to some of libc++'s headers, since these
warnings are now triggered by clang's new `-std=gnu++14` default:
$ cat test.cpp
#include <string>
$ clang -std=c++14 -Wsystem-headers -Wall -Wextra -c test.cpp
In file included from test.cpp:1:
In file included from /usr/include/c++/v1/string:470:
/usr/include/c++/v1/string_view:763:29: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<char> operator "" sv(const char *__str, size_t __len)
^
/usr/include/c++/v1/string_view:769:32: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<wchar_t> operator "" sv(const wchar_t *__str, size_t __len)
^
/usr/include/c++/v1/string_view:775:33: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<char16_t> operator "" sv(const char16_t *__str, size_t __len)
^
/usr/include/c++/v1/string_view:781:33: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string_view<char32_t> operator "" sv(const char32_t *__str, size_t __len)
^
In file included from test.cpp:1:
/usr/include/c++/v1/string:4012:24: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<char> operator "" s( const char *__str, size_t __len )
^
/usr/include/c++/v1/string:4018:27: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<wchar_t> operator "" s( const wchar_t *__str, size_t __len )
^
/usr/include/c++/v1/string:4024:28: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<char16_t> operator "" s( const char16_t *__str, size_t __len )
^
/usr/include/c++/v1/string:4030:28: warning: user-defined literal suffixes not starting with '_' are reserved [-Wuser-defined-literals]
basic_string<char32_t> operator "" s( const char32_t *__str, size_t __len )
^
8 warnings generated.
Both @aaron.ballman and @mclow.lists felt that adding this workaround
to the libc++ headers was the wrong way, and it should be fixed in
clang instead.
Here is a proposal to do just that. I verified that this suppresses
the warning, even when -Wsystem-headers is used, and that the warning
is still emitted for a declaration outside of system headers.
This will allow to compile some of the libc++ headers in C++14 mode
(which is the default for gcc 6 and higher, and will be the default for
clang 6.0.0 and higher), with -Wsystem-headers and -Werror enabled.
Follow the RFC6980 and silently ignore following IPv6 NDP messages
that had the IPv6 fragmentation header:
o Neighbor Solicitation
o Neighbor Advertisement
o Router Solicitation
o Router Advertisement
o Redirect
Introduce M_FRAGMENTED mbuf flag, and set it after IPv6 fragment reassembly
is completed. Then check the presence of this flag in correspondig ND6
handling routines.
Warner Losh [Fri, 15 Dec 2017 06:34:27 +0000 (06:34 +0000)]
Script to generate minimal boot images for each of the 24 supported
boot images for x86. This will be enhanced to generate all the other
images (u-boot, powerpc CHRP, etc).
At the moment, it's only generating three of them. zfs+gpt+legacy
works with qemu:
qemu-system-x86_64 --drive file=${file},format=raw -serial telnet::4444,server
but the ufs ones still have issues I'm tracking down.
These images are the boot blocks, /boot/loader, a kernel, maybe a
couple of modules, /sbin/init, /bin/sh, /libexec/ld-elf.so, libc.so,
libedit and libncursesw. This is just enough to get to single user. At
the moment, these come from the host system, but should come from
OBJTOP.
At the moment, this requires root to build since the zfs tools require
it (and GELI will too when we add support for that).
Warner Losh [Fri, 15 Dec 2017 06:34:11 +0000 (06:34 +0000)]
Script that knows how to put boot blocks onto a device. Eventually,
this will be installed into /usr/sbin, but for now it's just used for
the boot loader regression script. It's still a bit green, and likely
will get edge cases wrong still. It's also x86 centric at the moment,
but will be enhanced shortly for u-boot, CHRP PowerPC and other
methods.
Justin Hibbits [Fri, 15 Dec 2017 04:11:20 +0000 (04:11 +0000)]
Handle the Facility Unavailable exception as a SIGILL
Currently Facility Unavailable is absent and once an application
tries to use or access a register from a feature disabled in the
CPU it causes a kernel panic.
A simple test-case is:
int main() { asm volatile ("tbegin.;"); }
which will use TM (Hardware Transactional Memory) feature which
is not supported by the kernel and so will trigger the following
kernel panic:
Ryan Stone [Thu, 14 Dec 2017 20:48:50 +0000 (20:48 +0000)]
Plug an ifaddr leak when changing a route's src
If a route is modified in a way that changes the route's source
address (i.e. the address used to access the gateway), then a
reference on the ifaddr representing the old source address will
be leaked if the address type does not have an ifa_rtrequest
method defined. Plug the leak by releasing the reference in
all cases.
Warner Losh [Thu, 14 Dec 2017 18:57:17 +0000 (18:57 +0000)]
Revert r326855: Cargo cut a fix for the regressions r326585 caused.
This was an experiment that landed in the wrong branch and was pushed
accidentally. It's best if it is ignored because the difference was
due to vers.o being different, not float.o... And it was confirmed to
not fix anything...
Warner Losh [Thu, 14 Dec 2017 17:00:24 +0000 (17:00 +0000)]
Turn loader GELI support in the boot loaders off by default as a
temporary workaround. This fixes zfs booting generally, but breaks all
GELI booting by default. Add note to UPDATING to this effect. When the
GELI issues are resolved, this will be reverted.
Warner Losh [Thu, 14 Dec 2017 16:51:26 +0000 (16:51 +0000)]
Cargo cut a fix for the regressions r326585 caused.
We need to include ficl.h after the standard includes, rather than
before them. It changes the generated code in ways that haven't been
completely analyized. This restores the old code generation (as
verified by md5 changing back for zfsloader).
This should restore GPT + ZFS and GPT + ZFS + GELI booting that was
broken in r326585 (or would have been if r326584 hadn't broken the
build).
In devfs_lookupx() dotdot lookup case, avoid dereferencing
dvp->v_mount after dvp is unlocked.
The vnode might be reclaimed after unlock, so v_mount becomes NULL.
Cache the struct mount pointer before the unlock, the struct is
type-stable.
Note that devfs_allocv() reads mp->mnt_data but does not operate on it
further when dirent is doomed. The unmount cannot proceed until all
dirents are reclaimed.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Landon J. Fuller [Thu, 14 Dec 2017 06:45:04 +0000 (06:45 +0000)]
Add basic bwn(4) support for the (BCMA-based) BCM43224 and BCM43225.
- Add the BCM4322X D11 core revision and missing BCM43224 PCI device ID to
our device tables.
- Disable the DMA engine parity check (rather than adding parity support
to the to-be-replaced bwn(4) DMA implementation).
Currently, N-PHY support in bwn(4) is GPL licensed, and is not included by
default. Until this is replaced with Broadcom's ISC-licensed N-PHY
implementation, bwn(4) must be rebuilt to enable N-PHY support.
To build bwn(4) with N-PHY support, add the following lines to your kernel
configuration file and rebuild the kernel (and modules):
options BWN_GPL_PHY
To test bwn(4) with a BCM43224/BCM43225 device, install the firmware from
the net/bwn-firmware-kmod port, and place the following lines in
loader.conf(5):
Justin Hibbits [Thu, 14 Dec 2017 04:41:07 +0000 (04:41 +0000)]
Allow bman-portals and qman-portals to attach to simple-bus
Official Linux dts's put the individual portals under a simple-bus, rather
than under a '*-portals' grouping. This adds a hack to permit that, which
gets us closer to using stock device trees for DPAA-based devices.
Landon J. Fuller [Thu, 14 Dec 2017 03:41:12 +0000 (03:41 +0000)]
bhndb(4): Fix two register window overcommit bugs introduced in r326297:
- The window target must always be updated when stealing a register window.
- Fix missing initialization of bhndb(4) region alloc_flags when
registering statically mapped port regions (caught by scan-build).
Approved by: adrian (mentor, implicit)
Sponsored by: The FreeBSD Foundation
Dimitry Andric [Wed, 13 Dec 2017 19:03:48 +0000 (19:03 +0000)]
Pull in r315334 from upstream lld trunk (by Rafael Espindola):
Don't create a dummy __tls_get_addr.
We just don't need one with the current setup.
We only error on undefined references that are used by some
relocation.
If we managed to relax all uses of __tls_get_addr, no relocation uses
it and we don't produce an error.
This is less code and fixes the case were we fail to relax. Before we
would produce a broken output, but now we produce an error.
Pull in r320390 from upstream lld trunk (by Rafael Espindola):
Create reserved symbols early so they can be versioned.
This fixes pr35570.
We were creating these symbols after parsing version scripts, so they
could not be versioned.
We cannot move the version script parsing later because we need it for
lto.
One option is to move both addReservedSymbols and
createSyntheticSections earlier. The disadvantage is that some
sections created by createSyntheticSections replace other input
sections. For example, gdb index replaces .debug_gnu_pubnames, so it
wants to run after gc sections so that it can set S->Live to false.
What this patch does instead is to move just the ElfHeader creation
early.
Pull in r320412 from upstream lld trunk (by Rafael Espindola):
Handle symbols pointing to output sections.
Now that gc sections runs after linker defined symbols are added it
can see symbols that point to an OutputSection.
Should fix a bot failure.
Pull in r320431 from upstream lld trunk (by Peter Collingbourne):
ELF: Do not follow relocation edges to output sections during GC.
This fixes an assertion error introduced by r320390.
Together these fix handling of reserved symbols, in particular _end,
which is needed to make brk(2) and sbrk(2) work correctly. This
unbreaks the emacs ports on amd64, and also appears to unbreak most of
world on i386.