dim [Sat, 16 Mar 2019 13:40:27 +0000 (13:40 +0000)]
Add LLVM openmp trunk r351319 (just before the release_80 branch point)
to contrib/llvm. This is not yet connected to the build, the glue for
that will come in a follow-up commit.
kib [Sat, 16 Mar 2019 11:44:33 +0000 (11:44 +0000)]
amd64 KPTI: add control from procctl(2).
Add the infrastructure to allow MD procctl(2) commands, and use it to
introduce amd64 PTI control and reporting. PTI mode cannot be
modified for existing pmap, the knob controls PTI of the new vmspace
created on exec.
kib [Sat, 16 Mar 2019 11:31:01 +0000 (11:31 +0000)]
amd64: Add md process flags and first P_MD_PTI flag.
PTI mode for the process pmap on exec is activated iff P_MD_PTI is set.
On exec, the existing vmspace can be reused only if pti mode of the
pmap matches the P_MD_PTI flag of the process. Add MD
cpu_exec_vmspace_reuse() callback for exec_new_vmspace() which can
vetoed reuse of the existing vmspace.
MFC note: md_flags change struct proc KBI.
Reviewed by: jhb, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D19514
kib [Sat, 16 Mar 2019 11:16:09 +0000 (11:16 +0000)]
amd64: fix switching to the pmap with pti disabled.
When the pmap with pti disabled (i.e. pm_ucr3 == PMAP_NO_CR3) is
activated, tss.rsp0 was not updated. Any interrupt that happen before
next context switch would use pti trampoline stack for hardware frame
but fault and interrupt handlers are not prepared to this. Correctly
update tss.rsp0 for both PMAP_NO_CR3 and pti pmaps.
Note that this case, pti = 1 but pmap->pm_ucr3 == PMAP_NO_CR3 is not
used at the moment.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D19514
ngie [Fri, 15 Mar 2019 21:49:19 +0000 (21:49 +0000)]
Integrate cddl/usr.sbin/zfds/tests into the FreeBSD test suite
This change integrates the unit tests for zfsd into the test suite using the
integration method described in r345203.
This change removes the `LOCALBASE` includes added for the port version of
googlemock/googletest, as well as unnecessary `LIBADD`/`DPADD` and `CXXFLAGS`
defines, which are included in the `GTEST_CXXFLAGS` variable, as part of
r345203.
ngie [Fri, 15 Mar 2019 21:43:52 +0000 (21:43 +0000)]
Initial googlemock/googletest integration into the build/FreeBSD test suite
This initial integration takes googlemock/googletest release 1.8.1, integrates
the library, tests, and sample unit tests into the build.
googlemock/googletest's inclusion is optionally available via `MK_GOOGLETEST`.
`MK_GOOGLETEST` is dependent on `MK_TESTS` and is enabled by default when
built with a C++11 capable toolchain.
Google tests can be specified via the `GTESTS` variable, which, in comparison
with the other test drivers, is more simplified/streamlined, as Googletest only
supports C++ tests; not raw C or shell tests (C tests can be written in C++
using the standard embedding methods).
No dependent libraries are assumed for the tests. One must specify `gmock`,
`gmock_main`, `gtest`, or `gtest_main`, via `LIBADD` for the program.
More information about googlemock and googletest can be found on the
Googletest [project page](https://github.com/google/googletest), and the
[GoogleMock](https://github.com/google/googletest/blob/v1.8.x/googlemock/docs/Documentation.md)
and
[GoogleTest](https://github.com/google/googletest/tree/v1.8.x/googletest/docs)
docs.
These tests are originally integrated into the build as plain driver tests, but
will be natively integrated into Kyua in a later version.
Known issues/Errata:
* [WhenDynamicCastToTest.AmbiguousCast fails on FreeBSD](https://github.com/google/googletest/issues/2172)
mav [Fri, 15 Mar 2019 18:59:04 +0000 (18:59 +0000)]
MFV r336930: 9284 arc_reclaim_thread has 2 jobs
`arc_reclaim_thread()` calls `arc_adjust()` after calling
`arc_kmem_reap_now()`; `arc_adjust()` signals `arc_get_data_buf()` to
indicate that we may no longer be `arc_is_overflowing()`.
The problem is, `arc_kmem_reap_now()` can take several seconds to
complete, has no impact on `arc_is_overflowing()`, but due to how the
code is structured, can impact how long the ARC will remain in the
`arc_is_overflowing()` state.
The fix is to use seperate threads to:
1. keep `arc_size` under `arc_c`, by calling `arc_adjust()`, which
improves `arc_is_overflowing()`
2. keep enough free memory in the system, by calling
`arc_kmem_reap_now()` plus `arc_shrink()`, which improves
`arc_available_memory()`.
glebius [Fri, 15 Mar 2019 18:18:05 +0000 (18:18 +0000)]
Deanonymize thread and proc state enums, so that a userland app can
use them without redefining the value names. New clang no longer
allows to redefine a enum value name to the same value.
Bump __FreeBSD_version, since ports depend on that.
kevans [Fri, 15 Mar 2019 17:19:36 +0000 (17:19 +0000)]
if_bridge(4): Drop pointless rtflush
At this point, all routes should've already been dropped by removing all
members from the bridge. This condition is in-fact KASSERT'd in the line
immediately above where this nop flush was added.
kevans [Fri, 15 Mar 2019 17:13:05 +0000 (17:13 +0000)]
if_bridge(4): Drop pointless rtflush
At this point, all routes should've already been dropped by removing all
members from the bridge. This condition is in-fact KASSERT'd in the line
immediately above where this nop flush was added.
kp [Fri, 15 Mar 2019 15:52:36 +0000 (15:52 +0000)]
bridge: Fix STP-related panic
After r345180 we need to have the appropriate vnet context set to delete an
rtnode in bridge_rtnode_destroy().
That's usually the case, but not when it's called by the STP code (through
bstp_notify_rtage()).
We have to set the vnet context in bridge_rtable_expire() just as we do in the
other STP callback bridge_state_change().
eugen [Fri, 15 Mar 2019 14:42:23 +0000 (14:42 +0000)]
trim(8): emit more user-friendly error message in verbose mode.
If underlying driver provides no TRIM/UNMAP support and operation fails
due to this reason, state it clearly in verbose mode (default)
instead of writing standard message that may be too cryptic for a user:
trim: ioctl(DIOCGDELETE) failed: nda0: Operation not supported
Now it would write:
trim: nda0: TRIM/UNMAP not supported by driver
But still use previous format including errno value for quiet mode.
Small candelete() function borrowed from diskinfo(8) code.
This function was committed by Alan Somers <asomers@FreeBSD.org>,
so give him some credit.
kevans [Fri, 15 Mar 2019 13:19:52 +0000 (13:19 +0000)]
if_bridge(4): Fix module teardown
bridge_rtnode_zone still has outstanding allocations at the time of
destruction in the current model because all of the interface teardown
happens in a VNET_SYSUNINIT, -after- the MOD_UNLOAD has already been
processed. The SYSUNINIT triggers destruction of the interfaces, which then
attempts to free the memory from the zone that's already been destroyed, and
we hit a panic.
Solve this by virtualizing the uma_zone we allocate the rtnodes from to fix
the ordering. bridge_rtable_fini should also take care to flush any
remaining routes that weren't taken care of when dynamic routes were flushed
in bridge_stop.
fsu [Fri, 15 Mar 2019 11:49:46 +0000 (11:49 +0000)]
Remove unneeded mount point unlock function calls.
The ext2_nodealloccg() function unlocks the mount point
in case of successful node allocation.
The additional unlocks are not required and should be removed.
kp [Fri, 15 Mar 2019 11:21:20 +0000 (11:21 +0000)]
bridge: Fix panic if the STP root is removed
If the spanning tree root interface is removed from the bridge we panic
on the next 'ifconfig'.
While the STP code is notified whenever a bridge member interface is
removed from the bridge it does not clear the bs_root_port. This means
bs_root_port can still point at an bridge_iflist which has been free()d.
The next access to it will panic.
Explicitly check if the interface we're removing in bstp_destroy() is
the root, and if so re-assign the roles, which clears bs_root_port.
kp [Fri, 15 Mar 2019 11:08:44 +0000 (11:08 +0000)]
pf :Use counter(9) in pf tables.
The counters of pf tables are updated outside the rule lock. That means state
updates might overwrite each other. Furthermore allocation and
freeing of counters happens outside the lock as well.
Use counter(9) for the counters, and always allocate the counter table
element, so that the race condition cannot happen any more.
chuck [Fri, 15 Mar 2019 02:11:27 +0000 (02:11 +0000)]
Fix bhyve's NVMe Identify Namespace data
The NVMe Identify Namespace data structure's Number of LBA Formats
(NLBAF) field is a 0's based value (i.e. 0x0 means 1). Since the
emulation only supports a single format, set NLBAF to 0x0, not 1.
glebius [Thu, 14 Mar 2019 22:52:16 +0000 (22:52 +0000)]
PFIL_MEMPTR for ipfw link level hook
With new pfil(9) KPI it is possible to pass a void pointer with length
instead of mbuf pointer to a packet filter. Until this commit no filters
supported that, so pfil run through a shim function pfil_fake_mbuf().
Now the ipfw(4) hook named "default-link", that is instantiated when
net.link.ether.ipfw sysctl is on, supports processing pointer/length
packets natively.
- ip_fw_args now has union for either mbuf or void *, and if flags have
non-zero length, then we use the void *.
- through ipfw_chk() we handle mem/mbuf cases differently.
- ether_header goes away from args. It is ipfw_chk() responsibility
to do parsing of Ethernet header.
- ipfw_log() now uses different bpf APIs to log packets.
Although ipfw_chk() is now capable to process pointer/length packets,
this commit adds support for the link level hook only, see
ipfw_check_frame(). Potentially the IP processing hook ipfw_check_packet()
can be improved too, but that requires more changes since the hook
supports more complex actions: NAT, divert, etc.
glebius [Thu, 14 Mar 2019 22:28:50 +0000 (22:28 +0000)]
- Add more flags to ip_fw_args. At this changeset only IPFW_ARGS_IN and
IPFW_ARGS_OUT are utilized. They are intented to substitute the "dir"
parameter that is often passes together with args.
- Rename ip_fw_args.oif to ifp and now it is set to either input or
output interface, depending on IPFW_ARGS_IN/OUT bit set.
kevans [Thu, 14 Mar 2019 19:48:43 +0000 (19:48 +0000)]
ether_fakeaddr: Use 'b' 's' 'd' for the prefix
This has the advantage of being obvious to sniff out the designated prefix
by eye and it has all the right bits set. Comment stolen from ffec.
I've removed bryanv@'s pending question of using the FreeBSD OUI range --
no one has followed up on this with a definitive action, and there's no
particular reason to shoot for it and the administrative overhead that comes
with deciding exactly how to use it.
kevans [Thu, 14 Mar 2019 17:18:00 +0000 (17:18 +0000)]
ether: centralize fake hwaddr generation
We currently have two places with identical fake hwaddr generation --
if_vxlan and if_bridge. Lift it into if_ethersubr for reuse in other
interfaces that may also need a fake addr.
Reviewed by: bryanv, kp, philip
Differential Revision: https://reviews.freebsd.org/D19573
0mp [Thu, 14 Mar 2019 14:34:36 +0000 (14:34 +0000)]
chroot.8: Add examples & clean up
- Sort arguments in synopsis.
- Clarify that it is possible to specify arguments to the command (and that
they could be passed as further arguments to chroot(1)).
- Standardize the description of the flags.
- Improve formatting (e.g., do not use macros in strings specifying width).
- Add examples.
kib [Wed, 13 Mar 2019 17:30:03 +0000 (17:30 +0000)]
Some fixes for proccontrol(1) man page.
- Fix markup.
- Mention that process can only allow tracing for itself. This is already
stated in procctl(2), but requiring knowledge of the syscall description
is too much for the tool user.
- Clearly state that query mode only works for existing process.
Noted and reviewed by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
bz [Wed, 13 Mar 2019 17:00:15 +0000 (17:00 +0000)]
Enhance IPv6 autoconf startup.
Before this change we would only run rtsol on an interface which was
set to accept_rtadv and did not have rtsold enabled. This change
removes the latter condition and always runs rtsol (rather than the
deferred rtsold) to reduce the delay until we send the first RS.
This change will also handle the accept_rtadv before dhcp hence
starting IPv6 auto-configuration before IPV4 DHCP.
This change is intended for FreeBSD 13 and later only and will not be MFCed.
eugen [Wed, 13 Mar 2019 09:48:33 +0000 (09:48 +0000)]
mfi.4, mrsas.4: document how to get ATA TRIM support for SSD
while using LSI RAID adapters as it was completely obscure before:
mfi has no TRIM support at all and mrsas provides TRIM
if underlying adapter does it (for Non-RAID drives generally).
bcr [Tue, 12 Mar 2019 20:08:37 +0000 (20:08 +0000)]
Extend descriptions and comments about the need to create /etc/pf.conf.
FreeBSD removed the default /etc/pf.conf file in previous releases, but
the documentation kept mentioning it like any other file present in the
system. Change pf.conf(5) to mention in the description of the default
ruleset location that this file needs to be created manually. Also, the
default rc.conf file had it's comment extended a bit to let people know
that this file does not exist by default.
kib [Tue, 12 Mar 2019 19:33:25 +0000 (19:33 +0000)]
hwpmc/core: Adopt to upcoming Skylake TSX errata.
The forthcoming microcode update will fix a TSX bug by clobbering PMC3
when TSX instructions are executed (even speculatively). There is an
alternate mode where CPU executes all TSX instructions by aborting
them, in which case PMC3 is still available to OS. Any code that
correctly uses TSX must be ready to handle abort anyway.
Since it is believed that FreeBSD population of hwpmc(4) users is
significantly larger than the population of TSX users, switch the
microcode into TSX abort mode whenever a pmc is allocated, and back to
bug avoidance mode when the last pmc is deallocated.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
mckusick [Tue, 12 Mar 2019 19:08:41 +0000 (19:08 +0000)]
This is an additional fix for bug report 230962. When using
extended attributes, the kernel can panic with either "ffs_truncate3"
or with "softdep_deallocate_dependencies: dangling deps".
The problem arises because the flushbuflist() function which is
called to clear out buffers is passed either the V_NORMAL flag to
indicate that it should flush buffer associated with the contents
of the file or the V_ALT flag to indicate that it should flush the
buffers associated with the extended attribute data. The buffers
containing the extended attribute data are identified by having
their BX_ALTDATA flag set in the buffer's b_xflags field. The
BX_ALTDATA flag is set on the buffer when the extended attribute
block is first allocated or when its contents are read in from the
disk.
On a busy system, a buffer may be reused for another purpose, but
the contents of the block that it contained continues to be held
in the main page cache. Each physical page is identified as holding
the contents of a logical block within a specified file (identified
by a vnode). When a request is made to read a file, the kernel first
looks for the block in the existing buffers. If it is not found
there, it checks the page cache to see if it is still there. If
it is found in the page cache, then it is remapped into a new
buffer thus avoiding the need to read it in from the disk.
The bug is that when a buffer request made for an extended attribute
is fulfilled by reconstituting a buffer from the page cache rather
than reading it in from disk, the BX_ALTDATA flag was not being
set. Thus the flushbuflist() function would never clear it out and
the "ffs_truncate3" panic would occur because the vnode being cleared
still had buffers on its clean-buffer list. If the extended attribute
was being updated, it is first read, then updated, and finally
written. If the read is fulfilled by reconstituting the buffer
from the page cache the BX_ALTDATA flag was not set and thus the
dirty buffer would never be flushed by flushbuflist(). Eventually
the buffer would be recycled. Since it was never written it would
have an unfinished dependency which would trigger the
"softdep_deallocate_dependencies: dangling deps" panic.
The fix is to ensure that the BX_ALTDATA flag is set when a buffer
has been reconstituted from the page cache.
dim [Tue, 12 Mar 2019 18:19:44 +0000 (18:19 +0000)]
Revert r308867 (which was originally committed in the clang390-import
project branch):
Work around LLVM PR30879, which is about a bad interaction between
X86 Call Frame Optimization on i386 and libunwind, by disallowing the
optimization for i386-freebsd12.
This should fix some instances of broken exception handling when
frame pointers are omitted, in particular some unittests run during
the build of editors/libreoffice.
This hack will be removed as soon as upstream has implemented a more
permanent fix for this problem.
And indeed, after r345018 and r345019, which updated LLVM libunwind to
the most recent version, the above workaround is no longer needed. The
upstream commit which fixed this is:
Specifically, 32 bit (i386-freebsd) executables optimized with omitted
frame pointers and Call Frame Optimization should now behave correctly
when a C++ exception is thrown, and the stack is unwound.
kib [Tue, 12 Mar 2019 16:49:08 +0000 (16:49 +0000)]
isci(4): Use controller->lock for busdma tags.
isci(4) uses deferred loading. Typically on amd64 and i386 non-PAE
the tag does not create any restrictions, but on i386 PAE-tables but
non-PAE configs callbacks might be used.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
kevans [Tue, 12 Mar 2019 16:21:39 +0000 (16:21 +0000)]
stand: Improve some debugging experience
Some of these files using <FOO>_DEBUG defined a DEBUG() macro to serve as a
debug-printf. -DDEBUG is useful to enable some debugging output across
multiple ELF/common parts, so switch the DEBUG-as-printf macros over to
something more like DPRINTF that is more commonly used for this kind of
thing and less likely to conflict.
userboot/elf64_freebsd debugging also assumed %llx for uint64; use PRIx64
instead.