bz [Wed, 17 Oct 2018 16:49:11 +0000 (16:49 +0000)]
Move the rc framework out of sbin/init into libexec/rc.
The reasons for this are forward looking to pkgbase:
* /sbin/init is a special binary; try not to replace it with
every package update because an rc script was touched.
(a follow-up commit will make init its own package)
* having rc in its own place will allow more easy replacement
of the rc framework with alternatives, such as openrc.
jamie [Wed, 17 Oct 2018 16:11:43 +0000 (16:11 +0000)]
Add a new jail permission, allow.read_msgbuf. When true, jailed processes
can see the dmesg buffer (this is the current behavior). When false (the
new default), dmesg will be unavailable to jailed users, whether root or
not.
The security.bsd.unprivileged_read_msgbuf sysctl still works as before,
controlling system-wide whether non-root users can see the buffer.
yuripv [Wed, 17 Oct 2018 14:51:43 +0000 (14:51 +0000)]
strptime: fix parsing of tm_year when both %C and %y appear in the
format string in arbitrary order. This makes the related test cases in
lib/libc/tests/time (not yet connected to the build) pass.
While here, don't error on negative tm_year value based on the
APPLICATION USAGE in
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/time.h.html
(glibc does the same):
tm_year is a signed value; therefore, years before 1900 may be represented.
Approved by: re (gjb), kib (mentor)
Differential Revision: https://reviews.freebsd.org/D17550
bz [Wed, 17 Oct 2018 10:31:08 +0000 (10:31 +0000)]
The countp argument passed to linker_file_lookup_set() in
linker_load_dependencies() is unused, so no need to ask for the
value in first place. Remove the unused "count" variable.
kib [Tue, 16 Oct 2018 20:12:35 +0000 (20:12 +0000)]
Add initial driver for ACPI NFIT-enumerated NVDIMMs.
Driver enumerates NVDIMMs. Besides, for each found System Physical
Address (SPA) range, spaN geom provider is created, which allows
formatting and mounting the region as the normal volume. Also,
/dev/nvdimm_spaN node is created, which can be read/written/mapped by
userspace, the mapping is zero-copy.
No support for block access methods implemented, labels are not
parsed. No management interfaces are provided.
Tested by: Intel, NetApp
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
MFC after: 2 weeks
hselasky [Tue, 16 Oct 2018 18:47:13 +0000 (18:47 +0000)]
Fix for reception of large full speed isochronous frames via the transaction
translator, when using the DWC OTG USB controller driver. Make sure to re-try
getting the complete split packets until a DATA0 packet is received. Larger
isochronous frames may be split into multiple MDATA packets terminated
by a single DATA0 packet.
PR: 230434
MFC after: 3 days
Approved by: re (gjb)
Sponsored by: Mellanox Technologies
kib [Tue, 16 Oct 2018 17:28:10 +0000 (17:28 +0000)]
Provide pmap_large_map() KPI on amd64.
The KPI allows to map very large contigous physical memory regions
into KVA, which are not covered by DMAP.
I see both with QEMU and with some real hardware started shipping, the
regions for NVDIMMs might be very far apart from the normal RAM, and
we expect that at least initial users of NVDIMM could install very
large amount of such memory. IMO it is not reasonable to extend DMAP
to cover that far-away regions both because it could overflow existing
4T window for DMAP in KVA, and because it costs in page table pages
allocations, for gap and for possibly unused NV RAM.
Also, KPI provides some special functionality for fast cache flushing
based on the knowledge of the NVRAM mapping use.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D17070
yuripv [Tue, 16 Oct 2018 17:17:11 +0000 (17:17 +0000)]
apropos/whatis: use output of manpath(1) to set defpaths if -M is not
specified. This fixes searching the paths specified in
/usr/local/etc/man.d/*.conf, as currently apropos/whatis from mandoc
suite aren't aware about them.
kib [Tue, 16 Oct 2018 17:00:42 +0000 (17:00 +0000)]
Add clwb().
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D17070
jtl [Tue, 16 Oct 2018 14:41:09 +0000 (14:41 +0000)]
In r338102, the TCP reassembly code was substantially restructured. Prior
to this change, the code sometimes used a temporary stack variable to hold
details of a TCP segment. r338102 stopped using the variable to hold
segments, but did not actually remove the variable.
Because the variable is no longer used, we can safely remove it.
jtl [Tue, 16 Oct 2018 02:30:13 +0000 (02:30 +0000)]
Import CK as of commit 5221ae2f3722a78c7fc41e47069ad94983d3bccb.
This fixes two problems, one where epoch calls could occur before all
the readers had exited the epoch section, and one where the epoch calls
could be unnecessarily delayed.
mav [Mon, 15 Oct 2018 21:59:24 +0000 (21:59 +0000)]
Skip VDEV_IO_DONE stage only for ZIO_TYPE_FREE.
Device removal code uses zio_vdev_child_io() with ZIO_TYPE_NULL parent,
that never happened before. It confused FreeBSD-specific TRIM code,
which does not use VDEV_IO_DONE for logical ZIO_TYPE_FREE ZIOs. As
result of that stage being skipped device removal ZIOs leaked references
and memory that supposed to be freed by VDEV_IO_DONE, making it stuck.
It is a quick patch rather then a nice fix, but hopefully we'll be able
to drop it all together when alternative TRIM implementation finally get
landed.
PR: 228750, 229007
Discussed with: allanjude, avg, smh
Approved by: re (delphij)
MFC after: 5 days
Sponsored by: iXsystems, Inc.
yuripv [Mon, 15 Oct 2018 20:11:53 +0000 (20:11 +0000)]
pw: respect path specified using -V when writing pw.conf, and -C is not
explicitly specified. -V path is already used to determine which file
to read default values from, so it's only logical to write them to the
same file.
jhb [Mon, 15 Oct 2018 18:56:54 +0000 (18:56 +0000)]
Various fixes for TLB management on RISC-V.
- Remove the arm64-specific cpu_*cache* and cpu_tlb_flush* functions.
Instead, add RISC-V specific inline functions in cpufunc.h for the
fence.i and sfence.vma instructions.
- Catch up to changes in the arm64 pmap and remove all the cpu_dcache_*
calls, pmap_is_current, pmap_l3_valid_cacheable, and PTE_NEXT bits from
pmap.
- Remove references to the unimplemented riscv_setttb().
- Remove unused cpu_nullop.
- Add a link to the SBI doc to sbi.h.
- Add support for a 4th argument in SBI calls. It's not documented but
it seems implied for the asid argument to SBI_REMOVE_SFENCE_VMA_ASID.
- Pass the arguments from sbi_remote_sfence*() to the SEE. BBL ignores
them so this is just cosmetic.
- Flush icaches on other CPUs when they resume from kdb in case the
debugger wrote any breakpoints while the CPUs were paused in the IPI_STOP
handler.
- Add SMP vs UP versions of pmap_invalidate_* similar to amd64. The
UP versions just use simple fences. The SMP versions use the
sbi_remove_sfence*() functions to perform TLB shootdowns. Since we
don't have a valid pm_active field in the riscv pmap, just IPI all
CPUs for all invalidations for now.
- Remove an extraneous TLB flush from the end of pmap_bootstrap().
- Don't do a TLB flush when writing new mappings in pmap_enter(), only if
modifying an existing mapping. Note that for COW faults a TLB flush is
only performed after explicitly clearing the old mapping as is done in
other pmaps.
- Sync the i-cache on all harts before updating the PTE for executable
mappings in pmap_enter and pmap_enter_quick. Previously the i-cache was
only sync'd after updating the PTE in pmap_enter.
- Use sbi_remote_fence() instead of smp_rendezvous in pmap_sync_icache().
jhb [Mon, 15 Oct 2018 18:12:25 +0000 (18:12 +0000)]
Reload the LDT selector after an AMD-v #VMEXIT.
cpu_switch() always reloads the LDT, so this can only affect the
hypervisor process itself. Fix this by explicitly reloading the host
LDT selector after each #VMEXIT. The stock bhyve process on FreeBSD
never uses a custom LDT, so this change is cosmetic.
Reviewed by: kib
Tested by: Mike Tancsa <mike@sentex.net>
Approved by: re (gjb)
MFC after: 2 weeks
erj [Mon, 15 Oct 2018 17:23:41 +0000 (17:23 +0000)]
iavf(4): Finish rename/rebrand internally
Rename functions and variables from ixlv to iavf to match the
user-facing name change. There shouldn't be any functional changes
with this change, but this may help with browsing the source code
and reducing diffs in the future.
luporl [Mon, 15 Oct 2018 16:43:07 +0000 (16:43 +0000)]
Initialize SPRG0 before its first possible use
At early boot, PCPU_GET(), that obtains a pointer from SPRG0, was being
used with SPRG0 not yet initialized. If it pointed to an invalid
address, the machine would hang.
hselasky [Mon, 15 Oct 2018 10:29:29 +0000 (10:29 +0000)]
Fix deadlock when destroying VLANs.
Synchronizing the epoch before freeing the multicast addresses while holding
the VLAN_XLOCK() might lead to a deadlock. Use deferred freeing of the VLAN
multicast addresses to resolve deadlock. Backtrace:
erj [Sun, 14 Oct 2018 05:09:43 +0000 (05:09 +0000)]
em/igb/ix(4): Port two Tx/Rx fixes made to ixl in r339338
- Fix assert/panic on receive when Jumbo Frames are enabled.
From the commit I made to ixl:
"It turns out that *_isc_rxd_available is supposed to return how many
packets are available to be cleaned on the rx ring. This patch removes
a section of code where if the budget argument is 1, the function would return
one if there was a descriptor available, not necessarily a packet.
This is okay in regular mtu 1500 traffic since the max frame size is less
than the configured receive buffer size (2048), but this doesn't work when
received packets can span more than one descriptor, as is the case when the
mtu is 9000 and the receive buffer size is 4096."
- Fix possible Tx hang because *_isc_txd_credits_update returns incorrect result
From the commit by Krzysztof Galazka to ixl: "Function isc_txd_update_credits
called with clear set to false should return 1 if there are TX descriptors
already handled by HW. It was always returning 0 causing troubles with UDP TX
traffic."
emaste [Sat, 13 Oct 2018 21:26:07 +0000 (21:26 +0000)]
elfcopy: delete filter_reloc, it is broken and unnecessary
elfcopy contained logic to filter individual relocations in STRIP_ALL
mode. However, this is not valid; relocations emitted by the linker are
required, unless they apply to an entire section being removed (which is
handled by other logic in elfcopy).
Note that filter_reloc was also buggy: for RELA relocation sections it
operated on uninitialized rel.r_info resulting in invalid operation.
The logic most likely needs to be inverted: instead of removing
relocations because their associated symbols are being removed, we must
keep symbols referenced by relocations. That said, in practice we do
not encounter this code path today: objects being stripped are either
dynamically linked binaries which retain .dynsym, or static binaries
with no relocations.
Just remove filter_reloc. This fixes certain cases including statically
linked binaries containing ifuncs. Stripping binaries with relocations
referencing removed symbols was already broken, and after this change
may still be broken in a different way.
PR: 232176
Reviewed by: kaiw, kib, markj
Approved by: re (rgrimes)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17519
mjg [Sat, 13 Oct 2018 21:18:31 +0000 (21:18 +0000)]
amd64: partially depessimize cpu_fetch_syscall_args and cpu_set_syscall_retval
Vast majority of syscalls take 6 or less arguments. Move handling of other
cases to a fallback function. Similarly, special casing for _syscall
and __syscall
magic syscalls is moved away.
Return is almost always 0. The change replaces 3 branches with 1 in the common
case. Also the 'frame' variable convinces clang not to reload it on each access.
Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17542
mjg [Sat, 13 Oct 2018 21:17:28 +0000 (21:17 +0000)]
amd64: convert libc bcopy to a C func to avoid future bloat
The function is of limited use and is an almost a direct clone of
memmove/memcpy (with arguments swapped). Introduction of ERMS variants
of string routines would mean avoidable growth of libc.
bcopy will get redefined to a __builtin_memmove later on with this
symbol only left for compatibility.
Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17539
bz [Fri, 12 Oct 2018 22:51:45 +0000 (22:51 +0000)]
In udp_input() when walking the pcblist we can come across
an inp marked FREED after the epoch(9) changes.
Check once we hold the lock and skip the inp if it is the case.
Contrary to IPv6 the locking of the inp is outside the multicast
section and hence a single check seems to suffice.
erj [Fri, 12 Oct 2018 22:40:54 +0000 (22:40 +0000)]
ixl/iavf(4): Change ixlv to iavf and update it to use iflib(9)
Finishes the conversion of the 40Gb Intel Ethernet drivers to iflib(9) for
FreeBSD 12.0, and fixes numerous bugs in both ixl(4) and iavf(4).
This commit also re-adds the VF driver to GENERIC since it now compiles and
functions.
The VF driver name was changed from ixlv(4) to iavf(4) because the VF driver is
now intended to be used with future products, not just with Fortville/Fort Park
VFs.
A man page update that documents these drivers is forthcoming in a separate
commit.
mav [Fri, 12 Oct 2018 16:55:28 +0000 (16:55 +0000)]
Avoid zero-sized kmem_alloc() in vdev_compact_children().
The device evacuation code adds a dependency that
vdev_compact_children() be able to properly empty the vdev_child
array by setting it to NULL and zeroing vdev_children. Under Linux,
kmem_alloc() and related functions return a sentinel pointer rather
than NULL for zero-sized allocations.
This is a part of ZoL port of device removal patch:
commit a1d477c24c7badc89c60955995fd84d311938486
Author: Matthew Ahrens <mahrens@delphix.com> Ported-by: Tim Chase <tim@chase2k.com>
Approved by: re (kib)
MFC after: 1 week
kib [Fri, 12 Oct 2018 16:00:21 +0000 (16:00 +0000)]
Call initializecpucache() before ifuncs are resolved.
The function tweaks CPU capabilities based on the VM platform and
tunables, which affected selection of the cache flush method before
ifuncs were used, and should affect the cache flush in the same way
after ifunc.
PR: 232081
Reported by: phk
Analyzed by: avg
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
kib [Fri, 12 Oct 2018 15:30:15 +0000 (15:30 +0000)]
bhyve: emulate CLFLUSH and CLFLUSHOPT.
Apparently CLFLUSH on mmio can cause VM exit, as reported in the PR.
I do not see that anything useful can be done except emulating page
faults on invalid addresses.
Due to the instruction encoding pecularity, also emulate SFENCE.
PR: 232081
Reported by: phk
Reviewed by: araujo, avg, jhb (all: previous version)
Sponsored by: The FreeBSD Foundation
Approved by: re (gjb)
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D17482
mav [Fri, 12 Oct 2018 15:14:22 +0000 (15:14 +0000)]
Add ZIO_TYPE_FREE support for indirect vdevs.
Upstream code expects only ZIO_TYPE_READ and some ZIO_TYPE_WRITE
requests to removed (indirect) vdevs, while on FreeBSD there is also
ZIO_TYPE_FREE (TRIM). ZIO_TYPE_FREE requests do not have the data
buffers, so don't need the pointer adjustment.
PR: 228750, 229007
Reviewed by: allanjude, sef
Approved by: re (kib)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D17523
bz [Fri, 12 Oct 2018 11:30:46 +0000 (11:30 +0000)]
r217592 moved the check for imo in udp_input() into the conditional block
but leaving the variable assignment outside the block, where it is no longer
used. Move both the variable and the assignment one block further in.
This should result in no functional changes. It will however make upcoming
changes slightly easier to apply.
yuripv [Thu, 11 Oct 2018 18:30:12 +0000 (18:30 +0000)]
Restore some of the ctype definitions reported in the PR from pre-CLDR
data, namely 0xE000-0xF8FF private use area, and 0xFF00-0xFFF half- and
fullwidth punctuation.
While here, update tools/tools/locale/README based on my experience
rebuilding the locale data.
PR: 225692
Reviewed by: bapt, cem (previous version)
Approved by: re (gjb), kib (mentor)
Differential Revision: https://reviews.freebsd.org/D17471
jhb [Thu, 11 Oct 2018 18:27:19 +0000 (18:27 +0000)]
Fully restore the GDTR, IDTR, and LDTR after VT-x VM exits.
The VT-x VMCS only stores the base address of the GDTR and IDTR. As a
result, VM exits use a fixed limit of 0xffff for the host GDTR and
IDTR losing the smaller limits set in when the initial GDT is loaded
on each CPU during boot. Explicitly save and restore the full GDTR
and IDTR contents around VM entries and exits to restore the correct
limit.
Similarly, explicitly save and restore the LDT selector. VM exits
always clear the host LDTR as if the LDT was loaded with a NULL
selector and a userspace hypervisor is probably using a NULL selector
anyway, but save and restore the LDT explicitly just to be safe.
kevans [Thu, 11 Oct 2018 17:18:49 +0000 (17:18 +0000)]
Disable kernels_autodetect on installation media
This feature is disabled on install media as these generally won't have any
interesting kernels to be listed other than the default kernel, so the
potential performance penalty in these situations likely isn't worth it.
kevans [Thu, 11 Oct 2018 17:17:54 +0000 (17:17 +0000)]
Enable lualoader's kernel autodetection, disabled on install media
As documented in loader.conf(5), kernels_autodetect="YES" will cause the
Lua scripts to effectively scan /boot for directories with a "kernel" file
inside, to be listed in the loader menu.
emaste [Thu, 11 Oct 2018 13:19:17 +0000 (13:19 +0000)]
lld: set sh_link and sh_info for .rela.plt sections
ELF spec says that for SHT_REL and SHT_RELA sh_link should reference the
associated string table and sh_info should reference the "section to
which the relocation applies." ELF Tool Chain's elfcopy / strip use
this (in part) to control whether or not the relocation entry is copied
to the output.
nwhitehorn [Thu, 11 Oct 2018 00:54:39 +0000 (00:54 +0000)]
Loader GELI support, like lua loader, seems to be broken on PowerPC as
well as on SPARC64 and can cause boot failures even when no encrypted
disks are present. Presumably, the reasons, while unknown, are the same
and most-likely are the result of some endian-unsafe code. Pending
finding the actual problem, extend the blacklist entry for these parts
of loader on SPARC to also cover all PowerPC platforms.
In vdev_queue_aggregate() the zio_execute() bypass should not be
called under the vdev queue lock. This can result in a deadlock
as shown in the stack traces below.
Drop the vdev queue lock then walk the parents of the aggregate IO
to determine the list of component IOs to be bypassed. This can
be done safely without holding the io_lock since the new aggregate
IO has not yet been returned and its parents cannot change.
mw [Wed, 10 Oct 2018 10:34:17 +0000 (10:34 +0000)]
Update Armada 38x UART device tree binding
Recent changes in Linux updated Marvell Armada 38x
UART compatible string. As a result the FreeBSD driver
(uart_dev_snps) does not probe. This commit fixes the
situation, however not applying any functional modification
to the driver methods.
brooks [Tue, 9 Oct 2018 22:22:15 +0000 (22:22 +0000)]
Don't include the broken riscv64sf TARGET_ARCH in universe.
riscv64sf has been broken due to duplicate symbols for months and
degrades the quality of universe builds. Remove it until this is
resolved leaving a comment to it is not re-added.
des [Tue, 9 Oct 2018 19:27:42 +0000 (19:27 +0000)]
Fix portability issues with the Capsicum patch committed in r339216:
- Wrap access to pw_change and pw_expire in the appropriate #ifdefs.
- Wrap calls to login_cap(3) API in appropriate #ifdefs.
- Add wrapper for transferring time_t, which is still only 32 bits wide
on FreeBSD i386.
- Use a temporary variable to deserialize size_t.