Martin Matuska [Thu, 25 Oct 2018 21:44:17 +0000 (21:44 +0000)]
MFV r339640,339641,339644:
Sync libarchive with vendor
Relevant vendor changes:
PR #1013: Add missing h_base offset when performing absolute seeks in
xar decompression
PR #1061: Add support for extraction of RAR v5 archives
PR #1066: Fix out of bounds read on empty string filename for gnutar, pax
and v7tar
PR #1067: Fix temporary file path buffer overflow in tests
IS #1068: Correctly process and verify integer arguments passed to
bsdcpio and bsdtar
PR #1070: Don't default XAR entry atime/mtime to the current time
Andrew Turner [Thu, 25 Oct 2018 17:39:41 +0000 (17:39 +0000)]
Implement a BSD licensed crtbegin/crtend
These are needed for .ctors/.dtors and .jcr handling. The former needs
all the function pointers to be called in the correct order from the
.init/.fini section. The latter just needs to call a gcj specific function
if it exists with a pointer to the start of the .jcr section.
This is currently disabled until __dso_handle support is added.
Warner Losh [Thu, 25 Oct 2018 17:17:11 +0000 (17:17 +0000)]
Update comment for AMI00[12]0 override.
The AML is even stupider than always returning 0. It will only return
non-zero for an OS that reports itself as "Windows 2015", at least
on the Threadripper board's AML that I've examined.
Those AMLs also suggest we may need this quirk for AMI0030 as well.
There may be other cases where we need to override the _STA in a
generic way, so we should consider writing code to do that.
- Fix some typos.
- Remove redundant semicolons from the synopsis section.
- Stylize variable names and types with Vt and Va respectively.
- Use a list to present non-implemented functions.
- Sort the order of the sections.
- Add a history section.
- Use Nm when "libefivar" is mentioned.
Navdeep Parhar [Thu, 25 Oct 2018 14:37:26 +0000 (14:37 +0000)]
cxgbe(4): Allow "pass" filters to distribute matching traffic using a
subset of a VI's RSS indirection table.
This makes it possible to make groups out of rx queues and steer
different kinds of traffic to different groups. For example, an
interface with 8 rx queues could have all non-TCP traffic delivered to
queues 0-3 and all TCP traffic to queues 4-7.
Note that it is already possible for filters to steer traffic to a
particular queue or to distribute it using the full indirection table
(much like normal rx does).
Brooks Davis [Thu, 25 Oct 2018 04:10:41 +0000 (04:10 +0000)]
Deprecate a number of less used 10 and 10/100 Ethernet devices.
The current deprecated list is: ae, bm, cs, de, dme, ed, ep, ex, fe,
pcn, sf, sn, tl, tx, txp, vx, wb, xe
The list as refined as part of FCP-0101. Per the FCP, devices may be
removed from the deprecation list if enough users are found or they are
converted to iflib.
Kyle Evans [Thu, 25 Oct 2018 02:14:35 +0000 (02:14 +0000)]
lualoader: Improve module loading diagnostics
Some fixes:
- Maintain historical behavior more accurately w.r.t verbose_loading;
verbose_loading strictly prints "${module_name...}" and later "failed!"
or "ok" based on load success
- With or without verbose_loading, dump command_errbuf on load failure.
This usually happens prior to ok/failed if we're verbose_loading
Reviewed by: imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D17694
Kyle Evans [Thu, 25 Oct 2018 02:04:01 +0000 (02:04 +0000)]
Update lualoader test script a little bit
Use userboot.so from the test directory if possible, fall back to .OBJDIR.
This avoids a problem that we've had since userboot coexistence was added,
where userboot.so alone no longer exists in the .OBJDIR but is instead just
a link installed later.
Mark Johnston [Wed, 24 Oct 2018 16:41:47 +0000 (16:41 +0000)]
Use a vm_domainset iterator in keg_fetch_slab().
Previously, it used a hand-rolled round-robin iterator. This meant that
the minskip logic in r338507 didn't apply to UMA allocations, and also
meant that we would call vm_wait() for individual domains rather than
permitting an allocation from any domain with sufficient free pages.
Discussed with: jeff
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17420
Bjoern A. Zeeb [Wed, 24 Oct 2018 08:45:33 +0000 (08:45 +0000)]
Allow the bhyve VNC server to listen on IPv6 for incoming connections.
Alternatively to IPv4 address:port this will allow to listen on IPv6
link-local (incl. scope), a specific address, or ::. Addresses have
to be given in RFC2732 format so that [::]:port parsing will work.
This patch also starts to introduce WITH_INET/INET6_SUPPORT to bhyve.
PR: 232018
Submitted by: Dave Rush (northwoodlogic.free gmail.com) (original)
Reviewed by: Dave Rush (updated verison)
MFC after: 3 days
Kyle Evans [Wed, 24 Oct 2018 03:14:10 +0000 (03:14 +0000)]
menu.lua: Abort autoboot sequence on failed command
Currently, a timeout in the menu autoboot sequence would effectively do
nothing. We would return from the autoboot handling, then begin processing
the menu without redrawing it.
This change makes the behavior a little more friendly. Returning the user to
the menu can't have any good effects, so abort the autoboot sequence and
drop to the loader prompt.
Kyle Evans [Wed, 24 Oct 2018 02:02:37 +0000 (02:02 +0000)]
lualoader: unload upon kernel change if a kernel was previously loaded
In the majority of cases, a kernel is not loaded before we hit the menu.
However, if a password is set, we'll trigger autoboot and have loadelf'd
beforehand. We also need to take into account one dropping to the loader
prompt and twiddling with things manually; if they try to toggle through
kernels, we'll assume they mean it.
Kristof Provost [Wed, 24 Oct 2018 00:19:44 +0000 (00:19 +0000)]
pf: Fix copy/paste error in IPv6 address rewriting
We checked the destination address, but replaced the source address. This was
fixed in OpenBSD as part of their NAT rework, which we don't want to import
right now.
ffs_subr.c requires calculate_crc32c() from libkern. Unfortunately we
cannot just add libkern/crc32.c to libstand because crc32.o is already
compiled from contrib/zlib/crc32.c. Use the include trick to rename
the source.
Note that libstand also provides crc32.c which seems to be unused.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17677
Use bypass to catch any NFS VOP dispatch and route it through the
wrapper which does sigdeferstop() and then dispatches original
VOP. NFS does not need a bypass below it, which is not supported.
The vop offset in the vop_vector is added since otherwise it is
impossible to get vop_op_t from the internal table, and I did not
wanted to create the layered fs only to wrap NFS VOPs.
VFS_OP()s wrap is straightforward.
Requested and reviewed by: mjg (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17658
Kirk McKusick [Tue, 23 Oct 2018 21:10:06 +0000 (21:10 +0000)]
Continuing efforts to provide hardening of FFS, this change adds a
check hash to the superblock. If a check hash fails when an attempt
is made to mount a filesystem, the mount fails with EINVAL (Invalid
argument). This avoids a class of filesystem panics related to
corrupted superblocks. The hash is done using crc32c.
Check hases are added only to UFS2 and not to UFS1 as UFS1 is primarily
used in embedded systems with small memories and low-powered processors
which need as light-weight a filesystem as possible.
Reviewed by: kib
Tested by: Peter Holm
Sponsored by: Netflix
Mark Johnston [Tue, 23 Oct 2018 16:35:58 +0000 (16:35 +0000)]
Refactor domainset iterators for use by malloc(9) and UMA.
Before this change we had two flavours of vm_domainset iterators: "page"
and "malloc". The latter was only used for kmem_*() and hard-coded its
behaviour based on kernel_object's policy. Moreover, its use contained
a race similar to that fixed by r338755 since the kernel_object's
iterator was being run without the object lock.
In some cases it is useful to be able to explicitly specify a policy
(domainset) or policy+iterator (domainset_ref) when performing memory
allocations. To that end, refactor the vm_dominset_* KPI to permit
this, and get rid of the "malloc" domainset_iter KPI in the process.
Reviewed by: jeff (previous version)
Tested by: pho (part of a larger patch)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17417
Toomas Soome [Tue, 23 Oct 2018 14:44:32 +0000 (14:44 +0000)]
loader: biosdisk interface should be able to cope with 4k sectors
The 4kn support in current bios specific biosdisk.c is broken, as the code
is only implementing the support for the 512B sector size.
This work is building the support for custom size sectors, we still do assume
the requested data to be multiple of 512B blocks and we only do address the
biosdisk.c interface here.
For reference, see also:
https://www.illumos.org/issues/8303
https://www.illumos.org/rb/r/547
As the GELI is moved above biosdisk "layer", the GELI should just work
Glen Barber [Tue, 23 Oct 2018 14:38:08 +0000 (14:38 +0000)]
Add debug.witness.trace=0 back to the installer sysctl.conf(5),
incorrectly removed from head when it should have been removed
from stable/12 post-branch.
Reported by: bdrewery
Sponsored by: The FreeBSD Foundation
This makes it directly use MAP_EXCL and MAP_ALIGNED() instead
of weird workarounds involving mapping at random places and then
unmapping parts of them.
Toomas Soome [Tue, 23 Oct 2018 13:38:39 +0000 (13:38 +0000)]
libsa: re-send ACK for older data packets in tftp
In current tftp code we drop out-of-order packets; however, we should play
nice and re-send ACK for older data packets we are receiving. This will
hopefully stop server repeating those packets we already have received.
Note we do not answer duplicates from "previous" session (that is, session
with different port number), those will eventually time out.
Add the check that current VNET is ready and access to srchash is allowed.
This change is similar to r339646. The callback that checks for appearing
and disappearing of tunnel ingress address can be called during VNET
teardown. To prevent access to already freed memory, add check to the
callback and epoch_wait() call to be sure that callback has finished its
work.
Ed Maste [Tue, 23 Oct 2018 13:07:03 +0000 (13:07 +0000)]
ar: report errno on warning/error
Previously ar would report an error like "ar: fatal: Write error"
without including additional errno information. Change warnings and
errors to include archive_errno() so that the user may have some idea
of the reason for the failure.
Reviewed by: brooks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17650
Add the check that current VNET is ready and access to srchash is
allowed.
ipsec_srcaddr() callback can be called during VNET teardown, since
ingress address checking subsystem isn't VNET specific. And thus
callback can make access to already freed memory. To prevent this,
use V_ipsec_idhtbl pointer as indicator of VNET readiness. And make
epoch_wait() after resetting it to NULL in vnet_ipsec_uninit() to
be sure that ipsec_srcaddr() is finished its work.
Relevant vendor changes:
PR #1013: Add missing h_base offset when performing absolute seeks in
xar decompression
PR #1061: Add support for extraction of RAR v5 archives
PR #1066: Fix out of bounds read on empty string filename for gnutar, pax
and v7tar
PR #1067: Fix temporary file path buffer overflow in tests
IS #1068: Correctly process and verify integer arguments passed to
bsdcpio and bsdtar
PR #1070: Don't default XAR entry atime/mtime to the current time
netmap: align codebase to the current upstream (sha 8374e1a7e6941)
Changelist:
- Move large parts of VALE code to a new file and header netmap_bdg.[ch].
This is useful to reuse the code within upcoming projects.
- Improvements and bug fixes to pipes and monitors.
- Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
handle differences between FreeBSD and Linux.
- Introduce some new helper functions to handle more host rings and fake
rings (netmap_all_rings(), netmap_real_rings(), ...)
- Added new sysctl to enable/disable hw checksum in emulated netmap mode.
- nm_inject: add support for NS_MOREFRAG
Alex Richardson [Tue, 23 Oct 2018 06:31:31 +0000 (06:31 +0000)]
Fix ncurses fallback.c build with a strict build shell
The script uses shift three times and when building with a strict /bin/sh
shifting without any arguments will cause the script to fail. In this case
the target will fail and we write an empty output file. When doing a
NO_CLEAN build after this will mean fallback.c is up to date and clang
will happily compile the empty input file which leads to strange build
errors later.
Fixed by passing three empty arguments to MkFallback.sh and only creating
fallback.c if MKfallback.sh succeeds.
Alex Richardson [Tue, 23 Oct 2018 06:31:25 +0000 (06:31 +0000)]
Only compute the X_COMPILER_*/X_LINKER_* variables when needed
When building CheriBSD we have to set XLD/XCC/XCFLAGS on the command line.
This triggers the $XCC != $CC case in bsd.compiler.mk (and the same for LD
in bsd.linker.mk) which causes it to call ${XCC} --version and
${XLD} --version (plus various awk+sed+echo calls) in every subdirectory.
For incremental builds and stages that only walk the source tree this is
often the majority of the time spent in that directory.
By only computing the value of the X_COMPILER_*/X_LINKER_* variables if
_WANT_TOOLCHAIN_CROSS_VARS is set we can reduce the number of cc/ld calls
to once per build stage instead of once per recursive make.
With this change (and no changes to the sources) the `make includes` stage
now takes 28 seconds at -j1 instead of 86 seconds.
Alex Richardson [Tue, 23 Oct 2018 06:31:19 +0000 (06:31 +0000)]
Fix regex for extracting SHM_* values for libsysdecode
There was an additional + after the {6} which is apparently ignored by the
FreeBSD regex implementation but was giving me an error when compiling on
MacOS.
While changing this also make sure that tables.h is not created if mktables
fails. The current rule would create a partial tables.h which causes following
incremental builds to use that broken file and fail with an unrelated
compilation error or even succeed even though they shouldn't.
Eric Joyner [Tue, 23 Oct 2018 04:37:29 +0000 (04:37 +0000)]
iflib: drain enqueued tasks before detaching from taskqgroup
The taskqgroup_detach function does not check if task is already enqueued when
detaching it. This may lead to kernel panic if enqueued task starts after
context state lock is destroyed. Ensure that the already enqueued admin tasks
are executed before detaching them.
The issue was discovered during validation of D16429. Unloading of if_ixlv
followed by immediate removal of VFs with iovctl -D may lead to panic on
NODEBUG kernel.
As well, check if iflib is in detach before enqueueing new admin or iov
tasks, to prevent new tasks from executing while the taskqgroup tasks
are being drained.
Justin Hibbits [Tue, 23 Oct 2018 01:56:52 +0000 (01:56 +0000)]
dpaa: Mark BMan and QMan as earlier driver modules
The BMan softc must exist when dtsec devices are created, else a NULL
pointer is dereferenced. QMan likely as well. Until now, we have relied on
order within the fdt parsing to attach correctly, but this obviously is not
foolproof. Mark these as BUS_PASS_SUPPORTDEV so they're probed and attached
explicitly before dtsec devices.
Navdeep Parhar [Mon, 22 Oct 2018 23:06:23 +0000 (23:06 +0000)]
cxgbe(4): Use automatic cidx updates with ofld and ctrl queues.
The bits that explicitly request cidx updates do not work reliably with
all possible WRs that can be sent over the queue. The F_FW_WR_EQUIQ
requests that still remain may also have to be replaced with explicit
credit flush WRs in the future.
MFC after: 2 days
Sponsored by: Chelsio Communications
Brooks Davis [Mon, 22 Oct 2018 22:13:00 +0000 (22:13 +0000)]
Remove the need for backslashes in syscalls.master.
Join non-special lines together until we hit a line containing a '}'
character. This allows the function declaration body to be split
across multiple lines without backslash continuation characters.
Continue to join lines ending with backslashes to allow gradual
migration and to support out-of-tree syscall vectors
Brooks Davis [Mon, 22 Oct 2018 21:50:43 +0000 (21:50 +0000)]
Remove __restrict qualifiers from syscalls.master.
The restruct qualifier is intended to aid code generation in the
compiler, but the only access to storage through these pointers is via
structs using copyin/copyout and the like which can not be written in C
or C++ and thus the compiler gains nothing from the qualifiers.
As such, the qualifiers add no value in current usage.
John Baldwin [Mon, 22 Oct 2018 21:25:28 +0000 (21:25 +0000)]
Add a "live" mode to ktrdump.
Support a "live" mode in ktrdump enabled via the -l flag. In this
mode, ktrdump polls the kernel's trace buffer periodically (currently
hardcoded as a 50 millisecond interval) and dumps any newly added
entries. Fancier logic for the timeout (e.g. a command line option or
some kind of backoff based on the time since the last entry) can be
added later as the need arises.
While here, fix some bugs from when this was Capsicum-ized:
- Use caph_limit_stream() for the output stream so that isatty() works
and the output can be line-buffered (especially useful for live
mode).
- Use caph_limit_stderr() to permit error messages to be displayed if
an error occurs after cap_enter().
John Baldwin [Mon, 22 Oct 2018 21:17:36 +0000 (21:17 +0000)]
A couple of style fixes in recent TCP changes.
- Add a blank line before a block comment to match other block comments
in the same function.
- Sort the prototype for sbsndptr_adv and fix whitespace between return
type and function name.
Tijl Coosemans [Mon, 22 Oct 2018 20:55:35 +0000 (20:55 +0000)]
Define linuxkpi readq for 64-bit architectures. It is used by drm-kmod.
Currently the compiler picks up the definition in machine/cpufunc.h.
Add compiler memory barriers to read* and write*. The Linux x86
implementation of these functions uses inline asm with "memory" clobber.
The Linux x86 implementation of read_relaxed* and write_relaxed* uses the
same inline asm without "memory" clobber.
Implement ioread* and iowrite* in terms of read* and write* so they also
have memory barriers.
Qualify the addr parameter in write* as volatile.
Like Linux, define macros with the same name as the inline functions.
Only define 64-bit versions on 64-bit architectures because generally
32-bit architectures can't do atomic 64-bit loads and stores.
Regroup the functions a bit and add brief comments explaining what they do:
- __raw_read*, __raw_write*: atomic, no barriers, no byte swapping
- read_relaxed*, write_relaxed*: atomic, no barriers, little-endian
- read*, write*: atomic, with barriers, little-endian
Add a comment that says our implementation of ioread* and iowrite*
only handles MMIO and does not support port IO.
Mark Johnston [Mon, 22 Oct 2018 20:13:51 +0000 (20:13 +0000)]
Make it possible to disable NUMA support with a tunable.
This provides a chicken switch for anyone negatively impacted by
enabling NUMA in the amd64 GENERIC kernel configuration. With
NUMA disabled at boot-time, information about the NUMA topology
is not exposed to the rest of the kernel, and all of physical
memory is viewed as coming from a single domain.
This method still has some performance overhead relative to disabling
NUMA support at compile time.
PR: 231460
Reviewed by: alc, gallatin, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17439
Ed Maste [Mon, 22 Oct 2018 18:40:21 +0000 (18:40 +0000)]
Makefile.inc1: clean up stale dependency hacks
Our dependency tracking cannot directly cope with certain source tree
changes, particularly with respect to removing or moving source files or
replacing generated files. We have a collection of ad-hoc workarounds
to handle these cases. As there is a (small) build-time cost inherent
in these workarounds, we do not want to keep them indefinitely. Thus,
remove workarounds from 2017.
Mark Johnston [Mon, 22 Oct 2018 17:04:04 +0000 (17:04 +0000)]
Swap in processes unless there's a global memory shortage.
On NUMA systems, we would not swap in processes unless all domains
had some free pages. This is too conservative in general. Instead,
permit swapins so long as at least one domain has free pages, and add
a kernel stack NUMA policy which ensures that we will try to allocate
kernel stack pages from any domain.
Reported and tested by: pho, Jan Bramkamp <crest@bultmann.eu>
Reviewed by: alc, kib
Discussed with: jeff
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17304
Mark Johnston [Mon, 22 Oct 2018 16:16:42 +0000 (16:16 +0000)]
Don't import 0 into vmem quantum caches.
vmem uses UMA cache zones to implement the quantum cache. Since
uma_zalloc() returns 0 (NULL) to signal an allocation failure, UMA
should not be used to cache resource 0. Fix this by ensuring that 0 is
never cached in UMA in the first place, and by modifying vmem_alloc()
to fall back to a search of the free lists if the cache is depleted,
rather than blocking in qc_import().
Reported by and discussed with: Brett Gutstein <bgutstein@rice.edu>
Reviewed by: alc
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D17483
Gleb Smirnoff [Mon, 22 Oct 2018 15:48:07 +0000 (15:48 +0000)]
If we lost race or were migrated during bucket allocation for the per-CPU
cache, then we put new bucket on generic bucket cache. However, code didn't
honor UMA_ZONE_NOBUCKETCACHE flag, so potentially we could start a cache
on a zone that clearly forbids that. Fix this.
Andriy Gapon [Mon, 22 Oct 2018 15:33:05 +0000 (15:33 +0000)]
nfsrvd_readdirplus: for some errors, do not fail the entire request
Instead, a failing entry is skipped.
This change consist of two logical changes.
A failure to vget or lookup an entry is considered to be a result of a
concurrent removal, which is the only reasonable explanation given that
the filesystem is busied. So, the entry would be silently skipped.
In the case of a failure to get attributes of an entry for an NFSv3
request, the entry would be silently skipped. There can be legitimate
reasons for the failure, but NFSv3 does not provide any means to report
the error, so we have two options: either fail the whole request or
ignore the failed entry. Traditionally, the old NFS server used the
latter option, so the code is reverted to it. Making the whole
directory unreadable because of a single entry seems to be unpractical.
Additionally, some bits of code are slightly re-arranged to account for
the new control flow and to honor style(9).