kevans [Wed, 5 Feb 2020 04:32:49 +0000 (04:32 +0000)]
service(8): set the environment of the "daemon" class before invoking
As mentioned in r357562, this gives the user a single place to configure
environment variables that need to be used for various services -- the
"daemon" class -- for, e.g., configuring a system-wide HTTP proxy.
This is a part of D21481.
Submitted by: Andrew Gierth <andrew_tao173.riddles.org.uk>
kevans [Wed, 5 Feb 2020 04:27:44 +0000 (04:27 +0000)]
init(8): set environment variables from the "daemon" class as well
Specifically, when running /etc/rc. This allows one to specify via
login.conf(5) an environment that should be used when running services to
ease, e.g., setting up env vars for an HTTP proxy consistently across cron
and services alike.
Future changes will extend cron(8)/service(8) to use environment vars
pecified in login.conf(5) as well to promote a more cohesive experience.
This is a part of D21481.
Submitted by: Andrew Gierth <andrew_tao173.riddles.org.uk>
rlibby [Tue, 4 Feb 2020 22:40:45 +0000 (22:40 +0000)]
uma: multipage chicken switch
Add a switch to allow disabling multipage slabs, in order to facilitate
measuring memory usage and performance effects. The tunable
vm.debug.uma_multipage_slabs defaults to 1 and can be set to 0 to
disable. The name may change soon.
rlibby [Tue, 4 Feb 2020 22:40:34 +0000 (22:40 +0000)]
uma: grow slabs to enforce minimum memory efficiency
Memory efficiency can be poor with awkward item sizes (e.g. 1/2 or 1
page size + epsilon). In order to achieve a minimum memory efficiency,
select a slab size with a potentially larger number of pages if it
yields a lower portion of waste.
This may mean using page_alloc instead of uma_small_alloc, which could
be more costly.
rlibby [Tue, 4 Feb 2020 22:40:23 +0000 (22:40 +0000)]
uma: convert mbuf_jumbo_alloc to UMA_ZONE_CONTIG & tag others
Remove mbuf_jumbo_alloc and let large mbuf zones use the new uma default
contig allocator (a copy of mbuf_jumbo_alloc). Tag other zones which
require contiguous objects, even if they don't use the new default
contig allocator, so that uma knows about their constraints.
rlibby [Tue, 4 Feb 2020 22:39:58 +0000 (22:39 +0000)]
uma: pcpu_page_free needs to startup_free pages from startup_alloc
After r357392, it is apparent that we do have some early-boot PCPU
zones. Make it so we can safely free pages from them if they are
actually used during early boot.
kevans [Tue, 4 Feb 2020 21:43:39 +0000 (21:43 +0000)]
ObsoleteFiles: Update after simple_httpd removal
There should have perhaps been an entry in OptionalObsoleteFiles for it
before, but alas- let it be removed now with `make delete-old` if it was
installed.
kevans [Tue, 4 Feb 2020 21:27:39 +0000 (21:27 +0000)]
Remove simple_httpd
simple_httpd was granted a reprieve from the picobsd removal based on having
some reported user; it turns out this user isn't actually using the version
in base and merging their changes would be difficult at this point, so the
version in base will simply continue to rot. Retire it now, it may make a
comeback to ports with the improved version.
No notice issued because its current visibility has only been for ~3
months, and a notice has been previously issued about picobsd removal.
markj [Tue, 4 Feb 2020 21:17:59 +0000 (21:17 +0000)]
size: Avoid returning a stack pointer from xlatetom().
The callers only check whether the returned pointer is non-NULL, so this
was harmless in practice, but change the return value to guard against
the issue.
CID: 1411597
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
markj [Tue, 4 Feb 2020 21:16:56 +0000 (21:16 +0000)]
elfcopy: Avoid leaking dst's fd when we fail to copy a file.
We should really create the output file in the same directory as the
destination file so that rename() works. This will be done in a future
change as part of some work to run in capability mode.
CID: 1262523
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
bdragon [Tue, 4 Feb 2020 20:40:45 +0000 (20:40 +0000)]
[PowerPC] Fix VSX context handling
In r356767, memcpy/memmove/bcopy optimizations were added to libc to
improve performance.
This exposed an existing kernel issue in VSX handling. The PSL_VSX flag was
not being excluded from the psl_userstatic set, which meant that any thread
that used these and then called swapcontext(3) would get an EINVAL error.
Fixing this exposed a second issue - in r344123, the FPU was being forced
off in set_mcontext(). However, this was neglecting to ensure VSX was turned
off at the same time.
While here, add some code comments to explain what's going on.
jeff [Tue, 4 Feb 2020 20:33:01 +0000 (20:33 +0000)]
Add an explicit busy state for free pages. This improves behavior with
potential bugs that access freed pages as well as providing a path
towards lockless page lookup.
jeff [Tue, 4 Feb 2020 20:28:06 +0000 (20:28 +0000)]
Use literal bucket sizes for smaller buckets rather than the rounding
system. Small bucket sizes already pack well even if they are an odd
number of words. This prevents any potential new instances of the
problem fixed in r357463 as well as making the system easier to
understand.
Revert r357201: downgrade sqlite3 from sqlite3-3.31.0 (3310000) to
sqlite3-3.30.1 (3300100), as it causes svnlite segfaults on PowerPC,
resulting in corruption.
Reported by: Mark Millard <marklmi at yahoo.com>
Francis Little <oggy at farscape.co.uk>
kib [Tue, 4 Feb 2020 19:05:58 +0000 (19:05 +0000)]
tmpfs: add nomtime mount option,
which disables tracking mtime updates due to writes through the shared
mapped areas backed by tmpfs files. This removes periodic scans which
downgrades rw mapped pages to ro to note the writes.
kib [Tue, 4 Feb 2020 19:03:37 +0000 (19:03 +0000)]
Enable vm_object_mightbedirty() and vm_object_page_clean() for swap
objects backing tmpfs vnodes data.
The clean scan is limited to only remove write permissions from the
mapped pages of the objects. This fixes the issue that tmpfs vnode
mtime is not updated from writes to the mmaped area after the initial
page-in.
Noted by: mjg
Reviewed by: markj
Discussed with: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23432
kevans [Tue, 4 Feb 2020 18:45:28 +0000 (18:45 +0000)]
psm: use make_dev_s instead of make_dev
This most importantly reduces duplication, but it also removes any potential
race with usage of dev->si_drv1 since it's now set prior to the device being
constructed enough to be accessible.
mav [Tue, 4 Feb 2020 15:53:51 +0000 (15:53 +0000)]
Few microoptimizations to dbuf layer.
Move db_link into the same cache line as db_blkid and db_level.
It allows significantly reduce avl_add() time in dbuf_create() on
systems with large RAM and huge number of dbufs per dnode.
Avoid few accesses to dbuf_caches[].size, which is highly congested
under high IOPS and never stays in cache for a long time. Use local
value we are receiving from zfs_refcount_add_many() any way.
Remove cache_size_bytes_max bump from dbuf_evict_one(). I don't see
a point to do it on dbuf eviction after we done it on insertion in
dbuf_rele_and_unlock().
Reviewed by: mahrens, Brian Behlendorf
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
andrew [Tue, 4 Feb 2020 12:33:00 +0000 (12:33 +0000)]
Print useful debug data on unhandled kernel fault on arm64
When panicing because of an unhandled data abort from the kernel it is
useful to know the register state and faulting address to aid debugging.
Print these registers before calling panic.
dchagin [Tue, 4 Feb 2020 05:27:05 +0000 (05:27 +0000)]
Fix clock_gettime() and clock_getres() for cpu clocks:
- handle the CLOCK_{PROCESS,THREAD}_CPUTIME_ID specified directly;
- fix thread id calculation as in the Linuxulator we should
convert the user supplied thread id to struct thread * by linux_tdfind();
- fix CPUCLOCK_SCHED case by using kern_{process,thread}_cputime()
directly as native get_cputime() used by kern_clock_gettime() uses
native tdfind()/pfind() to find proccess/thread.
dchagin [Tue, 4 Feb 2020 05:23:34 +0000 (05:23 +0000)]
linux_to_native_clockid() properly initializes nwhich variable (or return error),
so don't initialize nwhich in declaration and remove stale comment from r161304.
jeff [Tue, 4 Feb 2020 02:41:24 +0000 (02:41 +0000)]
Use STAILQ instead of TAILQ for bucket lists. We only need FIFO behavior
and this is more space efficient.
Stop queueing recently used buckets to the head of the list. If the bucket
goes to a different processor the cache coherency will be more expensive.
We already try to encourage cache-hot behavior in the per-cpu layer.
arichardson [Tue, 4 Feb 2020 00:06:16 +0000 (00:06 +0000)]
Set the LMA of the riscv kernel to the OpenSBI jump target by default
This allows us to boot FreeBSD RISCV on QEMU using the -kernel command line
options. When using that option, QEMU maps the kernel ELF file to the
addresses specified in the LMAs in the program headers.
Since version 4.2 QEMU ships with OpenSBI fw_jump by default so this allows
booting FreeBSD using the following command line:
qemu-system-riscv64 -bios default -kernel /.../boot/kernel/kernel -nographic -M virt
Without this change the -kernel option cannot be used since the LMAs start
at address zero and QEMU already maps a ROM to these low physical addresses.
For targets that require a different kernel LMA the make variable
KERNEL_LMA can be overwritten in the config file. For example, adding
`makeoptions KERNEL_LMA=0xc0200000` will create an ELF file that will be
loaded at 0xc0200000.
Before:
There are 4 program headers, starting at offset 64
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs. Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.
0mp [Mon, 3 Feb 2020 23:40:27 +0000 (23:40 +0000)]
ports.7: Update examples with install-missing-packages
The ports framework recently grew support for installing dependencies with
a dedicated target called "install-missing-packages". Let's retire the
carefully constructed one-liner that was used for getting dependencies so
far and use the official ports target instead.
mjg [Mon, 3 Feb 2020 22:32:49 +0000 (22:32 +0000)]
fd: streamline fget_unlocked
clang has the unfortunate property of paying little attention to prediction
hints when faced with a loop spanning the majority of the rotuine.
In particular fget_unlocked has an unlikely corner case where it starts almost
from scratch. Faced with this clang generates a maze of taken jumps, whereas
gcc produces jump-free code (in the expected case).
Work around the problem by providing a variant which only tries once and
resorts to calling the original code if anything goes wrong.
While here note that the 'seq' parameter is almost never passed, thus the
seldom users are redirected to call it directly.
mjg [Mon, 3 Feb 2020 22:26:00 +0000 (22:26 +0000)]
ktrace: provide ktrstat_error
This eliminates a branch from its consumers trading it for an extra call
if ktrace is enabled for curthread. Given that this is almost never true,
the tradeoff is worth it.
glebius [Mon, 3 Feb 2020 20:48:57 +0000 (20:48 +0000)]
Couple protocol drain routines (frag6_drain and sctp_drain) may send
packets. An unexpected behaviour for memory reclamation routine.
Anyway, we need enter the network epoch for doing that.
markj [Mon, 3 Feb 2020 19:29:02 +0000 (19:29 +0000)]
Disable the smallest UMA bucket size on 32-bit platforms.
With r357314, sizeof(struct uma_bucket) grew to 16 bytes on 32-bit
platforms, so BUCKET_SIZE(4) is 0. This resulted in the creation of a
bucket zone for buckets with zero capacity. A more general fix is
planned, but for now this bandaid allows 32-bit platforms to boot again.
PR: 243837
Discussed with: jeff
Reported by: pho, Jenkins via lwhsu
Tested by: pho
Sponsored by: The FreeBSD Foundation
kevans [Mon, 3 Feb 2020 18:59:07 +0000 (18:59 +0000)]
namei: preserve errors from fget_cap_locked
Most notably, we want to make sure we don't clobber any capabilities-related
errors. This is a regression from r357412 (O_SEARCH) that was picked up by
the capsicum tests.
markj [Mon, 3 Feb 2020 18:23:50 +0000 (18:23 +0000)]
Dynamically select LSE-based atomic(9)s on arm64.
Once all CPUs are online, determine if they all support LSE atomics and
set lse_supported to indicate this. For now the atomic(9)
implementations are still always inlined, though it would be preferable
to create out-of-line functions to avoid text bloat. This was not done
here since big.little systems exist in which some CPUs implement LSE
while others do not, and ifunc resolution must occur well before this
scenario can be detected. It does seem unlikely that FreeBSD will
ever run on such platforms, however, so converting atomic(9) to use
ifuncs is probably a good next step.
Add a LSE_ATOMICS arm64 kernel configuration option to unconditionally
select LSE-based atomic(9) implementations when the target system is
known.
markj [Mon, 3 Feb 2020 18:23:35 +0000 (18:23 +0000)]
Add LSE-based atomic(9) implementations.
These make use of the cas*, ld* and swp instructions added in ARMv8.1.
Testing shows them to be significantly more performant than LL/SC-based
implementations.
No functional change here since the wrappers still unconditionally
select the _llsc variants.
markj [Mon, 3 Feb 2020 18:23:14 +0000 (18:23 +0000)]
Add wrappers for arm64 atomics.
Add a _llsc suffix for the existing LL/SC-based implementations and add
trivial wrappers. This is in preparation for supporting LSE-based
atomic(9) implementations.
markj [Mon, 3 Feb 2020 18:22:59 +0000 (18:22 +0000)]
Provide a single implementation for each of the arm64 atomic(9) ops.
Parameterize the macros by type width as well as acq/rel semantics.
This makes modifying the implementations much less tedious and
error-prone and makes it easier to support alternate LSE-based
implementations. No functional change intended.
markj [Mon, 3 Feb 2020 16:41:40 +0000 (16:41 +0000)]
addr2line: Cache CU DIEs upon a successful address lookup.
Previously, addr2line would sequentially search all CUs for each input
address. For some uses, notably syzkaller's code coverage map generator,
this was extremely slow. Add a CU cache into which entries are added
following a successful lookup, and search the cache before falling back
to a scan. When translating a large number of addresses this yields
slightly better performance than GNU addr2line.
Garbage-collect an unused hash table which appears to have been intended
for the same purpose. A hash table doesn't seem particularly suitable
since each CU spans a range of addresses.
Submitted by: Tiger Gao <tig@freebsdfoundation.org>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23418
0mp [Mon, 3 Feb 2020 15:22:46 +0000 (15:22 +0000)]
units(1): Refactor the manual page and update usage information
Changes to units.1:
- Change the description to a more descriptive "conversion calculator".
- Sort options.
- Split the description into sections to make it easier to navigate the
manual page.
- Improve the description of various options.
- Document the default value of the output format.
- Use more mdoc macros for better readability.
- Document the behavior of the PATH environmental variable.
- Improve examples.
- Add sections: EXIT STATUS, DIAGNOSTICS, and HISTORY.
- Document that units(1) cannot convert negative values and it handles long
unit lists poorly.
- Update the documentation of the -V flag to match the implementation.
units(1) prints its version and the units data file instead of its
version and usage information.
This commit does not attempts to change the current behavior of units(1).
What's left to do is probably defining a better versioning (at the moment
units(1) always reports "FreeBSD units" as its version) and changing the
behavior of the -V flag to only print version.
andrew [Mon, 3 Feb 2020 14:38:19 +0000 (14:38 +0000)]
Remove the GICv3 ITS irq and replace it with an ID
In r357324 most of the use of gi_irq was moved to gi_lpi. Complete this
with the last few places we need the IRQ value and create gi_id for the
per-device value we need.
mjg [Mon, 3 Feb 2020 14:28:31 +0000 (14:28 +0000)]
fd: fix f_count acquire in fget_unlocked
The code was using a hand-rolled fcmpset loop, while in other places the same
count is manipulated with the refcount API.
This transferred from a stylistic issue into a bug after the API got extended
to support flags. As a result the hand-rolled loop could bump the count high
enough to set the bit flag. Another bump + refcount_release would then free
the file prematurely.