markj [Wed, 5 Feb 2020 19:08:45 +0000 (19:08 +0000)]
Stop compiling dtrace modules with -DSMP.
I believe this is left over from when dtrace was being ported and
developed out-of-tree. Now it just ensures that dtrace.ko and a non-SMP
kernel have incompatible KBIs.
markj [Wed, 5 Feb 2020 19:08:21 +0000 (19:08 +0000)]
Define MAXCPU consistently between the kernel and KLDs.
This reverts r177661. The change is no longer very useful since
out-of-tree KLDs will be built to target SMP kernels anyway. Moveover
it breaks the KBI in !SMP builds since cpuset_t's layout depends on the
value of MAXCPU, and several kernel interfaces, notably
smp_rendezvous_cpus(), take a cpuset_t as a parameter.
PR: 243711
Reviewed by: jhb, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23512
kevans [Wed, 5 Feb 2020 17:21:36 +0000 (17:21 +0000)]
O_SEARCH test: drop O_SEARCH|O_RDWR local diff
In FreeBSD's O_SEARCH implementation, O_SEARCH in conjunction with O_RDWR or
O_WRONLY is explicitly rejected. In this case, O_RDWR was not necessary
anyways as the file will get created with or without it.
This was submitted upstream as misc/54940 and committed in rev 1.8 of the
file.
emaste [Wed, 5 Feb 2020 16:53:02 +0000 (16:53 +0000)]
linuxulator: implement sendfile
Submitted by: Bora Özarslan <borako.ozarslan@gmail.com>
Submitted by: Yang Wang <2333@outlook.jp>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19917
markj [Wed, 5 Feb 2020 16:09:21 +0000 (16:09 +0000)]
Avoid releasing object PIP in vn_sendfile() if no pages were grabbed.
sendfile(2) optionally takes a set of headers that get prepended to the
file data. If the request length is less than that of the headers,
sendfile may not allocate an sfio structure, in which case its pointer
is null and we should be careful not to dereference. This was
introduced in r356902.
Reported by: syzkaller
Sponsored by: The FreeBSD Foundation
kevans [Wed, 5 Feb 2020 14:00:27 +0000 (14:00 +0000)]
wc(1): account for possibility of file == NULL
file could reasonably be NULL here if we we're using stdin. Albeit less
likely in normal usage, one could actually hit either of these warnings on
stdin.
luporl [Wed, 5 Feb 2020 11:34:10 +0000 (11:34 +0000)]
Add SYSCTL to get KERNBASE and relocated KERNBASE
This change adds 2 new SYSCTLs, to retrieve the original and relocated KERNBASE
values. This provides an easy, architecture independent way to calculate the
running kernel displacement (current/load address minus original base address).
The initial goal for this change is to add a new libkvm function that returns
the kernel displacement, both for live kernels and crashdumps. This would in
turn be used by kgdb to find out how to relocate kernel symbols (if needed).
kevans [Wed, 5 Feb 2020 04:32:49 +0000 (04:32 +0000)]
service(8): set the environment of the "daemon" class before invoking
As mentioned in r357562, this gives the user a single place to configure
environment variables that need to be used for various services -- the
"daemon" class -- for, e.g., configuring a system-wide HTTP proxy.
This is a part of D21481.
Submitted by: Andrew Gierth <andrew_tao173.riddles.org.uk>
kevans [Wed, 5 Feb 2020 04:27:44 +0000 (04:27 +0000)]
init(8): set environment variables from the "daemon" class as well
Specifically, when running /etc/rc. This allows one to specify via
login.conf(5) an environment that should be used when running services to
ease, e.g., setting up env vars for an HTTP proxy consistently across cron
and services alike.
Future changes will extend cron(8)/service(8) to use environment vars
pecified in login.conf(5) as well to promote a more cohesive experience.
This is a part of D21481.
Submitted by: Andrew Gierth <andrew_tao173.riddles.org.uk>
rlibby [Tue, 4 Feb 2020 22:40:45 +0000 (22:40 +0000)]
uma: multipage chicken switch
Add a switch to allow disabling multipage slabs, in order to facilitate
measuring memory usage and performance effects. The tunable
vm.debug.uma_multipage_slabs defaults to 1 and can be set to 0 to
disable. The name may change soon.
rlibby [Tue, 4 Feb 2020 22:40:34 +0000 (22:40 +0000)]
uma: grow slabs to enforce minimum memory efficiency
Memory efficiency can be poor with awkward item sizes (e.g. 1/2 or 1
page size + epsilon). In order to achieve a minimum memory efficiency,
select a slab size with a potentially larger number of pages if it
yields a lower portion of waste.
This may mean using page_alloc instead of uma_small_alloc, which could
be more costly.
rlibby [Tue, 4 Feb 2020 22:40:23 +0000 (22:40 +0000)]
uma: convert mbuf_jumbo_alloc to UMA_ZONE_CONTIG & tag others
Remove mbuf_jumbo_alloc and let large mbuf zones use the new uma default
contig allocator (a copy of mbuf_jumbo_alloc). Tag other zones which
require contiguous objects, even if they don't use the new default
contig allocator, so that uma knows about their constraints.
rlibby [Tue, 4 Feb 2020 22:39:58 +0000 (22:39 +0000)]
uma: pcpu_page_free needs to startup_free pages from startup_alloc
After r357392, it is apparent that we do have some early-boot PCPU
zones. Make it so we can safely free pages from them if they are
actually used during early boot.
kevans [Tue, 4 Feb 2020 21:43:39 +0000 (21:43 +0000)]
ObsoleteFiles: Update after simple_httpd removal
There should have perhaps been an entry in OptionalObsoleteFiles for it
before, but alas- let it be removed now with `make delete-old` if it was
installed.
kevans [Tue, 4 Feb 2020 21:27:39 +0000 (21:27 +0000)]
Remove simple_httpd
simple_httpd was granted a reprieve from the picobsd removal based on having
some reported user; it turns out this user isn't actually using the version
in base and merging their changes would be difficult at this point, so the
version in base will simply continue to rot. Retire it now, it may make a
comeback to ports with the improved version.
No notice issued because its current visibility has only been for ~3
months, and a notice has been previously issued about picobsd removal.
markj [Tue, 4 Feb 2020 21:17:59 +0000 (21:17 +0000)]
size: Avoid returning a stack pointer from xlatetom().
The callers only check whether the returned pointer is non-NULL, so this
was harmless in practice, but change the return value to guard against
the issue.
CID: 1411597
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
markj [Tue, 4 Feb 2020 21:16:56 +0000 (21:16 +0000)]
elfcopy: Avoid leaking dst's fd when we fail to copy a file.
We should really create the output file in the same directory as the
destination file so that rename() works. This will be done in a future
change as part of some work to run in capability mode.
CID: 1262523
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
bdragon [Tue, 4 Feb 2020 20:40:45 +0000 (20:40 +0000)]
[PowerPC] Fix VSX context handling
In r356767, memcpy/memmove/bcopy optimizations were added to libc to
improve performance.
This exposed an existing kernel issue in VSX handling. The PSL_VSX flag was
not being excluded from the psl_userstatic set, which meant that any thread
that used these and then called swapcontext(3) would get an EINVAL error.
Fixing this exposed a second issue - in r344123, the FPU was being forced
off in set_mcontext(). However, this was neglecting to ensure VSX was turned
off at the same time.
While here, add some code comments to explain what's going on.
jeff [Tue, 4 Feb 2020 20:33:01 +0000 (20:33 +0000)]
Add an explicit busy state for free pages. This improves behavior with
potential bugs that access freed pages as well as providing a path
towards lockless page lookup.
jeff [Tue, 4 Feb 2020 20:28:06 +0000 (20:28 +0000)]
Use literal bucket sizes for smaller buckets rather than the rounding
system. Small bucket sizes already pack well even if they are an odd
number of words. This prevents any potential new instances of the
problem fixed in r357463 as well as making the system easier to
understand.
Revert r357201: downgrade sqlite3 from sqlite3-3.31.0 (3310000) to
sqlite3-3.30.1 (3300100), as it causes svnlite segfaults on PowerPC,
resulting in corruption.
Reported by: Mark Millard <marklmi at yahoo.com>
Francis Little <oggy at farscape.co.uk>
kib [Tue, 4 Feb 2020 19:05:58 +0000 (19:05 +0000)]
tmpfs: add nomtime mount option,
which disables tracking mtime updates due to writes through the shared
mapped areas backed by tmpfs files. This removes periodic scans which
downgrades rw mapped pages to ro to note the writes.
kib [Tue, 4 Feb 2020 19:03:37 +0000 (19:03 +0000)]
Enable vm_object_mightbedirty() and vm_object_page_clean() for swap
objects backing tmpfs vnodes data.
The clean scan is limited to only remove write permissions from the
mapped pages of the objects. This fixes the issue that tmpfs vnode
mtime is not updated from writes to the mmaped area after the initial
page-in.
Noted by: mjg
Reviewed by: markj
Discussed with: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23432
kevans [Tue, 4 Feb 2020 18:45:28 +0000 (18:45 +0000)]
psm: use make_dev_s instead of make_dev
This most importantly reduces duplication, but it also removes any potential
race with usage of dev->si_drv1 since it's now set prior to the device being
constructed enough to be accessible.
mav [Tue, 4 Feb 2020 15:53:51 +0000 (15:53 +0000)]
Few microoptimizations to dbuf layer.
Move db_link into the same cache line as db_blkid and db_level.
It allows significantly reduce avl_add() time in dbuf_create() on
systems with large RAM and huge number of dbufs per dnode.
Avoid few accesses to dbuf_caches[].size, which is highly congested
under high IOPS and never stays in cache for a long time. Use local
value we are receiving from zfs_refcount_add_many() any way.
Remove cache_size_bytes_max bump from dbuf_evict_one(). I don't see
a point to do it on dbuf eviction after we done it on insertion in
dbuf_rele_and_unlock().
Reviewed by: mahrens, Brian Behlendorf
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
andrew [Tue, 4 Feb 2020 12:33:00 +0000 (12:33 +0000)]
Print useful debug data on unhandled kernel fault on arm64
When panicing because of an unhandled data abort from the kernel it is
useful to know the register state and faulting address to aid debugging.
Print these registers before calling panic.
dchagin [Tue, 4 Feb 2020 05:27:05 +0000 (05:27 +0000)]
Fix clock_gettime() and clock_getres() for cpu clocks:
- handle the CLOCK_{PROCESS,THREAD}_CPUTIME_ID specified directly;
- fix thread id calculation as in the Linuxulator we should
convert the user supplied thread id to struct thread * by linux_tdfind();
- fix CPUCLOCK_SCHED case by using kern_{process,thread}_cputime()
directly as native get_cputime() used by kern_clock_gettime() uses
native tdfind()/pfind() to find proccess/thread.
dchagin [Tue, 4 Feb 2020 05:23:34 +0000 (05:23 +0000)]
linux_to_native_clockid() properly initializes nwhich variable (or return error),
so don't initialize nwhich in declaration and remove stale comment from r161304.
jeff [Tue, 4 Feb 2020 02:41:24 +0000 (02:41 +0000)]
Use STAILQ instead of TAILQ for bucket lists. We only need FIFO behavior
and this is more space efficient.
Stop queueing recently used buckets to the head of the list. If the bucket
goes to a different processor the cache coherency will be more expensive.
We already try to encourage cache-hot behavior in the per-cpu layer.
arichardson [Tue, 4 Feb 2020 00:06:16 +0000 (00:06 +0000)]
Set the LMA of the riscv kernel to the OpenSBI jump target by default
This allows us to boot FreeBSD RISCV on QEMU using the -kernel command line
options. When using that option, QEMU maps the kernel ELF file to the
addresses specified in the LMAs in the program headers.
Since version 4.2 QEMU ships with OpenSBI fw_jump by default so this allows
booting FreeBSD using the following command line:
qemu-system-riscv64 -bios default -kernel /.../boot/kernel/kernel -nographic -M virt
Without this change the -kernel option cannot be used since the LMAs start
at address zero and QEMU already maps a ROM to these low physical addresses.
For targets that require a different kernel LMA the make variable
KERNEL_LMA can be overwritten in the config file. For example, adding
`makeoptions KERNEL_LMA=0xc0200000` will create an ELF file that will be
loaded at 0xc0200000.
Before:
There are 4 program headers, starting at offset 64
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs. Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.
0mp [Mon, 3 Feb 2020 23:40:27 +0000 (23:40 +0000)]
ports.7: Update examples with install-missing-packages
The ports framework recently grew support for installing dependencies with
a dedicated target called "install-missing-packages". Let's retire the
carefully constructed one-liner that was used for getting dependencies so
far and use the official ports target instead.
mjg [Mon, 3 Feb 2020 22:32:49 +0000 (22:32 +0000)]
fd: streamline fget_unlocked
clang has the unfortunate property of paying little attention to prediction
hints when faced with a loop spanning the majority of the rotuine.
In particular fget_unlocked has an unlikely corner case where it starts almost
from scratch. Faced with this clang generates a maze of taken jumps, whereas
gcc produces jump-free code (in the expected case).
Work around the problem by providing a variant which only tries once and
resorts to calling the original code if anything goes wrong.
While here note that the 'seq' parameter is almost never passed, thus the
seldom users are redirected to call it directly.
mjg [Mon, 3 Feb 2020 22:26:00 +0000 (22:26 +0000)]
ktrace: provide ktrstat_error
This eliminates a branch from its consumers trading it for an extra call
if ktrace is enabled for curthread. Given that this is almost never true,
the tradeoff is worth it.
glebius [Mon, 3 Feb 2020 20:48:57 +0000 (20:48 +0000)]
Couple protocol drain routines (frag6_drain and sctp_drain) may send
packets. An unexpected behaviour for memory reclamation routine.
Anyway, we need enter the network epoch for doing that.