]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agoProperly detect ATA sanitize errors.
mav [Sun, 15 Dec 2019 23:28:53 +0000 (23:28 +0000)]
Properly detect ATA sanitize errors.

It seems I read specifications not careful enough.  There are devices not
setting successful completion bit, causing previous code report false error.

MFC after: 1 week

4 years agoApply a small optimization to pmap_remove_l3_range(). Specifically, hoist a
alc [Sun, 15 Dec 2019 22:41:57 +0000 (22:41 +0000)]
Apply a small optimization to pmap_remove_l3_range().  Specifically, hoist a
PHYS_TO_VM_PAGE() operation that always returns the same vm_page_t out of
the loop.  (Since arm64 is configured as VM_PHYSSEG_SPARSE, the
implementation of PHYS_TO_VM_PAGE() is more costly than that of
VM_PHYSSEG_DENSE platforms, like amd64.)

MFC after: 1 week

4 years agoloader: rewrite zfs vdev initialization
tsoome [Sun, 15 Dec 2019 21:52:40 +0000 (21:52 +0000)]
loader: rewrite zfs vdev initialization

In some cases the pool discovery will get stuck in infinite loop while setting
up the vdev children.

To fix, we split the vdev setup into two parts, first we create vdevs based on
configuration we do get from pool label, then, we process pool config from MOS
and update the pool config if needed.

Testing done: confirm previously hung loader is not hung any more.

MFC after: 1 week

4 years agoschedlock 4/4
jeff [Sun, 15 Dec 2019 21:26:50 +0000 (21:26 +0000)]
schedlock 4/4

Don't hold the scheduler lock while doing context switches.  Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency.  This means that mi_switch() is entered with the thread
locked but returns without.  This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.

This change has not been made to 4BSD but in principle it would be
more straightforward.

Discussed with: markj
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22778

4 years agopowerpc/powernv: Set the PTCR for the Nest MMU
jhibbits [Sun, 15 Dec 2019 21:20:18 +0000 (21:20 +0000)]
powerpc/powernv: Set the PTCR for the Nest MMU

The Nest MMU manages address translation for accelerators on the POWER9.  To
do so, it needs a page table, so export the system page table to the Nest
MMU.  This will quietly fail on pre-POWER9 systems that do not have a NMMU.

The NMMU is currently unused, so this change is currently effectively a NOP,
but the NMMU and VAS will eventually be used.

4 years agoschedlock 3/4
jeff [Sun, 15 Dec 2019 21:19:41 +0000 (21:19 +0000)]
schedlock 3/4

Eliminate lock recursion from turnstiles.  This was simply used to avoid
tracking the top-level turnstile lock.  explicitly check for it before
picking up and dropping locks.

Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22746

4 years agoschedlock 2/4
jeff [Sun, 15 Dec 2019 21:18:07 +0000 (21:18 +0000)]
schedlock 2/4

Do all sleepqueue post-processing in sleepq_remove_thread() so that we
do not require the thread lock after a context switch.

Reviewed by: jhb, kib
Differential Revision: https://reviews.freebsd.org/D22745

4 years agoRewrite arm kernel stack unwind code to work when unwinding through modules.
ian [Sun, 15 Dec 2019 21:16:35 +0000 (21:16 +0000)]
Rewrite arm kernel stack unwind code to work when unwinding through modules.

The arm kernel stack unwinder has apparently never been able to unwind when
the path of execution leads through a kernel module. There was code that
tried to handle modules by looking for the unwind data in them, but it did
so by trying to find symbols which have never existed in arm kernel
modules. That caused the unwind code to panic, and because part of panic
handling calls into the unwind code, that just created a recursion loop.

Locating the unwind data in a loaded module requires accessing the Elf
section headers to find the SHT_ARM_EXIDX section. For preloaded modules
those headers are present in a metadata blob. For dynamically loaded
modules, the headers are present only while the loading is in progress; the
memory is freed once the module is ready to use. For that reason, there is
new code in kern/link_elf.c, wrapped in #ifdef __arm__, to extract the
unwind info while the headers are loaded. The values are saved into new
fields in the linker_file structure which are also conditional on __arm__.

In arm/unwind.c there is new code to locally cache the per-module info
needed to find the unwind tables. The local cache is crafted for lockless
read access, because the unwind code often needs to run in context where
sleeping is not allowed.  A large comment block describes the local cache
list, so I won't repeat it all here.

4 years agoschedlock 1/4
jeff [Sun, 15 Dec 2019 21:11:15 +0000 (21:11 +0000)]
schedlock 1/4

Eliminate recursion from most thread_lock consumers.  Return from
sched_add() without the thread_lock held.  This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks.  This will eventually allow for lockless remote adds.

Discussed with: kib
Reviewed by: jhb
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22626

4 years agopowerpc/mpc85xx: Clean up Freescale SATA driver a little
jhibbits [Sun, 15 Dec 2019 21:08:40 +0000 (21:08 +0000)]
powerpc/mpc85xx: Clean up Freescale SATA driver a little

* Remove unused ATA_IN/OUT macros, they just clutter up the file.
* Fix some RID management bits for the channel memory resource.

4 years agoSupport --all-repeats in uniq(1) for compatibility with gnu coreutils.
ian [Sun, 15 Dec 2019 18:05:18 +0000 (18:05 +0000)]
Support --all-repeats in uniq(1) for compatibility with gnu coreutils.

This adds a new -D/--all-repeats option to uniq(1), which outputs each copy
of any repeated lines (as opposed to a single copy of a repeated line). You
can specify a separator option to output a blank line before or after each
group of repeated lines. This adds compatibility with the GNU coreutils
version of uniq(1).

This change also re-groups the -c, -d, -D, -u options in the usage display
and man page to indicate that they are mutally exclusive of each other. This
matches the posix/opengroup definition of uniq(1) command line args. Note
that this change does NOT actually enforce the mutual exclusion in the code,
for now, it simply documents that the arguments should be considered
exclusive with each other.

Differential Revision: https://reviews.freebsd.org/D22262

4 years agoRevert r355760, r355759
cem [Sun, 15 Dec 2019 17:33:26 +0000 (17:33 +0000)]
Revert r355760, r355759

And remove the inline/deprecated attribute use entirely in stdlib.h, from
r355747.  The intent was to provide a buildable API transitionary period, but
clearly that was counter-productive.

Reported by: delphij, imp, others

4 years agokbd: convert kbdd_* macros to inline functions
kevans [Sun, 15 Dec 2019 16:28:12 +0000 (16:28 +0000)]
kbd: convert kbdd_* macros to inline functions

This reduces the noise when interested parties wish to de-Giant kbd; these
accesses to kbdsw will need to be properly locked.

4 years agoProperly synchronize completion DMA buffers.
mmel [Sun, 15 Dec 2019 14:28:38 +0000 (14:28 +0000)]
Properly synchronize completion DMA buffers.
Within command completion processing the callback function may access
DMAed data buffer. Synchronize it before use, not after.
This allows to use NVMe disk on non-DMA coherent arm64 system.

MFC after: 3 weeks

4 years agoloader: zfsimpl.c cstyle cleanup
tsoome [Sun, 15 Dec 2019 14:09:49 +0000 (14:09 +0000)]
loader: zfsimpl.c cstyle cleanup

No functional changes intended.

MFC after: 1 week

4 years agoFix a mistake in r355765. We need to activate the page if it is not yet
jeff [Sun, 15 Dec 2019 06:26:47 +0000 (06:26 +0000)]
Fix a mistake in r355765.  We need to activate the page if it is not yet
on a pagequeue.

Reported by: pho

4 years agokbd: drop _KERNEL #ifdef in kbdreg.h
kevans [Sun, 15 Dec 2019 04:22:50 +0000 (04:22 +0000)]
kbd: drop _KERNEL #ifdef in kbdreg.h

This #ifdef is misleading as there are actually no user-serviceable parts
inside and, as far as I can tell, there is no pollution leading from
userland to this header. Furthermore, it becomes a slight nuisance when
attempting to move things around in this header.

4 years agoPreviously we did not support invalid pages in default objects. This means
jeff [Sun, 15 Dec 2019 04:08:24 +0000 (04:08 +0000)]
Previously we did not support invalid pages in default objects.  This means
that if fault fails to progress and needs to restart the loop it must free
the page it is working on and allocate again on restart.  Resolve the few
places that need to be modified to support this condition and simply
deactivate the page.  Presently, we only permit this when fault restarts
for busy contention.  This has an added benefit of removing some object
trylocking in this case.

While here consolidate some page cleanup logic into fault_page_free() and
fault_page_release() to reduce redundant code and automate some teardown.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D22653

4 years agoAdd a deferred free mechanism for freeing swap space that does not require
jeff [Sun, 15 Dec 2019 03:15:06 +0000 (03:15 +0000)]
Add a deferred free mechanism for freeing swap space that does not require
an exclusive object lock.

Previously swap space was freed on a best effort basis when a page that
had valid swap was dirtied, thus invalidating the swap copy.  This may be
done inconsistently and requires the object lock which is not always
convenient.

Instead, track when swap space is present.  The first dirty is responsible
for deleting space or setting PGA_SWAP_FREE which will trigger background
scans to free the swap space.

Simplify the locking in vm_fault_dirty() now that we can reliably identify
the first dirty.

Discussed with: alc, kib, markj
Differential Revision: https://reviews.freebsd.org/D22654

4 years agoSlightly optimize locking in vm_map_copy_swap_entry(). Anonymous objects
jeff [Sun, 15 Dec 2019 02:02:27 +0000 (02:02 +0000)]
Slightly optimize locking in vm_map_copy_swap_entry().  Anonymous objects
require the object lock to synchronize collapse.  Other swap objects such
as tmpfs do not.

Reported by: mjg
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D22747

4 years agoHandle pagein clustering in vm_page_grab_valid() so that it can be used by
jeff [Sun, 15 Dec 2019 02:00:32 +0000 (02:00 +0000)]
Handle pagein clustering in vm_page_grab_valid() so that it can be used by
exec_map_first_page().  This will also enable pagein clustering for other
interested consumers (tmpfs, md, etc).

Discussed with: alc
Approved by: kib
Differential Revision: https://reviews.freebsd.org/D22731

4 years agocdefs: use more accurate GCC version for the deprecated attribute.
pfg [Sun, 15 Dec 2019 01:56:56 +0000 (01:56 +0000)]
cdefs: use more accurate GCC version for the deprecated attribute.

The message argument in the "deprecated" attribute was introduced in GCC 4.5 *.
Use the accurate version number for consistency, as done already with other
attributes.

* https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Function-Attributes.html

4 years ago<unistd.h>: remove redundant __BSD_VISIBLE
kevans [Sun, 15 Dec 2019 01:26:57 +0000 (01:26 +0000)]
<unistd.h>: remove redundant __BSD_VISIBLE

This bit is already inside of a larger __BSD_VISIBLE block.

Reported by: vangyzen

4 years agolinuxkpi: Drop incompatible __deprecated definition
cem [Sat, 14 Dec 2019 23:39:32 +0000 (23:39 +0000)]
linuxkpi: Drop incompatible __deprecated definition

Probably all of these linuxkpi stubs should be '#ifndef' guarded, but maybe
that would prevent people from noticing when they are defined.

Introduced in r355759.  For some reason I only ran a buildworld and not a
kernel.  Mea culpa.

Reported by: Mark Millard
X-MFC-with: r355759

4 years agocdefs: Add __deprecated(message) function attribute macro
cem [Sat, 14 Dec 2019 21:52:49 +0000 (21:52 +0000)]
cdefs: Add __deprecated(message) function attribute macro

The legacy version of GCC4 currently in base does not support the
parameterized form of this function attribute, as recent introduced in
stdlib.h (r355747).

As we have done for other function attributes with similar compatibility
problems, add a version-compatibile definition in sys/cdefs.h.  Note that
Clang defines itself to be GCC 4, so one must check for __clang__ in
addition to __GNUC__ version.  On legacy GCC 4, the macro expands to just
the __deprecated__ attribute; on modern GCC or Clang, the macro expands to
the parameterized variant with the message.

Ignoring legacy or unsupported compilers, the macro is also beneficial in
that it is a bit more ergonomic than the full
__attribute__((__deprecated__())) boilerplate.

Reported by: CI (but not tinderbox); imp and others
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D22817

4 years agoUpdate the mount_nfs.8 man page to include NFSv4.2.
rmacklem [Sat, 14 Dec 2019 21:49:47 +0000 (21:49 +0000)]
Update the mount_nfs.8 man page to include NFSv4.2.

r355677 added NFSv4.2 support to the NFS client. This patch updates the
mount_nfs.8 man page to reflect that.
It also clarifies that the "nolockd" option does not apply to NFSv4 mounts.

This is a content change.

4 years agoSimplify the processing a leaf mask to find big-enough ranges of set
dougm [Sat, 14 Dec 2019 19:44:42 +0000 (19:44 +0000)]
Simplify the processing a leaf mask to find big-enough ranges of set
bits, by storing and modifying the complement of the original leaf
mask, and by avoiding some unnecessary intermediate variables in
computing the shift amounts. The logic is similar to what has recently
been committed to sys/sys/bitstring.h.

Compute better hint updates for the case when the cursor starts in
mid-leaf, and eliminates some otherwise viable solutions. Assume the
worst case, that all the eliminated offsets could have been solutions,
and you can still compute a better hint than we use now.

Eliminate some unnecessary conditional control flow.

Approved by: alc
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22666

4 years agoAdd driver for Rockchip PCIe root complex found in RK3399 SOC.
mmel [Sat, 14 Dec 2019 14:56:34 +0000 (14:56 +0000)]
Add driver for Rockchip PCIe root complex found in RK3399 SOC.
Unfortunately, there are some limitations:
- memory aperture of his controller is only 16MiB, so it is nearly
  unusable for graphic cards
- every attempt to generate type 1 config cycle always causes trap.
  These config cycles are disabled now and we don't support cards
  with PCIe switch.
- in some cases, attempt to do config cycle to (probably) not-yet ready
  card also causes trap. This cannot be detected at runtime, but it seems
  like very rare issue.

MFC after: 3 weeks
Differential Revision:  https://reviews.freebsd.org/D22724

4 years agoAdd sync_file_range(2) implementation to linux(4); it's a thin wrapper
trasz [Sat, 14 Dec 2019 13:37:17 +0000 (13:37 +0000)]
Add sync_file_range(2) implementation to linux(4); it's a thin wrapper
over the usual fsync(2).

This silences some warnings when running "apt-get upgrade".

Reviewed by: brooks, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22371

4 years agoRegen after r355752.
trasz [Sat, 14 Dec 2019 13:32:37 +0000 (13:32 +0000)]
Regen after r355752.

MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22371

4 years agoFix definitions for linuxulator's sync_file_range(2).
trasz [Sat, 14 Dec 2019 13:30:43 +0000 (13:30 +0000)]
Fix definitions for linuxulator's sync_file_range(2).

Reviewed by: brooks, emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22371

4 years agoAdd 'sesutil show' subcommand to show enclosure and its contents
trasz [Sat, 14 Dec 2019 10:58:06 +0000 (10:58 +0000)]
Add 'sesutil show' subcommand to show enclosure and its contents
in a user-friendly way.

Reviewed by: allanjude, bcr (manpages)
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D22567

4 years agoAdd -M option to nc(1), which makes it print the TCP connection
trasz [Sat, 14 Dec 2019 10:53:52 +0000 (10:53 +0000)]
Add -M option to nc(1), which makes it print the TCP connection
statistics obtained with stats(3) in JSON format to standard error.

Reviewed by: allanjude, thj, cem (earlier version)
Tested by: thj
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D21324

4 years agoDeprecate sranddev(3) API
cem [Sat, 14 Dec 2019 08:28:10 +0000 (08:28 +0000)]
Deprecate sranddev(3) API

It serves no useful purpose and wasn't as popular as its equally meritless
cousin, srandomdev(3).

Setting aside the problems with rand(3) in general, the problem with this
interface is that the seed isn't shared with the caller (other than by
attacking the output of the generator, which is trivial, but not a hallmark of
pleasant API design).  The (arguable) utility of rand(3) or random(3) is as a
semi-fast simulation generator which produces consistent results from a given
seed.  These are mutually at odd.  Furthermore, sometimes people got the
mistaken impression that a high quality random seed meant a weak generator like
rand(3) or random(3) could be used for things like cryptographic key
generation.  This is absolutely not so.

The API was never part of a standard and was not widely used in tree.  Existing
in-tree uses have all been removed.

Possible replacement in out of tree codebases:

char buf[3];
time_t t;

time(t);
strftime(buf, sizeof(buf), "%S", gmtime(&t));
srand(atoi(buf));

Relnotes: yes

4 years agouma dbg: flexible size for slab debug bitset too
rlibby [Sat, 14 Dec 2019 05:21:56 +0000 (05:21 +0000)]
uma dbg: flexible size for slab debug bitset too

Recently (r355315) the size of the struct uma_slab bitset field us_free
became dynamic instead of conservative.  Now, make the debug bitset
size dynamic too.  The debug bitset is INVARIANTS-only, so in fact we
don't care too much about the space savings that results from this, but
enabling minimally-sized slabs on INVARIANTS builds is still important
in order to be able to test new slab layouts effectively.

Reviewed by: jeff (previous version), markj (previous version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22759

4 years agopf: Make request_maxcount runtime adjustable
kp [Sat, 14 Dec 2019 02:06:07 +0000 (02:06 +0000)]
pf: Make request_maxcount runtime adjustable

There's no reason for this to be a tunable. It's perfectly safe to
change this at runtime.

Reviewed by: Lutz Donnerhacke
Differential Revision: https://reviews.freebsd.org/D22737

4 years agopfctl: Warn users when they run into kernel limits
kp [Sat, 14 Dec 2019 02:03:47 +0000 (02:03 +0000)]
pfctl: Warn users when they run into kernel limits

Warn users when they try to add/delete/modify more items than the kernel will
allow.

Reviewed by: allanjude (previous version), Lutz Donnerhacke
Differential Revision: https://reviews.freebsd.org/D22733

4 years agoRemove the useless return value from proc_set_cred
mjg [Sat, 14 Dec 2019 00:43:17 +0000 (00:43 +0000)]
Remove the useless return value from proc_set_cred

4 years agoAdd accessors for the Vendor Specific Extended Capability (VSEC)
scottl [Fri, 13 Dec 2019 23:46:59 +0000 (23:46 +0000)]
Add accessors for the Vendor Specific Extended Capability (VSEC)
Parse out the VSEC.  If the user invokes a second -c command line option,
do a hex dump of the vendor data.

Reviewed by: imp
MFC after: 3 days
Sponsored by: Intel
Differential Revision: http://reviews.freebsd.org/D22808

4 years agoExpand net epoch in the cxgbe TOE driver to satisfy assertions.
jhb [Fri, 13 Dec 2019 23:33:54 +0000 (23:33 +0000)]
Expand net epoch in the cxgbe TOE driver to satisfy assertions.

Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22483

4 years agoMFV: r355716
jkim [Fri, 13 Dec 2019 23:28:52 +0000 (23:28 +0000)]
MFV: r355716

Merge ACPICA 20191213.

4 years agoInclude ofw_bus_if.h in SRCS only on systems configured with the FDT option.
ian [Fri, 13 Dec 2019 23:22:49 +0000 (23:22 +0000)]
Include ofw_bus_if.h in SRCS only on systems configured with the FDT option.

4 years agoBetter copyright advice
imp [Fri, 13 Dec 2019 22:32:05 +0000 (22:32 +0000)]
Better copyright advice

Document the common practices around copyrights with "all rights reserved" in
them as new copyright notices get added.

It's an open question qhether to point people at the fact that since the Berne
convention was ratified, All rights reserved is largely obsolete.
https://en.wikipedia.org/wiki/All_rights_reserved#Obsolescence has the
details. The committer's guide will be revised shortly, and it's likely that's a
better place for this discussion. If not, I'll add a blurb here.

Reviewed by: jhb@, brooks@
Differential Review: https://reviews.freebsd.org/D22800

4 years agozfs boot: fix a crash in a rarely taken path in fzap_lookup
avg [Fri, 13 Dec 2019 22:04:13 +0000 (22:04 +0000)]
zfs boot: fix a crash in a rarely taken path in fzap_lookup

Instead of passing NULL to fzap_name_equal and crashing, just return
ENOENT.  This happened when higher bits of a hash of the searched key
(its hash prefix) matched a hash prefix of some key in the ZAP, but the
full hash value of the searched key did not match any key in the ZAP.

I observerved this problem when loader tried to look up
"features_for_read" in a particular old pool that predates pool
features.

MFC after: 2 weeks
Sponsored by: Panzura

4 years agoBe consistent about checking return value from bus_delayed_attach_children.
imp [Fri, 13 Dec 2019 21:39:20 +0000 (21:39 +0000)]
Be consistent about checking return value from bus_delayed_attach_children.

Most places checked, but a couple last minute changes didn't. Make them all use
the return value.

Noticed by: rpokala@

4 years agoDon't use contractions. Fix the date.
imp [Fri, 13 Dec 2019 21:39:10 +0000 (21:39 +0000)]
Don't use contractions. Fix the date.

Contractions cause problems for translators, so s/aren't/are not/ in the one
place this slipped through.

While here, noticed I commited with the date I did the work, not today's
date. Fix that too.

Noticed by: bjk@

4 years agoSilence some "might not be initialized" warnings for riscv64.
rmacklem [Fri, 13 Dec 2019 21:38:08 +0000 (21:38 +0000)]
Silence some "might not be initialized" warnings for riscv64.

None of these case were actually using the variable(s) uninitialized, but
I figured that silencing the warnings via initializing them made sense.

Some of these predated r355677.

4 years agoRemove the deprecated timeout(9) interface.
jhb [Fri, 13 Dec 2019 21:03:12 +0000 (21:03 +0000)]
Remove the deprecated timeout(9) interface.

All in-tree consumers have been converted to callout(9).

Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D22602

4 years agocxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.
np [Fri, 13 Dec 2019 20:38:58 +0000 (20:38 +0000)]
cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.

CPL_TX_PKT_XT disables the internal parser on the chip and instead
relies on the driver to provide the exact length of the L2 and L3
headers.  This allows hw checksumming and TSO to be used with L2 and
L3 encapsulations that the chip doesn't understand directly.

Note that netmap tx still uses the old CPL as it never uses the hw
to generate the checksum on tx.

Reviewed by: jhb@
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D22788

4 years ago[PowerPC] Fully define gdtoa settings on powerpc64.
bdragon [Fri, 13 Dec 2019 20:30:26 +0000 (20:30 +0000)]
[PowerPC] Fully define gdtoa settings on powerpc64.

The settings in arith.h were not fully defined on powerpc64 after the gdtoa
switchover. Generate them using arithchk.c, similar to what AMD64 did for
r114814.

Technically, none of this is necessary in FreeBSD gdtoa, but since the other
platforms have full definitions, we might as well have full definitions
too.

Approved by: jhibbits (in irc)
Differential Revision: https://reviews.freebsd.org/D22775

4 years agoUse callout(9) instead of deprecated timeout(9).
jhb [Fri, 13 Dec 2019 19:56:48 +0000 (19:56 +0000)]
Use callout(9) instead of deprecated timeout(9).

Reviewed by: imp
Tested by: Scott Benesh
Differential Revision: https://reviews.freebsd.org/D22598

4 years agoCreate new wrapper function: bus_delayed_attach_children()
imp [Fri, 13 Dec 2019 19:39:33 +0000 (19:39 +0000)]
Create new wrapper function: bus_delayed_attach_children()

Delay the attachment of children, when requested, until after interrutps are
running. This is often needed to allow children to run transactions on i2c or
spi busses. It's a common enough idiom that it will be useful to have its own
wrapper.

Reviewed by: ian
Differential Revision: https://reviews.freebsd.org/D21465

4 years agoUse a callout instead of timeout(9) for delayed zio's.
jhb [Fri, 13 Dec 2019 19:27:51 +0000 (19:27 +0000)]
Use a callout instead of timeout(9) for delayed zio's.

Reviewed by: avg
Differential Revision: https://reviews.freebsd.org/D22597

4 years agoUse callout(9) instead of deprecated timeout(9) for fail points.
jhb [Fri, 13 Dec 2019 19:26:04 +0000 (19:26 +0000)]
Use callout(9) instead of deprecated timeout(9) for fail points.

Allocate the callout structure on-demand from
fail_point_use_timeout_path() since most fail points do not use
timeouts.

Reviewed by: markj (earlier version), cem
Differential Revision: https://reviews.freebsd.org/D22599

4 years agoSupport software breakpoints in the debug server on Intel CPUs.
jhb [Fri, 13 Dec 2019 19:21:58 +0000 (19:21 +0000)]
Support software breakpoints in the debug server on Intel CPUs.

- Allow the userland hypervisor to intercept breakpoint exceptions
  (BP#) in the guest.  A new capability (VM_CAP_BPT_EXIT) is used to
  enable this feature.  These exceptions are reported to userland via
  a new VM_EXITCODE_BPT that includes the length of the original
  breakpoint instruction.  If userland wishes to pass the exception
  through to the guest, it must be explicitly re-injected via
  vm_inject_exception().

- Export VMCS_ENTRY_INST_LENGTH as a VM_REG_GUEST_ENTRY_INST_LENGTH
  pseudo-register.  Injecting a BP# on Intel requires setting this to
  the length of the breakpoint instruction.  AMD SVM currently ignores
  writes to this register (but reports success) and fails to read it.

- Rework the per-vCPU state tracked by the debug server.  Rather than
  a single 'stepping_vcpu' global, add a structure for each vCPU that
  tracks state about that vCPU ('stepping', 'stepped', and
  'hit_swbreak').  A global 'stopped_vcpu' tracks which vCPU is
  currently reporting an event.  Event handlers for MTRAP and
  breakpoint exits loop until the associated event is reported to the
  debugger.

  Breakpoint events are discarded if the breakpoint is not present
  when a vCPU resumes in the breakpoint handler to retry submitting
  the breakpoint event.

- Maintain a linked-list of active breakpoints in response to the GDB
  'Z0' and 'z0' packets.

Reviewed by: markj (earlier version)
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D20309

4 years agoAdd kern_kill() and use it in Linuxulator. It's just a cleanup,
trasz [Fri, 13 Dec 2019 18:44:02 +0000 (18:44 +0000)]
Add kern_kill() and use it in Linuxulator.  It's just a cleanup,
no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22645

4 years agoAdd kern_getsid() and use it in Linuxulator; no functional changes.
trasz [Fri, 13 Dec 2019 18:39:36 +0000 (18:39 +0000)]
Add kern_getsid() and use it in Linuxulator; no functional changes.

Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22647

4 years agoMove to using bool instead of boolean_t
imp [Fri, 13 Dec 2019 18:35:48 +0000 (18:35 +0000)]
Move to using bool instead of boolean_t

While there are subtle semantic differences between bool and boolean_t, none of
them matter in these cases. Prefer true/false when dealing with bool
type. Preserve a couple of TRUEs since they are passed into int args into CAM.
Preserve a couple of FALSEs when used for status.done, an int.

Differential Revision: https://reviews.freebsd.org/D20999

4 years agoRestore the reservation of boot pages for bucket zones after r355707.
markj [Fri, 13 Dec 2019 18:28:01 +0000 (18:28 +0000)]
Restore the reservation of boot pages for bucket zones after r355707.

uma_startup2() sets booted = BOOT_BUCKETS after calling bucket_init(),
but before that assignment, startup_alloc() will use pages from the
reserved pool, so the bucket zones themselves are still allocated using
startup pages.

Reviewed by: rlibby
Reported by: Jenkins via lwhsu
Differential Revision: https://reviews.freebsd.org/D22797

4 years ago[PowerPC] Enable TLS usage in system libraries on ELFv2.
bdragon [Fri, 13 Dec 2019 18:18:14 +0000 (18:18 +0000)]
[PowerPC] Enable TLS usage in system libraries on ELFv2.

Currently, __NO_TLS is defined to 1 on powerpc64. TLS usage works much
better on ELFv2 due to the modern tooling, so take the opportunity to
reenable TLS on ELFv2.

If you are using a self-built ELFv2 environment on powerpc64, you will
have to run installworld twice due to RuneLocale changes. This is the only
known regression, and if you are using the ELFv2 isos, you likely already
have the updated libraries installed, as this change is part of the
patchset that the isos integrate.

(No UPDATING note about this because ELFv2 is still an unofficial build.)

Reviewed by: luporl, Alfredo Dal'Ava Junior <alfredo.junior@eldorado.org.br>
Differential Revision: https://reviews.freebsd.org/D22524

4 years agoFix $() handling, broken since the beginning at r108014.
mav [Fri, 13 Dec 2019 17:52:09 +0000 (17:52 +0000)]
Fix $() handling, broken since the beginning at r108014.

Due to off-by-one error in brackets counting it consumed the rest of the
string, preventing later variables expansions.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

4 years agoAdd an entry to RELNOTES for r355677.
rmacklem [Fri, 13 Dec 2019 16:28:48 +0000 (16:28 +0000)]
Add an entry to RELNOTES for r355677.

4 years agorevert r355609
emaste [Fri, 13 Dec 2019 14:48:44 +0000 (14:48 +0000)]
revert r355609

4 years agoloader: cd9660_open() warn: is 'buf' large enough for 'struct iso_primary_descriptor'?
tsoome [Fri, 13 Dec 2019 12:36:16 +0000 (12:36 +0000)]
loader: cd9660_open() warn: is 'buf' large enough for 'struct iso_primary_descriptor'?

We do allocate amount of memory (void * or char *), and then assign this
buffer to struct iso_primary_descriptor *vd. Make sure we do
allocate enough bytes.

In fact we do allocate enough, but it is good idea to make sure this really
is so.

MFC after: 1 week

4 years agoMake TCP options parsing stricter.
ae [Fri, 13 Dec 2019 11:47:58 +0000 (11:47 +0000)]
Make TCP options parsing stricter.

Rework tcpopts_parse() to be more strict. Use const pointer. Add length
checks for specific TCP options. The main purpose of the change is
avoiding of possible out of mbuf's data access.

Reported by: Maxime Villard
Reviewed by: melifaro, emaste
MFC after: 1 week

4 years agoRevert r355706 & r355710
rlibby [Fri, 13 Dec 2019 11:21:28 +0000 (11:21 +0000)]
Revert r355706 & r355710

The quick fix didn't work.  I'll sort it out tomorrow.

Revert r355710: "libmemstat: unbreak build"
Revert r355706: "uma dbg: flexible size for slab debug bitset too"

4 years agolibmemstat: unbreak build
rlibby [Fri, 13 Dec 2019 10:34:19 +0000 (10:34 +0000)]
libmemstat: unbreak build

r355706 added an instance of offsetof() to the UMA private kernel header
file uma_int.h.  Userspace memstat_uma.c includes that header, and
chokes on offsetof() because apparently the definition in sys/types.h is
ifdef _KERNEL.  Now, include sys/stddef.h which has an identical
definition.

Pointyhat to: rlibby
Sponsored by: Dell EMC Isilon

4 years agobitset: rename confusing macro NAND to ANDNOT
rlibby [Fri, 13 Dec 2019 09:32:16 +0000 (09:32 +0000)]
bitset: rename confusing macro NAND to ANDNOT

s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too.  The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does appear to be what all existing callers want.

Don't supply a NAND (not (A and B)) operation at this time.

Discussed with: jeff
Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22791

4 years agouma: report slab efficiency
rlibby [Fri, 13 Dec 2019 09:32:09 +0000 (09:32 +0000)]
uma: report slab efficiency

Reviewed by: jeff
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22766

4 years agouma: delay bucket_init() until we might actually enable buckets
rlibby [Fri, 13 Dec 2019 09:32:03 +0000 (09:32 +0000)]
uma: delay bucket_init() until we might actually enable buckets

This helps with a bootstrapping problem in upcoming work.

We don't first enable buckets until uma_startup2(), so we can delay
bucket creation until then.  The other two paths to bucket_enable() are
both later, one in the pageout daemon (SI_SUB_KTHREAD_PAGE vs SI_SUB_VM)
and one in uma_timeout() (first activated in uma_startup3()).  Note that
although some bucket functions are accessible before uma_startup2()
(e.g. bucket_select() in zone_ctor()), none of them inspect ubz_zone.

Discussed with: jeff
Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22765

4 years agouma dbg: flexible size for slab debug bitset too
rlibby [Fri, 13 Dec 2019 09:31:59 +0000 (09:31 +0000)]
uma dbg: flexible size for slab debug bitset too

Recently (r355315) the size of the struct uma_slab bitset field us_free
became dynamic instead of conservative.  Now, make the debug bitset
size dynamic too.  The debug bitset is INVARIANTS-only, so in fact we
don't care too much about the space savings that results from this, but
enabling minimally-sized slabs on INVARIANTS builds is still important
in order to be able to test new slab layouts effectively.

Reviewed by: jeff, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22759

4 years agoAdd kern.geom.part.separator tunable. This makes it possible
trasz [Fri, 13 Dec 2019 09:28:44 +0000 (09:28 +0000)]
Add kern.geom.part.separator tunable.  This makes it possible
to specify an optional separator to insert before partition name;
eg if it's set to "c/", you'll get "ada0c/s1" instead of "ada0s1".
(It cannot be set to just “/“, since ada0 is a device node, not
a directory.)

Reviewed by: imp
MFC after: 2 weeks
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D22193

4 years agoloader: clean up devopen and devclose a bit
tsoome [Fri, 13 Dec 2019 08:41:37 +0000 (08:41 +0000)]
loader: clean up devopen and devclose a bit

devopen should undo setup of f->f_dev in case of error.
devclose can just call free().

MFC after: 1 week

4 years agoloader: vdisk dereference after free
tsoome [Fri, 13 Dec 2019 08:20:20 +0000 (08:20 +0000)]
loader: vdisk dereference after free

print out the information and then free the memory used.

MFC after: 1 week

4 years agontpd(8): Don't use OpenSSL's RAND API
cem [Fri, 13 Dec 2019 05:54:38 +0000 (05:54 +0000)]
ntpd(8): Don't use OpenSSL's RAND API

The !USE_OPENSSL_CRYPTO_RAND path uses arc4random_buf() correctly.

In general, we should prefer to avoid things OpenSSL does poorly when a good
alternative exists in libc.

4 years agolibtelnet: Replace bogus use of srandomdev + random to generate "public key pair"
cem [Fri, 13 Dec 2019 05:42:57 +0000 (05:42 +0000)]
libtelnet: Replace bogus use of srandomdev + random to generate "public key pair"

I'm pretty skeptical that any crypto in telnet is worth using, but if we're
ostensibly generating keys, arc4random is strictly better than the previous
construct.

4 years agobsnmpd(1): Replace dubious srandomdev+random(3) with arc4random(3)
cem [Fri, 13 Dec 2019 05:13:25 +0000 (05:13 +0000)]
bsnmpd(1): Replace dubious srandomdev+random(3) with arc4random(3)

4 years agolibtacplus: Remove bogus srandomdev+random
cem [Fri, 13 Dec 2019 05:11:34 +0000 (05:11 +0000)]
libtacplus: Remove bogus srandomdev+random

Replace with arc4random.

TACAS+ is a 1993 Cisco extension to the 1984 TACAS.  Is this something we want
in base still?  The directory has been substantively unmaintained since 2002,
at least.

4 years agolibradius: Rip out dubious use of srandomdev(3)+random(3)
cem [Fri, 13 Dec 2019 04:55:17 +0000 (04:55 +0000)]
libradius: Rip out dubious use of srandomdev(3)+random(3)

These functions appear to intend to produce unpredictable results.  Just use
arc4random.

While here, use an explicit_bzero instead of memset where the intent is clearly
to zero out a secret (clear_passphrase).

4 years agokern/subr_unit: Rip srandomdev, random(3) out of dead code
cem [Fri, 13 Dec 2019 04:48:20 +0000 (04:48 +0000)]
kern/subr_unit: Rip srandomdev, random(3) out of dead code

The simulation cannot be reproduced, so the value of using a deterministic PRNG
like random(3) is dubious.  The number of repitions used in the sample isn't a
problem for the Chacha implementation of arc4random we have today.  (Also, no
one actually runs this code; it was provided as an example of the work the
author did validating the implementation.  It's not even test code.)

4 years agorandom(6): produce random results
cem [Fri, 13 Dec 2019 04:37:39 +0000 (04:37 +0000)]
random(6): produce random results

This program is trash and there's no reason to keep it in base.  But as long as
we're shipping a silly program named 'random', let's actually make it random.

4 years agofsirand(8): Just use arc4random(3)
cem [Fri, 13 Dec 2019 04:12:13 +0000 (04:12 +0000)]
fsirand(8): Just use arc4random(3)

Remove single use of dubious srandomdev(3) + random(3) and replace with
arc4random(3), as is used already in this program.

Follow-up question: Do we really need this program anymore?  In base?

4 years agokeyserv(8): unifdef out __FreeBSD__ and KEYSERV_RANDOM
cem [Fri, 13 Dec 2019 04:03:05 +0000 (04:03 +0000)]
keyserv(8): unifdef out __FreeBSD__ and KEYSERV_RANDOM

This doesn't appear to have some active upstream (and it's a steaming pile of
bad 90s crypto design).  Rip out the completely horrible bits and leave the
only mildly less horrible bits.  The whole thing should probably be deleted; to
the extent it purports to provide a security feature: it doesn't.

4 years agoIf device_delete_children() returns an error, bail on the rest of the
ian [Fri, 13 Dec 2019 02:20:26 +0000 (02:20 +0000)]
If device_delete_children() returns an error, bail on the rest of the
detach work and return the error.  Especially don't call iicbus_reset()
since the most likely cause of failing to detach children is that one
of them has IO in progress.

4 years agoDocument that the debug server supports writing to guest memory.
jhb [Fri, 13 Dec 2019 02:18:44 +0000 (02:18 +0000)]
Document that the debug server supports writing to guest memory.

This was added in r348212.

4 years agoFix a mismerge in r355683 and remove the local gdb_port from main.
jhb [Fri, 13 Dec 2019 02:15:34 +0000 (02:15 +0000)]
Fix a mismerge in r355683 and remove the local gdb_port from main.

4 years agoClean up some of my copyrights; add SPDX tag and remove All rights reserved.
ian [Fri, 13 Dec 2019 01:38:48 +0000 (01:38 +0000)]
Clean up some of my copyrights; add SPDX tag and remove All rights reserved.

4 years agoAdd some more initializations to quiet riscv build.
rmacklem [Fri, 13 Dec 2019 01:34:25 +0000 (01:34 +0000)]
Add some more initializations to quiet riscv build.

The one case in nfs_copy_file_range() was a legitimate case, although
it would probably never occur in practice.

4 years agoDon't call into the debug server if it isn't configured.
jhb [Fri, 13 Dec 2019 01:17:20 +0000 (01:17 +0000)]
Don't call into the debug server if it isn't configured.

Reviewed by: markj (as part of a larger diff)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D20309

4 years agoFix the build for MAC not defined and a couple of might not be initialized.
rmacklem [Fri, 13 Dec 2019 00:45:14 +0000 (00:45 +0000)]
Fix the build for MAC not defined and a couple of might not be initialized.

r355677 broke the build for the not MAC defined case and a couple of
might not be initialized warnings were generated for riscv. Others seem
to be erroneous.

Hopefully there won't be too many more build errors.

Pointy hat goes on me.

4 years agor355677 requires that vop_stdioctl() be global so it can be called from NFS.
rmacklem [Fri, 13 Dec 2019 00:14:12 +0000 (00:14 +0000)]
r355677 requires that vop_stdioctl() be global so it can be called from NFS.

r355677 modified the NFS client so that it does lseek(SEEK_DATA/SEEK_HOLE)
for NFSv4.2, but calls vop_stdioctl() otherwise. As such, vop_stdioctl()
needs to be a global function.

Missed during the code merge for r355677.

4 years agoAvoid relying on silent type casting in the native atomic_load_32.
markj [Thu, 12 Dec 2019 23:55:34 +0000 (23:55 +0000)]
Avoid relying on silent type casting in the native atomic_load_32.

Reported by: np

4 years agoBump __FreeBSD_version since r355677 changes the internal interface
rmacklem [Thu, 12 Dec 2019 23:37:04 +0000 (23:37 +0000)]
Bump __FreeBSD_version since r355677 changes the internal interface
between the NFS modules such that they all need to be upgraded to
post r355677 simultaneously.

4 years agoAdd an entry to UPDATING for r355677.
rmacklem [Thu, 12 Dec 2019 23:33:32 +0000 (23:33 +0000)]
Add an entry to UPDATING for r355677.

4 years agoAdd support for NFSv4.2 to the NFS client and server.
rmacklem [Thu, 12 Dec 2019 23:22:55 +0000 (23:22 +0000)]
Add support for NFSv4.2 to the NFS client and server.

This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes
(RFC-8276) to the NFS client and server.
NFSv4.2 is comprised of several optional features that can be supported
in addition to NFSv4.1. This patch adds the following optional features:
   - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED)
   - posix_fallocate()
   - intra server file range copying via the copy_file_range(2) syscall
     --> Avoiding data tranfer over the wire to/from the NFS client.
   - lseek(SEEK_DATA/SEEK_HOLE)
   - Extended attribute syscalls for "user" namespace attributes as defined
     by RFC-8276.

Although this patch is fairly large, it should not affect support for
the other versions of NFS. However it does add two new sysctls that allow
a sysadmin to limit which minor versions of NFSv4 a server supports, allowing
a sysadmin to disable NFSv4.2.

Unfortunately, when the NFS stats structure was last revised, it was assumed
that there would be no additional operations added beyond what was
specified in RFC-7862. However RFC-8276 did add additional operations,
forcing the NFS stats structure to revised again. It now has extra unused
entries in all arrays, so that future extensions to NFSv4.2 can be
accomodated without revising this structure again.

A future commit will update nfsstat(1) to report counts for the new NFSv4.2
specific operations/procedures.

This patch affects the internal interface between the nfscommon, nfscl and
nfsd modules and, as such, they all must be upgraded simultaneously.
I will do a version bump (although arguably not needed), due to this.

This code has survived a "make universe" but has not been built with a
recent GCC. If you encounter build problems, please email me.

Relnotes: yes

4 years agortld: make checks for mmap(2) failures compliant with documentation.
kib [Thu, 12 Dec 2019 22:59:22 +0000 (22:59 +0000)]
rtld: make checks for mmap(2) failures compliant with documentation.

On error, mmap(2) returns MAP_FAILED.  There is no need to use its
definition or to cast.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

4 years agocxgbe(4): Never use hardware checksumming in netmap tx.
np [Thu, 12 Dec 2019 21:33:00 +0000 (21:33 +0000)]
cxgbe(4): Never use hardware checksumming in netmap tx.

MFC after: 1 week
Sponsored by: Chelsio Communications

4 years agoImplement atomic state updates using the new vm_page_astate_t structure.
markj [Thu, 12 Dec 2019 21:13:20 +0000 (21:13 +0000)]
Implement atomic state updates using the new vm_page_astate_t structure.

Introduce primitives vm_page_astate_load() and vm_page_astate_fcmpset()
to operate on the 32-bit per-page atomic state.  Modify
vm_page_pqstate_fcmpset() to use them.  No functional change intended.

Introduce PGA_QUEUE_OP_MASK, a subset of PGA_QUEUE_STATE_MASK that only
includes queue operation flags.  This will be used in subsequent
patches.

Reviewed by: alc, jeff, kib
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D22753

4 years agolibpmc: add MIT SPDX tag to header file
emaste [Thu, 12 Dec 2019 20:55:43 +0000 (20:55 +0000)]
libpmc: add MIT SPDX tag to header file

The jevents tool includes a copy of the jsmn json parser which is MIT
licensed.  Upstream the MIT license appears in the jsmn.c source and a
standalone LICENSE file, but the latter is not included in the copy
contained in libpmc and the jsmn.h header carried no license information.
Add an SPDX tag to clarify the situation.

4 years agoRather than pass the address of the packet information control block to
cy [Thu, 12 Dec 2019 20:44:49 +0000 (20:44 +0000)]
Rather than pass the address of the packet information control block to
ipf_pcksum6(), directly pass the adddress of the mbuf to it. This reduces
one pointer dereference. ipf_pcksum6() doesn't use the packet information
control block except to obtain the mbuf address.

MFC after: 3 days