cem [Sat, 14 Dec 2019 21:52:49 +0000 (21:52 +0000)]
cdefs: Add __deprecated(message) function attribute macro
The legacy version of GCC4 currently in base does not support the
parameterized form of this function attribute, as recent introduced in
stdlib.h (r355747).
As we have done for other function attributes with similar compatibility
problems, add a version-compatibile definition in sys/cdefs.h. Note that
Clang defines itself to be GCC 4, so one must check for __clang__ in
addition to __GNUC__ version. On legacy GCC 4, the macro expands to just
the __deprecated__ attribute; on modern GCC or Clang, the macro expands to
the parameterized variant with the message.
Ignoring legacy or unsupported compilers, the macro is also beneficial in
that it is a bit more ergonomic than the full
__attribute__((__deprecated__())) boilerplate.
Reported by: CI (but not tinderbox); imp and others
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D22817
rmacklem [Sat, 14 Dec 2019 21:49:47 +0000 (21:49 +0000)]
Update the mount_nfs.8 man page to include NFSv4.2.
r355677 added NFSv4.2 support to the NFS client. This patch updates the
mount_nfs.8 man page to reflect that.
It also clarifies that the "nolockd" option does not apply to NFSv4 mounts.
dougm [Sat, 14 Dec 2019 19:44:42 +0000 (19:44 +0000)]
Simplify the processing a leaf mask to find big-enough ranges of set
bits, by storing and modifying the complement of the original leaf
mask, and by avoiding some unnecessary intermediate variables in
computing the shift amounts. The logic is similar to what has recently
been committed to sys/sys/bitstring.h.
Compute better hint updates for the case when the cursor starts in
mid-leaf, and eliminates some otherwise viable solutions. Assume the
worst case, that all the eliminated offsets could have been solutions,
and you can still compute a better hint than we use now.
Eliminate some unnecessary conditional control flow.
mmel [Sat, 14 Dec 2019 14:56:34 +0000 (14:56 +0000)]
Add driver for Rockchip PCIe root complex found in RK3399 SOC.
Unfortunately, there are some limitations:
- memory aperture of his controller is only 16MiB, so it is nearly
unusable for graphic cards
- every attempt to generate type 1 config cycle always causes trap.
These config cycles are disabled now and we don't support cards
with PCIe switch.
- in some cases, attempt to do config cycle to (probably) not-yet ready
card also causes trap. This cannot be detected at runtime, but it seems
like very rare issue.
cem [Sat, 14 Dec 2019 08:28:10 +0000 (08:28 +0000)]
Deprecate sranddev(3) API
It serves no useful purpose and wasn't as popular as its equally meritless
cousin, srandomdev(3).
Setting aside the problems with rand(3) in general, the problem with this
interface is that the seed isn't shared with the caller (other than by
attacking the output of the generator, which is trivial, but not a hallmark of
pleasant API design). The (arguable) utility of rand(3) or random(3) is as a
semi-fast simulation generator which produces consistent results from a given
seed. These are mutually at odd. Furthermore, sometimes people got the
mistaken impression that a high quality random seed meant a weak generator like
rand(3) or random(3) could be used for things like cryptographic key
generation. This is absolutely not so.
The API was never part of a standard and was not widely used in tree. Existing
in-tree uses have all been removed.
rlibby [Sat, 14 Dec 2019 05:21:56 +0000 (05:21 +0000)]
uma dbg: flexible size for slab debug bitset too
Recently (r355315) the size of the struct uma_slab bitset field us_free
became dynamic instead of conservative. Now, make the debug bitset
size dynamic too. The debug bitset is INVARIANTS-only, so in fact we
don't care too much about the space savings that results from this, but
enabling minimally-sized slabs on INVARIANTS builds is still important
in order to be able to test new slab layouts effectively.
scottl [Fri, 13 Dec 2019 23:46:59 +0000 (23:46 +0000)]
Add accessors for the Vendor Specific Extended Capability (VSEC)
Parse out the VSEC. If the user invokes a second -c command line option,
do a hex dump of the vendor data.
imp [Fri, 13 Dec 2019 22:32:05 +0000 (22:32 +0000)]
Better copyright advice
Document the common practices around copyrights with "all rights reserved" in
them as new copyright notices get added.
It's an open question qhether to point people at the fact that since the Berne
convention was ratified, All rights reserved is largely obsolete.
https://en.wikipedia.org/wiki/All_rights_reserved#Obsolescence has the
details. The committer's guide will be revised shortly, and it's likely that's a
better place for this discussion. If not, I'll add a blurb here.
avg [Fri, 13 Dec 2019 22:04:13 +0000 (22:04 +0000)]
zfs boot: fix a crash in a rarely taken path in fzap_lookup
Instead of passing NULL to fzap_name_equal and crashing, just return
ENOENT. This happened when higher bits of a hash of the searched key
(its hash prefix) matched a hash prefix of some key in the ZAP, but the
full hash value of the searched key did not match any key in the ZAP.
I observerved this problem when loader tried to look up
"features_for_read" in a particular old pool that predates pool
features.
np [Fri, 13 Dec 2019 20:38:58 +0000 (20:38 +0000)]
cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.
CPL_TX_PKT_XT disables the internal parser on the chip and instead
relies on the driver to provide the exact length of the L2 and L3
headers. This allows hw checksumming and TSO to be used with L2 and
L3 encapsulations that the chip doesn't understand directly.
Note that netmap tx still uses the old CPL as it never uses the hw
to generate the checksum on tx.
bdragon [Fri, 13 Dec 2019 20:30:26 +0000 (20:30 +0000)]
[PowerPC] Fully define gdtoa settings on powerpc64.
The settings in arith.h were not fully defined on powerpc64 after the gdtoa
switchover. Generate them using arithchk.c, similar to what AMD64 did for
r114814.
Technically, none of this is necessary in FreeBSD gdtoa, but since the other
platforms have full definitions, we might as well have full definitions
too.
Approved by: jhibbits (in irc)
Differential Revision: https://reviews.freebsd.org/D22775
imp [Fri, 13 Dec 2019 19:39:33 +0000 (19:39 +0000)]
Create new wrapper function: bus_delayed_attach_children()
Delay the attachment of children, when requested, until after interrutps are
running. This is often needed to allow children to run transactions on i2c or
spi busses. It's a common enough idiom that it will be useful to have its own
wrapper.
Reviewed by: ian
Differential Revision: https://reviews.freebsd.org/D21465
jhb [Fri, 13 Dec 2019 19:21:58 +0000 (19:21 +0000)]
Support software breakpoints in the debug server on Intel CPUs.
- Allow the userland hypervisor to intercept breakpoint exceptions
(BP#) in the guest. A new capability (VM_CAP_BPT_EXIT) is used to
enable this feature. These exceptions are reported to userland via
a new VM_EXITCODE_BPT that includes the length of the original
breakpoint instruction. If userland wishes to pass the exception
through to the guest, it must be explicitly re-injected via
vm_inject_exception().
- Export VMCS_ENTRY_INST_LENGTH as a VM_REG_GUEST_ENTRY_INST_LENGTH
pseudo-register. Injecting a BP# on Intel requires setting this to
the length of the breakpoint instruction. AMD SVM currently ignores
writes to this register (but reports success) and fails to read it.
- Rework the per-vCPU state tracked by the debug server. Rather than
a single 'stepping_vcpu' global, add a structure for each vCPU that
tracks state about that vCPU ('stepping', 'stepped', and
'hit_swbreak'). A global 'stopped_vcpu' tracks which vCPU is
currently reporting an event. Event handlers for MTRAP and
breakpoint exits loop until the associated event is reported to the
debugger.
Breakpoint events are discarded if the breakpoint is not present
when a vCPU resumes in the breakpoint handler to retry submitting
the breakpoint event.
- Maintain a linked-list of active breakpoints in response to the GDB
'Z0' and 'z0' packets.
imp [Fri, 13 Dec 2019 18:35:48 +0000 (18:35 +0000)]
Move to using bool instead of boolean_t
While there are subtle semantic differences between bool and boolean_t, none of
them matter in these cases. Prefer true/false when dealing with bool
type. Preserve a couple of TRUEs since they are passed into int args into CAM.
Preserve a couple of FALSEs when used for status.done, an int.
markj [Fri, 13 Dec 2019 18:28:01 +0000 (18:28 +0000)]
Restore the reservation of boot pages for bucket zones after r355707.
uma_startup2() sets booted = BOOT_BUCKETS after calling bucket_init(),
but before that assignment, startup_alloc() will use pages from the
reserved pool, so the bucket zones themselves are still allocated using
startup pages.
bdragon [Fri, 13 Dec 2019 18:18:14 +0000 (18:18 +0000)]
[PowerPC] Enable TLS usage in system libraries on ELFv2.
Currently, __NO_TLS is defined to 1 on powerpc64. TLS usage works much
better on ELFv2 due to the modern tooling, so take the opportunity to
reenable TLS on ELFv2.
If you are using a self-built ELFv2 environment on powerpc64, you will
have to run installworld twice due to RuneLocale changes. This is the only
known regression, and if you are using the ELFv2 isos, you likely already
have the updated libraries installed, as this change is part of the
patchset that the isos integrate.
(No UPDATING note about this because ELFv2 is still an unofficial build.)
tsoome [Fri, 13 Dec 2019 12:36:16 +0000 (12:36 +0000)]
loader: cd9660_open() warn: is 'buf' large enough for 'struct iso_primary_descriptor'?
We do allocate amount of memory (void * or char *), and then assign this
buffer to struct iso_primary_descriptor *vd. Make sure we do
allocate enough bytes.
In fact we do allocate enough, but it is good idea to make sure this really
is so.
ae [Fri, 13 Dec 2019 11:47:58 +0000 (11:47 +0000)]
Make TCP options parsing stricter.
Rework tcpopts_parse() to be more strict. Use const pointer. Add length
checks for specific TCP options. The main purpose of the change is
avoiding of possible out of mbuf's data access.
rlibby [Fri, 13 Dec 2019 10:34:19 +0000 (10:34 +0000)]
libmemstat: unbreak build
r355706 added an instance of offsetof() to the UMA private kernel header
file uma_int.h. Userspace memstat_uma.c includes that header, and
chokes on offsetof() because apparently the definition in sys/types.h is
ifdef _KERNEL. Now, include sys/stddef.h which has an identical
definition.
Pointyhat to: rlibby
Sponsored by: Dell EMC Isilon
rlibby [Fri, 13 Dec 2019 09:32:16 +0000 (09:32 +0000)]
bitset: rename confusing macro NAND to ANDNOT
s/BIT_NAND/BIT_ANDNOT/, and for CPU and DOMAINSET too. The actual
implementation is "and not" (or "but not"), i.e. A but not B.
Fortunately this does appear to be what all existing callers want.
Don't supply a NAND (not (A and B)) operation at this time.
Discussed with: jeff
Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22791
rlibby [Fri, 13 Dec 2019 09:32:03 +0000 (09:32 +0000)]
uma: delay bucket_init() until we might actually enable buckets
This helps with a bootstrapping problem in upcoming work.
We don't first enable buckets until uma_startup2(), so we can delay
bucket creation until then. The other two paths to bucket_enable() are
both later, one in the pageout daemon (SI_SUB_KTHREAD_PAGE vs SI_SUB_VM)
and one in uma_timeout() (first activated in uma_startup3()). Note that
although some bucket functions are accessible before uma_startup2()
(e.g. bucket_select() in zone_ctor()), none of them inspect ubz_zone.
rlibby [Fri, 13 Dec 2019 09:31:59 +0000 (09:31 +0000)]
uma dbg: flexible size for slab debug bitset too
Recently (r355315) the size of the struct uma_slab bitset field us_free
became dynamic instead of conservative. Now, make the debug bitset
size dynamic too. The debug bitset is INVARIANTS-only, so in fact we
don't care too much about the space savings that results from this, but
enabling minimally-sized slabs on INVARIANTS builds is still important
in order to be able to test new slab layouts effectively.
trasz [Fri, 13 Dec 2019 09:28:44 +0000 (09:28 +0000)]
Add kern.geom.part.separator tunable. This makes it possible
to specify an optional separator to insert before partition name;
eg if it's set to "c/", you'll get "ada0c/s1" instead of "ada0s1".
(It cannot be set to just “/“, since ada0 is a device node, not
a directory.)
cem [Fri, 13 Dec 2019 05:42:57 +0000 (05:42 +0000)]
libtelnet: Replace bogus use of srandomdev + random to generate "public key pair"
I'm pretty skeptical that any crypto in telnet is worth using, but if we're
ostensibly generating keys, arc4random is strictly better than the previous
construct.
cem [Fri, 13 Dec 2019 05:11:34 +0000 (05:11 +0000)]
libtacplus: Remove bogus srandomdev+random
Replace with arc4random.
TACAS+ is a 1993 Cisco extension to the 1984 TACAS. Is this something we want
in base still? The directory has been substantively unmaintained since 2002,
at least.
cem [Fri, 13 Dec 2019 04:48:20 +0000 (04:48 +0000)]
kern/subr_unit: Rip srandomdev, random(3) out of dead code
The simulation cannot be reproduced, so the value of using a deterministic PRNG
like random(3) is dubious. The number of repitions used in the sample isn't a
problem for the Chacha implementation of arc4random we have today. (Also, no
one actually runs this code; it was provided as an example of the work the
author did validating the implementation. It's not even test code.)
cem [Fri, 13 Dec 2019 04:37:39 +0000 (04:37 +0000)]
random(6): produce random results
This program is trash and there's no reason to keep it in base. But as long as
we're shipping a silly program named 'random', let's actually make it random.
cem [Fri, 13 Dec 2019 04:03:05 +0000 (04:03 +0000)]
keyserv(8): unifdef out __FreeBSD__ and KEYSERV_RANDOM
This doesn't appear to have some active upstream (and it's a steaming pile of
bad 90s crypto design). Rip out the completely horrible bits and leave the
only mildly less horrible bits. The whole thing should probably be deleted; to
the extent it purports to provide a security feature: it doesn't.
ian [Fri, 13 Dec 2019 02:20:26 +0000 (02:20 +0000)]
If device_delete_children() returns an error, bail on the rest of the
detach work and return the error. Especially don't call iicbus_reset()
since the most likely cause of failing to detach children is that one
of them has IO in progress.
rmacklem [Fri, 13 Dec 2019 00:45:14 +0000 (00:45 +0000)]
Fix the build for MAC not defined and a couple of might not be initialized.
r355677 broke the build for the not MAC defined case and a couple of
might not be initialized warnings were generated for riscv. Others seem
to be erroneous.
Hopefully there won't be too many more build errors.
rmacklem [Fri, 13 Dec 2019 00:14:12 +0000 (00:14 +0000)]
r355677 requires that vop_stdioctl() be global so it can be called from NFS.
r355677 modified the NFS client so that it does lseek(SEEK_DATA/SEEK_HOLE)
for NFSv4.2, but calls vop_stdioctl() otherwise. As such, vop_stdioctl()
needs to be a global function.
rmacklem [Thu, 12 Dec 2019 23:37:04 +0000 (23:37 +0000)]
Bump __FreeBSD_version since r355677 changes the internal interface
between the NFS modules such that they all need to be upgraded to
post r355677 simultaneously.
rmacklem [Thu, 12 Dec 2019 23:22:55 +0000 (23:22 +0000)]
Add support for NFSv4.2 to the NFS client and server.
This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes
(RFC-8276) to the NFS client and server.
NFSv4.2 is comprised of several optional features that can be supported
in addition to NFSv4.1. This patch adds the following optional features:
- posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED)
- posix_fallocate()
- intra server file range copying via the copy_file_range(2) syscall
--> Avoiding data tranfer over the wire to/from the NFS client.
- lseek(SEEK_DATA/SEEK_HOLE)
- Extended attribute syscalls for "user" namespace attributes as defined
by RFC-8276.
Although this patch is fairly large, it should not affect support for
the other versions of NFS. However it does add two new sysctls that allow
a sysadmin to limit which minor versions of NFSv4 a server supports, allowing
a sysadmin to disable NFSv4.2.
Unfortunately, when the NFS stats structure was last revised, it was assumed
that there would be no additional operations added beyond what was
specified in RFC-7862. However RFC-8276 did add additional operations,
forcing the NFS stats structure to revised again. It now has extra unused
entries in all arrays, so that future extensions to NFSv4.2 can be
accomodated without revising this structure again.
A future commit will update nfsstat(1) to report counts for the new NFSv4.2
specific operations/procedures.
This patch affects the internal interface between the nfscommon, nfscl and
nfsd modules and, as such, they all must be upgraded simultaneously.
I will do a version bump (although arguably not needed), due to this.
This code has survived a "make universe" but has not been built with a
recent GCC. If you encounter build problems, please email me.
markj [Thu, 12 Dec 2019 21:13:20 +0000 (21:13 +0000)]
Implement atomic state updates using the new vm_page_astate_t structure.
Introduce primitives vm_page_astate_load() and vm_page_astate_fcmpset()
to operate on the 32-bit per-page atomic state. Modify
vm_page_pqstate_fcmpset() to use them. No functional change intended.
Introduce PGA_QUEUE_OP_MASK, a subset of PGA_QUEUE_STATE_MASK that only
includes queue operation flags. This will be used in subsequent
patches.
emaste [Thu, 12 Dec 2019 20:55:43 +0000 (20:55 +0000)]
libpmc: add MIT SPDX tag to header file
The jevents tool includes a copy of the jsmn json parser which is MIT
licensed. Upstream the MIT license appears in the jsmn.c source and a
standalone LICENSE file, but the latter is not included in the copy
contained in libpmc and the jsmn.h header carried no license information.
Add an SPDX tag to clarify the situation.
cy [Thu, 12 Dec 2019 20:44:49 +0000 (20:44 +0000)]
Rather than pass the address of the packet information control block to
ipf_pcksum6(), directly pass the adddress of the mbuf to it. This reduces
one pointer dereference. ipf_pcksum6() doesn't use the packet information
control block except to obtain the mbuf address.
bdragon [Thu, 12 Dec 2019 17:40:32 +0000 (17:40 +0000)]
rtld: do not try to mmap a zero-sized PT_LOAD
When a PT_LOAD segment has a zero p_filesz, skip the data mmap, as mmapping
zero bytes from a file is an error.
A PT_LOAD with zero p_filesz is legal (but somewhat uncommon due to segment
merging in modern linkers, as it is more efficient to merge .data and .bss
by just extending p_memsz in the previous segment, assuming compatible
page protection.)
This was seen on ports/graphics/glew on a powerpc64 ELFv2 experimental
build.
bdragon [Thu, 12 Dec 2019 16:49:55 +0000 (16:49 +0000)]
[PowerPC] Fix powerpc 32 bit build in mmu_oea64.c
Due to ppc32 building mmu_oea64.c (for use when in bridge mode on a G5), we
need to guard the new moea64_page_array_startup code behind __powerpc64__
to avoid a compile error, since vm_offset_t is not 64-bit on ppc32.
ae [Thu, 12 Dec 2019 13:28:46 +0000 (13:28 +0000)]
Follow RFC 4443 p2.2 and always use own addresses for reflected ICMPv6
datagrams.
Previously destination address from original datagram was used. That
looked confusing, especially in the traceroute6 output.
Also honor IPSTEALTH kernel option and do TTL/HLIM decrementing only
when stealth mode is disabled.
Reported by: Marco van Tol <marco at tols org>
Reviewed by: melifaro
MFC after: 2 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D22631
cem [Thu, 12 Dec 2019 04:44:09 +0000 (04:44 +0000)]
arm: libgcc_s: Fix ABI breakage introduced in r354347
Provide the symbol version for llvm-libunwind's _Unwind_Backtrace that libgcc
has historically provided on arm, in addition to the (default) standard version
used on all other arch.
markj [Thu, 12 Dec 2019 02:43:24 +0000 (02:43 +0000)]
Rename tdq_ipipending and clear it in sched_switch().
This fixes a regression after r355311. Specifically, sched_preempt()
may trigger a context switch by calling thread_lock(), since
thread_lock() calls critical_exit() in its slow path and the interrupted
thread may have already been marked for preemption. This would happen
before tdq_ipipending is cleared, blocking further preemption IPIs. The
CPU can be left in this state indefinitely if the interrupted thread
migrates.
Rename tdq_ipipending to tdq_owepreempt. Any switch satisfies a remote
preemption request, so clear tdq_owepreempt in sched_switch() instead of
sched_preempt() to avoid subtle problems of the sort described above.
Reviewed by: jeff, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22758
emaste [Thu, 12 Dec 2019 02:18:18 +0000 (02:18 +0000)]
ObsoleteFiles.inc: remove stale comment
A comment at the top of the file claimed that the file was grouped into
OLD_FILES, OLD_LIBS, then OLD_DIRS, but that hasn't been the case since
the mid-2000s. Delete the stale comment, add a new comment for the
historical split entries, and move the one more recent entry (from 2013)
to group it into a single logical change.
kevans [Thu, 12 Dec 2019 01:41:55 +0000 (01:41 +0000)]
Add sigsetop extensions commonly found in musl libc and glibc
These functions (sigandset, sigisemptyset, sigorset) are commonly available
in at least musl libc and glibc; sigorset, at least, has proven quite useful
in qemu-bsd-user work for tracking the current process signal mask in a more
self-documenting/aesthetically pleasing manner.
kevans [Thu, 12 Dec 2019 01:35:56 +0000 (01:35 +0000)]
stand: liblua: drop default buffer size to 128
Lua allocates LUAL_BUFFERSIZE buffers on the stack for various string
functions (string.format, string.gsub) -- this works out to be somewhat
significant and not necessary, based on how we use string operations.
Dropping it risks having to allocate per call to format/gsub, but this is
not the case for our usage. This simply stops allocating 8K buffers on the
stack when luaL_Buffer is used.
kevans [Thu, 12 Dec 2019 01:33:45 +0000 (01:33 +0000)]
usr.sbin/ntp: don't emit versions w/ make -s
<sys.mk> defines ECHO=echo when not using make -s, and ECHO=true when using
make -s.
export ECHO for ntp products and use it in the mkver script to echo the
version. This suppresses the output as appropriate. ECHO is given a default
value to make sure things still work as expected for anyone that isn't
redefining ECHO.