kib [Wed, 29 May 2019 14:05:27 +0000 (14:05 +0000)]
Do not go into sleep in sleepq_catch_signals() when SIGSTOP from
PT_ATTACH was consumed.
In particular, do not clear TDP_FSTP in ptracestop() if td_wchan is
non-NULL. Leave it to sleepq_catch_signal() to clear and convert zero
return code to EINTR.
Otherwise, per submitter report, if the PT_ATTACH SIGSTOP was
delivered right after the thread was added to the sleepqueue but not
yet really sleep, and cursig() caused debugger attach, the thread
sleeps instead of returning to the userspace boundary with EINTR.
avg [Wed, 29 May 2019 09:08:20 +0000 (09:08 +0000)]
revert r273728 and parts of r306589, iicbus no-stop by default feature
Since drm2 removal, there has not been any consumer of the feature in the
tree. I am also unaware of any out-of-tree consumer.
More importantly, the feature has been broken from the very start, both
before and after r306589, because the ivar was set on a device that does
not support it and it was read from another device that also does not
support it.
A bus-wide no-stop flag cannot be implemented as an ivar as iicbus
attaches as a child of various drivers. Implementing the ivar in each
and every I2C driver is just impractical.
If we ever want to implement this feature properly, then probably the
easiest way to do it would be via a flag in the softc of iicbus.
In fact, we might have to do that in the stable branches if we want to
fix the code for them.
Reported by: ian (long time ago)
MFC after: 1 month (maybe)
X-MFC-note: cannot just merge the change, must keep drm2 happy
jhibbits [Wed, 29 May 2019 02:02:56 +0000 (02:02 +0000)]
Add missing powerpc64 relocation support to libdwarf
Summary:
Due to missing relocation support in libdwarf for powerpc64, handling of dwarf
info on unlinked objects was bogus.
Examining raw dwarf data on objects compiled on ppc64 with a modern compiler
(in-tree gcc tends to hide the issue, since it only rarely generates relocations
in .debug_info and uses DW_FORM_str instead of DW_FORM_strp for everything), you
will find that the dwarf data appears corrupt, with repeated references to the
compiler version where things like types and function names should appear.
This happens because the 0 offset of .debug_str contains the compiler version,
and without applying the relocations, *all* indirect strings in .dwarf_info will
end up pointing to it.
This corruption then propogates to the CTF data, as ctfconvert relies on
libdwarf to read the dwarf info, for every compiled object (when building a
kernel.)
However, if you examine the dwarf data on a compiled executable, it will appear
correct, because during final link the relocations get applied and baked in by
the linker.
kevans [Wed, 29 May 2019 01:08:30 +0000 (01:08 +0000)]
if_bridge(4): Complete bpf auditing of local traffic over the bridge
There were two remaining "gaps" in auditing local bridge traffic with
bpf(4):
Locally originated outbound traffic from a member interface is invisible to
the bridge's bpf(4) interface. Inbound traffic locally destined to a member
interface is invisible to the member's bpf(4) interface -- this traffic has
no chance after bridge_input to otherwise pass it over, and it wasn't
originally received on this interface.
I call these "gaps" because they don't affect conventional bridge setups.
Alas, being able to establish an audit trail of all locally destined traffic
for setups that can function like this is useful in some scenarios.
johalun [Tue, 28 May 2019 20:54:59 +0000 (20:54 +0000)]
pseudofs: Ignore unsupported commands in vop_setattr.
Users of pseudofs (e.g. lindebugfs), should be able to receive
input from command line via commands like "echo 1 > /path/to/file".
Currently this fails because sh tries to truncate the file first and
vop_setattr returns not supported error for this. This patch simply
ignores the error and returns 0 instead.
mav [Tue, 28 May 2019 18:32:04 +0000 (18:32 +0000)]
Fix array out of bound panic introduced in r306219.
As I see, different NICs in different configurations may have different
numbers of TX and RX queues. The code was assuming 1:1 mapping between
event queues (interrupts) and TX/RX queues. Since number of interrupts
is set to maximum of TX and RX queues, when those two are different, the
system is doomed.
I have no documentation or deep knowledge about this hardware, so this
change is based on general observations and code reading. If some of my
guesses are wrong, please do better. I just confirmed HP NC550SFP NICs
are working now.
kevans [Tue, 28 May 2019 16:12:16 +0000 (16:12 +0000)]
bectl(8): Address Coverity complaints
CID 1400451: case 0 is missing a break/return and falling through to the
default case. waitpid(0, ...) makes little sense in the child, we likely
wanted to terminate immediately.
CID 1400453: size argument uses sizeof(char **) instead of sizeof(char *)
and is assigned to a char **; sizeof's match but "this isn't a portable
assumption".
dougm [Tue, 28 May 2019 15:47:00 +0000 (15:47 +0000)]
Implement the ffs and fls functions, and their longer counterparts, in
cpufunc, in terms of __builtin_ffs and the like, for arm32 v6 and v7
architectures, and use those, rather than the simple libkern
implementations, in building arm32 kernels.
Reviewed by: manu
Approved by: kib, markj (mentors)
Tested by: iz-rpi03_hs-karlsruhe.de, mikael.urankar_gmail.com, ian
Differential Revision: https://reviews.freebsd.org/D20412
ae [Tue, 28 May 2019 11:45:00 +0000 (11:45 +0000)]
Rework r348303 to reduce the time of holding global BPF lock.
It appeared that using NET_EPOCH_WAIT() while holding global BPF lock
can lead to another panic:
spin lock 0xfffff800183c9840 (turnstile lock) held by 0xfffff80018e2c5a0 (tid 100325) too long
panic: spin lock held too long
...
#0 sched_switch (td=0xfffff80018e2c5a0, newtd=0xfffff8000389e000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2133
#1 0xffffffff80bf9912 in mi_switch (flags=256, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:439
#2 0xffffffff80c21db7 in sched_bind (td=<optimized out>, cpu=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2704
#3 0xffffffff80c34c33 in epoch_block_handler_preempt (global=<optimized out>, cr=0xfffffe00005a1a00, arg=<optimized out>)
at /usr/src/sys/kern/subr_epoch.c:394
#4 0xffffffff803c741b in epoch_block (global=<optimized out>, cr=<optimized out>, cb=<optimized out>, ct=<optimized out>)
at /usr/src/sys/contrib/ck/src/ck_epoch.c:416
#5 ck_epoch_synchronize_wait (global=0xfffff8000380cd80, cb=<optimized out>, ct=<optimized out>) at /usr/src/sys/contrib/ck/src/ck_epoch.c:465
#6 0xffffffff80c3475e in epoch_wait_preempt (epoch=0xfffff8000380cd80) at /usr/src/sys/kern/subr_epoch.c:513
#7 0xffffffff80ce970b in bpf_detachd_locked (d=0xfffff801d309cc00, detached_ifp=<optimized out>) at /usr/src/sys/net/bpf.c:856
#8 0xffffffff80ced166 in bpf_detachd (d=<optimized out>) at /usr/src/sys/net/bpf.c:836
#9 bpf_dtor (data=0xfffff801d309cc00) at /usr/src/sys/net/bpf.c:914
To fix this add the check to the catchpacket() that BPF descriptor was
not detached just before we acquired BPFD_LOCK().
andrew [Tue, 28 May 2019 10:55:59 +0000 (10:55 +0000)]
The alignment is passed into contigmalloc_domainset in the 7th argument.
KUBSAN was complaining the pointer contigmalloc_domainset returned was
misaligned. Fix this by using the correct argument to find the alignment
in the function signature.
cem [Mon, 27 May 2019 17:33:20 +0000 (17:33 +0000)]
kldxref(8): Sort MDT_MODULE info first in linker.hints output
MDT_MODULE info is required to be ordered before any other MDT metadata for
a given kld because it serves as an implicit record boundary between
distinct klds for linker.hints consumers. kldxref(8) has previously relied
on the assumption that MDT_MODULE was ordered relative to other module
metadata in kld objects by source code ordering.
However, C does not require implementations to emit file scope objects in
any particular order, and it seems that GCC 6.4.0 and/or binutils 2.32 ld
may reorder emitted objects with respect to source code ordering.
So: just take two passes over a given .ko's module metadata, scanning for
the MDT_MODULE on the first pass and the other metadata on subsequent
passes. It's not super expensive and not exactly a performance-critical
piece of code. This ensures MDT_MODULE is always ordered before
MDT_PNP_INFO and other MDTs, regardless of compiler/linker movement. As a
fringe benefit, it removes the requirement that care be taken to always
order MODULE_PNP_INFO after DRIVER_MODULE in source code.
kib [Mon, 27 May 2019 15:21:26 +0000 (15:21 +0000)]
Correct some inconsistencies in the earliest created kernel page
tables which affect demotion.
The last last-level page table under 2M mappings below KERNend was
only partially initialized. When that page was used as the hardware
page table for demotion of the 2M mapping, the result was not
consistent. Since pmap_demote_pde() is switched to use PG_PROMOTED as
the test for the validity of the saved last level page table page, we
can keep page table pages zero-initialized instead. Demotion would
fill them as needed.
Only map the created page tables beyond KERNend, there is no need to
pre-promote PTmap after KERNend, because the extra mapping is not used.
Only round up *firstaddr to 2M boundary when it is below rounded
KERNend. Sometimes the allocpages() calls advance *firstaddr past the
end of the last 2MB page mapping. In that case, this conditional
avoids wasting an average of 1MB of physical memory.
Update comments to explain action in more clean and direct language.
Reported and tested by: pho
In collaboration with: alc
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D20380
ae [Mon, 27 May 2019 12:41:41 +0000 (12:41 +0000)]
Fix possible NULL pointer dereference.
bpf_mtap() can invoke catchpacket() for already detached descriptor.
And this can lead to NULL pointer dereference, since bd_bif pointer
was reset to NULL in bpf_detachd_locked(). To avoid this, use
NET_EPOCH_WAIT() when descriptor is removed from interface's descriptors
list. After the wait it is safe to modify descriptor's content.
mckusick [Mon, 27 May 2019 06:22:43 +0000 (06:22 +0000)]
Add function name and line number debugging information to softupdates
worklist structures to help track their movement between work lists.
No functional change to the operation of soft updates intended.
jhibbits [Mon, 27 May 2019 03:18:56 +0000 (03:18 +0000)]
powerpc/dtrace: Fix fbt function probing for ELFv2
'.' function names exist only in ELFv1. ELFv2 does away with function
descriptors, and look more like they do on powerpc(32) and most other
platforms, as direct function pointers. Stop blacklisting regular function
names in ELFv2.
jchandra [Sun, 26 May 2019 23:04:21 +0000 (23:04 +0000)]
arm64 nexus: remove incorrect warning
acpi_config_intr() will be called when an arm64 system booted with ACPI.
We do the interrupt mapping for ACPI interrupts in nexus_acpi_map_intr()
on arm64, so acpi_config_intr() has to just return success without
printing this error message.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D19432
tuexen [Sun, 26 May 2019 17:18:14 +0000 (17:18 +0000)]
When an ACK segment as the third message of the three way handshake is
received and support for time stamps was negotiated in the SYN/SYNACK
exchange, perform the PAWS check and only expand the syn cache entry if
the check is passed.
Without this check, endpoints may get stuck on the incomplete queue.
Reviewed by: jtl@
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D20374
dim [Sun, 26 May 2019 15:44:58 +0000 (15:44 +0000)]
Pull in r361696 from upstream llvm trunk (by Sanjay Patel):
[SelectionDAG] soften assertion when legalizing narrow vector FP ops
The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not be
so aggressive with an assert here. There's no telling what
out-of-tree targets look like.
This fixes an assertion when building the graphics/mesa-dri port for
PowerPC64.
Reported by: Mark Millard <marklmi26-fbsd@yahoo.com>
PR: 238082
MFC after: 3 days
sef [Sat, 25 May 2019 07:26:30 +0000 (07:26 +0000)]
Add an AESNI-optimized version of the CCM/CBC cryptographic and authentication
code. The primary client of this is probably going to be ZFS encryption.
cem [Sat, 25 May 2019 01:59:24 +0000 (01:59 +0000)]
virtio_pci(4): Fix typo in read_ivar method
Prior to this revision, vtpci's BUS_READ_IVAR method on VIRTIO_IVAR_SUBVENDOR
accidentally returned the PCI subdevice.
The typo seems to have been introduced with the original commit adding
VIRTIO_IVAR_{{SUB,}DEVICE,{SUB,}VENDOR} to virtio_pci. The commit log and code
strongly suggest that the ivar was intended to return the subvendor rather than
the subdevice; it was likely just a copy/paste mistake.
mckusick [Sat, 25 May 2019 00:07:49 +0000 (00:07 +0000)]
When using the destroy option to shut down a nop GEOM module, I/O
operations already in its queue were not being properly drained.
The GEOM framework does the queue draining, but the module needs
to wait for the draining to happen. The waiting is done by adding
a g_nop_providergone() function to wait for the I/O operations to
finish up. This change is similar to change -r345758 made to the
memory-disk driver.
kib [Fri, 24 May 2019 23:28:11 +0000 (23:28 +0000)]
Fix too loose assert in pmap_large_unmap().
The upper bound for the valid address from the large map used
LARGEMAP_MAX_ADDRESS instead of LARGEMAP_MIN_ADDRESS. Provide a
function-like macro for proper upper value.
Noted by: markj
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D20386
cem [Fri, 24 May 2019 22:33:14 +0000 (22:33 +0000)]
Disable intr_storm_threshold mechanism by default
The ixl.4 manual page has documented that the threshold falsely detects
interrupt storms on 40Gbit NICs as long ago as 2015, and we have seen
similar false positives with the ioat(4) DMA device (which can push GB/s).
For example, synthetic load can be generated with tools/tools/ioat
'ioatcontrol 0 200 8192 1 1000' (allocate 200x8kB buffers, generate an
interrupt for each one, and do this for 1000 milliseconds). With
storm-detection disabled, the Broadwell-EP version of this device is capable
of generating ~350k real interrupts per second.
The following historical context comes from jhb@: Originally, the threshold
worked around incorrect routing of PCI INTx interrupts on single-CPU systems
which would end up in a hard hang during boot. Since the threshold was
added, our PCI interrupt routing was improved, most PCI interrupts use
edge-triggered MSI instead of level-triggered INTx, and typical systems have
multiple CPUs available to service interrupts.
On the off chance that the threshold is useful in the future, it remains
available as a tunable and sysctl.
jhb [Fri, 24 May 2019 22:30:40 +0000 (22:30 +0000)]
Restructure mbuf send tags to provide stronger guarantees.
- Perform ifp mismatch checks (to determine if a send tag is allocated
for a different ifp than the one the packet is being output on), in
ip_output() and ip6_output(). This avoids sending packets with send
tags to ifnet drivers that don't support send tags.
Since we are now checking for ifp mismatches before invoking
if_output, we can now try to allocate a new tag before invoking
if_output sending the original packet on the new tag if allocation
succeeds.
To avoid code duplication for the fragment and unfragmented cases,
add ip_output_send() and ip6_output_send() as wrappers around
if_output and nd6_output_ifp, respectively. All of the logic for
setting send tags and dealing with send tag-related errors is done
in these wrapper functions.
For pseudo interfaces that wrap other network interfaces (vlan and
lagg), wrapper send tags are now allocated so that ip*_output see
the wrapper ifp as the ifp in the send tag. The if_transmit
routines rewrite the send tags after performing an ifp mismatch
check. If an ifp mismatch is detected, the transmit routines fail
with EAGAIN.
- To provide clearer life cycle management of send tags, especially
in the presence of vlan and lagg wrapper tags, add a reference count
to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
Provide a helper function (m_snd_tag_init()) for use by drivers
supporting send tags. m_snd_tag_init() takes care of the if_ref
on the ifp meaning that code alloating send tags via if_snd_tag_alloc
no longer has to manage that manually. Similarly, m_snd_tag_rele
drops the refcount on the ifp after invoking if_snd_tag_free when
the last reference to a send tag is dropped.
This also closes use after free races if there are pending packets in
driver tx rings after the socket is closed (e.g. from tcpdrop).
In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
Drivers now also check this flag instead of checking snd_tag against
NULL. This avoids false positive matches when a forwarded packet
has a non-NULL rcvif that was treated as a send tag.
- cxgbe was relying on snd_tag_free being called when the inp was
detached so that it could kick the firmware to flush any pending
work on the flow. This is because the driver doesn't require ACK
messages from the firmware for every request, but instead does a
kind of manual interrupt coalescing by only setting a flag to
request a completion on a subset of requests. If all of the
in-flight requests don't have the flag when the tag is detached from
the inp, the flow might never return the credits. The current
snd_tag_free command issues a flush command to force the credits to
return. However, the credit return is what also frees the mbufs,
and since those mbufs now hold references on the tag, this meant
that snd_tag_free would never be called.
To fix, explicitly drop the mbuf's reference on the snd tag when the
mbuf is queued in the firmware work queue. This means that once the
inp's reference on the tag goes away and all in-flight mbufs have
been queued to the firmware, tag's refcount will drop to zero and
snd_tag_free will kick in and send the flush request. Note that we
need to avoid doing this in the middle of ethofld_tx(), so the
driver grabs a temporary reference on the tag around that loop to
defer the free to the end of the function in case it sends the last
mbuf to the queue after the inp has dropped its reference on the
tag.
- mlx5 preallocates send tags and was using the ifp pointer even when
the send tag wasn't in use. Explicitly use the ifp from other data
structures instead.
- Sprinkle some assertions in various places to assert that received
packets don't have a send tag, and that other places that overwrite
rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.
asomers [Fri, 24 May 2019 20:27:50 +0000 (20:27 +0000)]
Remove "struct ucred*" argument from vtruncbuf
vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever
since that function was first added in r34611. Remove it. Also, remove some
"struct ucred" arguments from fuse and nfs functions that were only used by
vtruncbuf.
Reviewed by: cem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20377
luporl [Fri, 24 May 2019 18:41:31 +0000 (18:41 +0000)]
Make options MD_ROOT_MEM default on PPC64
Having this option enabled by default on PowerPC64 kernels makes
booting ISO images much easier when on PowerNV.
With it, the ISO may simply be given to the -i flag of kexec.
Better yet, the ISO may be loop mounted on PetitBoot and its
kernel may be used to load itself.
Without this option, booting ISOs on remote PPC64 machines usually
involve preparing a separate kernel, with this option enabled.
ken [Fri, 24 May 2019 17:58:29 +0000 (17:58 +0000)]
Fix FC-Tape bugs caused in part by r345008.
The point of r345008 was to reset the Command Reference Number (CRN)
in some situations where a device stayed in the topology, but had
changed somehow.
This can include moving from a switch connection to a direct
connection or vice versa, or a device that temporarily goes away
and comes back. (e.g. moving to a different switch port)
There were a couple of bugs in that change:
- We were reporting that a device had not changed whenever the
Establish Image Pair bit was not set. That is not quite correct.
Instead, if the Establish Image Pair bit stays the same (set or
not), the device hasn't changed in that way.
- We weren't setting PRLI Word0 in the port database when a new
device arrived, so comparisons with the old value for the
Establish Image Pair bit weren't really possible. So, make sure
PRLI Word0 is set in the port database for new devices.
- We were resetting the CRN whenever the Establish Image Pair bit
was set for a device, even when the device had stayed the same
and the value of the bit hadn't changed. Now, only reset the
CRN for devices that have changed, not devices that sayed the
same.
The result of all of this was that if we had a single FC device on
an FC port and it went away and came back, we would wind up
correctly resetting the CRN.
But, if we had multiple devices connected via a switch, and there
was any change in one or more of those devices, all of the devices
that stayed the same would also have their CRN values reset.
The result, from a user standpoint, is that the tape drives, etc.
would all start to time out commands and the initiator would send
aborts.
sys/dev/isp/isp.c:
In isp_pdb_add_update(), look at whether the Establish
Image Pair bit has changed as part of the check to
determine whether a device is still the same. This was
causing erroneous change notifications. Also, when
creating a new port database entry, initialize the
PRLI Word 0 values.
sys/dev/isp/isp_freebsd.c:
In isp_async(), in the changed/stayed case, instead of
looking at the Establish Image Pair bit to determine
whether to reset the CRN, look at the command value.
(Changed vs. Stayed.) Only reset the CRN for devices
that have changed.
kib [Fri, 24 May 2019 17:19:06 +0000 (17:19 +0000)]
Fix a corner case in demotion of kernel mappings.
It is possible for the kernel mapping to be created with superpage by
directly installing pde using pmap_enter_2mpage() without filling the
corresponding page table page. This can happen e.g. if the range is
already backed by reservation and vm_fault_soft_fast() conditions are
satisfied, which was observed on the pipe_map.
In this case, demotion must fill the page obtained from the pmap
radix, same as if the page is newly allocated. Use PG_PROMOTED bit as
an indicator that the page is valid, instead of the wire count of the
page table page.
Since the PG_PROMOTED bit is set on pde when we leave TLB entries for
4k pages around, which in particular means that the ptes were filled,
it provides more correct indicator. Note that pmap_protect_pde()
clears PG_PROMOTED, which handles the case when protection was changed
on the superpage without adjusting ptes.
Reported by: pho
In collaboration with: alc
Tested by: alc, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D20380
markj [Fri, 24 May 2019 15:45:43 +0000 (15:45 +0000)]
Modernize the MAKE_JUST_KERNELS hint in the top-level makefile.
It doesn't make sense to limit to -j12 anymore, build scalability
is better than it used to be. Fold the hint into the description
of the universe target.
ae [Fri, 24 May 2019 11:06:24 +0000 (11:06 +0000)]
Add `missing` and `or-flush` options to "ipfw table <NAME> create"
command to simplify firewall reloading.
The `missing` option suppresses EEXIST error code, but does check that
existing table has the same parameters as new one. The `or-flush` option
implies `missing` option and additionally does flush for table if it
is already exist.
Submitted by: lev
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18339
delphij [Fri, 24 May 2019 02:44:15 +0000 (02:44 +0000)]
cryptodeflate: Drop z_stream zbuf.state->dummy from SDT probe.
For older versions of zlib, dummy was a workaround for compilers that do not
handle opaque type definition well; on FreeBSD, it's representing a value
that is not really useful for monitoring purposes, and the field would be gone
in newer zlib versions.
PR: 229763
Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision: https://reviews.freebsd.org/D20222
kevans [Fri, 24 May 2019 01:53:45 +0000 (01:53 +0000)]
bectl(8): Add a test for jail/unjail of numeric BE names
Fixed by r348215, bectl ujail first attempts the trivial fetch of a jid by
passing the first argument to 'ujail' to jail_getid(3) in case a jid/name
have been passed in instead of a BE name. For numerically named BEs, this
was doing the wrong thing: instead of failing to locate the jid specified
and falling back to mountpath search, jail_getid(3) would return the input
as-is.
While here, I've fixed bectl_jail_cleanup which still used a hard-coded pool
name that was overlooked w.r.t. other work that was in-flight around the
same time.
kevans [Fri, 24 May 2019 01:28:07 +0000 (01:28 +0000)]
jail_getid(3): validate jid string input
Currently, if jail_getid(3) is passed in a numeric string, it assumes that
this is a jid string and passes it back converted to an int without checking
that it's a valid/existing jid. This breaks consumers that might use
jail_getid(3) to see if it can trivially grab a jid from a name if that name
happens to be numeric but not actually the name/jid of the jail. Instead of
returning -1 for the jail not existing, it'll return the int version of the
input and the consumer will not fallback to trying other methods.
Pass the numeric input to jail_get(2) as the jid for validation, rather than
the name. This works well- the kernel enforces that jid=name if name is
numeric, so doing the safe thing and checking numeric input as a jid will
still DTRT based on the description of jail_getid.
jhb [Fri, 24 May 2019 00:34:13 +0000 (00:34 +0000)]
Add support for writing to guest memory in the debug server.
- Add a write_mem counterpart to read_mem to handle writes to MMIO.
- Add support for the GDB 'M' packet to write bytes to the guest's
memory. For MMIO writes, attempt to batch writes up into words.
This is imprecise, but if you write a single 2 or 4-byte aligned
word, it should be treated as a single MMIO write operation.
- While here, tidy up the parsing of the 'm' command used for reading
memory to match 'M'.
jhb [Thu, 23 May 2019 22:31:55 +0000 (22:31 +0000)]
Add deprecation warnings for weaker algorithms to geli(4).
- Triple DES has been formally deprecated in Kerberos (RFC 8429)
and is soon to be deprecated in IPsec (RFC 8221).
- Blowfish is deprecated. FreeBSD doesn't support its successor
(Twofish).
- MD5 is generally considered a weak digest that has known attacks.
geli refuses to create new volumes using these algorithms via 'geli
init'. It also warns when attaching to existing volumes or creating
temporary volumes via 'geli onetime' . The plan is to fully remove
support for these algorithms in FreeBSD 13.
Note that none of these algorithms have ever been the default
algorithm used by geli(8). Users would have had to explicitly select
these algorithms when creating volumes in the past.
jhb [Thu, 23 May 2019 22:06:57 +0000 (22:06 +0000)]
Add deprecation warnings for IPsec algorithms deprecated in RFC 8221.
All of these algorithms are either explicitly marked MUST NOT, or they
are implicitly MUST NOTs by virtue of not being included in IETF's
list of protocols at all despite having assignments from IANA.
Specifically, this adds warnings for the following ciphers:
- des-cbc
- blowfish-cbc
- cast128-cbc
- des-deriv
- des-32iv
- camellia-cbc
Warnings for the following authentication algorithms are also added:
- hmac-md5
- keyed-md5
- keyed-sha1
- hmac-ripemd160
cem [Thu, 23 May 2019 21:02:27 +0000 (21:02 +0000)]
random(4): deduplicate explicit_bzero() in harvest
Pull the responsibility for zeroing events, which is general to any
conceivable implementation of a random device algorithm, out of the
algorithm-specific Fortuna code and into the callers. Most callers
indirect through random_fortuna_process_event(), so add the logic there.
Most callers already explicitly bzeroed the events they provided, so the
logic in Fortuna was mostly redundant.
Add one missing bzero in randomdev_accumulate(). Also, remove a redundant
bzero in the same function -- randomdev_hash_finish() is obliged to bzero
the hash state.
cem [Thu, 23 May 2019 20:12:24 +0000 (20:12 +0000)]
EKCD: Add Chacha20 encryption mode
Add Chacha20 mode to Encrypted Kernel Crash Dumps.
Chacha20 does not require messages to be multiples of block size, so it is
valid to use the cipher on non-block-sized messages without the explicit
padding AES-CBC would require. Therefore, allow use with simultaneous dump
compression. (Continue to disallow use of AES-CBC EKCD with compression.)
dumpon(8) gains a -C cipher flag to select between chacha and aes-cbc.
It defaults to chacha if no -C option is provided. The man page documents this
behavior.
cperciva [Thu, 23 May 2019 19:55:53 +0000 (19:55 +0000)]
Use ACPI SPCR on x86
This takes the SPCR code currently in uart_cpu_arm64.c, moves it into
a new uart_cpu_acpi.c (with some associated refactoring), and uses it
from both arm64 and x86.
An SPCR serial port address AccessWidth field value of 0 ("reserved")
is now treated as 1 ("byte access") in order to work around a buggy
SPCR table on Amazon EC2 i3.metal instances.
Reviewed by: manu, Greg V
MFC after: 3 days
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D20357
manu [Thu, 23 May 2019 19:26:50 +0000 (19:26 +0000)]
loader: Add pnp functions for autoloading modules based on linker.hints
This adds some new commands to loader :
- pnpmatch
This takes a pnpinfo string as argument and tries to find a kernel module
associated with it. -v and -d option are available and are the same as in
devmatch (v is verbose, d dumps the hints).
- pnpload
This takes a pnpinfo string as argument and tries to load a kernel module
associated with it.
- pnpautoload
This will attempt to load every kernel module for each buses. Each buses are
probed, the probe function will generate pnpinfo string and load kernel module
associated with it if it exists.
Only simplebus for FDT system is implemented for now.
Since we need the dtb and overlays to be applied before searching the tree
fdt_devmatch_next will load and apply the dtb + overlays.
All the pnp parsing code comes from devmatch and is the same at 99%.
manu [Thu, 23 May 2019 17:36:19 +0000 (17:36 +0000)]
arm: allwinner: clk: Use the new frac clock
Some clocks used the NM type but this clock is for the ones with the
formula "clk = clkin / n / m" and not "clk = clkin * n / m"
Use the new frac clock for them.
manu [Thu, 23 May 2019 17:35:40 +0000 (17:35 +0000)]
arm: allwinner: clk: Add new clock aw_clk_frac
Add a clock driver for clock that can either be used in integer mode
with one N factor and one M divider or in fractional mode where the
output frequency is chosen between two predifined output.