pfg [Sat, 27 Jan 2018 15:33:52 +0000 (15:33 +0000)]
{ext2|ufs}_readdir: Set limit on valid ncookies values.
Sanitize the values that will be assigned to ncookies so that we ensure
they are sane and we can handle them.
Let ncookies signed as it was before r328346. The valid range is such
that unsigned values are not required and we are not able to avoid at
least one cast anyways.
kib [Sat, 27 Jan 2018 11:49:37 +0000 (11:49 +0000)]
Use PCID to optimize PTI.
Use PCID to avoid complete TLB shootdown when switching between user
and kernel mode with PTI enabled.
I use the model close to what I read about KAISER, user-mode PCID has
1:1 correspondence to the kernel-mode PCID, by setting bit 11 in PCID.
Full kernel-mode TLB shootdown is performed on context switches, since
KVA TLB invalidation only works in the current pmap. User-mode part of
TLB is flushed on the pmap activations as well.
Similarly, IPI TLB shootdowns must handle both kernel and user address
spaces for each address. Note that machines which implement PCID but
do not have INVPCID instructions, cause the usual complications in the
IPI handlers, due to the need to switch to the target PCID temporary.
This is racy, but because for PCID/no-INVPCID we disable the
interrupts in pmap_activate_sw(), IPI handler cannot see inconsistent
state of CPU PCID vs PCPU pmap/kcr3/ucr3 pointers.
On the other hand, on kernel/user switches, CR3_PCID_SAVE bit is set
and we do not clear TLB.
I can imagine alternative use of PCID, where there is only one PCID
allocated for the kernel pmap. Then, there is no need to shootdown
kernel TLB entries on context switch. But copyout(3) would need to
either use method similar to proc_rwmem() to access the userspace
data, or (in reverse) provide a temporal mapping for the kernel buffer
into user mode PCID and use trampoline for copy.
mmel [Sat, 27 Jan 2018 11:19:41 +0000 (11:19 +0000)]
Implement mitigation for Spectre version 2 attacks on ARMv7.
Similarly as we already do for arm64, for mitigation is necessary to
flush branch predictor when we:
- do task switch
- receive prefetch abort on non-userspace address
The user can disable this mitigation by setting 'machdep.disable_bp_hardening'
sysctl variable, or it can check actual system status by reading
'machdep.spectre_v2_safe'
The situation is complicated by fact that:
- for Cortex-A8, the BPIALL instruction is effectively NOP until the IBE bit
in ACTLR is set.
- for Cortex-A15, the BPIALL is always NOP. The branch predictor can be
only flushed by doing ICIALLU with special bit (Enable invalidates of BTB)
set in ACTLR.
Since access to the ACTLR register is locked to secure monitor/firmware on
most boards, they will also need update of firmware / U-boot.
In worst case, when secure monitor is on-chip ROM (e.g. PandaBoard),
the board is unfixable.
mmel [Sat, 27 Jan 2018 09:49:47 +0000 (09:49 +0000)]
Fix pmap_fault().
- special fault handling for break-before-make mechanism should be also
applied for instruction translation faults, not only for data translation
faults.
- since arm64_address_translate_...() functions are not atomic,
use these with disabled interrupts.
jhb [Sat, 27 Jan 2018 00:39:49 +0000 (00:39 +0000)]
Clarify some comments in the MIPS makecontext().
- N32 and N64 do not have a $a0-3 gap.
- Use 'sp += 4' to skip over the gap for O32 rather than '+= i'. It
doesn't make a functional change, but makes the code match the comment.
jhb [Fri, 26 Jan 2018 23:21:50 +0000 (23:21 +0000)]
Move per-operation data out of the csession structure.
Create a struct cryptop_data which contains state needed for a single
symmetric crypto operation and move that state out of the session. This
closes a race with the CRYPTO_F_DONE flag that can result in use after
free.
While here, remove the 'cse->error' member. It was just a copy of
'crp->crp_etype' and cryptodev_op() and cryptodev_aead() checked both
'crp->crp_etype' and 'cse->error'. Similarly, do not check for an
error from mtx_sleep() since it is not used with PCATCH or a timeout
so cannot fail with an error.
imp [Fri, 26 Jan 2018 23:14:46 +0000 (23:14 +0000)]
Fix a sleepable malloc in ndastart. We shouldn't be sleeping
here. Return ENOMEM when we can't malloc a buffer for the DSM
TRIM. This should fix the WITNESS warnings similar to the following:
uma_zalloc_arg: zone "16" with the following non-sleepable locks held:
exclusive sleep mutex CAM device lock (CAM device lock) r = 0 (0xfffff800080c34d0) locked @ /usr/src/sys/cam/nvme/nvme_da.c:351
imp [Fri, 26 Jan 2018 21:50:59 +0000 (21:50 +0000)]
Now that exit is __dead2, we need to tag ub_exit() as __dead2. To do
that, we have to put a while (1); after the syscall that will never
return to fake out the compiler....
mckusick [Fri, 26 Jan 2018 18:17:11 +0000 (18:17 +0000)]
For many years the message "fsync: giving up on dirty" has occationally
appeared on UFS/FFS filesystems. In some cases it was promptly followed
by a panic of "softdep_deallocate_dependencies: dangling deps". This fix
should eliminate both of these occurences.
Submitted by: Andreas Longwitz <longwitz at incore.de>
Reviewed by: kib
Tested by: Peter Holm (pho)
PR: 225423
MFC after: 1 week
imp [Fri, 26 Jan 2018 17:56:20 +0000 (17:56 +0000)]
Gross hack to omit printing hex floating point when the lua number
type is int64. While lua is setup for the representation, it's not
setup to properly print the numbers as ints. This is the least-gross
way around that, and won't affect the bootloader where we do this.
ian [Fri, 26 Jan 2018 17:55:17 +0000 (17:55 +0000)]
Add support to the imx5/6 watchdog for the external reset signal. Also, if
the "power down" watchdog used by the ROM boot code is still active when the
regular watchdog is activated, turn off the power-down watchdog.
This adds support for the "fsl,ext-reset-output" FDT property. When
present, that property indicates that a chip reset is accomplished by
asserting the WDOG1_B external signal, which is supposed to trigger some
external component such as a PMIC to ready the hardware for reset (for
example, adjusting voltages from idle to full-power levels), and assert the
POR signal to SoC when ready. To guard against misconfiguation leading to a
non-rebootable system, the external reset signal is backstopped by code
that asserts a normal internal chip reset if nothing responds to the
external reset signal within one second.
imp [Fri, 26 Jan 2018 17:24:25 +0000 (17:24 +0000)]
Preserve the original luaconf.h in a convenient place. Clients will
almost certainly need to override this, so reinforce that. If that's
not hte case, clients can always do a #include luaconf.h.dist.
hselasky [Fri, 26 Jan 2018 10:49:02 +0000 (10:49 +0000)]
Decouple Linux files from the belonging character device right after open
in the LinuxKPI. This is done by calling finit() just before returning a magic
value of ENXIO in the "linux_dev_fdopen" function.
The Linux file structure should mimic the BSD file structure as much as
possible. This patch decouples the Linux file structure from the belonging
character device right after the "linux_dev_fdopen" function has returned.
This fixes an issue which allows a Linux file handle to exist after a
character device has been destroyed and removed from the directory index
of /dev. Only when the reference count of the BSD file handle reaches zero,
the Linux file handle is destroyed. This fixes use-after-free issues related
to accessing the Linux file structure after the character device has been
destroyed.
While at it add a missing NULL check for non-present file operation.
Calling a NULL pointer will result in a segmentation fault.
mckusick [Fri, 26 Jan 2018 00:58:32 +0000 (00:58 +0000)]
Refactoring of reading and writing of the UFS/FFS superblock.
Specifically reading is done if ffs_sbget() and writing is done
in ffs_sbput(). These functions are exported to libufs via the
sbget() and sbput() functions which then used in the various
filesystem utilities. This work is in preparation for adding
subperblock check hashes.
jhibbits [Fri, 26 Jan 2018 00:58:02 +0000 (00:58 +0000)]
Minimum changes for ctl to build on architectures with non-matching physical and
virtual address sizes
Summary:
Some architectures use physical addresses larger than virtual. This is the
minimal changeset needed to get CAM/CTL to build on these targets. No
functional changes. More changes would likely be needed for this to be fully
functional on said platforms, but they can be made when needed.
jhibbits [Fri, 26 Jan 2018 00:56:09 +0000 (00:56 +0000)]
Minimal change to build linuxkpi on architectures with physical addresses larger
than virtual
Summary:
Some architectures have physical/bus addresses that are much larger
than virtual addresses. This change just quiets a warning, as DMAP is not used
on those architectures, and on 64-bit platforms uintptr_t is the same size as
vm_paddr_t and void *.
jkim [Thu, 25 Jan 2018 23:38:05 +0000 (23:38 +0000)]
Add declaration of SSL_get_selected_srtp_profile() for OpenSSL.
Because there was an extra declaration in the vendor version, we locally
removed the second one in r238405 with 1.0.1c. Later, upstream fixed it in
1.0.2d but they removed the first one. Therefore, both were removed in our
version unfortunately. Now we revert to the vendor one to re-add it.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D10525
imp [Thu, 25 Jan 2018 21:38:30 +0000 (21:38 +0000)]
Track Ref / DeRef and Hold / Unhold that da is doing to track down
leaks. We assume each source can be taken / dropped only once and
don't recurse. These are only enabled via DA_TRACK_REFS or
INVARIANTS. There appreas to be a reference leak under extreme load,
and these should help us colaberatively work it out. It also documents
better the reference / holding protocol better.
imp [Thu, 25 Jan 2018 21:38:09 +0000 (21:38 +0000)]
When devices are invalidated, there's some cases where ccbs for that
device still wind up in xpt_done after the path has been
invalidated. Since we don't always need sim or devq, add some guard
rails to only fail if we have to use them.
nwhitehorn [Thu, 25 Jan 2018 18:09:26 +0000 (18:09 +0000)]
Treat DSE exceptions like DSI exceptions when generating signinfo.
Both can generate SIGSEGV, but DSEs would have put the wrong address
into the siginfo structure when the signal was delivered.
ian [Thu, 25 Jan 2018 17:53:33 +0000 (17:53 +0000)]
Minor cleanups... Move DRIVER_MODULE() and other boilerplate stuff to the
bottom of the file, where it is in most imx5/6 drivers. Switch from an RD2
macro using bus_space_read_2() to an inline function using bus_read_2();
likewise for WR2. Use RESOURCE_SPEC_END to end the resource_spec list.
markj [Thu, 25 Jan 2018 15:35:34 +0000 (15:35 +0000)]
Use tcpinfoh_t for TCP headers in the tcp:::debug-{drop,input} probes.
The header passed to these probes has some fields converted to host
order by tcp_fields_to_host(), so the tcpinfo_t translator doesn't do
what we want.
mckusick [Wed, 24 Jan 2018 23:57:40 +0000 (23:57 +0000)]
More throughly integrate libufs into fsck_ffs by using its cgput()
routine to write out the cylinder groups rather than recreating the
calculation of the cylinder-group check hash in fsck_ffs.
emaste [Wed, 24 Jan 2018 21:39:40 +0000 (21:39 +0000)]
Add efi.8 as a man page link to uefi.8
FreeBSD and industry has been inconsistent in the use of UEFI and EFI.
They are essentially just different versions of the same specification
and are often used interchangeably. Make it easier for users to find
information by making efi(8) an alias for uefi(8).
emaste [Wed, 24 Jan 2018 21:11:35 +0000 (21:11 +0000)]
uefi.8: update HISTORY and remove incomplete AUTHORS
- EFI support appeared in 5.0 for ia64
- arm64 UEFI support added in 11.0
The AUTHORS section included the folks responsible for the bulk of the
work to bring UEFI support to amd64, but missed those who did the
original work on ia64, the initial port to i386, the ports to arm64 and
arm, and have generally maintained and improved general UEFI support
since then. It's unwieldly to include everyone and would quickly become
outdated again anyhow, so just remove the AUTHORS section.
Reviewed by: manu
Discussed with: jhb
Differential Revision: https://reviews.freebsd.org/D14033
jhb [Wed, 24 Jan 2018 20:14:57 +0000 (20:14 +0000)]
Expand the software fallback for GCM to cover more cases.
- Extend ccr_gcm_soft() to handle requests with a non-empty payload.
While here, switch to allocating the GMAC context instead of placing
it on the stack since it is over 1KB in size.
- Allow ccr_gcm() to return a special error value (EMSGSIZE) which
triggers a fallback to ccr_gcm_soft(). Move the existing empty
payload check into ccr_gcm() and change a few other cases
(e.g. large AAD) to fallback to software via EMSGSIZE as well.
- Add a new 'sw_fallback' stat to count the number of requests
processed via the software fallback.
jhb [Wed, 24 Jan 2018 20:13:07 +0000 (20:13 +0000)]
Clamp DSGL entries to a length of 2KB.
This works around an issue in the T6 that can result in DMA engine
stalls if an error occurs while processing a DSGL entry with a length
larger than 2KB.
jhb [Wed, 24 Jan 2018 20:12:00 +0000 (20:12 +0000)]
Fail crypto requests when the resulting work request is too large.
Most crypto requests will not trigger this condition, but a request
with a highly-fragmented data buffer (and a resulting "large" S/G
list) could trigger it.
jhb [Wed, 24 Jan 2018 20:11:00 +0000 (20:11 +0000)]
Don't discard AAD and IV output data for AEAD requests.
The T6 can hang when processing certain AEAD requests if the request
sets a flag asking the crypto engine to discard the input IV and AAD
rather than copying them into the output buffer. The existing driver
always discards the IV and AAD as we do not need it. As a workaround,
allocate a single "dummy" buffer when the ccr driver attaches and
change all AEAD requests to write the IV and AAD to this scratch
buffer. The contents of the scratch buffer are never used (similar to
"bogus_page"), and it is ok for multiple in-flight requests to share
this dummy buffer.
jhb [Wed, 24 Jan 2018 20:08:10 +0000 (20:08 +0000)]
Reject requests with AAD and IV larger than 511 bytes.
The T6 crypto engine's control messages only support a total AAD
length (including the prefixed IV) of 511 bytes. Reject requests with
large AAD rather than returning incorrect results.
jhb [Wed, 24 Jan 2018 20:04:08 +0000 (20:04 +0000)]
Always store the IV in the immediate portion of a work request.
Combined authentication-encryption and GCM requests already stored the
IV in the immediate explicitly. This extends this behavior to block
cipher requests to work around a firmware bug. While here, simplify
the AEAD and GCM handlers to not include always-true conditions.
ae [Wed, 24 Jan 2018 19:48:25 +0000 (19:48 +0000)]
Adopt revision 1.76 and 1.77 from NetBSD:
Fix a vulnerability in IPsec-IPv6-AH, that allows an attacker to remotely
crash the kernel with a single packet.
In this loop we need to increment 'ad' by two, because the length field
of the option header does not count the size of the option header itself.
If the length is zero, then 'count' is incremented by zero, and there's
an infinite loop. Beyond that, this code was written with the assumption
that since the IPv6 packet already went through the generic IPv6 option
parser, several fields are guaranteed to be valid; but this assumption
does not hold because of the missing '+2', and there's as a result a
triggerable buffer overflow (write zeros after the end of the mbuf,
potentially to the next mbuf in memory since it's a pool).
Add the missing '+2', this place will be reinforced in separate commits.
Reported by: Maxime Villard <maxv at NetBSD.org>
MFC after: 1 week