kevans [Tue, 25 Jun 2019 18:47:40 +0000 (18:47 +0000)]
libbe(3): restructure be_mount, skip canmount check for BE dataset
Further cleanup after r349380; loader and kernel will both ignore canmount
on the root dataset as well, so we should not be so strict about it when
mounting it. be_mount is restructured to make it more clear that depth==0 is
special, and to not try fetching these properties that we won't care about.
mav [Tue, 25 Jun 2019 18:35:23 +0000 (18:35 +0000)]
Avoid extra taskq_dispatch() calls by DMU.
DMU sync code calls taskq_dispatch() for each sublist of os_dirty_dnodes
and os_synced_dnodes. Since the number of sublists by default is equal
to number of CPUs, it will dispatch equal, potentially large, number of
tasks, waking up many CPUs to handle them, even if only one or few of
sublists actually have any work to do.
This change adds check for empty sublists to avoid this.
kevans [Tue, 25 Jun 2019 18:13:39 +0000 (18:13 +0000)]
libbe(3): mount: the BE dataset is mounted at /
Other parts of libbe(3) were fairly strict on the mountpoint property of the
BE dataset, and be_mount was not much better. It was improved in r347027 to
allow mountpoint=none for depth==0, but this bit was still sensitive to
mountpoint != / and mountpoint != none. Given that other parts of libbe(3)
no longer restrict the mountpoint property here, and the rest of the base
system is generally OK and will assume that a BE is mounted at /, let's do
the same.
luporl [Tue, 25 Jun 2019 17:15:44 +0000 (17:15 +0000)]
[PowerPC64] Don't mark module data as static
Fixes panic when loading ipfw.ko and if_epair.ko built with modern compiler.
Similar to arm64 and riscv, when using a modern compiler (!gcc4.2), code
generated tries to access data in the wrong location, causing kernel panic
(data storage interrupt trap) when loading if_epair and ipfw.
Issue was reproduced with kernel/module compiled using gcc8 and clang8. It
affects both ELFv1 and ELFv2 ABI environments.
mav [Tue, 25 Jun 2019 17:00:53 +0000 (17:00 +0000)]
Fix strsep_quote() on strings without quotes.
For strings without quotes and escapes dstptr and srcptr are equal, so
zeroing *dstptr before checking *srcptr is not a good idea. In practice
it means that in -maproot=65534:65533 everything after the colon is lost.
The problem was there since r293305, but before r346976 it was covered by
improper strsep_quote() usage.
PR: 238725
MFC after: 3 days
Sponsored by: iXsystems, Inc.
hselasky [Tue, 25 Jun 2019 11:54:41 +0000 (11:54 +0000)]
Convert all IPv4 and IPv6 multicast memberships into using a STAILQ
instead of a linear array.
The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.
To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi
All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.
dougm [Tue, 25 Jun 2019 07:44:37 +0000 (07:44 +0000)]
vm_map_protect may return an INVALID_ARGUMENT or PROTECTION_FAILURE
error response after clipping the first map entry in the region to be
reserved. This creates a pair of matching entries that should have
been "simplified" back into one, or never created. This change defers
the clipping of that entry until those two vm_map_protect failure
cases have been ruled out.
imp [Tue, 25 Jun 2019 06:14:31 +0000 (06:14 +0000)]
Replay r349342 by imp accidentally reverted by r349352
Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
imp [Tue, 25 Jun 2019 06:14:16 +0000 (06:14 +0000)]
Replay r349339 by imp accidentally reverted by r349352
Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
This commit has evolved from the original work to include Capsicum
integration. As part of that, it only opens the host audio devices once
and leaves them open, instead of opening and closing them on each guest
access. Thanks to Peter Grehan and Marcelo Araujo for their help in
bringing the work forward and providing some of the final techncial push.
Submitted by: Alex Teaca <iateaca@freebsd.org>
Differential Revision: D7840, D12419
imp [Tue, 25 Jun 2019 06:13:56 +0000 (06:13 +0000)]
Replay r349333 by emaste accidentally reverted by r349352
vtfontcvt: improve .bdf validation
Previously if we had a FONTBOUNDINGBOX or DWIDTH entry that had missing
or invalid values and and failed sscanf, we would proceeded with
partially initialized bounding box / device width variables.
Reported by: afl (FONTBOUNDINGBOX)
MFC with: r349100
Sponsored by: The FreeBSD Foundation
imp [Tue, 25 Jun 2019 04:50:09 +0000 (04:50 +0000)]
Remove NAND and NANDFS support
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.
Numerous posts to arch@ and other locations have found no actual users
for this software.
Relnotes: Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
jhibbits [Tue, 25 Jun 2019 02:35:22 +0000 (02:35 +0000)]
powerpc: Transition to Secure-PLT, like most other OSs (Toolchain part)
Summary:
Toolchain follow-up to r349350. LLVM patches will be submitted upstream for
9.0 as well.
The bsd.cpu.mk change is required because GNU ld assumes BSS-PLT if it
cannot determine for certain that it needs Secure-PLT, and some binaries do
not compile in such a way to make it know to use Secure-PLT.
jhibbits [Tue, 25 Jun 2019 00:40:44 +0000 (00:40 +0000)]
powerpc: Transition to Secure-PLT, like most other OSs
Summary:
PowerPC has two PLT models: BSS-PLT and Secure-PLT. BSS-PLT uses runtime
code generation to generate the PLT stubs. Secure-PLT was introduced with
GCC 4.1 and Binutils 2.17 (base has GCC 4.2.1 and Binutils 2.17), and is a
more secure PLT format, using a read-only linkage table, with the dynamic
linker populating a non-executable index table.
This is the libc, rtld, and kernel support only. The toolchain and build
parts will be updated separately.
bcran [Mon, 24 Jun 2019 23:18:42 +0000 (23:18 +0000)]
loader: add HTTP support using UEFI
Add support for an HTTP "network filesystem" using the UEFI's HTTP
stack.
This also supports HTTPS, but TianoCore EDK2 implementations currently
crash while fetching loader files.
Only IPv4 is supported at the moment. IPv6 support is planned for a
follow-up changeset.
Note that we include some headers from the TianoCore EDK II project in
stand/efi/include/Protocol verbatim, including links to the license instead
of including the full text because that's their preferred way of
communicating it, despite not being normal FreeBSD project practice.
jchandra [Mon, 24 Jun 2019 21:24:55 +0000 (21:24 +0000)]
arm64 acpi_iort: add some error handling
Print warnings for some bad kernel configurations (like NUMA disabled
with multiple domains). Check and report some firmware errors (like
incorrect proximity domain entries).
imp [Mon, 24 Jun 2019 20:23:19 +0000 (20:23 +0000)]
Use the cam_ed copy of ata_params rather than malloc and freeing
memory for it. This reaches into internal bits of xpt a little, and
I'll clean that up later.
imp [Mon, 24 Jun 2019 20:18:49 +0000 (20:18 +0000)]
Go ahead and completely fix the ata_params before calling the veto
function. This breaks nothing that uses it in the tree since
ata_params is ignored in storvsc_ada_probe_veto which is the only
in-tree consumer.
This commit has evolved from the original work to include Capsicum
integration. As part of that, it only opens the host audio devices once
and leaves them open, instead of opening and closing them on each guest
access. Thanks to Peter Grehan and Marcelo Araujo for their help in
bringing the work forward and providing some of the final techncial push.
Submitted by: Alex Teaca <iateaca@freebsd.org>
Differential Revision: D7840, D12419
emaste [Mon, 24 Jun 2019 17:25:14 +0000 (17:25 +0000)]
vtfontcvt: improve .bdf validation
Previously if we had a FONTBOUNDINGBOX or DWIDTH entry that had missing
or invalid values and and failed sscanf, we would proceeded with
partially initialized bounding box / device width variables.
Reported by: afl (FONTBOUNDINGBOX)
MFC with: r349100
Sponsored by: The FreeBSD Foundation
ian [Mon, 24 Jun 2019 01:42:09 +0000 (01:42 +0000)]
Build an armv7 LINT kernel in addition to armv5 LINT. You might think this
had been done years ago. I did. All this time we've only compiled a LINT
kernel for TARGET_ARCH=arm. Now separate LINT-V5 and LINT-V7 configs are
generated and built.
There are two new files in arm/conf, NOTES.armv5 and NOTES.armv7, containing
some of what used to be in the arm NOTES file. That file now contains only
the bits that are common to v5 and v7.
The makeLINT.mk file now creates the LINT-V5 and LINT-V7 files by concatening
sys/conf/NOTES, arm/conf/NOTES, and arm/conf/NOTES.armv{5,7} in that order.
kib [Sun, 23 Jun 2019 21:21:11 +0000 (21:21 +0000)]
amd64 pmap: block on turnstile for lock-less DI.
Port the code to block on turnstile instead of yielding, to lock-less
delayed invalidation. The yield might cause tight loop due to priority
inversion.
Since it is impossible to avoid race between block and wake-up, arm
1-tick callout to wakeup when thread blocks itself.
Reported and tested by: mjg
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 2 months
Differential revision: https://reviews.freebsd.org/D20636
kib [Sun, 23 Jun 2019 21:15:31 +0000 (21:15 +0000)]
Switch to check for effective user id in r349320, and disable dumping
into existing files for sugid processes.
Despite using real user id pronounces the intent, it actually breaks
suid coredumps, while not making any difference for non-sugid
processes. The reason for the breakage is that non-existent core file
is created with the effective uid (unless weird hacks like SUIDDIR are
configured).
Then, if user enabled kern.sugid_coredump, core dumping should not
overwrite core files owned by effective uid, but we cannot pretend to
use real uid for dumping.
PR: 68905
admbugs: 358
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
mav [Sun, 23 Jun 2019 19:05:01 +0000 (19:05 +0000)]
Improve AHCI Enclosure Management and SES interoperation.
Since SES specs do not define mechanism to map enclosure slots to SATA
disks, AHCI EM code I written many years ago appeared quite useless,
that always bugged me. I was thinking whether it was a good idea, but
if LSI HBAs do that, why I shouldn't?
This change introduces simple non-standard mechanism for the mapping
into both AHCI EM and SES code, that makes AHCI EM on capable controllers
(most of Intel's) a first-class SES citizen, allowing it to report disk
physical path to GEOM, show devices inserted into each enclosure slot in
`sesutil map` and `getencstat`, control locate and fault LEDs for specific
devices with `sesutil locate adaX on` and `sesutil fault adaX on`, etc.
I've successfully tested this on Supermicro X10DRH-i motherboard connected
with sideband cable of its S-SATA Mini-SAS connector to SAS815TQ backplane.
It can indicate with LEDs Locate, Fault and Rebuild/Remap SES statuses for
each disk identical to real SES of Supermicro SAS2 backplanes.
ian [Sun, 23 Jun 2019 17:23:56 +0000 (17:23 +0000)]
Add the rtc8583 driver to conf/files. Also, move sy8106a from
file.allwinner to conf/files... it's not allwinner-specific, some day
other platforms could use the same regulator chip.
arichardson [Sun, 23 Jun 2019 10:47:07 +0000 (10:47 +0000)]
Fix two WARNS=6 warnings in opendir.c and telldir.c
This is in preparation for compiling these files as part of rtld (which is
built with WARNS=6). See https://reviews.freebsd.org/D20663 for more details.
sevan [Sat, 22 Jun 2019 22:34:59 +0000 (22:34 +0000)]
Remove question mark from the link between NetBSD & Darwin.
As linked to in bug 26137 as a source
https://web.archive.org/web/20001012121507/http://www.opensource.apple.com/projects/darwin/faq.html
mentions:
"We already synchronize our code periodically with NetBSD for most of our user commands"
alc [Sat, 22 Jun 2019 16:26:38 +0000 (16:26 +0000)]
Introduce pmap_remove_l3_range() and use it in two places:
(1) pmap_remove(), where it eliminates redundant TLB invalidations by
pmap_remove() and pmap_remove_l3(), and (2) pmap_enter_l2(), where it may
optimize the TLB invalidations by batching them.
dougm [Sat, 22 Jun 2019 03:16:01 +0000 (03:16 +0000)]
Modify swapon(8) to invoke BIO_DELETE to trim swap devices, either if
'-E' appears on the swapon command line, or if "trimonce" appears as
an fstab option.
vangyzen [Sat, 22 Jun 2019 01:20:45 +0000 (01:20 +0000)]
VirtIO SCSI: validate seg_max on attach
Until r349278, bhyve presented a seg_max to the guest that was too large.
Detect this case and clamp it to the virtqueue size. Otherwise, we would
fail the "too many segments to enqueue" assertion in virtqueue_enqueue().
I hit this by running a guest with a MAXPHYS of 256 KB.
mav [Sat, 22 Jun 2019 01:06:41 +0000 (01:06 +0000)]
Make ELEMENT INDEX validation more strict.
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page". It allows us to question ELEMENT INDEX that is lower
then values we already processed. There are many SAS2 enclosures with
this kind of problem.
While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong. Also skip elements with INVALID bit set.
scottl [Fri, 21 Jun 2019 23:40:26 +0000 (23:40 +0000)]
Refactor xpt_getattr() to make it more readable. No outwardly
visible functional changes, though code flow was modified a bit
internally to lessen the need for goto jumps and chained if
conditionals.
mav [Fri, 21 Jun 2019 23:29:16 +0000 (23:29 +0000)]
Fix individual_element_index when some type has 0 elements.
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first. To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.
vangyzen [Fri, 21 Jun 2019 18:57:33 +0000 (18:57 +0000)]
bhyve: Fix vtscsi maximum segment config
The seg_max value reported to the guest should be two less than the
host's maximum, in order to leave room for the request and the
response. This is analogous to r347033 for virtio_block.
We hit the "too many segments to enqueue" assertion on OneFS because
we increase MAXPHYS to 256 KB.
johalun [Fri, 21 Jun 2019 18:48:07 +0000 (18:48 +0000)]
LinuxKPI: Additions to rcu list.
- Add rcu list functions.
- Make rcu hlist's foreach macro use rcu calls instead of the non-rcu macro.
- Bump FreeBSD version so we have a checkpoint for the vboxvideo drm driver.
ian [Fri, 21 Jun 2019 15:12:17 +0000 (15:12 +0000)]
Do some general cleanup and light wordsmithing.
Sort methods alphabetically. Wrap long lines. Start sentences on a new
line. Remove contractions (not because it's a good idea, just to silence
igor). Add some explanation of the units for the period and duty arguments
and the convention for channel numbers.
ian [Fri, 21 Jun 2019 14:46:43 +0000 (14:46 +0000)]
Catch up with recent changes in pwmbus(9). The pwm(9) and pwmbus(9)
interfaces were unified into pwmbus(9), and the PWMBUS_CHANNEL_MAX method
was renamed PWMBUS_CHANNEL_COUNT. The pwmbus_attach_bus() function just
went away completely. Also, fix a few typos such as s/is/if/.
ian [Fri, 21 Jun 2019 14:24:33 +0000 (14:24 +0000)]
Add support for the PWM(9) API. This allows configuring the pwm output using
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps. The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
ian [Fri, 21 Jun 2019 14:01:02 +0000 (14:01 +0000)]
Some mundane tweaks and cleanups to help de-clutter the diffs of some
upcoming functional changes.
Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data. Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite. Move the
device_t and driver_methods structs to the end of the file. Tweak comments.
emaste [Fri, 21 Jun 2019 13:42:40 +0000 (13:42 +0000)]
nandsim: correct test to avoid out-of-bounds access
Previously nandsim_chip_status returned EINVAL iff both of user-provided
chip->ctrl_num and chip->num were out of bounds. If only one failed the
bounds check arbitrary memory would be read and returned.
The NAND framework is not built by default, nandsim is not intended for
production use (it is a simulator), and the nandsim device has root-only
permissions.
admbugs: 827
Reported by: Daniel Hodson of elttam
MFC after: 3 days
Security: kernel information leak or DoS
Sponsored by: The FreeBSD Foundation
ae [Fri, 21 Jun 2019 10:54:51 +0000 (10:54 +0000)]
Add "tcpmss" opcode to match the TCP MSS value.
With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.
# ipfw add deny log tcp from any to any tcpmss 0-500
kp [Fri, 21 Jun 2019 07:58:08 +0000 (07:58 +0000)]
ip_output: pass PFIL_FWD in the slow path
If we take the slow path for forwarding we should still tell our
firewalls (hooked through pfil(9)) that we're forwarding. Pass the
ip_output() flags to ip_output_pfil() so it can set the PFIL_FWD flag
when we're forwarding.
imp [Fri, 21 Jun 2019 03:49:36 +0000 (03:49 +0000)]
Mount and unmount devfs around calls to add packages.
pkg now uses /dev/null for some of its operations. NanoBSD's packaging
stuff didn't mount that for the chroot it ran in, so any config that
added packages would see the error:
pkg: Cannot open /dev/null:No such file or directory
when trying to actually add those packages. It's easy enough for
nanobsd to mount /dev and it won't hurt anything that was already
working and may help things that weren't (like this). I moved the
mount/unmount pair to be in the right push/pop order from the
submitted patch.
PR: 238727
Submitted by: mike tancsa
Tested by: Karl Denninger
cem [Fri, 21 Jun 2019 00:16:30 +0000 (00:16 +0000)]
sys: Remove DEV_RANDOM device option
Remove 'device random' from kernel configurations that reference it (most).
Replace perhaps mistaken 'nodevice random' in two MIPS configs with 'options
RANDOM_LOADABLE' instead. Document removal in UPDATING; update NOTES and
random.4.
takawata [Thu, 20 Jun 2019 23:52:33 +0000 (23:52 +0000)]
Fix the case where no root hub object while host controller object exist in ACPI namespace.
Also you can disable ACPI support for USB by setting
debug.acpi.disabled="usb"
asomers [Thu, 20 Jun 2019 23:07:20 +0000 (23:07 +0000)]
fcntl: fix overflow when setting F_READAHEAD
VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.
Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936. Formalize that by using a union type.
Reviewed by: cem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20710