geom_part: make it possible recovering broken GPT after some LBAs cut off
This is followup to r365477.
If pre-formatted device has GPT and a partition covering
last available LBAs and the device is attached using
a bridge reducing amount of LBAs, then it could be not enough
forcing GEOM to use primary GPT. Also, we should make it possible
to recover GPT and this requires either deleting or resizing the partition.
This change enables "gpart delete" and "gpart resize" commands
on corrupted GPT with following "gpart recover".
It still does not allow modifying corrupted GPT without
preliminary setting sysctl kern.geom.part.check_integrity=0
installworld: run `certctl rehash` after installation completes
This was originally introduced back in r360833, and subsequently reverted
because it was broken for -DNO_ROOT builds and it may not have been the
correct place for it.
While debatably this may still not be 'the correct place,' it's much cleaner
than scattering rehashes all throughout the tree. brooks has fixed the issue
with -DNO_ROOT by properly writing to the METALOG in r361397.
Do note that this is different than what was originally committed; brooks
had revisions in D24932 that made it actually use the revised unprivileged
mode and write to METALOG, along with being a little more friendly to
foreign crossbuilds and just using the certctl in-tree.
With this change, I believe we should now have a populated /etc/ssl/certs in
the VM images.
Orphans affect job control state, we must account for them when
changing pg_jobc.
Instead of p_pptr, use proc_realparent() to get parent relevant for
job control.
Use correct calculation of the parent for exiting process. For jobc
purposes, we must use realparent, but if it is also exiting, we should
fall to reaper, then recursively find non-exiting reaper.
Reported by: trasz
PR: 249257
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D26416
Mark Johnston [Wed, 16 Sep 2020 13:51:47 +0000 (13:51 +0000)]
Move PLTs to the beginning of amd64 kernel modules.
As with .text, the aim is to ensure that executable sections are
segregated from the rest, to avoid creation of writeable and executable
mappings. Recent versions of LLVM emit a PLT in firmware modules.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26444
Rick Macklem [Wed, 16 Sep 2020 02:25:18 +0000 (02:25 +0000)]
Fix a LOR between the NFS server and server side krpc.
Recent testing of the NFS-over-TLS code found a LOR between the mutex lock
used for sessions and the sleep lock used for server side krpc socket
structures.
The code in nfsrv_checksequence() would call SVC_RELEASE() with the mutex
held. Normally this is ok, since all that happens is SVC_RELEASE()
decrements a reference count. However, if the socket has just been shut
down, SVC_RELEASE() drops the reference count to 0 and acquires a sleep
lock during destruction of the server side krpc structure.
This patch fixes the problem by moving the SVC_RELEASE() call in
nfsrv_checksequence() down a few lines to below where the mutex is released.
Or it could be explained as lockless (for vnode lock) reads. Reads
are performed from the node tn_obj object. Tmpfs regular vnode object
lifecycle is significantly different from the normal OBJT_VNODE: it is
alive as far as ref_count > 0.
Ensure liveness of the tmpfs VREG node and consequently v_object
inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open().
Provide custom tmpfs fo_close() method on file, to ensure that close
is paired with open.
Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks.
It is quite cheap in code size sense to support page-ins for read for
tmpfs even if we do not own tmpfs vnode lock. Also, we can handle
holes in tmpfs node without additional efforts, and do not have
limitation of the transfer size.
Reviewed by: markj
Discussed with and benchmarked by: mjg (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346
There are several negative side-effects of not calling into VOP layer
at all for page cache reads. The biggest is the missed activation of
EVFILT_READ knotes.
Also, it allows filesystem to make more fine grained decision to
refuse read from page cache.
Keep VIRF_PGREAD flag around, it is still useful for nullfs, and for
asserts.
Reviewed by: markj
Tested by: pho
Discussed with: mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346
Eric Joyner [Tue, 15 Sep 2020 21:07:30 +0000 (21:07 +0000)]
e1000: Properly retain promisc flag
From Franco:
The iflib rewrite forced the promisc flag but it was not reported
to the system. Noticed on a stock VM that went into unsolicited
promisc mode when dhclient was started during bootup.
[PowerPC64LE] Use correct in_masks table on LE to fix checksumming
Due to a check that should have been an endian check being an #if 0,
the wrong checksum mask table was being used on LE, which was causing
extreme strangeness in DNS resolution -- *some* hosts would be resolvable,
but most would not.
This fixes DNS resolution.
(I am committing some parts of the LE patchset ahead of time to reduce the
amount of work I have to do while committing the main patchset.)
Intercept and report #UD to VM on SVM/AMD in case VM tried to execute an
SVM instruction. Otherwise, SVM allows execution of them, and instructions
operate on host physical addresses despite being executed in guest mode.
Mark Johnston [Tue, 15 Sep 2020 19:23:22 +0000 (19:23 +0000)]
Simplify unix socket connection peer locking.
unp_pcb_owned_lock2() has some sharp edges and forces callers to deal
with a bunch of cases. Simplify it:
- Rename to unp_pcb_lock_peer().
- Return the connected peer instead of forcing callers to load it
beforehand.
- Handle self-connected sockets.
- In unp_connectat(), just lock the accept socket directly. It should
not be possible for the nascent socket to participate in any other
lock orders.
- Get rid of connect_internal(). It does not provide any useful
checking anymore.
- Block in unp_connectat() when a different thread is concurrently
attempting to lock both sides of a connection. This provides simpler
semantics for callers of unp_pcb_lock_peer().
- Make unp_connectat() return EISCONN if the socket is already
connected. This fixes a race[1] when multiple threads attempt to
connect() to different addresses using the same datagram socket.
Upper layers will disconnect a connected datagram socket before
calling the protocol connect's method, but there is no synchronization
between this and protocol-layer code.
Reported by: syzkaller [1]
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26299
Mark Johnston [Tue, 15 Sep 2020 19:22:37 +0000 (19:22 +0000)]
Simplify unp_disconnect() callers.
In all cases, PCBs are unlocked after unp_disconnect() returns. Since
unp_disconnect() may release the last PCB reference, callers may have to
bump the refcount before the call just so that they can release them
again.
Change unp_disconnect() to release PCB locks as well as connection
references; this lets us remove several refcount manipulations. Tighten
assertions.
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26297
Mark Johnston [Tue, 15 Sep 2020 19:22:16 +0000 (19:22 +0000)]
Rename unp_pcb_lock2().
unp_pcb_lock_pair() seems like a better name. Also make it handle the
case where the two sockets are the same instead of making callers do it.
No functional change intended.
Reviewed by: glebius, kevans, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26296
Mark Johnston [Tue, 15 Sep 2020 19:21:58 +0000 (19:21 +0000)]
Improve unix socket PCB refcounting.
- Use refcount_init().
- Define an INVARIANTS-only zone destructor to assert that various
bits of PCB state aren't left dangling.
- Annotate unp_pcb_rele() with __result_use_check.
- Simplify control flow.
Reviewed by: glebius, kevans, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26295
Mark Johnston [Tue, 15 Sep 2020 19:21:33 +0000 (19:21 +0000)]
Update unix domain socket locking comments.
- Define a locking key for unpcb members.
- Rewrite some of the locking protocol description to make it less
verbose and avoid referencing some subroutines which will be renamed.
- Reorder includes.
Reviewed by: glebius, kevans, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26294
The first issue was lack of quoting around INSTALLFLAGS, which set it
incorrectly and produced an error on -M.
The second issue was that we weren't actually doing the install in
unprivileged mode, making it effectively useless. This was designed to pass
through the proper metalog/unpriv flags to install(1), so just let it
happen.
Warner Losh [Tue, 15 Sep 2020 15:21:29 +0000 (15:21 +0000)]
Include sys/types.h here
It's included by header pollution in most of the compile
environments. However, in the standalone envirnment, it's not
included. Go ahead and include it always since the overhead is low and
it is simpler that way.
In D12421, the ability to compile stand/ in little-endian was added, with the
intention to extend loader.kboot to run in Petitboot.
However, no further work was done, as the kernel then gained self-execution
capabilities as Petitboot was taught to load FreeBSD kernels directly.
The FreeBSD installer on powerpc64 (on POWER8 and POWER9) uses
/boot/etc/kboot.conf instead of loader.
As this option does nothing but cause stand/ to be miscompiled and actively
causes confusion, remove it.
(I have a functioning petitboot loader in my local tree, however, it turned
out to be quite inconvient to use due to the current petitboot plugin design
so I put it on hold.)
Warner Losh [Mon, 14 Sep 2020 23:51:14 +0000 (23:51 +0000)]
We don't need the sc_ekeys_lock in standalone environment.
When we bring in geli into the boot loader, we are single threaded so
we don't have to worry about locking. We have no mutexes, and don't need
to use them, so comment it out.
Warner Losh [Mon, 14 Sep 2020 23:30:04 +0000 (23:30 +0000)]
Don't do the busy dance in icee_open/close
We don't need to do the busy dance for this driver. It's handled by
destroy_dev() entirely. Since all we did was busy/unbusy in
open/close, just delete them. We therefore don't need to track closes
either.
Warner Losh [Mon, 14 Sep 2020 23:27:51 +0000 (23:27 +0000)]
Tweak what's visible in the standalone environment. We define offsetof
in stand.h typically, but when this is included we can define it
multiple times. However, we don't define bool in stand.h at the
moment, so allow it to be defined inside types.h when we're building
for the standalone environment.
Ian Lepore [Mon, 14 Sep 2020 17:33:28 +0000 (17:33 +0000)]
Add product ID strings for a couple Microchip usb hubs. Also, update the
vendor ID string to say just "Microchip Technology" -- the buyout of
Standard Microsystems happened in 2012 and the SMC/SMSC names are pretty
much retired at this point.
Andrew Turner [Mon, 14 Sep 2020 16:18:53 +0000 (16:18 +0000)]
Cleanups for gprof:
* Remove identical or almost identical headers
* Only build aout.c on amd64 and i386. None of the the other current
architectures ever supported running a.out binaries
* Enable on all architectures
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D26369
Andrew Turner [Mon, 14 Sep 2020 16:12:28 +0000 (16:12 +0000)]
Use MACHINE_CPUARCH when checking for arm64
Use MACHINE_CPUARCH with arm64 (aarch64) when we build code that could run
on any 64-bit Arm instruction set. This will simplify checks in downstream
consumers targeting prototype instruction sets.
The only place we check for MACHINE_ARCH == aarch64 is when building the
device tree blobs. As these are targeting current generation ISAs.
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D26370
Currently, the only thing that prevents a functioning 64-bit FICL build is
a few integer types that were intended to be fixed-width.
Changing them to C99 integer types allows building a functioning 64-bit
FICL.
While this isn't applicable to the default settings of any in-tree loaders,
it is necessary for a future Petitboot loader, due to the requirement that
it be compiled as a 64-bit program.
[PowerPC] Make cpu frequency detection endian-independent
On ibm,extended-clock-frequency, ensure we be64toh() the value.
On clock-frequency, remove the right-shifting hack (which was needed due to
reading a 32 bit value into a 64 bit variable) and switch to OF_getencprop()
for reading (which will handle endian conversion internally.)
Reviewed by: jhibbits (in irc)
Sponsored by: Tag1 Consulting, Inc.
Gordon Tetlow [Mon, 14 Sep 2020 14:45:30 +0000 (14:45 +0000)]
Partially revert r346018 and use the if/then construct instead of shell.
There are a couple of places in the tree that directly parse the newvers.sh
script looking for the BRANCH variable. I found two locations, one in
release/Makefile and the other in bin/freebsd-version/Makefile.
While there is a good argument that BRANCH_OVERRIDE should properly
propagate in those circumstances and the new behavior is thus better, the
reality is this change broke freebsd-update's ability to find timestamps in
binaries and resulted in a large number of gratuitous changes.
Reported by: freebsd-update
Discussed with: cperciva
MFC after: 1 day
Andrew Turner [Mon, 14 Sep 2020 08:59:16 +0000 (08:59 +0000)]
Allow for interrupts on pl061 children
Add enough infrastructure for interrupts on children of the pl061 GPIO
controller. As gpiobus already provided these the pl061 driver also needs
to pass requests up the newbus hierarchy.
Currently there are no children that expect to configure interrupts, however
this is expected to change to support the ACPI Event Information interface.
Alex Richardson [Mon, 14 Sep 2020 08:51:18 +0000 (08:51 +0000)]
pfctl_test: avoid 200 calls to atf_get_srcdir
I have been trying to reduce the time that testsuite runs take for CheriBSD
on QEMU (currently about 22 hours). One of the slowest tests is pfctl_test:
Just listing the available test cases currently takes 98 seconds on a
CheriBSD RISC-V system due to all the processes being spawned. This trivial
patch reduces the time to 92 seconds. The better solution would be to
rewrite the test in C/C++ which I may do as a follow-up change.
Scott Long [Mon, 14 Sep 2020 05:58:12 +0000 (05:58 +0000)]
Refine the busdma template interface. Provide tools for filling in fields
that can be extended, but also ensure compile-time type checking. Refactor
common code out of arch-specific implementations. Move the mpr and mps
drivers to this new API. The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.
__FreeBSD_version bump for r365605 (crunchgen producing WARNS-clean)
The change in D26397 will need a __FreeBSD_version to base off of for
bootstrapping crunchgen, to avoid avoidable build failures just because the
host has an outdated crunchgen.
Rick Macklem [Mon, 14 Sep 2020 00:44:50 +0000 (00:44 +0000)]
Fix a case where the NFSv4.0 server might crash if delegations are enabled.
asomers@ reported a crash on an NFSv4.0 server with a backtrace of:
kdb_backtrace
vpanic
panic
nfsrv_docallback
nfsrv_checkgetattr
nfsrvd_getattr
nfsrvd_dorpc
nfssvc_program
svc_run_internal
svc_thread_start
fork_exit
fork_trampoline
where the panic message was "docallb", which indicates that a callback
was attempted when the ClientID is unconfirmed.
This would not normally occur, but it is possible to have an unconfirmed
ClientID structure with delegation structure(s) chained off it if the
client were to issue a SetClientID with the same "id" but different
"verifier" after acquiring delegations on the previously confirmed ClientID.
The bug appears to be that nfsrv_checkgetattr() failed to check for
this uncommon case of an unconfirmed ClientID with a delegation structure
that no longer refers to a delegation the client knows about.
This patch adds a check for this case, handling it as if no delegation
exists, which is the case when the above occurs.
Although difficult to reproduce, this change should avoid the panic().
Colin Percival [Sun, 13 Sep 2020 19:56:53 +0000 (19:56 +0000)]
Spawn the DHCPv6 client in EC2 instances via rtsold.
Prior to this commit, EC2 AMIs used a "dual-dhclient" tool which was
launched in place of dhclient and spawned both the base system dhclient
for IPv4 and the ISC dhclient from ports for IPv6.
Now that rtsold supports the "M bit" (managed configuration), we can go
back to having the base system dhclient spawned normally, and provide a
script to rtsold which spawns the ISC dhclient from ports when rtsold
decides that it is appropriate.
Thanks to: bz
MFC after: 1 week
Sponsored by: https://www.patreon.com/cperciva
Colin Percival [Sun, 13 Sep 2020 19:11:45 +0000 (19:11 +0000)]
Bump the size of EC2 AMIs up to 5 GB.
The FreeBSD base system continues to expand. 4GB is now insufficient;
we passed 3 GB in May 2019; we passed 2 GB in August 2017. Over half
of the disk space used is in /usr/lib/debug/.
Without this change, instances boot but are unusable, since the first
thing which breaks when VM filesystems are too small is the "pkg install"
in the VM building process.
Mike Karels [Sat, 12 Sep 2020 23:49:43 +0000 (23:49 +0000)]
bcm2838_pci.c: Respect DMA limits of controller.
Fixes for Raspberry Pi 4B PCIe / USB:
- Pass through a DMA tag for the controller.
- In theory the controller can access the lower 3 GB, but testing found
that unreliable. OpenBSD also restricts DMA to the lowest 960 MiB.
- Rename some constants to be a bit more meaningful.
Submitted by: Robert Crowston, crowston at protonmail.com
Reviewed by: mkarels, outside reviewers
Differential Revision: https://reviews.freebsd.org/D26344
Warner Losh [Sat, 12 Sep 2020 17:24:04 +0000 (17:24 +0000)]
Update flp test for new diskinfo output
The floppy test passes with this. The others fail due to 'integrity
checks' failing in GPART. It's not at all clear those integrity
checks are legit or if the test samples were bogusly generated
by FreeBSD.
Michael Tuexen [Sat, 12 Sep 2020 11:24:36 +0000 (11:24 +0000)]
Fix the length of probe packets when using UDP.
Since https://svnweb.freebsd.org/changeset/base/365378 a raw socket is
used for sending UDP probe packets instead of a UDP socket. So don't
compensate for the UDP header anymore.
Michael Tuexen [Sat, 12 Sep 2020 11:19:54 +0000 (11:19 +0000)]
Simplify code, no functional change.
Since https://svnweb.freebsd.org/base?view=revision&revision=365378
UDP is handled the same way as SCTP and TCP (using a raw socket).
Therefore use the same code path.
amd64: prevent KCSan false positives on LAPIC mapping
For configurations without x2APIC support (guests, older hardware), the global
LAPIC MMIO mapping will trigger false-positive KCSan reports as it will appear
that multiple CPUs are concurrently reading and writing the same address.
This isn't actually true, as the underlying physical access will be performed
on the local CPU's APIC. Additionally, because LAPIC access can happen during
event timer configuration, the resulting KCSan printf can produce a panic due
to attempted recursion on event timer resources.
Add a __nosanitizethread preprocessor define to prevent the compiler from
inserting TSan hooks, and apply it to the x86 LAPIC accessors.
This update adds support for:
HW VLAN tagging
HW checksum offload for IPv4 and IPv6
tx and rx aggreegation (for full gige speeds)
multiple transactions
In my testing, I am able to get 900-950Mbps depending upon
TCP or UDP, which is a significant improvement over the previous
91Mbps (~8kint/sec*1500bytes/packet*1packet/int).