Toomas Soome [Sat, 16 Jan 2021 17:51:23 +0000 (19:51 +0200)]
loader.efi: handle multiple gop instances
Some systems may provide multiple GOP instances and not all are
bound to hardware. The current loader is picking up the first GOP,
which may not be usable. Instead we load the GOP handle array,
and test every handle to have registered ConOut protocol. If ConOut is
present, we can use this GOP handle to open GOP protocol.
Marius Strobl [Sat, 16 Jan 2021 15:58:38 +0000 (16:58 +0100)]
man4: bring back ofw_console.4 and openfirm.4
Back when I wrote openfirm.4, sparc64 was the only architecture to
include the corresponding device. However, nowadays all supported
architectures will provied this Open Firmware interface, even x86
when built with FDT support.
As for ofw_console(4), powerpc actually was the first architecture
to ship it but we'll probably not see another consumer in future.
rtinit[1]() is a function used to add or remove interface address prefix routes,
similar to ifa_maintain_loopback_route().
It was intended to be family-agnostic. There is a problem with this approach
in reality.
1) IPv6 code does not use it for the ifa routes. There is a separate layer,
nd6_prelist_(), providing interface for maintaining interface routes. Its part,
responsible for the actual route table interaction, mimics rtenty() code.
2) rtinit tries to combine multiple actions in the same function: constructing
proper route attributes and handling iterations over multiple fibs, for the
non-zero net.add_addr_allfibs use case. It notably increases the code complexity.
3) dstaddr handling. flags parameter re-uses RTF_ flags. As there is no special flag
for p2p connections, host routes and p2p routes are handled in the same way.
Additionally, mapping IFA flags to RTF flags makes the interface pretty messy.
It make rtinit() to clash with ifa_mainain_loopback_route() for IPV4 interface
aliases.
4) rtinit() is the last customer passing non-masked prefixes to rib_action(),
complicating rib_action() implementation.
5) rtinit() coupled ifa announce/withdrawal notifications, producing "false positive"
ifa messages in certain corner cases.
To address all these points, the following has been done:
* rtinit() has been split into multiple functions:
- Route attribute construction were moved to the per-address-family functions,
dealing with (2), (3) and (4).
- funnction providing net.add_addr_allfibs handling and route rtsock notificaions
is the new routing table inteface.
- rtsock ifa notificaion has been moved out as well. resulting set of funcion are only
responsible for the actual route notifications.
Side effects:
* /32 alias does not result in interface routes (/32 route and "host" route)
* RTF_PINNED is now set for IPv6 prefixes corresponding to the interface addresses
Mariusz Zaborski [Sat, 16 Jan 2021 11:55:42 +0000 (12:55 +0100)]
cat: persistent errno
There is no guarantee that after close(2)/free the errno will remain
persistent. The caller of the udom_open function depends on the errno
for reporting errors.
AMD 10GbE hardware is designed to have two buffers per receive descriptor to
support split header feature. For this purpose, the driver was designed to use
2 iflib freelists per receive queue. So, that buffers from 2 freelists are used
to refill an entry in the receive descriptor. The current design holds good
with regular data traffic.
But, when netmap comes into play, the current design will not fit in. The
current netmap interfaces and netmap implementation in iflib doesn't seem
to accomodate the design of 2 freelists per receive queue. So, exercising
Netmap capability with inbuilt tools like bridge, pkt-gen doesn't work with
the 2 freelists driver design.
So, the driver design is changed to accomodate the current netmap interfaces
and netmap implementation in iflib by using single freelist per receive queue
approach when Netmap capability is exercised without disturbing the current
2 freelists approach.
The dev.ax.sph_enable tunable can be set to 0 to configure the single
free list mode.
Thanks to Stephan Dewt for his Initial set of code changes for the stated
problem.
Kyle Evans [Sat, 16 Jan 2021 06:02:43 +0000 (00:02 -0600)]
bectl: remove spurious aok variable
This rode in with the OpenZFS import. It may have been necessary at some
point, but it is no longer and it breaks the WITHOUT_DYNAMICROOT build as
it collides with the definition in libspl.
Kyle Evans [Sat, 16 Jan 2021 05:58:12 +0000 (23:58 -0600)]
bectl: tests: use -R <mount> instead of specifying altroot
-R is currently shorthand for cachefile=none, altroot=<mount>. This is
functionally the same, but perhaps more resilient to future changes that
could be necessary that may be added when -R is specified.
Kirk McKusick [Sat, 16 Jan 2021 00:33:00 +0000 (16:33 -0800)]
Eliminate a locking panic when cleaning up UFS snapshots after a
disk failure.
Each vnode has an embedded lock that controls access to its contents.
However vnodes describing a UFS snapshot all share a single snapshot
lock to coordinate their access and update. As part of mounting a
UFS filesystem with snapshots, each of the vnodes describing a
snapshot has its individual lock replaced with the snapshot lock.
When the filesystem is unmounted the vnode's original lock is
returned replacing the snapshot lock.
When a disk fails while the UFS filesystem it contains is still
mounted (for example when a thumb drive is removed) UFS forcibly
unmounts the filesystem. The loss of the drive causes the GEOM
subsystem to orphan the provider, but the consumer remains until
the filesystem has finished with the unmount. Information describing
the snapshot locks was being prematurely cleared during the orphaning
causing the return of the snapshot vnode's original locks to fail.
The fix is to not clear the needed information prematurely.
Kirk McKusick [Sat, 16 Jan 2021 00:00:17 +0000 (16:00 -0800)]
Eliminate lock order reversal in UFS when unmounting filesystems
with snapshots.
Each vnode has an embedded lock that controls access to its contents.
However vnodes describing a UFS snapshot all share a single snapshot
lock to coordinate their access and update. As part of mounting a
UFS filesystem with snapshots, each of the vnodes describing a
snapshot has its individual lock replaced with the snapshot lock.
When the filesystem is unmounted the vnode's original lock is
returned replacing the snapshot lock.
The lock order reversal happens because vnode locks must be acquired
before snapshot locks. When unmounting we must lock both the snapshot
lock and the vnode lock before swapping them so that the vnode will
be continuously locked during the swap. For each vnode representing
a snapshot, we must first acquire the snapshot lock to ensure
exclusive access to it and its original lock. We then face a lock
order reversal when we try to acquire the original vnode lock. The
problem is eliminated by doing a non-blocking exclusive lock on the
original lock which will always succeed since there are no users
of that lock.
Kyle Evans [Fri, 15 Jan 2021 14:15:40 +0000 (08:15 -0600)]
lualoader: use floor division to get correct type
This fixes the positioning of the "Welcome to FreeBSD" heading, which was
misplaced after the recent update to Lua 5.4. The issue was previously
masked by a compatibility knob in Lua 5.3 that would cause float-tagged
numbers to render faithfully without the decimal component. Lua 5.4 dropped
that and ensures that it always prints a decimal component, even if it has
to append a ".0" to the value.
Standard division produces a "float", floor division (//) can be used to
guarantee an integer. Floating point operations have been completely ripped
out of the liblua compiled for the bootloader, so this is a nop. This is
decidedly better than trying to hack out the float tag entirely.
Reported-by: mjg, probably others
MFC-after: 3 days
Gleb Smirnoff [Mon, 11 Jan 2021 19:11:29 +0000 (11:11 -0800)]
Add 'tmp' to the list of FILESYSTEMS dependencies. Some scripts that
depend on FILESYSTEMS run mktemp(1). For systems that have read-only
root this is broken until memory disk based /tmp is instantiated. At
least 'os-release' and 'motd' are subject to this problem.
Gleb Smirnoff [Mon, 11 Jan 2021 20:13:41 +0000 (12:13 -0800)]
Revert 97ec6eba653a07. There shouldn't be a dependency of 'tmp' on
remote filesystems. Discussed this with Brooks and he can't find
evidence that provoked the change in 2005. If anything gets broken
I will fix it in a different way, not via rc sequence change.
Andrew Turner [Fri, 15 Jan 2021 18:48:43 +0000 (18:48 +0000)]
Extract the logic from pmap_kextract
This allows us to use it when we only need to check if the virtual address
is valid. For example when checking if an address in the DMAP region is
mapped.
Emmanuel Vadot [Fri, 15 Jan 2021 14:28:46 +0000 (15:28 +0100)]
Switch to the new device-tree vendor tree
The old vendor tree was never fully merged and doing partial merge isn't
supported with git subtree merge so a new one was created.
Switch the build to use the new DTS from sys/contrib/device-tree
This also bump the DTS used to be in sync with Linux 5.9
While here change the way to get the linux version, simply hardcode
the value in sys/dts/freebsd-compatible.dts and use awk to get that
to put it in the CFLAGS.
As a bonus we now have the bindings docs available
in sys/contrib/device-tree/Bindings/ so no need to link to the Linux repo
or to the vendor tree.
It changed the #pinctrl-cells value to be equal to 2 and the macro
that generates the values.
Based on the bindings docs a value of 2 is only acceptable if the node
used pinctrl-single,bits and not pinctrl-single,pins
This allow booting further on the beaglebone black with 5.9 DTS
Simon J. Gerraty [Fri, 15 Jan 2021 01:33:05 +0000 (17:33 -0800)]
pkgfs_open: follow symlinks
Caller is not interested in symlinks follow them.
Throw an error if too many links encountered.
Reviewed by: stevek
Sponsored by: Juniper Networks
--This line, and those below, will be ignored--
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Differential Revision: https://reviews.freebsd.org/D### (*full* phabric URL needed).
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> MFH: Ports tree branch name. Request approval for merge.
> Relnotes: Set to 'yes' for mention in release notes.
> Security: Vulnerability reference (one per line) or description.
> Sponsored by: If the change was sponsored by an organization.
> Empty fields above will be automatically removed.
Ed Maste [Wed, 13 Jan 2021 18:08:31 +0000 (13:08 -0500)]
elfctl: prefix disable flags with "no"
Some ELF feature flags indicate a request to opt-out of some feature,
for example NT_FREEBSD_FCTL_ASLR_DISABLE indicates that ASLR should be
disabled for the tagged binary. Using "aslr" as the short name for the
flag is confusing as it seems to indicate a request for ASLR to be
enabled. Rename "noaslr", and make a similar change for other opt-out
flags.
Reviewed by: bapt, manu, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28139
Ed Maste [Wed, 13 Jan 2021 19:21:38 +0000 (14:21 -0500)]
elfctl: add backwards compatibility for "no" prefixes
I am going to prefix opt-out ELF feature flag names with "no" to make
their meaning more clear (review D28139), but there are some uses of the
existing names already (e.g., the PR referenced below).
For now accept the older, unprefixed name as well, and emit a warning.
We can revert this after FreeBSD 13 branches.
Michael Tuexen [Wed, 13 Jan 2021 21:48:17 +0000 (22:48 +0100)]
tcp: add sysctl to tolerate TCP segments missing timestamps
When timestamp support has been negotiated, TCP segements received
without a timestamp should be discarded. However, there are broken
TCP implementations (for example, stacks used by Omniswitch 63xx and
64xx models), which send TCP segments without timestamps although
they negotiated timestamp support.
This patch adds a sysctl variable which tolerates such TCP segments
and allows to interoperate with broken stacks.
Add the *.swp and *~ pattern for vim temporary files. Expand the
cscope ones to include all files possibly generated by cscope and also
add some known object formats.
Also remove the leading '?' from cscope.out, or else it doesn't match
the cscope.out file generated by default (as it expects an extra
character in front).
While the package names are a bit prettier this confuse pkg about upgrading :
$ pkg version -t 13.0.s2021011313063 13.0.a1
>
$ pkg version -t 13.0.s2021011313063 13.0_ALPHA1
<
Note that the current scheme isn't good when bumping from ALPHA to BETA or
even BETA to RC:
$ pkg version -t 13.0_ALPHA1 13.0_BETA1
=
$ pkg version -t 13.0_BETA1 13.0_RC1
=
But more thoughts have to be put into this renaming.
Adrian Chadd [Tue, 12 Jan 2021 21:13:20 +0000 (13:13 -0800)]
[mips] revert r366664 - flip mips back from -O2 to -O
Now that I have -head fitting in 8MB of flash again, I can test
out freebsd-head on my home AP test setup. Unfortunately,
the introduction of -O2 in r366664 causes the following infinite
loop shortly after boot:
------
MAP: No valid partition found at map/rootfs.uzip
Warning: no time-of-day clock registered, system time will not be set accurately
start_init: trying /sbin/init
BAD_PAGE_FAULT: pid 1 tid 100001 (init), uid 0: pc 0x4042c320 got a read fault (type 0x2) at 0x2e3a0
Trapframe Register Dump:
zero: 0 at: 0 v0: 0 v1: 0
a0: 0x1af34 a1: 0 a2: 0 a3: 0x7fffeff0
t0: 0 t1: 0 t2: 0 t3: 0
t4: 0 t5: 0 t6: 0 t7: 0
t8: 0 t9: 0x152e8 s0: 0x7fffee84 s1: 0
s2: 0 s3: 0 s4: 0 s5: 0
s6: 0 s7: 0 k0: 0 k1: 0
gp: 0x362c0 sp: 0x7fffedf0 s8: 0 ra: 0x40417df0
sr: 0xf413 mullo: 0 mulhi: 0 badvaddr: 0x2e3a0
cause: 0xffffffff80000008 pc: 0x4042c31c
Page table info for pc address 0x4042c320: pde = 0x80712000, pte = 0xa002065a
Dumping 4 words starting at pc address 0x4042c320: 8f9980e0808200001040006700809825
Page table info for bad address 0x2e3a0: pde = 0, pte = 0
------
I'm not yet sure why, but until I figure it out with the mips64/cheri
folk this should be reverted.
This should only use -O on GCC generated code for MIPS platforms.
Kyle Evans [Thu, 14 Jan 2021 06:34:29 +0000 (00:34 -0600)]
build: `make check`: use a PATH search instead for Kyua
which(1) accepts both relative/absolute paths as well as lone binary
names. Set KYUA to kyua and use which(1) to confirm that it can find one;
if it cannot, just advise the user to set KYUA directly to the kyua binary
rather than assuming a relative location from LOCALBASE.
This allows `make check` to be operated with the version of kyua in base
without losing the flexibility of specifying another one.
ngie@ notes that the original intention was to avoid redundant $PATH lookups
and improve the determinism of the target. A future change will likely push
us back to this state, perhaps in the form of reverting this entirely and
just switching to using kyua in base. Accepting any in $PATH should be
considered a transitional move, at least until it's declared otherwise,
since kyua was only semi-recently added to base.
Kyle Evans [Thu, 14 Jan 2021 06:33:07 +0000 (00:33 -0600)]
tools: git hooks: drop "submitted by" from commit template
With the switch to git, we should strive to properly attribute every
commit appropriately with the metadata that's provided to do so. In this
case, the submitter should be recorded via the author metadata. Committing
an arbitrary patch, one can set it as such:
git commit --author="John Smith <smith@example.com>"
Mitchell Horne [Wed, 13 Jan 2021 18:30:50 +0000 (14:30 -0400)]
arm64: fix early devmap assertion
The purpose of this KASSERT is to ensure that we do not run out of space
in the early devmap. However, the devmap grew beyond its initial size of
2MB in r336519, and this assertion did not grow with it.
A devmap mapping of a 1080p framebuffer requires 1920x1080 bytes, or
1.977 MB, so it is just barely able to fit without triggering the
assertion, provided no other devices are mapped before it. With the
addition of `options GDB` in GENERIC by bbfa199cbc16, the uart is now
mapped for the purposes of a debug port, before mapping the framebuffer.
The presence of both these conditions pushes the selected virtual
address just below the threshold, triggering the assertion.
To fix this, use the correct size of the devmap, defined by
PMAP_MAPDEV_EARLY_SIZE. Since this code is shared with RISC-V, define
it for that platform as well (although it is a different size).
PR: 25241
Reported by: gbe
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
John Baldwin [Wed, 13 Jan 2021 21:13:01 +0000 (13:13 -0800)]
Enable accelerated AES-XTS software crypto in GENERIC.
In particular, using GELI on a root filesystem will only use
accelerated software crypto drivers if they are available before the
root filesystem is mounted. While these modules can be loaded from
the loader, including them in GENERIC provides a better out-of-the-box
experience for users.
Both aesni(4) and armv8crypto(4) provide accelerated implementations
of the default cipher used by GELI (AES-XTS) in addition to other
ciphers.
Kristof Provost [Wed, 13 Jan 2021 18:30:01 +0000 (19:30 +0100)]
pf: Don't hold PF_RULES_WLOCK during copyin() on DIOCRCLRTSTATS
We cannot hold a non-sleepable lock during copyin(). This means we can't
safely count the table, so instead we fall back to the pf_ioctl_maxcount
used in other ioctls to protect against overly large requests.
Emmanuel Vadot [Wed, 13 Jan 2021 17:23:51 +0000 (18:23 +0100)]
Add driver for Synopsys Designware Watchdog timer.
This driver supports some arm and arm64 boards equipped with
"snps,dw-wdt"-compatible watchdog device.
Tested on RK3399-based board (RockPro64).
Once started watchdog device cannot be stopped.
Interrupt handler has mode to kick watchdog even when software does not do it
properly.
This can be controlled via sysctl: dev.dwwdt.prevent_restart.
Also - driver handles system shutdown and prevents from restart when system
is asked to reboot.
Toomas Soome [Wed, 13 Jan 2021 17:05:51 +0000 (19:05 +0200)]
loader.efi: initial terminal size should base on UEFI terminal size
We do select font based on desired terminal size, we do query
UEFI terminal size with conout->QueryMode(), but by mistake, the fallback
values are used.