tsoome [Wed, 5 Aug 2020 14:32:20 +0000 (14:32 +0000)]
MFOpenZFS: Add support for boot environment data to be stored in the label
We are building new bootonce mechanism (previously zfs bootnext) and it is
based on this OpenZFS change. Since this patch is nicely self contained,
I am commiting it as is, and we can stack our changes.
Original patch description follows:
Modern bootloaders leverage data stored in the root filesystem to
enable some of their powerful features. GRUB specifically has a grubenv
file which can store large amounts of configuration data that can be
read and written at boot time and during normal operation. This allows
sysadmins to configure useful features like automated failover after
failed boot attempts. Unfortunately, due to the Copy-on-Write nature
of ZFS, the standard behavior of these tools cannot handle writing to
ZFS files safely at boot time. We need an alternative way to store
data that allows the bootloader to make changes to the data.
This work is very similar to work that was done on Illumos to enable
similar functionality in the FreeBSD bootloader. This patch is different
in that the data being stored is a raw grubenv file; this file can store
arbitrary variables and values, and the scripting provided by grub is
powerful enough that special structures are not required to implement
advanced behavior.
We repurpose the second padding area in each label to store the grubenv
file, protected by an embedded checksum. We add two ioctls to get and
set this data, and libzfs_core and libzfs functions to access them more
easily. There are no direct command line interfaces to these functions;
these will be added directly to the bootloader utilities.
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #10009
Obtained from: OpenZFS
Sponsored by: Netflix, Klara Inc.
gbe [Wed, 5 Aug 2020 11:41:41 +0000 (11:41 +0000)]
environ(7): Update the description and include some more environment variables
- Add a better introduction to the DESCRIPTION section
- Add a description for MANPATH and POSIXLY_CORRECT
- Asorted improvements for the usage of some macros
mjg [Wed, 5 Aug 2020 09:25:59 +0000 (09:25 +0000)]
cache: convert the hash from LIST to SLIST
This reduces struct namecache by sizeof(void *).
Negative side is that we have to find the previous element (if any) when
removing an entry, but since we normally don't expect collisions it should be
fine.
Note this adds cache_get_hash calls which can be eliminated.
mjg [Wed, 5 Aug 2020 09:24:38 +0000 (09:24 +0000)]
cache: reduce zone alignment to 8 bytes
It used to be sizeof of the given struct to accomodate for 32 bit mips
doing 64 bit loads, but the same can be achieved with requireing just
64 bit alignment.
While here reorder struct namecache so that most commonly used fields
are closer.
Upper level protocols defer checksums calculation in hope we have
checksums offloading in a network card. CSUM_DELAY_DATA flag is used
to determine that checksum calculation was deferred. And IP output
routine checks for this flag before pass mbuf to lower layer. Forwarded
packets have not this flag.
NAT64 uses checksums adjustment when it translates IP headers.
In most cases NAT64 is used for forwarded packets, but in case when it
handles locally originated packets we need to finish checksum calculation
that was deferred to correctly adjust it.
Add check for presence of CSUM_DELAY_DATA flag and finish checksum
calculation before adjustment.
Reported and tested by: Evgeniy Khramtsov <evgeniy at khramtsov org>
MFC after: 1 week
mjg [Tue, 4 Aug 2020 23:07:00 +0000 (23:07 +0000)]
cache: add NCF_WIP flag
This allows making half-constructed entries visible to the lockless lookup,
which now can check for either "not yet fully constructed" and "no longer valid"
state.
mjg [Tue, 4 Aug 2020 23:04:29 +0000 (23:04 +0000)]
cache: add cache_purge_vgone
cache_purge locklessly checks whether the vnode at hand has any namecache
entries. This can race with a concurrent purge which managed to remove
the last entry, but may not be done touching the vnode.
Make sure we observe the relevant vnode lock as not taken before proceeding
with vgone.
Paired with the fact that doomed vnodes cannnot receive entries this restores
the invariant that there are no namecache-related writing users past cache_purge
in vgone.
kibab [Tue, 4 Aug 2020 21:58:43 +0000 (21:58 +0000)]
Minor cleanups in mmc_xpt.c
* Downgrade some CAM debug messages from _INFO to _DEBUG level;
* Add KASSERT for the case when we suspect incorrect CAM SIM initialization (using cam_sim_alloc() instead of cam_sim_alloc_dev());
* Use waiting version of xpt_alloc_ccb(), we are not in hurry;
* With the waiting version we cannot get NULL return, so remove the NULL check;
* In some csses, the name of mmcprobe_done has been written as mmc_probedone();
* Send AC_LOST_DEVICE if we, well, lost the device;
* Misc style(9) fixes.
Reviewed by: manu
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D25843
1.) #if/#else wasn't taken into account at all for maxsyscall figures, but
2.) We didn't validate contiguous syscall numbers anyways...
This kind of inconsistency is bad as we don't currently ensure explicit
indexing of, e.g., the sysent array if one syscall is unimplemented/missing.
This could be fixed and might be more robust, but it's also good to have the
"documentation" that comes from being explicit as to what the missing
syscalls are.
The new version looks much like the awk version; stash off the current
'last highest syscall seen' if we hit an #if, restore to that if we hit an
#else, and make sure that we're explicitly always defining the next syscall.
The logic at the tail end of process_syscall_def that moves maxsyscall has
been 'cleaned up' a little since we're now ensuring that it's monotonically
increasing earlier in the function. At the moment I think it's unlikely we'd
see range-definitions that are not UNIMPL, but there's no reason to
specifically handle that case for bumping maxsyscall there.
This change was provoked by reading the commit message for r363832 and
realizing that this validation hadn't been included in the initial rewrite
to lua.
mav [Tue, 4 Aug 2020 19:27:03 +0000 (19:27 +0000)]
Remove extra memset() left after r342388.
This memset() wiped MPI2_FUNCTION_SCSI_TASK_MGMT set by mprsas_alloc_tm(),
that broke target reset on device removal, making later re-insertion into
the same slot impossible, since firmware was still waiting for the driver
to finish with the removed device.
markj [Tue, 4 Aug 2020 13:58:36 +0000 (13:58 +0000)]
Remove free_domain() and uma_zfree_domain().
These functions were introduced before UMA started ensuring that freed
memory gets placed in domain-local caches. They no longer serve any
purpose since UMA now provides their functionality by default. Remove
them to simplyify the kernel memory allocator interfaces a bit.
Reviewed by: cem, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25937
gbe [Tue, 4 Aug 2020 08:46:28 +0000 (08:46 +0000)]
directory(3): Add an ERRORS section
- Add an ERRORS section for opendir(3) and closedir(3)
- Document also the errors of readdir(3), readdir_r(3) and telldir(3)
- Convert the code sample into an EXAMPLES section
kevans [Tue, 4 Aug 2020 02:47:24 +0000 (02:47 +0000)]
bsdgrep: switch to libregex for GNU_GREP_COMPAT
libregex is incomplete, but it's a bit less buggy than the in-base
libgnuregex and mostly OK.
While here, rename -DIWTH_GNU -> -DWITH_GNU_COMPAT; the option implies
that we're compatible with the GNU counterpart, not that we're including GNU
anything.
kevans [Tue, 4 Aug 2020 02:14:51 +0000 (02:14 +0000)]
libregex: Implement a subset of the GNU extensions
The entire patch-set is not yet mature enough for commit, but this usable
subset is generally enough for googletest to be happy with and mostly map to
some existing concepts, so they're not as invasive.
The specific changes included here are:
- Branching in BREs with \|
- \w and \W for [[:alnum:]] and [^[:alnum:]] respectively
- \s and \S for [[:space:]] and [^[:space:]] respectively
- Additional quantifiers in BREs, \? and \+ (self-explanatory)
There's some #ifdef'd out work for allowing empty branches as a match-all.
This is a feature that's under assessment... future work will determine
how standard this behavior is and act accordingly.
The tests compare the command output (including of error cases) with the
expected output and exit code.
Not all tests are executed, since some expect to have a known good bc and
dc binary installed and compare results of large amounts of generated data
being processed by both versions to test for regressions.
This version omits the printing of a copyright header in interactive mode
and the dc command now exits after execution of the commands passed via -e
or -f instead of switching to interactive mode. To pass further commands
via STDIN when dc has been invoked with -e or -f, add "-f -" to the
parameter list.
This version omits the printing of a copyright header in interactive mode
and the dc command now exits after execution of the commands passed via -e
or -f instead of switching to interactive mode. To pass further commands
via STDIN when dc has been invoked with -e or -f, add "-f -" to the
parameter list.
This version makes dc exit after processing all commands passed via -e or -f
instead of waiting for more input on STDIN (add "-f -" to the command line
to emulate the behavior of versionm 3.1.3 and earlier, if desired).
The version and copyright message are no longer printed for interactive
sessions as was the case with the prior implementation in the FreeBSD base
system.
arichardson [Mon, 3 Aug 2020 18:08:10 +0000 (18:08 +0000)]
Allow bootstrapping mtree on Linux systems
Linux glibc has a dummy lchmod that always fails and emitting a linker
warning when used. Don't fail the build due to that warning when
bootstrapping by setting LD_FATAL_WARNINGS=no.
arichardson [Mon, 3 Aug 2020 18:08:04 +0000 (18:08 +0000)]
Allow building setmode.c on Linux/macOS
We bootstrap this file to allow compiling FreeBSD on Linux systems since
some boostrap tools use setmode(). Unfortunately, glibc's sys/stat.h
declares a non-static getumask() function (which is unimplemented!) and
that conflicts with the local getumask() function. To work around this
simply use a different name here.
jhb [Mon, 3 Aug 2020 17:53:15 +0000 (17:53 +0000)]
Pass the full CFLAGS to cpp for MKlib_gen.sh.
GCC's cpp was exiting immediately when it failed to find requested
includes (<ncurses_cfg.h> and <ncurses_defs.h>). clang-cpp emitted an
error for the missing header files but continued processing the file
(thus not honoring any macros defined in the missing headers).
Arguably, the awk script is buggy since it doesn't check the return
value of the command it executes.
andrew [Mon, 3 Aug 2020 16:43:40 +0000 (16:43 +0000)]
Allow the Raspberry Pi firmware driver to be a bus
There are child nodes in the device tree, e.g. the Raspberry Pi firmware
GPIO device. Add support for this to be a bus so we can attach these
children.
Reviewed by: manu
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D25848
andrew [Mon, 3 Aug 2020 16:26:10 +0000 (16:26 +0000)]
Allow child classes of simplebus to call attach directly
Reduce code duplication when a bus is subclassed from simplebus by allowing
them to call simplebus_attach directly. This is useful when the child bus
will just implement the same calls.
As not all children will expect to have a ranges property, e.g. the
Raspberry Pi firmware, allow this property to be missing.
Reviewed by: manu
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D25925
libc: Provide sub fp(s|g)etmask() implementations for RISC-V
RISC-V doesn't support floating-point exceptions.
RISC-V Instruction Set Manual: Volume I: User-Level ISA, 11.2 Floating-Point
Control and Status Register: "As allowed by the standard, we do not support
traps on floating-point exceptions in the base ISA, but instead require
explicit checks of the flags in software. We considered adding branches
controlled directly by the contents of the floating-point accrued exception
flags, but ultimately chose to omit these instructions to keep the ISA simple."
We still need these functions, because some applications (notably Perl) call
them, but we cannot provide a meaningful implementation.
andrew [Mon, 3 Aug 2020 10:19:50 +0000 (10:19 +0000)]
Handle Raspberry Pi 4 xhci firmware loading.
The newer hardware revisions of the Raspberry Pi 4 removed the ability of
the VIA VL805 xhci controller to load its own firmware. Instead the
firmware must be installed at the appropriate time by the VideoCore
coprocessor.
Submitted by: Robert Crowston <crowston_protonmail.com>
Differential Revision: https://reviews.freebsd.org/D25261
jah [Sun, 2 Aug 2020 20:18:37 +0000 (20:18 +0000)]
vt(4): CONS_HISTORY/CONS_CLRHIST should operate on issuing terminal
Currently the CONS_HISTORY and CONS_CLRHIST ioctls modify the state of the
active terminal instead of the terminal against which the ioctl was issued.
Because of the way vidcontrol(1) works, these are the same in most cases.
But a poorly-timed window switch can make them differ. This is reproducible
by issuing e.g. 'vidcontrol -s 2 && vidcontrol -C' to switch from vty 1 to
vty 2; teken will reset the cursor position on vty 1 but vt(4) will clear
the history buffer of vty 2, producing an interesting state of affairs.
mjg [Sun, 2 Aug 2020 20:02:06 +0000 (20:02 +0000)]
vfs: store precomputed namecache hash in the vnode
This significantly speeds up path lookup, Cascade Lake doing access(2) on ufs
on /usr/obj/usr/src/amd64.amd64/sys/GENERIC/vnode_if.c, ops/s:
before: 2535298
after: 2797621
Over +10%.
The reversed order of computation here does not seem to matter for hash
distribution.
0mp [Sun, 2 Aug 2020 16:59:14 +0000 (16:59 +0000)]
core(5) appeared in Version 1 AT&T UNIX
Based on the scans of manual pages available at
https://www.bell-labs.com/usr/dmr/www/man51.pdf,
which are a part of the following collection:
https://www.bell-labs.com/usr/dmr/www/1stEdman.html.