Warner Losh [Fri, 17 Jan 2020 01:20:48 +0000 (01:20 +0000)]
Small tweak to the default behavior of shutdown -c
'shutdown -c' is supposed to power cycle the system rather than doing a normal
reboot. However, when that fails, it halts the system. This is not quite right
since the intent isn't to halt the system but to restart. Make the default init
behavior be to restart the system. The halt(8) interface can be used if you'd
like to powercycle or halt.
Warner Losh [Fri, 17 Jan 2020 01:16:23 +0000 (01:16 +0000)]
We only want to send the speedup to the lower layers when there's a shortage.
Only send a speedup when there's a shortage. While this is a little racy, lost
races aren't a big deal for this function. If there's a shorage just popping up
after we check these values, then we'll catch it next time. If there's a
shortage that's just clearing up, we may do some work at the lower layers a
little sooner than we otherwise would have. Sicne shortages are relatively rare
events, both races are acceptable.
Warner Losh [Fri, 17 Jan 2020 01:16:19 +0000 (01:16 +0000)]
Use buf to send speedup
It turns out there's a problem with using g_io to send the speedup. It leads to
a race when there's a resource shortage when a disk fails.
Instead, send BIO_SPEEDUP via struct buf. This is pretty straight forward,
except we need to transfer the bio_flags from b_ioflags for BIO_SPEEDUP commands
in g_vfs_strategy.
Warner Losh [Fri, 17 Jan 2020 01:15:55 +0000 (01:15 +0000)]
Pass BIO_SPEEDUP through all the geom layers
While some geom layers pass unknown commands down, not all do. For the ones that
don't, pass BIO_SPEEDUP down to the providers that constittue the geom, as
applicable. No changes to vinum or virstor because I was unsure how to add this
support, and I'm also unsure how to test these. gvinum doesn't implement
BIO_FLUSH either, so it may just be poorly maintained. gvirstor is for testing
and not supportig BIO_SPEEDUP is fine.
Kristof Provost [Thu, 16 Jan 2020 22:08:05 +0000 (22:08 +0000)]
Fix pfdenied not returning any results
When _a is empty we end up with an invalid invocation of pfctl, and no output.
We must add quotes to make it clear to pfctl that we're passing an empty anchor
name.
Mateusz Guzik [Thu, 16 Jan 2020 21:45:21 +0000 (21:45 +0000)]
vfs: increment numvnodes without the vnode list lock unless under pressure
The vnode list lock is only needed to reclaim free vnodes or kick the vnlru
thread (or to block and not miss a wake up (but note the sleep has a timeout so
this would not be a correctness issue)). Try to get away without the lock by
just doing an atomic increment.
The lock is contended e.g., during poudriere -j 104 where about half of all
acquires come from vnode allocation code.
Note the entire scheme needs a rewrite, the above just reduces it's SMP impact.
Conrad Meyer [Thu, 16 Jan 2020 21:38:44 +0000 (21:38 +0000)]
random(6): Fix off-by-one
After r355693, random(6) -f sometimes fail to output all the lines of the
input file. This is because the range from which random indices are chosen
is too big, so occasionally the random selection doesn't correspond to any
line and nothing gets printed.
(Ed. note: Mea culpa. Working on r355693, I was confused by the sometime
use of 1-indexing, sometimes 0-indexing in randomize_fd().)
Submitted by: Ryan Moeller <ryan AT freqlabs.com>
X-MFC-With: r355693
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D23199
Alan Somers [Thu, 16 Jan 2020 21:31:56 +0000 (21:31 +0000)]
setextattr: Increase stdin buffer size to 4096
Extended attribute values can potentially be quite large. One test for ZFS
is supposed to set a 200MB xattr. However, the buffer size for reading
values from stdin with setextattr -i is so small that the test times out
waiting for tiny chunks of data to be buffered and appended to an sbuf.
Increasing the buffer size should help alleviate some of the burden of
reallocating larger sbufs when writing large extended attributes.
Submitted by: Ryan Moeller <ryan@freqlabs.com>
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D23211
Emmanuel Vadot [Thu, 16 Jan 2020 21:21:20 +0000 (21:21 +0000)]
arm64: rockchip: Add new interface for rk_pinctrl
The gpio controller in rockchips SoC in a child of the pinctrl driver
and cannot control pullups and pulldowns.
Use the new fdt_pinctrl interface for accessing pin capabilities and
setting them.
We can now report that every pins is capable of being IN or OUT function
and PULLUP PULLDOWN.
If the pin isn't in gpio mode no changes will be allowed.
Emmanuel Vadot [Thu, 16 Jan 2020 21:19:27 +0000 (21:19 +0000)]
fdt_pinctrl: Add new methods for gpios
Most of the gpio controller cannot configure or get the configuration
of the pin muxing as it's usually handled in the pinctrl driver.
But they can know what is the pinmuxing driver either because they are
child of it or via the gpio-range property.
Add some new methods to fdt_pinctrl that a pin controller can implement.
Some methods are :
fdt_pinctrl_is_gpio: Use to know if the pin in the gpio mode
fdt_pinctrl_set_flags: Set the flags of the pin (pullup/pulldown etc ...)
fdt_pinctrl_get_flags: Get the flags of the pin (pullup/pulldown etc ...)
Emmanuel Vadot [Thu, 16 Jan 2020 20:52:26 +0000 (20:52 +0000)]
regulator_fixed: Add a get_voltage method
Some consumer cannot know the voltage of the regulator without it.
While here, refuse to attach is min_voltage != max_voltage, it
shouldn't happens anyway.
Emmanuel Vadot [Thu, 16 Jan 2020 20:02:41 +0000 (20:02 +0000)]
arm: allwinner: Add support for bank supply
Each GPIO bank is powered by a different pin and so can be powered at different
voltage from different regulators.
Add a new config that now hold the pinmux data and the banks available on each
SoCs.
Since the aw_gpio driver being also the pinmux one it's attached before the PMIC
so add a config_intrhook_oneshot function that will enable the needed regulators
when the system is fully functional.
Emmanuel Vadot [Thu, 16 Jan 2020 19:59:00 +0000 (19:59 +0000)]
axp8xx: Add a regnode_init method
This method will set the desired voltaged based on values in the DTS.
It will not enable the regulator, this is the job of either a consumer
or regnode_set_constraint SYSINIT if the regulator is boot_on or always_on.
Alex Richardson [Thu, 16 Jan 2020 14:15:00 +0000 (14:15 +0000)]
Allow building bin/cat on non-FreeBSD systems
`cat -l` is needed during the installworld phase and other system's cat
don't support that flag. To avoid portability issues when compiling on
Linux/macOS (such as the the direct access to &fp->_mbstate), we
disable the entire multibyte support when building as a boostrap tool.
Alex Richardson [Thu, 16 Jan 2020 14:14:55 +0000 (14:14 +0000)]
Allow bootstrapping mkimg on macOS/Linux
On these systems the (u)int64_t typedefs will not be implicitly defined by the
previous includes, so include <stdint.h> in the header that uses uint64_t.
[MIPS][ELF] Use PC-relative relocations in .eh_frame when possible
When compiling position-independent executables, we now use
DW_EH_PE_pcrel | DW_EH_PE_sdata4. However, the MIPS ABI does not define a
64-bit PC-relative ELF relocation so we cannot use sdata8 for the large
code model case. When using the large code model, we fall back to the
previous behaviour of generating absolute relocations.
With this change clang-generated .o files can be linked by LLD without
having to pass -Wl,-z,notext (which creates text relocations).
This is simpler than the approach used by ld.bfd, which rewrites the
.eh_frame section to convert absolute relocations into relative references.
I saw in D13104 that apparently ld.bfd did not accept pc-relative relocations
for MIPS ouput at some point. However, I also checked that recent ld.bfd
can process the clang-generated .o files so this no longer seems true.
[MIPS] Don't emit R_(MICRO)MIPS_JALR relocations against data symbols
The R_(MICRO)MIPS_JALR optimization only works when used against functions.
Using the relocation against a data symbol (e.g. function pointer) will
cause some linkers that don't ignore the hint in this case (e.g. LLD prior
to commit 5bab291) to generate a relative branch to the data symbol
which crashes at run time. Before this patch, LLVM was erroneously emitting
these relocations against local-dynamic TLS function pointers and global
function pointers with internal visibility.
These two changes should allow using lld for MIPS64 (and maybe also MIPS32)
by default.
The second commit is not strictly necessary for clang+lld since LLD9 will
not perform the R_MIPS_JALR optimization (it was only added for 10) but it
is probably required in order to use recent ld.bfd.
Jeff Roberson [Thu, 16 Jan 2020 05:01:21 +0000 (05:01 +0000)]
Simplify VM and UMA startup by eliminating boot pages. Instead use careful
ordering to allocate early pages in the same way boot pages were but only
as needed. After the KVA allocator has started up we allocate the KVA that
we consumed during boot. This also makes the boot pages freeable since they
have vm_page structures allocated with the rest of memory.
Parts of this patch were written and tested by markj.
Leandro Lupori [Wed, 15 Jan 2020 20:25:52 +0000 (20:25 +0000)]
[PPC64] memcpy/memmove/bcopy optimization
For copies shorter than 512 bytes, the data is copied using plain
ld/std instructions.
For 512 bytes or more, the copy is done in 3 phases:
Phase 1: copy from the src buffer until it's aligned at a 16-byte boundary
Phase 2: copy as many aligned 64-byte blocks from the src buffer as possible
Phase 3: copy the remaining data, if any
In phase 2, this code uses VSX instructions when available. Otherwise,
it uses ldx/stdx.
Kirk McKusick [Wed, 15 Jan 2020 18:53:32 +0000 (18:53 +0000)]
Peter Holm reports that his test that does an umount(8) on an active
mount point while numerous tests are running that are writing to
files on that mount point cause the unmount(8) to hang forever.
The unmount(8) system call is handled in the kernel by the dounmount()
function. The cause of the hang is that prior to dounmount() calling
VFS_UNMOUNT() it is calling VFS_SYNC(mp, MNT_WAIT). The MNT_WAIT
flag indicates that VFS_SYNC() should not return until all the dirty
buffers associated with the mount point have been written to disk.
Because user processes are allowed to continue writing and can do
so faster than the data can be written to disk, the call to VFS_SYNC()
can never finish.
Unlike VFS_SYNC(), the VFS_UNMOUNT() routine can suspend all processes
when they request to do a write thus having a finite number of dirty
buffers to write that cannot be expanded. There is no need to call
VFS_SYNC() before calling VFS_UNMOUNT(), because VFS_UNMOUNT() needs
to flush everything again anyway after suspending writes, to catch
anything that was dirtied between the VFS_SYNC() and writes being
suspended.
The fix is to simply remove the unnecessary call to VFS_SYNC() from
dounmount().
Reported by: Peter Holm
Analysis by: Chuck Silvers
Tested by: Peter Holm
MFC after: 7 days
Sponsored by: Netflix
Kyle Evans [Wed, 15 Jan 2020 15:59:32 +0000 (15:59 +0000)]
mips trampoline: don't bother with unwind tables
The utility here seems somewhat limited, but clang will attempt to generate
.eh_frame and actively fail in doing so. It is perhaps worth investigating
why it's being generated in the first place (GCC doesn't do so), but this
isn't a high priority.
Ed Maste [Wed, 15 Jan 2020 13:52:13 +0000 (13:52 +0000)]
Update WITHOUT_BINUTILS* descriptions
In the WITHOUT_ descriptions we don't need to mention that ld.bfd is
limited to powerpc. When WITHOUT_BINUTILS is specified ld.bfd is not
installed on any CPU architecture.
Gleb Smirnoff [Wed, 15 Jan 2020 06:05:20 +0000 (06:05 +0000)]
Introduce NET_EPOCH_CALL() macro and use it everywhere where we free
data based on the network epoch. The macro reverses the argument
order of epoch_call(9) - first function, then its argument. NFC
Gleb Smirnoff [Wed, 15 Jan 2020 03:40:32 +0000 (03:40 +0000)]
Since this code dereferences struct ifnet, it must include if_var.h
explicitly, not via header pollution. While here move TCPSTATES
declaration right above the include that is going to make use of it.
Gleb Smirnoff [Wed, 15 Jan 2020 03:34:21 +0000 (03:34 +0000)]
- Move global network epoch definition to epoch.h, as more different
subsystems tend to need to know about it, and including if_var.h is
huge header pollution for them. Polluting possible non-network
users with single symbol seems much lesser evil.
- Remove non-preemptible network epoch. Not used yet, and unlikely
to get used in close future.
Mateusz Guzik [Wed, 15 Jan 2020 01:30:32 +0000 (01:30 +0000)]
rtld: remove hand rolled memset and bzero
They were introduced to take care of ifunc, but right now no architecture
provides ifunc'ed variants. Since rtld uses memset extensively this results in
a pessmization. Should someone want to use ifunc here they should provide a
mandatory symbol (e.g., rtld_memset).
Kirk McKusick [Tue, 14 Jan 2020 22:27:46 +0000 (22:27 +0000)]
When sync'ing a mount point, the mount point's vnodes were scanned
twice. Once to update the changed inodes, and a second time to update
changed quota information. This change merges these two scans into a
single scan which does both inode and quota updates.
Kyle Evans [Tue, 14 Jan 2020 17:50:13 +0000 (17:50 +0000)]
Revert r353140: Re-add ALLOW_MIPS_SHARED_TEXTREL, sprinkle it around
arichardson has an actual fix for the same issue that this was working
around; given that we don't build with llvm today, go ahead and revert the
workaround in advance.
Ed Maste [Tue, 14 Jan 2020 17:35:34 +0000 (17:35 +0000)]
Update WITH_/WITHOUT_CLANG_IS_CC descriptions
Describe /usr/bin/cc etc. as links to the compiler, and don't conflate
WITHOUT_CLANG_IS_CC with installing GCC. Leave a reference to WITH_GCC
and WITHOUT_CLANG_IS_CC installing links to GCC, although this will be
removed in ~1.5 months when GCC 4.2.1 is removed from the tree.
Andriy Gapon [Tue, 14 Jan 2020 13:20:16 +0000 (13:20 +0000)]
storvsc: port a Linux patch, properly set residual data length on errors
This change is based on Linux commit 40630f462824ee. csio.resid should
account for transfer_len only for success and SRB_STATUS_DATA_OVERRUN
condition.
I am not sure how exactly this change works, but I have a report from a
user that they see lots of checksum errors when running a pool scrub
concurrently with iozone -l 1 -s 100G. After applying this patch the
problem cannot be reproduced.
asprintf returns -1, not an arbitrary value < 0. Also upon error the
(very sloppy specification) leaves an undefined value in *ret, so it is
wrong to inspect it, the error condition is enough.
Alexander Motin [Tue, 14 Jan 2020 03:27:57 +0000 (03:27 +0000)]
Restore loop break in vm_pageout_lowmem().
r355004 removed return statement from this loop with intention to also
call uma_reclaim_wakeup(). But in case of vm.lowmem_period=0 it causes
infinite loop.
Ryan Libby [Tue, 14 Jan 2020 02:14:15 +0000 (02:14 +0000)]
uma: split slabzone into two sizes
By allowing more items per slab, we can improve memory efficiency for
small allocs. If we were just to increase the bitmap size of the
slabzone, we would then waste slabzone memory. So, split slabzone into
two zones, one especially for 8-byte allocs (512 per slab). The
practical effect should be reduced memory usage for counter(9).
Ryan Libby [Tue, 14 Jan 2020 02:14:02 +0000 (02:14 +0000)]
malloc: remove assumptions about MINALLOCSIZE
Remove assumptions about the minimum MINALLOCSIZE, in order to allow
testing of smaller MINALLOCSIZE. A following patch will lower the
MINALLOCSIZE, but not so much that the present patch is required for
correctness at these sites.
Jeff Roberson [Tue, 14 Jan 2020 02:00:24 +0000 (02:00 +0000)]
Fix a long standing bug in journaled soft-updates. The dirrem structure
needs to handle file removal, directory removal, file move, directory move,
etc. The code in handle_workitem_remove() needs to propagate any completed
journal entries to the write that will render the change stable. In the
case of a moved directory this means the new parent. However, for an
overwrite that frees a directory (DIRCHG) we must move the jsegdep to the
removed inode to be released when it is stable in the cg bitmap or the
unlinked inode list. This case was previously unhandled and caused a
panic.
netmap: disable passthrough with no hypervisor support
The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.
vmx: fix initialization of TSO related descriptor fields
Fix a mistake introduced by r343291, which ported the vmx(4)
driver to iflib.
In case of TSO, the hlen field of the (first) tx descriptor must
be initialized to the cumulative length of Ethernet, IP and TCP
headers. The length of the TCP header was missing.
Dimitry Andric [Mon, 13 Jan 2020 20:31:10 +0000 (20:31 +0000)]
Merge commit f46ba4f07 from llvm git (by Simon Atanasyan):
[mips] Use less registers to load address of TargetExternalSymbol
There is no pattern matched `add hi, (MipsLo texternalsym)`. As a
result, loading an address of 32-bit symbol requires two registers
and one more additional instruction:
```
addiu $1, $zero, %lo(foo)
lui $2, %hi(foo)
addu $25, $2, $1
```
This patch adds the missed pattern and enables generation more
effective set of instructions:
```
lui $1, %hi(foo)
addiu $25, $1, %lo(foo)
```
Merge commit 59bb3609f from llvm git (by Simon Atanasyan):
[mips] Fix 64-bit address loading in case of applying 32-bit mask to
the result
If result of 64-bit address loading combines with 32-bit mask, LLVM
tries to optimize the code and remove "redundant" loading of upper
32-bits of the address. It leads to incorrect code on MIPS64 targets.
MIPS backend creates the following chain of commands to load 64-bit
address in the `MipsTargetLowering::getAddrNonPICSym64` method:
```
(add (shl (add (shl (add %highest(sym), %higher(sym)),
16),
%hi(sym)),
16),
%lo(%sym))
```
If the mask presents, LLVM decides to optimize the chain of commands.
It really does not make sense to load upper 32-bits because the
0x0fffffff mask anyway clears them. After removing redundant commands
we get this chain:
```
(add (shl (%hi(sym), 16), %lo(%sym))
```
There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in
`SYM_32` predicate definition, backend incorrectly selects a pattern
for a 32-bit symbols and uses the `lui` instruction for loading
`%hi(sym)`.
As a result we get incorrect set of instructions with unnecessary
16-bit left shifting:
```
lui at,0x0
R_MIPS_HI16 foo
dsll at,at,0x10
daddiu at,at,0
R_MIPS_LO16 foo
```
This patch resolves two problems:
- Fix `SYM_32/SYM_64` predicates to prevent selection of patterns
dedicated to 32-bit symbols in case of using N64 ABI.
- Add missed patterns for 64-bit symbols for `%hi/%lo`.
Toomas Soome [Mon, 13 Jan 2020 20:02:27 +0000 (20:02 +0000)]
Backout 356693. The libsa malloc does provide necessary alignment and
memalign by 4 will reduce alignment for some platforms. Thanks for Ian for
pointing this out.
Kyle Evans [Mon, 13 Jan 2020 18:26:27 +0000 (18:26 +0000)]
tap(4): also note that we drop configured addresses
This provides a specific pointer for users of tap(4) to understand why their
interfaces are losing their addresses, and specifically how to workaround
this if they need different behavior.
This manpage received a .Dd bump earlier today in r35688, so no bump occurs
this time.
Mateusz Guzik [Mon, 13 Jan 2020 14:33:51 +0000 (14:33 +0000)]
ufs: relax an overzealous assert added in r356671
Part of i_flag can persist across a drop to hold count of 0, at which
point the vnode is taken off the lazy list. Then whoever locks and unlocks
the vnode can trip on the assert.
This trips over kyua running a test untarring character devices to ufs.