After r343619 ipfw uses own locking for packets flow. PULLUP_LEN() macro
is used in ipfw_chk() to make m_pullup(). When m_pullup() fails, it just
returns via `goto pullup_failed`. There are two places where PULLUP_LEN()
is called with IPFW_PF_RLOCK() held.
Add PULLUP_LEN_LOCKED() macro to use in these places to be able release
the lock, when m_pullup() fails.
Sponsored by: Yandex LLC
NOTE: since r343619 was not merged, this commit is mostly NOP, but
it is needed to reduce code difference between stable and head/.
Emmanuel Vadot [Mon, 16 Dec 2019 18:00:05 +0000 (18:00 +0000)]
MFC r354115:
dtc: Allow multiple dts-v1 tag
Some dts are including dtsi that also contain a /dts-v1/ tag at the
top. GNU DTC doesn't seems to have a problem with that so fix our
dtc to behave the same.
Cy Schubert [Mon, 16 Dec 2019 02:38:47 +0000 (02:38 +0000)]
MFC r355670:
Rather than pass the address of the packet information control block to
ipf_pcksum6(), directly pass the adddress of the mbuf to it. This reduces
one pointer dereference. ipf_pcksum6() doesn't use the packet information
control block except to obtain the mbuf address.
Xin LI [Sat, 14 Dec 2019 09:49:36 +0000 (09:49 +0000)]
MFC r345744, r348122, r355247
r345744: random(4): Attempt to persist entropy promptly
r348122: save-entropy(8), rc.d/random: Set nodump flag
r355247: Reduce disk write load in /usr/libexec/save-entropy.
Scott Long [Fri, 13 Dec 2019 05:29:26 +0000 (05:29 +0000)]
Merge r355134,355375,355589
Clean up and clarify meta commentary on TAA. Add a state to denote
that TSX doesn't exist on the CPU.
x86: Add missed break to TAA status sysctl
Fix the TAA state machine to do the right thing when the TAA
migitation is available in microcode and the operator has set
the sysctl to automatic mode.
A future change to posixshm to add file sealing will move locking out of
shm_dotruncate as kern_shm_open() will require the lock to be held across
the dotruncate until the seal is actually applied. For this, the cookie is
passed into shm_dotruncate_locked which asserts RCA_WLOCKED.
r343777:
Fix zapping of static hints and env in init_static_kenv(). Environments
are terminated by 2 NULs, but only 1 NUL was zapped. Zapping only 1
NUL just splits the first string into an empty string and a corrupted
string. All other strings in static hints and env remained live early
in the boot when they were supposed to be disabled.
Support calling init_static_kenv() very early in the boot, so as to
use the env very early in the boot. Then the pointer to the loader
env may change after the first call due to enabling paging or otherwise
remapping the pointer. Another call is needed to register the change.
Don't use the previous pointer in this (or any) later call.
r352244:
kenv: assert that an empty static buffer passed in is "empty"
Garbage in the passed-in buffer can cause problems if any attempts to read
the kenv are inadvertently made between init_static_kenv and the first
kern_setenv -- assuming there is one.
This is cheap and easy, so do it. This also helps rule out some class of
bugs as one tries to debug; tunables fetch from the static environment up
until SI_SUB_KMEM + 1, and many of these buffers are global ~4k buffers that
rely on BSS clearing while others just grab a page of free memory and use it
(e.g. xen).
r352245:
Follow up r352244: kenv: tighten up assertions
As I like to forget: static kenv var formatting is actually such that an
empty environment would be double null bytes. We should make sure that a
non-zero buffer has at least enough for this, though most of the current
usage is with a 4k buffer.
Rick Macklem [Thu, 12 Dec 2019 22:00:10 +0000 (22:00 +0000)]
MFC: r354989
Fix the pNFS server's reporting of SpaceUsed (va_bytes).
The pNFS server currently reports SpaceUsed (va_bytes) for the metadata
file. This in not correct, since the metadata file is always empty and,
as such, va_bytes is just the allocation for the empty file.
This patch adds va_bytes to the list of attributes acquired from the
DS for a file, so that it includes the allocated data size and is updated
when the file is written.
For files created on a pNFS server before this patch is applied, the
va_bytes value is estimated by rounding va_size up to a multiple of
BLKDEV_IOSIZE. Once the file is written after this patch has been
applied to the metadata server, the va_bytes returned for the file
will be correct.
This patch only affects a pNFS metadata server.
Found during testing of the NFSv4.2 pNFS server for the Allocate operation.
(Not yet in head/current.)
r348803 by bz:
bcm2835_sdhci.c: save block registers to avoid controller bug
Extending what the initial revision, r273264, r276985, r277346 have
started for the transfer mode and command registers, another pair of
16bit registers written in sequence are block size and block count,
which fall together onto the same 32bit line and hence the same
register(s) would be written twice in sequence for those as well.
Use a similar approach to transfer mode and command and save the writes
to either of the block regiters and then only execute a write once.
We can do this as with transfer mode their values are meaningless until
a command is issued so we can use that write to command as a trigger
to also write out the block registers.
Compared to transfer mode and command the value of block count can
change, so we need to keep state and actually read the block registers
back the first time after a write.
r348804 by bz:
bcm2835_sdhci.c: exit DMA if not enough data left to avoid timeout errors
In the DMA case, given we disable the data interrupts, we never seem
to get DATA_END. Given we are relying on DMA interrupts we are not
using the SDHCI state machine and hence only call into
sdhci_platform_will_handle() for the first check of data.
We do not call "will handle" for any following round trips of the same
transaction if block size * count > BCM_DMA_BLOCK_SIZE.
Manually check "left" in the DMA interrupt handler to see if we have at
least another full BCM_DMA_BLOCK_SIZE to handle.
Without this change we would DMA that and then even start a DMA with
left == 0 which would lead to a timeout and error.
Now we re-enable data interrupts and return and let the SDHCI generic
interrupt handler and state machine pick the SPACE_AVAIL up and then
find that it should punt to the pio_handler for the remaining bytes
or finish the data transaction.
With this change block mode seems to work beyond 7 * 64byte blocks,
which worked as it was below BCM_DMA_BLOCK_SIZE.
r354488:
bcm_lintc: don't attach if "interrupt-controller" is missing
This is a standard required property for interrupt controllers, and present
on the bcm_lintc nodes for currently supported RPi models. For the RPi4, we
have both bcm_lintc as well as GIC-400, but only one may be active at a
time.
Don't probe bcm_lintc if it's missing the "interrupt-controller" property --
in RPi 4 DTS, the bcm_lintc node is actually missing this along with other
required interrupt properties. Presumably, if the earlier boot stages will
support switching to the legacy interrupt controller (as is suggested
possible by the documentation), the DTS will need to be updated to indicate
the proper interrupt-parent and hopefully also mark this node as an
interrupt-controller instead.
r354524:
bcm2835_dma: Mark IRQs shareable
On the RPi4, some of these IRQs are shared. Start moving toward a mode where
we accept that shared IRQs happen and simply ignore interrupts that are
seemingly for no reason.
I would like to be more verbose here, but my 30-minute assessment of the
current world order is that mapping a resource/rid to an actual IRQ number
(as found in FDT) data is not a simple matter. Determining if more than one
handler is attached to an IRQ is closer to feasible, but it's unclear which
way is the cleaner path. Beyond that, we're only really using it to be
slightly more verbose when something's going wrong, so for now just suppress
and drop a complaint-comment.
This was originally submitted (via freebsd-arm@) by Robert Crowston; the
additional verbosity was dropped by kevans@.
r354560:
bcm2835_sdhci: add some very basic support for rpi4
DMA is currently disabled while I work out why it's broken, but this is
enough for upstream U-Boot + rpi-firmware + our rpi3-psci-monitor to boot
with the right config.
The RPi 4 is still not in a good "supported" state, as we have no
USB/PCI-E/Ethernet drivers, but if air-gapped pies only able to operate over
cereal is your thing, here's your guy.
r354561:
bcm2835_sdhci: remove unused power_id field
This was once set, but I removed it by the time I committed it because both
configurations use the same POWER_ID. This can be separated back out if the
situation changes.
r354563:
bcm2835: commit missing constant from r354560
Surgically pulling the patch from my debugging work lead to this slopiness-
my apologies.
r354577:
arm64: add SOC_BRCM_BCM2838, build it in GENERIC
BCM2838/BCM2711 is the Raspberry Pi 4, which we will soon be able to boot
on once some ports bits are worked out.
r354579:
bcm2835_sdhci: don't panic in DMA interrupt if curcmd went away
This is an exceptional case; generally found during controller errors.
A panic when we attempt to acess slot->curcmd->data is less ideal than
warning, and other verbiage will be emitted to indicate the exact error.
r354823:
bcm2835_sdhci: push DATA_END handling out of DMA interrupt path
This simplifies the DMA interrupt handler quite a bit. The sdhci framework
will call platform_finish_transfer() if it's received SDHCI_INT_DATA_END, so
we can take care of any final cleanup there and simply not worry about the
possibility of it ending in the DMA interrupt path.
r354825:
bcm2835_sdhci: use a macro for interrupts we handle
This is just further simplification, very little functional change. In the
DMA interrupt handler, we *do* now acknowledge both DATA_AVAIL | SPACE_AVAIL
every time -- these operations are mutually exclusive, so while this is a
functional change, it's effectively a nop. Removing the 'mask' local allows
us to further simplify in a future change.
r354844:
bcm2835_sdhci: drop an assert in start_dma_seg
Trivial change to clarify locking expectations... no functional change.
r354845:
bcm2835_sdhci: some style cleanup, no functional change
This allows easy and care-free scaling of NUM_DMA_SEGS with proper-ish
calculations to make sure we can actually handle the number of segments we'd
like to handle on average so that performance comparisons can be easily made
at different values if/once we can actually handle it. It also makes it
helps the untrained reader understand more quickly the reasoning behind the
choice of maxsize/maxsegs/maxsegsize.
r354868:
bcm2835_sdhci: various refactoring of DMA path
This round of refactoring is mostly about streamlining the interrupt handler
to make it easier to verify and reason about operations taking place while
trying to bring FreeBSD up on the RPi4.
r354875:
bcm2835: push address mapping conversion for DMA/mailbox to runtime
We could maintain the static conversions for the !AArch64 Raspberry Pis, but
I'm not sure it's worth it -- we'll traverse the platform list exactly once
(of which there are only two for armv7), then every conversion there-after
traverses the memory map listing of which there are at-most two entries for
these boards: sdram and peripheral space.
Detecting this at runtime is necessary for the AArch64 SOC, though, because
of the distinct IO windows being otherwise not discernible just from support
compiled into the kernel. We currently select the correct window based on
/compatible in the FDT.
We also use a similar mechanism to describe the DMA restrictions- the RPi 4
can have up to 4GB of RAM while the DMA controller and mailbox mechanism can
technically, kind of, only access the lowest 1GB. See the comment in
bcm2835_vcbus.h for a fun description/clarification of this.
r354876:
bcm2835_vcbus: add compatibility name for ^/sys/contrib/vchiq
It's unclear how this didn't get caught in my last iteration, but the fix is
easy- the interface is still compatible, it was just gratuituously renamed
to match my arbitrary definition of consistency... VCBUS, the BCM2835 name,
represents an address on the VideoCore CPU Bus.
In a similar fashion, while it is a physical address, the ARMC portion
represents that these are addresses as seen by the ARM CPU.
To make things even more fun, the BCM2711 peripheral documentation describes
not virtual address space vs. physical address space, but instead the 32-bit
address map vs. the address map in "Low Peripheral" mode. The latter of
these is what the *ARMC* macros translate to/from.
r354930:
bcm2835_sdhci: clean up DMA segments in error handling path
Later parts assume that this would've been done if interrupts are enabled,
but this is the only case in which that wouldn't have been true. This commit
also reorders operations such that we're done touching slot/slot->intmask
before we call back into the SDHCI framework and exit.
r354931:
Revert r354930: wrong diff, right message.
r354932:
bcm2835_sdhci: roll back r354823
r354823 kicked DATA_END handling out of the DMA interrupt path "to make
things easy", but this was likely a mistake -- if we know we're done after
we've finished pending DMA operations, we should go ahead and acknowledge
it rather than waiting for the controller to finalize it. If it's not ready,
we'll simply re-enable interrupts and wait for it anyways, to be re-entered
in sdhci_data_intr.
r354933:
bcm2835_sdhci: clean up DMA segments in error handling path
Later parts assume that this would've been done if interrupts are enabled,
but this is the only case in which that wouldn't have been true. This commit
also reorders operations such that we're done touching slot/slot->intmask
before we call back into the SDHCI framework and exit.
r354956:
bcm2835_sdhci: only inspect interrupts we handle
We'll write the value we read back to ack pending interrupts, but we should
at least make it clear to ourselves that we only want to ack pending
transfer interrupts.
r355016:
bcm2835_vcbus: add the *other* rpi4 compat string
The DTS I used initially had brcm,bcm2838; the new one uses brcm,bcm2711.
Add that one as well.
r355025:
bcm2835_sdhci: "fix" DMA on the RPi 4
According to the documentation I have, DREQ pacing should be required here.
The DREQ# hasn't changed since the BCM2835. As soon as we attempt to setup
DREQ, DMA stalls and there's no clear reason why as of yet. Setting this
back to NONE seems to work just as well, though it's yet to be determined if
this is a sustainable model in high-throughput scenarios.
r355026:
bcm2835_dma: rip out the "use_dma" flag, make it non-optional
Now that it works for the Raspberry Pi 4, we can discontinue our workarounds
that were put in place to at least get a bootable kernel for other testing.
r355031:
bcm2835_sdhci: fix non-INVARIANTS build
sc is now only used to make sure we're not re-entering the data handling
path erroneously.
r355563:
RPI: Fix DMA/SDHCI on the BCM2836 (Raspberry Pi 2)
r354875 pushed VCBUS <-> ARMC translations to runtime determination, but
incorrectly mapped addresses for the BCM2836 -- SOC_BCM2835 and SOC_BCM2836
are actually mutually exclusive, so the BCM2836 config (GENERIC) would have
taken the latter path in the header and used 0x3f000000 as peripheral start.
Easily fixed -- split out the BCM2836 into its own memmap config and use
that instead if SOC_BCM2836 is included. With this, we get back to userland
again.
Kyle Evans [Thu, 12 Dec 2019 18:53:45 +0000 (18:53 +0000)]
MFC r355460: libbe: fix build against sysutils/openzfs, part 1
This is the half of the changes required that work as-is with both in-tree
ZFS and the new hotness, sysutils/openzfs. Highlights are less dependency
on header pollution (from somewhere) and using 'mnttab' instead of
'extmnttab'. In the in-tree ZFS, the latter is a #define for the former,
but in the port extmnttab is actually a distinct struct that's a super-set
of mnttab. We really want mnttab here anyways, so just use it.
r354247: lualoader: rewrite try_include using lfs + dofile
Actual modules get require()'d in, rather than try_include(). All instances
of try_include should be provided with proper hooks/API in the rest of
loader to do the work they need to do, since we can't rely on them to exist.
Convert this now to lfs + dofile since we won't really be treating them as
modules.
lfs is required because dofile will properly throw an error if the file
doesn't exist, which is not in the spirit of 'optionally included'.
Getting out of the pcall game allows us to provide a loader.exit() style
call that backs out to the common bits of loader (autoboot sequence unless
disabled with a loader.setenv("autoboot_delay", "NO")). The most ideal way
identified so far to implement loader.exit() is to throw a special
abort-style error that indicates to the caller in interp_lua that we've not
actually errored out, just continue execution. Otherwise, we have to hack in
logic to bubble up and return from loader.lua without continuing further,
which gets kind of ugly depending on the context in which we're aborting.
A compat shim is provided temporarily in case the executing loader doesn't
yet have loader.lua_path, which was just added in r354246.
r355349: lualoader: correct a typo from r354247
r354247 converted try_include to lfs + dofile with the loader.lua_path added
just before. Fortunately, there was a hardcoded /boot/lua fallback in case
loader.lua_path wasn't being set yet- I typo'd it as loader.lua_paths.
Dimitry Andric [Thu, 12 Dec 2019 18:16:32 +0000 (18:16 +0000)]
MFC r355568:
Correctly check for C++17 and higher when declaring timespec_get()
Summary:
In rS338751, the check to declare `timespec_get()` for C++17 and higher
was incorrectly done against a `cplusplus` define, while it should have
been `__cplusplus`.
Fix this by using `__cplusplus`, and also bump `__FreeBSD_version` so it
becomes possible to correctly check for `timespec_get()` in upstream
libc++ headers.
Alexander Motin [Thu, 12 Dec 2019 00:29:48 +0000 (00:29 +0000)]
MFC r355182: Fix use-after-free in case of L2ARC prefetch failure.
In case L2ARC read failed, l2arc_read_done() creates _different_ ZIO
to read data from the original storage device. Unfortunately pointer
to the failed ZIO remains in hdr->b_l1hdr.b_acb->acb_zio_head, and if
some other read try to bump the ZIO priority, it will crash.
The problem is reproducible by corrupting L2ARC content and reading
some data with prefetch if l2arc_noprefetch tunable is changed to 0.
With the default setting the issue is probably not reproducible now.
MFC r355322:
Use refcount from "in_joingroup_locked()" when joining multicast
groups. Do not acquire additional references. This makes the IPv4 IGMP
code in line with the IPv6 MLD code.
Background:
The IPv4 multicast code puts an extra reference on the in_multi struct
when joining groups. This becomes visible when using daemons like
igmpproxy from ports, that multicast entries do not disappear from the
output of ifmcstat(8) when multicast streams are disconnected.
This fixes a regression issue after r349762.
While at it factor the ip_mfilter_insert() and ip6_mfilter_insert() calls
to avoid repeated "is_new" check.
Just as disks can have nested partitions, the same happens with cd devices,
so we need to detect device paths and make sure we will not mix the handles.
To address this:
we fetch handle array and create linked list of block devices.
we walk the list and detect parent devices and set children pd_parent.
for {fd, cd, hd}, we walk device list and pick up our devices and store to
corresponding list. We make sure we store parent device first.
For sorting we use 3 steps: We check for floppy, we check for cd and then
everything else must be hd.
In general, it seems the floppy devices have no parent.
CD can have both parents and children (multiple boot entries, partitions
from the hybrid disk image).
Tested by: cross+freebsd@distal.com on Cisco UCS systems, C200 series (C220M5, C240M4, C220M3).
Also on MBP with UEFI 1.10
Matt Macy [Sun, 8 Dec 2019 04:19:05 +0000 (04:19 +0000)]
MFC r351770,r352920-r352923
MFCs to dd appear to have been haphazard so selectively
MFCing individually didn't work easily.
Add conv=fsync flag to dd
The fsync flag performs an fsync(2) on the output file before closing it.
This will be useful for the ZFS test suite.
Add conv=fdatasync flag to dd
The fdatasync flag performs an fdatasync(2) on the output file before closing it.
This will be useful for the ZFS test suite.
dd: Check result of close(2) for errors
close(2) can return errors from previous operations which should not be ignored.
Add oflag=fsync and oflag=sync capability to dd
Sets the O_FSYNC flag on the output file. oflag=fsync and oflag=sync are
synonyms just as O_FSYNC and O_SYNC are synonyms. This functionality is
intended to improve portability of dd commands in the ZFS test suite.
Add iflag=fullblock to dd
Normally, count=n means read(2) will be called n times on the input to dd. If
the read() returns short, as may happen when reading from a pipe, fewer bytes
will be copied from the input. With conv=sync the buffer is padded with zeros
to fill the rest of the block.
iflag=fullblock causes dd to continue reading until the block is full, so that
count=n means n full blocks are copied. This flag is compatible with illumos
and GNU dd and is used in the ZFS test suite.
Submitted by: Ryan Moeller, Thomas Hurst
Reviewed by: manpages, mmacy@
Sponsored by: iXsystems, Inc.
Ian Lepore [Sat, 7 Dec 2019 17:44:21 +0000 (17:44 +0000)]
MFC r354973:
Rewrite iicdev_writeto() to use a single buffer and a single iic_msg, rather
than effectively doing scatter/gather IO with a pair of iic_msgs that direct
the controller to do a single transfer with no bus STOP/START between the
two buffers. It turns out we have multiple i2c hardware drivers that don't
honor the NOSTOP and NOSTART flags; sometimes they just try to do the
transfers anyway, creating confusing failures or leading to corrupted data.
Ian Lepore [Sat, 7 Dec 2019 17:34:04 +0000 (17:34 +0000)]
MFC r352938:
Add 8 and 16 bit versions of atomic_cmpset and atomic_fcmpset for arm.
This adds 8 and 16 bit versions of the cmpset and fcmpset functions. Macros
are used to generate all the flavors from the same set of instructions; the
macro expansion handles the couple minor differences between each size
variation (generating ldrexb/ldrexh/ldrex for 8/16/32, etc).
In addition to handling new sizes, the instruction sequences used for cmpset
and fcmpset are rewritten to be a bit shorter/faster, and the new sequence
will not return false when *dst==*old but the store-exclusive fails because
of concurrent writers. Instead, it just loops like ldrex/strex sequences
normally do until it gets a non-conflicted store. The manpage allows LL/SC
architectures to bogusly return false, but there's no reason to actually do
so, at least on arm.
Ian Lepore [Sat, 7 Dec 2019 17:24:07 +0000 (17:24 +0000)]
MFC r352338:
Create a mechanism for encoding a system errno into the IIC_Exxxxx space.
Errors are communicated between the i2c controller layer and upper layers
(iicbus and slave device drivers) using a set of IIC_Exxxxxx constants which
effectively define a private number space separate from (and having values
that conflict with) the system errno number space. Sometimes it is necessary
to report a plain old system error (especially EINTR) from the controller or
bus layer and have that value make it back across the syscall interface
intact.
I initially considered replicating a few "crucial" errno values with similar
names and new numbers, e.g., IIC_EINTR, IIC_ERESTART, etc. It seemed like
that had the potential to grow over time until many of the errno names were
duplicated into the IIC_Exxxxx space.
So instead, this defines a mechanism to "encode" an errno into the IIC_Exxxx
space by setting the high bit and putting the errno into the lower-order
bits; a new errno2iic() function does this. The existing iic2errno()
recognizes the encoded values and extracts the original errno out of the
encoded value. An interesting wrinkle occurs with the pseudo-error values
such as ERESTART -- they aleady have the high bit set, and turning it off
would be the wrong thing to do. Instead, iic2errno() recognizes that lots of
high bits are on (i.e., it's a negative number near to zero) and just
returns that value as-is.
Thus, existing drivers continue to work without needing any changes, and
there is now a way to return errno values from the lower layers. The first
use of that is in iicbus_poll() which does mtx_sleep() with the PCATCH flag,
and needs to return the errno from that up the call chain.
Ian Lepore [Sat, 7 Dec 2019 17:17:34 +0000 (17:17 +0000)]
MFC r352196, r352333, r352342, r353653
r352196:
In am335x_dmtpps, use a spin mutex to interlock between PPS capture and PPS
ioctl(2) handling. This allows doing the pps_event() work in the polling
routine, instead of using a taskqueue task to do that work.
Also, add PNPINFO, and switch to using make_dev_s() to create the cdev.
Using a spin mutex and calling pps_event() from the polling function works
around the situation which requires more than 2 sets of timecounter
timehands in a single-core system to get reliable PPS capture. That problem
would happen when a single-core system is idle in cpu_idle() then gets woken
up with an event timer event which was scheduled to handle a hardclock tick.
That processing path would end up calling tc_windup 3 or 4 times between
when the tc polling function was called and when the taskqueue task would
eventually run, and with only two sets of timehands, the th_generation count
would always be too old to allow the captured PPS data to be used.
r352333:
Include <lock.h>, required to use spinlocks in this code.
r352342:
Make the ti_sysc device quiet. It's an internal utility pseudo-device
that makes the upstream FDT data work right, so we don't need to see a
couple dozen instances of it spam the dmesg at boot time unless it's a
verbose boot.
r353653:
Update some comments; no functional changes. Some historical old comments
in this driver indicate that the SD_CAPA register is write-once and after
being set one time the values in it cannot be changed. That turns out not
to be the case -- the values written to it survive a reset, but they can
be rewritten/changed at any time.
r355214:
Ignore "gpio-hog" nodes when instantiating ofw_gpiobus children. Also,
in ofw_gpiobus_probe() return BUS_PROBE_DEFAULT rather than 0; we are not
the only possible driver to handle this device, we're just slightly better
than the base gpiobus (which probes at BUS_PROBE_GENERIC).
In the time since this code was first written, the gpio controller bindings
aquired the concept of a "hog" node which could be used to preset one or
more gpio pins as input or output at a specified level. This change doesn't
fully implement the hogging concept, it just filters out hog nodes when
instantiating child devices by scanning for child nodes in the fdt data.
The whole concept of having child nodes under the controller node is not
supported by the standard bindings, and appears to be a freebsd extension,
probably left over from the days when we had no support for cross-tree
phandle references in the fdt data.
r355239:
Add an OFWBUS_PNP_INFO() macro for devices that hang directly off the root
ofwbus. Also, apply some style(9) whitespace fixing to the
SIMPLEBUS_PNP_INFO() macro (no functional change).
r355274:
Move most of the gpio_pin_* functions from ofw_gpiobus.c to gpiobus.c so
that they can be used by drivers on non-FDT-configured systems. Only the
functions related to acquiring pins by parsing FDT data remain in
ofw_gpiobus. Also, add two new functions for acquiring gpio pins based on
child device_t and index, or on the bus device_t and pin number. And
finally, defer reserving pins for gpiobus children until they acquire the
pin, rather than reserving them as soon as the child is added (before it's
even known whether the child will attach).
This will allow drivers configured with hints (or any other mechanism) to
use the same code as drivers configured via FDT data. Until now, a hinted
driver and an FDT driver had to be two completely different sets of code,
because hinted drivers could only use gpiobus calls to manipulate pins,
while fdt-configured drivers could not use that API (due to not always being
children of the bus that owns the pins) and had to use the newer
gpio_pin_xxxx() functions. Now drivers can be written in the more
traditional form, where most of the code is shared and only the resource
acquisition code at attachment time changes.
r355276:
Rewrite gpioiic(4) to use the gpio_pin_* API, and to conform to the modern
FDT bindings document for gpio-i2c devices.
Using the gpio_pin_* functions to acquire/release/manipulate gpio pins
removes the constraint that both gpio pins must belong to the same gpio
controller/bank, and that the gpioiic instance must be a child of gpiobus.
Removing those constraints allows the driver to be fully compatible with
the modern dts bindings for a gpio bitbanged i2c bus.
For hinted attachment, the two gpio pins still must be on the same gpiobus,
and the device instance must be a child of that bus. This preserves
compatibility for existing installations that have use gpioiic(4) with hints.
r355277:
Fix leading whitespace (spaces->tabs) in comments; no functional change.
r355295:
Remove "all rights reserved" from copyright after getting a response from
Luiz that he also was not intentionally asserting that right, it was already
there when he added his name.
r355298:
Do not initialize the flags field in struct gpiobus_pin from the flags in
struct gpio_pin. It turns out these two sets of flags are completely
unrelated to each other.
Also, update the comment for GPIO_ACTIVE_LOW to reflect the fact that it
does get set, somewhat unobviously, by code that parses FDT data. The bits
from the FDT cell containing flags are just copied to gpiobus_pin.flags, so
there's never any obvious reference to the symbol GPIO_ACTIVE_LOW being
stored into the flags field.
Ed Maste [Sat, 7 Dec 2019 03:56:36 +0000 (03:56 +0000)]
MFC r238022 (cem): dhclient: fix braino in previous bugfix r300174
The previous revision missed the exact same error in a copy paste block
of the same code in another function. Fix the identical case, too.
A DHCP client identifier is simply the hardware type (one byte)
concatenated with the hardware address (some variable number of bytes,
but at most 16). Limit the size of the temporary buffer to match and
the rest of the calculations shake out correctly.
PR: 238022
Reported by: Young <yangx92 AT hotmail.com>
Submitted by: Young <yangx92 AT hotmail.com>
Summary:
Various paths through hypot(x, y) will multiply x and y by a power of
two, perform the calculation in a range where IEEE-754 provides greater
precision, then undo the multiplication to determine the true result.
Undoing that multiplication is implemented as t1*w, where t1=2**k.
2**k is often computed by taking the high word of 1.0, then adding k<<20
(for doubles or long doubles) or k<<23 (for floats) to it, then
overwriting that high word. But when k is negative this left-shifts a
negative value -- and that's undefined behavior in many editions of C
and C++.
This patch should fix all hypot implementations to compute 2**k without
triggering this particular bit of undefined behavior.
Test Plan: I've only very lightly tested out the hypot(double, double)
change, in SpiderMonkey's JavaScript engine, for consistency with prior
behavior. The other functions' changes have more or less only been
eyeballed. Careful examination appreciated! Do note, however, that an
error in any of these changes would most likely produce a value that is
incorrect by a factor of two, so any mistake would most likely be
glaring if invoked.
Alexander Motin [Wed, 4 Dec 2019 15:14:14 +0000 (15:14 +0000)]
MFC r355165: Make DMAR allow Intel NTB device to access its own BAR0.
I have no good explanation why it happens, but I found that in B2B mode
at least Xeon v4 NTB leaks accesses to its configuration memory at BAR0
originated from the link side to its host side. DMAR predictably blocks
those, making access to remote scratchpad registers in B2B mode impossible.
This change creates identity mapping in DMAR covering the BAR0 addresses,
making the NTB work fine with DMAR enabled. It seems like allowing single
4KB range at 32KB offset may be enough, but I don't see a reason to be so
specific.
r355065:
Linux epoll: Don't deregister file descriptor after EPOLLONESHOT is fired
Linux epoll does not remove descriptor after one-shot event has been triggered.
Set EV_DISPATCH kqueue flag rather then EV_ONESHOT to get the same behavior.
Required by Linux Steam client.
PR: 240590
Reported by: Alex S <iwtcex@gmail.com>
Reviewed by: emaste, imp
Differential Revision: https://reviews.freebsd.org/D22513
r355066:
Linux epoll: Check both read and write kqueue events existence in EPOLL_CTL_ADD
Linux epoll EPOLL_CTL_ADD op handler should always check registration
of both EVFILT_READ and EVFILT_WRITE kevents to deceide if supplied
file descriptor fd is already registered with epoll instance.
r355067:
Linux epoll: Register events with zero event mask
Such an events are legal and should be interpreted as EPOLLERR | EPOLLHUP.
Register a disabled kqueue event in that case as we do not support EPOLLHUP yet.
Required by Linux Steam client.
PR: 240590
Reported by: Alex S <iwtcex@gmail.com>
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D22516
r355068:
Linux epoll: Allow passing of any negative timeout value to epoll_wait
Linux epoll allow passing of any negative timeout value to epoll_wait()
to cause unbound blocking
Eric van Gyzen [Tue, 3 Dec 2019 22:57:10 +0000 (22:57 +0000)]
MFC r354624
tip/cu: check for EOF on input on the local side
If cu reads an EOF on the input side, it goes into a tight loop
sending a garbage byte to the remote. With this change, it exits
gracefully, along with its child.
Ravi Pokala [Tue, 3 Dec 2019 22:51:25 +0000 (22:51 +0000)]
MFC r354102:
Args for buf_track() might be unused
If neither FULL_BUF_TRACKING nor BUF_TRACKING are defined, then the body of
buf_track() becomes empty. Mark the arguments with "__unused" so the
compiler doesn't complain about unused arguments in that case.
Ravi Pokala [Tue, 3 Dec 2019 22:49:24 +0000 (22:49 +0000)]
MFC r343583:
Remove unecessary "All rights reserved" from files under my or Panasas's
copyright.
When all member nations of the Buenos Aires Convention adopted the Berne
Convention, the phrase "All rights reserved" became unnecessary to assert
copyright. Remove it from files under my or Panasas's copyright. The files
related to jedec_dimm(4) also bear avg@'s copyright; he has approved this
change.
Kyle Evans [Tue, 3 Dec 2019 19:00:12 +0000 (19:00 +0000)]
MFC r354712: arm64: fix BUS_DMA_ALLOCNOW for non-paged aligned sizes
For any size that isn't page-aligned, we end up not pre-allocating enough
for a single mapping because we truncate the size instead of rounding up to
make sure the last bit is accounted for, leaving us one page shy of what we
need to fulfill a request.
Kyle Evans [Tue, 3 Dec 2019 18:58:45 +0000 (18:58 +0000)]
MFC r354541: csu: Fix dynamiclib/init_test:jcr_test on !HAVE_CTORS archs
.jcr still needs a 0-entry added in crtend, even on !HAVE_CTORS archs, as
we're still getting .jcr sections added -- presumably due to the reference
in crtbegin. Without this terminal, the .jcr section (without data) overlaps
with the next section and register_classes in crtbegin will be examining the
wrong item.
These files already have 'device' lines that they require; adding a
dependency on SOC_* options is an extra restriction that adds extra
verbosity when future supported Broadcom-based SOC will also feature the
same compatible device.
Users wishing to not compile these devices in should remove the 'device'
lines from their config.
Summary:
- basic: test application of patches created by diff -u at the
beginning/middle/end of file, which have differing amounts of context
before and after chunks being added
- limited_ctx: stems from PR 74127 in which a rogue line was getting added
when the patch should have been rejected. Similar behavior was
reproducible with larger contexts near the beginning/end of a file. See
r326084 for details
- file_creation: patch sourced from /dev/null should create the file
- file_nodupe: said patch sourced from /dev/null shouldn't dupe the contents
when re-applied (personal vendetta, WIP, see comment)
- file_removal: this follows from nodupe; the reverse of a patch sourced
from /dev/null is most naturally deleting the file, as is expected based
on GNU patch behavior (WIP)
r351866: patch(1): fix the file removal test, strengthen it a bit
To remain compatible with GNU patch, we should ensure that once we're
removing empty files after a reversed /dev/null patch we don't remove files
that have been modified. GNU patch leaves these intact and just reverses the
hunk that created the file, effectively implying --remove-empty-files for
reversed /dev/null patches.
r354328: patch(1): give /dev/null patches special treatment
We have a bad habit of duplicating contents of files that are sourced from
/dev/null and applied more than once... take the more sane (in most ways)
GNU route and complain if the file exists and offer reversal options.
This still falls short a little bit as selecting "don't reverse, apply
anyway" will still give you duplicated file contents. There's probably other
issues as well, but awareness is the first step to happiness.
Kyle Evans [Tue, 3 Dec 2019 18:50:18 +0000 (18:50 +0000)]
MFC r354246: liblua: add loader.lua_path
As described previously, loader.lua_path is absolute path where scripts are
installed. A future commit will use this to build paths for dofile in
try_include, rather than the current pcall/require setup that makes it more
difficult to coordinate loader aborts from local.lua -- we do not need the
flexibility of require(), and local.lua is in-fact not a 'module-like' file
as we will not be referencing anything from it.
Kyle Evans [Tue, 3 Dec 2019 18:38:51 +0000 (18:38 +0000)]
MFC r354236: mdmfs(8): add -k skel option to populate fs from a skeleton
mdmfs(8) lacks the ability to populate throwaway memory filesystems from an
existing directory.
This features permits an interesting setup where /var for instance lives on
a device where wear-leveling is something you want to avoid as much as
possible and nonetheless you don't want to lose your logs, ports metadata,
etc. Here are the steps:
1. Copy /var to /var.bak;
2. Mount an mfs into /var using -k /var.bak at startup;
3. Synchronize /var to /var.bak weekly and on shutdown.
Note that this more or less mimics OpenBSD's mount_mfs(8) -P flag.
Kyle Evans [Tue, 3 Dec 2019 18:28:39 +0000 (18:28 +0000)]
MFC rarm: correct kernelstack allocation size
This appears to be a copy-pasto from previous lines that propagated to v6
over the years. Indeed, nothing references kernelstack beyond
USPACE_SVC_STACK_TOP and it would be odd if anything did.
Kyle Evans [Tue, 3 Dec 2019 18:25:16 +0000 (18:25 +0000)]
MFC r354245, r354833, r354837: add flua to the base system
r354245: stand: consolidate knowledge of lua path
Multiple places coordinate to 'know' where lua scripts are installed. Knock
this down to being formally defined (and overridable) in exactly one spot,
defs.mk, and spread the knowledge to loaders and liblua alike. A future
commit will expose this to lua as loader.lua_path, so it can build absolute
paths to lua scripts as needed.
r354833: Add flua to the base system, install to /usr/libexec
FreeBSDlua ("flua") is a FreeBSD-private lua, flavored with whatever
extensions we need for base system operations. We currently support a subset
of lfs and lposix that are used in the rewrite of makesyscall.sh into lua,
added in r354786.
flua is intentionally written such that one can install standard lua and
some set of lua modules from ports and achieve the same effect.
linit_flua is a copy of linit.c from contrib/lua with lfs and lposix added
in. This is similar to what we do in stand/. linit.c has been renamed to
make it clear that this has flua-specific bits.
luaconf has been slightly obfuscated to make extensions more difficult. Part
of the problem is that flua is already hard enough to use as a bootstrap
tool because it's not in PATH- attempting to do extension loading would
require a special bootstrap version of flua with paths changed to protect
the innocent.
src.lua.mk has been added to make it easy for in-tree stuff to find flua,
whether it's bootstrap-flua or relying on PATH frobbing by Makefile.inc1.
r354837: flua: newer GCC complains about format-nonliteral at WARNS=2
Alexander Motin [Tue, 3 Dec 2019 16:51:26 +0000 (16:51 +0000)]
MFC r355023: Do not retry long ready waits if previous gave nothing.
I have some disks reporting "Logical unit is in process of becoming ready"
for about half an hour before finally reporting failure. During that time
CAM waits for the readiness during ~2 minutes for each request, that makes
system boot take very long time.
This change reduces wait times for the following requests to ~1 second if
previously long wait for that device has timed out.
Alexander Motin [Tue, 3 Dec 2019 16:48:21 +0000 (16:48 +0000)]
MFC r355010: Make CAM use root_mount_hold_token() to delay boot.
Before this change CAM used config_intrhook_establish() for this purpose,
but that approach does not allow to delay it again after releasing once.
USB stack uses root_mount_hold() to delay boot until bus scan is complete.
But once it is, CAM had no time to scan SCSI bus, registered by umass(4),
if it already done other scans and called config_intrhook_disestablish().
The new approach makes it work smooth, assuming the USB device is found
during the initial bus scan. Devices appearing on USB bus later may still
require setting kern.cam.boot_delay, but hopefully those are minority.
Alexander Motin [Tue, 3 Dec 2019 16:42:32 +0000 (16:42 +0000)]
MFC r341756 (by scottl):
Don't allocate the config_intrhook separately from the softc, it's small
enough that it costs more code to handle the malloc/free than it saves.
linprocfs: Make sure to report -1 as tty when we have no controlling tty.
When reporting a process' stats, we can't just provide the tty as an
unsigned long, as if we have no controlling tty, the tty would be NODEV, or
-1. Instaed, just special-case NODEV.
Andriy Gapon [Tue, 3 Dec 2019 07:22:16 +0000 (07:22 +0000)]
MFC r354349: if_ixv: disable RSS configuration on 82599 and X540 VFs
It is reported that those VFs share their RSS configuration with PF and,
thus, they cannot be configured independently.
Also:
- add missing opt_rss.h to if_ixv.c, otherwise RSS kernel option could
not be seen
- do not enable IXGBE_FEATURE_RSS on the older VFs
- set flowid / hash type to M_HASHTYPE_NONE or M_HASHTYPE_OPAQUE_HASH
(based on what the hardware reports) if IXGBE_FEATURE_RSS is not set