I accidentally swapped 'linux_fixup_elf' to 'linux_elf_fixup' in amd64's
declaration (only), while bringing this change over from git and
encountering a conflict.
linux*_sysvec.c: rationalize whitespace and comments
There's a fair amount of duplication between MD linuxulator files.
Make indentation and comments consistent between the three versions of
linux_sysvec.c to reduce diffs when comparing them.
MFC r331056:
Share a single bsd-linux errno table across MD consumers
Three copies of the linuxulator linux_sysvec.c contained identical
BSD to Linux errno translation tables, and future work to support other
architectures will also use the same table. Move the table to a common
file to be used by all. Make it 'const int' to place it in .rodata.
(Some existing Linux architectures use MD errno values, but x86 and Arm
share the generic set.)
This change should introduce no functional change; a followup will add
missing errno values.
A version of each of the MD files by necessity exists for each CPU
architecture supported by the Linuxolator. Clean these up so that new
architectures do not inherit whitespace issues.
r317849 (partial, required by r332506):
cxgbe/t4_tom: Per-connection rate limiting for TCP sockets handled by
the TOE.
Sponsored by: Chelsio Communications
r332506:
cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload. t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface. The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so. The general format
of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or
passive open and stops at the first match. There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload
This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
r332787:
cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag
(instead of hardcoded 0xffff, implying no tag) in the routines that
process offload policy.
r346545:
libbe(3): allow creation of arbitrary depth boot environments
libbe currently only provides an API to create a recursive boot environment,
without any formal support for intentionally limiting the depth. This
changeset adds an API, be_create_depth, that may be used to arbitrarily
restrict the depth of the new BE.
r346546:
libbe(3): Add a test for be creation
r346680:
libbe(3): Copy received properties as well
This was inherently broken on send|recv datasets.
r346700:
libbe(3): Fix mis-application of patch (SHLIBDIR)
Rob's patch in D18564 cemented the SHLIBDIR because bsd.own.mk (included by
src.opts.mk) sets it to /usr/lib. r346546 did somehow not apply this part of
the patch, leaving it to get installed to the wrong place and subsequently
removed via ObsoleteFiles.
r346705:
libbe(3): Fix libcompat build
SHLIBDIR should still be optionally set, just before src.opts.mk is included
so that libcompat can properly override it. This fixes lib32 failures
reported by both Jenkins and Michael Butler.
if_bridge and if_vxlan conversion to this deterministic MAC address KPI has
been MFC as well. This is potentially error prone as the generated address
range for these has decreased, but I've deemed this acceptable for stable
branches due to collisions for thees interfaces being easily remedied.
I have no intention of switching anything else to this KPI in any stable
branches.
r345139:
ether: centralize fake hwaddr generation
We currently have two places with identical fake hwaddr generation --
if_vxlan and if_bridge. Lift it into if_ethersubr for reuse in other
interfaces that may also need a fake addr.
r345151:
ether_fakeaddr: Use 'b' 's' 'd' for the prefix
This has the advantage of being obvious to sniff out the designated prefix
by eye and it has all the right bits set. Comment stolen from ffec.
I've removed bryanv@'s pending question of using the FreeBSD OUI range --
no one has followed up on this with a definitive action, and there's no
particular reason to shoot for it and the administrative overhead that comes
with deciding exactly how to use it.
r346324:
net: adjust randomized address bits
Give devices that need a MAC a 16-bit allocation out of the FreeBSD
Foundation OUI range. Change the name ether_fakeaddr to ether_gen_addr now
that we're dealing real MAC addresses with a real OUI rather than random
locally-administered addresses.
r346328:
Compile sha1.c when ether support is included
sha1 is used by ether_gen_addr after r346324. Perhaps in an ideal world we
could detect that the kernel's been compiled without sha1_* bits included
and silently fallback to arc4random instead because these platforms/kernel
configs are far and few between. It's fairly lightweight, though, so just
include it for now.
MFC: r346191
Add support for INET6 addresses to the kernel code that dumps open/lock state.
PR#223036 reported that INET6 callback addresses were not printed by
nfsdumpstate(8). This kernel patch adds INET6 addresses to the dump structure,
so that nfsdumpstate(8) can print them out, post-r346190.
MFC r333322: Keep CARP state as INIT when net.inet.carp.allow=0.
Currently when net.inet.carp.allow=0 CARP state remains as MASTER, which is
not very useful (if there are other masters -- it can lead to split brain,
if there are none -- it makes no sense). Having it as INIT makes it clear
that carp packets are disabled.
* Handle SIGPIPE in gssd
We've got some cases where the other end of gssd's AF_LOCAL socket gets
closed, resulting in an error (and SIGPIPE) when it tries to do I/O to it.
Closing without cleaning up means the next time nfsd starts up, it hangs,
unkillably; this allows gssd to handle that particular error.
* Limit the retry cound in gssd_syscall to 5.
The default is INT_MAX, which effectively means forever. And it's an
uninterruptable RPC call, so it will never stop.
MFC r345656: Do not map small IOCTL buffers to KVA, but copy.
CAM IOCTL interfaces traditionally mapped user-space data buffers to KVA.
It was nice originally, but now it takes too much to handle respective
TLB shootdowns, while small kernel memory allocations up to 64KB backed
by UMA and accompanied by copyin()/copyout() can be much cheaper.
For large buffers mapping still may have sense, and unmapped I/O would
be even better, but the last unfortunately is more tricky, since unmapped
I/O API is too specific to struct bio now.
Update carp to set DSCP value CS7(Network Traffic) in the flowlabel field of
packets by default. Currently carp only sets TOS_LOWDELAY in IPv4 which was
deprecated in 1998. This also implements sysctl that can revert carp back to
it's old behavior if desired.
This will allow implementation of QOS on modern network devices to make sure
carp packets aren't dropped during interface contention.
Submitted by: Nick Wolff <darkfiberiru AT gmail.com>
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14536
MFC r345438,r345842,r346259,r346261: TPM as possible entropy source
r345438:
Allow using TPM as entropy source
TPM has a built-in RNG, with its own entropy source.
The driver was extended to harvest 16 random bytes from TPM every 10 seconds.
A new build option "TPM_HARVEST" was introduced - for now, however, it
is not enabled by default in the GENERIC config.
Reviewed by: markm, delphij
Approved by: secteam
r345842:
Add a cv_wait to the TPM2.0 harvesting function
MFC r339826 (by yuripv):
Provide basic descriptions for VMX exit reason (from "Intel 64 and IA-32
Architectures Software Developer’s Manual Volume 3"). Add the document
to SEE ALSO in bhyve.8 (and pet manlint here a bit).
r344569:
Implement parallel mounting for ZFS filesystem
It was first implemented on Illumos and then ported to ZoL.
This patch is a port to FreeBSD of the ZoL version.
This patch also includes a fix for a race condition that was amended
With such patch Delphix has seen a huge decrease in latency of the mount phase
(https://github.com/openzfs/openzfs/commit/a3f0e2b569 for details).
With that current change Gandi has measured improvments that are on par with
those reported by Delphix.
Import a fix from illumos (thanks Toomas Soomas for pointing at it)
See https://www.illumos.org/issues/10205 for more details
Illumos commit: https://github.com/illumos/illumos-gate/commit/247b7da039fd88350c50e3d7fef15bdab6bef215
MFC r345200: MFV r336930: 9284 arc_reclaim_thread has 2 jobs
`arc_reclaim_thread()` calls `arc_adjust()` after calling
`arc_kmem_reap_now()`; `arc_adjust()` signals `arc_get_data_buf()` to
indicate that we may no longer be `arc_is_overflowing()`.
The problem is, `arc_kmem_reap_now()` can take several seconds to
complete, has no impact on `arc_is_overflowing()`, but due to how the
code is structured, can impact how long the ARC will remain in the
`arc_is_overflowing()` state.
The fix is to use seperate threads to:
1. keep `arc_size` under `arc_c`, by calling `arc_adjust()`, which
improves `arc_is_overflowing()`
2. keep enough free memory in the system, by calling
`arc_kmem_reap_now()` plus `arc_shrink()`, which improves
`arc_available_memory()`.
MFC r340311: Do not ignore arc_adjust() return value.
This covers scenario when ARC may not shrink as fast as it could:
1. arc_size < arc_c and arc_adjust() does not evict anything, returning
zero to arc_reclaim_thread();
2. arc_available_memory() reports memory pressure, which can not be
satisfied by arc_kmem_reap_now();
3. arc_shrink() reduces arc_c and calls arc_adjust(), return of which is
ignored;
4. even if the last arc_adjust() could not satisfy arc_size < arc_c,
arc_reclaim_thread() will still go to sleep, since the first one
returned zero.
This fixes an assert in vdev_queue_change_io_priority():
VERIFY3(zio->io_priority < ZIO_PRIORITY_NUM_QUEUEABLE) failed (7 < 6)
PANIC at vdev_queue.c:832:vdev_queue_change_io_priority()
Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Use cached feature info in spa_add_feature_stats()
Avoid issuing I/O to the pool when retrieving feature flags information.
Trying to read the ZAPs from disk means that zpool clear would hang if
the pool is suspended and recovery would require a reboot. To keep the
feature stats resident in memory, we hang a cached nvlist off of the
spa. It is built up from disk the first time spa_add_feature_stats() is
called, and refreshed thereafter using the cached feature reference
counts. spa_add_feature_stats() gets called at pool import time so we
can be sure the cached nvlist will be available if the pool is later
suspended.
Signed-off-by: Ned Bass <bass6@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3082
MFC r346356:
Implement flag for telling cuse(3) clients if the peer is running in 32-bit
compat mode or not. This is useful when implementing compatibility ioctl(2)
handlers in userspace.
[ndis] Fix unregistered use of FPU by NDIS in kernel on amd64
amd64 miniport drivers are allowed to use FPU which triggers "Unregistered use
of FPU in kernel" panic.
Wrap all variants of MSCALL with fpu_kern_enter/fpu_kern_leave. To reduce
amount of allocations/deallocations done via
fpu_kern_alloc_ctx/fpu_kern_free_ctx maintain cache of fpu_kern_ctx elements.
If during DIOCRSETTFLAGS pfrio_buffer is NULL copyin() will fault, which we're
not allowed to do with a lock held.
We must count the number of entries in the table and release the lock during
copyin(). Only then can we re-acquire the lock. Note that this is safe, because
pfr_set_tflags() will check if the table and entries exist.
This was discovered by a local syzcaller instance.
myjob will get loaded at the next second==0, but this echo job will not
run until cron restarts. These jobs are normally handled in
run_reboot_jobs(), which sets e->lastexit of INTERVAL jobs to the startup
time so they run 'n' seconds later.
Fix this by special-casing TargetTime > 0 in the database load. Preexisting
jobs will be handled at startup during run_reboot_jobs as normal, but if
we've reloaded a database during runtime we'll hit this case and set
e->lastexit to the current time when we process it. They will then run every
'n' seconds from that point, and a full restart of cron is no longer
required to make these jobs work.
ian [Mon, 22 Apr 2019 15:26:21 +0000 (15:26 +0000)]
MFC r337364:
Document 64-bit arm in terms of arch name (aarch64) not machine (arm64).
Other architectures are documented in terms of the name that is displayed by
'uname -p', aka MACHINE_ARCH and TARGET_ARCH in the build system, now
aarch64 matches the rest of them.
ian [Mon, 22 Apr 2019 15:23:06 +0000 (15:23 +0000)]
MFC r346312:
Only set up the interrupts that will actually be used in arm generic_timer.
The code previously set up interrupt handlers for all the interrupt
resources available, including for timers that are not in use. That could
lead to interrupt storms. For example, if boot firmware enabled the virtual
timer but the kernel is using the physical timer, it could get flooded with
interrupts on the virtual timer which it cannot shut off. By only setting
up an interrupt handler for the hardware that will actually be used, any
interrupts from other timer units will remain masked in the interrupt
controller.
ian [Mon, 22 Apr 2019 15:20:46 +0000 (15:20 +0000)]
MFC r345475-r345476
r345475:
Truncate a too-long interrupt handler name when there is only one handler.
There are only 19 bytes available for the name of an interrupt plus the
name(s) of handlers/drivers using it. There is a mechanism from the days of
shared interrupts that replaces some of the handler names with '+' when they
don't all fit into 19 bytes.
In modern times there is typically only one device on an interrupt, but long
device names are the norm, especially with embedded systems. Also, in systems
with multiple interrupt controllers, the names of the interrupts themselves
can be long. For example, 'gic0,s54: imx6_anatop0' doesn't fit, and
replacing the device driver name with a '+' provides no useful info at all.
When there is only one handler but its name was too long to fit, this
change truncates enough leading chars of the handler name (replacing them
with a '-' char to indicate that some chars are missing) to use all 19
bytes, preserving the unit number typically on the end of the name. Using
the prior example, this results in: 'gic0,s54:-6_anatop0' which provides
plenty of info to figure out which device is involved.
PR: 211946
Reviewed by: gonzo@ (prior version without the '-' char)
Differential Revision: https://reviews.freebsd.org/D19675
r345476:
Revert accidental change that should not have been included in r345475.
I had changed this value as part of a local experiment, and neglected to
change it back before committing the other changes.
ian [Mon, 22 Apr 2019 15:09:47 +0000 (15:09 +0000)]
MFC r345480, r346013
r345480:
Support device-independent labels for geom_flashmap slices.
While geom_flashmap has always supported label names for its slices, it does
so by appending "s.labelname" to the provider device name, meaning you still
have to know the name and unit of the hardware device to use the labels.
These changes add support for device-independent geom_flashmap labels, using
the standard geom_label infrastructure. geom_flashmap now creates a softc
struct attached to its geom, and as it creates slices it stores the label
into an array in the softc. The new geom_label_flashmap uses those labels
when tasting a geom_flashmap provider.
A large set of changes that collectively modernize the at45d and mx25l
(DataFlash and SpiFlash) drivers, add FDT support, and add geom_flashmap
support to them.
r335159 by manu:
mx25l: Add pnp info
r344505:
Add a functional detach() implementation to make module unloading possible.
r344506:
Add support for probing/attaching on FDT-based systems.
r344507:
Switch to using config_intrhook_oneshot(). That allows the error handling
in the delayed attach to use early returns, which allows reducing the level
of indentation. So all in all, what looks like a lot of changes is really
no change in behavior, mostly just moving whitespace around.
r344523:
Include the jedec "extended device information string" in the criteria used
to match a chip to our table of metadata describing the chips. At least one
new DataFlash chip has a 3-byte jedec ID identical to its predecessors and
differs only in the extended info, and it has different metadata requiring a
unique entry in the table. This paves the way for supporting such chips.
The metadata table now includes two new fields, extmask and extid. The two
bytes of extended info obtained from the chip are ANDed with extmask then
compared to extid, so it's possible to use only a subset of the extended
info in the matching.
We now always read 6 bytes of jedec ID info. Most chips don't return any
extended info, and the values read back for those two bytes may be
indeterminate, but such chips have extmask and extid values of 0x0000 in the
table, so the extid effectively doesn't participate in the matching on those
chips and it doesn't matter what they return in the extended info bytes.
r344525:
Add a metadata entry for the AT45DB641E chip. This chip has the same 3-byte
jedec ID as its older cousin the AT45DB642D, but uses a different page size.
The only way to distinguish between the two chips is that the 2D chip has
0 bytes of extended ID info and the new 1E has 1 byte of extended ID. The
actual value of the extended ID byte is all zeroes. In other words, it's
the presence of the extended info that identifies this chip. (Presumably
a future upgrade might define non-zero values for the extended ID byte.)
r344526:
Resolve a name conflict when both SpiFlash and DataFlash devices are present.
Both SpiFlash (mx25l) and DataFlash (at45d) drivers create a disk device
with a name of /dev/flash/spiN where N is the driver's unit number. If
both types of devices are present in the same system, this creates a fatal
conflict that prevents attachment of whichever device attaches second
(because mx25l0 and at45d0 both try to create a spi0).
This gives each type of device a unique name (mx25lN or at45dN respectively)
and also adds an alias of spiN for compatibility. When both device types
appear in the same system, only the first to attach gets the spiN alias.
When the second device attaches there is a non-fatal warning that the alias
can't be created, but both devices are still accessible via their primary
names (and there is no need for the spiN name to work for backwards
compatibility on such a system, because it has never been possible to use
the spiN names when both devices exist).
r344529:
Fix a paste-o that broke the build on all arches.
r344556:
Set maximum bus clock speed from hints when attaching hinted spibus(4) children.
Some devices (such as spigen(4)) document that this works, but it appears that the
code to implement it never got added.
r344606:
Add support for geom_flashmap by providing a getattr() for "SPI:device".
r344607:
Compile fdt_slicer and geom_flashmap when the at45d device is included.
r344608:
Update a comment to reflect reality; no functional changes.
r344609:
Make it possible to load fdt_slicer as a module (unloading works too fwiw).
r344610:
Add manpages for at45d(4) and mx25l(4).
r344611:
Add a module dependency on fdt_slicer.
r344612:
Add a module dependency on fdt_slicer. Also, move the PNP_INFO to its more
usual location, down near the DRIVER_MODULE() stuff.
r344614:
Rename some functions and variables to have shorter names, which allows
unwrapping multiple lines of code. Also, convert some short multiline
comments into single-line comments. Change old-school FALSE to false.
All in all, no functional changes, it's just more compact and readable.
r344615:
Child nodes with a compatible property are not slices, according to the
devicetree/bindings/mtd/partitions.txt document, so just ignore them.
r344616:
Add support to fdt_slicer for the new style partition data documented in
devicetree/bindings/mtd/partition.txt.
In the old style, all the children of the device node which did not have a
compatible property were the partitions. In the new style, there is a child
node of the device which has a compatible string of "fixed-partitions", and
its children are the individual partitions.
Also, support the read-only property by setting the corresponding slice flag.
r344681:
Build fdt support modules on systems that use fdt data.
kern.opts.mk sets make var OPT_FDT to a non-empty value if platform.h
contains OPT_FDT.
r344684:
Undo accidental part of r344681.
I think I must have accidentally mouse-click pasted while scrolling and
didn't notice it.
r344685:
Add required header file to SRCS.
r344686:
Add another required header file.
For some reason this seems to be required on aarch64, but I can build armv7
from clean without needing this in the list. (The file does get included,
so the mystery is why armv7 works.)
r344728:
Bugfix: use a dummy buffer for the inactive side of a transfer.
This is especially important for writes. SPI is inherently a bidirectional
bus; you receive data (even if it's garbage) while writing. We should not
receive that data into the same buffer we're writing to the device.
When reading it doesn't matter what we send to the device, but using the
dummy buffer for that as well is pleasingly symmetrical.
r344733:
Add some comments. Give #define'd names to some scattered numbers. Change
some #define'd names to be more descriptive. When reporting a post-write
compare failure, report the page number, not the byte address of the page.
The latter is the only functional change, it makes the number match the
words of the error message.
r344734:
Allow the sector size of the disk device to be configured using hints or
FDT data. The sector size must be a multiple of the device's page size.
If not configured, use the historical default of the device page size.
Setting the disk sector size to 512 or 4096 allows a variety of standard
filesystems to be used on the device. Of course you wouldn't want to be
writing frequently to a SPI flash chip like it was a disk drive, but for
data that gets written once (or rarely) and read often, using a standard
filesystem is a nice convenient thing.
r344981:
Give the mx25l device sole ownership of the name /dev/flash/spi* instead of
trying to use disk_add_alias() to make spi* an alias for mx25l*. It turns
out disk_add_alias() works for partitions, but not slices, and that's hard
to fix.
This change is, in effect, a partial revert of r344526.
The mips world relies on the existence of flashmap names formatted as
/dev/flash/spi0s.name, whereas pretty much nothing relies on at45d devices
using the /dev/spi* names (because until recently the at45d driver didn't
even work reliably). So this change makes mx25l devices the sole owner of
the /dev/flash/spi* namespace, which actually makes some sense because it is
a SpiFlash(tm) device, so flash/spi isn't a horrible name.
ian [Mon, 22 Apr 2019 13:55:06 +0000 (13:55 +0000)]
MFC r342639:
When allocating a new keyboard at vt_upgrade() time, unwind any cngrabs
done on the old keyboard and then do the corresponding number of grabs
on the new keyboard.
This fixes a race that can leave the system with a non-functioning
keyboard. It goes like this...
- The bios claims there is an AT keyboard, atkbd attaches.
- SI_SUB_INT_CONFIG_HOOKS runs.
- USB probes devices. Devices begin attaching, including disks.
- GELI prompts for a password for a just-attached disk, which results
in a cngrab() while atkbd is the keyboard.
- A USB keyboard attaches.
- vt_upgrade() runs and switches the keyboard to the new USB keyboard,
but because cngrab was never called for it, it's not activated and
keystrokes are ignored.
- Now there is no functional keyboard and no way to get one; even
plugging in a different USB keyboard doesn't help, because the console
is still grabbed, still waiting for a GELI pw.
ian [Mon, 22 Apr 2019 13:51:25 +0000 (13:51 +0000)]
MFC r337731:
Export the eeprom device size via readonly sysctl. Also export the write
page size and address size, although they are likely to be inherently
less-interesting values outside of the driver.
ian [Mon, 22 Apr 2019 13:45:08 +0000 (13:45 +0000)]
MFC r336137-r336138, r336202, r336214, r336216
r336137:
Add a manpage for the imx_spi driver.
r336138:
Add pnp info to the imx_spi driver.
r336202:
Enhancements and fixes for the spigen(4) driver...
- Resources used by spigen_mmap_single() are now tracked using
devfs_set_cdevpriv() rather than in the softc.
- Since resources are now tracked per-open-fd, there is no need to try to
impose any exclusive-open logic, so flags related to that are removed.
- Flags used to track open status to prevent detach() when the device is
open are replaced with calls to device_busy()/device_unbusy(). That
extends the protection up the hierarchy so that the spibus and hardware
controller drivers also can't be detached while the device is open/in use.
- Arbitrary limits on the maximum size of a transfer are removed, along with
the sysctl variables that allowed the limits to be changed. There is just
no reason to limit the size of a spi transfer to the machine's page size.
Or to any other arbitrary value, really.
- Most of the locking is removed. It was mostly protecting access to flags
and fields in the softc that no longer exist. The locking that remains is
just to prevent concurrent calls to device_[un]busy().
- The code was calling malloc() with M_WAITOK while holding a mutex in
several places. Since most of the locking is gone, that's fixed.
r336214:
Add various spi devices to NOTES.
r336216:
Actually build and install the spigen.4 manpage.
ian [Mon, 22 Apr 2019 05:00:29 +0000 (05:00 +0000)]
MFC r336094, r336096
r336094:
Catch up with improvements in RTC handling... It's no longer necessary to
ignore the timestamp passed in to settime() due to inaccuracy, the core
routines now pass in a nanosecond-accurate time freshly-obtained before
calling each driver's settime() method. Also, add calls to the new
debugging output helpers.
r336096:
Make the imx6_snvs driver usable as a module, add pnp info. Add a manpage.
ian [Mon, 22 Apr 2019 04:56:41 +0000 (04:56 +0000)]
MFC r333073-r333074
r333073 by manu:
arm: Fix duplicate ahci DRIVER_MODULE
Name each ahci driver uniquely.
This remove the warning printed at each arm boot :
module_register: cannot register simplebus/ahci from kernel; already loaded from kernel
r333074 by manu:
arm: Fix duplicate ehci DRIVER_MODULE
Name each ehci driver uniquely.
This remove the warning printed at each arm boot :
module_register: cannot register simplebus/ehci from kernel; already loaded from kernel
ian [Mon, 22 Apr 2019 04:15:22 +0000 (04:15 +0000)]
MFC r336070, r336072-r336073, r336076
r336070:
Add pnp info and a module makefile for the imx_wdog watchdog driver.
r336072:
Correctly calculate the value to put in the imx wdog countdown register.
The correct value is seconds*2-1. The code was using just seconds*2, which
led to being off by a half-second -- usually not a big deal, except when the
value was the max (128) it overflowed so zero would get written to the
countdown register, which equates to a timeout of a half second.
r336073:
Add support to the imx watchdog for the FDT "timeout-sec" property, by
automatically initializing the watchdog using the given value. Also,
attach at BUS_PASS_TIMER to extend watchdog protection to more of the
kernel init process.
r336076:
Add a manpage for the imx5/6 watchdog driver.
ian [Mon, 22 Apr 2019 04:07:51 +0000 (04:07 +0000)]
MFC r335982, r335985, r335988-r335989
r335982:
Fix an out-of-bounds array access... the irq data for teardown is in two
arrays, as elements 0 and 1 of one array and elements 1 and 2 of the other.
Run the loop 0..1 instead of 1..2 and use named constants to offset into
one of the arrays.
PR: 229508
r335985:
Remove a test and early-out which just can't possibly be right. It causes
detach() to do nothing if attach() succeeded, which is the opposite of
what's needed. Also, move device_delete_children() from the end to the
beginning of detach(), so that children won't be trying to make use of the
hardware we're in the process of shutting down.
PR: 229510
r335988:
Add a missing call to usb_bus_mem_free_all() when detaching.
r335989:
Detach all children before beginning to tear down the hardware, instead of
doing it last. Also, remove the local tracking of whether usb's busdma
memory allocation got done, because it's safe to call the free_all
function even if it wasn't.
ian [Mon, 22 Apr 2019 04:02:16 +0000 (04:02 +0000)]
MFC r335594:
Retrieve the bus clock speed and mode (polarity/phase) from the child device
and set up the hardware accordingly on each transfer. This replaces the old
configuration done via sysctl, and allows both fdt configuration data and
userland control via the spigen device to work.
Submitted by: Bob Frazier
Differential Revision: https://reviews.freebsd.org/D15031
ian [Mon, 22 Apr 2019 03:55:02 +0000 (03:55 +0000)]
MFC r335527, r335529, r335593
r335527:
Add spi(8), a utility for communicating with a device on a SPI bus from
userland, conceptually similar to what i2c(8) provides for i2c devices.
Submitted by: Bob Frazier
Differential Revision: https://reviews.freebsd.org/D15029
r335529:
Eliminate gcc "shadowed declaration" warnings by using idx rather than
index as a variable name.
r335593:
Add an example for displaying the manufacturer and size info from a
standard spiflash chip.
ian [Mon, 22 Apr 2019 03:52:11 +0000 (03:52 +0000)]
MFC r335506
r335506:
Incorporate bus and chip select numbers into spigen(4) cdev names. Rather
than assigning spigen device names in order of creation, this uses a device
name that corresponds to the owning spibus and chip-select index.
Example: /dev/spigen0.1 would be a child of spibus0, and use cs = 1
The intent is for systems like Raspberry Pi to have a consistent way of
using an SPI interface with a specific cs value from a user application.
Otherwise, there is no consistent way of knowing which cs pin will be
assigned to a particular spigen device. The alternative is to specify
everything in "the right order" in an overlay file, which is less than
ideal. Additionally, this duplicates (to some extent) the way Linux handles
a similar situation with their 'spidev' device, so it would be somewhat
familiar to those who also use Linux.
A new kernel config option, SPIGEN_LEGACY_CDEVNAME, causes the driver to
also create /dev/spigenN device name aliases, with N incrementing in the
order of device instantiation. This is provided to ease the transition
for existing systems using the original naming convention (particularly
when these changes are MFC'd to stable branches).
Comment out checks that are causing failures on ^/stable/11, post-r337133
Some usr.bin/procstat code has not been MFCed to ^/stable/11 for command line
handling, and unfortunately, r337133 (which was MFCed as r337542) relies on
that support.
Comment out the checks and add a pointer to the PR for code archaeology and
to cause future potential parties to verify changes when backporting them.
r334817:
Add new functionality and syntax to cron(1) to allow to run jobs at a
given interval, which is counted in seconds since exit of the previous
invocation of the job. Example user crontab entry:
@25 sleep 10
The example will launch 'sleep 10' every 35 seconds. This is a rather
useless example above, but clearly explains the functionality.
The practical goal here is to avoid overlap of previous job invocation
to a new one, or to avoid too short interval(s) for jobs that last long
and doesn't have any point of immediate launch soon after previous run.
Another useful effect of interval jobs can be noticed when a cluster of
machines periodically communicates with a single node. Running the task
time based creates too much load on the node. Running interval based
spreads invocations across machines in cluster. Note that -j/-J won't
help in this case.
r334910:
Remove old, dead compat code.
We no longer need to od these things conditionally, and the fallbacks
are to 4.2BSD era defaults, which nobody uses anymore. Vixie cron has
diverged from upstream anyway in our tree, and it's not clear there's
actually a viable upstream anymore. Plus, we don't follow the
vendor-supplied code pattern here.
I'm doing this to reduce false positives from grep.
When loading bigger variables form UEFI it is necessary to know their
size beforehand, so that an appropriate amount of memory can be
allocated. The easiest way to do this is to try to read the variable
with buffer size equal 0, expecting EFI_BUFFER_TOO_SMALL error to be
returned. Allow such possible approach in efi_getenv routine.
Extracted from a bigger patch as suggested by imp.
r344238:
Restore loader(8)'s ability for lsdev to show partitions within a bsd slice.
I'm pretty sure this used to work at one time, perhaps long ago. It has
been failing recently because if you call disk_open() with dev->d_partition
set to -1 when d_slice refers to a bsd slice, it assumes you want it to
open the first partition within that slice. When you then pass that open
dev instance to ptable_open(), it tries to read the start of the 'a'
partition and decides there is no recognizable partition type there.
This restores the old functionality by resetting d_offset to the start
of the raw slice after disk_open() returns. For good measure, d_partition
is also set back to -1, although that doesn't currently affect anything.
I would have preferred to make disk_open() avoid such rude assumptions and
if you ask for partition -1 you get the raw slice. But the commit history
shows that someone already did that once (r239058), and had to revert it
(r239232), so I didn't even try to go down that road.
r344239:
Use a couple local variables to avoid repetitive long expressions that
cause line-wrapping.
r344240:
Make lsdev -v output line up in neat columns by using a fixed width for
the size field and a tab between the partition type and the size.
r344247:
Make uboot_devdesc properly alias disk_devdesc, so that parsing the u-boot
loaderdev variable works correctly.
The uboot_devdesc struct is variously cast back and forth between
uboot_devdesc and disk_devdesc as pointers are handed off through various
opaque interfaces. uboot_devdesc attempted to mimic the layout of
disk_devdesc by having a devdesc struct, followed by a union of some
device-specific stuff that included a struct that contains the same fields
as a disk_devdesc. However, one of those fields inside the struct is 64-bit
which causes the entire union to be 64-bit aligned -- 32 bits of padding
is added between the struct devdesc and the union, so the whole mess ends
up NOT properly mimicking a disk_devdesc after all. (In disk_devdesc there
is also 32 bits of padding, but it shows up immediately before the d_offset
field, rather than before the whole collection of d_* fields.)
This fixes the problem by using an anonymous union to overlay the devdesc
field uboot network devices need with the disk_devdesc that uboot storage
devices need. This is a different solution than the one contributed with
the PR (so if anything goes wrong, the blame goes to me), but 95% of the
credit for this fix goes to Pawel Worach and Manuel Stuhn who analyzed the
problem and proposed a fix.
r344254:
Use DEV_TYP_NONE instead of -1 to indicate no device was specified.
DEV_TYP_NONE has a value of zero, which makes more sense since the device
type is a bunch of bits describing the device, crammed into an int.
r344255:
Fix more places to use DEV_TYP_NONE instead of -1 to indicate 'no device'.
r344260:
Allow the u-boot loaderdev env var to be formatted in the "usual" loader(8)
way: device<unit>[s|p]<slice><partition>. E.g., disk0s2a or disk3p12.
The code first tries to parse the variable in this format using the
standard disk_parsedev(). If that fails, it falls back to parsing the
legacy format that has been supported by ubldr for years.
In addition to 'disk', all the valid uboot device names can also be used:
mmc, sata, usb, ide, scsi. The 'disk' device serves as an alias for all
those types and will match the Nth storage-type device found (where N is
the unit number).
r344268:
loader: ptable_close() should check its argument
If the passed in table is NULL, just return.
r344335:
Fix the handling of legacy-format devices in the u-boot loaderdev variable.
When I added support for the standard loader(8) disk0s2a: type formats,
the parsing of legacy format was broken because it also contains a colon,
but it comes before the slice and partition. That would cause disk_parsedev()
to return success with the slice and partition set to wildcard values.
This change examines the string first, and if it contains spaces, dots, or
a colon at any position other than the end, it must be a legacy-format
string and we don't even try to use disk_parsedev() on it.
r344839:
Add retry loop around GetMemoryMap call to fix fragmentation bug
The call to BS->AllocatePages can cause the memory map to become framented,
causing BS->GetMemoryMap to return EFI_BUFFER_TOO_SMALL more than once. For
example this can happen on the MinnowBoard Turbot, causing the boot to stop
with an error. Avoid this by calling GetMemoryMap in a loop.
r345066:
stand: Improve some debugging experience
Some of these files using <FOO>_DEBUG defined a DEBUG() macro to serve as a
debug-printf. -DDEBUG is useful to enable some debugging output across
multiple ELF/common parts, so switch the DEBUG-as-printf macros over to
something more like DPRINTF that is more commonly used for this kind of
thing and less likely to conflict.
userboot/elf64_freebsd debugging also assumed %llx for uint64; use PRIx64
instead.
r345330:
loader: fix loading of kernels with . in path
The loader indended to search the kernel file name (only) for . but
instead searched the entire path, so paths like
"boot/test.elfv2/kernel" would not work.
r341101:
powerpcspe: Don't crash the loader on ubldr with SPE instructions.
-msoft-float seems to be insufficient for disabling the SPE on powerpcspe.
Force it off with -mno-spe as well. This prevents a crash in ubldr on
powerpcspe.
r341231:
loader: command_bcache() should print unsigned values
All bcache counters are unsigned.
r341276:
When handling CMD_CRIT error set command_errmsg to NULL after we dump it out,
so that it does not result in error message printed twice.
OK load doodoo
can't find 'doodoo'
can't find 'doodoo'
OK
r341329:
loader.efi: fix EFI getchar() for multiple consoles
This fix is ported from illumos (issue #9970), the analysis and initial
implementation was done by John Levon.
See also: https://www.illumos.org/issues/9970
Currently, efi_cons_getchar() will wait for a key. While this seems to make
sense, the implementation of getchar() in common/console.c will loop across
getchar() for all consoles without doing ischar() first.
This means that if we've configured multiple consoles, we can't input into
the serial, as getchar() will be sat waiting for input only from efi_console.c
This patch does implement a bit more generic key buffer to support
translation of input keys, and we use generic efi_readkey() to reduce
duplication from calls from getchar() and poll().
r341433:
Move inclusion of src.opts.mk later.
src.opts.mk includes bsd.own.mk. This in turn defines CTFCONVERT_CMD
depending on the MK_CTF value. We then set MK_CTF to no, which has no
real effect. The solution is to set all the MK_foo values before
including src.opts.mk.
This should stop the cdboot binary from exploding in size for releases
built WITH_CTF=yes in src.conf.
r341780:
powerpc/ubldr: Teach powerpc's ubldr to boot 64-bit kernels
This is just a copy of powerpc/ofw's ppc64_elf_freebsd.c modified to fit
ubldr's boot format.
r342054:
Print an error message in efi_main.c if we can't allocate memory for the heap
With the default Qemu parameters, only 128MB RAM gets given to a VM. This causes
the loader to be unable to allocate the 64MB it needs for the heap. This change
makes the cause of the error more obvious.
r342055:
Cast error message in efi_main.c to CHAR16* to avoid build error
r342721:
loader.efi: update memmap command to recognize new attributes
Also move memory type to string translation to libefi for later use.
r342742:
loader.efi: efi variable rework and lsefi command added
This update does add diag and debug capabilities to interpret the efi
variables, configuration and protocols (lsefi).
The side effect is that we add/update bunch of related headers.
r342840:
Create MK_LOADER_VERBOSE and connect it to ELF_VERBOSE in the loader
code.
r343008:
Add Dell Chromebook to the list of devices with E820 extmem quirk enabled
Just like for Acer C270 chromebook the E820 extmem workaround is required for
FreeBSD to boot on Dell chromebook.
r343225:
Unbreak mip64 build after r328437
Add exit and getchar functions to beri/boot2 code. They are required by
panic_action functin introduced in r328437
r338262:
stand: fdt: Drop some write-only assignments/variables and leaked bits
Generally straightforward enough; a copy of argv[1] was being made in
command_fdt_internal, solely used for a comparison within the
handler-search, then promptly leaked.
r339334:
loader.efi: add poweroff command
Add poweroff command to make life a bit easier.
r339796:
Simplify the EFI delay() function by calling BS->Stall()
r340240:
loader: ptable_open() check for ptable_cd9660read result is wrong
The ptable_*read() functions return NULL on read errors (and partition table
closed as an side effect). The ptable_open must check the return value and
act properly.
r340857:
Nuke out buffer overflow safety marker code, it duplicates similar code in
the malloc()/free() as well as having potential of softening the handling
in case error is detected down to a mere warning as compared to hard panic
in free().
r340917:
Update pxeboot(8) manual page to reflect the next-server change in the ISC DHCP v3 server.
r341007:
Bump the date of pxeboot(8) manual page for r340917.
r337321:
Make it possible for init to execute any executable, not just sh(1)
scripts. This means one should be able to eg rewrite their /etc/rc
in Python.
r337435:
Move description of init_shell, init_script, and init_chroot kenv
tunables from loader(8) to init(8), since it's init that actually
uses them. Add .Xrs at their old place.
r337707:
Move around text in loader(8), in particular stuff related to ZFS,
to restore the usual section order.
r337740:
Add init_exec kenv(1) variable, to make init(8) execute a file
after opening the console, replacing init as PID 1.
From the user point of view, it makes it possible to run eg the
shell as PID 1, using 'set init_exec=/bin/sh' at the loader(8)
prompt.
r337834:
Add SECURITY section to loader(8).
r337836:
Improve formatting.
r337968:
Consistently use NULL to terminate the argv; no functional changes.
new_package may not set *pp if it errors out, leaving pkg uninitialized.
r339970:
Remove unnecessary include from libstand.
r342151:
loader: zfs reader should not probe partitionless disks
First of all, normal setups can not boot such pools as the tools
do not support installing boot programs.
Secondly, for proper pool configuration detection, we need to checks all
four label copies on disk, 2 from front and 2 from the end of the disk,
but zfs label does not contain the size of the disk - so we depend on
firmware to report the correct disk size or use information from the
partition table.
Without partition table, we only can rely on firmware to report and support
disk IO properly.
There is a specific case: 8TB disks are reported by BIOS to have 4294967295
sectors (0x00000000ffffffff), the sectors reported by OS is 15628053168
(0x00000003a3812ab0), so the reported size is less than actual but is hitting
32-bit max. Unfortuantely the real limit must be even lower because probing
this disk in this system will wnd up with hung system.
UEFI boot of this system seems not to be affected.
r342161:
loader: zfs reader should not probe partitionless disks (UEFI case)
With r342151 I did fix the BIOS version of zfs_probe_dev() from accessing
the whole disk, but the fix was not complete - we actually did not check
if the device name was really for whole disk. Since UEFI version
is only calling the zfs_probe_dev() with partitions and not with whole
disk, the UEFI loader was not able to find the zfs pools.
This update does correct the issue by calling archsw.arch_getdev() to
translate the device name back to dev_desc, and we have whole disk when both
partition and slice values are -1.
r343123:
loader should ignore active multi_vdev_crash_dump feature on zpool
Since the loader zfs reader does not need to read the dump zvol, we can
just enable the feature.
r344226:
Fix memory corruption bug introduced in r325310
The bug occurred when a bounce buffer was used and the requested read
size was greater than the size of the bounce buffer. This commit also
rewrites the read logic so that it is easier to systematically verify
all alignment and size cases.
r344234:
It turns out r344226 narrowed the overrun bug but did not eliminate it entirely
This commit fixes a remaining output buffer overrun in the
single-sector case when there is a non-zero tail.
r344248:
cd9660: dirmatch fails to unmatch when name is prefix for directory record
Loader does fail to properly match the file name in directory record and
does open file based on prefix match.
For fix, we check the name lengths first.
r344387:
loader: really fix cd9660 dirmatch
The cd9660_open() does pass whole path to dirmatch() and we need to
compare only the current path component, not full path.
Additinally, skip over duplicate / (if any) and check if the last component
in the path was meant to be directory (having trailing /). If it is in fact
a file, error out.
r341253:
The libstand's panic() appends its own '\n' to the message, so that users of the API
don't need to supply one.
r341328:
loader: create separate lists for fd, cd and hd, merge bioscd with biosdisk
Create unified block IO implementation in BIOS version, like it is done in UEFI
side. Implement fd, disk and cd device lists, this will split floppy devices
from disks and will allow us to have consistent, predictable device naming
(modulo BIOS issues).
r342619:
loader: create bio_alloc and bio_free for bios bounce buffer
We do have 16KB buffer space defined in pxe.c, move it to bio.c and implement
bio_alloc()/bio_free() interface to make it possible to use this space for
other BIOS calls (notably, from biosdisk.c).
r342626:
Add Copyright.
r342707:
i386_parsedev() needs to support fd devices
r342785:
With buggy int13 ah=15, we can mis-identify the floppy devices.
We have no option than trust INT13 ah=08 return code during the init phase.
r342865:
biospci_write_config args were backwards
biospci_write_config args swapped length and value to write. Some
hardware coped just fine, while other hardware had issues.
r339658:
loader: biosdisk interface should be able to cope with 4k sectors
The 4kn support in current bios specific biosdisk.c is broken, as the code
is only implementing the support for the 512B sector size.
This work is building the support for custom size sectors, we still do assume
the requested data to be multiple of 512B blocks and we only do address the
biosdisk.c interface here.
For reference, see also:
https://www.illumos.org/issues/8303
https://www.illumos.org/rb/r/547
As the GELI is moved above biosdisk "layer", the GELI should just work
r339959:
loader: issue edd probe before legacy ah=08 and detect no media
while probing for drives, use int13 extended info before standard one and
provide workaround for case we are not getting needed information in case
of floppy drive.
In case of INT13 errors, there are (at least) 3 error codes appearing in case
of missin media - 20h, 31h and 80h. Flag the no media and do not print an
error.
r340047:
loader: do not probe floppy devices for zfs
The subject is telling it all.
r340049:
loader: biosdisk should check if the media is present
The bd_print/bd_open/bd_strategy need to make sure the device does have
media, before getting into performing IO operations. Some systems can
hung if the device without a media is accessed.
r340215:
loader: always set media size from partition.
The disk access is validated by using partition table definitions, therefore
we have no need for if statements, just set the disk size.
Of course the partition table itself may be incorrect/inconsistent, but if
so, we are in trouble anyhow.
Also switch u_int to uint32_t. Also replace "write" by "dowrite".
No functional changes intended.
r337354:
loader: 337353 did miss to rename 2 write instances
2 write instances got somehow missed.
r337356:
loader: bd_open() should cleanup from disk_open() error
Since bd_open() does early increment for reference counter and bcache
allocation, it also should undo those in case of the error.
Also remove unused variables rdev, g_err.
r337872:
libi386: remove BD_SUPPORT_FRAGS
BD_SUPPORT_FRAGS is preprocessor knob to allow partial reads in bioscd/biosdisk
level. However, we already have support for partial reads in bcache, and there
is no need to have duplication via preprocessor controls.
Note that bioscd/biosdisk interface is assumed to perform IO in 512B blocks,
so the only translation we have to do is 512 <-> native block size.
r337878:
libi386: remove bd_read() and bd_write() wrappers
Those wroappers are nice, but do not really add much value.
r337881:
libi386: use BD_RD and BR_WR constants
Use BD_RD and BD_WR instead of 0 and 1.
r337890:
libi386: small style updates in biosdisk
Use break instead of return in for loop, as done earlier. Insert and remove
some blank lines. No functional changes intended.
r337891:
libi386: bd_io_workaround() is to be called for reads only
bd_io() can perform either reads or writes, we only need bd_io_workaround()
for reads.
r338188:
loader: bios loader should allow to chain load a file
The current chain command does accept only device, allow also a file to be used,
such as /boot/pmbr or /boot/mbr (or stored third party MBR/VBR block).