Defining INTRNG remove some necessary registers and declarations of
pic_init_secondary, pic_ipi_send, pic_ipi_read and pic_ipi_clear.
Because Marvell ArmadaXP and Armada38X always use INTRNG, include all
INTRNG code and remove code that does not use it.
Separate pic registers declarations for Armada38X are unnecessary, it
works properly with ArmadaXP config.
9434 Speculative prefetch is blocked by device removal code.
Device removal code does not set spa_indirect_vdevs_loaded for pools
that never experienced device removal. At least one visual consequence
of it is completely blocked speculative prefetcher. This patch sets
the variable in such situations.
Add the etdump utility for dumping El Torito boot catalog information.
This can be used to check existing images but will be used in the future to
find EFI ESP images placed in El Torito catalogs so they can be used for
hybrid boot purposes.
fix i386 build with CPU_ELAN (LINT for instance) after r331878
x86/cpu_machdep.c now needs to include elan_mmcr.h when CPU_ELAN is set.
While here, also remove the now unneeded inclusion of isareg.h in i386
and amd64 vm_machdep.c.
Reported by: lwhsu
MFC after: 14 days
X-MFC with: r331878
r330675 introduced an extra window check in the LRO code to ensure it
captured and reported the highest window advertisement with the same
SEQ/ACK. However, the window comparison uses modulo 2**16 math, rather
than directly comparing the absolute values. Because windows use
absolute values and not modulo 2**16 math (i.e. they don't wrap), we
need to compare the absolute values.
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D14937
andrew [Tue, 3 Apr 2018 11:01:50 +0000 (11:01 +0000)]
Switch users of fdt_is_enabled to use ofw_bus_node_status_okay. These are
equivalent, so to prepare to remove the former move users to call the
latter.
fix signatures of cpu_reset_real and cpu_reset_proxy, broken in r331878
When I moved these functions from i386 and amd64 to x86 I dropped their
prototype declarations (that were correct) and left only their definitions
that became incorrect.
Reported by: bde
MFC after: 15 days
X-MFC with: r331878
Fix accidental USB port resets by GPIO on Zynq/Zedboard boards
The Zynq/Zedboard GPIO driver attempts to tri-state all GPIO pins on
boot up but the order in which I reset the hardware can cause the pins
to be briefly held low before being tri-stated. This is a problem on
boards that use GPIO pins to reset devices.
In particular, the Zybo and ZC-706 boards use a GPIO pin as a USB PHY
reset. If U-boot enables the USB port before booting the kernel, the
GPIO driver attach causes a glitch on the USB PHY reset and the USB
port loses power. My fix is to have the GPIO driver leave the pins in
whatever configuration U-boot placed them.
Default loader.conf: Drop efi_max_resolution to 1x1
Effectively disabling the mode changing bits in the loader. No matter which
way we go with it, it seems to be wrong- either the firmware doesn't change
the resolution and reports the resolution we requested, or the firmware
changes the resolution and doesn't report the resolution we requested. It
some cases, it does the right thing, but the bad cases outweight those.
Interested individuals can still set efi_max_resolution to 1080p or whatnot
in loader.conf(5) to restore the new behavior, but the new behavior does not
work out well for many cases.
cxgbe: Implement tcp_info handler for connections handled by t4_tom.
The TCB is read using a memory window right now. A better alternate to
get self-consistent, uncached information would be to use a GET_TCB
request but waiting for a reply from hw while holding non-sleepable
locks is quite inconvenient.
Import CK as of commit b19ed4c6a56ec93215ab567ba18ba61bf1cfbac8
It should fix ck_pr_[load|store]_ptr on mips and riscv, make sure no
*fence instructions are used on i386, as older cpus don't support it, and
make sure we don't rely on gcc builtins that can lead to calls to
libatomic when linked with -O0.
Ensure the background laundering threshold is positive after a scan.
The division added in r331732 meant that we wouldn't attempt a
background laundering until at least v_free_target - v_free_min clean
pages had been freed by the page daemon since the last laundering. If
the inactive queue is depleted but not completely empty (e.g., because
it contains busy pages), it can thus take a long time to meet this
threshold. Restore the pre-r331732 behaviour of using a non-zero
background laundering threshold if at least one inactive queue scan has
elapsed since the last attempt at background laundering.
unify amd64 and i386 cpu_reset() in x86/cpu_machdep.c
Because I didn't see any reason not too.
I've been making some changes to the code and couldn't help but notice
that the i386 and am64 code was nearly identical.
x86 cpu_reset: if failed to switch to BSP proceed to cpu_reset_real
If cpu_reset() is called on an AP and if it somehow fails to wake the
BSP, then it's better to attempt the reset on the AP than just sit there
spinning on an unusable and undebuggable system.
x86 cpu_reset_proxy: no need to stop_cpus() the original processor
The processor is "parked" in a spin-loop already and that's sufficient
for the reset. There is nothing that stop_cpus() would add here, only
extra complexity and fragility.
The original processor does not need to enable interrupts now, in fact,
it must not do that.
In uma_startup_count() handle special case when zone will fit into
single slab, but with alignment adjustment it won't. Again, when
there is only one item in a slab alignment can be ignored. See
previous revision of this file for more info.
Handle a special case when a slab can fit only one allocation,
and zone has a large alignment. With alignment taken into
account uk_rsize will be greater than space in a slab. However,
since we have only one item per slab, it is always naturally
aligned.
Code that will panic before this change with 4k page:
ian [Sun, 1 Apr 2018 18:53:27 +0000 (18:53 +0000)]
Fix the build on arches with default unsigned char. Capture the fubyte()
return value in an int as well as the char, and test the full int value
for fubyte() failure.
jeff [Sun, 1 Apr 2018 04:50:05 +0000 (04:50 +0000)]
Add a uma cache of free pages in the DEFAULT freepool. This gives us
per-cpu alloc and free of pages. The cache is filled with as few trips
to the phys allocator as possible by the use of a new
vm_phys_alloc_npages() function which allocates as many as N pages.
This code was originally by markj with the import function rewritten by
me.
This commit splits all of the logodefs/graphics out into their own own files
and provides a method for these files to register their logodefs with the
drawer. Graphics are now loaded on demand if they don't exist in the current
set of logodefs.
The drawer module becomes a little easier to navigate through without all of
the graphics mixed in. It's also easy to do one-off graphics like the
9.2 Die Hard tribute by dteske@ without adding even more to our memory
requirements.
- No need for a 'goto' when our entire loop body is then wrapped in a
conditional.
- No need to leave commented out prints laying around
- If an expression is clearly going to be either nil or an expression that
isn't likely to be a boolean, we might as well use `or` to specify a
default value for the expression. e.g. `loader.getenv(...) or "no"`
kevans [Sat, 31 Mar 2018 23:49:00 +0000 (23:49 +0000)]
lualoader: Don't assume that {module}_load is set
The previous iteration of this assumed that {module}_load was set. In the
old world order of default loader.conf(5), this was probably a safe
assumption given that we had almost every module explicitly not-loaded in
it.
In the new world order, this is no longer the case, so one could delete a
_load line inadvertently while leaving a _name, _type, _flags, _before,
_after, or _error. This would have caused a confusing Lua error and borked
module loading.
imp [Sat, 31 Mar 2018 22:02:59 +0000 (22:02 +0000)]
fwohcireg.h is 99% the same between the boot loader and the
kernel. Delete it and fix up the 1% difference because there's no need
for them to be different.
benno [Sat, 31 Mar 2018 15:04:41 +0000 (15:04 +0000)]
Synchronise with NetBSD's version of EFI handling for El Torito images.
When I implemented my EFI support I failed to check if the upstream version
of makefs in NetBSD had done the same. Override my version with theirs to
make it easier to stay in sync with them in the future.
jah [Sat, 31 Mar 2018 05:17:12 +0000 (05:17 +0000)]
Remove MK_AUTO_OBJ from env passed to PORTS_MODULES
This fixes a failure to resolve object file paths seen when buildkernel
(which sets MK_AUTO_OBJ=yes) and installkernel (which sets MK_AUTO_OBJ=no)
are run as separate steps. r329232 partially fixed this scenario by removing
MAKEOBJDIR, but it seems the AUTO_OBJ setting also needs to be on the same
page for the build and install steps.
brooks [Fri, 30 Mar 2018 21:38:53 +0000 (21:38 +0000)]
Document and enforce assumptions about struct (in6_)ifreq.
- The two types must be type-punnable for shared members of ifr_ifru.
This allows compatibility accessors to be shared.
- There must be no padding gap between ifr_name and ifr_ifru. This is
assumed in tcpdump's use of SIOCGIFFLAGS output which attempts to be
broadly portable. This is true for all current architectures, but very
large (256-bit) fat-pointers could violate this invariant.
brooks [Fri, 30 Mar 2018 20:24:29 +0000 (20:24 +0000)]
Fall back to ether_ioctl() by default.
The common pratice in ethernet device drivers is to fall back to
ether_ioctl() to implement generic ioctls not implemented by the driver
and to fail if no handler exists.
Convert these drivers to follow that practice rather than calling
ether_ioctl() for specific cases.
vxge(4) aready had the default case, but it was only called on failure
to match.
hselasky [Fri, 30 Mar 2018 19:45:48 +0000 (19:45 +0000)]
Reorganize health recovery in mlx5core.
- Move the semaphore locking and unlocking to the same function.
- Flags are no longer needed if the reset and crdump will be done in the
same function.
hselasky [Fri, 30 Mar 2018 19:43:15 +0000 (19:43 +0000)]
Prepare for FW dump in error state in mlx5core.
- Move firmware dump prep and cleanup to init_one() and remove_one() so that
the init and cleanup will happen only upon driver reload.
- Add some prints to indicate firmware dump.
gjb [Fri, 30 Mar 2018 19:08:37 +0000 (19:08 +0000)]
Add logic for "families" for GCE images.
This allows for GCE consumers to easily detect the latest major
version of FreeBSD when using the gcloud command line utility.
To ensure snapshot builds do not conflict with release-style
builds (ALPHA, BETA, RC, RELEASE), the '-snap' suffix is appended
to the GCE image family name.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
brooks [Fri, 30 Mar 2018 18:50:13 +0000 (18:50 +0000)]
Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.
manu [Fri, 30 Mar 2018 16:37:08 +0000 (16:37 +0000)]
efinet: Do not return only if ReceiveFilter fails
If the network interface or the uefi implementation do not support the
ReceiveFilter interface do not return only and just print a message.
U-Boot doesn't support is and likely never will. Also even if this fails
it doesn't mean that network in EFI isn't supported.
ken [Fri, 30 Mar 2018 15:28:25 +0000 (15:28 +0000)]
Bring in the Broadcom/Emulex Fibre Channel driver, ocs_fc(4).
The ocs_fc(4) driver supports the following hardware:
Emulex 16/8G FC GEN 5 HBAS
LPe15004 FC Host Bus Adapters
LPe160XX FC Host Bus Adapters
Emulex 32/16G FC GEN 6 HBAS
LPe3100X FC Host Bus Adapters
LPe3200X FC Host Bus Adapters
The driver supports target and initiator mode, and also supports FC-Tape.
Note that the driver only currently works on little endian platforms. It
is only included in the module build for amd64 and i386, and in GENERIC
on amd64 only.
kib [Fri, 30 Mar 2018 10:55:31 +0000 (10:55 +0000)]
Make vm_map_max/min/pmap KBI stable.
There are out of tree consumers of vm_map_min() and vm_map_max(), and
I believe there are consumers of vm_map_pmap(), although the later is
arguably less in the need of KBI-stable interface. For the consumers
benefit, make modules using this KPI not depended on the struct vm_map
layout.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D14902
emaste [Fri, 30 Mar 2018 03:38:08 +0000 (03:38 +0000)]
makefs: sync fragment and block size with newfs
r222319 in newfs raised the default blocksize for UFS/FFS filesystems
from 16K to 32K and the default fragment size from 2K to 4K, with a
rationale that most disks were now running with 4K sectors.
MFC after: 2 weeks
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
landonf [Thu, 29 Mar 2018 19:44:15 +0000 (19:44 +0000)]
bhnd(4): include a subset of the ChipCommon capability flags in bhnd_chipid;
this provides early access to device capability flags required by bhnd(4)
bus and bhndb(4) bridge drivers.
davidcs [Thu, 29 Mar 2018 17:36:34 +0000 (17:36 +0000)]
1. Add additional debug prints.
2. Break transmit when IFF_DRV_RUNNING is OFF.
3. set desc_count=0 for default case in switch in ql_rcv_isr()
MFC after:5 days
markj [Thu, 29 Mar 2018 17:19:59 +0000 (17:19 +0000)]
Have TD_LOCKS_DEC() assert that td_locks is positive.
This makes it easier to catch lock accounting bugs, since the problem
is otherwise only detected upon a return to user mode (or never, for
kernel threads).
brooks [Thu, 29 Mar 2018 15:58:49 +0000 (15:58 +0000)]
GC never enabled support for SIOCGADDRROM and SIOCGCHIPID.
When de(4) was imported in 1997 the world was not ready for these ioctls.
In over 20 years that hasn't changed so it seems safe to assume their
time will never come.
markj [Thu, 29 Mar 2018 14:27:40 +0000 (14:27 +0000)]
Fix the background laundering mechanism after r329882.
Rather than using the number of inactive queue scans as a metric for
how many clean pages are being freed by the page daemon, have the
page daemon keep a running counter of the number of pages it has freed,
and have the laundry thread use that when computing the background
laundering threshold.
jeff [Thu, 29 Mar 2018 02:54:50 +0000 (02:54 +0000)]
Implement several enhancements to NUMA policies.
Add a new "interleave" allocation policy which stripes pages across
domains with a stride or width keeping contiguity within a multi-page
region.
Move the kernel to the dedicated numbered cpuset #2 making it possible
to assign kernel threads and memory policy separately from user. This
also eliminates the need for the complicated interrupt binding code.
Add a sysctl API for viewing and manipulating domainsets. Refactor some
of the cpuset_t manipulation code using the generic bitset type so that
it can be used for both. This probably belongs in a dedicated subr file.
kevans [Thu, 29 Mar 2018 00:55:11 +0000 (00:55 +0000)]
stand: Add workaround for HP BIOS issues
hrs@ and kuriyama@ have found that on some HP BIOS, a system will fail to
boot immediately after installation with the claim that it can't work out
which disk they are booting from.
They tracked it down to a buffer overrun, and found that it could be
alleviated by doing a dummy read before-hand.
jhb [Thu, 29 Mar 2018 00:12:50 +0000 (00:12 +0000)]
Reformat the enum of syscall argument types.
List enum values on separate lines to minimize diffs as new types are
added. Split the enum values up into groups and use some simple sorting
within groups (scalar enums are sorted by size, then base, all other
groups are generally sorted alphabetically).
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Matt Ahrens <Matt.Ahrens@delphix.com>
With compressed ARC (6950) we use up to 25% of our CPU to decompress indirect
blocks, under a workload of random cached reads. To reduce this decompression
cost, we would like to increase the size of the dbuf cache so that more
indirect blocks can be stored uncompressed.
If we are caching entire large files of recordsize=8K, the indirect blocks
use 1/64th as much memory as the data blocks (assuming they have the same
compression ratio). We suggest making the dbuf cache be 1/32nd of all memory,
so that in this scenario we should be able to keep all the indirect blocks
decompressed in the dbuf cache. (We want it to be more than the 1/64th that
the indirect blocks would use because we need to cache other stuff in the
dbuf cache as well.)
In real world workloads, this won't help as dramatically as the example
above, but we think it's still worth it because the risk of decreasing
performance is low. The potential negative performance impact is that we
will be slightly reducing the size of the ARC (by ~3%).
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Allan Jude <allanjude@freebsd.org>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: George Wilson <george.wilson@delphix.com>
arc_loan_compressed_buf() increments arc_loaned_bytes by psize unconditionally
In the case of zfs_compressed_arc_enabled=0, when the buf is returned via
arc_return_buf(), if ARC_BUF_COMPRESSED(buf) is false, then arc_loaned_bytes
is decremented by lsize, not psize.
Switch to using arc_buf_size(buf), instead of psize, which will return
psize or lsize, depending on the result of ARC_BUF_COMPRESSED(buf).
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Allan Jude <allanjude@freebsd.org>
We want to be able to pass various settings during import/open of a pool,
which are not only related to rewind. Instead of adding a new policy and
duplicate a bunch of code, we should just rename rewind_policy to a more
generic term like load_policy.
For instance, we'd like to set spa->spa_import_flags from the nvlist,
rather from a flags parameter passed to spa_import as in some cases we want
those flags not only for the import case, but also for the open case. One
such flag could be ZFS_IMPORT_MISSING_LOG (as used in zdb) which would
allow zfs to open a pool when logs are missing.
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
ztest failed with uncorrectable IO error despite having the fix for #7163.
Both sides of the mirror have CANT_OPEN_BAD_LABEL, which also distinguishes
it from that issue.
Definitely seems like a racing condition between the vdev_validate and spa_sync:
1. Thread A (spa_sync): vdev label is updated to latest txg
2. Thread B (vdev_validate): vdev label's txg is compared to spa_last_synced_txg and is ahead.
3. Thread A (spa_sync): spa_last_synced_txg is updated to latest txg.
Solution: do not check txg in vdev_validate unless config lock is held.
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
The idea of Storage Pool Checkpoint (aka zpool checkpoint) deals with
exactly that. It can be thought of as a “pool-wide snapshot” (or a
variation of extreme rewind that doesn’t corrupt your data). It remembers
the entire state of the pool at the point that it was taken and the user
can revert back to it later or discard it. Its generic use case is an
administrator that is about to perform a set of destructive actions to ZFS
as part of a critical procedure. She takes a checkpoint of the pool before
performing the actions, then rewinds back to it if one of them fails or puts
the pool into an unexpected state. Otherwise, she discards it. With the
assumption that no one else is making modifications to ZFS, she basically
wraps all these actions into a “high-level transaction”.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>