Kristof Provost [Sun, 25 Feb 2018 08:56:44 +0000 (08:56 +0000)]
pf: Cope with overly large net.pf.states_hashsize
If the user configures a states_hashsize or source_nodes_hashsize value we may
not have enough memory to allocate this. This used to lock up pf, because these
allocations used M_WAITOK.
Cope with this by attempting the allocation with M_NOWAIT and falling back to
the default sizes (with M_WAITOK) if these fail.
Kyle Evans [Sun, 25 Feb 2018 05:14:06 +0000 (05:14 +0000)]
lualoader: Explain deviation from naming guidelines
cli_execute is likely the only exception that we should make, due to it
being a global. We don't really need other globals, so this won't really end
up an epidemic.
Kyle Evans [Sun, 25 Feb 2018 05:00:54 +0000 (05:00 +0000)]
lualoader: Pull autoboot handling out into menu.run()
There's no reason for autoboot handling to be mixed in with menu processing.
It is a distinct process that should only be done once when entering the
menu system.
menu.process has been modified to take an initial keypress to process and to
only draw the screen initially if it's been invalidated. The keypress is
kind of a kludge, although it could be argued to be a potentially useful
kludge if there are other processes that may need to feed a keypress into
the menu system.
Kyle Evans [Sun, 25 Feb 2018 04:44:45 +0000 (04:44 +0000)]
lualoader: Pull menu redrawing specifics out of menu.process
In general, every menu redraw is going to require a screen clear and cursor
reset. Each redraw also has the potential to invalidate the alias table, so
we move the alias table being used out into a module variable. This allows
third party consumers to also inspect or update the alias table if they need
to.
While here, stop searching the alias table once we've found a match.
Kyle Evans [Sun, 25 Feb 2018 04:11:08 +0000 (04:11 +0000)]
lualoader: Clean up menu handling a little bit
This is driven by an urge to separate out the bits that really only need to
happen when the menu system starts up. Key points:
- menu.process now does the bulk of menu handling. It retains autoboot
handling for dubious reasons, and it no longer accepts a 'nil' menu to
process as 'the default'. Its return value is insignificant.
- The MENU_SUBMENU handler now returns nothing. If menu.process has exited,
then we continue processing menu items on the parent menu as expected.
- menu.run is now the entry point of the menu system. It checks whether the
menu should be skipped, processes the default menu, then returns.
Mark Johnston [Sat, 24 Feb 2018 20:47:22 +0000 (20:47 +0000)]
Restore the pre-r329882 inactive page shortage computation.
With r329882, in the absence of a free page shortage we would only take
len(PQ_INACTIVE)+len(PQ_LAUNDRY) into account when deciding whether to
aggressively scan PQ_ACTIVE. Previously we would also include the
number of free pages in this computation, ensuring that we wouldn't scan
PQ_ACTIVE with plenty of free memory available. The change in behaviour
was most noticeable immediately after booting, when PQ_INACTIVE and
PQ_LAUNDRY are nearly empty.
Kyle Evans [Sat, 24 Feb 2018 20:07:39 +0000 (20:07 +0000)]
lualoader: throw out nextboot's usage of standard config processing
It should use the common parser, but it should not be processed like a
standard file. Rewite check_nextboot to read the file in, check whether it
should continue, then parse as needed.
This allows us to throw the recently introduced check_and_halt callback
swiftly out the window.
Kyle Evans [Sat, 24 Feb 2018 20:00:31 +0000 (20:00 +0000)]
lualoader: Strip config.parse of its I/O privileges
config.parse is now purely a parser, rather than a whole proccessor. The
standard process for loading a config file has been split out into
config.processFile.
This clears the way for having nextboot read its own config file and decide
there whether it should parse the rest of the file.
Kyle Evans [Sat, 24 Feb 2018 19:51:18 +0000 (19:51 +0000)]
lualoader: Split config file I/O out into a separate function
This is step 1 towards revoking config.parse of it I/O privileges. Ideally,
all reading would be done before config.parse and config.parse would just
take text and parse it rather than being charged with the entire process.
Devin Teske [Sat, 24 Feb 2018 17:13:15 +0000 (17:13 +0000)]
Updates and enhancements to io.d to aid DTrace scripting
+ Add dev_type do devinfo_t
+ Add b_cmd to bufinfo_t
+ Add constants for BIO_* and DEVSTAT_TYPE_*
+ Add inline for converting BIO_* int to string
+ Add inline for converting DEVSTAT_TYPE_* int to string
+ Add mask for dev_type & DEVSTAT_TYPE_MASK to string
+ Add mask for dev_type & DEVSTAT_TYPE_IF_MASK to string
Reviewed by: markj
Sponsored by: Smule, Inc.
Differential Revision: https://reviews.freebsd.org/D14396
Andrew Turner [Sat, 24 Feb 2018 10:33:31 +0000 (10:33 +0000)]
Correctly set the 16kB page size field in the ITS BASER register. Some
new arm64 hardware, e.g. ThunderX2, seems to use this page size so was
failing to attach as the register value read back was incorrect.
Hide all vm/vm_pageout.h content under #ifdef _KERNEL.
There are no parts useful for usermode applications in
vm/vm_pageout.h. Even for the specific applications like fstat and
lsof.
In my opinion, this protection is redundant and instead userspace
should not include the header at all. Since there are apparently
broken third party codebases, give them a bit of slack by providing
transitional period.
Reported by: julian
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Fix race when detach is called right after attach in if_axge, that the
network device pointer might be NULL. Wait for any pending tasks to
complete before calling axge_stop().
Kyle Evans [Sat, 24 Feb 2018 03:47:04 +0000 (03:47 +0000)]
lualoader: Add comment on trailing space, don't operate on nil
Functionally, the latter error wouldn't necessarily hurt anything. io.write
will just error out as it's not passed a valid file handle. Still, we can do
better than that.
Kyle Evans [Sat, 24 Feb 2018 03:43:10 +0000 (03:43 +0000)]
lualoader: Correct test and name
The functionality was correct, but our style guidelines tend to request that
we shy away from using boolean operations in place of explicit comparisons
to nil.
Kyle Evans [Sat, 24 Feb 2018 03:35:35 +0000 (03:35 +0000)]
lualoader: Add nextboot support
config.parse now takes an extra callback that is invoked on the full text of
the config file. This callback dictates where we should actually try to
parse this file or not.
For nextboot, we use this to halt parsing if we see 'nextboot_enable="NO"'.
If we don't, parse it and write 'nextboot_enable="NO" ' to it. The same
caveat as with forth still applies- writing is only supported by UFS.
Justin Hibbits [Sat, 24 Feb 2018 01:46:56 +0000 (01:46 +0000)]
Remove platform_cpu_idle() and platform_cpu_idle_wakeup() interfaces
These interfaces were put in place to let QorIQ SoCs dictate CPU idling
semantics, in order to support capabilities such as NAP mode and deep sleep.
However, this never stabilized, and the idling support reverted back to
CPU-level rather than SoC level. Move this code back to cpu.c instead. If
at a later date the lower power modes do come to fruition, it should be done
by overriding the cpu_idle_hook instead of this platform hook.
Jung-uk Kim [Sat, 24 Feb 2018 01:24:57 +0000 (01:24 +0000)]
Partially revert r197863 to reduce diff against i386.
When I wrote the patch, I wanted to remove SYSINIT() usage from amd64 code.
There is no reason to keep the divergence any more because iwasaki merged
most amd64 suspend/resume code to i386 with r235622. Note this also fixed
an enge case reported by royger. [1]
Jeff Roberson [Fri, 23 Feb 2018 22:51:51 +0000 (22:51 +0000)]
Add a generic Proportional Integral Derivative (PID) controller algorithm and
use it to regulate page daemon output.
This provides much smoother and more responsive page daemon output, anticipating
demand and avoiding pageout stalls by increasing the number of pages to match
the workload. This is a reimplementation of work done by myself and mlaier at
Isilon.
Conrad Meyer [Fri, 23 Feb 2018 20:15:19 +0000 (20:15 +0000)]
Remove unused error return from API that cannot fail
No implementation of fpu_kern_enter() can fail, and it was causing needless
error checking boilerplate and confusion. Change the return code to void to
match reality.
(This trivial change took nine days to land because of the commit hook on
sys/dev/random. Please consider removing the hook or otherwise lowering the
bar -- secteam never seems to have free time to review patches.)
Alan Somers [Fri, 23 Feb 2018 16:31:00 +0000 (16:31 +0000)]
Add the ZFS test suite
It was originally written by Sun as part of the STF (Solaris test framework).
They open sourced it in OpenSolaris, then HighCloud partially ported it to
FreeBSD, and Spectra Logic finished the port. We also added many testcases,
fixed many broken ones, and converted them all to the ATF framework. We've had
help along the way from avg, araujo, smh, and brd.
By default most of the tests are disabled. Set the disks Kyua variable to
enable them.
Warner Losh [Fri, 23 Feb 2018 04:04:25 +0000 (04:04 +0000)]
Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
Warner Losh [Fri, 23 Feb 2018 04:04:03 +0000 (04:04 +0000)]
Centralize lua defines
We need to ensure that we defined Numbers as int64_t everywhere we
build for lua. Previously, we were compiling part of the code with
Numbers as int64_t and part as double. Move lua number definition to a
more-central location
Kyle Evans [Fri, 23 Feb 2018 04:03:07 +0000 (04:03 +0000)]
lualoader: Use "local function x()" instead of "local x = function()"
The latter is good, but the former is more elegant and clear about what 'x'
is. Adopt it, preferably only using the latter kind of notation where needed
as values for tables.
1. Added support to offline a port if is error recovery on successful.
2. Sysctls to enable/disable driver_state_dump and error_recovery.
3. Sysctl to control the delay between hw/fw reinitialization and
restarting the fastpath.
4. Stop periodic stats retrieval if interface has IFF_DRV_RUNNING flag off.
5. Print contents of PEG_HALT_STATUS1 and PEG_HALT_STATUS2 on heartbeat
failure.
6. Speed up slowpath shutdown during error recovery.
7. link_state update using atomic_store.
8. Added timestamp information on driver state and minidump captures.
9. Added support for Slowpath event logging
10.Added additional failure injection types to simulate failures.
Alan Somers [Fri, 23 Feb 2018 03:11:43 +0000 (03:11 +0000)]
libifconfig: multiple feature additions
Added the ability to:
* Create virtual interfaces
* Create vlan interfaces
* Get interface fib
* Get interface groups
* Get interface status
* Get nd6 info
* Get media status
* Get additional ifaddr info in a convenient struct
* Get vhids
* Get carp info
* Get lagg and laggport status
* Iterate over all interfaces and ifaddrs
And add more examples, too.
Note that this is a backwards-incompatible change. But that's ok, because it's
a private library.
Kyle Evans [Fri, 23 Feb 2018 02:53:50 +0000 (02:53 +0000)]
Add copyright notice to core.lua
I've also made some not-insignificant changes/additions to this file, to
include the added constants, ACPI changes, boot environment listing, and
some utility functions.
Pedro F. Giffuni [Fri, 23 Feb 2018 00:28:00 +0000 (00:28 +0000)]
getpeereid(3): Fix behavior on failure to match documentation.
According to the getpeereid(3) documentation, on failure the value -1 is
returned and the global variable errno is set to indicate the error. We
were returning the error instead.
Obtained from: Apple's Libc-1244.30.3
MFC after: 5 days
Don Lewis [Fri, 23 Feb 2018 00:12:51 +0000 (00:12 +0000)]
Decrease latency by not wrapping the idle loop's potentially lengthy
search for a thread to steal inside a critical section. Since this
allows the search to be preempted, restart the search if preemption
happens since the search results found earlier may no longer be
valid.
Decrease the latency of starting a thread that may be assigned to
this CPU during the search by polling for incoming threads during
the search and switching to that thread instead of continuing the
search.
Test for stale search results and restart the search before going
through the expense of calling tdq_lock_pair(). Retry some tests
after grabbing the locks since things may have changed while waiting
to get both locks.
Eliminate special case handling for stealing from an SMT peer that
uses 1 as the steal threshold. This can only succeed if a thread
has been assigned but our SMT peer has not yet started executing
it. This is quite rare and when it happens the other SMT thread
is generally waiting for the same tdq lock that we hold. Basically
both SMT threads are racing to grab the same spin lock.
Add the kern.sched.always_steal knob from a ULE patch by jeff@.
Incorporate another idea from Jeff's ULE patch. If the sched_switch()
detects that the CPU is about to go idle, try to steal a thread
before switching to the idle thread. Since the search for a thread
to steal has to be done inside a critical section in this context,
limit the impact on latency by adding the knob kern.sched.trysteal_limit
to limit the topological distance of the search and don't restart
the search if we detect stale results. If this search can't find
an stealable thread, the idle loop can do a more complete search.
Also poll for threads being assigned to this CPU during the search
and switch to them instead of continuing the search. This change
is responsibile for the majority of the improvement in parallel
buildworld times.
In sched_balance_group() change the minimum threshold from stealing
a thread from 1 to 2. Poaching a newly assigned thread from a CPU
that is waking up hasn't yet switched to that thread from idle is
likely very rare and is likely to have the same lock race as is
seen when stealing threads in the idle loop. Also use tdq_notify()
to kick the destintation CPU instead of always sending an IPI.
Update a stale comment, the number of transferable threads is not
calculated.
Ravi Pokala [Thu, 22 Feb 2018 23:18:46 +0000 (23:18 +0000)]
jedec_dimm(4): report asset info and temperatures for DDR3 and DDR4 DIMMs
A super-set of the functionality of jedec_ts(4). jedec_dimm(4) reports asset
information (Part Number, Serial Number) encoded in the "Serial Presence
Detect" (SPD) data on JEDEC DDR3 and DDR4 DIMMs. It also calculates and
reports the memory capacity of the DIMM, in megabytes. If the DIMM includes
a "Thermal Sensor On DIMM" (TSOD), the temperature is also reported.
Reviewed by: cem
MFC after: 1 week
Relnotes: yes
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D14392
Discussed with: avg, cem
Tested by: avg, cem (previous version, no semantic changes)
Kyle Evans [Thu, 22 Feb 2018 20:10:23 +0000 (20:10 +0000)]
lualoader: Attend to some 80-col issues, pointed out by luacheck
Graphics have a tendency to cause 80-col issues, so make an exception to our
standard indentation guidelines for these graphics. This does not hamper
readability too badly.
Two 40-column strings of spaces is trivially replaced with
string.rep(" ", 80)
[chvgpio] add GPIO driver for Intel Z8xxx SoC family
Add chvgpio(4) driver for Intel Z8xxx SoC family. This product
was formerly known as Cherry Trail but Linux and OpenBSD drivers
refer to it as Cherry View. This driver is derived from OpenBSD
one so the name is kept for alignment with another BSD system.
Submitted by: Tom Jones <tj@enoti.me>
Reviewed by: gonzo, wblock(man page)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D13086
Kyle Evans [Thu, 22 Feb 2018 18:49:53 +0000 (18:49 +0000)]
Fix userboot w/ ZFS after r329725
r329725 cleaned up ZFS commands duplicated in multiple places, but userboot
was not setting HAVE_ZFS when MK_ZFS != "no". This resulted in a failure to
boot (as seen in PR 226118) in bhyve, with the following message:
/boot/userboot.so: Undefined symbol "ldi_get_size"
Return correct error code to user-space when a system call receives a
signal in the LinuxKPI.
The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.
MFC after: 3 days
Sponsored by: Mellanox Technologies
Andriy Gapon [Thu, 22 Feb 2018 13:06:27 +0000 (13:06 +0000)]
another rework of getzfsvfs / getzfsvfs_impl code
This change is designed to account for yet another difference between
illumos and FreeBSD VFS. In FreeBSD a filesystem driver is supposed to
clean up mnt_data in its VFS_UNMOUNT method because it's the last call
into the driver before a struct mount object is destroyed. The VFS
drains all references to the object before destroying it, but for the
driver it's already as good as gone.
In contrast, illumos VFS provides another method, VFS_FREEVFS, that is
called when all references are drained. So, the driver can keep its
data after VFS_UNMOUNT and clean it up in VFS_FREEVFS after all
references are gone. This is what ZFS does on illumos.
So there a reference to a filesystem is sufficient to guarantee that the
ZFS specific data, aka zfsvfs_t, stays around (even if the filesystem
gets unmounted). In FreeBSD we need to vfs_busy the filesystem to get
the same guarantee. vfs_ref guarantees only that the struct mount is
kept.
The following rules should be observed in getzfsvfs / getzfsvfs_impl on
FreeBSD:
- if we need access to zfsvfs_t then we must use vfs_busy
- if only we need to access struct mount (aka vfs_t), then vfs_ref is
enough
- when illumos code actually needs only the vfs_t, they still can pass
the zfsvfs_t and get the vfs_t from it; that can work in FreeBSD if
the filesystem is busied, but when it's just referenced then we have
to pass the vfs_t explicitly
- we cannot call vfs_busy while holding a dataset because that creates a
LOR with dp_config_rwlock
As a result:
- getzfsvfs_impl now only references the filesystem, same as in illumos,
but unlike illumos it has to return the vfs_t
- the consumers are updated to account for the change
- getzfsvfs busies the filesystem (and drops the reference from
getzfsvfs_impl)
Also, zfs_unmount_snap() now gets a busied a filesystem, references it
and then unbusies it essentially reverting actions done in getzfsvfs.
This is needed because the code may perform some checks that require the
zfsvfs_t. So, those are done before the unbusying.
Andriy Gapon [Thu, 22 Feb 2018 11:41:00 +0000 (11:41 +0000)]
followup to r329556, completely remove the covered vnode assert
vrele() acquires the vnode lock only if the hold count drops to zero.
In other scenarios it needs only the interlock. So,
zfsctl_snapdir_lookup() can race with vfs_mount_destroy() -> vrele()
such that the lookup adds a new reference and then vrele() drops the
mountpoint's reference and only then we check the reference count.
It would be just one in this case.
In fact, the assert should have been removed in r323483 when the code
learned how to deal with the uncovered vnode.
Marcelo Araujo [Thu, 22 Feb 2018 08:25:39 +0000 (08:25 +0000)]
The firewall_type is ignored if not set in rc.conf or rc.conf.local,
after r190575 there is an option to call rc.firewall with the firewall_type
passed in as an argument.
Submitted by: David P. Discher <dpd@dpdtech.com>
MFC after: 3 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D14286