We need to subtract the TLS_TCB_SIZE to get to the real data pointer, since
r13 points to the end of the TCB structure. Prior to this, devel/protobuf-c
port broke with recent update to devel/protobuf, which exposed this issue.
cem [Tue, 1 Jan 2019 19:56:49 +0000 (19:56 +0000)]
linuxkpi: Remove extraneous NULL check on M_WAITOK allocation
The check was not introduced in r342628, but the subsequent unchecked access to
refs was added then, prompting a Coverity warning about "Null pointer
dereferences (FORWARD_NULL)." The warning is bogus due to M_WAITOK, but so is
the NULL check that hints it, so just remove it.
vmaffione [Mon, 31 Dec 2018 11:17:58 +0000 (11:17 +0000)]
netmap: add suite of unit tests
Import the unit tests from upstream (https://github.com/luigirizzo/netmap ba02539859d46d33), and make them ready for use with Kyua.
There are currently 38 regression tests, which test the kernel control ABI
exposed by netmap to userspace applications:
1: test for port info get
2-5: tests for basic port registration
6-9: tests for VALE
10-11: tests for getting netmap allocator info
12-15: tests for netmap pipes
16: test on polling mode
17-18: tests on options
19-27: tests for sync-kloop subsystem
28-39: tests for null ports
31-38: tests for the legacy NIOCREGIF registers
ian [Mon, 31 Dec 2018 01:09:23 +0000 (01:09 +0000)]
When allocating a new keyboard at vt_upgrade() time, unwind any cngrabs
done on the old keyboard and then do the corresponding number of grabs
on the new keyboard.
This fixes a race that can leave the system with a non-functioning
keyboard. It goes like this...
- The bios claims there is an AT keyboard, atkbd attaches.
- SI_SUB_INT_CONFIG_HOOKS runs.
- USB probes devices. Devices begin attaching, including disks.
- GELI prompts for a password for a just-attached disk, which results
in a cngrab() while atkbd is the keyboard.
- A USB keyboard attaches.
- vt_upgrade() runs and switches the keyboard to the new USB keyboard,
but because cngrab was never called for it, it's not activated and
keystrokes are ignored.
- Now there is no functional keyboard and no way to get one; even
plugging in a different USB keyboard doesn't help, because the console
is still grabbed, still waiting for a GELI pw.
bcran [Mon, 31 Dec 2018 00:20:58 +0000 (00:20 +0000)]
Fix ESP generation when using a gmirror, and when booting from RO medium
When using a gmirror, entries in /dev can be removed. So instead of using
kern.disks, get the list of disks from "gpart status -sg" instead.
We assume that any 'efi' partition that can't be mounted as msdosfs should
be used as an ESP. However, the ESP on the CD/DVD can't be mounted read-write
and so was being treated as if unformatted. Try the mount as read-only
instead, to catch cases like this.
marius [Sun, 30 Dec 2018 23:08:06 +0000 (23:08 +0000)]
o Don't allocate resources for SDMA in sdhci(4) if the controller or the
front-end doesn't support SDMA or the latter implements a platform-
specific transfer method instead. While at it, factor out allocation
and freeing of SDMA resources to sdhci_dma_{alloc,free}() in order to
keep the code more readable when adding support for ADMA variants.
o Base the size of the SDMA bounce buffer on MAXPHYS up to the maximum
of 512 KiB instead of using a fixed 4-KiB-buffer. With the default
MAXPHYS of 128 KiB and depending on the controller and medium, this
reduces the number of SDHCI interrupts by a factor of ~16 to ~32 on
sequential reads while an increase of throughput of up to ~84 % was
seen.
Front-ends for broken controllers that only support an SDMA buffer
boundary of a specific size may set SDHCI_QUIRK_BROKEN_SDMA_BOUNDARY
and supply a size via struct sdhci_slot. According to Linux, only
Qualcomm MSM-type SDHCI controllers are affected by this, though.
Requested by: Shreyank Amartya (unconditional bump to 512 KiB)
o Introduce a SDHCI_DEPEND macro for specifying the dependency of the
front-end modules on the sdhci(4) one and bump the module version
of sdhci(4) to 2 via an also newly introduced SDHCI_VERSION in order
to ensure that all components are in sync WRT struct sdhci_slot.
o In sdhci(4):
- Make pointers const were applicable,
- replace a few device_printf(9) calls with slot_printf() for
consistency, and
- sync some local functions with their prototypes WRT static.
delphij [Sun, 30 Dec 2018 23:04:02 +0000 (23:04 +0000)]
Fix various issues with Chinese locales:
- Change short weekday names to use only one Chinese character.
(note: this is a somewhat misunfortunate compromise due to the fact
that some applications are using short buffer for weekday names,
and in ~1905 when 星期 system was created to replace the traditional
七曜 system, which can use 日月火水木金土 to represent Sunday through
Saturday with just one character without any confusion).
- for zh_CN locales: use Arabic numerals for month names, matching the
practice of all other CJK locales
- Regenerate zh_CN.{GB2312,GBK} locales from zh_CN.UTF-8.
kib [Sun, 30 Dec 2018 15:46:45 +0000 (15:46 +0000)]
Fix linux_destroy_dev() behaviour when there are still files open from
the destroying cdev.
Currently linux_destroy_dev() waits for the reference count on the
linux cdev to drain, and each open file hold the reference.
Practically it means that linux_destroy_dev() is blocked until all
userspace processes that have the cdev open, exit. FreeBSD devfs does
not have such problem, because device refcount only prevents freeing
of the cdev memory, and separate 'active methods' counter blocks
destroy_dev() until all threads leave the cdevsw methods. After that,
attempts to enter cdevsw methods are refused with an error.
Implement somewhat similar mechanism for LinuxKPI cdevs. Demote cdev
refcount to only mean a hold on the linux cdev memory. Add sirefs
count to track both number of threads inside the cdev methods, and for
single-bit indicator that cdev is being destroyed. In the later case,
the call is redirected to the dummy cdev.
tsoome [Sun, 30 Dec 2018 09:35:47 +0000 (09:35 +0000)]
loader: create bio_alloc and bio_free for bios bounce buffer
We do have 16KB buffer space defined in pxe.c, move it to bio.c and implement
bio_alloc()/bio_free() interface to make it possible to use this space for
other BIOS calls (notably, from biosdisk.c).
mckusick [Sun, 30 Dec 2018 05:03:41 +0000 (05:03 +0000)]
For consistency with FFS2's fifoops2 and both versions of FFS's
vnodeops make FFS1's fifoops1 use ffs_lock. Also delete ffs_reallocblks
from fifoops1 which is needed only for fifoops2 because of its
support for extended attributes that need to allocate blocks.
cy [Sun, 30 Dec 2018 04:25:48 +0000 (04:25 +0000)]
TCP_PAWS_IDLE is does not exist in NetBSD and illumos. In FreeBSD
TCP_PAWS_IDLE is defined in netinet/tcp_seq.h, however this header
isn't included explicitly or implicitly at this point therefore
as far ipfilter is concerned TCP_PAWS_IDLE is not defined. Remove
the #ifdef and include netinet/tcp.h unconditionally.
cem [Sat, 29 Dec 2018 21:18:01 +0000 (21:18 +0000)]
Update to Zstandard 1.3.8
This merge brings in a couple new files, which needed to be attached to the
build; a new dependency on <limits.h>, which must be stubbed; and a name
change in the Context parameter constants, from ZSTD_p_foo to ZSTD_c_foo.
Significantly, it fixes a kernel build error with GCC where floating-point
functions were included in the kernel build, by hiding them under the same
compile-time #ifdef that already covered their invocation. That issue was
introduced to FreeBSD in the 1.3.7 update and tracked upstream here:
https://github.com/facebook/zstd/issues/1386
The full 1.3.8 release notes can be found on Github:
ngie [Sat, 29 Dec 2018 20:02:20 +0000 (20:02 +0000)]
Remove legacy rc.d infrastructure references from rc(8)
Legacy rc.d scripts (.sh extension) have not been supported since
r193118. Remove the outdated references to the legacy format, as they
are no longer valid.
kib [Sat, 29 Dec 2018 15:55:44 +0000 (15:55 +0000)]
For hw.{physmem,realmem,usermem} MIBs, clamp instead truncating.
If the memory size does not fit into u_long, current code truncates
the returned value and returns complete nonsense. Make the result
slightly more useful by clamping it at ULONG_MAX.
Reported and tested : pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
This fixes 'Assertion failed: ((VT.getVectorNumElements() +
N2C->getZExtValue() <= N1.getValueType().getVectorNumElements()) &&
"Extract subvector overflow!"), function getNode' when building the
multimedia/aom port (with AVX2 enabled).
emaste [Fri, 28 Dec 2018 22:47:55 +0000 (22:47 +0000)]
ar: detect and error out on 32-bit symbol table overflow
BSD ar currently does not support the /SYM64/ 64-bit symbol table, and
previously truncated to 32-bits, silently producing corrupted archives
larger than 4GB.
This is another overflow case in addtion to r342575.
PR: 234454
Reported by: Aijaz Baig, imp
MFC after: 2 weeks
MFC with: r342575
Sponsored by: The FreeBSD Foundation
0mp [Fri, 28 Dec 2018 19:49:58 +0000 (19:49 +0000)]
Add a style.mdoc(5) manual page.
The aim of this manual page is to act as a style and formatting guide for
mdoc(7) manual pages. Currently, mdoc(7) does not provide much guidance
when it comes to the usage of macros making it difficult to format manual
pages in a consistent way.
emaste [Fri, 28 Dec 2018 17:00:12 +0000 (17:00 +0000)]
ar: detect and error out on 32-bit symbol table overflow
BSD ar currently does not support the /SYM64/ 64-bit symbol table, and
previously truncated to 32-bits, silently producing corrupted archives
larger than 4GB.
Note that this is only a partial fix; additional checks will come.
PR: 234454
Reported by: Aijaz Baig, imp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
crees [Fri, 28 Dec 2018 15:11:22 +0000 (15:11 +0000)]
There is no way of escaping literal $ signs in auto_master(5), which
makes for difficulty with hidden Samba shares; shares with $ at the end
of their name. This enables the use of ${DOLLAR} to work around this.
jilles [Fri, 28 Dec 2018 13:32:14 +0000 (13:32 +0000)]
pfind, pfind_any: Correct zombie logic
SVN r340744 erroneously changed pfind() to return any process including
zombies and pfind_any() to return only non-zombie processes.
In particular, this caused kill() on a zombie process to fail with [ESRCH].
There is no direct test case for this but /usr/tests/bin/sh/builtins/kill1.0
occasionally triggers it (as reported by lwhsu).
Conversely, returning zombies from pfind() seems likely to violate
invariants and cause panics, but I have not looked at this.
jhibbits [Fri, 28 Dec 2018 01:34:08 +0000 (01:34 +0000)]
libm: Include float.h to get LDBL_MANT_DIG
The long double aliases of double functions are only exposed as aliases if
LDBL_MANT_DIG is 53 (same as DBL_MANT_DIG). Without float.h included these
files were not exposing weak aliases as expected, leading to link failures
if programs use the *l functions. This should fix editors/calligra on
targets with 64-bit long double, which uses erfl and erfcl. Found on
powerpc64.
will [Thu, 27 Dec 2018 23:27:48 +0000 (23:27 +0000)]
beinstall: try to save progress from pkg updates.
This is primarily aimed at failed updates due to package conflicts, and
affects treatment of failed updates. Whereas before potentially a large
number of packages would need to be synced for each attempt, they can now
be persisted. Requires rsync. There may be better ways to implement this,
e.g. using secondary cache path that is only used on followup attempts and
then wiped on success, which avoids polluting current cache.
mav [Thu, 27 Dec 2018 19:15:24 +0000 (19:15 +0000)]
Switch from mutexes to atomics in GEOM_DEV I/O path.
Mutexes in I/O path there were used twice per I/O to atomically access
several variables to close and/or destroy the device on last request
completion. I found the way to fit all required info into one integer,
suitable for atomic operations. It opened race window on device close,
but addition of timeout to the msleep() there should cover it.
Profiling shows removal of significant spinning time on those mutexes
and IOPS increase from ~600K to >800K to NVMe on 72-core systems.
mav [Thu, 27 Dec 2018 18:28:19 +0000 (18:28 +0000)]
Reimplement nvd(4) detach handling.
Previous code typically crashed in case of NVMe device unplug or even clean
detach while some I/Os are still in flight. To fix this the new code calls
disk_gone() and waits for confirmation of all references gone before calling
disk_destroy(), freeing other resources and allowing controller detach.
While there, fix disk lists locking and reimplement unit numbers assignment.
0mp [Thu, 27 Dec 2018 14:44:01 +0000 (14:44 +0000)]
iscsictl.8: Add missing flag parameters
- Add missing parameters to flags in the description of available options.
- Remove spaces between alternative parameters and "|".
- Align descriptions of options to the longest option.
- Use em dash instead of a hyphen.
andrew [Thu, 27 Dec 2018 14:14:41 +0000 (14:14 +0000)]
Pass VM_PROT_EXECUTE to vm_fault for instruction faults.
We need to tell vm_fault the reason for the fault was because we tried to
execute from the memory location. Without this it may return with success
as we only request read-only memory, then we return to the same location
and try to execute from the same memory address. This leads to an infinite
loop raising the same fault and returning to the same invalid location.
kib [Thu, 27 Dec 2018 13:02:15 +0000 (13:02 +0000)]
Bump sys_errlist size to keep ABI backward-compatible for some time.
Addition of the new errno values requires adding new elements to
sys_errlist array, which is actually ABI-incompatible, since ELF
records the object size. Expand array in advance to 150 elements so
that we have our users to go over the issue only once, at least until
more than 53 new errors are added.
I did not bumped the symbol version, same as it was not done for
previous increases of the array size. Runtime linker only copies as
much data into binary object on copy relocation as the binary'object
specifies. This is not fixable for binaries which access sys_errlist
directly.
While there, correct comment and calculation of the temporary buffer
size for the message printed for unknown error. The on-stack buffer
is used only for the number and delimiter since r108603.
Requested by: mckusick
Reviewed by: mckusick, yuripv
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18656
danfe [Thu, 27 Dec 2018 08:48:54 +0000 (08:48 +0000)]
Amend the `-i batt' option description and explain that the battery
is specified by its number (index), starting with zero. Previously,
sometimes users would try to literally invoke `acpiconf -i batt' in
their console and become confused as to why this did not work.
mckusick [Thu, 27 Dec 2018 07:18:53 +0000 (07:18 +0000)]
When loading an inode from disk, verify that its mode is valid.
If invalid, return EINVAL. Note that inode check-hashes greatly
reduce the chance that these errors will go undetected.
Reported by: Christopher Krah <krah@protonmail.com>
Reported as: FS-5-UFS-2: Denial Of Service in nmount-3 (ffs_read)
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
M sys/fs/ext2fs/ext2_vnops.c
M sys/kern/vfs_subr.c
M sys/ufs/ffs/ffs_snapshot.c
M sys/ufs/ufs/ufs_vnops.c
avg [Wed, 26 Dec 2018 11:03:14 +0000 (11:03 +0000)]
MFV r342532: 5882 Temporary pool names
Note that this commit brings only formatting changes that were done
during the final review of the illumos change, because FreeBSD got the
main changes before illumos.
https://www.illumos.org/issues/5882
This is an import of the temporary pool names functionality from ZoL:
https://github.com/zfsonlinux/zfs/commit/e2282ef57edc79cdce2a4b9b7e3333c56494a807
https://github.com/zfsonlinux/zfs/commit/26b42f3f9d03f85cc7966dc2fe4dfe9216601b0e
https://github.com/zfsonlinux/zfs/commit/2f3ec9006146844af6763d1fa4e823fd9047fd54
https://github.com/zfsonlinux/zfs/commit/00d2a8c92f614f49d23dea5d73f7ea7eb489ccf1
https://github.com/zfsonlinux/zfs/commit/83e9986f6eefdf0afc387f06407087bba3ead4e9
https://github.com/zfsonlinux/zfs/commit/023bbe6f017380f4a04c5060feb24dd8cdda9fce
It is intended to assist the creation and management of virtual machines
that have their rootfs on ZFS on hosts that also have their rootfs on
ZFS. These situations cause SPA namespace collisions when the standard
name rpool is used in both cases. The solution is either to give each
guest pool a name unique to the host, which is not always desireable, or
boot a VM environment containing an ISO image to install it, which is
cumbersome.
kadesai [Wed, 26 Dec 2018 10:47:52 +0000 (10:47 +0000)]
Problem statement:
Due to hardware errata in Aero controllers, reads to certain
fusion registers could intermittently return all zeroes.
This behavior is transient in nature and subsequent reads will return
valid value.
Fix:
For Aero controllers, any read will retry the read operations
from certain registers for maximum three times, if read returns zero.
kadesai [Wed, 26 Dec 2018 10:47:08 +0000 (10:47 +0000)]
This patch will add support for 32 bit atomic request descriptor for Aero adapters.
For Aero adapters-
1. Driver will use 32 bit atomic descriptor to fire IOs and DCMDs.
2. Driver will use 64 bit request descriptor to fire IOC INIT.
3. If Aero firmware supports 32 bit atomic descriptor, then only driver will use it
otherwise driver will use 64 bit request descriptor.
For rest of adapters(Ventura, Invader and Thunderbolt), driver will use 64 bit request
descriptors only.
kadesai [Wed, 26 Dec 2018 10:46:23 +0000 (10:46 +0000)]
This patch will add support for latest generation MegaRAID adapters- Aero(39xx).
Driver will throw a warning message when a Configurable secure type controller is
encountered.
kadesai [Wed, 26 Dec 2018 10:42:45 +0000 (10:42 +0000)]
On Aero/Sea A0 cards retry MPT Fusion registers reads for max three times
Due to HW Errta on Aero/Sea A0 chipset on secure boot mode & on heavy IO load,
sometimes read operation on MPT Fusion registers will give zero value,
So, as a workaround driver will retry the MPT Fusion register
read operation for max three times upon reading zero value form these
registers.
kadesai [Wed, 26 Dec 2018 10:40:27 +0000 (10:40 +0000)]
Added support for NVMe Task Management
Following list of changes done in the driver as a part of TM handling on the NVMe drives.
Below changes are only applicable on NVMe drives and only when custom NVMe TM handling bit is set to zero by IOC.
1. Issue LUN reset & Target reset TMs with Target reset method field set to Protocol Level reset (0x3),
2. For LUN & target reset TMs use the timeout value as ControllerResetTO value provided by firmware using PCie Device Page 0,
3. If LUN reset fails to terminates the IO then directly escalate to host reset instead of going for target reset TM,
4. For Abort TM use the timeout value as NVMeAbortTO value given by the IOC using Manufacturing Page 11,
5. Log message "PCie Host Reset failed" message up on receiving P
In the above mps_pass_thru structure; Application expects PrtReply buffer
should contain both MPI reply followed by sense data. So, updated driver
to copy sense data at PtrReply + sizeof(MPI2 reply) location where
application wants the driver to copy back the sense data info.
https://www.illumos.org/issues/9630
Rename and destroy are very useful operations that deserve to be in
libzfs_core. And they are not hard to implement too.
kevans [Tue, 25 Dec 2018 15:18:41 +0000 (15:18 +0000)]
bectl: use jail id as the default jail name for a boot environment
By default, bectl is setting the jail 'name' parameter to the boot
environment name, which causes an error when the boot environment name is
not a valid jail name. With the attached fix, when no name is supplied, the
default jail name will be the jail id - this is is the same behavior as the
jail command.
Additionally, this commit addresses two other bugs that prevented unjailing
in scenarios where the jail name does not match the boot environment name:
1. In 'bectl_locate_jail', 'mountpoint' is used to resolve the boot
environment path, but really 'mounted' should be used. 'mountpoint' is the
path where the zfs dataset will be mounted. 'mounted' is the path where
the dataset is actually mounted.
2. in 'bectl_search_jail_paths', 'jail_getv' would fail after the first
call. Which is fine, if the boot environment you're unjailing is the next
one up. According to 'man jail_getv', it's expecting name and value
strings. 'jail_getv' is being passed an integer for the lastjid, so amend
that to use a string instead.
Test cases have been amended to reflect the bugs found.
PR: 233637
Submitted by: Rob <rob.fx907_gmail.com>
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D18607
mav [Mon, 24 Dec 2018 23:52:35 +0000 (23:52 +0000)]
Increase MTX_POOL_SLEEP_SIZE from 128 to 1024.
This value remained unchanged for 15 years, and now this bump reduces
lock spinning in GEOM and BIO layers while doing ~1.6M IOPS to 4 NVMe
on 72-core system from ~25% to ~5% by the cost of additional 28KB RAM.
While there, align struct mtx_pool fields to cache lines.
mav [Mon, 24 Dec 2018 23:28:11 +0000 (23:28 +0000)]
Remove CAM SIM lock from NVMe SIM.
CAM does not require SIM lock since FreeBSD 10.4, and NVMe code never
required it at all, using per-queue locks instead. This formally allows
parallel request submission in CAM mode as much as single per-device and
per-queue locks of CAM allow.
scottl [Mon, 24 Dec 2018 05:54:36 +0000 (05:54 +0000)]
Commands for user-initated device resets should come from the high-priority
allocator. Prior to this change, they would leak from the normal allocator.