r335284:
audit(4): add tests for extattr_get_file(2) and friends
This commit includes extattr_{get_file, get_fd, get_link, list_file,
list_fd, list_link}. It does not include any syscalls that modify, set, or
delete extended attributes, as those are in a different audit class.
r334471:
audit(4): Add tests for the fr class of syscalls
readlink and readlinkat are the only syscalls in this class. open and
openat are as well, but they'll be handled in a different file. Also, tidy
up the copyright headers of recently added files in this area.
r334487:
audit(4): Add tests for the fw class of syscalls.
truncate and ftruncate are the only syscalls in this class, apart from
certain variations of open and openat, which will be handled in a different
file.
r334496:
audit(4): add tests for the fd audit class
The only syscalls in this class are rmdir, unlink, unlinkat, rename, and
renameat. Also, set is_exclusive for all audit(4) tests, because they can
start and stop auditd.
asomers [Tue, 2 Oct 2018 15:18:48 +0000 (15:18 +0000)]
MFC r334360, r334362, r334388, r334395
r334360:
Add initial set of tests for audit(4)
This change includes the framework for testing the auditability of various
syscalls, and includes changes for the first 12. The tests will start
auditd(8) if needed, though they'll be much faster if it's already running.
The syscalls tested in this commit include mkdir(2), mkdirat(2), mknod(2),
mknodat(2), mkfifo(2), mkfifoat(2), link(2), linkat(2), symlink(2),
symlinkat(2), rename(2), and renameat(2).
r334362 by emaste:
Temporarily disconnect audit tests
Audit tests added in r334360 broke the build on a number of archs.
Remove the subdir from the top level tests/sys/Makefile until they're
fixed.
r334388:
Fix OpenBSM with GCC with -Wredundant-decls
Upstream change ed47534 consciously added some redundant functional
declarations, and I'm not sure why. AFAICT they were never required. On
FreeBSD, they break the build with GCC (but not Clang) for any program
including libbsm.h with WARNS=6.
Fix by cherry-picking upstream change
https://github.com/openbsm/openbsm/commit/0553c27
asomers [Mon, 1 Oct 2018 17:36:58 +0000 (17:36 +0000)]
MFC r337222:
Fix LOCAL_PEERCRED with socketpair(2)
Enable the LOCAL_PEERCRED socket option for unix domain stream sockets
created with socketpair(2). Previously, it only worked with unix domain
stream sockets created with socket(2)/listen(2)/connect(2)/accept(2).
PR: 176419
Reported by: Nicholas Wilson <nicholas@nicholaswilson.me.uk>
Differential Revision: https://reviews.freebsd.org/D16350
sobomax [Mon, 1 Oct 2018 17:26:41 +0000 (17:26 +0000)]
MFC r309554 and r309631 which breaks down overly long monolithic
souce file and reduces duplication by auto-generating functions
that only differ in the value of the SCM_XXX constant used.
This also fixes unintentional breakage introduced in earlier
MFC in r338617 that happens to rely on some of those changes.
asomers [Mon, 1 Oct 2018 16:04:07 +0000 (16:04 +0000)]
MFC r338216:
tftpd: Fix data corruption bug with netascii
Transferring files in netascii format requires, among other things,
translating all CR characters to a CR,NUL pair. tftpd does this correctly
except when the CR occurs as the last octet of a packet. In that case, it
erroneously drops the NUL which should be part of the following packet. The
bug was caused by using 0 as a sentinel value in a variable that could
legitimately hold 0. Fix it by switching the sentinel value to -1.
PR: 178055
Reported by: Richard <rsitze@gmail.com>
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D16853
asomers [Mon, 1 Oct 2018 15:49:43 +0000 (15:49 +0000)]
MFC r336871, r336874
r336871:
getrusage(2): fix return value under 32-bit emulation
According to the man page, getrusage(2) should return EFAULT if the rusage
argument lies outside of the process's address space. But due to an
oversight in r100384, that's never been the case during 32-bit emulation.
Fix it.
asomers [Mon, 1 Oct 2018 15:45:20 +0000 (15:45 +0000)]
MFC r336594:
Fix tmpfs detection in the sys/fs/tmpfs tests
This code was originally written for NetBSD. r306031 tried to adapt it to
FreeBSD, but didn't correctly handle the case that tmpfs was available, but
not already loaded. Fix the logic to load the module if necessary. The
tmpfs tests shouldn't be skipped anymore.
Also, fix a comment that was dislocated by r306031.
asomers [Mon, 1 Oct 2018 15:43:56 +0000 (15:43 +0000)]
MFC r336587:
tftpd(8): when completing an WRQ, flush the file before acknowleding receipt
tftpd(8) should flush a newly written file to disk before ACKing the final DATA
packet. Otherwise there is a narrow race window when a subsequent read may not
see the file. This is somewhat related to r330710, but the race window is much
smaller. Hopefully this will fix the intermittent tests in Jenkins.
asomers [Mon, 1 Oct 2018 15:40:06 +0000 (15:40 +0000)]
MFC r336582:
makefs(8): add test case for PR 229929
Fix two failing makefs test cases by adding "-M 1m", which was already used
for every other FFS test case. Add a new test case for the underlying
issue: with no -M, -m, or -s options, makefs can underestimate image size.
MFC r313168 (by pkelsey):
Fix VIMAGE-related bugs in TFO. The autokey callout vnet context was
not being initialized, and the per-vnet fastopen context was only
being initialized for the default vnet.
MFC r338890:
Update ifr_name before invoking IPSECSREQID ioctl, this fixes the case,
when `ifconfig ipsec create reqid N` command invoked without interface
unit number. The "name" global variable is updated after interface
cloning in the ifclonecreate() and contains actual interface name.
During scans (scrubs or resilvers), it sorts the blocks in each transaction
group by block offset; the result can be a significant improvement. (On my
test system just now, which I put some effort to introduce fragmentation into
the pool since I set it up yesterday, a scrub went from 1h2m to 33.5m with the
changes.) I've seen similar rations on production systems.
r336180
Fix up some missed and mis-merges from the sequential scan code
(r334844). Most of the changes involve moving some code around to
reduce conflicts with future merges. One of the missing changes
included a notification on scrub cancellation.
r336458
Fix a couple of typos in r334844 noticed by Richard Kojedzinszky
r338111:
[ig4] add ACPI Device HID for AMD platforms
Added ACPI Device HID AMDI0010 for the designware I2C controllers in
future AMD platforms. Also, when verifying component version check for
minimal value instead of exact match.
r338215:
[ig4] Fix I/O timeout issue with Designware I2C controller on AMD platforms
Due to hardware limitation AMD I2C controller can't trigger pending
interrupt if interrupt status has been changed after clearing
interrupt status bits. So, I2C will lose the interrupt and IO will be
timed out. Implements a workaround to disable I2C controller interrupt
and re-enable I2C interrupt before existing interrupt handler.
r336051:
ig4(4): Fix Apollo lake entries platform identifier
Identify Apollo Lake controllers as IG4_APL and not as a IG4_SKYLAKE
Reported by: rpokala@
r336142:
ig4(4): add devmatch(8) PNP info
Now that we have all devices ids in a table add MODULE_PNP_INFO macro
to let devmatch autoload module
r336326:
Remove MODULE_PNP_INFO for ig4(4) driver
ig4(4) does not support suspend/resume but present on the hardware where
such functionality is critical, like laptops. Remove PNP info to avoid
breaking suspend/resume on the systems where ig4(4) load is not explicitly
requested by the user.
PR: 229791
Reported by: Ali Abdallah
r337719:
[ig4] Fix initialization sequence for newer ig4 chips
Newer chips may require assert/deassert after power down for proper
startup. Check respective flag in DEVIDLE_CTRL and perform operation
if neccesssary.
From PCI Spec rev 2.2, 6.2.1. Device Identification:
Vendor ID This field identifies the manufacturer of the device. Valid
vendor identifiers are allocated by the PCI SIG to ensure uniqueness.
0FFFFh is an invalid value for Vendor ID.
sef [Sat, 29 Sep 2018 00:44:23 +0000 (00:44 +0000)]
MFC r336017,r338799
r336017
This exposes ZFS user and group quotas via the normal
quatactl(2) mechanism. (Read-only at this point, however.)
In particular, this is to allow rpc.rquotad query quotas
for NFS mounts, allowing users to see their quotas on the
hosts using the datasets.
The changes specifically:
* Add new RPC entry points for querying quotas.
* Changes the library routines to allow non-UFS quotas.
* Changes rquotad to check for quotas on mounted filesystems,
rather than being limited to entries in /etc/fstab
* Lastly, adds a VFS entry-point for ZFS to query quotas.
Note that this makes one unavoidable behavioural change: if quotas
are enabled, then they can be queried, as opposed to the current
method of checking for quotas being specified in fstab. (With
ZFS, if there are user or group quotas, they're used, always.)
Describe the role of tags and mapping objects as abstractions.
Describe static vs dynamic transaction types and give a brief overview
of the set of functions and object life cycles used for static vs
dynamic.
While here, fix a few other typos and expand a bit on parent tags.
gordon [Thu, 27 Sep 2018 18:50:10 +0000 (18:50 +0000)]
There are various cases where we modify the inp_vflag and inp_inc.inc_flags
fields during a syscall, but don't restore those fields if the operation
fails. This can leave the inp structure in an inconsistent state and cause
various problems.
Restore the inp_vflag and inp_inc.inc_flags fields when the underlying
operation fails and the inp could be in an inconsistent state.
This is a direct commit to the branch as the code is different enough in
the other branches to make it difficult to resolve a merge.
Submitted by: jtl@
Reported by: Jakub Jirasek, Secunia Research at Flexera
Reviewed by: jhb@
Approved by: so
Security: FreeBSD-EN-18:11.listen
Security: CVE-2018-6925
When we're at our vnode limit, getnewvnode will call into the vnode LRU
cache to free up vnodes. If the vnode we try to recycle is a ZFS vnode we
end up, eventually, in zfs_rmnode. If the ZFS vnode we're recycling
represents something with extended attributes, zfs_rmnode will call
zfs_zget which will attempt to allocate another vnode. If the next vnode we
try to recycle is also a ZFS vnode representing something with extended
attributes we can recurse further. This ends up being unbounded and can end
up overflowing the stack.
In order to avoid this, restructure zfs_rmnode to simply add the extended
attribute directory's object ID to the unlinked set, thus not requiring the
allocation of a vnode. We then schedule a task that calls zfs_unlinked_drain
which will do the work of properly marking the vnodes for unlinking.
zfs_unlinked_drain is also called on mount so these will be cleaned up
there.
r338205:
Create separate taskqueue to call zfs_unlinked_drain().
r334810 introduced zfs_unlinked_drain() dispatch to taskqueue on every
deletion of a file with extended attributes. Using system_taskq for that
with its multiple threads in case of multiple files deletion caused all
available CPU threads to uselessly spin on busy locks, completely blocking
the system.
Use of single dedicated taskqueue is the only easy solution I've found,
while in would be great if we could specify that some task should be
executed only once at a time, but never in parallel, while many tasks
could use different threads same time.
r338206:
Add dmu_tx_assign() error handling in zfs_unlinked_drain().
The error handling got lost during r334810, while according to the report
error there may happen in case of dataset being over quota. In such case
just leave the node in the unlinked list to be freed sometimes later.
MFC r333307 (by sbruno):
Cleanup sundry clang warnings for code that is not upstream in illumos.
https://github.com/illumos/illumos-gate/edit/master/usr/src/lib/libzfs/common/libzfs_sendrecv.c
Patch our version of it to quiesce warnings until someone decides to sync
up our code:
libzfs_sendrecv.c:2555:30: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
sprintf(guidname, "%lu", thisguid);
~~~ ^~~~~~~~
%llu
libzfs_sendrecv.c:2612:29: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
sprintf(guidname, "%lu", parent_fromsnap_guid);
~~~ ^~~~~~~~~~~~~~~~~~~~
%llu
libzfs_sendrecv.c:2645:29: warning: format specifies type 'unsigned long'
but the argument has type 'uint64_t' (aka 'unsigned long long') [-Wformat]
sprintf(guidname, "%lu", parent_fromsnap_guid);
~~~ ^~~~~~~~~~~~~~~~~~~~
%llu
MFC r334231, r334779, r335322, and r338208 to stable/11 from head
These include:
r334231: iflib: Add new shared flag: IFLIB_ADMIN_ALWAYS_RUN
r334779: iflib: Record TCP checksum info in iflib when TCP checksum is requested
r335322: iflib: Style fixes
r338208: if_media: Add new 2.5G/5G/25G/40G/50G/100G/200G/400G media types
MFC r338754:
Update the pkg-stage.sh script used to populate packages on the
dvd1.iso installation medium from including KDE4 to KDE5, as the
KDE4-based ports have been marked as deprecated in the Ports
Collection.
MFC 337270: Install the 32-bit compat sanitizer libraries.
The lib32 build was already building the i386 version of
the clang sanitizers (libclang_rt) but they were not being
installed. This enables the installation.
MK_TOOLCHAIN=no was originally added to the install make
environment to disable includes so that NO_INCS could be
removed. The MK_TOOLCHAIN in bsd.incs.mk was subsequently
renamed to MK_INCLUDES, but bsd.lib.mk doesn't even include
bsd.incs.mk when LIBRARIES_ONLY is defined which the install
make environment for compat libs now defines. However,
setting MK_TOOLCHAIN=no forced MK_CLANG=no which disabled
libclang_rt during the install32 phase. Remove MK_TOOLCHAIN=no
since LIBRARIES_ONLY is now sufficient.
Since the libcompat environment overrides both LIBDIR and
SHLIBDIR, libclang_rt/Makefile.inc has to set both variables
to force the libraries to be installed to the location
expected by the compiler.
mm [Wed, 19 Sep 2018 09:41:33 +0000 (09:41 +0000)]
MFC r338600:
Update libarchive to 3.3.3
As all important changes have already been merged from libarchive git
this is just a version number bump, documentation update and some
polishing for cpio tests. Other source code changes are not relevant to
FreeBSD.
MFC r338679:
Improve LibUSB debugging by simultaneously allowing both function
and transfer prints. Make sure the debug level comes from the
correct USB context.
MFC r338616:
Fix issues about cancelling USB transfers in LibUSB when the USB device has
been detached. When a USB device has been detached the kernel file handle
stops responding to commands. USB applications which continue to run after
the USB device has been detached, depend on LibUSB generated events to tear
down its pending USB transfers. Add code to handle the needed cleanup when
processing the USB transfer(s) fails and prevent new USB transfer(s) from
being submitted.
dim [Tue, 18 Sep 2018 20:46:55 +0000 (20:46 +0000)]
MFC r309748 (by glebius):
Treat R_X86_64_PLT32 relocs as R_X86_64_PC32.
If we load a binary that is designed to be a library, it produces
relocatable code via assembler directives in the assembly itself
(rather than compiler options). This emits R_X86_64_PLT32 relocations,
which are not handled by the kernel linker.
dim [Tue, 18 Sep 2018 07:29:01 +0000 (07:29 +0000)]
MFC r338697:
Pull in r325478 from upstream clang trunk (by Ivan A. Kosarev):
[CodeGen] Initialize large arrays by copying from a global
Currently, clang compiles explicit initializers for array elements
into series of store instructions. For large arrays of built-in types
this results in bloated output code and significant amount of time
spent on the instruction selection phase. This patch fixes the issue
by initializing such arrays with global constants that store the
binary image of the initializer.
MFC 335913:
Use 'e' instead of 'i' constraints with 64-bit atomic operations on amd64.
The ADD, AND, OR, and SUB instructions take at most a 32-bit
sign-extended immediate operand. 64-bit constants that do not fit into
that constraint need to be loaded into a register. The 'i' constraint
tells the compiler it can pass any integer constant to the assembler,
whereas the 'e' constrain only permits constants that fit into a 32-bit
sign-extended value. This fixes using
atomic_add/clear/set/subtract_long/64 with constants that do not fit into
a 32-bit sign-extended immediate.
MFC r338613:
Fix for backends which doesn't support capsicum.
Not all libpcap backends use the BPF compatible set
of IOCTLs. For example the mlx5 backend uses libibverbs
which is currently not capsicum compatible.
MFC r337992, r338125:
POSIX compliance improvements in the pthread(3) functions.
This basically adds makes use of the C99 restrict keyword, and also
adds some 'const's to four threading functions: pthread_mutexattr_gettype(),
pthread_mutexattr_getprioceiling(), pthread_mutexattr_getprotocol(), and
pthread_mutex_getprioceiling. The changes are in accordance to POSIX/SUSv4-2018.
MFC 332454,334009,334122: Various fixes for x86 debug exceptions.
332454:
Fix PSL_T inheritance on exec for x86.
The miscellaneous x86 sysent->sv_setregs() implementations tried to
migrate PSL_T from the previous program to the new executed one, but
they evaluated regs->tf_eflags after the whole regs structure was
bzeroed. Make this functional by saving PSL_T value before zeroing.
Note that if the debugger is not attached, executing the first
instruction in the new program with PSL_T set results in SIGTRAP, and
since all intercepted signals are reset to default dispostion on
exec(2), this means that non-debugged process gets killed immediately
if PSL_T is inherited. In particular, since suid images drop
P_TRACED, attempt to set PSL_T for execution of such program would
kill the process.
Another issue with userspace PSL_T handling is that it is reset by
trap(). It is reasonable to clear PSL_T when entering SIGTRAP
handler, to allow the signal to be handled without recursion or
delivery of blocked fault. But it is not reasonable to return back to
the normal flow with PSL_T cleared. This is too late to change, I
think.
334009:
Cleanups related to debug exceptions on x86.
- Add constants for fields in DR6 and the reserved fields in DR7. Use
these constants instead of magic numbers in most places that use DR6
and DR7.
- Refer to T_TRCTRAP as "debug exception" rather than a "trace trap"
as it is not just for trace exceptions.
- Always read DR6 for debug exceptions and only clear TF in the flags
register for user exceptions where DR6.BS is set.
- Clear DR6 before returning from a debug exception handler as
recommended by the SDM dating all the way back to the 386. This
allows debuggers to determine the cause of each exception. For
kernel traps, clear DR6 in the T_TRCTRAP case and pass DR6 by value
to other parts of the handler (namely, user_dbreg_trap()). For user
traps, wait until after trapsignal to clear DR6 so that userland
debuggers can read DR6 via PT_GETDBREGS while the thread is stopped
in trapsignal().
334122:
x86: stop unconditionally clearing PSL_T on the trace trap.
We certainly should clear PSL_T when calling the SIGTRAP signal
handler, which is already done by all x86 sendsig(9) ABI code. On the
other hand, there is no obvious reason why PSL_T needs to be cleared
when returning from the signal handler. For instance, Linux allows
userspace to set PSL_T and keep tracing enabled for the desired
period. There are userspace programs which would use PSL_T if we make
it possible, for instance sbcl.
Remember if PSL_T was set by PT_STEP or PT_SETSTEP by mean of TDB_STEP
flag, and only clear it when the flag is set.
marius [Thu, 13 Sep 2018 10:18:47 +0000 (10:18 +0000)]
MFC: r333647, r338275, r338280, r338513
- If present, take advantage of the R/W cache of eMMC revision 1.5 and
later devices. These caches work akin to the ones found in HDDs/SSDs
that ada(4)/da(4) also enable if existent, but likewise increase the
likelihood of data loss in case of a sudden power outage etc. On the
other hand, write performance is up to twice as high for e. g. 1 GiB
files depending on the actual chip and transfer mode employed.
For maximum data integrity, the usage of eMMC caches can be disabled
via the hw.mmcsd.cache tunable.
- Get rid of the NOP mmcsd_open().
- Obtain the bus mode (MMC or SD) from the directly superordinated
bus rather than reaching up to the bridge and use the cached mode
in mmcsd_delete(), too.
- Use le32dec(9) for decoding EXT_CSD values where it makes sense. [1]
- Locally cache some instance variable values in mmc_discover_cards()
in order to improve the code readability a bit.
MFC r312296 and r323254, which is new a socket option
SO_TS_CLOCK to pick from several different clock sources to
return timestamps when SO_TIMESTAMP is enabled and two
new nanosecond-precision timestamp types. This also fixes
recvmsg32() system call to properly down-convert layout of the
64-bit structures to match what 32-bit app(s) expect.
Bump __FreeBSD_version to indicate presence of a new
functionality.
MFC r338491:
ibcore: Fix endless loop in searching for matching VLAN device
In r337943 ifnet's if_pcp was set to the PCP value in use
instead of IFNET_PCP_NONE.
Current ibcore code assumes that if_pcp is IFNET_PCP_NONE with
VLAN interfaces so it can identify prio-tagged traffic.
Fix that by explicitly verifying that that the if_type is IFT_ETHER
and not IFT_L2VLAN.
eugen [Wed, 12 Sep 2018 08:46:49 +0000 (08:46 +0000)]
MFC r338468: Fix "ipfw fwd" to work for incoming IPv4 packets
when ip_tryforward() chooses fast forwarding path, as it already works
for IPv6 and for both of them on old slow path.
MFC r338541:
Introduce and use sgid_index in CM requests in ibcore.
For RoCE, when CM requests are received for RC and UD connections,
netdevice of the incoming request is unavailable. Because of that CM
requests are always forwarded to init_net namespace.
Now that we have the GID index available, introduce SGID index in
incoming CM requests and refer to the netdevice of it.
While at it fix some incorrect uses of init_net and make sure
the rdma_create_id() function stores the VNET it is passed.
MFC r338526:
Implement get network interface by params function in ipoib.
Also fix the validate_ipv4_net_dev() and validate_ipv6_net_dev() functions
which had source and destination addresses swapped, and didn't set the
scope ID for IPv6 link-local addresses.
This allows applications like krping to work using IPoIB devices.
MFC r338492:
Add support for receive side scaling stride, RSSS, in mlx5en(4).
The receive side scaling stride parameter is a value which define the interval
between active receive side queues. The traffic for the inactive queues is
redirected to the nearest active queue by use of modulus. The default value
of this parameter is one, which means all receive side queues are used.
The point of this feature is to redirect more traffic to fewer receive side
queues in order to take more advantage of sorted large receive offload,
sorted LRO. The sorted LRO works better when more packets are accumulated
per service interval.