Mark Johnston [Fri, 13 Oct 2017 19:27:33 +0000 (19:27 +0000)]
Make the PHOLD in linux_wait_event_common() unconditional.
After some in-progress work is committed, this would otherwise be the only
instance of #if(n)def NO_SWAPPING in the tree. Moreover, the requisite
opt_vm.h include was missing, so the PHOLD/PRELE calls were always being
compiled in anyway.
Alan Cox [Fri, 13 Oct 2017 16:31:50 +0000 (16:31 +0000)]
Address two problems with sendfile(..., SF_NOCACHE) and apply one
"optimization". First, sendfile(..., SF_NOCACHE) frees pages without
checking whether those pages are mapped. This can leave the system
with mappings to free or repurposed pages. Second, a page can be
busied between the time of the current busy test and acquiring the
object lock. Essentially, the test performed before the object lock
is acquired can only be regarded as an optimization to short-circuit
further work on the page. It cannot, however, be relied upon to prove
that it is safe to free the page. Third, when sendfile(..., SF_NOCACHE)
was originally implemented, vm_page_deactivate_noreuse() did not yet
exist. Use vm_page_deactivate_noreuse() instead of vm_page_deactivate(),
because it comes closer to freeing the page.
Don't call selrecord() outside the select system call in the LinuxKPI, because
then td->td_sel is NULL and this will result in a segfault inside selrecord().
This happens when only using kqueue() to poll for read and write events.
If select() and kqueue() is mixed there won't be a segfault.
Andriy Gapon [Fri, 13 Oct 2017 09:21:41 +0000 (09:21 +0000)]
i2c(8): clean up and clarify read operation
The code went to a lot of trouble to issue either a start+stop condition
or a repeated start condition only to follow it with a stop condition
and a read(2) call that issues a new start condition.
So, fix the read in I2C_MODE_REPEATED_START mode by using I2CREAD ioctl
within the running transaction. This obviously requires that the slave
address has the read bit set which was not required before.
Another problem was with width parameter of zero and
I2C_MODE_REPEATED_START mode. In that case we issued a repeated start
without any preceding start.
While here, remove the redundant (unused) argument to I2CSTOP throughout
the program.
Also, clarify the meaning of -w option, especially "-w 0", in the manual
page.
Reviewed by: no one
Differential Revision: https://reviews.freebsd.org/D12331
Adrian Chadd [Thu, 12 Oct 2017 21:58:51 +0000 (21:58 +0000)]
[ath] Begin using the replacement EDCA functions.
As part of ath10k and other chipset support, the EDCA stuff has to be moved
to potentially be per-VAP. For hardware that doesn't support it (ie,
everything that we currently support) it can just fetch the "current"
global EDCA parameters for the NIC.
This is one of those parameters that is linked to the currently active
channel context / VAP in Linux mac80211 parlance.
Adrian Chadd [Thu, 12 Oct 2017 21:56:58 +0000 (21:56 +0000)]
[net80211] begin handling multiple hardware decap'ed A-MSDU in the RX path.
The duplicate detection code currently expects A-MSDU frames to be encaped -
they're decap'ed /after/ duplicate detection.
However for ath10k (and iwm hardware later on) the firmware supports
doing A-MSDU decap in hardware - which shows up as multiple frames with
the same sequence number and IV.
This is the first part of decap handling - if we see a stretch of A-MSDU
frames from the driver with the MORE bit set, then don't treat them
as duplicates.
This isn't 100% complete as crypto sequence number handling and "A-MSDU in
A-MPDU" needs handling, but it's a start.
This should be a glorified no-op for everyone. Please tell me if it isn't.
Ed Maste [Thu, 12 Oct 2017 15:45:53 +0000 (15:45 +0000)]
allow posix_fallocate in capability mode
posix_fallocate is logically equivalent to writing zero blocks to the
desired file size and there is no reason to prevent calling it in
capability mode. posix_fallocate already checked for the CAP_WRITE
right, so we merely need to list it in capabilities.conf.
Reviewed by: allanjude
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12640
Warner Losh [Thu, 12 Oct 2017 15:16:22 +0000 (15:16 +0000)]
Define prototype for exit and ensure references
Define a prototype for exit in stand.h. Provide a reference to exit in
a few conf.c files to ensure that its definition gets pulled in early.
Since exit() is a MD routine, it isn't defined in libsa. However,
libsa tends to be listed last and will soon have panic() in it which
calls exit(). The reference to exit early ensures that the MD exit is
available to satisfy linking for static libraries.
Warner Losh [Thu, 12 Oct 2017 14:57:05 +0000 (14:57 +0000)]
Move ufsread.c
Move ufsread.c from sys/boot/common (which used to be all the common
files for /boot/loader, but grew to be all the common files for
sys/boot, but that's now sys/boot/libsa's job) to sys/boot/libsa.
Mateusz Guzik [Thu, 12 Oct 2017 13:59:23 +0000 (13:59 +0000)]
xinstall: plug an infinite loop in directory creation
If stat continues to fail with ENOENT and mkdir with EEXIST the code wont
finish. In particular this can show up when the target path follows through
a symlink to a non-existent directory.
Matt Joras [Wed, 11 Oct 2017 21:53:53 +0000 (21:53 +0000)]
When unmounting a tmpfs, do not call free_unr.
tmpfs uses unr(9) to allocate inodes. Previously when unmounting it
would individually free the units when it freed each vnode. This is
unnecessary as we can use the newly-added unrhdr_clear function to clear
out the unr in onde go. This measurably reduces the time to unmount a
tmpfs with many files.
Matt Joras [Wed, 11 Oct 2017 21:53:50 +0000 (21:53 +0000)]
Add clearing function for unr(9).
Previously before you could call unrhdr_delete you needed to
individually free every allocated unit. It is useful to be able to tear
down the unr without having to go through this process, as it is
significantly faster than freeing the individual units.
Gleb Smirnoff [Wed, 11 Oct 2017 20:36:09 +0000 (20:36 +0000)]
Declare more TCP globals in tcp_var.h, so that alternative TCP stacks
can use them. Gather all TCP tunables in tcp_var.h in one place and
alphabetically sort them, to ease maintainance of the list.
Don't copy and paste declarations in tcp_stacks/fastpath.c.
Ed Maste [Wed, 11 Oct 2017 19:26:39 +0000 (19:26 +0000)]
libunwind: use upstream patch to disable executable stacks
arm uses '@' as a comment character, and cannot use @progbits in the
.section directive. Apply the upstream noexec stach change which avoids
this issue.
Add sanity checks in ql_hw_send() qla_send() to ensure that empty slots
in Tx Ring map to empty slot in Tx_buf array before Transmits. If the
checks fail further Transmission on that Tx Ring is prevented.
Conrad Meyer [Wed, 11 Oct 2017 14:59:04 +0000 (14:59 +0000)]
hwpmc(4): Force sufficiently wide type for left shift
Ordinary input to this macro comes from pe_code, which is uint16_t. Coverity
points out that shifting such a value discards the result of a 24 bit shift,
which is not what we want.
Ed Maste [Wed, 11 Oct 2017 14:34:06 +0000 (14:34 +0000)]
OptionalObsoleteFiles: remove diff from MK_GNU_DIFF=no block
diff (and man page) are not from GNU, as of r317209, and should not be
deleted if WITHOUT_GNU_DIFF is set. (WITHOUT_GNU_DIFF still controls
whether diff3 is built.)
The th_bintime, th_microtime and th_nanotime members of the timehand
all cache the last system time (uptime + boottime). Only the format
differs. Do not re-calculate the bintime and simply use the value
used to calculate the microtime and nanotime.
Group all the updates under the relevant comment. Remove obsoleted
XXX part.
Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after: 1 week
Justin Hibbits [Wed, 11 Oct 2017 02:39:20 +0000 (02:39 +0000)]
Do exception offset computations in 64 bits, not 32.
This fixes clang-built binaries on a gcc powerpc64 world. Gets us one step
closer to a clang-built world. The same change was made in later upstream
binutils.
Rick Macklem [Tue, 10 Oct 2017 21:05:40 +0000 (21:05 +0000)]
Fix forced dismount when a pNFS mount is hung on a DS.
When a "pnfs" NFSv4.1 mount is hung because of an unresponsive DS,
a forced dismount wouldn't work, because the RPC socket for the DS
was not being closed. This patch fixes this.
This will only affect "pnfs" mounts where the pNFS server's DS
is unresponsive (crashed or network partitioned or...).
Found during testing of the pNFS server.
Revert Commit r324290
Add sanity checks in ql_hw_send() qla_send() to ensure that empty slots
in Tx Ring map to empty slot in Tx_buf array before Transmits. If the
checks fail further Transmission on that Tx Ring is prevented.
Jung-uk Kim [Tue, 10 Oct 2017 19:20:38 +0000 (19:20 +0000)]
Do not check whether AcpiOsGetTimer() is called during boot.
From ACPICA 20170929, AcpiOsGetTimer() should be available early because
While() loop timeout mechanism was reimplemented with it. Unfortunately,
it means AcpiLoadTables() may cause panic when a While() loop is executed.
After having lengthy discussions with ACPICA developers, I have concluded
that dummy timecounter is good enough for the purpose and it is the least
intrusive solution for now. Also, they reminded me the ACPI specification
implies OS timer function should be available before loading tables.
Kirk McKusick [Tue, 10 Oct 2017 16:17:03 +0000 (16:17 +0000)]
Growfs got missed in r323923 that added a check hash to cylinder groups.
This makes the needed changes to add/update cylinder group check hashes
when a filesystem is expanded.
Reported by: kib and Warner Losh (imp)
Reviewed by: kib
Tested by: Peter Holm (pho)
Sepherosa Ziehau [Tue, 10 Oct 2017 08:32:03 +0000 (08:32 +0000)]
hyperv/hn: Workaround erroneous hash type observed on WS2016.
Background:
- UDP 4-tuple hash type is unconditionally enabled in Hyper-V on WS2016,
which is _not_ affected by NDIS_OBJTYPE_RSS_PARAMS.
- Non-fragment UDP/IPv4 datagrams' hash type is delivered to VM as
TCP_IPV4.
Currently this erroneous behavior only applies to WS2016/Windows10.
Force l3/l4 protocol check, if the RXed packet's hash type is TCP_IPV4,
and the Hyper-V is running on WS2016/Windows10. If the RXed packet is
UDP datagram, adjust mbuf hash type to UDP_IPV4.
Enji Cooper [Tue, 10 Oct 2017 05:58:33 +0000 (05:58 +0000)]
Check the exit code from fsck_ffs instead of relying on MODIFIED being in the output
^/head@r323923 changed when MODIFIED is printed at exit. It's better to follow the
documented way of determining whether or not a filesystem is clean per fsck_ffs, i.e.,
ensure that the exit code is either 0 or 7.
The pass/fail determination is brittle prior to this commit, and ^/head@r323923 made
the issue apparent -- thus this needs to be fixed independent of ^/head@r323923.
Warner Losh [Tue, 10 Oct 2017 01:31:44 +0000 (01:31 +0000)]
Rather than laying whack-a-mole with including the path to stand.h,
always include it. Remove places where we explicitly include it. This
also helps reduce the 'cut-and-paste' factor of these Makefiles.
Warner Losh [Mon, 9 Oct 2017 22:12:53 +0000 (22:12 +0000)]
Create sys/boot/libsa and build libstand.a there
Build libstand from inside the sys/boot build. Redirect all users in
sys/boot to grab it from there. We still build it as libstand.a for
the moment. When lib/libstand is moved here, we'll change the name.
Warner Losh [Mon, 9 Oct 2017 22:12:46 +0000 (22:12 +0000)]
Define LIBSA* and use them instead of overloaded LIBSTAND
LIBSA is the current stand alone library. LIBSA32 is the 32-bit
version of the library. LIBSAU is the userboot version of libsa. Use
the proper define instead of the more generic define.
Warner Losh [Mon, 9 Oct 2017 22:12:32 +0000 (22:12 +0000)]
Define SASRC and use it
Define SASRC to point to the current libstand sources. Include
../Makefile.inc early enough in a few places so we can .include
"${SASRC}/Makefile" and have it work. Create a new pass-up
Makefile.inc in sys/boot/userboot to allow this pattern to work.
Gleb Smirnoff [Mon, 9 Oct 2017 21:06:16 +0000 (21:06 +0000)]
Improvements to sendfile(2) mbuf free routine.
o Fall back to default m_ext free mech, using function pointer in
m_ext_free, and remove sf_ext_free() called directly from mbuf code.
Testing on modern CPUs showed no regression.
o Provide internally used flag EXT_FLAG_SYNC, to mark that I/O uses
SF_SYNC flag. Lack of the flag allows us not to dereference
ext_arg2, saving from a cache line miss.
o Create function sendfile_free_page() that later will be used, for
multi-page mbufs. For now compiler will inline it into
sendfile_free_mext().
In collaboration with: gallatin
Differential Revision: https://reviews.freebsd.org/D12615
Gleb Smirnoff [Mon, 9 Oct 2017 20:51:58 +0000 (20:51 +0000)]
In mb_dupcl() don't copy full m_ext, to avoid cache miss. Respectively,
in mb_free_ext() always use fields from the original refcount holding
mbuf (see. r296242) mbuf. Cuts another cache miss from mb_free_ext().
However, treat EXT_EXTREF mbufs differently, since they are different -
they don't have a refcount holding mbuf.
Provide longer comments in m_ext declaration to explain this change
and change from r296242.
In collaboration with: gallatin
Differential Revision: https://reviews.freebsd.org/D12615
Gleb Smirnoff [Mon, 9 Oct 2017 20:35:31 +0000 (20:35 +0000)]
Shorten list of arguments to mbuf external storage freeing function.
All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list. Not all functions need the second
argument, some don't even need the first one. The second argument
lives in next cache line, so not dereferencing it is a performance
gain. This was discovered in sendfile(2), which will be covered by
next commits.
The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.