imp [Thu, 12 Oct 2017 14:57:05 +0000 (14:57 +0000)]
Move ufsread.c
Move ufsread.c from sys/boot/common (which used to be all the common
files for /boot/loader, but grew to be all the common files for
sys/boot, but that's now sys/boot/libsa's job) to sys/boot/libsa.
mjg [Thu, 12 Oct 2017 13:59:23 +0000 (13:59 +0000)]
xinstall: plug an infinite loop in directory creation
If stat continues to fail with ENOENT and mkdir with EEXIST the code wont
finish. In particular this can show up when the target path follows through
a symlink to a non-existent directory.
mjoras [Wed, 11 Oct 2017 21:53:53 +0000 (21:53 +0000)]
When unmounting a tmpfs, do not call free_unr.
tmpfs uses unr(9) to allocate inodes. Previously when unmounting it
would individually free the units when it freed each vnode. This is
unnecessary as we can use the newly-added unrhdr_clear function to clear
out the unr in onde go. This measurably reduces the time to unmount a
tmpfs with many files.
mjoras [Wed, 11 Oct 2017 21:53:50 +0000 (21:53 +0000)]
Add clearing function for unr(9).
Previously before you could call unrhdr_delete you needed to
individually free every allocated unit. It is useful to be able to tear
down the unr without having to go through this process, as it is
significantly faster than freeing the individual units.
glebius [Wed, 11 Oct 2017 20:36:09 +0000 (20:36 +0000)]
Declare more TCP globals in tcp_var.h, so that alternative TCP stacks
can use them. Gather all TCP tunables in tcp_var.h in one place and
alphabetically sort them, to ease maintainance of the list.
Don't copy and paste declarations in tcp_stacks/fastpath.c.
emaste [Wed, 11 Oct 2017 19:26:39 +0000 (19:26 +0000)]
libunwind: use upstream patch to disable executable stacks
arm uses '@' as a comment character, and cannot use @progbits in the
.section directive. Apply the upstream noexec stach change which avoids
this issue.
davidcs [Wed, 11 Oct 2017 18:25:05 +0000 (18:25 +0000)]
Add sanity checks in ql_hw_send() qla_send() to ensure that empty slots
in Tx Ring map to empty slot in Tx_buf array before Transmits. If the
checks fail further Transmission on that Tx Ring is prevented.
cem [Wed, 11 Oct 2017 14:59:04 +0000 (14:59 +0000)]
hwpmc(4): Force sufficiently wide type for left shift
Ordinary input to this macro comes from pe_code, which is uint16_t. Coverity
points out that shifting such a value discards the result of a 24 bit shift,
which is not what we want.
emaste [Wed, 11 Oct 2017 14:34:06 +0000 (14:34 +0000)]
OptionalObsoleteFiles: remove diff from MK_GNU_DIFF=no block
diff (and man page) are not from GNU, as of r317209, and should not be
deleted if WITHOUT_GNU_DIFF is set. (WITHOUT_GNU_DIFF still controls
whether diff3 is built.)
kib [Wed, 11 Oct 2017 11:03:11 +0000 (11:03 +0000)]
The th_bintime, th_microtime and th_nanotime members of the timehand
all cache the last system time (uptime + boottime). Only the format
differs. Do not re-calculate the bintime and simply use the value
used to calculate the microtime and nanotime.
Group all the updates under the relevant comment. Remove obsoleted
XXX part.
Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after: 1 week
jhibbits [Wed, 11 Oct 2017 02:39:20 +0000 (02:39 +0000)]
Do exception offset computations in 64 bits, not 32.
This fixes clang-built binaries on a gcc powerpc64 world. Gets us one step
closer to a clang-built world. The same change was made in later upstream
binutils.
rmacklem [Tue, 10 Oct 2017 21:05:40 +0000 (21:05 +0000)]
Fix forced dismount when a pNFS mount is hung on a DS.
When a "pnfs" NFSv4.1 mount is hung because of an unresponsive DS,
a forced dismount wouldn't work, because the RPC socket for the DS
was not being closed. This patch fixes this.
This will only affect "pnfs" mounts where the pNFS server's DS
is unresponsive (crashed or network partitioned or...).
Found during testing of the pNFS server.
davidcs [Tue, 10 Oct 2017 20:45:45 +0000 (20:45 +0000)]
Revert Commit r324290
Add sanity checks in ql_hw_send() qla_send() to ensure that empty slots
in Tx Ring map to empty slot in Tx_buf array before Transmits. If the
checks fail further Transmission on that Tx Ring is prevented.
jkim [Tue, 10 Oct 2017 19:20:38 +0000 (19:20 +0000)]
Do not check whether AcpiOsGetTimer() is called during boot.
From ACPICA 20170929, AcpiOsGetTimer() should be available early because
While() loop timeout mechanism was reimplemented with it. Unfortunately,
it means AcpiLoadTables() may cause panic when a While() loop is executed.
After having lengthy discussions with ACPICA developers, I have concluded
that dummy timecounter is good enough for the purpose and it is the least
intrusive solution for now. Also, they reminded me the ACPI specification
implies OS timer function should be available before loading tables.
mckusick [Tue, 10 Oct 2017 16:17:03 +0000 (16:17 +0000)]
Growfs got missed in r323923 that added a check hash to cylinder groups.
This makes the needed changes to add/update cylinder group check hashes
when a filesystem is expanded.
Reported by: kib and Warner Losh (imp)
Reviewed by: kib
Tested by: Peter Holm (pho)
sephe [Tue, 10 Oct 2017 08:32:03 +0000 (08:32 +0000)]
hyperv/hn: Workaround erroneous hash type observed on WS2016.
Background:
- UDP 4-tuple hash type is unconditionally enabled in Hyper-V on WS2016,
which is _not_ affected by NDIS_OBJTYPE_RSS_PARAMS.
- Non-fragment UDP/IPv4 datagrams' hash type is delivered to VM as
TCP_IPV4.
Currently this erroneous behavior only applies to WS2016/Windows10.
Force l3/l4 protocol check, if the RXed packet's hash type is TCP_IPV4,
and the Hyper-V is running on WS2016/Windows10. If the RXed packet is
UDP datagram, adjust mbuf hash type to UDP_IPV4.
ngie [Tue, 10 Oct 2017 05:58:33 +0000 (05:58 +0000)]
Check the exit code from fsck_ffs instead of relying on MODIFIED being in the output
^/head@r323923 changed when MODIFIED is printed at exit. It's better to follow the
documented way of determining whether or not a filesystem is clean per fsck_ffs, i.e.,
ensure that the exit code is either 0 or 7.
The pass/fail determination is brittle prior to this commit, and ^/head@r323923 made
the issue apparent -- thus this needs to be fixed independent of ^/head@r323923.
imp [Tue, 10 Oct 2017 01:31:44 +0000 (01:31 +0000)]
Rather than laying whack-a-mole with including the path to stand.h,
always include it. Remove places where we explicitly include it. This
also helps reduce the 'cut-and-paste' factor of these Makefiles.
imp [Mon, 9 Oct 2017 22:12:53 +0000 (22:12 +0000)]
Create sys/boot/libsa and build libstand.a there
Build libstand from inside the sys/boot build. Redirect all users in
sys/boot to grab it from there. We still build it as libstand.a for
the moment. When lib/libstand is moved here, we'll change the name.
imp [Mon, 9 Oct 2017 22:12:46 +0000 (22:12 +0000)]
Define LIBSA* and use them instead of overloaded LIBSTAND
LIBSA is the current stand alone library. LIBSA32 is the 32-bit
version of the library. LIBSAU is the userboot version of libsa. Use
the proper define instead of the more generic define.
imp [Mon, 9 Oct 2017 22:12:32 +0000 (22:12 +0000)]
Define SASRC and use it
Define SASRC to point to the current libstand sources. Include
../Makefile.inc early enough in a few places so we can .include
"${SASRC}/Makefile" and have it work. Create a new pass-up
Makefile.inc in sys/boot/userboot to allow this pattern to work.
glebius [Mon, 9 Oct 2017 21:06:16 +0000 (21:06 +0000)]
Improvements to sendfile(2) mbuf free routine.
o Fall back to default m_ext free mech, using function pointer in
m_ext_free, and remove sf_ext_free() called directly from mbuf code.
Testing on modern CPUs showed no regression.
o Provide internally used flag EXT_FLAG_SYNC, to mark that I/O uses
SF_SYNC flag. Lack of the flag allows us not to dereference
ext_arg2, saving from a cache line miss.
o Create function sendfile_free_page() that later will be used, for
multi-page mbufs. For now compiler will inline it into
sendfile_free_mext().
In collaboration with: gallatin
Differential Revision: https://reviews.freebsd.org/D12615
glebius [Mon, 9 Oct 2017 20:51:58 +0000 (20:51 +0000)]
In mb_dupcl() don't copy full m_ext, to avoid cache miss. Respectively,
in mb_free_ext() always use fields from the original refcount holding
mbuf (see. r296242) mbuf. Cuts another cache miss from mb_free_ext().
However, treat EXT_EXTREF mbufs differently, since they are different -
they don't have a refcount holding mbuf.
Provide longer comments in m_ext declaration to explain this change
and change from r296242.
In collaboration with: gallatin
Differential Revision: https://reviews.freebsd.org/D12615
glebius [Mon, 9 Oct 2017 20:35:31 +0000 (20:35 +0000)]
Shorten list of arguments to mbuf external storage freeing function.
All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list. Not all functions need the second
argument, some don't even need the first one. The second argument
lives in next cache line, so not dereferencing it is a performance
gain. This was discovered in sendfile(2), which will be covered by
next commits.
The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.
hselasky [Mon, 9 Oct 2017 18:33:29 +0000 (18:33 +0000)]
When showing the sleepqueues from the in-kernel debugger,
properly dump all the sendqueues and not just the first one
History:
It appears that in the commit which introduced the code,
r165272, the array indexes of "sq_blocked[0]" and "td_name[i]"
were interchanged. In r180927 "td_name[i]" was corrected to
"td_name[0]", but "sq_blocked[0]" was left unchanged.
alc [Mon, 9 Oct 2017 18:19:06 +0000 (18:19 +0000)]
The recent change to initialization of blists (r324420) relied on '-1'
appearing only where the code explicitly set it, but since much of the
data was not initialized, '-1' appeared other places too, and led to
panics. Clear the allocated data before initializing nonzero values by
allocating with M_ZERO.
Submitted by: Doug Moore <dougm@rice.edu>
Reported by: Oleg V. Nauman <oleg@theweb.org.ua>, cy
Tested by: Oleg V. Nauman <oleg@theweb.org.ua>
MFC after: 1 week
X-MFC with: r324420
Differential Revision: https://reviews.freebsd.org/D12627
kib [Mon, 9 Oct 2017 16:20:39 +0000 (16:20 +0000)]
Change amd64_get_ldt() to return 'EOF' when the LDT is not yet
allocated, when requested range of descriptors does not fit into
currently allocated LDT, or trim the return if the range fits
partially. Before, the function returned EINVAL.
Reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 9 Oct 2017 16:19:26 +0000 (16:19 +0000)]
Change i386_get_ldt() to return 'EOF' when the requested range of
descriptors does not fit into currently allocated LDT, or trim the
return if the range fits partially. Before, the function returned
EINVAL.
Fix two bugs in r324366: use capped num counter for malloc size, and
do not leak allocated buffer on EINVAL (by handling EINVAL case as
normal, see above).
Reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 9 Oct 2017 16:07:27 +0000 (16:07 +0000)]
Improvements to set_user_ldt().
Remove mtx_owned() checks from set_user_ldt(). Split the function
into _locked() version which requires the dt_lock spinlock owned, and
make set_user_ldt() a wrapper. Add a comment in swtch.s noting that
the call to the new set_user_ldt() cannot recurse on dt_lock.
Remove #ifdef SMP block, the addend is always zero on UP.
Fix type of set_user_ldt_rv(), making it match the type used for
smb_rendezvous() callback, and remove the cast. Use curproc.
Reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Mon, 9 Oct 2017 15:39:43 +0000 (15:39 +0000)]
Reset the fs and gs bases on exec(2).
The values from the old address space do not make sense for the new
program. In particular, gsbase might be the TLS base for the old
program but the new program has no TLS now.
amd64 already handles this correctly.
Reported and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kevans [Mon, 9 Oct 2017 14:50:02 +0000 (14:50 +0000)]
patch(1): Don't overrun line buffer in some cases
Patches like file.txt attached to PR 190195 with a final line formed
like ">(EOL)" could cause a copy past the end of the current line buffer. In the
case of PR 191641, this caused a duplicate line to be copied into the resulting
file.
Instead of running past the end, treat it as if it were a blank line.
alc [Sun, 8 Oct 2017 22:17:39 +0000 (22:17 +0000)]
The blst_radix_init function has two purposes - to compute the number of
nodes to allocate for the blist, and to initialize them. The computation
can be done much more quickly by identifying the terminating node, if any,
at every level of the tree and then summing the number of nodes at each
level that precedes the topmost terminator. The initialization can also be
done quickly, since settings at the root mark the tree as all-allocated, and
only a few terminator nodes need to be marked in the rest of the tree.
Eliminate blst_radix_init, and perform its two functions more simply in
blist_create.
The allocation of the blist takes places in two pieces, but there's no good
reason to do so, when a single allocation is sufficient, and simpler.
Allocate the blist struct, and the array of nodes associated with it, with a
single allocation.
ian [Sun, 8 Oct 2017 18:38:22 +0000 (18:38 +0000)]
Fix imx6 hdmi init after r323553, which used a config_intrhook to defer the
attachment of i2c devices needed by hdmi.
The hdmi init also uses an intrhook callback to defer initialization, and if
the hdmi callback runs first, the i2c devices will not yet have registered
their device_t in association with their FDT phandle, which allows cross-
device references on FDT systems.
Now the hdmi deferred init checks for the i2c device registration, and if
it's not complete yet, it registers as an eventhandler watching for newbus
attach events. When the i2c device eventually attaches, the hdmi driver
unregisters from watching further events, and continues with the hdmi init.
Because the function signatures for an intrhook callback and an event
handler callback are the same, a single function is used for both callbacks.
ian [Sun, 8 Oct 2017 17:33:49 +0000 (17:33 +0000)]
Add eventhandler notifications for newbus device attach/detach.
The detach case is slightly complicated by the fact that some in-kernel
consumers may want to know before a device detaches (so they can release
related resources, stop using the device, etc), but the detach can fail. So
there are pre- and post-detach notifications for those consumers who need to
handle all cases.
A couple salient comments from the review, they amount to some helpful
documentation about these events, but there's currently no good place for
such documentation...
Note that in the current newbus locking model, DETACH_BEGIN and
DETACH_COMPLETE/FAILED sequence of event handler invocation might interweave
with other attach/detach events arbitrarily. The handlers should be prepared
for such situations.
Also should note that detach may be called after the parent bus knows the
hardware has left the building. In-kernel consumers have to be prepared to
cope with this race.
ian [Sun, 8 Oct 2017 17:21:16 +0000 (17:21 +0000)]
Restore the ability to deregister an eventhandler from within the callback.
When the EVENTHANDLER(9) subsystem was created, it was a documented feature
that an eventhandler callback function could safely deregister itself. In
r200652 that feature was inadvertantly broken by adding drain-wait logic to
eventhandler_deregister(), so that it would be safe to unload a module upon
return from deregistering its event handlers.
There are now 145 callers of EVENTHANDLER_DEREGISTER(), and it's likely many
of them are depending on the drain-wait logic that has been in place for 8
years. So instead of creating a separate eventhandler_drain() and adding it
to some or all of those 145 call sites, this creates a separate
eventhandler_drain_nowait() function for the specific purpose of
deregistering a callback from within the running callback.
alc [Sun, 8 Oct 2017 16:54:42 +0000 (16:54 +0000)]
Replace an unnecessary call to vm_page_activate() by an assertion that
the page is already wired or queued. Prior to the elimination of PG_CACHED
pages, vm_page_grab() might have returned a valid, previously PG_CACHED
page, in which case enqueueing the page was necessary. Now, that can't
happen. Moreover, activating the page is a dubious choice, since the page
is not being accessed.
sbruno [Sat, 7 Oct 2017 23:33:14 +0000 (23:33 +0000)]
Fix symlink if_igb.ko in -current such that its relative and doesn't
end up with non-standard DESTDIR information in its symlink. This
can happen very trivially if the release scripts are used.
sbruno [Sat, 7 Oct 2017 23:30:57 +0000 (23:30 +0000)]
Check so_error early in sendfile() call. Prior to this patch, if a
connection was reset by the remote end, sendfile() would just report
ENOTCONN instead of ECONNRESET.