Pedro F. Giffuni [Fri, 17 Jan 2014 21:21:28 +0000 (21:21 +0000)]
gcc: Drop useless objc change from r260311.
Among some of the objc changes from Apple that crept into r260311,
Radar 5355344 is incomplete and is not used since we don't carry
ObjC in the base system.
The dead code seems to have caused issues in some Tinderboxes so
get rid of it altogether.
Fix a possible memory use after free and leak situation associated
with USB device detach when using character device handles. This also
includes LibUSB. It turns out that "usb_close()" cannot always get a
reference to clean up its USB transfers and such, if called during the
kernel USB device detach.
Adrian Chadd [Fri, 17 Jan 2014 05:26:55 +0000 (05:26 +0000)]
Implement a kqueue notification path for sendfile.
This fires off a kqueue note (of type sendfile) to the configured kqfd
when the sendfile transaction has completed and the relevant memory
backing the transaction is no longer in use by this transaction.
This is analogous to SF_SYNC waiting for the mbufs to complete -
except now you don't have to wait.
Both SF_SYNC and SF_KQUEUE should work together, even if it
doesn't necessarily make any practical sense.
This is designed for use by applications which use backing cache/store
files (eg Varnish) or POSIX shared memory (not sure anything is using
it yet!) to know when a region of memory is free for re-use. Note
it doesn't mark the region as free overall - only free from this
transaction. The application developer still needs to track which
ranges are in the process of being recycled and wait until all
pending transactions are completed.
Adrian Chadd [Fri, 17 Jan 2014 05:13:08 +0000 (05:13 +0000)]
Implement the extension api for sendfile to allow for kqueue notifications.
This is still under a bit of flux, as the final API hasn't been nailed
down. It's also unclear whether we should define the two new types in the
header or not - it may allow bad code to compile that shouldn't (ie,
since uintX's are defined, the developer may not include sys/types.h.)
Reviewed by: peter, imp, bde
Sponsored by: Netflix, Inc.
Neel Natu [Fri, 17 Jan 2014 04:21:39 +0000 (04:21 +0000)]
If a VM-exit happens during an NMI injection then clear the "NMI Blocking" bit
in the Guest Interruptibility-state VMCS field.
If we fail to do this then a subsequent VM-entry will fail because it is an
error to inject an NMI into the guest while "NMI Blocking" is turned on. This
is described in "Checks on Guest Non-Register State" in the Intel SDM.
Submitted by: David Reed (david.reed@tidalscale.com)
fix a regression introduced in r237618 that would result in
killall confusing killall -INT with killall -I (interactive
confirmation) which resulted in the wrong signal (TERM)
being delivered to the process(s).
Add a command line argument to turn off blocking waiting for the user
to press Ctrl-C (-b). This allows tests with tight loops of mcgrabs
that can stress the multicast tables.
Glen Barber [Thu, 16 Jan 2014 16:12:09 +0000 (16:12 +0000)]
Update the pkg-stage target to be more compatible with pkg-1.2:
- Add a release-dvd.conf pkg(8) configuration file to override
the default FreeBSD.conf configuration.
- Remove architecture-specific pkg-stage.conf files, consolidate,
and move their contents to scripts/pkg-stage.sh.
- Use 'pkg -vv' to determine the ABI, which is used as the
cache directory.
Prior to these changes, it would be possible for pkg-stage to fetch
conflicting binary packages from multiple repositories.
Tested against: head@r260522, stable/10@r260522
MFC after: 3 days
X-Insta-MFC: possibly
Sponsored by: The FreeBSD Foundation
Andriy Gapon [Thu, 16 Jan 2014 13:24:10 +0000 (13:24 +0000)]
fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt. Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds. But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.
Reviewed by: gibbs
MFC after: 13 days
Sponsored by: HybridCluster
Andriy Gapon [Thu, 16 Jan 2014 12:31:27 +0000 (12:31 +0000)]
zfs_deleteextattr: name buffer from namei is needed by zfs_rename
If we prematurely free the name buffer and it gets quickly recycled,
then zfs_rename may see data from another lookup or even unmapped memory
via cn_nameptr.
Andriy Gapon [Thu, 16 Jan 2014 12:26:54 +0000 (12:26 +0000)]
fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt. Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds. But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.
Reviewed by: gibbs
MFC after: 13 days
Sponsored by: HybridCluster
Andriy Gapon [Thu, 16 Jan 2014 12:22:46 +0000 (12:22 +0000)]
zfs: getnewvnode_reserve must be called outside of a zfs transaction
Otherwise we could run into the following deadlock.
A thread has a transaction open and assigned to a transaction group.
That would prevent the transaction group from be quiesced and synced.
The thread is blocked in getnewvnode_reserve waiting for a vnode to
a be reclaimed. vnlru thread is blocked trying to enter ZFS VOP because
a filesystem is suspended by an ongoing rollback or receive operation.
In its turn the operation is waiting for the current transaction group
to be synced.
zfs_zget is always used outside of active transactions, but zfs_mknode
is always used in a transaction context. Thus, we hoist
getnewvnode_reserve from zfs_mknode to its callers.
While there, assert that ZFS always calls getnewvnode while having
a vnode reserved.
Reported by: adrian
Tested by: adrian
MFC after: 17 days
Sponsored by: HybridCluster
Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.
Steven Kreuzer [Wed, 15 Jan 2014 15:16:11 +0000 (15:16 +0000)]
Remove reference to FreeBSD 6.2-RELEASE from 'Upgrading from
previous releases' paragraph since all supported version of FreeBSD
now support binary upgrades
Remove 'of course,' from foot note reminding to create a backup
before attempting a binary update
Marcel Moolenaar [Wed, 15 Jan 2014 03:57:41 +0000 (03:57 +0000)]
In the nested TLB fault handler, for a direct-mapped address, make
sure to clear the lower 12 bits. We're adding the translation
attributes to the physical address and non-zero bits in the first
12 bits would give us something unexpected, including invalid bit
values. Those trigger nested general protection faults.
We do not have to clear the region bits, because they are ignored
anyway, so we can replace an existing dep instruction with the one
we need.
This fixes GP faults for the swapper thread, as it's the only thread
that has a direct-mapped stack. Since the bug is in the nested TLB
fault handler, the frequency of hitting the GP is in the order of
hours/days under load.
Jilles Tjoelker [Tue, 14 Jan 2014 22:05:33 +0000 (22:05 +0000)]
libc/resolv: Use poll() instead of kqueue().
The resolver in libc creates a kqueue for watching a single file descriptor.
This can be done using poll() which should be lighter on the kernel and
reduce possible problems with rlimits (file descriptors, kqueues).
Julio Merino [Tue, 14 Jan 2014 18:45:32 +0000 (18:45 +0000)]
Replace hand-crafted Kyuafiles with automatic generation.
Redo r260506 by using the new TEST_METADATA functionality of bsd.test.mk
to mark the sh(1) and test(1) tests as not supporting root. This is to
get rid of hand-crafted Kyuafiles for these very simple cases.
Julio Merino [Tue, 14 Jan 2014 18:39:30 +0000 (18:39 +0000)]
Use TAP_TESTS_PERL to register the legacy_test in bin/pax.
Redo r260586 by using the new functionality in tap.test.mk to transparently
support perl-based test programs.
As a side-effect, we get rid of an explicit path to /usr/bin/perl by
replacing it with /usr/local/bin/perl (or as defined in tap.test.mk).
This also fixes the name of the legacy_test source file because this should
have always been legacy_test.pl and not legacy_test.sh. My mistake when
originally moving the code around without realizing that this was a perl
script.
Julio Merino [Tue, 14 Jan 2014 18:35:56 +0000 (18:35 +0000)]
Support perl-based TAP-compliant test programs.
Introduce a TAP_TESTS_PERL primitive to list test programs written in perl.
Only do this in tap.test.mk because I only expect perl-based test programs
with this interface.
This is very similar to TAP_TESTS_SH but the difference is that we record
in the Kyuafile that the test program requires a perl interpreter. This
in turn makes Kyua mark the test as skipped if the perl package is not yet
installed, instead of mysteriously failing to run the program.
Julio Merino [Tue, 14 Jan 2014 18:32:47 +0000 (18:32 +0000)]
Support defining test program metadata from the Makefiles.
Introduce a new, per-test-program TEST_METADATA.<program> variable that
contains a list of key/value paris describing metadata properties for
that test program. These properties are later written into the
auto-generated Kyuafile when using the KYUAFILE=auto functionality.
This is to avoid having to supply hand-crafted Kyuafiles when the needs
for metadata overrides are trivial.
While doing this, and because I am documenting TEST_METADATA, take the
chance to document the TEST_INTERFACE setting as well.
Don't output any modifier keys before we see a valid
non-modifier key press. This prevents so-called "ghost
keyboards" keeping modifier keys pressed while not
actually seen as a real keyboard.
Neel Natu [Tue, 14 Jan 2014 01:55:58 +0000 (01:55 +0000)]
Add an API to rendezvous all active vcpus in a virtual machine. The rendezvous
can be initiated in the context of a vcpu thread or from the bhyve(8) control
process.
The first use of this functionality is to update the vlapic trigger-mode
register when the IOAPIC pin configuration is changed.
Prior to this change we would update the TMR in the virtual-APIC page at
the time of interrupt delivery. But this doesn't work with Posted Interrupts
because there is no way to program the EOI_exit_bitmap[] in the VMCS of
the target at the time of interrupt delivery.
Marcel Moolenaar [Mon, 13 Jan 2014 19:08:25 +0000 (19:08 +0000)]
When building a cross-kgdb, suppress the registration of the
standard core target by declaring coreops_suppress_target with
initializer. This is also happening for non-cross kgdb, by
virtue of having fbsd-threads.c in libgdb and having it do the
exact same thing. Since fbsd-threads.c is not included in in
libgdb when building a cross debugger, we ended up with more
than 1 core file targets (the standard gdb core file target and
kgdb's libkvm based core file target) and this behaves the same
as not having a core target at all.
Marcel Moolenaar [Mon, 13 Jan 2014 19:01:14 +0000 (19:01 +0000)]
Re-apply the part of r260022 that was reverted by r260030 with
one significant difference: for LIB32 builds both TARGET_ARCH
and MACHINE_ARCH are defined. TARGET_ARCH confusingly holds the
architecture of the host (e.g. amd64), while MACHINE_ARCH holds
the architecture were trying to build (e.g. i386). With both
set and different, r260022 changed the behaviour to interpret
the condition as building a cross-amd64 libkvm on i386, when
obviously we're trying to build an i386 version on amd64. When
COMPAT_32BIT is defined, we're building LIB32 and ignore the
value of TARGET_ARCH as we did before.
Implement better error recovery for Transaction Translators, TTs,
found in High Speed USB HUBs which translate from High Speed USB into
FULL or LOW speed USB. In some rare cases SPLIT transactions might get
lost, which might leave the TT in an unknown state. Whenever we detect
such an error try to issue either a clear TT buffer request, or if
that is not possible reset the whole TT.
Julio Merino [Mon, 13 Jan 2014 10:47:26 +0000 (10:47 +0000)]
Prevent misc_helpers from running as a test.
Do this by generating misc_helpers explicitly, without using the
ATF_TESTS_SH functionality.
While this script is technically an atf-sh test program, it is not intended
to be run as a test and therefore it mustn't end up in the Kyuafile. Using
ATF_TESTS_SH means that misc_helpers ended up registered in the Kyuafile
and then failed to run as a test.
The alternative would be to supply an explicit Kyuafile from this directory
that lists the known test files, but doing it the way described above will
be easier to maintain.
Jilles Tjoelker [Sun, 12 Jan 2014 20:30:55 +0000 (20:30 +0000)]
fts: Stat things relative to the directory fd, if possible.
As a result, the kernel needs to process shorter pathnames if fts is not
changing directories (if fts follows symlinks (-L option to utilities), fts
cannot open "." or FTS_NOCHDIR was specified).
Side effect: If pathnames exceed PATH_MAX, [ENAMETOOLONG] is not hit at the
stat stage but later (opendir or application fts_accpath) or not at all.
Alan Cox [Sun, 12 Jan 2014 19:04:20 +0000 (19:04 +0000)]
Correctly update the count of stuck pages, "addl_page_shortage", in
vm_pageout_scan(). There were missing increments in two less common cases.
Don't conflate the count of stuck pages and the pageout deficit provided by
vm_page_alloc{,_contig}(). (A proposed fix to the OOM code depends on this.)
Handle held pages consistently in the inactive queue scan. In the more
common case, we did not move the page to the tail of the queue. Whereas, in
the less common case, we did. There's no particular reason to move the page
in the less common case, so remove it.
Perform the calculation of the page shortage for the active queue scan a
little earlier, before the active queue lock is acquired. The correctness
of this calculation doesn't depend on the active queue lock being held.
Eliminate a redundant variable, "pcount". Use the more descriptive
variable, "maxscan", in its place.
Apply a few nearby style fixes, e.g., eliminate stray whitespace and excess
parentheses.
Gavin Atkinson [Sat, 11 Jan 2014 22:41:10 +0000 (22:41 +0000)]
Remove spaces from boot messages when we print the CPU ID/Family/Stepping
to match the rest of the CPU identification lines, and once again fit
into 80 columns in the usual case.
Jilles Tjoelker [Sat, 11 Jan 2014 21:12:27 +0000 (21:12 +0000)]
find: Allow -type d without statting everything.
fts(3) detects directories even in FTS_NOSTAT mode (so it can descend into
them).
No functional change is intended, but find commands that use -type d but no
primaries that still require stat/lstat calls make considerably fewer system
calls.