marcel [Mon, 20 Jan 2014 18:37:35 +0000 (18:37 +0000)]
In pmap_set_pte(), make sure to enforce ordering by inserting a memory
fence. Under system load, the CPU has been found to change the order
by which the stores are made visible. When the tag is made visible
before the other TLB values, other CPUs may use the invalid TLB values
and do bad things.
While here (i.e. not a fix) don't return errors from pmap_remove_vhpt()
to callers of pmap_remove_pte(). Those callers don't check the return
value and as such don't do what is needed to keep a consistent state.
More importantly, pmap_remove_vhpt() can't really have an error without
it indicating something unintended. Using KASSERT is therefore better.
nwhitehorn [Mon, 20 Jan 2014 18:15:06 +0000 (18:15 +0000)]
Add a new flag to /etc/ttys: onifconsole. This is equivalent to "on" if the
device is an active kernel console and "off" otherwise. This is designed to
allow serial-booting x86 systems to provide a login prompt on the serial line
by default without providing one on all systems by default.
jhb [Mon, 20 Jan 2014 15:51:02 +0000 (15:51 +0000)]
- Allow PCI devices that are attached to a driver to be identified by their
device name instead of just the selector.
- Accept an optional device argument to -l to restrict the output to only
listing details about a single device. This is mostly useful in
conjunction with other flags like -e or -c to allow a user to query
details about a single device.
dteske [Mon, 20 Jan 2014 03:39:08 +0000 (03:39 +0000)]
Dummy commit (whitespace changes and style nits) to show previous commit
(SVN r260866) was [in-part] Submitted-by: Christoph Mallon ...
<christoph.mallon@gmx.de>
dteske [Mon, 20 Jan 2014 03:31:16 +0000 (03:31 +0000)]
Dummy commit (s/__num/__number/) in f_expand_number() to describe that the
previous commit here (SVN r260894) was [in-part] from Submitted-by:
Christoph Mallon <christoph.mallon@gmx.de>
kaiw [Mon, 20 Jan 2014 01:35:14 +0000 (01:35 +0000)]
Clang 3.4 will sometimes emit DIE for struct/union member before
emitting the DIE for the type of that member. ctfconvert can not
handle this properly and will calculate a wrong member bit offset.
Same struct/union type from different .o file will be treated as
different types when their member bit offsets are different, and
gets added/merged multiple times. This will in turn cause many other
structs/pointers/typedefs that refer to the duplicated struct/union
gets added/merged multiple times and eventually causes numerous
duplicated CTF types in the kernel.debug file.
The simple workaround here is to make use of DW_AT_byte_size attribute
of the member DIE to calculate the bits occupied by the member's type,
without actually resolving the type.
imp [Sun, 19 Jan 2014 19:39:13 +0000 (19:39 +0000)]
Introduce grab and ungrab upcalls. When the kernel desires to grab the
console, it calls the grab functions. These functions should turn off
the RX interrupts, and any others that interfere. This makes mountroot
prompt work again. If there's more generalized need other than
prompting, many of these routines should be expanded to do those new
things.
Should have been part of r260889, but waasn't due to command line typo.
imp [Sun, 19 Jan 2014 19:36:11 +0000 (19:36 +0000)]
Introduce grab and ungrab upcalls. When the kernel desires to grab the
console, it calls the grab functions. These functions should turn off
the RX interrupts, and any others that interfere. This makes mountroot
prompt work again. If there's more generalized need other than
prompting, many of these routines should be expanded to do those new
things.
melifaro [Sun, 19 Jan 2014 16:07:27 +0000 (16:07 +0000)]
Further rework netinet6 address handling code:
* Set ia address/mask values BEFORE attaching to address lists.
Inet6 address assignment is not atomic, so the simplest way to
do this atomically is to fill in ia before attach.
* Validate irfa->ia_addr field before use (we permit ANY sockaddr in old code).
* Do some renamings:
in6_ifinit -> in6_notify_ifa (interaction with other subsystems is here)
in6_setup_ifa -> in6_broadcast_ifa (LLE/Multicast/DaD code)
in6_ifaddloop -> nd6_add_ifa_lle
in6_ifremloop -> nd6_rem_ifa_lle
* Split working with LLE and route announce code for last two.
Add temporary in6_newaddrmsg() function to mimic current rtsock behaviour.
* Call device SIOCSIFADDR handler IFF we're adding first address.
In IPv4 we have to call it on every address change since ARP record
is installed by arp_ifinit() which is called by given handler.
IPv6 stack, on the opposite is responsible to call nd6_add_ifa_lle() so
there is no reason to call SIOCSIFADDR often.
kaiw [Sun, 19 Jan 2014 13:48:02 +0000 (13:48 +0000)]
* Make die_mem_offset() be able to handle DW_AT_data_member_location
attributes generated by Clang 3.4.
* Document how different compilers generate DW_AT_data_member_location
attributes differently.
* Document the quirks about DW_FORM_data[48].
kaiw [Sun, 19 Jan 2014 13:42:49 +0000 (13:42 +0000)]
* Allow API dwarf_loclist_n() and dwarf_loclist() to be called with
attributes that have form DW_FORM_sec_offset.
* If the .debug_info section conforms to DWARF4, do not allow the value
of attributes with form DW_FORM_data[48] to be used as section
offset.
marcel [Sun, 19 Jan 2014 04:45:52 +0000 (04:45 +0000)]
Enable vt. This brings VGA-based console and terminal support to
ia64 for the very first time. Only 9 years in the making...
Note that the vt/vga driver does not actually make sure there's
VGA hardware at the standard/legacy VGA I/O port and memory I/O
addresses. This can cause machine checks if the H/W does not have
a VGA controller.
marcel [Sun, 19 Jan 2014 00:38:18 +0000 (00:38 +0000)]
Revision 258428 changed gcc by virtue of having _bswapsi2 _bswapdi2 in
libgcc, but this was not propagated to this file. Revision 260844 added
them here for ia64 unbeknownst revision 258428. Fix it for all...
adrian [Sat, 18 Jan 2014 23:48:20 +0000 (23:48 +0000)]
If the flowid is available for the mbuf that finalised the creation
of a syncache connection, copy it into the inp_flowid field.
Without this, an incoming TCP connection won't have an inp_flowid marked
until some data comes in, and this means that things like the per-CPU
TCP timer option will choose a different CPU for the timer work.
(It also means that if one grabbed the flowid via an ioctl from userland,
it won't be available until some data has been received.)
melifaro [Sat, 18 Jan 2014 23:24:51 +0000 (23:24 +0000)]
Simplify filling sockaddr_dl structure for if_resolvemulti()
callback providers. link_init_sdl() function can be used to
fill most of the parameters. Use caller stack instead of
allocation / freing memory for each request. Do not drop support
for extra-long (probably non-existing) link-layer protocols by
introducing link_alloc_sdl() (used by if_resolvemulti() callback)
and link_free_sdl() (used by caller).
Since this change breaks KBI, MFC requires slightly different approach
(link_init_sdl() auto-allocating buffer if necessary to handle cases
with unmodified if_resolvemulti() callers).
dteske [Sat, 18 Jan 2014 22:33:49 +0000 (22:33 +0000)]
Fix a bad comparison operator (s/==/=/), and address a use-case issue where-
in the one-line comment associated with the dumpdev setting was not present
for the case where the user deselects the dumpdev service (restoring pre-
r256348 behaviour.
neel [Sat, 18 Jan 2014 21:47:12 +0000 (21:47 +0000)]
Some processor's don't allow NMI injection if the STI_BLOCKING bit is set in
the Guest Interruptibility-state field. However, there isn't any way to
figure out which processors have this requirement.
So, inject a pending NMI only if NMI_BLOCKING, MOVSS_BLOCKING, STI_BLOCKING
are all clear. If any of these bits are set then enable "NMI window exiting"
and inject the NMI in the VM-exit handler.
alc [Sat, 18 Jan 2014 20:02:59 +0000 (20:02 +0000)]
Style changes in vm_pageout_scan():
1. Be consistent in the style of "act_delta" manipulations between the
inactive and active queue scans.
2. Explicitly compare to zero.
3. The deactivation of a page is based is based on its recent history
and not just the current call to vm_pageout_scan(). The variable
"act_delta" represents the current state of the page, and not its
history. Avoid possible confusion by not (ab)using "act_delta" for
the making the deactivation decision.
kaiw [Sat, 18 Jan 2014 17:59:22 +0000 (17:59 +0000)]
API dwarf_attrval_flag() should properly handle an attribute with
(DWARF4) form DW_FORM_flag_present which implicitly indicates the
presence of the attribute. Manual page is updated to reflect this
change.
Note that this was previously fixed in the old libdwarf.
marcel [Sat, 18 Jan 2014 04:09:39 +0000 (04:09 +0000)]
For ia64, add _bswapsi2 & _bswapdi2. The audio/flac port uses the
bswap32 builtin and the compiler emits a call to the libgcc function
rather than generating inline code.
neel [Sat, 18 Jan 2014 02:20:10 +0000 (02:20 +0000)]
If the guest exits due to a fault while it is executing IRET then restore
the state of "Virtual NMI blocking" in the guest's interruptibility-state
field before resuming the guest.
pfg [Fri, 17 Jan 2014 21:21:28 +0000 (21:21 +0000)]
gcc: Drop useless objc change from r260311.
Among some of the objc changes from Apple that crept into r260311,
Radar 5355344 is incomplete and is not used since we don't carry
ObjC in the base system.
The dead code seems to have caused issues in some Tinderboxes so
get rid of it altogether.
hselasky [Fri, 17 Jan 2014 10:35:18 +0000 (10:35 +0000)]
Fix a possible memory use after free and leak situation associated
with USB device detach when using character device handles. This also
includes LibUSB. It turns out that "usb_close()" cannot always get a
reference to clean up its USB transfers and such, if called during the
kernel USB device detach.
kaiw [Fri, 17 Jan 2014 08:44:12 +0000 (08:44 +0000)]
We should not set the unnamed DIE's name to "__anon__" since that will
bring back a known issue with DTrace regarding type name
comparison. Instead, we can set the name to an empty string.
adrian [Fri, 17 Jan 2014 05:26:55 +0000 (05:26 +0000)]
Implement a kqueue notification path for sendfile.
This fires off a kqueue note (of type sendfile) to the configured kqfd
when the sendfile transaction has completed and the relevant memory
backing the transaction is no longer in use by this transaction.
This is analogous to SF_SYNC waiting for the mbufs to complete -
except now you don't have to wait.
Both SF_SYNC and SF_KQUEUE should work together, even if it
doesn't necessarily make any practical sense.
This is designed for use by applications which use backing cache/store
files (eg Varnish) or POSIX shared memory (not sure anything is using
it yet!) to know when a region of memory is free for re-use. Note
it doesn't mark the region as free overall - only free from this
transaction. The application developer still needs to track which
ranges are in the process of being recycled and wait until all
pending transactions are completed.
adrian [Fri, 17 Jan 2014 05:13:08 +0000 (05:13 +0000)]
Implement the extension api for sendfile to allow for kqueue notifications.
This is still under a bit of flux, as the final API hasn't been nailed
down. It's also unclear whether we should define the two new types in the
header or not - it may allow bad code to compile that shouldn't (ie,
since uintX's are defined, the developer may not include sys/types.h.)
Reviewed by: peter, imp, bde
Sponsored by: Netflix, Inc.
neel [Fri, 17 Jan 2014 04:21:39 +0000 (04:21 +0000)]
If a VM-exit happens during an NMI injection then clear the "NMI Blocking" bit
in the Guest Interruptibility-state VMCS field.
If we fail to do this then a subsequent VM-entry will fail because it is an
error to inject an NMI into the guest while "NMI Blocking" is turned on. This
is described in "Checks on Guest Non-Register State" in the Intel SDM.
Submitted by: David Reed (david.reed@tidalscale.com)
csjp [Fri, 17 Jan 2014 03:30:24 +0000 (03:30 +0000)]
fix a regression introduced in r237618 that would result in
killall confusing killall -INT with killall -I (interactive
confirmation) which resulted in the wrong signal (TERM)
being delivered to the process(s).
kaiw [Thu, 16 Jan 2014 22:28:33 +0000 (22:28 +0000)]
If function die_name() finds a DIE without a name, set its name to
"__anon__". This hack is used to workaround a issue that compilers
like GCC could generate DW_TAG_base_type DIE without a name.
Note that we didn't need this before because the old libdwarf
internally set all the unnamed DIE's name to "__anon__".
kaiw [Thu, 16 Jan 2014 21:47:27 +0000 (21:47 +0000)]
Use FreeBSD's ELF headers instead of the elfdefinitions.h header which
comes with elftoolchain. This version of libelf doesn't need to be
portable; using FreeBSD's own ELF headers will avoid conflicts and
make integration easier.
gnn [Thu, 16 Jan 2014 21:46:43 +0000 (21:46 +0000)]
Add a command line argument to turn off blocking waiting for the user
to press Ctrl-C (-b). This allows tests with tight loops of mcgrabs
that can stress the multicast tables.
gjb [Thu, 16 Jan 2014 16:12:09 +0000 (16:12 +0000)]
Update the pkg-stage target to be more compatible with pkg-1.2:
- Add a release-dvd.conf pkg(8) configuration file to override
the default FreeBSD.conf configuration.
- Remove architecture-specific pkg-stage.conf files, consolidate,
and move their contents to scripts/pkg-stage.sh.
- Use 'pkg -vv' to determine the ABI, which is used as the
cache directory.
Prior to these changes, it would be possible for pkg-stage to fetch
conflicting binary packages from multiple repositories.
Tested against: head@r260522, stable/10@r260522
MFC after: 3 days
X-Insta-MFC: possibly
Sponsored by: The FreeBSD Foundation
avg [Thu, 16 Jan 2014 13:24:10 +0000 (13:24 +0000)]
fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt. Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds. But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.
Reviewed by: gibbs
MFC after: 13 days
Sponsored by: HybridCluster
avg [Thu, 16 Jan 2014 12:31:27 +0000 (12:31 +0000)]
zfs_deleteextattr: name buffer from namei is needed by zfs_rename
If we prematurely free the name buffer and it gets quickly recycled,
then zfs_rename may see data from another lookup or even unmapped memory
via cn_nameptr.
avg [Thu, 16 Jan 2014 12:26:54 +0000 (12:26 +0000)]
fix a bug in ZFS mirror code for handling multiple DVAa
The bug was introduced in r256956 "Improve ZFS N-way mirror read
performance".
The code in vdev_mirror_dva_select erroneously considers already
tried DVAs for the next attempt. Thus, it is possible that a failing DVA
would be retried forever.
As a secondary effect, if the attempts fail with checksum error, then
checksum error reports are accumulated until the original request
ultimately fails or succeeds. But because retrying is going on indefinitely
the cheksum reports accumulation will effectively be a memory leak.
Reviewed by: gibbs
MFC after: 13 days
Sponsored by: HybridCluster
avg [Thu, 16 Jan 2014 12:22:46 +0000 (12:22 +0000)]
zfs: getnewvnode_reserve must be called outside of a zfs transaction
Otherwise we could run into the following deadlock.
A thread has a transaction open and assigned to a transaction group.
That would prevent the transaction group from be quiesced and synced.
The thread is blocked in getnewvnode_reserve waiting for a vnode to
a be reclaimed. vnlru thread is blocked trying to enter ZFS VOP because
a filesystem is suspended by an ongoing rollback or receive operation.
In its turn the operation is waiting for the current transaction group
to be synced.
zfs_zget is always used outside of active transactions, but zfs_mknode
is always used in a transaction context. Thus, we hoist
getnewvnode_reserve from zfs_mknode to its callers.
While there, assert that ZFS always calls getnewvnode while having
a vnode reserved.
Reported by: adrian
Tested by: adrian
MFC after: 17 days
Sponsored by: HybridCluster
melifaro [Thu, 16 Jan 2014 11:50:00 +0000 (11:50 +0000)]
Fix ipfw fwd for IPv4 traffic broken by r249894.
Problem case:
Original lookup returns route with GW set, so gw points to
rte->rt_gateway.
After that we're changing dst and performing lookup another time.
Since fwd host is most probably directly reachable, resulting
rte does not contain rt_gateway, so gw is not set. Finally, we
end with packet transmitted to proper interface but wrong
link-layer address.