neel [Mon, 19 Jan 2015 06:53:31 +0000 (06:53 +0000)]
MOVS instruction emulation.
These instructions are emitted by 'bus_space_read_region()' when accessing
MMIO regions.
Since MOVS can be used with a repeat prefix start decoding the REPZ and
REPNZ prefixes. Also start decoding the segment override prefix since MOVS
allows overriding the source operand segment register.
neel [Mon, 19 Jan 2015 06:51:04 +0000 (06:51 +0000)]
Fix a bug in libvmmapi 'vm_copy_setup()' where it would return success even if
the 'gpa' was in the guest MMIO region. This would manifest as a segmentation
fault in 'vm_map_copyin()' or 'vm_map_copyout()' because 'vm_map_gpa()' would
return NULL for this 'gpa'.
Fix this by calling 'vm_map_gpa()' in 'vm_copy_setup' and returning a failure
if the 'gpa' cannot be mapped. This matches the behavior of 'vm_copy_setup()'
in vmm.ko.
nwhitehorn [Mon, 19 Jan 2015 05:14:07 +0000 (05:14 +0000)]
Provide a tunable (machdep.moea64_bpvo_pool_size) to set the bootstrap
PVO pool size. The default errs on the exceedingly large side, so absent
any intelligent automatic tuning, at least let the user set it to save
RAM on memory-constrained systems.
ian [Mon, 19 Jan 2015 04:56:17 +0000 (04:56 +0000)]
For armv6 builds, add -mfloat-abi=softfp. This tells the compiler it can
use floating point hardware instructions (because all armv6/7 systems we
support have fp hardware), but it passes args using a soft-float compatible
ABI. This should give noticible performance improvement (but not as much
as using the armv6hf arch).
rstone [Mon, 19 Jan 2015 00:33:32 +0000 (00:33 +0000)]
When mountd is creating sockets, it iterates over all addresses specified
in the "hosts" array and eventually looks up the network address with
getaddrinfo(). At one point it checks for a numeric address and if it
sees one, it sets a hint parameter to force getaddrinfo to interpret the
host as a numeric address. However that hint is not cleared for subsequent
iterations of the loop and if any hosts seen after this point are host names,
getaddrinfo will fail on the name. The result of this bug is that you cannot
pass a host name to the -h flag.
Unfortunately, the first iteration will either process ::1 or 127.0.0.1,
so the flag is set on the first iteration and all host names will fail
to be processed.
The same bug applies to rpc.lockd and rpc.statd, so fix them too.
Differential Revision: https://reviews.freebsd.org/D1507
Reported by: Dylan Martin
MFC after: 1 week
Sponsored by: Sandvine Inc.
smh [Sun, 18 Jan 2015 23:15:49 +0000 (23:15 +0000)]
Clean ZFS spa config before syncing
A number of entries that can be present in the spa config shouldn't be saved
to disk so add a method to ensure this is case. Without this if the last
caller to vdev_config_generate requested stats then we can end up in the
cache file.
Also only skip a none writable pool in the cache file generation if its
active. This prevents unavailable pools incorrectly getting removed from
cache file.
ian [Sun, 18 Jan 2015 20:47:21 +0000 (20:47 +0000)]
Save the command-and-flags value into the shadow register when it is written.
This doesn't actually change any behavior, because it just allows a 16-bit
read of the command register to return the correct value, and nothing
actually does a 16-bit read of that register.
adrian [Sun, 18 Jan 2015 18:06:40 +0000 (18:06 +0000)]
Refactor / restructure the RSS code into generic, IPv4 and IPv6 specific
bits.
The motivation here is to eventually teach netisr and potentially
other networking subsystems a bit more about how RSS work queues / buckets
are configured so things have a hope of auto-configuring in the future.
* net/rss_config.[ch] takes care of the generic bits for doing
configuration, hash function selection, etc;
* topelitz.[ch] is now in net/ rather than netinet/;
* (and would be in libkern if it didn't directly include RSS_KEYSIZE;
that's a later thing to fix up.)
* netinet/in_rss.[ch] now just contains the IPv4 specific methods;
* and netinet/in6_rss.[ch] now just contains the IPv6 specific methods.
This should have no functional impact on anyone currently using
the RSS support.
kib [Sun, 18 Jan 2015 15:13:11 +0000 (15:13 +0000)]
Add procctl(2) PROC_TRACE_CTL command to enable or disable debugger
attachment to the process. Note that the command is not intended to
be a security measure, rather it is an obfuscation feature,
implemented for parity with other operating systems.
Discussed with: jilles, rwatson
Man page fixes by: rwatson
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
dim [Sun, 18 Jan 2015 14:14:47 +0000 (14:14 +0000)]
Upgrade our copy of clang and llvm to 3.5.1 release. This is a bugfix
only release, no new features have been added.
Please note that this version requires C++11 support to build; see
UPDATING for more information.
Release notes for llvm and clang can be found here:
<http://llvm.org/releases/3.5.1/docs/ReleaseNotes.html>
<http://llvm.org/releases/3.5.1/tools/clang/docs/ReleaseNotes.html>
hselasky [Sun, 18 Jan 2015 14:04:55 +0000 (14:04 +0000)]
Make the linuxapi module only build when WITH_OFED=YES is specified.
There needs to be some more testing done before it is ready for all
platforms and architectures.
cperciva [Sun, 18 Jan 2015 12:45:26 +0000 (12:45 +0000)]
When disabling C3+ CPU states due to the CPU_QUIRK_NO_C3 quirk, don't
accidentally enable non-existent states.
This bug was triggered if ACPI advertises the presence of a C2 state
which we fail to parse via acpi_PkgGas due to our lack of support for
FFixedHW resources, and causes an immediate panic when an attempt is
made to enter the (NULL) state.
One affected platform is the EC2 c4.8xlarge VM instance type; there
may be others.
hselasky [Sun, 18 Jan 2015 10:53:48 +0000 (10:53 +0000)]
Extend fixes made in r277308 to fix build of LINT kernels for i386 and
amd64. Until further we need some custom C-flags when building the
Linux compat API.
neel [Sun, 18 Jan 2015 03:08:30 +0000 (03:08 +0000)]
Simplify instruction restart logic in bhyve.
Keep track of the next instruction to be executed by the vcpu as 'nextrip'.
As a result the VM_RUN ioctl no longer takes the %rip where a vcpu should
start execution.
Also, instruction restart happens implicitly via 'vm_inject_exception()' or
explicitly via 'vm_restart_instruction()'. The APIs behave identically in
both kernel and userspace contexts. The main beneficiary is the instruction
emulation code that executes in both contexts.
bhyve(8) VM exit handlers now treat 'vmexit->rip' and 'vmexit->inst_length'
as readonly:
- Restarting an instruction is now done by calling 'vm_restart_instruction()'
as opposed to setting 'vmexit->inst_length' to 0 (e.g. emulate_inout())
- Resuming vcpu at an arbitrary %rip is now done by setting VM_REG_GUEST_RIP
as opposed to changing 'vmexit->rip' (e.g. vmexit_task_switch())
bz [Sun, 18 Jan 2015 01:28:08 +0000 (01:28 +0000)]
There are still kernel configs and mk files depending on the OFED option.
This will need a proper cleanup and in the meantime after r277302 unbreak
LINT builds.
ian [Sat, 17 Jan 2015 19:57:03 +0000 (19:57 +0000)]
Add a new SDHCI quirk, SDHCI_QUIRK_DONT_SET_HISPD_BIT. Apparently some
sdhci controllers, such as the one on a Raspberry Pi, mishandle the signal
timing in high speed signaling mode, but run just fine in standard mode
with the bus running at frequencies between 25-50MHz (which shouldn't work).
This is the solution adopted by U-Boot and other OSes (linux and *BSD)
for the timeouts on Raspberry Pi boards with certain SD cards. Some
research shows that this quirk is also used on a few other boards, so the
fix is a generic quirk instead of being in the RPi-specific driver code.
This change is based on information discovered by Michal Meloun.
ian [Sat, 17 Jan 2015 18:40:46 +0000 (18:40 +0000)]
Minor cleanups, comment changes. No need to load 3 values when setting up
the stack for secondary cores, the other two values are only used for zeroing
bss on the primary core. No need to store the size of the stack at the
top of the stack (seems to be a leftover instruction from some cut-n-paste).
hselasky [Sat, 17 Jan 2015 16:36:39 +0000 (16:36 +0000)]
Start importing the basic OFED linux compatibility layer changes made
by dumbbell@ to be able to compile this layer as a dependency module.
Clean up some Makefiles and remove the no longer used OFED define.
Currently only i386 and amd64 targets are supported.
smh [Sat, 17 Jan 2015 14:44:59 +0000 (14:44 +0000)]
Mechanically convert cddl sun #ifdef's to illumos
Since the upstream for cddl code is now illumos not sun, mechanically
convert all sun #ifdef's to illumos #ifdef's which have been used in all
newer code for some time.
Also do a manual pass to correct the use if #ifdef comments as per style(9)
as well as few uses of #if defined(__FreeBSD__) vs #ifndef illumos.
br [Sat, 17 Jan 2015 12:31:26 +0000 (12:31 +0000)]
o Notify USB host about connection when operating in device mode.
Required when communicating to Mac OS X USB host stack.
o Also don't set stall bit to TX pipe in device mode as seems Mac OS X
don't clears it as it should.
adrian [Sat, 17 Jan 2015 06:43:30 +0000 (06:43 +0000)]
Override the bt enable/disable methods for AR9462 (jupiter) and
AR9565 (Aphrodite.) These need to use the MCI routines, not
the legacy 2-wire / 3-wire bluetooth coexistence methods.
imp [Sat, 17 Jan 2015 02:17:55 +0000 (02:17 +0000)]
The sn driver isn't UCODE sourceless. While it is true there's an
binary FPGA image that's in an include file in this directory, that
include file isn't actually used. It is only for certain Trump Cards
that we don't yet support. When support was anticipated for them, we
got permission to include the required FPGA image in our sources under
the BSDL, but didn't start actually including the file. This was done
to provide a public paper trail for this file.
ngie [Sat, 17 Jan 2015 00:58:24 +0000 (00:58 +0000)]
Fix lib/libthr/tests/detach_test
- Eliminate race with liberal use of sleep(3) [1]
- Fix NetBSD-specific implementation way of testing result from pthread_cancel
by testing with `td` instead of `NULL` [2]
adrian [Sat, 17 Jan 2015 00:02:18 +0000 (00:02 +0000)]
Until there's a full MCI implementation - just implement a placeholder
MCI bluetooth coexistence method for WB222.
The rest of MCI requires a bunch more work, including adding a DMA buffer
for the MCI hardware to bounce messages in/out of and handling MCI
interrupts. But the more important part here is telling the HAL
the btcoex is enabled and MCI is in use so it configures the correct
initial bluetooth parameters in the wireless NIC and configures
things like bluetooth traffic weights and such.
So, this at least gets the HAL to do some of the right things in
configuring the inital bluetooth coexistence stuff, but doesn't
actually do full btcoex. That'll take.. some effort.
will [Fri, 16 Jan 2015 21:39:08 +0000 (21:39 +0000)]
Add a ${CP} alias for copying files in the build.
Some users build FreeBSD as non-root in Perforce workspaces. By default,
Perforce sets files read-only unless they're explicitly being edited.
As a result, the -f argument must be used to cp in order to override the
read-only flag when copying source files to object directories. Bare use of
'cp' should be avoided in the future.
emaste [Fri, 16 Jan 2015 18:59:15 +0000 (18:59 +0000)]
crunchide: Correct 64-bit section header offset
For 64-bit binaries the Elf_Ehdr e_shoff is at offset 40, not 44.
Instead of using an incorrect hardcoded offset, let the compiler
figure it out for us with offsetof().
Differential Revision: https://reviews.freebsd.org/D1543
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
nwhitehorn [Fri, 16 Jan 2015 18:47:20 +0000 (18:47 +0000)]
Add two fake properties ("fdtbootcpu" and "fdtmemreserv") to the device
tree's /chosen node to provide out-of-band header fields of the FDT. This
emulation is not perfect without corresponding changes to ofw_fdt_nextprop(),
but is enough to enable lookup by memory-map-parsing code.
alc [Fri, 16 Jan 2015 18:17:09 +0000 (18:17 +0000)]
Revamp the default page clustering strategy that is used by the page fault
handler. For roughly twenty years, the page fault handler has used the
same basic strategy: Fetch a fixed number of non-resident pages both ahead
and behind the virtual page that was faulted on. Over the years,
alternative strategies have been implemented for optimizing the handling
of random and sequential access patterns, but the only change to the
default strategy has been to increase the number of pages read ahead to 7
and behind to 8.
The problem with the default page clustering strategy becomes apparent
when you look at how it behaves on the code section of an executable or
shared library. (To simplify the following explanation, I'm going to
ignore the read that is performed to obtain the header and assume that no
pages are resident at the start of execution.) Suppose that we have a
code section consisting of 32 pages. Further, suppose that we access
pages 4, 28, and 16 in that order. Under the default page clustering
strategy, we page fault three times and perform three I/O operations,
because the first and second page faults only read a truncated cluster of
12 pages. In contrast, if we access pages 8, 24, and 16 in that order, we
only fault twice and perform two I/O operations, because the first and
second page faults read a full cluster of 16 pages. In general, truncated
clusters are more common than full clusters.
To address this problem, this revision changes the default page clustering
strategy to align the start of the cluster to a page offset within the vm
object that is a multiple of the cluster size. This results in many fewer
truncated clusters. Returning to our example, if we now access pages 4,
28, and 16 in that order, the cluster that is read to satisfy the page
fault on page 28 will now include page 16. So, the access to page 16 will
no longer page fault and perform an I/O operation.
Since the revised default page clustering strategy is typically reading
more pages at a time, we are likely to read a few more pages that are
never accessed. However, for the various programs that we looked at,
including clang, emacs, firefox, and openjdk, the reduction in the number
of page faults and I/O operations far outweighed the increase in the
number of pages that are never accessed. Moreover, the extra resident
pages allowed for many more superpage mappings. For example, if we look
at the execution of clang during a buildworld, the number of (hard) page
faults on the code section drops by 26%, the number of superpage mappings
increases by about 29,000, but the number of never accessed pages only
increases from 30.38% to 33.66%. Finally, this leads to a small but
measureable reduction in execution time.
mav [Fri, 16 Jan 2015 12:35:55 +0000 (12:35 +0000)]
Don't count status as sent until CTIO completes successfully.
If we aggregated status sending with data move and got error, allow status
to be updated and resent again separately. Without this command may stuck
without status sent at all.
smh [Fri, 16 Jan 2015 10:44:39 +0000 (10:44 +0000)]
Eliminate illumos whole disk special case when searching for a ZFS vdev
This special case prevented locating vdevs which start with c[0-9] e.g.
gptid/c6cde092-504b-11e4-ba52-c45444453598 hence it was impossible to
online a vdev via its path.
melifaro [Fri, 16 Jan 2015 10:09:28 +0000 (10:09 +0000)]
Eliminate SIOCGIFADDR handling in bpf.
Quoting 19 years bpf.4 manual from bpf-1.2a1:
"
(SIOCGIFADDR is obsolete under BSD systems. SIOCGIFCONF should be
used to query link-level addresses.)
"
* SIOCGIFADDR was not imported in NetBSD (bpf.c 1.36) and OpenBSD.
* Last bits (e.g. manpage claiming SIOCGIFADDR exists) was cleaned
from NetBSD via kern/21513 5 years ago,
from OpenBSD via documentation/6352 5 years ago.
kib [Fri, 16 Jan 2015 07:06:58 +0000 (07:06 +0000)]
For sigaction(2), ignore possible garbage in sa_flags for sa_handler
== SIG_DFL or SIG_IGN. Sloppy code does not fully initialize struct
sigaction for such cases, and being too demanding in the case of
default handler does not catch anything.
Reported and tested by: Alex Tutubalin <lexa@lexa.ru>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
imp [Fri, 16 Jan 2015 06:19:52 +0000 (06:19 +0000)]
Always enable I/O, memory and dma cycles. Some BIOSes don't enable
them, sometimes they are reset for power state transitions or during
whatever happens while suspended. Also, it is good practice to always
do this.
imp [Fri, 16 Jan 2015 06:19:08 +0000 (06:19 +0000)]
Back out the refactor. It turns out to cause interrupt storms on
resume sometimes (but not others). On powerup, other wierd issues show
up (sometimes the card comes up, but with really bogus pci config
space stuff. There may be more, but given my experience of historical
fussiness, stick to what works and make more minimal changes to that.
dim [Thu, 15 Jan 2015 21:17:36 +0000 (21:17 +0000)]
Import libc++ trunk r224926. This fixes a number of bugs, completes
C++14 support[1], adds more C++1z features[2], and fixes the following
LWG issues[3]:
1450: Contradiction in regex_constants
2003: String exception inconsistency in erase.
2075: Progress guarantees, lock-free property, and scheduling
assumptions
2104: unique_lock move-assignment should not be noexcept
2112: User-defined classes that cannot be derived from
2132: std::function ambiguity
2135: Unclear requirement for exceptions thrown in
condition_variable::wait()
2142: packaged_task::operator() synchronization too broad?
2182: Container::[const_]reference types are misleadingly specified
2186: Incomplete action on async/launch::deferred
2188: Reverse iterator does not fully support targets that overload
operator&
2193: Default constructors for standard library containers are explicit
2205: Problematic postconditions of regex_match and regex_search
2213: Return value of std::regex_replace
2240: Probable misuse of term "function scope" in [thread.condition]
2252: Strong guarantee on vector::push_back() still broken with C++11?
2257: Simplify container requirements with the new algorithms
2258: a.erase(q1, q2) unable to directly return q2
2263: Comparing iterators and allocator pointers with different
const-character
2268: Setting a default argument in the declaration of a member
function assign of std::basic_string
2271: regex_traits::lookup_classname specification unclear
2272: quoted should use char_traits::eq for character comparison
2278: User-defined literals for Standard Library types
2280: begin / end for arrays should be constexpr and noexcept
2285: make_reverse_iterator
2288: Inconsistent requirements for shared mutexes
2291: std::hash is vulnerable to collision DoS attack
2293: Wrong facet used by num_put::do_put
2299: Effects of inaccessible key_compare::is_transparent type are not
clear
2301: Why is std::tie not constexpr?
2304: Complexity of count in unordered associative containers
2306: match_results::reference should be value_type&, not const
value_type&
2308: Clarify container destructor requirements w.r.t. std::array
2313: tuple_size should always derive from integral_constant<size_t, N>
2314: apply() should return decltype(auto) and use decay_t before
tuple_size
2315: weak_ptr should be movable
2316: weak_ptr::lock() should be atomic
2317: The type property queries should be UnaryTypeTraits returning
size_t
2320: select_on_container_copy_construction() takes allocators, not
containers
2322: Associative(initializer_list, stuff) constructors are
underspecified
2323: vector::resize(n, t)'s specification should be simplified
2324: Insert iterator constructors should use addressof()
2329: regex_match()/regex_search() with match_results should forbid
temporary strings
2330: regex("meow", regex::icase) is technically forbidden but should
be permitted
2332: regex_iterator/regex_token_iterator should forbid temporary
regexes
2339: Wording issue in nth_element
2341: Inconsistency between basic_ostream::seekp(pos) and
basic_ostream::seekp(off, dir)
2344: quoted()'s interaction with padding is unclear
2346: integral_constant's member functions should be marked noexcept
2350: min, max, and minmax should be constexpr
2356: Stability of erasure in unordered associative containers
2357: Remaining "Assignable" requirement
2359: How does regex_constants::nosubs affect basic_regex::mark_count()?
2360: reverse_iterator::operator*() is unimplementable
royger [Thu, 15 Jan 2015 16:27:20 +0000 (16:27 +0000)]
loader: implement multiboot support for Xen Dom0
Implement a subset of the multiboot specification in order to boot Xen
and a FreeBSD Dom0 from the FreeBSD bootloader. This multiboot
implementation is tailored to boot Xen and FreeBSD Dom0, and it will
most surely fail to boot any other multiboot compilant kernel.
In order to detect and boot the Xen microkernel, two new file formats
are added to the bootloader, multiboot and multiboot_obj. Multiboot
support must be tested before regular ELF support, since Xen is a
multiboot kernel that also uses ELF. After a multiboot kernel is
detected, all the other loaded kernels/modules are parsed by the
multiboot_obj format.
The layout of the loaded objects in memory is the following; first the
Xen kernel is loaded as a 32bit ELF into memory (Xen will switch to
long mode by itself), after that the FreeBSD kernel is loaded as a RAW
file (Xen will parse and load it using it's internal ELF loader), and
finally the metadata and the modules are loaded using the native
FreeBSD way. After everything is loaded we jump into Xen's entry point
using a small trampoline. The order of the multiboot modules passed to
Xen is the following, the first module is the RAW FreeBSD kernel, and
the second module is the metadata and the FreeBSD modules.
Since Xen will relocate the memory position of the second
multiboot module (the one that contains the metadata and native
FreeBSD modules), we need to stash the original modulep address inside
of the metadata itself in order to recalculate its position once
booted. This also means the metadata must come before the loaded
modules, so after loading the FreeBSD kernel a portion of memory is
reserved in order to place the metadata before booting.
In order to tell the loader to boot Xen and then the FreeBSD kernel the
following has to be added to the /boot/loader.conf file:
The first argument contains the command line that will be passed to the Xen
kernel, while the second argument is the path to the Xen kernel itself. This
can also be done manually from the loader command line, by for example
typing the following set of commands:
OK unload
OK load /boot/xen dom0_mem=1024M dom0_max_vcpus=2 dom0pvh=1 console=com1,vga
OK load kernel
OK load zfs
OK load if_tap
OK load ...
OK boot