neel [Sat, 24 May 2014 23:12:30 +0000 (23:12 +0000)]
Add libvmmapi functions vm_copyin() and vm_copyout() to copy into and out
of the guest linear address space. These APIs in turn use a new ioctl
'VM_GLA2GPA' to convert the guest linear address to guest physical.
Use the new copyin/copyout APIs when emulating ins/outs instruction in
bhyve(8).
adrian [Sat, 24 May 2014 20:37:15 +0000 (20:37 +0000)]
Add a new taskqueue setup method that takes a cpuid to pin the
taskqueue worker thread(s) to.
For now it isn't a taskqueue/taskthread error to fail to pin
to the given cpuid.
Thanks to rpaulo@, kib@ and jhb@ for feedback.
Tested:
* igb(4), with local RSS patches to pin taskqueues.
TODO:
* ask the doc team for help in documenting the new API call.
* add a taskqueue_start_threads_cpuset() method which takes
a cpuset_t - but this may require a bunch of surgery to
bring cpuset_t into scope.
ian [Sat, 24 May 2014 16:21:16 +0000 (16:21 +0000)]
Eliminate one of the causes of spurious interrupts on armv6. The arm weak
memory ordering model allows writes to different devices to complete out
of order, leading to a situation where the write that clears an interrupt
source at a device can complete after a write that unmasks and EOIs the
interrupt at the interrupt controller, leading to a spurious re-interrupt.
This adds a generic barrier function specific to the needs of interrupt
controllers, and calls that function from the GIC and TI AINTC controllers.
There may still be other soc-specific controllers that need to make the call.
Reviewed by: cognet, Svatopluk Kraus <onwahe@gmail.com>
MFC after: 3 days
mav [Sat, 24 May 2014 13:00:49 +0000 (13:00 +0000)]
Increase taskqueue thread priority from idle to PRIBIO.
Idle priority is not even time-share, so if system is busy in any way,
those events may never be executed. Since in some cases system waits
for events processed by that thread, that may cause deadlocks.
kib [Sat, 24 May 2014 10:23:06 +0000 (10:23 +0000)]
Right now, the rtld prefork hook locks the rtld bind lock in the read
mode. This allows the binder to be functional in the child after the
fork (assuming no lazy loading of a filter is needed), but other rtld
services which require write lock on rtld_bind_lock cause deadlock, if
called by child.
Change the _rtld_atfork() to lock the bind lock in write mode, making
the rtld fully functional after the fork.
Pre-resolve the symbols which are called by the libthr' fork()
interposer, since dynamic resolution causes deadlock due to the
rtld_bind_lock already owned in the write mode.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
cy [Sat, 24 May 2014 06:05:21 +0000 (06:05 +0000)]
Move mutex creation from ipf_log_soft_init() to ipf_log_soft_create()
to be consistent with mutex destruction in ipf_log_soft_destroy(). As a
result mutex destruction in ipf_log_soft_fini() is redundant.
bz [Fri, 23 May 2014 20:15:01 +0000 (20:15 +0000)]
Move the tcp_fields_to_host() and tcp_fields_to_net() (inline)
functions to the tcp_var.h header file in order to avoid further
duplication with upcoming commits.
alc [Fri, 23 May 2014 16:22:36 +0000 (16:22 +0000)]
There is no reason to perform the pmap_remove() on the kernel pmap while
the kmem object lock is held. Do the pmap_remove() before acquiring the
kmem object lock.
imp [Fri, 23 May 2014 14:34:22 +0000 (14:34 +0000)]
Allow CC to not actually exist. During the ports INDEX run, all the
Makefiles are evaluated without building things. In a normal build,
the prerequisites would be built, and CC would be an actual thing. In
an INDEX build, though, they don't exists. Redirect stderr to get rid
of annoying messages, and assume that the compiler version is 0 if the
actual compiler can't tell us. Do this in preference to guessing based
on numbers because gcc410 might be 4.10, or 4.1.0 and without
carefully crafted special knowledge we differentiate between them
easily (also ming-gcc has no clues at all). Elsewhere, don't trust
the compiler version if it is 0.
hselasky [Fri, 23 May 2014 08:46:28 +0000 (08:46 +0000)]
Initial import of character device in userspace support for FreeBSD.
The CUSE library is a wrapper for the devfs kernel functionality which
is exposed through /dev/cuse . In order to function the CUSE kernel
code must either be enabled in the kernel configuration file or loaded
separately as a module. Currently none of the committed items are
connected to the default builds, except for installing the needed
header files. The CUSE code will be connected to the default world and
kernel builds in a follow-up commit.
The CUSE module was written by Hans Petter Selasky, somewhat inspired
by similar functionality found in FUSE. The CUSE library can be used
for many purposes. Currently CUSE is used when running Linux kernel
drivers in user-space, which need to create a character device node to
communicate with its applications. CUSE has full support for almost
all devfs functionality found in the kernel:
- kevents
- read
- write
- ioctl
- poll
- open
- close
- mmap
- private per file handle data
Requested by several people. Also see "multimedia/cuse4bsd-kmod" in
ports.
luigi [Fri, 23 May 2014 08:10:07 +0000 (08:10 +0000)]
add libraries to the initial build for picobsd.
add a -j option so we can tune the amount of parallel make,
the default we used (-j 8) is large and was giving problems
with SUBDIR_PARALLEL due to some missing dependencies.
neel [Fri, 23 May 2014 05:04:50 +0000 (05:04 +0000)]
A Centos 6.4 guest will write 0xff to the 8259 mask register before beginning
the proper ICWx initialization sequence. It assumes, probably correctly, that
the boot firmware has done the 8259 initialization.
Since grub-bhyve does not initialize the 8259 this write to the mask register
takes a code path in which 'error' remains uninitialized (ready=0,icw_num=0).
Fix this by initializing 'error' at the start of the function.
imp [Fri, 23 May 2014 00:20:48 +0000 (00:20 +0000)]
When libelf and libdwarf were updated, we didn't bump the minimal
version needed for CTF tools, so sometimes we'd use the host's CTF
tools that didn't work. Be sure to bootstrap in that case.
imp [Fri, 23 May 2014 00:20:44 +0000 (00:20 +0000)]
Add .../share/mk to the default system make path. This will fix the
problem with broken in-tree builds (which are used far more
pervasively than I'd known outside the tree). However, weird results
may now happen if at any point in the tree above you there happens to
be a directory that has subdirectory of share/mk, as unpredictable
results will follow. This was considered the lessor of the two evils,
at least for now. In the future this will be removed again when the
underlying issues are resolved.
ian [Thu, 22 May 2014 23:38:17 +0000 (23:38 +0000)]
Map device memory using PTE_DEVICE attributes, and also ensure that the
shared flag is set on normal-memory mappings made via pmap_kenter() for SMP.
The "shared flag" part of this change isn't obvious from the diff, here's
the deal... by using the array of preformatted page table entry templates
instead of constructing the PTE from scratch, we automatically get the
right attribute bits set for both caching and shared.
dteske [Thu, 22 May 2014 19:36:29 +0000 (19:36 +0000)]
Fix syntax error thrown at the point of creating the root pool, caused by
an embedded newline appearing within the options string surrounded by
double-quotes. Rework the logic that goes into setting dataset options on
the root pool dataset while we're here -- added two new variables (which
can be altered via scripting) ZFSBOOT_POOL_CREATE_OPTIONS and also
ZFSBOOT_BOOT_POOL_CREATE_OPTIONS for setting pool/dataset attributes at
the time of pool creation. The former is for setting options on the root
pool (zroot) and the latter is for setting options on the optional separate
boot pool (bootpool) implicitly enabled when using either GELI or MBR. The
default value for the root pool variable (ZFSBOOT_POOL_CREATE_OPTIONS) is
"-O compress=lz4 -O atime=off" and the default value for separate boot pool
variable (ZFSBOOT_BOOT_POOL_CREATE_OPTIONS) is NULL (no additional options
for the separate boot pool dataset).
Reviewed by: allanjude
MFC after: 7 days
X-MFC-with: r266107-266109
gjb [Thu, 22 May 2014 19:22:03 +0000 (19:22 +0000)]
Add forward-compatibility glue with pkg-1.3:
- Use ASSUME_ALWAYS_YES=YES instead of ASSUME_ALWAYS_YES=1
since pkg-1.3 expects "yes" or "true" values.
- Before exporting PKG_ABI, strip extra characters from what
is parsed from 'pkg -vv'. This causes problems further down
when creating the packages directory for inclusion on the
dvd1.iso. Previously PKG_ABI would be 'freebsd:9:x86:64',
but now is '"freebsd:9:x86:64";' in pkg-1.3
Tested on: stable/9@r265858 with ports-mgmt/pkg-devel
MFC After: 3 days
Sponsored by: The FreeBSD Foundation
neel [Thu, 22 May 2014 17:22:37 +0000 (17:22 +0000)]
Allow vmx_getdesc() and vmx_setdesc() to be called for a vcpu that is in the
VCPU_RUNNING state. This will let the VMX exit handler inspect the vcpu's
segment descriptors without having to exit the critical section.
trasz [Thu, 22 May 2014 15:29:25 +0000 (15:29 +0000)]
Make iwn(4) able to get itself back into working condition after
"fatal firmware error" happens. Previously it was neccessary to reset
it manually, using "/etc/rc.d/netif restart".
Approved by: adrian@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
trasz [Thu, 22 May 2014 14:56:34 +0000 (14:56 +0000)]
Make iwn(4) able to get itself back into working condition after
"fatal firmware error" happens. Previously it was neccessary to reset
it manually, using "/etc/rc.d/netif restart".
Approved by: adrian@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
hselasky [Thu, 22 May 2014 11:58:15 +0000 (11:58 +0000)]
- Fix a bug where the TLBPC value was forced to being odd for IN
direction isochronous transfers.
- Remove setting of fields which does not belong to the respective
TRBs. These fields are currently set as zero and this is more a
cosmetic change.
MFC after: 3 days
Submitted by: Horse Ma <HMa@wyse.com>
mav [Thu, 22 May 2014 07:27:04 +0000 (07:27 +0000)]
Make ng_mppc to not disable the node in case of multiple packet loss.
Quite often it can be just packet reorder, and killing link in such case
is inconvenient. Add few sysctl's to control that behavior.
hselasky [Thu, 22 May 2014 06:28:09 +0000 (06:28 +0000)]
- Stop transfers when RSU init fails.
- Make sure TX/RX lists don't leak and are only allocated once.
- Fix off-by one transfer index computation.
- Give firmware loading more time.
delphij [Thu, 22 May 2014 00:01:31 +0000 (00:01 +0000)]
Explicitly link libzfs against libavl as it is done in OpenSolaris
(4543:12bb2876a62e). Without this, some third party applications
may break because the lack of AVL related symbols.
FreeBSD base system are not affected because the FreeBSD ZFS command
line tools were all linked against libavl and thus hide the underlying
issue.
marcel [Wed, 21 May 2014 17:39:49 +0000 (17:39 +0000)]
Fix CID 1204379 (vtoc8.c) & CID 1204380 (bsd.c): Cast ncyls to lba_t
before multiplying the 32-bit integrals to avoid any possibility of
truncation before widening. Not a likely scenario to begin with...
hselasky [Wed, 21 May 2014 16:52:55 +0000 (16:52 +0000)]
- Split transmit queue into one for each type. Apparently there will
be a race when using a single active queue for all transmit types.
- Last argument of usb_pause_mtx() is ticks and not milliseconds.
- Remove unused watchdog.
- Remove some unused fields from the RSU softc structure.
- Workaround usbd_transfer_start() recursion from inside of completion
callback.
hrs [Wed, 21 May 2014 10:04:51 +0000 (10:04 +0000)]
- Fix a bug which can make sysctl() fail when -F is specified.
- Increase WID_IF_DEFAULT() from 6 to 8 (the default for AF_INET6) because
we have interfaces with longer names than 6 chars like epairN{a,b}.
- Style fixes.
hselasky [Wed, 21 May 2014 09:26:02 +0000 (09:26 +0000)]
- Replace some constants with macros.
- Need to set the pre-fetch memory address when reading the host memory.
- We currently assume that no endianness conversion is needed.
bjk [Wed, 21 May 2014 03:11:27 +0000 (03:11 +0000)]
Check for mismatched vref()/vdrop()
Assert that the hold count has not fallen below the use count, a situation
that would only happen when a vref() (or similar) is erroneously paired
with a vdrop(). This situation has not been observed in the wild, but
could be helpful for someone implementing a new filesystem.
scottl [Tue, 20 May 2014 22:43:17 +0000 (22:43 +0000)]
Old PCIe implementations cannot allow a DMA transfer to cross a 4GB
boundary. This was addressed several years ago by creating a parent
tag hierarchy for the root buses that set the boundary restriction
for appropriate buses and allowed child deviced to inherit it.
Somewhere along the way, this restriction was turned into a case for
marking the tag as a candidate for needing bounce buffers, instead
of just splitting the segment along the boundary line. This flag
also causes all maps associated with this tag to be non-NULL, which
in turn causes bus_dmamap_sync() to take the slow path of function
pointer indirection to discover that there's no bouncing work to
do. The end result is a lot of pages set aside in bounce pools
that will never be used, and a slow path for data buffers in nearly
every DMA-capable PCIe device. For example, our workload at Netflix
was spending nearly 1% of all CPU time going through this slow path.
Fix this problem by being more selective about when to set the
COULD_BOUNCE flag. Only set it when the boundary restriction
exists and the consumer cannot do more than a single DMA segment
at once. This fixes the case of dynamic buffers (mbufs, bio's)
but doesn't address static buffers allocated from bus_dmamem_alloc().
That case will be addressed in the future.
For those interested, this was discovered thanks to Dtrace Flame
Graphs.
Discussed with: jhb, kib
Obtained from: Netflix, Inc.
MFC after: 3 days
neel [Tue, 20 May 2014 20:30:28 +0000 (20:30 +0000)]
Add PG_RW check when translating a guest linear to guest physical address.
Set the accessed and dirty bits in the page table entry. If it fails then
restart the page table walk from the beginning. This might happen if another
vcpu modifies the page tables simultaneously.
sjg [Tue, 20 May 2014 18:25:46 +0000 (18:25 +0000)]
Use an intermediate target to associate with _SUBDIR which is marked .MAKE
this allows make -n to do tree walks as expected without
doing anything else (as intended).
Use prefix _sub. to help avoid conflict with any real target.