emaste [Fri, 16 May 2014 03:05:53 +0000 (03:05 +0000)]
Speed up pmcstat by improving string hash
In one case generating callgraph output from a 24MB system-wide sampling
data file took 17.4 seconds on average. Profiling showed pmcstat
spending a lot of time in strcmp, due to hash collisions.
Replacing the XOR-only hash with FNV-1a reduces the run time for my
test by 40%.
bjk [Fri, 16 May 2014 01:50:04 +0000 (01:50 +0000)]
Review pass through jail.8
Replace usage of "prison" with "jail", since that term has mostly dropped
out of use. Note once at the beginning that the "prison" term is equivalent,
but do not use it otherwise. [1]
Some grammar issues.
Some mdoc formatting fixes.
Consistently use \(em for em dashes, with spaces around it.
silby [Fri, 16 May 2014 01:38:38 +0000 (01:38 +0000)]
Remove the function tcp_twrecycleable; it has been #if 0'd for
eight years. The original concept was to improve the
corner case where you run out of ephemeral ports, but it
was causing performance problems and the mechanism
of limiting the number of time_wait sockets serves
the same purpose in the end.
hrs [Thu, 15 May 2014 19:26:20 +0000 (19:26 +0000)]
- Do not override sin6_scope_id in LLA when it is already set to non-zero.
This fixes destination list in output of netstat -r.
- Plug a memory leak.
- Add RTM_VERSION check.
- Minor style fixes.
marcel [Thu, 15 May 2014 19:19:57 +0000 (19:19 +0000)]
MFuser/marcel/mkimg:
Add support for different output formats:
1. The output file that was previously written is now called the raw format.
2. Add the vmdk output format to create VMDK images.
When the format is not given, the raw output format is assumed.
imp [Thu, 15 May 2014 15:45:45 +0000 (15:45 +0000)]
Makefile.inc is also included by the tests subdirectory, which results
in SUBDIRS having tests added to it, which fails. Work around this by
checking to make sure tests exists before adding it to subdirs and
work to get the generated file fixed so we can rename Makefile.inc to
something else so it isn't automatically included by subdirs...
marcel [Thu, 15 May 2014 14:48:25 +0000 (14:48 +0000)]
Replace unchecked calls to write(2) and lseek(2) with calls to
sparse_write() and make sure to check for errors. In particular,
lseek(2) may not be possible (e.g. the output file is stdout)
and sparse_write(2) will do what is optimal (i.e. use lseek(2)
when possible).
jhb [Thu, 15 May 2014 14:16:55 +0000 (14:16 +0000)]
Implement a PCI interrupt router to route PCI legacy INTx interrupts to
the legacy 8259A PICs.
- Implement an ICH-comptabile PCI interrupt router on the lpc device with
8 steerable pins configured via config space access to byte-wide
registers at 0x60-63 and 0x68-6b.
- For each configured PCI INTx interrupt, route it to both an I/O APIC
pin and a PCI interrupt router pin. When a PCI INTx interrupt is
asserted, ensure that both pins are asserted.
- Provide an initial routing of PCI interrupt router (PIRQ) pins to
8259A pins (ISA IRQs) and initialize the interrupt line config register
for the corresponding PCI function with the ISA IRQ as this matches
existing hardware.
- Add a global _PIC method for OSPM to select the desired interrupt routing
configuration.
- Update the _PRT methods for PCI bridges to provide both APIC and legacy
PRT tables and return the appropriate table based on the configured
routing configuration. Note that if the lpc device is not configured, no
routing information is provided.
- When the lpc device is enabled, provide ACPI PCI link devices corresponding
to each PIRQ pin.
- Add a VMM ioctl to adjust the trigger mode (edge vs level) for 8259A
pins via the ELCR.
- Mark the power management SCI as level triggered.
- Don't hardcode the number of elements in Packages in the source for
the DSDT. iasl(8) will fill in the actual number of elements, and
this makes it simpler to generate a Package with a variable number of
elements.
gavin [Thu, 15 May 2014 03:08:20 +0000 (03:08 +0000)]
Fix typo. Note that although this file is under contrib, it has diverged
sufficiently from upstream (including a full whitespace commit and large
portions rewritten) that this change does not move us further from the
upstream.
imp [Thu, 15 May 2014 01:27:24 +0000 (01:27 +0000)]
Undo changes to the generated Makefile. Move tests directory to proper
location, including updating the test to work in the more-fragile
fmake -> bmake bootstrap environment.
grehan [Thu, 15 May 2014 01:06:27 +0000 (01:06 +0000)]
Update dis_tables.c to the latest Illumos version.
This includes decodes of recent Intel instructions, in particular
VT-x and related instructions. This allows the FBT provider to
locate the exit points of routines that include these new
instructions.
Illumos issues:
3414 Need a new word of AT_SUN_HWCAP bits
3415 Add isainfo support for f16c and rdrand
3416 Need disassembler support for rdrand and f16c
3413 isainfo -v overflows 80 columns
3417 mdb disassembler confuses rdtscp for invlpg
1518 dis should support AMD SVM/AMD-V/Pacifica instructions
1096 i386 disassembler should understand complex nops
1362 add kvmstat for monitoring of KVM statistics
1363 add vmregs[] variable to DTrace
1364 need disassembler support for VMX instructions
1365 mdb needs 16-bit disassembler support
marcel [Thu, 15 May 2014 00:48:05 +0000 (00:48 +0000)]
Have the format resize twice. The first time is before we call
scheme_write(). The second time is immediately before we call
the format's write function. The first call is needed to have
the scheme know the right image size. The second call is needed
to compensate for the scheme adjusting the size while writing.
marcel [Thu, 15 May 2014 00:24:49 +0000 (00:24 +0000)]
Commit rough but working code:
o The VMDK image can be queried and comverted by qemu-img
o VMware Fusion finds no errors when recovering using
'vmware-vdiskmanager -R' and the disk can be added to any
vortual machine.
markm [Wed, 14 May 2014 19:11:15 +0000 (19:11 +0000)]
Give suitably-endowed ARMs a register similar to the x86 TSC register.
Here, "suitably endowed" means that the System Control Coprocessor
(#15) has Performance Monitoring Registers, including a CCNT (Cycle
Count) register.
The CCNT register is used in a way similar to the TSC register in
x86 processors by the get_cyclecount(9) function. The entropy-harvesting
thread is a heavy user of this function, and will benefit from not
having to call binuptime(9) instead.
One problem with the CCNT register is that it is 32-bit only, so
the upper 32-bits of the returned number are always 0. The entropy
harvester does not care, but in case any one else does, follow-up
work may include an interrup trap to increment an upper-32-bit
counter on CCNT overflow.
Another problem is that the CCNT register is not readable in user-mode
code; in can be made readable by userland, but then it is also
writable, and so is a good chunk of the PMU system. For that reason,
the CCNT is not enabled for user-mode access in this commit.
Like the x86, there is one CCNT per core, so they don't all run in
perfect sync.
markj [Wed, 14 May 2014 19:02:00 +0000 (19:02 +0000)]
Bind ip/tcp/udp provider translators and symbols to the same versions as in
illumos, rather than using "1.0" everywhere.
Some of the translators use D functions that are not present in version
1.0 (e.g. inet_ntoa()) which can result in libdtrace crashing when running
scripts that restrict themselves to version 1.0
(e.g. with "-x version=1.0").
jmmv [Wed, 14 May 2014 18:43:13 +0000 (18:43 +0000)]
Move old fmake tests into bmake and hook them to the build.
This first step is mostly to prevent the code from rotting even further
and to ensure these do not get wiped when fmake's code is removed from
the tree.
These tests are currently being skipped because they detect the underlying
make is not fmake and thus disable themselves -- and the reason is that
some of the tests fail, possibly due to legitimate bugs. Enabling them to
run against bmake will come separately.
Lastly, it would be ideal if these tests were fed upstream but they are
not ready for that yet. In the interim, just put them under usr.bin/bmake/
while we sort things out. The existence of a different unit-tests directory
within here makes me feel less guilty about this.
Change confirmed working with a clean amd64 build.
dim [Wed, 14 May 2014 17:11:57 +0000 (17:11 +0000)]
Use the new -d option that was added to tblgen between llvm/clang 3.3
and 3.4 to generate dependency files for the '.inc.h' files generated
from .td files, and .sinclude those dependency files in clang.build.mk.
This will make future incremental builds of lib/clang and usr.bin/clang
work correctly, whenever any of the .td files get modified.
Note that this will not fix any problems with incremental builds from
*before* this revision, since there will not yet be any generated
dependency files. A quick workaround is to run the following:
find /usr/obj -type f -name '*.inc.h' | xargs rm
and then a regular incremental buildworld (e.g. with -DNO_CLEAN).
hselasky [Wed, 14 May 2014 17:04:02 +0000 (17:04 +0000)]
Implement USB device side driver code for SAF1761 and compatible
chips, based on datasheet and existing USS820 DCI driver. This code is
not yet tested.
marcel [Wed, 14 May 2014 15:46:07 +0000 (15:46 +0000)]
Give output formats a chance to (re-)size the image before the
scheme adds the partitioning metadata. This is needed by VMDK
to round the image size to the grain size.
hselasky [Wed, 14 May 2014 11:25:59 +0000 (11:25 +0000)]
Make sure the USB audio driver is loaded last. This is important when
built as part of a kernel module to prevent panics when the USB audio
driver kernel module is unloaded.
hselasky [Wed, 14 May 2014 07:33:06 +0000 (07:33 +0000)]
Change the USB audio kernel module linking order, so that the USB
audio device driver is detached first and not its children. This fixes
a panic in some cases when unloading "snd_uaudio" while a USB device
is plugged. The linking order affects the order in which the module
dependencies are registered.
neel [Tue, 13 May 2014 16:40:27 +0000 (16:40 +0000)]
Don't include the guest memory segments in the bhyve(8) process core dump.
This has not added a lot of value when debugging bhyve issues while greatly
increasing the time and space required to store the core file.
Passing the "-C" option to bhyve(8) will change the default and dump guest
memory in the core dump.
hselasky [Tue, 13 May 2014 13:46:38 +0000 (13:46 +0000)]
- Isochronous transfers should use the alternate next transfer
descriptor upon receiving a short packet, in host and device mode.
- Correct some comments.
alc [Tue, 13 May 2014 13:20:23 +0000 (13:20 +0000)]
On a fork allow read-only wired pages to be copy-on-write shared between the
parent and child processes. Previously, we copied these pages even though
they are read only. However, the reason for copying them is historical and
no longer exists. In recent times, vm_map_protect() has developed the
ability to copy pages when write access is added to wired copy-on-write
pages. So, in this case, copy-on-write sharing of wired pages is not to be
feared. It is not going to lead to copy-on-write faults on wired memory.
yongari [Tue, 13 May 2014 05:19:29 +0000 (05:19 +0000)]
Disable TX IP/TCP/UDP checksum offloading for RTL8168C/RTL8168CP.
Previously only TX IP checksum offloading was disabled but it's
reported that TX checksum offloading for UDP datagrams with IP
options also generates corrupted frames. Reporter's controller is
RTL8168CP but I guess RTL8168C also have the same issue since it
shall share the same core.
neel [Mon, 12 May 2014 23:35:10 +0000 (23:35 +0000)]
abort(3) the process in response to a VMEXIT_ABORT. This usually happens in
response to an unhandled VM exit or an unexpected error so a core is useful.
ian [Mon, 12 May 2014 13:05:03 +0000 (13:05 +0000)]
Interrupts need to be disabled on entry to cpu_sleep() for ARM. Given
that and the need to be in a critical section when switching to idleclock
mode for event timers, use spinlock_enter()/exit() to achieve both needs.
The ARM WFI (wait for interrupt) instruction blocks until an interrupt is
asserted, and it will unblock even if interrupts are masked, and it will
unblock immediately if an interrupt is already pending. It is necessary
to execute it with interrupts disabled, otherwise the interrupt that
should unblock it may occur and be serviced just prior to executing the
instruction. At that point the system is inappropriately asleep until
the next timer tick or some other random interrupt happens.
In general, interrupts need to be disabled continuously from the time the
decision is made that there is no work to be done and sleeping is needed
until actually going to sleep, to avoid a race where handling a new
interrupt changes the basis for deciding there is no work to be done.
tuexen [Mon, 12 May 2014 09:46:48 +0000 (09:46 +0000)]
Disable TX checksum offload for UDP-Lite completely. It wasn't used for
partial checksum coverage, but even for full checksum coverage it doesn't
work.
This was discussed with Kevin Lo (kevlo@).
nwhitehorn [Mon, 12 May 2014 02:56:27 +0000 (02:56 +0000)]
Repair some races in IPI handling:
1. Make sure IPI mask is set before sending the IPI
2. Operate atomically on PS3 PIC outstanding interrupt list
3. Make sure IPIs are EOI'ed before, not after, processing. Without this,
a second IPI could be sent partway through processing the first one,
get erroneously acknowledge by the EOI to the first, and be lost. In
particular in the case of smp_rendezvous(), this can be fatal.
In combination, this makes the PS3 boot SMP again. It probably also fixes
some latent bugs elsewhere.
imp [Sun, 11 May 2014 23:22:32 +0000 (23:22 +0000)]
Attempt to walk a fine line between current usage (/usr/ports which
does an out-of-tree build without setting MAKESYSPATH) and recently
added requirements (JIRA's building the modules in a non-standard
layout). So, when MAKESYSPATH is defined, trust that it will do the
right thing (to catch the JIRA use case). When it isn't defined,
assume a standard FreeBSD tree and reach over to grab bsd.mkopt.mk (to
fix the /usr/ports use case). Both camps cannot be appeased otherwise,
so we have this kludge until it can be sorted out.
jilles [Sun, 11 May 2014 21:21:14 +0000 (21:21 +0000)]
accept(),accept4(): Don't set *addrlen = 0 on [ECONNABORTED].
If the underlying protocol reported an error (e.g. because a connection was
closed while waiting in the queue), this error was also indicated by
returning a zero-length address. For all other kinds of errors (e.g.
[EAGAIN], [ENFILE], [EMFILE]), *addrlen is unmodified and there are
successful cases where a zero-length address is returned (e.g. a connection
from an unbound Unix-domain socket), so this error indication is not
reliable.
As reported in Austin Group bug #836, modifying *addrlen on error may cause
subtle bugs if applications retry the call without resetting *addrlen.
dim [Sun, 11 May 2014 21:07:00 +0000 (21:07 +0000)]
Allow libstdc++ and libsupc++ to compile with clang again, after the
bsd.*.mk infrastructure changes. Apparently, you must now modify
CXXFLAGS *before* including bsd.lib.mk, or your changes will be lost.