Don't appease clang static analyzer after all and roll back
the free(3) of mntbuf ... again. There's no point in doing
useless extra work when we're about to exit.
Octeon 2 (6xxx) and newer CPUs don't use the clock CPU speed for its
I/O clock. Thankfully, the simple executive provies a way to querry
the proper clock that works on all models. Move to asking for the SCLK
via this interface.
This gets the serial console working after we start init and open the
console and set the divisor (which turned the output from good to
bad). I can login on the console now.
Use a thread for the processing of virtio tx descriptors rather
than blocking the vCPU thread. This improves bulk data performance
by ~30-40% and doesn't harm req/resp time for stock netperf runs.
Future work will use a thread pool rather than a thread per tx queue.
In the case where the controller supports an sg_list LESS than our predefined
and tuned value, we would advertise the unsupported value to CAM and it would
merrily destroy the controller with way too many IO operations.
This manifests itself in a Zero Memory RAID configuration for a P410 and
possibly other controllers.
Remove deprecated APIs to get the total and free memory available to vmm.ko.
These APIs were relevant when memory for virtual machine allocation was
hard partitioned away from the rest of the system but that is no longer
the case. The sysctls that provided this information were garbage collected
a while back.
Use the offsets from pcb.h rather than regnum.h to store the registers
in the pcb. setjmp/longjmp in the kernel also used these values, so
continue to use them although their use isn't technically the pcb
register array (matching is all that's important for setjmp/longjmp in
the kernel). Finally, eliminate the old register names from regnum.h.
This is a lexical change only. The non-debug .o files have the same md5.
Introduce a pointer to const variable gw, which points either at the
same place as dst, or to the sockaddr in the routing table.
The const constraint of gw makes us safe from modifing routing table
accidentially. And "onstantness" of dst allows us to remove several
bandaids, when we switched it back at &ro->ro_dst, now it always
points there.
revert r248644 because of the regression for usdt probes
USDT probes are advertised to kernel by initialization code with
atrribute((constructor))). It seems that on Solaris the .init-ish code
of the main object is executed before RD_PREINIT point is hit. On
FreeBSD that is not the case. And because on FreeBSD there is no other
well-defined point between RD_PREINIT and main() we have to parse a
DTrace script when main is hit, for time being.
A footnote: currently we actually post RD_POSTINIT event, but that's a
bug because the event is triggered by hitting r_debug_state which
happens before any init code is executed.
Add RIP-relative addressing to the instruction decoder.
Rework the guest register fetch code to allow the RIP to
be extracted from the VMCS while the kernel decoder is
functioning.
Move hptmv and mpt drivers shutdown a bit later to the SHUTDOWN_PRI_LAST
stage of shutdown_post_sync. That should allow CAM to do final cache flush
at the SHUTDOWN_PRI_DEFAULT without using polling magic.
This fixes the issue with the "randomly changing" default
route. What it was is there are two places in ip_output.c
where we do a goto again. One place was fine, it
copies out the new address and then resets dst = ro->rt_dst;
But the other place does *not* do that, which means earlier
when we found the gateway, we have dst pointing there
aka dst = ro->rt_gateway is done.. then we do a
goto again.. bam now we clobber the default route.
The fix is just to move the again so we are always
doing dst = &ro->rt_dst; in the again loop.
dim [Wed, 24 Apr 2013 17:20:45 +0000 (17:20 +0000)]
When rebooting (exiting) from the BTX loader, make sure to restore the
GDT from the correct segment, otherwise a triple fault would be caused.
In some virtual environments (VMware, VirtualBox, etc) this could lead
to a unhandled error or hang in the guest emulation software.
Thanks to avg and jhb for a few hints in the right direction.
Noticed by: Jeremy Chadwick <jdc@koitsu.org> (and many others)
MFC after: 1 week
andre [Wed, 24 Apr 2013 13:54:55 +0000 (13:54 +0000)]
Base the calculation of maxmbufmem in part on kmem_map size
instead of kernel_map size to prevent kernel memory exhaustion
by mbufs and a subsequent panic on physical page allocation
failure.
On architectures without a direct map all mbuf memory (except
for jumbo mbufs larger than PAGE_SIZE) comes from kmem_map.
It is the limiting factor hence.
For architectures with a direct map using the size of kmem_map
is a good proxy of available kernel memory as well. If it is
much smaller the mbuf limit may be sub-optimal but remains
reasonable, while avoiding panics under exhaustion.
The overall mbuf memory limit calculation may be reconsidered
again later, however due to the many different mbuf sizes and
different backing KVM maps it is a tricky subject.
Found by: pho's new network stress test
Pointed out by: alc (kmem_map instead of kernel_map)
Tested by: pho
dim [Tue, 23 Apr 2013 18:58:39 +0000 (18:58 +0000)]
Pull in r180121 from upstream llvm trunk:
LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make
sure that the order in which the elements are scalarized is the same
as the original order.
This fixes a miscompilation in FreeBSD's regex library.
This should fix lib/libc/regex/regcomp.c at -O3 with clang 3.3 r178860
on CPUs with SSE. Before this change, the vectorizer could incorrectly
rearrange the second loop in computejumps(), leading to possibly invalid
entries in the re_gets::charjump table.
The net result was that for example "sed s/@CC@/foo/" failed to work
correctly, leading to trouble with many configure scripts.
Return a lun count of 1 and a lun id of 0 when CAM attempts a REPORT_LUNS
command on a disk device. This quieseces some noise on the console that
recently appeared.
Teach the virtio block device to deal with direct as well as indirect
descriptors. Prior to this change the device would only work with guests
that chose to use indirect descriptors.
Modify the device reset callback to actually reset the device state.
Literally follow POSIX:
If the bs= expr operand is specified and no conversions other than sync,
noerror, or notrunc are requested, the data returned from each input
block shall be written as a separate output block.
In particular, when both bs=size and conv=sparce were specified, the
resulted file was fully filled, instead of sparce.
andre [Tue, 23 Apr 2013 14:06:32 +0000 (14:06 +0000)]
When doing RFC3042 limited transmit on the first on second
duplicate ACK make sure we actually have new data to send.
This prevents us from sending unneccessary pure ACKs.
Reported by: Matt Miller <matt@matthewjmiller.net>
Tested by: Matt Miller <matt@matthewjmiller.net>
MFC after: 2 weeks
Fix installkernel requiring users/groups defined in CHECK_UIDS
and CHECK_GIDS to exist since r152680. This is only needed for
installworld. The documented procedure of running mergemaster -p
to check for missing users is only needed for installworld, not
for installkernel. This fixes auditdistd incorrectly being
required for installkernel.
Add an option for the GE FES based packet engines. Its board IDs
overlap with the standard ones, so kernels for this family of boards
need the option OCTEON_VENDOR_GEFES.
mm [Tue, 23 Apr 2013 06:28:35 +0000 (06:28 +0000)]
The zfs synctask code restructuring introduced a new bug that makes it
impossible to set quota and reservation on pools lower than version 22.
Problem has been reported and a solution discussed with vendor.
Illumos ZFS issues:
3739 cannot set zfs quota or reservation on pool version < 22
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reported by: Steve Wills <swills@FreeBSD.org>
MFC after: 3 days
Prevent device.subr from auto-loading in the nameservers module.
This module doesn't need device support (but device.subr is loaded
indirectly through media/tcpip.subr which contains resolv stuff).
Partially uncommit r249779. The changes to share/common.subr were good
while the remaining changes were part of a much larger ``secret sauce''
involved in an up-coming commit that I'm still laboring on.
Partially implement generic_bs_*_8() for MIPS platforms.
This is known to work with TARGET_ARCH=mips64 with FreeBSD/BERI.
Assuming that other definitions in cpufunc.h are correct it will
work on non-o64 ABI systems except sibyte. On sibyte and o32 systems
generic_bs_*_8() will remain panic() implementations.
- Some BIOSes use an Extended IRQ resource descriptor in _PRS for a link
that uses non-ISA IRQs but use a plain IRQ resource in _CRS. However,
a non-ISA IRQ can't fit into a plain IRQ resource. If we encounter a
link like this, build the resource buffer from _PRS instead of _CRS.
- Set the correct size of the end tag in a resource buffer.
Tested by: Benjamin Lee <ben@b1c1l1.com>
MFC after: 2 weeks