yongari [Mon, 7 Mar 2011 00:42:22 +0000 (00:42 +0000)]
MFC r219102:
Make sure changing ownership of RX descriptor to be done as last
operation. Previously ownership was transferred to hardware before
setting address of new RX buffer such that it was possible for
hardware to use wrong RX buffer address.
While here keep compiler from re-ordering instructions by declaring
descriptor members volatile. Memory barriers would do the same job
but volatile is supposed to be cheaper than using memory barriers,
especially on MP systems.
marius [Sun, 6 Mar 2011 11:51:39 +0000 (11:51 +0000)]
MFC: r216013, r216083
Several chipset drivers alter parameters relevant for the DMA tag creation,
i.e. alignment, max_address, max_iosize and segsize (only max_address is
thought to have an negative impact regarding this issue though), after
calling ata_dmainit() either directly or indirectly so these values have
no effect or at least no effect on the DMA tags and the defaults are used
for the latter instead. So change the drivers to set these parameters
up-front and ata_dmainit() to honor them.
marius [Sun, 6 Mar 2011 11:43:02 +0000 (11:43 +0000)]
- Use the correct DMA tag/map pair for synchronize the FC scratch area.
- Allocate coherent DMA memory for the request/response queue area and
and the FC scratch area.
These changes allow isp(4) to work properly on sparc64 with usage of the
IOMMU streaming buffers enabled.
Add opteron-sse3, athlon64-sse3 and k8-sse3 cpu types to bsd.cpu.mk.
- add "sse3" to MACHINE_CPU for the new cpu types
- for i386, default to CPUTYPE=prescott for the new cpu types
Add opteron-sse3, athlon64-sse3 and k8-sse3 cpu types to bsd.cpu.mk.
- add "sse3" to MACHINE_CPU for the new cpu types
- for i386, default to CPUTYPE=prescott for the new cpu types
Backport svn r124339 from gcc 4.3 and add opteron-sse3, athlon64-sse3
and k8-sse3 cpu-types for -march=/-mtune= gcc options.
These new cpu-types include the SSE3 instruction set that is supported
by all newer AMD Athlon 64 and Opteron processors.
All three cpu-types are supported by clang and all gcc versions
starting with 4.3 SVN rev 124339 (at that time GPLv2 licensed).
das [Sun, 6 Mar 2011 08:50:15 +0000 (08:50 +0000)]
Bump __FreeBSD_version for the MFC of log2(), for the benefit of ports
such as opencity and inkscape that have workarounds for the lack of a
log2() in the base system.
MFC r212067 (pjd):
Eliminate confusing while () loop. In the first version of the code it was
there to avoid gotos, but in the current version it serves no purpose.
MFC r214623 (pjd):
Fix ztest when it is executed by just 'ztest' and not by full path
'/usr/bin/ztest'.
jhb [Thu, 3 Mar 2011 20:13:44 +0000 (20:13 +0000)]
MFC 218968:
Properly handle BARs bigger than 4G. The '1' was treated as an int
causing the size calculation to be truncated to the size of an int
(32-bits on all current architectures).
jhb [Thu, 3 Mar 2011 19:57:38 +0000 (19:57 +0000)]
MFC 218777:
Save a copy of errno before invoking syslog() if accept() or select() fail.
syslog() can trash the errno value causing nfsd to exit for non-fatal
errors like ECONNABORTED from accept().
jhb [Thu, 3 Mar 2011 18:52:11 +0000 (18:52 +0000)]
MFC 216487,217754,218524:
- Pass JFLAG as JFLAG from tinderbox to universe.
- For `make tinderbox` there is no need to print the extra commands.
- Add a new UNIVERSE_TARGET variable for 'make universe'. If it is set,
then that target is invoked for each architecture rather than the
default action of building world and kernels for each architecture.
- Add a 'make toolchains' wrapper which uses UNIVERSE_TARGET to build
toolchains for all architectures.
- Document JFLAG, MAKE_JUST_KERNELS, and MAKE_JUST_WORLDS variables for
'make universe'.
If dlclose() is called recursively from a _fini() function, the inner
dlclose() call may unload the object of the outer call prematurely
because objects are unreferenced before _fini() calls.
Fix this by unreferencing objects after calling objlist_call_fini() in
dlclose(). Therefore objlist_call_fini() now calls the fini function if
the reference count of an object is 1. In addition we must restart the
list_fini traversal after every _fini() call because another dlclose()
call might have modified the reference counts.
Add an XXX comment to objlist_call_fini() about possible race with
dlopen().
jhb [Thu, 3 Mar 2011 16:58:58 +0000 (16:58 +0000)]
MFC 218270:
Use M_WAITOK rather than M_NOWAIT when creating taskqueues via the
TASKQUEUE_DEFINE macros. All the places that use these macros to create
taskqueues assume that the operation succeeds.
jkim [Thu, 3 Mar 2011 00:24:55 +0000 (00:24 +0000)]
MFC: r217515, r217519, r217539
Add reader/writer lock around mem_range_attr_get() and mem_range_attr_set().
Compile sys/dev/mem/memutil.c for all supported platforms and remove now
unnecessary dev_mem_md_init(). Consistently define mem_range_softc from
mem.c for all platforms. Add missing #include guards for machine/memdev.h
and sys/memrange.h. Clean up some nearby style(9) nits.
jhb [Wed, 2 Mar 2011 21:50:59 +0000 (21:50 +0000)]
MFC 217239:
Add a nested include of <sys/linker_set.h> to make the sysctl(9) manpage
accurate. <sys/linker_set.h> is one of the very few headers similar to
<sys/queue.h> for which nested includes is allowed.
dchagin [Wed, 2 Mar 2011 20:08:52 +0000 (20:08 +0000)]
MFC r218879:
Do not clobber %rdx.
Before calling vfork() syscall the linux user-space stores the current PID
in the %rdx and restore it when the parent process leaves the kernel.
dchagin [Wed, 2 Mar 2011 20:04:54 +0000 (20:04 +0000)]
MFC r218744:
To avoid excessive code duplication create wrapper for fill regs
from stack frame. Change the trap() code to use newly created function
instead of explicit regs assignment.
dchagin [Wed, 2 Mar 2011 20:01:24 +0000 (20:01 +0000)]
MFC r218719 (by hand, depends on r209592):
Make a linux_rt_sigtimedwait() system call is actually working.
1) Translate the native signal number in the appropriate Linux signal.
2) Remove bogus code, which can lead to a panic as it calls
kern_sigtimedwait with the same ksiginfo.
3) Return the corresponding signal number.
jhb [Wed, 2 Mar 2011 19:27:01 +0000 (19:27 +0000)]
MFC 217805:
Fix a LOR by dropping the global ifnet locks while allocating a new ifnet
table in if_grow(). The order of the SYSINIT's for ifnet state were swapped
so that the various locks were initialized before being used.
jkim [Wed, 2 Mar 2011 19:09:49 +0000 (19:09 +0000)]
MFC: r216634, r216673
Improve PCB flags handling and make it more robust. Add two new functions
for manipulating pcb_flags. These inline functions are very similar to
atomic_set_int(9) and atomic_clear_int(9) but without unnecessary LOCK
prefix for SMP. Add comments about the rationale. Use these functions
wherever possible. Although there are some places where it is not strictly
necessary (e.g., a PCB is copied to create a new PCB), it is done across
the board for sake of consistency. Turn pcb_full_iret into a PCB flag as
it is safe now. Move rarely used fields before pcb_flags and reduce size
of pcb_flags to four bytes. Fix some style(9) nits in pcb.h while I am in
the neighborhood.
netchild [Wed, 2 Mar 2011 09:53:13 +0000 (09:53 +0000)]
MFC r215664:
By using the 32-bit Linux version of Sun's Java Development Kit 1.6
on FreeBSD (amd64), invocations of "javac" (or "java") eventually
end with the output of "Killed" and exit code 137.
This is caused by:
1. After calling exec() in multithreaded linux program threads are not
destroyed and continue running. They get killed after program being
executed finishes.
2. linux_exit_group doesn't return correct exit code when called not
from group leader. Which happens regularly using sun jvm.
The submitters fix this in a similar way to how NetBSD handles this.
I took the PRs away from dchagin, who seems to be out of touch of
this since a while (no response from him).
The patches committed here are from [2], with some little modifications
from me to the style.
PR: 141439 [1], 144194 [2]
Submitted by: Stefan Schmidt <stefan.schmidt@stadtbuch.de>, gk
Reviewed by: rdivacky (in april 2010)
MFC r215675:
Do not take the process lock. The assignment to u_short inside the
properly aligned structure is atomic on all supported architectures, and
the thread that should see side-effect of assignment is the same thread
that does assignment.
Use a more appropriate conditional to detect the linux ABI.
dchagin [Wed, 2 Mar 2011 06:09:52 +0000 (06:09 +0000)]
MFC r218100:
The kern_wait() code already removes the SIGCHLD signal for the waited
process. Removing other SIGCHLD signals is not needed and may cause
problems.
kib [Wed, 2 Mar 2011 00:36:28 +0000 (00:36 +0000)]
MFC r218972:
Move the max_threads_per_proc and max_threads_hits variables to the
file where they are used. Declare the kern.threads sysctl node at the
same location. Since no external use for the variables exists, make them
static.
MFC r218976 (by pluknet):
Clean up the now unused #include statement.
kib [Tue, 1 Mar 2011 21:51:32 +0000 (21:51 +0000)]
MFC r210431:
Remove the linux_exec_copyin_args(), freebsd32_exec_copyin_args() may
serve as well. COMPAT_FREEBSD32 is a prerequisite for COMPAT_LINUX32.
MFC r210451:
Use forward declartion for enum uio_seg in imgact.h. This allows to remove
inclusion of sys/uio.h from the header.
MFC r210498:
Revert r210451, and the similar part of the r210431. The forward-declaration
for the enum tag when enum definition is not complete is not allowed by
C99, and is gcc extension.
dchagin [Tue, 1 Mar 2011 20:44:14 +0000 (20:44 +0000)]
MFC r217896:
Add macro to test the sv_flags of any process. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures.
yongari [Tue, 1 Mar 2011 00:04:34 +0000 (00:04 +0000)]
MFC r218289:
Disable TX IP checksum offloading for RTL8168C controllers. The
controller in question generates frames with bad IP checksum value
if packets contain IP options. For instance, packets generated by
ping(8) with record route option have wrong IP checksum value. The
controller correctly computes checksum for normal TCP/UDP packets
though.
There are two known RTL8168/8111C variants in market and the issue
I observed happened on RL_HWREV_8168C_SPIN2. I'm not sure
RL_HWREV_8168C also has the same issue but it would be better to
assume it has the same issue since they shall share same core.
RTL8102E which is supposed to be released at the time of
RTL8168/8111C announcement does not have the issue.
Tested by: Konstantin V. Krotov ( kkv <> insysnet dot ru )
yongari [Tue, 1 Mar 2011 00:01:34 +0000 (00:01 +0000)]
MFC r217911:
Add support for RTL8105E PCIe Fast Ethernet controller. It seems
the controller has a kind of embedded controller/memory and vendor
applies a large set of magic code via undocumented PHY registers in
device initialization stage. I guess it's a firmware image for the
embedded controller in RTL8105E since the code is too big compared
to other DSP fixups. However I have no idea what that magic code
does and what's purpose of the embedded controller. Fortunately
driver seems to still work without loading the firmware.
While I'm here change device description of RTL810xE controller.
yongari [Mon, 28 Feb 2011 23:41:27 +0000 (23:41 +0000)]
MFC r217902:
Do not use interrupt taskqueue on controllers with MSI/MSI-X
capability. One of reason using interrupt taskqueue in re(4) was
to reduce number of TX/RX interrupts under load because re(4)
controllers have no good TX/RX interrupt moderation mechanism.
Basic TX interrupt moderation is done by hardware for most
controllers but RX interrupt moderation through undocumented
register showed poor RX performance so it was disabled in r215025.
Using taskqueue to handle RX interrupt greatly reduced number of
interrupts but re(4) consumed all available CPU cycles to run the
taskqueue under high TX/RX network load. This can happen even with
RTL810x fast ethernet controller and I believe this is not
acceptable for most systems.
To mitigate the issue, use one-shot timer register to moderate RX
interrupts. The timer register provides programmable one-shot timer
and can be used to suppress interrupt generation. The timer runs at
125MHZ on PCIe controllers so the minimum time allowed for the
timer is 8ns. Data sheet says the register is 32 bits but
experimentation shows only lower 13 bits are valid so maximum time
that can be programmed is 65.528us. This yields theoretical maximum
number of RX interrupts that could be generated per second is about
15260. Combined with TX completion interrupts re(4) shall generate
less than 20k interrupts. This number is still slightly high
compared to other intelligent ethernet controllers but system is
very responsive even under high network load.
Introduce sysctl variable dev.re.%d.int_rx_mod that controls amount
of time to delay RX interrupt processing in units of us. Value 0
completely disables RX interrupt moderation. To provide old
behavior for controllers that have MSI/MSI-X capability, introduce
a new tunable hw.re.intr_filter. If the tunable is set to non-zero
value, driver will use interrupt taskqueue. The default value of
the tunable is 0. This tunable has no effect on controllers that
has no MSI/MSI-X capability or if MSI/MSI-X is explicitly disabled
by administrator.
While I'm here cleanup interrupt setup/teardown since re(4) uses
single MSI/MSI-X message at this moment.
rwatson [Mon, 28 Feb 2011 23:28:35 +0000 (23:28 +0000)]
Merge userspace DTrace support from head to stable/8:
r209721:
Merge from vendor-sys/opensolaris:
* add fasttrap files
r209731:
Introduce USD_{SET,GET}{BASE,LIMIT}. These help setting up the user
segment descriptor hi and lo values. Idea from Solaris.
Reviewed by: kib
r209763:
Fix style issues with the previous commit, namely
use-tab-instead-of-space and don't use underscores in macro variables.
Pointed out by: bde
r210292:
Fix typo in comment.
r210357:
MFamd64:
Add USD_GETBASE(), USD_SETBASE(), USD_GETLIMIT() and USD_SETLIMIT().
r210611:
Bump the witness pendlist to 768 to accomodate the increased number of
spinlocks.
r211553:
Add sysname to struct opensolaris_utsname. This is needed by one DTrace
test.
r211566:
Add a sysname char * to struct opensolaris_utsname.
r211606:
Add the FreeBSD definition for the fasttrap ioctls.
r211607:
Add a function compatibility function dtrace_instr_size_isa() that on
FreeBSD does the same as dtrace_dis_isize().
r211608:
Kernel DTrace support for:
o uregs (sson@)
o ustack (sson@)
o /dev/dtrace/helper device (needed for USDT probes)
r211610:
Add more compatibility structure members needed by the upcoming fasttrap
DTrace device.
r211611:
Destroy the helper device when unloading.
r211613:
Fix style issues.
r211614:
Bump KDTRACE_THREAD_ZERO and use M_ZERO as a malloc flag instead of
calling bzero.
r211615:
Remove an elif and add an or-clause.
r211616:
Add an extra comment to the SDT probes definition. This allows us to get
use '-' in probe names, matching the probe names in Solaris.
Add userland SDT probes definitions to sys/sdt.h.
r211617:
Call the systrace_probe_func() when the error value.
r211618:
Port this to FreeBSD. We miss some suword functions, so we use copyout.
r211738:
Port the fasttrap provider to FreeBSD. This provider is responsible for
injecting debugging probes in the userland programs and is the basis for
the pid provider and the usdt provider.
r211744:
MD fasttrap implementation.
r211745:
Replace a pksignal() call with tdksignal().
Pointed out by: kib
r211746:
Update for the recent location of the fasttrap code.
r211747:
Replace structure assignments with explicity memcpy calls. This allows
Clang to compile this file: it was using the builtin memcpy and we want
to use the memcpy defined in gptboot.c. (Clang can't compile boot2 yet).
Submitted by: Dimitry Andric <dimitry at andric.com>
Reviewed by: jhb
r211751:
Add a trap code for DTrace induced traps.
r211752:
Add two DTrace trap type values. Used by fasttrap.
r211753:
Enable fasttrap and make dtraceall depend on fasttrap when building i386
or amd64.
r211804:
Call the necessary DTrace function pointers when we have different kinds
of traps.
r211813:
Add the necessary DTrace function pointers.
r211839:
Sync DTrace bits with amd64 and fix the build.
r211924:
Register an interrupt vector for DTrace return probes. There is some
code missing in lapic to make sure that we don't overwrite this entry,
but this will be done on a sequent commit.
r211925:
Replace a memory barrier with a mutex barrier.
r211926:
Add the path necessary to find fasttrap_isa.h to CFLAGS.
r211929:
Remove debugging.
r212004:
When DTrace is enabled, make sure we don't overwrite the IDT_DTRACE_RET
entry with an IRQ for some hardware component.
Reviewed by: jhb
r212093:
Make the /dev/dtrace/helper node have the mode 0660. This allows
programs that refuse to run as root (pgsql) to install probes when their
user is part of the wheel group.
r212357:
Fix two bugs in DTrace:
* when the process exits, remove the associated USDT probes
* when the process forks, duplicate the USDT probes.
r212465:
Avoid a LOR (sleepable after non-sleepable) in
fasttrap_tracepoint_enable().
r212494:
Revamp locking a bit. This fixes three problems:
* processes now can't go away while we are inserting probes (fixes a panic)
* if a trap happens, we won't be holding the process lock (fixes a hang)
* fix a LOR between the process lock and the fasttrap bucket list lock
Thanks to kib for pointing some problems.
r212568:
Bump __FreeBSD_version to reflect the userland DTrace changes
Sponsored by: The FreeBSD Foundation
Userspace DTrace work by: rpaulo
yongari [Mon, 28 Feb 2011 21:21:24 +0000 (21:21 +0000)]
MFC r217857:
Prefer MSI-X to MSI on controllers that support MSI-X. All
recent PCIe controllers(RTL8102E or later and RTL8168/8111C or
later) supports either 2 or 4 MSI-X messages. Unfortunately vendor
did not publicly release RSS related information yet. However
switching to MSI-X is one-step forward to support RSS.
ken [Mon, 28 Feb 2011 16:39:15 +0000 (16:39 +0000)]
MFC: r219036
Silence 'out of chain frames' warnings and bump the number of frames.
mps.c: Hide the 'out of chain frames' warning behind MPS_INFO.
mps_sas.c: Hide the SIM queue freeze/unfreeze messages behind MPS_INFO.
mpsvar.h: Bump the number of chain frames from 1024 to 2048. From
testing, it looks like this makes it less likely that we'll
run out of chain frames, and it doesn't cost much memory
(32K).
alc [Sat, 26 Feb 2011 21:27:41 +0000 (21:27 +0000)]
MFC r217453
For some time now, the kernel and kmem objects have been ordinary
OBJT_PHYS objects. Thus, there is no need for handling them specially
in vm_fault(). In fact, this special case handling would have led to
an assertion failure just before the call to pmap_enter().
alc [Sat, 26 Feb 2011 21:08:09 +0000 (21:08 +0000)]
MFC r216090
Correct an error in the allocation of the vm_page_dump array in
vm_page_startup(). Specifically, the dump_avail array should be used
instead of the phys_avail array to calculate the size of vm_page_dump.