jhb [Fri, 17 Aug 2012 14:14:25 +0000 (14:14 +0000)]
Allow static DMA allocations that allow for enough segments to do page-sized
segments for the entire allocation to use kmem_alloc_attr() to allocate
KVM rather than using kmem_alloc_contig(). This avoids requiring
a single physically contiguous chunk in this case.
Submitted by: Peter Jeremy (original version)
MFC after: 1 month
rrs [Fri, 17 Aug 2012 05:51:46 +0000 (05:51 +0000)]
Ok jhb, lets move the ifa_free() down to the bottom to
assure that *all* tables and such are removed before
we start to free. This won't protect the Hash in ip_input.c
but in theory should protect any other uses that *do* use locks.
alc [Fri, 17 Aug 2012 05:02:29 +0000 (05:02 +0000)]
Fix two problems with pmap_clear_modify().
First, pmap_clear_modify() is write protecting all mappings to the specified
page, not just clearing the modified bit. Specifically, it sets PTE_RO on
the PTE, which is wrong. Moreover, it is calling vm_page_dirty(), which is
not the expected behavior for pmap_clear_modify(). Generally speaking, the
machine-independent VM layer masks these mistakes. For example, setting
PTE_RO will result in additional soft faults, but not a catastrophe.
Second, pmap_clear_modify() may not clear the modified bits because it only
iterates over the PV list when the page has the PV_TABLE_MOD flag set and
elsewhere the pmap clears the PV_TABLE_MOD flag anytime a modified mapping
is write protected or destroyed. However, the page may still have other
mappings with the modified bit set.
mckay [Fri, 17 Aug 2012 02:27:17 +0000 (02:27 +0000)]
Correct a regression introduced during the import of file(1) 5.11.
Magic tests containing "search" or "regex" directives were incorrectly
compiled by "mkmagic" and were effectively ignored. This caused troff
files (for example) to be detected as simply "ASCII text" instead of
as "troff or preprocessor input, ASCII text".
PR: bin/170415
Approved by: consensus on developers@
MFC after: 3 days
davidxu [Fri, 17 Aug 2012 02:26:31 +0000 (02:26 +0000)]
Implement syscall clock_getcpuclockid2, so we can get a clock id
for process, thread or others we want to support.
Use the syscall to implement POSIX API clock_getcpuclock and
pthread_getcpuclockid.
lstewart [Fri, 17 Aug 2012 01:49:51 +0000 (01:49 +0000)]
The TCP PAWS fix for kernels with fast tick rates (r231767) changed the TCP
timestamp related stack variables to reference ms directly instead of ticks.
The h_ertt(4) Khelp module relies on TCP timestamp information in order to
calculate its enhanced RTT estimates, but was not updated as part of r231767.
Consequently, h_ertt has not been calculating correct RTT estimates since
r231767 was comitted, which in turn broke all delay-based congestion control
algorithms because they rely on the h_ertt RTT estimates.
Fix the breakage by switching h_ertt to use tcp_ts_getticks() in place of all
previous uses of the ticks variable. This ensures all timestamp related
variables in h_ertt use the same units as the TCP stack and therefore results in
meaningful comparisons and RTT estimate calculations.
Reported & tested by: Naeem Khademi (naeemk at ifi uio no)
Discussed with: bz
MFC after: 3 days
np [Fri, 17 Aug 2012 00:49:29 +0000 (00:49 +0000)]
Support for TCP DDP (Direct Data Placement) in the T4 TOE module.
Basically, this is automatic rx zero copy when feasible. TCP payload is
DMA'd directly into the userspace buffer described by the uio submitted
in soreceive by an application.
- Works with sockets that are being handled by the TCP offload engine
of a T4 chip (you need t4_tom.ko module loaded after cxgbe, and an
"ifconfig +toe" on the cxgbe interface).
- Does not require any modification to the application.
- Not enabled by default. Use hw.t4nex.<X>.toe.ddp="1" to enable it.
np [Thu, 16 Aug 2012 22:33:56 +0000 (22:33 +0000)]
Initialize various DDP parameters in the main cxgbe(4) driver:
- Setup multiple DDP page sizes. When the driver attempts DDP it will
try to combine physically contiguous pages into regions of these sizes.
- Set the indicate size such that the payload carried in the indicate can
be copied in the header mbuf (and the 16K rx buffer can be recycled).
- Set DDP threshold to the max payload that the chip will coalesce and
deliver to the driver (this is ~16K by default, which is also why the
offload rx queue is backed by 16K buffers). If the chip is able to
coalesce up to the max it's allowed to, it's a good sign that the peer
is transmitting in bulk without any TCP PSH.
np [Thu, 16 Aug 2012 20:15:29 +0000 (20:15 +0000)]
Add a routine (t4_set_tcb_field) to update arbitrary parts of a hardware
TCB. Filters are programmed by modifying the TCB too (via a different
routine) and the reply to any TCB update is delivered via a
CPL_SET_TCB_RPL. Figure out whether the reply is for a filter-write or
something else and route it appropriately.
jhb [Thu, 16 Aug 2012 17:17:08 +0000 (17:17 +0000)]
Add locking for sscdisk(4) and mark it MPSAFE. Since this driver just
makes calls out to the emulator, the locking is fairly simple. A global
mutex protects the list of ssc disks, and each ssc disk has a mutex
to protect it's bioq.
des [Thu, 16 Aug 2012 08:29:49 +0000 (08:29 +0000)]
- When running out of swzone, instead of spewing an error message every
tick until the situation is resolved (if ever), just print a single
message when running out and another when space becomes available.
- When adding more swap, warn if the total amount exceeds half the
theoretical maximum we can handle.
gonzo [Thu, 16 Aug 2012 00:51:50 +0000 (00:51 +0000)]
Merge somewhat modified r230399 from projects/armv6:
Add timeout to wait for network controllers to appear when netbooting.
USB ethernet adapter initialization usually is delayed and
they're not available immidiately after autoconfiguration. So we need
to wait a bit before giving up
jfv [Wed, 15 Aug 2012 17:12:40 +0000 (17:12 +0000)]
Customer report of a panic on boot due to the old
"m_getjcl:invalid cluster type" that occurred some
time back with the igb driver. This happens often when
booting over the net. I believe the NIC hardware is left
in a warm state when handed over to the driver, and a stray
RX interrupt happens earlier than the code is prepared for
it to happen. This change was verified to fix the problem,
its kind of a bandaid... but it is similar to what was done
in the igb code.
hselasky [Wed, 15 Aug 2012 16:19:39 +0000 (16:19 +0000)]
Streamline use of cdevpriv and correct some corner cases.
1) It is not useful to call "devfs_clear_cdevpriv()" from
"d_close" callbacks, hence for example read, write, ioctl and
so on might be sleeping at the time of "d_close" being called
and then then freed private data can still be accessed.
Examples: dtrace, linux_compat, ksyms (all fixed by this patch)
2) In sys/dev/drm* there are some cases in which memory will
be freed twice, if open fails, first by code in the open
routine, secondly by the cdevpriv destructor. Move registration
of the cdevpriv to the end of the drm open routines.
3) devfs_clear_cdevpriv() is not called if the "d_open" callback
registered cdevpriv data and the "d_open" callback function
returned an error. Fix this.
kib [Wed, 15 Aug 2012 15:56:21 +0000 (15:56 +0000)]
Add a sysctl kern.pid_max, which limits the maximum pid the system is
allowed to allocate, and corresponding tunable with the same
name. Note that existing processes with higher pids are left intact.
hselasky [Wed, 15 Aug 2012 15:42:57 +0000 (15:42 +0000)]
Revert r239178 and implement two new functions, namely
"device_free_softc()" and "device_claim_softc()",
to allow USB serial drivers refcounting the softc.
These functions are used to grab the softc from
auto-free and to free the softc back to the correct
malloc type, respectivly.
adrian [Wed, 15 Aug 2012 08:14:16 +0000 (08:14 +0000)]
Extend the non-aggregate TX descriptor chain routine to be aware of:
* the descriptor ID, and
* the multi-buffer support that the EDMA chips support.
This is required for successful MAC transmission of multi-descriptor
frames. The MAC simply hangs if there are NULL buffers + 0 length pointers,
but the descriptor did have TxMore set.
This won't be done for the 11n aggregate path, as that will be modified
to use the newer API (ie, ath_hal_filltxdesc() and then set first|middle|
last_aggr), which will deprecate some of the current code.
TODO:
* Populate the numTxMaps field in the HAL, then make sure that's fetched
by the driver. Then I can undo that hack.
Tested:
* AR9380, AP mode, TX'ing non-aggregate 802.11n frames;
* AR9280, STA/AP mode, doing aggregate and non-aggregate traffic.
gonzo [Wed, 15 Aug 2012 03:49:10 +0000 (03:49 +0000)]
Merging of projects/armv6, part 4
r233822:
Remove useless and wrong piece of code in fdt_get_range() which i
overwrites passed phandle_t node. Modify debug printf in fdt_reg_to_rl()
to be consistent (that is, print start and end *virtual* addresses).
r230560:
Handle "ranges;"
Make fdt_reg_to_rl() responsible for mapping the device memory, instead
on just hoping that there's only one simplebus, and using fdt_immr_va as
the base VA.
r230315
Add a function to get the PA from range, instead of (ab)using
fdt_immr_pa, and use it for the UART driver
gonzo [Wed, 15 Aug 2012 03:21:56 +0000 (03:21 +0000)]
Merging of projects/armv6, part 3
r238211:
Support TARGET_ARCH=armv6 and TARGET_ARCH=armv6eb
This adds a new TARGET_ARCH for building on ARM
processors that support the ARMv6K multiprocessor
extensions. In particular, these processors have
better support for TLS and mutex operations.
This mostly touches a lot of Makefiles to extend
existing patterns for inferring CPUARCH from ARCH.
It also configures:
* GCC to default to arm1176jz-s
* GCC to predefine __FreeBSD_ARCH_armv6__
* gas to default to ARM_ARCH_V6K
* uname -p to return 'armv6'
* make so that MACHINE_ARCH defaults to 'armv6'
It also changes a number of headers to use
the compiler __ARM_ARCH_XXX__ macros to configure
processor-specific support routines.
gonzo [Wed, 15 Aug 2012 03:03:03 +0000 (03:03 +0000)]
Merging projects/armv6, part 1
Cummulative patch of changes that are not vendor-specific:
- ARMv6 and ARMv7 architecture support
- ARM SMP support
- VFP/Neon support
- ARM Generic Interrupt Controller driver
- Simplification of startup code for all platforms
np [Wed, 15 Aug 2012 01:03:13 +0000 (01:03 +0000)]
The size of the buffers in an Ethernet freelist has to be higher than the
interface's MTU. Initialize such freelists with correct values.
This wasn't a problem for common MTUs (1500 and 9000) as the buffers (2048
and 9216 in size) happened to have enough spare room. I ran into it when
playing around with unusual MTUs.
gavin [Tue, 14 Aug 2012 22:21:46 +0000 (22:21 +0000)]
Rename command defines to match names used in the datasheet, in order to
make maintaining this driver from the documentation easier in the future.
This is a mostly mechanical change.
In uslcom_param(), move the zeroing of the final two fields of the
flowctrl structure outside of the "if CRTSCTS" section - not only were
they being zeroed in both the clauses, but these two fields have nothing
to do with hardware flow control anyway.
np [Tue, 14 Aug 2012 21:47:41 +0000 (21:47 +0000)]
Convert some fixed parameters to tunables (with reasonable default
values).
- cong_drop specifies what to do on congestion: nothing, backpressure,
or drop.
- fl_pktshift specifies the padding before Ethernet payload.
- fl_pad specifies the boundary upto which to pad Ethernet payload.
- spg_len controls the length of the status page.
jh [Tue, 14 Aug 2012 19:16:30 +0000 (19:16 +0000)]
Reserve room for the terminating NUL when setting or getting kernel
environment variables. KENV_MNAMELEN and KENV_MVALLEN doesn't include
space for the terminating NUL.
kib [Tue, 14 Aug 2012 12:15:01 +0000 (12:15 +0000)]
Add a hackish debugging facility to provide a bit of information about
reason for generated trap. The dump of basic signal information and 8
bytes of the faulting instruction are printed on the controlling
terminal of the process, if the machdep.uprintf_signal syscal is
enabled.
The print is the only practical way to debug traps from a.out
processes I am aware of. Because I have to reimplement it each time I
debug an issue with a.out support on amd64, commit the hack to main
tree.
kib [Tue, 14 Aug 2012 12:13:27 +0000 (12:13 +0000)]
Real hardware, as opposed to QEMU, does not allow to have a call gate
in long mode which transfers control to 32bit code segment. Unbreak
the lcall $7,$0 implementation on amd64 by putting the 64bit user code
segment' selector into call gate, and execute the 64bit trampoline
which converts the return frame into 32bit format and switches back to
32bit mode for executing int $0x80 trampoline.
Note that all jumps over the hoops are performed in the user mode.
kib [Tue, 14 Aug 2012 12:11:48 +0000 (12:11 +0000)]
For old mmap syscall, when executing on amd64 or ia64, enforce the
PROT_EXEC if prot is non-zero, process is 32bit and
kern.elf32.i386_read_exec syscal is enabled. This workaround is needed
for old i386 a.out binaries, where dynamic linker did not specified
PROT_EXEC for mapping of the text.
The kern.elf32.i386_read_exec MIB name looks weird for a.out binaries,
but I reused the existing knob which already has the needed semantic.
kib [Tue, 14 Aug 2012 11:47:07 +0000 (11:47 +0000)]
Adjust the r205536, by allowing a non-zero offset for anonymous
mappings for a.out binaries. Apparently, a.out ld.so from FreeBSD
1.1.5.1 can issue such requests.
Reported and tested by: Dan Plassche <dplassche@gmail.com>
MFC after: 1 week
kib [Tue, 14 Aug 2012 11:45:47 +0000 (11:45 +0000)]
Do not leave invalid pages in the object after the short read for a
network file systems (not only NFS proper). Short reads cause pages
other then the requested one, which were not filled by read response,
to stay invalid.
Change the vm_page_readahead_finish() interface to not take the error
code, but instead to make a decision to free or to (de)activate the
page only by its validity. As result, not requested invalid pages are
freed even if the read RPC indicated success.