bschmidt [Tue, 13 Mar 2012 21:25:25 +0000 (21:25 +0000)]
Update the rt2860's firmware and add a Makefile for the module. While
here remove the ucode header file which was used to generate the fw files
but by now is outdated.
dim [Tue, 13 Mar 2012 19:40:56 +0000 (19:40 +0000)]
Pull in a fix (still under GPLv2) for a double free in gdb, leading to
an assert, which can occur if you repeatedly dlopen() and dlclose() a
.so file in a tight loop. This was reported on freebsd-current@ by
Alexandre Martins, with a sample to reproduce the behaviour.
dim [Tue, 13 Mar 2012 19:18:34 +0000 (19:18 +0000)]
Update comments and CFLAGS in sys/conf/kern.mk, introduced in r221879,
to match reality: clang does _not_ disable SSE automatically when
-mno-mmx is used, you have to specify -mno-sse explicitly.
Note this was the case even before r232894, which only makes a change in
the 'positive' flag case; e.g. when you specify -msse, MMX gets enabled
too.
nwhitehorn [Tue, 13 Mar 2012 18:59:19 +0000 (18:59 +0000)]
Work around a binutils bug on powerpc64 where the TOC would not be
properly reloaded when calling _fini() in large binaries with multiple
TOC sections (e.g. GCC), leading to a segmentation fault. Adding -mlongcall
to crt1 flags causes the compiler to emit explicit TOC load instructions
for all function calls, including _fini().
mav [Tue, 13 Mar 2012 10:21:08 +0000 (10:21 +0000)]
Add kern.eventtimer.activetick tunable/sysctl, specifying whether each
hardclock() tick should be run on every active CPU, or on only one.
On my tests, avoiding extra interrupts because of this on 8-CPU Core i7
system with HZ=10000 saves about 2% of performance. At this moment option
implemented only for global timers, as reprogramming per-CPU timers is
too expensive now to be compensated by this benefit, especially since we
still have to regularly run hardclock() on at least one active CPU to
update system uptime. For global timer it is quite trivial: timer runs
always, but we just skip IPIs to other CPUs when possible.
Option is enabled by default now, keeping previous behavior, as periodic
hardclock() calls are still used at least to implement setitimer(2) with
ITIMER_VIRTUAL and ITIMER_PROF arguments. But since default schedulers don't
depend on it since r232917, we are much more free to experiment with it.
mav [Tue, 13 Mar 2012 08:18:54 +0000 (08:18 +0000)]
Rewrite thread CPU usage percentage math to not depend on periodic calls
with HZ rate through the sched_tick() calls from hardclock().
Potentially it can be used to improve precision, but now it is just minus
one more reason to call hardclock() for every HZ tick on every active CPU.
SCHED_4BSD never used sched_tick(), but keep it in place for now, as at
least SCHED_FBFS existing in patches out of the tree depends on it.
adrian [Tue, 13 Mar 2012 06:28:52 +0000 (06:28 +0000)]
Fix link status handling on if_arge upon system boot to allow bootp/NFS to
function.
From the submitter:
This patch fixes an issue I encountered using an NFS root with an
ar71xx-based MikroTik RouterBoard 450G on -current where the kernel fails
to contact a DHCP/BOOTP server via if_arge when it otherwise should be able
to. This may be the same issue that Monthadar Al Jaberi reported against
an RSPRO on 6 March, as the signature is the same:
%%%
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
.
.
.
DHCP/BOOTP timeout for server 255.255.255.255
DHCP/BOOTP timeout for server 255.255.255.255
arge0: initialization failed: no memory for rx buffers
DHCP/BOOTP timeout for server 255.255.255.255
arge0: initialization failed: no memory for rx buffers
%%%
The primary issue that I found is that the DHCP/BOOTP message that
bootpc_call() is sending never makes it onto the wire, which I believe is
due to the following:
- Last December, a change was made to the ifioctl that bootpc_call() uses
to adjust the netmask around the sosend().
- The new ioctl (SIOCAIFADDR) performs an if_init when invoked, whereas the
old one (SIOCSIFNETMASK) did not.
- if_arge maintains its own sense of link state in sc->arge_link_status.
- On a single-phy interface, sc->arge_link_status is initialized to 0 in
arge_init_locked().
- sc->arge_link_status remains 0 until a phy state change notification
causes arge_link_task to run, notice the link is up, and set it to 1.
- The inits caused by the ifioctls in bootpc_call are reinitializing the
interface, but not the phy, so sc->arge_link_status goes to 0 and remains
there.
- arge_start_locked() always sees sc->arge_link_status == 0 and returns
without queuing anything.
The attached patch changes arge_init_locked() such that in the single-phy
case, instead of initializing sc->arge_link_status to 0, it runs
arge_link_task() to set it according to the current phy state. This change
has allowed my setup to mount an NFS root successfully.
Submitted by: Patrick Kelsey <kelsey@ieee.org>
Reviewed by: juli
jmallett [Tue, 13 Mar 2012 06:22:49 +0000 (06:22 +0000)]
Don't build kernel.tramp on Octeon. Probably building it should be opt-in
not opt-out, but I don't know enough about which ports need it to get the
defaults right.
adrian [Tue, 13 Mar 2012 06:15:20 +0000 (06:15 +0000)]
Correctly (I hope) deallocate the if_arge RX buffer ring on arge_stop().
I had some interesting hangs until I realised I should try flushing the
DDR FIFO register and lo and behold, hangs stopped occuring.
I've put in a few DDR flushes here and there in case people decide to
reuse some of these functions. It's very very likely they're almost
all superflous.
To test:
* Connect to a network with a _lot_ of broadcast traffic
* Do this:
# while true; do ifconfig arge0 down; ifconfig arge0 up; done
This fixes the mbuf exhaustion that has been reported when the interface
state flaps up/down.
jmallett [Tue, 13 Mar 2012 00:45:27 +0000 (00:45 +0000)]
sysinstall was removed from usr.sbin/Makefile in r225937. Because per-arch
Makefiles were split out in this directory and others in userland, it makes it
quite easy to miss per-arch conditionals when changing something generally.
jpaetzel [Mon, 12 Mar 2012 21:28:54 +0000 (21:28 +0000)]
Improve ZFS exporting functionality, only export pools which are on a
specific device we happen to be writing to. This fixes an issue when
running pc-sysinstall on a running system which needs ZFS and the main
disk gets exported.
jmallett [Mon, 12 Mar 2012 21:25:32 +0000 (21:25 +0000)]
o) Use ABI, not ISA_* options, to determine whether to compile bits if libkern
required for the ABI the kernel is being built for.
XXX This is implemented in a kind-of nasty way that involves including source
files, but it's still an improvement.
o) Retire ISA_* options since they're unused and were always wrong.
dim [Mon, 12 Mar 2012 21:07:22 +0000 (21:07 +0000)]
Pull in r145194 from upstream clang trunk:
Make our handling of MMX x SSE closer to what gcc does:
* Enabling sse enables mmx.
* Disabling (-mno-mmx) mmx, doesn't disable sse (we got this right already).
* The order in not important. -msse -mno-mmx is the same as -mno-mmx -msse.
adrian [Mon, 12 Mar 2012 20:32:23 +0000 (20:32 +0000)]
Configuration changes/updates!
* enable ALQ and net80211/ath ALQ logging by default, to make it possible
to get debug register traces.
* Update some comments
* Enable HWPMC for testing.
gonzo [Mon, 12 Mar 2012 20:24:59 +0000 (20:24 +0000)]
- Although we pass first 4 arguments in registers, function callinf ABI requires
space to be reserved for them in stack. _rtld() prologue saves a1 and a2 in
this space.
scottl [Mon, 12 Mar 2012 19:29:35 +0000 (19:29 +0000)]
Final pass at having devices use their bus parent for dma tags. The
remaining drivers that haven't been converted have various problems or
complexities that will be dealt with later. This list includes:
hptrr, hptmv, hpt27xx - device aggregation across multiple parents
drm - want to talk to the maintainer first
tsec, sec - Openfirmware devices, not sure if changes are warranted
fatm - Done except for unused testing code
usb - want to talk to the maintainer first
ce, cp, ctau, cx - Significant driver changes needed to convey parent info
There are also devices tucked into architecture subtrees that I'll leave
for the respective maintainers to deal with.
rrs [Mon, 12 Mar 2012 15:05:17 +0000 (15:05 +0000)]
This fixes PR 165210. Basically we just
add in the netgraph interface to the list of
acceptable interfaces. A todo at the next
IETF code blitz, though is we need to review
why we screen interfaces, there was a reason ;-).
melifaro [Mon, 12 Mar 2012 14:07:57 +0000 (14:07 +0000)]
- Add ipfw eXtended tables permitting radix to be used for any kind of keys.
- Add support for IPv6 and interface extended tables
- Make number of tables to be loader tunable in range 0..65534.
- Use IP_FW3 opcode for all new extended table cmds
No ABI changes are introduced. Old userland will see valid tables for
IPv4 tables and no entries otherwise. Flush works for any table.
IP_FW3 socket option is used to encapsulate all new opcodes:
/* IP_FW3 header/opcodes */
typedef struct _ip_fw3_opheader {
uint16_t opcode; /* Operation opcode */
uint16_t reserved[3]; /* Align to 64-bit boundary */
} ip_fw3_opheader;
New opcodes added:
IP_FW_TABLE_XADD, IP_FW_TABLE_XDEL, IP_FW_TABLE_XGETSIZE, IP_FW_TABLE_XLIST
ipfw(8) table argument parsing behavior is changed:
'ipfw table 999 add host' now assumes 'host' to be interface name instead of
hostname.
New tunable:
net.inet.ip.fw.tables_max controls number of table supported by ipfw in given
VNET instance. 128 is still the default value.
New syntax:
ipfw add skipto tablearg ip from any to any via table(42) in
ipfw add skipto tablearg ip from any to any via table(4242) out
This is a bit hackish, special interface name '\1' is used to signal interface
table number is passed in p.glob field.
kib [Mon, 12 Mar 2012 12:15:47 +0000 (12:15 +0000)]
Rtld on diet part 1:
Provide rtld-private implementations of __stack_chk_guard,
__stack_chk_fail() and __chk_fail() symbols, to be used by functions
linked from libc_pic.a. This avoids use of libc stack_protector.c,
which pulls in syslog(3) and stdio as dependency.
Also, do initialize rtld-private copy __stack_chk_guard, previously
libc-provided one was not initialized, since we do not call rtld
object _init() methods.
kib [Mon, 12 Mar 2012 10:36:03 +0000 (10:36 +0000)]
When iterating over the dso program headers, the object is not initialized
yet, and object segments are not yet mapped. Only parse the notes that
appear in the first page of the dso (as it should be anyway), and use
the preloaded page content.
jmallett [Mon, 12 Mar 2012 07:34:15 +0000 (07:34 +0000)]
Remove platform APIs which are not used by any code and which had only stub
implementations or no implementation on all platforms.
Some of these functions might be good ideas, but their semantics were unclear
given the lack of implementation, and an unlucky porter could be fooled into
trying to implement them or, worse, being baffled when something like
platform_trap_enter() failed to be called.
mav [Mon, 12 Mar 2012 07:02:16 +0000 (07:02 +0000)]
Tune cpuset macros to optimize cases when CPU_SETSIZE fits into single
machine word. For example, it turns CPU_SET() into expected shift and OR,
removing two extra shifts and additional index on memory access.
Generated code checked for kernel (optimized) and user-level (unoptimized)
cases with GCC and CLANG.
yongari [Mon, 12 Mar 2012 03:47:30 +0000 (03:47 +0000)]
Make if_ierrors updated whenever any of the following counters are
updated.
o Number of times NIC ran out of RX buffer descriptors
o Number of inbound packet errors
o Number of inbound packets that were chosen to be discarded
Previously only the discarded packet counter was used to update
if_ierrors. This change fixes wrong if_ierrors counter on
BCM570[0-4] controllers. For BCM5705 and later controllers bge(4)
already correctly counted it.
yongari [Mon, 12 Mar 2012 02:42:47 +0000 (02:42 +0000)]
Show PCI bus speed and width as well as running mode of PCI-X
device in device attach. This would help to narrow down issue to a
specific controller and operating mode of the controller.
While I'm here rename BGE_MISCCFG_BOARD_ID with
BGE_MISCCFG_BOARD_ID_MASK.
yongari [Mon, 12 Mar 2012 02:09:47 +0000 (02:09 +0000)]
Add workaround for PCI-X BCM5704 controller that live behind
AMD-8131 PCI-X bridge. The bridge seems to reorder write access to
mailbox registers such that it caused watchdog timeouts by
out-of-order TX completions.
Tested by: Michael L. Squires <mikes <> siralan dot org >
Reviewed by: jhb
gonzo [Mon, 12 Mar 2012 01:23:09 +0000 (01:23 +0000)]
- Rename apb_intr to apb_filter since it's a filter handler
- Pass interrupt trapframe for handlers dow the chain
- Add PMC interrupt handler
PMC interrupt is a special case, so we want handle it as soon as possible
with minimum overhead. So we handle it apb filter routine.
adrian [Mon, 12 Mar 2012 01:15:58 +0000 (01:15 +0000)]
Begin modifying the PB92 config file to actually generate a flashable,
bootable image.
The kernel has to fit inside an 896KiB area in a 4MB SPI flash.
So a bunch of stuff can't be included (and more is to come), including
(unfortunately) IPv6.
TODO:
* GPIO modules need to be created
* Shrink the image a bit more by removing some of the CAM layer debugging
strings.
emaste [Mon, 12 Mar 2012 01:06:29 +0000 (01:06 +0000)]
Remove extraneous log message
When ntp switched between PLL and FLL mode it produced a log message
"kernel time sync status change %04x". This issue is reported in ntp
bug 452[1] which claims that this behaviour is normal and the log
message isn't necessary. I'm not sure exactly when it was removed, but
it's gone in the latest ntp release (4.2.6p5).
kib [Sun, 11 Mar 2012 20:23:46 +0000 (20:23 +0000)]
Do not fall back to slow synchronous i/o when low on memory or buffers.
The bawrite() schedules the write to happen immediately, and its use
frees the current thread to do more cleanups.
kib [Sun, 11 Mar 2012 20:18:14 +0000 (20:18 +0000)]
In ffs_syncvnode(), pass boolean false as second argument of ffs_update().
Synchronous inode block update is not needed for MNT_LAZY callers (syncer),
and since waitfor values are not zero, code did unneccessary synchronous
update.
kib [Sun, 11 Mar 2012 20:03:09 +0000 (20:03 +0000)]
Add support for preinit, init and fini arrays. Some ABIs, in
particular on ARM, do require working init arrays.
Traditional FreeBSD crt1 calls _init and _fini of the binary, instead
of allowing runtime linker to arrange the calls. This was probably
done to have the same crt code serve both statically and dynamically
linked binaries. Since ABI mandates that first is called preinit
array functions, then init, and then init array functions, the init
have to be called from rtld now.
To provide binary compatibility to old FreeBSD crt1, which calls _init
itself, rtld only calls intializers and finalizers for main binary if
binary has a note indicating that new crt was used for linking. Add
parsing of ELF notes to rtld, and cache p_osrel value since we parsed
it anyway.
The patch is inspired by init_array support for DragonflyBSD, written
by John Marino.
Reviewed by: kan
Tested by: andrew (arm, previous version), flo (sparc64, previous version)
MFC after: 3 weeks
adrian [Sun, 11 Mar 2012 19:08:56 +0000 (19:08 +0000)]
Upgrade the netgraph vlan node to support 802.1q, encapsulation type,
PCP and CFI fields.
* Ethernet_type for VLAN encapsulation is tunable, default is 0x8100;
* PCP (Priority code point) and CFI (canonical format indicator) is
tunable per VID;
* Tunable encapsulation to support 802.1q
* Encapsulation/Decapsulation code improvements
New messages have been added for this netgraph node to support the
new features.
However, the legacy "vlan" id is still supported and compiled in by
default. It can be disabled in a future release.
TODO:
* Documentation
* Examples
PR: kern/161908
Submitted by: Ivan <rozhuk.im@gmail.com>
luigi [Sun, 11 Mar 2012 17:35:12 +0000 (17:35 +0000)]
- remove an extra parenthesis in a closing brace;
- add the macro NETMAP_RING_FIRST_RESERVED() which returns
the index of the first non-released buffer in the ring
(this is useful for code that retains buffers for some time
instead of processing them immediately)
marius [Sun, 11 Mar 2012 13:39:19 +0000 (13:39 +0000)]
Fix a bug introduced in r223938; on big-endian machines coping a 32-bit
quantum bytewise to the address of a 64-bit variable results in writing
to the "wrong" 32-bit half so adjust the address accordingly. This fix
is implemented in a hackish way for two reasons:
o in order to be able to get it into 8.3 with zero impact on the little-
endian architectures where this bug has no effect and
o to avoid blowing the x86 boot2 out of the water again when compiling
it with clang, which all sane versions of this fix tested do.
This change fixes booting from UFS1 file systems on big-endian machines.
jmallett [Sun, 11 Mar 2012 06:17:49 +0000 (06:17 +0000)]
Merge the Cavium Octeon SDK 2.3.0 Simple Executive code and update FreeBSD to
make use of it where possible.
This primarily brings in support for newer hardware, and FreeBSD is not yet
able to support the abundance of IRQs on new hardware and many features in the
Ethernet driver.
Because of the changes to IRQs in the Simple Executive, we have to maintain our
own list of Octeon IRQs now, which probably can be pared-down and be specific
to the CIU interrupt unit soon, and when other interrupt mechanisms are added
they can maintain their own definitions.
Remove unmasking of interrupts from within the UART device now that the
function used is no longer present in the Simple Executive. The unmasking
seems to have been gratuitous as this is more properly handled by the buses
above the UART device, and seems to work on that basis.
jmallett [Sun, 11 Mar 2012 06:11:31 +0000 (06:11 +0000)]
Disable -Winline on MIPS in preparation for the import of the latest version
of the Cavium Simple Executive, which violates large function growth rules
in such a way that simply increasing the large function growth parameter is
insufficient.