Doug Barton [Thu, 18 Mar 2010 19:00:35 +0000 (19:00 +0000)]
Update to 9.6.2-P1, the latest patchfix release which deals with
the problems related to the handling of broken DNSSEC trust chains.
This fix is only relevant for those who have DNSSEC validation
enabled and configure trust anchors from third parties, either
manually, or through a system like DLV.
Doug Barton [Thu, 18 Mar 2010 18:58:17 +0000 (18:58 +0000)]
For those of us mere mortals who do not aspire to the lofty heights
of kernel hackery, add MAKE_JUST_WORLDS so that we can take part in
the 'make universe' goodnes without using unecessary time and resources.
Rui Paulo [Thu, 18 Mar 2010 11:06:38 +0000 (11:06 +0000)]
Fix a couple of bugs with 802.11n:
o Process the BAR frame on the adhoc, mesh and sta modes
o Fix the format of the ADDBA reply frame
o Fix references to the spec section numbers
Also, print the all the MCS rates in bootverbose.
Sponsored by: iXsystems, Inc.
Obtained from: //depot/user/rpaulo/80211n/...
Bjoern A. Zeeb [Thu, 18 Mar 2010 09:09:59 +0000 (09:09 +0000)]
Add ddb support to the "new" link layer code ("new-arp"):
- show all lltables [1] (optional flag to also show the llentries as well)
- show lltable <struct lltable *>
- show llentry <struct llentry *>
Marius Strobl [Wed, 17 Mar 2010 22:45:09 +0000 (22:45 +0000)]
o Add support for UltraSparc-IV+:
- Swap the configuration of the first and second large dTLB as with
US-IV+ these can only hold entries of certain page sizes each, which
we happened to chose the non-working way around.
- Additionally ensure that the large iTLB is set up to hold 8k pages
(currently this happens to be a NOP though).
- Add a workaround for US-IV+ erratum #2.
- Turn off dTLB parity error reporting as otherwise we get seemingly
false positives when copying in the user window by simulating a
fill trap on return to usermode. Given that these parity errors can
be avoided by disabling multi issue mode and the problem could be
reproduced with a second machine this appears to be a silicon bug of
some sort.
- Add a membar #Sync also before the stores to ASI_DCACHE_TAG. While
at it, turn of interrupts across the whole cheetah_cache_flush() for
simplicity instead of around every flush. This should have next to no
impact as for cheetah-class machines we typically only need to flush
the caches a few times during boot when recovering from peeking/poking
non-existent PCI devices, if at all.
- Just use KERNBASE for FLUSH as we also do elsewhere as the US-IV+
documentation doesn't seem to mention that these CPUs also ignore the
address like previous cheetah-class CPUs do. Again the code changing
LSU_IC is executed seldom enough that the negligible optimization of
using %g0 instead should have no real impact.
With these changes FreeBSD runs stable on V890 equipped with US-IV+
and -j128 buildworlds in a loop for days are no problem. Unfortunately,
the performance isn't were it should be as a buildworld on a 4x1.5GHz
US-IV+ V890 takes nearly 3h while on a V440 with (theoretically) less
powerfull 4x1.5GHz US-IIIi it takes just over 1h. It's unclear whether
this is related to the supposed silicon bug mentioned above or due to
another issue. The documentation (which contains a sever bug in the
description of the bits added to the context registers though) at least
doesn't mention any requirements for changes in the CPU handling besides
those implemented and the cache as well as the TLB configurations and
handling look fine.
o Re-arrange cheetah_init() so it's easier to add support for SPARC64
V up to VIIIfx CPUs, which only require parts of this initialization.
Kip Macy [Wed, 17 Mar 2010 21:10:09 +0000 (21:10 +0000)]
- cache line align arcs_lock array (h/t Marius Nuennerich)
- fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE
- cache line align buf_hash_table ht_locks array
Marius Strobl [Wed, 17 Mar 2010 20:23:14 +0000 (20:23 +0000)]
- Add TTE and context register bits for the additional page sizes supported
by UltraSparc-IV and -IV+ as well as SPARC64 V, VI, VII and VIIIfx CPUs.
- Replace TLB_PCXR_PGSZ_MASK and TLB_SCXR_PGSZ_MASK with TLB_CXR_PGSZ_MASK
which just is the complement of TLB_CXR_CTX_MASK instead of trying to
assemble it from the page size bits which vary across CPUs.
- Add macros for the remainder of the SFSR bits, which are useful for at
least debugging purposes.
Andrew Gallatin [Wed, 17 Mar 2010 20:13:09 +0000 (20:13 +0000)]
Fix 2 bugs in mxge_attach()
- Don't leak slice resources when mxge_alloc_rings() fails
- Start taskq threads only after we know attach will succeed. At
boot time, taskqueue_terminate() will loop infinately, waiting
for the threads to exit, and hang the system.
Marius Strobl [Wed, 17 Mar 2010 20:01:01 +0000 (20:01 +0000)]
- Add quirk handling for Sun Fire V1280. The firmware of these machines
provides no ino-bitmap properties so forge them using the default set
of controller interrupts and let schizo_setup_intr() take care of the
children, hoping for non-fancy routing.
- Add quirk handling for Sun Fire V890. When booting these machines from
disk a Schizo comes up with PCI error residing which triggers as soon
as we register schizo_pci_bus() even when clearing it from all involved
registers (it's no longer indicated once we're in schizo_pci_bus()
though). Thus make PCI bus errors non-fatal until we actually touch the
bus. With this change schizo_pci_bus() typically triggers once during
attach in this case. Obviously this approach isn't exactly race free
but it's about the best we can do about this problem as we're not
guaranteed that the interrupt will actually trigger on V890 either, as
it certainly doesn't when for example netbooting them.
Matt Jacob [Wed, 17 Mar 2010 02:48:14 +0000 (02:48 +0000)]
Put gone device timer into a structure tag that can hold more than 32 seconds. Oops.
Untangle some of the confusion about what role means when it's in the FCPARAM/SDPARAM
or isp_fc/isp_spi structures. This fixed a problem about seeing targets appear if you've
turned off autologin and find them, or rather don't, via camcontrol rescan.
Marcel Moolenaar [Wed, 17 Mar 2010 00:37:15 +0000 (00:37 +0000)]
Revamp the interrupt code based on the previous commit:
o Introduce XIV, eXternal Interrupt Vector, to differentiate from
the interrupts vectors that are offsets in the IVT (Interrupt
Vector Table). There's a vector for external interrupts, which
are based on the XIVs.
o Keep track of allocated and reserved XIVs so that we can assign
XIVs without hardcoding anything. When XIVs are allocated, an
interrupt handler and a class is specified for the XIV. Classes
are:
1. architecture-defined: XIV 15 is returned when no external
interrupt are pending,
2. platform-defined: SAL reports which XIV is used to wakeup
an AP (typically 0xFF, but it's 0x12 for the Altix 350).
3. inter-processor interrupts: allocated for SMP support and
non-redirectable.
4. device interrupts (i.e. IRQs): allocated when devices are
discovered and are redirectable.
o Rewrite the central interrupt handler to call the per-XIV
interrupt handler and rename it to ia64_handle_intr(). Move
the per-XIV handler implementation to the file where we have
the XIV allocation/reservation. Clock interrupt handling is
moved to clock.c. IPI handling is moved to mp_machdep.c.
o Drop support for the Intel 8259A because it was broken. When
XIV 0 is received, the CPU should initiate an INTA cycle to
obtain the interrupt vector of the 8259-based interrupt. In
these cases the interrupt controller we should be talking to
WRT to masking on signalling EOI is the 8259 and not the I/O
SAPIC. This requires adriver for the Intel 8259A which isn't
available for ia64. Thus stop pretending to support ExtINTs
and instead panic() so that if we come across hardware that
has an Intel 8259A, so have something real to work with.
o With XIVs for IPIs dynamically allocatedi and also based on
priority, define the IPI_* symbols as variables rather than
constants. The variable holds the XIV allocated for the IPI.
o IPI_STOP_HARD delivers a NMI if possible. Otherwise the XIV
assigned to IPI_STOP is delivered.
Better way to find out available file system types is to use lsvfs(1).
Using 'sysctl vfs' is not only ugly, but is also not reliable - not all
file system types create entries in vfs sysctl tree.
Kip Macy [Tue, 16 Mar 2010 22:17:21 +0000 (22:17 +0000)]
- reduce contention by breaking up ARC state locks in to 16 for data
and 16 for metadata
- export L2ARC tunables as sysctls
- add several kstats to track L2ARC state more precisely
- avoid holding a contended lock when atomically incrementing a
contended counter (no lock protection needed for atomics)
Jung-uk Kim [Tue, 16 Mar 2010 19:59:14 +0000 (19:59 +0000)]
Fix a long standing regression of readdir(3) in fdescfs(5) introduced
in r1.48. We were stopping at the first null pointer when multiple file
descriptors were opened and one in the middle was closed. This restores
traditional behaviour of fdescfs.
Qing Li [Tue, 16 Mar 2010 17:59:12 +0000 (17:59 +0000)]
Verify interface up status using its link state only
if the interface has such capability. The interface
capability flag indicates whether such capability
exists. This approach is much more backward compatible.
Physical device driver changes will be part of another
commit.
Also updated the ifconfig utility to show the LINKSTATE
capability if present.
John Baldwin [Tue, 16 Mar 2010 16:01:19 +0000 (16:01 +0000)]
- Extend the machine check record structure to include several fields useful
for parsing model-specific and other fields in machine check events
including the global machine check capabilities and status registers,
CPU identification, and the FreeBSD CPU ID.
- Report these added fields in the console log of a machine check so that
a record structure can be reconstituted from the console messages.
- Parse new architectural errors including memory controller errors.
Luigi Rizzo [Mon, 15 Mar 2010 17:14:27 +0000 (17:14 +0000)]
+ implement (two lines) the kernel side of 'lookup dscp N' to use the
dscp as a search key in table lookups;
+ (re)implement a sysctl variable to control the expire frequency of
pipes and queues when they become empty;
+ add 'queue number' as optional part of the flow_id. This can be
enabled with the command
queue X config mask queue ...
and makes it possible to support priority-based schedulers, where
packets should be grouped according to the priority and not some
fields in the 5-tuple.
This is implemented as follows:
- redefine a field in the ipfw_flow_id (in sys/netinet/ip_fw.h) but
without changing the size or shape of the structure, so there are
no ABI changes. On passing, also document how other fields are
used, and remove some useless assignments in ip_fw2.c
- implement small changes in the userland code to set/read the field;
- revise the functions in ip_dummynet.c to manipulate masks so they
also handle the additional field;
Marcel Moolenaar [Mon, 15 Mar 2010 16:53:09 +0000 (16:53 +0000)]
Have cpu_throw() loop on blocked_lock as well. This bug has existed
a long time and has gone unnoticed just as long, because I kept
using sched_4bsd (due to sched_ule not working with preemption),
but GENERIC had sched_ule by default -- including SMP.
While here, remove unused inclusion of <machine/clock.h>, remove
totally bogus inclusion of <i386/include/specialreg.h>.
Luigi Rizzo [Mon, 15 Mar 2010 15:43:35 +0000 (15:43 +0000)]
Implement "lookup dscp N" which does a lookup of the DSCP (top 6 bits
of ip->ip_tos) in a table. This can be useful to direct traffic to
different pipes/queues according to the DSCP of the packet, as follows:
Nathan Whitehorn [Mon, 15 Mar 2010 00:27:40 +0000 (00:27 +0000)]
Fix two small bugs. The PowerPC 970 does not support non-coherent memory
access, and reflects this by autonomously writing LPTE_M into PTE entries.
As such, we should not panic if LPTE_M changes by itself. While here,
fix a harmless typo in moea64_sync_icache().
Pyun YongHyeon [Sun, 14 Mar 2010 23:23:57 +0000 (23:23 +0000)]
It seems PCI_OUR_REG_[1-5] registers are not mapped on PCI
configuration space on Yukon Ultra(88E8056) such that accesses to
these registers were NOPs which in turn make msk(4) instable on
this controller. Use indirect access method to access
PCI_OUR_REG_[1-5] registers. This should fix a long standing
instability bug which prevented msk(4) working on Yukon Ultra.
Special thanks to koitsu who gave me remote access to his system.
Attilio Rao [Sun, 14 Mar 2010 22:38:18 +0000 (22:38 +0000)]
Checkin a facility for specifying a passthrough FIB from userland.
arcconf tool by Adaptec already seems to use for identifying the
Serial Number of the devices.
Some simple things (like FIB setup and bound checks) are retrieved
from the Adaptec's driver, but this implementation is quite different
because it does use the normal buffer dmat area for loading segments
and not a special one (like the Adaptec's one does).
Robert Watson [Sun, 14 Mar 2010 18:59:11 +0000 (18:59 +0000)]
Abstract out initialization of most aspects of struct inpcbinfo from
their calling contexts in {IP divert, raw IP sockets, TCP, UDP} and
create new helper functions: in_pcbinfo_init() and in_pcbinfo_destroy()
to do this work in a central spot. As inpcbinfo becomes more complex
due to ongoing work to add connection groups, this will reduce code
duplication.
Ed Schouten [Sun, 14 Mar 2010 10:18:58 +0000 (10:18 +0000)]
Trim down libcompat by removing <regexp.h>.
Erwin ran an exp-run with libcompat and <regexp.h> removed. It turns out
the regexp library is almost entirely unused. In fact, it looks like it
is sometimes used by accident. Because these function names clash with
libc's <regex.h>, some application use both <regex.h> and libcompat,
which means they link against the wrong regex library.
This commit removes the regexp library and reimplements re_comp() and
re_exec() using <regex.h>. It seems the grammar of the regular
expressions accepted by these functions is similar to POSIX EREs.
After this commit, 1 low-profile port will be broken, but the maintainer
already has a patch for it sitting in his mailbox.
Doug Barton [Sun, 14 Mar 2010 05:22:46 +0000 (05:22 +0000)]
Make it more clear in the docs that -a is not compatible with -iFU,
and enforce this in the code. Apparently a lot of users mistakenly
combine -a with these flags and are then mystified that no changes
were made.
While I'm here, fix a trailing space in mergemaster.8
Weongyo Jeong [Sun, 14 Mar 2010 01:57:32 +0000 (01:57 +0000)]
fixes a broken software beacon miss handler. There is a race to check
vap->iv_bmiss_count == 0 in ieee80211_swbmiss because iv_swbmiss_task is
enqueued by taskqueue.
Jilles Tjoelker [Sat, 13 Mar 2010 22:53:17 +0000 (22:53 +0000)]
sh: Do not abort on a redirection error if there is no command word.
Although simple commands without a command word (only assignments and/or
redirections) are much like special builtins, POSIX and most shells seem to
agree that redirection errors should not abort the shell in this case. Of
course, the assignments persist and assignment errors are fatal.
To get the old behaviour portably, use the ':' special builtin.
To get the new behaviour portably, given that there are no assignments, use
the 'true' regular builtin.
Jilles Tjoelker [Sat, 13 Mar 2010 22:30:52 +0000 (22:30 +0000)]
sh: Add test for assignment errors (e.g. trying to change a readonly var).
We currently ignore readonly status for assignments before regular builtins
and external programs (these assignments are not persistent anyway), so just
check that the readonly variable really is not changed.
The test depends on the command builtin changes for 'command :'.
Jilles Tjoelker [Sat, 13 Mar 2010 20:43:11 +0000 (20:43 +0000)]
sh: Fix longjmp clobber warnings in parser.c.
Make parsebackq a function instead of an emulated nested function.
This puts the setjmp usage in a smaller function where it is easier to avoid
bad optimizations.
This also "reverts" some FreeBSD local changes so we should now
be back to using entirely stock OpenSSL. The local changes were
simple $FreeBSD$ lines additions, which were required in the CVS
days, and the patch for FreeBSD-SA-09:15.ssl which has been
superseded with OpenSSL 0.9.8m's RFC5746 'TLS renegotiation
extension' support.
Jaakko Heinonen [Sat, 13 Mar 2010 12:02:44 +0000 (12:02 +0000)]
Use an unique directory name instead of hardcoded /tmp/.diskless.
A malicious user could create a file named /tmp/.diskless and cause
the script to misbehave.
PR: conf/141258
Reported by: Jon Passki
MFC after: 1 week
Rebecca Cran [Sat, 13 Mar 2010 11:17:39 +0000 (11:17 +0000)]
Change the 'amt' parameter in format_k2 from int to unsigned long long
to match the values passed in and prevent the SIZE field being corrupted
when more than 2TB is allocated.
Ed Schouten [Sat, 13 Mar 2010 09:21:00 +0000 (09:21 +0000)]
Remove COMPAT_43TTY from stock kernel configuration files.
COMPAT_43TTY enables the sgtty interface. Even though its exposure has
only been removed in FreeBSD 8.0, it wasn't used by anything in the base
system in FreeBSD 5.x (possibly even 4.x?). On those releases, if your
ports/packages are less than two years old, they will prefer termios
over sgtty.
Juli Mallett [Sat, 13 Mar 2010 04:55:47 +0000 (04:55 +0000)]
o) Use octeon_fpa_alloc_phys in a situation in which we don't need a usable
pointer, rather than octeon_fpa_alloc.
o) Report half duplex status properly.
o) Do not unconditionally update the last known link status in the softc. If
report_link isn't set, when octeon_rgmx_config_speed is called the first
time it will tell the driver (essentially) that we have already marked the
interface up. Likewise, don't change media speed and duplex if only the
link status is at issue. [1]
o) Remove manual changing of link state and let octeon_rgmx_config_speed do the
heavy lifting. [1]
Randall Stewart [Fri, 12 Mar 2010 22:58:52 +0000 (22:58 +0000)]
The proper fix for the delayed SCTP checksum is to
have the delayed function take an argument as to the offset
to the SCTP header. This allows it to work for V4 and V6.
This of course means changing all callers of the function
to either pass the header len, if they have it, or create
it (ip_hl << 2 or sizeof(ip6_hdr)).
PR: 144529
MFC after: 2 weeks
Xin LI [Fri, 12 Mar 2010 21:14:56 +0000 (21:14 +0000)]
Two optimizations to MI strlen(3) inspired by David S. Miller's
blog posting [1].
- Use word-sized test for unaligned pointer before working
the hard way.
Memory page boundary is always integral multiple of a word
alignment boundary. Therefore, if we can access memory
referenced by pointer p, then (p & ~word mask) must be also
accessible.
- Better utilization of multi-issue processor's ability of
concurrency.
The previous implementation utilized a formular that must be
executed sequentially. However, the ~, & and - operations can
actually be caculated at the same time when the operand were
different and unrelated.
The original Hacker's Delight formular also offered consistent
performance regardless whether the input would contain
characters with their highest-bit set, as it catches real
nul characters only.
These two optimizations has shown further improvements over the
previous implementation on microbenchmarks on i386 and amd64 CPU
including Pentium 4, Core Duo 2 and i7.
Jung-uk Kim [Fri, 12 Mar 2010 19:14:58 +0000 (19:14 +0000)]
Tidy up callout for select(2) and read timeout.
- Add a missing callout_drain(9) before the descriptor deallocation.[1]
- Prefer callout_init_mtx(9) over callout_init(9) and let the callout
subsystem handle the mutex for callout function.
PR: kern/144453
Submitted by: Alexander Sack (asack at niksun dot com)[1]
MFC after: 1 week
Pyun YongHyeon [Fri, 12 Mar 2010 18:41:41 +0000 (18:41 +0000)]
Implement Rx checksum offloading for Yukon EC, Yukon Ultra,
Yukon FE and Yukon Ultra2. These controllers provide very simple
checksum computation mechanism and it requires additional pseudo
header checksum computation in upper stack. Even though I couldn't
see much performance difference with/without Rx checksum offloading
it may help notebook based controllers.
Actually controller can compute two checksum value by giving
different starting position of checksum computation on received
frame. However, for long time, Marvell's checksum offloading engine
have been known to have several silicon bugs so don't blindly trust
computed partial checksum value. Instead, compute partial checksum
twice by giving the same checksum computation position and compare
the result. If the value is different it's clear indication of
hardware bug. This configuration lose IP checksum offloading
capability but I think it's better to take safe route.
Note, Rx checksum offloading for Yukon XL was still disabled due to
known silicon bug.