Marcel Moolenaar [Thu, 24 May 2012 21:07:10 +0000 (21:07 +0000)]
Just return if the size of the window is 0. This can happen when the
FDT does not define all ranges possible for a particular node (e.g.
PCI).
While here, only update the trgt_mem and trgt_io pointers if there's
no error. This avoids that we knowingly write an invalid target (= -1).
Marcel Moolenaar [Thu, 24 May 2012 20:58:40 +0000 (20:58 +0000)]
o Rename kernload_ap to bp_kernelload. This to introduce a common prefix
for variables that live in the boot page.
o Add bp_trace (yes, it's in the boot page) that gets zeroed before we
try to wake a core and to which the core being woken can write markers
so that we know where the core was in case it doesn't wake up. The
boot code does not yet write markers (too follow).
o Disable the boot page translation to allow the last 4K page to be used
for whatever we please. It would get mapped otherwise.
o Fix kernstart in the case of SMP. The start argument is typically page
aligned due to the alignment requirements that come with having a boot
page. The point of using trunc_page is that we get the actual load
address given that the entry point is immediately following the ELF
headers. In the SMP case this ended up exactly 4K after the load
address. Hence subtracting 1 from start.
Marcel Moolenaar [Thu, 24 May 2012 20:45:44 +0000 (20:45 +0000)]
Fix the memory barriers for CPUs that do not like lwsync and wedge or cause
exceptions early enough during boot that the kernel will do ithe same.
Use lwsync only when compiling for LP64 and revert to the more proven isync
when compiling for ILP32. Note that in the end (i.e. between revision 222198
and this change) ILP32 changed from using sync to using isync. As per Nathan
the isync is needed to make sure I/O accesses are properly serialized with
locks and isync tends to be more effecient than sync.
While here, undefine __ATOMIC_ACQ and __ATOMIC_REL at the end of the file
so as not to leak their definitions.
Marcel Moolenaar [Thu, 24 May 2012 20:24:49 +0000 (20:24 +0000)]
Preset (clear) the ranges we're supposed to fill from the FDT. If a
particular range (either I/O memory or I/O port) is not defined in
the FDT, we're not handing uninitialized structures back to our caller.
Marcel Moolenaar [Thu, 24 May 2012 20:12:46 +0000 (20:12 +0000)]
Allow building for the PowerPC EABI by providing a dummy __eabi()
function. The purpose of the __eabi() function is to set up the
runtime and is called first thing by main(). The runtime is already
set up for us prior to caling main, so there's nothing to do for
us in the EABI case.
Marcel Moolenaar [Thu, 24 May 2012 20:00:58 +0000 (20:00 +0000)]
Fix an inconsistency I just ran into for LDADD and DPADD. The description
for both of them use different, and presumably wrong, variables in the
example. They set LDFILES and SRCLIB respectively. I guess that's what
DPADD and LDADD were called first ...
Marcel Moolenaar [Thu, 24 May 2012 19:48:15 +0000 (19:48 +0000)]
Work better with how make/bmake works:
1. Avoid a cd back into ${.CURDIR} to run mkbuiltins when we know make
will first cd into ${.OBJDIR}. Keep the cwd to what make sets it to.
2. Don't tell mkbuiltins where to write to (= ${.OBJDIR}), but where to
get sources from (= ${.CURDIR}). This to compensate for point 1.
This fixes a problem with bmake's mk files that optimize ${.OBJDIR} to
expand to "." after changing cwd, not taking into account that the
target is pretty much undoing that and not getting the full path to the
object tree anymore.
Bjoern A. Zeeb [Thu, 24 May 2012 18:25:09 +0000 (18:25 +0000)]
MFp4 bz_ipv6_fast:
Introduce a (for now copied stripped down) in6_cksum_pseudo()
function. We should be able to use this from in6_cksum() but
we should also ponder possible MD specific improvements.
It takes an extra csum argument to allow for easy checks as
will be done by the upper layer protocol input paths.
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
Gleb Smirnoff [Thu, 24 May 2012 18:22:57 +0000 (18:22 +0000)]
Revert r220768 for ng_ksocket. This node is special and
when it is cloning, its constructor method may be called
in a context that isn't allowed to sleep.
Bjoern A. Zeeb [Thu, 24 May 2012 18:05:10 +0000 (18:05 +0000)]
MFp4 bz_ipv6_fast:
Optimize in6_cksum(), re-ordering work and limiting variable
initialization, removing a bzero() for mostly re-initialized
struct values, making use of the newly introduced in6_getscope(),
as well as converting an if/panic to a KASSERT().
Sponsored by: The FreeBSD Foundation
Sponsored by: iXsystems
Reviewed by: gnn (as part of the whole)
MFC After: 3 days
Alan Cox [Thu, 24 May 2012 15:25:35 +0000 (15:25 +0000)]
MF amd64 r233097, r233122
With the changes over the past year to how accesses to the page's dirty
field are synchronized, there is no need for pmap_protect() to acquire
the page queues lock unless it is going to access the pv lists or
PMAP1/PADDR1.
Alexander Motin [Thu, 24 May 2012 14:07:44 +0000 (14:07 +0000)]
MFprojects/zfsd:
Revamp the CAM enclosure services driver.
This updated driver uses an in-kernel daemon to track state changes and
publishes physical path location information\for disk elements into the
CAM device database.
Alexander Motin [Thu, 24 May 2012 11:07:39 +0000 (11:07 +0000)]
MFprojects/zfsd:
- Add low-level support for SATA Enclosure Management Bridge (SEMB)
devices -- SATA equivalents of the SCSI SES/SAF-TE devices.
- Add some utility functions for SCSI SAF-TE devices access.
Maksim Yevmenkin [Wed, 23 May 2012 18:56:29 +0000 (18:56 +0000)]
Tweak condition for disabling allocation from per-CPU buckets in
low memory situation. I've observed a situation where per-CPU
allocations were disabled while there were enough free cached pages.
Basically, cnt.v_free_count was sitting stable at a value lower
than cnt.v_free_min and that caused massive performance drop.
John Baldwin [Wed, 23 May 2012 13:45:52 +0000 (13:45 +0000)]
Rework the previous change to honor MADT processor IDs when probing
processor objects. Instead of forcing the new-bus CPU objects to use
a unit number equal to pc_cpuid, adjust acpi_pcpu_get_id() to honor the
MADT IDs by default. As with the previous change, setting
debug.acpi.cpu_unordered to 1 in the loader will revert to the old
behavior.
John Baldwin [Wed, 23 May 2012 13:41:12 +0000 (13:41 +0000)]
Only check to see if a memory resource is a PCI ROM BAR when activating
and deactivating PCI resources. Previously, if a device had more than
48 MSI interrupts, then activating message 48 (which has a rid == PCIR_BIOS)
would incorrectly try to enable the PCI ROM BAR.
Tested by: Olivier Cinquin ocinquin uci edu
MFC after: 3 days
Pyun YongHyeon [Wed, 23 May 2012 03:35:08 +0000 (03:35 +0000)]
Don't force max payload size to 128. Root complex and Endpoint will
negotiate with each other on the TLP payload size so blindly
forcing the size to 128 can cause a completion error which in turn
will stop device.
Reported by: Geans Pin < geanspin <> broadcom dot com >
MFC after: 5 days
Pyun YongHyeon [Wed, 23 May 2012 01:20:25 +0000 (01:20 +0000)]
Make IPMI work in the bce driver even when the interface is
configured down. Formerly, IPMI communication was lost whenever the
interface was not up. The reason was that the BCE_EMAC_MODE
register was not configured with the correct media settings. There
are two parts to the fix.
First, resetting the chip in bce_reset() causes the BCE_EMAC_MODE
register to be initialized to a default value that does not
necessarily correspond to the actual media settings. The fix
implemented here is a bit of a hack. Ideally, at the end of
bce_reset() we would poll the PHY to determine the negotiated media,
and then we would set the BCE_EMAC_MODE register accordingly. That
is difficult, since the PHY is abstracted behind the MII layer and is
not supposed to be queried directly from the MAC driver. Instead,
we read the BCE_EMAC_MODE register at the beginning of bce_reset()
and then restore its media bits to their original values before
returning. If IPMI is up and running, then the link is already
established and the BCE_EMAC_MODE register is already set appropriately
when bce_reset() is called. If IPMI is not running, no harm is
done by preserving the BCE_EMAC_MODE settings. The driver will set
the register properly once the interface is configured up and link
is established.
Second, bce_miibus_statchg() is sometimes called when the link is
down. In that case, the reported media settings are invalid.
Formerly, the driver used them anyway to setup the BCE_EMAC_MODE
register. We now avoid changing any MAC registers unless link is
active and the reported media settings are valid.
Submitted by: jdp
Tested by: jdp
MFC after: 5 days
Adrian Chadd [Tue, 22 May 2012 19:50:21 +0000 (19:50 +0000)]
Re-up the TX ath_buf limit from 128 to 512.
I'll have to leave this high for now, until I've done some significant
surgery with how ath_bufs (and descriptors) are handled.
This should significantly cut down on the opportunities for a full TX
queue hanging traffic. I'll continue making things work though; I'm
mostly doing this for users. :)
Adrian Chadd [Tue, 22 May 2012 19:37:12 +0000 (19:37 +0000)]
Fix some corner cases in the ieee80211_send_bar() handling.
* If the first call succeeded but failed to transmit, a timer would
reschedule it via bar_timeout(). Unfortunately bar_timeout() didn't
check the return value from the ieee80211_send_bar() reattempt and
if that failed (eg the driver ic_raw_xmit() failed), it would never
re-arm the timer.
* If BARPEND is cleared (which ieee80211_send_bar() will do if it can't
TX), then re-arming the timer isn't enough - once bar_timeout() occurs,
it'll see BARPEND is 0 and not run through the rest of the routine.
So when rearming the timer, also set that flag.
* If the TX wasn't occuring, bar_tx_complete() wouldn't be called and the
driver callback wouldn't be called either. So the driver had no idea
that the BAR TX attempt had failed. In the ath(4) case, TX would stay
paused.
(There's no callback to indicate that BAR TX had failed or not;
only a "BAR TX was attempted". That's a separate, later problem.)
So call the driver callback (ic_bar_response()) before the ADDBA session
is torn down, so it has a chance of being notified that things didn't
quite go to plan.
I've verified that yes, this does suspend traffic for ath(4), retry BAR
TX even if the driver is failing ic_raw_xmit(), and then eventually giving
up and sending a DELBA. I'll address the "out of ath_buf" issue in ath(4)
in a subsequent commit - this commit just fixes the edge case where any
driver is (way) out of internal buffers/descriptors and fails frame TX.
Fix world after byacc import:
- old yacc(1) use to magicially append stdlib.h, while new one don't
- new yacc(1) do declare yyparse by itself, fix redundant declaration of
'yyparse'
Fix panic with RACCT that could occur in low memory (or out of swap)
situations, due to fork1() calling racct_proc_exit() without calling
racct_proc_fork() first.
Submitted by: Mateusz Guzik <mjguzik at gmail dot com> (earlier version)
Reviewed by: Mateusz Guzik <mjguzik at gmail dot com>
Add the code for new Intel GPU driver, which supports GEM, KMS and
works with new generations of GPUs (IronLake, SandyBridge and
supposedly IvyBridge).
The driver is not connected to the build yet.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
A rewrite of the i810 bits of the agp(4) driver. New driver supports
operations required by GEMified i915.ko. It also attaches to SandyBridge
and IvyBridge CPU northbridges now.
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Fix enforcement of file size limit with O_APPEND on ZFS.
vn_rlimit_fsize takes uio->uio_offset and uio->uio_resid into account
when determining whether given write would exceed RLIMIT_FSIZE.
When APPEND flag is specified, ZFS updates uio->uio_offset to point to the
end of file.
But this happens after a call to vn_rlimit_fsize, so vn_rlimit_fsize check
can be rendered ineffective by thread that opens some file with O_APPEND
and lseeks below RLIMIT_FSIZE before calling write.
Hartmut Brandt [Tue, 22 May 2012 09:59:49 +0000 (09:59 +0000)]
Fix a compilation error with some compilers: __attribute__
requires two parenthesis for its argument, but instead of using
__attribute__ directly, use the appropriate __nonnull macro
from cdefs.h.
Hartmut Brandt [Tue, 22 May 2012 07:23:41 +0000 (07:23 +0000)]
Make dumptid non-static. It is used by libkvm to detect whether
this is a VNET-kernel or not. gcc used to put the static symbol into
the symbol table, clang does not. This fixes the 'netstat: no namelist'
error seen on clang+VNET systems.
Andrew Turner [Tue, 22 May 2012 07:04:23 +0000 (07:04 +0000)]
Fix booting on ARM.
In PHYS_TO_VM_PAGE() when VM_PHYSSEG_DENSE is set the check if we are past
the end of vm_page_array was incorrect causing it to return NULL. This
value is then used in vm_phys_add_page causing a data abort.
Adrian Chadd [Tue, 22 May 2012 06:31:03 +0000 (06:31 +0000)]
Fix up some corner cases with aggregation handling.
I've come across a weird scenario in net80211 where two TX streams will
happily attempt to setup an aggregation session together.
If we're very lucky, it happens concurrently on separate CPUs and the
total lack of locking in the net80211 aggregation code causes this stuff
to race. Badly.
So >1 call would occur to the ath(4) addba start, but only one call would
complete to addba complete or timeout. The TID would thus stay paused.
The real fix is to implement some proper per-node (or maybe per-TID)
locking in net80211, which then could be leveraged by the ath(4) TX
aggregation code.
Whilst I'm at it, shuffle around the debugging messages a bit.
I like to keep people on their toes.
Jim Harris [Mon, 21 May 2012 22:54:33 +0000 (22:54 +0000)]
Wait until completion context unwinds before retrying CCBs that have been
queued internally. This works around issue in the isci HAL where it cannot
accept new I/O to a device after a resetting->ready state transition until
the completion context has unwound.
This issue was found by submitting non-tagged CCBs through pass(4) interface
to a SATA disk with an extremely small timeout value (5ms). This would trigger
internal resets with I/O in the isci(4) internal queues.
The small timeout value had not been intentional (and original reporter has
since changed his test to use 5sec instead), but it did uncover this corner
case that would result in a hung disk.
Sponsored by: Intel
Reported and tested by: Ravi Pokala <rpokala at panasas dot com>
Reviewed by: scottl (earlier version)
MFC after: 1 week
Call bpf_jitter() before acquiring BPF global lock due to malloc() being used inside bpf_jitter.
Eliminate bpf_buffer_alloc() and allocate BPF buffers on descriptor creation and BIOCSBLEN ioctl.
This permits us not to allocate buffers inside bpf_attachd() which is protected by global lock.
Fix old panic when BPF consumer attaches to destroying interface.
'flags' field is added to the end of bpf_if structure. Currently the only
flag is BPFIF_FLAG_DYING which is set on bpf detach and checked by bpf_attachd()
Problem can be easily triggered on SMP stable/[89] by the following command (sort of):
'while true; do ifconfig vlan222 create vlan 222 vlandev em0 up ; tcpdump -pi vlan222 & ; ifconfig vlan222 destroy ; done'
Fix possible use-after-free when BPF detaches itself from interface, freeing bpf_bif memory,
while interface is still UP and there can be routes via this interface.
Freeing is now delayed till ifnet_departure_event is received via eventhandler(9) api.
Convert bpfd rwlock back to mutex due lack of performance gain (currently checking if packet
matches filter is done without holding bpfd lock and we have to acquire write lock if packet matches)
Fix panic on attaching to non-existent interface (introduced by r233937, pointed by hrs@)
Fix panic on tcpdump being attached to interface being removed (introduced by r233937, pointed by hrs@ and adrian@)
Protect most of bpf_setf() by BPF global lock
Add several forgotten assertions (thanks to adrian@)
Document current locking model inside bpf.c
Document EVENTHANDLER(9) usage inside BPF.
Guy Helmer [Mon, 21 May 2012 21:10:00 +0000 (21:10 +0000)]
Add checks for memory allocation failures in appropriate places, and
avoid creating bad entries in the grp list as a result of memory allocation
failures while building new entries.
PR: bin/83340
Reviewed by: delphij (prior version of patch)