emax [Wed, 30 May 2012 00:38:24 +0000 (00:38 +0000)]
MFC r235854
Tweak condition for disabling allocation from per-CPU buckets in
low memory situation. I've observed a situation where per-CPU
allocations were disabled while there were enough free cached pages.
Basically, cnt.v_free_count was sitting stable at a value lower
than cnt.v_free_min and that caused massive performance drop.
jimharris [Tue, 29 May 2012 23:10:10 +0000 (23:10 +0000)]
MFC r234106, r235751:
Queue CCBs internally instead of using CAM_REQUEUE_REQ status. This fixes
problem where userspace apps such as smartctl fail due to CAM_REQUEUE_REQ
status getting returned when tagged commands are outstanding when smartctl
sends its I/O using the pass(4) interface.
trasz [Tue, 29 May 2012 15:08:35 +0000 (15:08 +0000)]
MFC r235787:
Fix panic with RACCT that could occur in low memory (or out of swap)
situations, due to fork1() calling racct_proc_exit() without calling
racct_proc_fork() first.
fabient [Tue, 29 May 2012 14:50:21 +0000 (14:50 +0000)]
MFC r233628, r234598, r235229, r235831, r226986.
Add software PMC support.
New kernel events can be added at various location for sampling or counting.
This will for example allow easy system profiling whatever the processor is
with known tools like pmcstat(8).
Simultaneous usage of software PMC and hardware PMC is possible, for example
looking at the lock acquire failure, page fault while sampling on
instructions.
yongari [Tue, 29 May 2012 04:44:51 +0000 (04:44 +0000)]
MFC r235821:
Don't force max payload size to 128. Root complex and Endpoint will
negotiate with each other on the TLP payload size so blindly
forcing the size to 128 can cause a completion error which in turn
will stop device.
yongari [Tue, 29 May 2012 04:30:02 +0000 (04:30 +0000)]
MFC r235816:
Make IPMI work in the bce driver even when the interface is
configured down. Formerly, IPMI communication was lost whenever the
interface was not up. The reason was that the BCE_EMAC_MODE
register was not configured with the correct media settings. There
are two parts to the fix.
First, resetting the chip in bce_reset() causes the BCE_EMAC_MODE
register to be initialized to a default value that does not
necessarily correspond to the actual media settings. The fix
implemented here is a bit of a hack. Ideally, at the end of
bce_reset() we would poll the PHY to determine the negotiated media,
and then we would set the BCE_EMAC_MODE register accordingly. That
is difficult, since the PHY is abstracted behind the MII layer and is
not supposed to be queried directly from the MAC driver. Instead,
we read the BCE_EMAC_MODE register at the beginning of bce_reset()
and then restore its media bits to their original values before
returning. If IPMI is up and running, then the link is already
established and the BCE_EMAC_MODE register is already set appropriately
when bce_reset() is called. If IPMI is not running, no harm is
done by preserving the BCE_EMAC_MODE settings. The driver will set
the register properly once the interface is configured up and link
is established.
Second, bce_miibus_statchg() is sometimes called when the link is
down. In that case, the reported media settings are invalid.
Formerly, the driver used them anyway to setup the BCE_EMAC_MODE
register. We now avoid changing any MAC registers unless link is
active and the reported media settings are valid.
alc [Mon, 28 May 2012 23:31:48 +0000 (23:31 +0000)]
MFC r230120
Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call
vm_object_pip_{add,subtract}() on the swap object because the swap
object can't be destroyed while the vnode is exclusively locked.
Moreover, even if the swap object could have been destroyed during
tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken
because vm_object_pip_subtract() does not wake up the sleeping thread
that is trying to destroy the swap object.
Free invalid pages after an I/O error. There is no virtue in keeping
them around in the swap object creating more work for the page daemon.
(I believe that any non-busy page in the swap object will now always
be valid.)
vm_pager_get_pages() does not return a standard errno, so its return
value should not be returned by tmpfs without translation to an errno
value.
There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to
occur with the swap object locked.
Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite().
(The swap pager already spams your console if data corruption is
imminent.)
alc [Mon, 28 May 2012 22:58:13 +0000 (22:58 +0000)]
MFC r229821
Correct an error of omission in the implementation of the truncation
operation on POSIX shared memory objects and tmpfs. Previously, neither of
these modules correctly handled the case in which the new size of the object
or file was not a multiple of the page size. Specifically, they did not
handle partial page truncation of data stored on swap. As a result, stale
data might later be returned to an application.
Interestingly, a data inconsistency was less likely to occur under tmpfs
than POSIX shared memory objects. The reason being that a different mistake
by the tmpfs truncation operation helped avoid a data inconsistency. If the
data was still resident in memory in a PG_CACHED page, then the tmpfs
truncation operation would reactivate that page, zero the truncated portion,
and leave the page pinned in memory. More precisely, the benevolent error
was that the truncation operation didn't add the reactivated page to any of
the paging queues, effectively pinning the page. This page would remain
pinned until the file was destroyed or the page was read or written. With
this change, the page is now added to the inactive queue.
MFC r230180
When tmpfs_write() resets an extended file to its original size after an
error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally
update the file's size.
rstone [Sun, 27 May 2012 18:55:23 +0000 (18:55 +0000)]
MFC r234691
Implement the D "cpu" variable, which returns curcpu. I have chosen not
to follow the example of OpenSolaris and its descendants, which implemented
cpu as an inline that took a value out of curthread. At certain points in
the FreeBSD scheduler curthread->td_oncpu will no longer be valid (in
particular, just before the thread gets descheduled) so instead I have
implemented this as its own built-in variable.
rstone [Sun, 27 May 2012 14:48:14 +0000 (14:48 +0000)]
MFC r233552
Instead of only iterating over the set of known SDT probes when sdt.ko is
loaded and unloaded, also have sdt.ko register callbacks with kern_sdt.c
that will be called when a newly loaded KLD module adds more probes or
a module with probes is unloaded.
This fixes two issues: first, if a module with SDT probes was loaded after
sdt.ko was loaded, those new probes would not be available in DTrace.
Second, if a module with SDT probes was unloaded while sdt.ko was loaded,
the kernel would panic the next time DTrace had cause to try and do
anything with the no-longer-existent probes.
This makes it possible to create SDT probes in KLD modules, although there
are still two caveats: first, any SDT probes in a KLD module must be part
of a DTrace provider that is defined in that module. At present DTrace
only destroys probes when the provider is destroyed, so you can still
panic the system if a KLD module creates new probes in a provider from a
different module(including the kernel) and then unload the the first module.
Second, the system will panic if you unload a module containing SDT probes
while there is an active D script that has enabled those probes.
rmacklem [Sun, 27 May 2012 01:24:08 +0000 (01:24 +0000)]
MFC: r234740
Fix a leak of namei lookup path buffers that occurs when a
ZFS volume is exported via the new NFS server. The leak occurred
because the new NFS server code didn't handle the case where
a file system sets the SAVENAME flag in its VOP_LOOKUP() and
ZFS does this for the DELETE case.
des [Sat, 26 May 2012 23:42:52 +0000 (23:42 +0000)]
Make yacc a bootstrap tool if building on head, since the new yacc is
not 100% backward compatible. Also, build yacc before lex, because lex
requires yacc but not vice versa.
mdf [Sat, 26 May 2012 20:13:24 +0000 (20:13 +0000)]
MFC r235297, r235316:
Add a -v and -N option to kenv(1), so it can be more easily used in
scripts the way sysctl(8) is. The -N option, like in sysctl(8),
displays only the kenv names, not their values. The -v option prints an
individual kenv variable name with its value as name="value". This is
the inverse of sysctl(8)'s -n flag, since the default behaviour of
kenv(1) is already like sysctl(8) -n.
rmacklem [Sat, 26 May 2012 13:12:14 +0000 (13:12 +0000)]
MFC: r235332
PR# 165923 reported intermittent write failures for dirty
memory mapped pages being written back on an NFS mount.
Since any thread can call VOP_PUTPAGES() to write back a
dirty page, the credentials of that thread may not have
write access to the file on an NFS server. (Often the uid
is 0, which may be mapped to "nobody" in the NFS server.)
Although there is no completely correct fix for this
(NFS servers check access on every write RPC instead of at
open/mmap time), this patch avoids the common cases by
holding onto a credential that recently opened the file
for writing and uses that credential for the write RPCs
being done by VOP_PUTPAGES() for both NFS clients.
marius [Sat, 26 May 2012 09:31:23 +0000 (09:31 +0000)]
MFC: r234524
o Fixes:
- When switching to 4-bit operation, send a SET_CLR_CARD_DETECT command
to disconnect the card-detect pull-up resistor from the DAT3 line before
sending the SET_BUS_WIDTH command.
- Add the missing "reserved" zero entry to the mantissa table used to
decode various CSD fields. This was causing SD cards to report that they
could run at 30 MHz instead of the maximum 25 MHz mandated in the spec.
o Enhancements:
- At the MMC layer, format various info from the CID into a string that
uniquely identifies the card instance (manufacturer number, serial
number, product name and revision, etc). Export it as an instance
variable.
- At the MMCSD layer, display the formatted card ID string, and also
report the clock speed of the hardware (not the card's max speed), and
the number of bits and number of blocks per transfer. It comes out like
this now:
mmcsd0: 968MB <SD SD01G 8.0 SN 276886905 MFG 08/2008 by 3 SD> at mmc0
22.5MHz/4bit/128-block
o Use DEVMETHOD_END.
o Use NULL instead of 0 for pointers.
marius [Sat, 26 May 2012 09:16:37 +0000 (09:16 +0000)]
MFC: r234901
- Add missing locking in at91_usart_getc().
- Align the RX buffers on the cache line size, otherwise the requirement
of partial cache line flushes on every are pretty much guaranteed. [1]
- Make the code setting the RX timeout match its comment (apparently,
start and stop bits were missed in the previous calculation). [1]
- Cover the busdma operations in at91_usart_bus_{ipend,transmit}() with
the hardware mutex, too, so these don't race against each other.
- In at91_usart_bus_ipend(), reduce duplication in the code dealing with
TX interrupts.
- In at91_usart_bus_ipend(), turn the code dealing with RX interrupts
into an else-if cascade in order reduce its complexity and to improve
its run-time behavior.
- In at91_usart_bus_ipend(), add missing BUS_DMASYNC_PREREAD calls on
the RX buffer map before handing things over to the hardware again. [1]
- In at91_usart_bus_getsig(), used a variable of sufficient width for
storing the contents of USART_CSR.
- Use KOBJMETHOD_END.
- Remove an unused header.
Submitted by: Ian Lepore [1]
Reviewed by: Ian Lepore
marius [Sat, 26 May 2012 09:13:24 +0000 (09:13 +0000)]
MFC: r234561
Interrupts must be disabled while handling a partial cache line flush,
as otherwise the interrupt handling code may modify data in the non-DMA
part of the cache line while we have it stashed away in the temporary
stack buffer, then we end up restoring a stale value.
marius [Sat, 26 May 2012 09:11:45 +0000 (09:11 +0000)]
MFC: r234560
- Add support for MCI1 revision 2xx controllers and a work-around for their
"Data Write Operation and number of bytes" erratum.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
marius [Sat, 26 May 2012 09:05:45 +0000 (09:05 +0000)]
MFC: r234291, r234292
Add support for the Atmel SAM9XE family of microcontrollers, which
consist of a ARM926EJ-S processor core with up to 512 Kbytes of on-chip
flash. Tested with SAM9XE512.
marius [Sat, 26 May 2012 08:54:26 +0000 (08:54 +0000)]
MFC: r234898, r235207
Add initial support for booting from ZFS on sparc64. At least on Sun Fire
V100, the firmware is known to be broken and not allowing to simultaneously
open disk devices, causing attempts to boot from a mirror or RAIDZ to cause
a crash. This will be worked around later. The firmwares of newer sun4u models
don't seem to exhibit this problem though.
thompsa [Sat, 26 May 2012 08:44:26 +0000 (08:44 +0000)]
MFC r234936 (emaste)
Relax restriction on direct tx to child ports
Lagg(4) restricts the type of packet that may be sent directly to a child
port, to avoid undesired output from accidental misconfiguration.
Previously only ETHERTYPE_PAE was permitted.
BPF writes to a lagg(4) child port are presumably intentional, so just
allow them, while still blocking other packets that should take the
aggregation path.
thompsa [Sat, 26 May 2012 08:25:25 +0000 (08:25 +0000)]
MFC r232321
Correct the description for CTLFLAG_TUN and CTLFLAG_RDTUN, the declaring of a
system tunable has never been implemented. This flag is only used by sysctl(8)
to provide a helpful error message.
thompsa [Sat, 26 May 2012 07:44:35 +0000 (07:44 +0000)]
MFC r235147
Do not reinitialise the interface if it is already running, this prevents the
bootp+nfs code from working as it calls init on each dhcp send and rx fails to
start in time.
thompsa [Sat, 26 May 2012 07:44:00 +0000 (07:44 +0000)]
MFC r235144
The DEVICE_POLLING dereference of sc->tsec_ifp needs to be checked for null
first or this will panic. Condense three blocks that check sc->tsec_ifp into
one while I am here.
thompsa [Sat, 26 May 2012 07:41:05 +0000 (07:41 +0000)]
MFC r234163
Set the proto to LAGG_PROTO_NONE before calling the detach routine so packets
are discarded, this is an issue because lacp drops the lock which may allow
network threads to access freed memory. Expand the lock coverage so the
detach/attach happen atomically.
thompsa [Sat, 26 May 2012 07:35:44 +0000 (07:35 +0000)]
MFC r232118
Only look for a usable MAC address for the bridge ID from ports within our
bridge, this allows us to have more than one independent bridge in the same
STP domain.
thompsa [Sat, 26 May 2012 07:34:46 +0000 (07:34 +0000)]
MFC r232014,r232030,r232070
- bstp_input() always consumes the packet so remove the mbuf handling dance
around it.
- Now that network interfaces advertise if they support linkstate notifications
we do not need to perform a media ioctl every 15 seconds.
- Indicate this function decrements the timer as well as testing for expiry.
marius [Fri, 25 May 2012 19:56:40 +0000 (19:56 +0000)]
MFC: r235681
Rewrite nd6_sysctl_{d,p}rlist() to avoid misaligned accesses to char arrays
casted to structs by getting rid of these buffers entirely. In r169832, it
was tried to paper over this issue by 32-bit aligning the buffers. Depending
on compiler optimizations that still was insufficient for 64-bit architectures
with strong alignment requirements though.
While at it, add comments regarding the total lack of locking in this area.
marius [Fri, 25 May 2012 17:50:50 +0000 (17:50 +0000)]
MFC: r228919
Add locally implemented atomic intrinsics to libcompiler_rt.
The built-in atomic operations are not implemented in our version of GCC
4.2 for the ARM and MIPS architectures. Instead of emitting locked
instructions, they generate calls to functions that can be implemented
in the C runtime.
Only implement the atomic operations that are used by <stdatomic.h> for
datatype sizes that are supported by atomic(9). This means that on these
architectures, we can only use atomic operations on 32-bits and 64-bits
variables, which is typically sufficient.
This makes <stdatomic.h> work on all architectures except MIPS, since
MIPS still uses libgcc.
marius [Fri, 25 May 2012 17:48:45 +0000 (17:48 +0000)]
MFC: r229135
Upgrade libcompiler_rt to upstream revision 147390.
This version of libcompiler_rt adds support for __mulo[sdt]i4(), which
computes a multiply and its overflow flag. There are also a lot of
cleanup fixes to headers that don't really affect us.
Updating to this revision should make it a bit easier to contribute
changes back to the LLVM developers.
marius [Fri, 25 May 2012 17:20:00 +0000 (17:20 +0000)]
MFC: r235487
Switch sparc64 to using libcompiler_rt; since r230021 (MFC'ed to stable/9 in
r236012) we have a workaround in place allowing it to be used there and since
r235388 (MFC'ed to stable/9 in r236002) we also have usable div/mod
optimizations like libgcc has.
marius [Fri, 25 May 2012 17:15:53 +0000 (17:15 +0000)]
MFC: r230021
Add a workaround to prevent endless recursion in compiler-rt.
SPARC and MIPS CPUs don't have special instructions to count
leading/trailing zeroes. The compiler-rt library provides fallback
rountines for these. The 64-bit routines, __clzdi2 and __ctzdi2, are
implemented as simple wrappers around the compiler built-in
__builtin_clz(), assuming these will expand to either 32-bit
CPU instructions or calls to __clzsi2 and __ctzsi2.
Unfortunately, our GCC 4.2 probably thinks that because the operand is
stored in a 64-bit register, it might just be a better idea to invoke
its 64-bit equivalent, simply resulting into endless recursion. Fix this
by defining __builtin_clz and __builtin_ctz to __clzsi2 and __ctzsi2
explicitly.
marius [Fri, 25 May 2012 17:14:47 +0000 (17:14 +0000)]
MFC: r222656
Upgrade libcompiler_rt from revision 117047 to 132478.
It seems there have only been a small amount to the compiler-rt source
code in the mean time. I'd rather have the code in sync as much as
possible by the time we release 9.0. Changes:
- The libcompiler_rt library is now dual licensed under both the
University of Illinois "BSD-Like" license and the MIT license.
- Our local modifications for using .hidden instead of .private_extern
have been upstreamed, meaning our changes to lib/assembly.h can now be
reverted.
- A possible endless recursion in __modsi3() has been fixed.
- Support for ARM EABI has been added, but it has no effect on FreeBSD
(yet).
- The functions __udivmodsi4 and __divmodsi4 have been added.
Requested by: many, including bf@ and Pedro Giffuni
bschmidt [Fri, 25 May 2012 16:07:39 +0000 (16:07 +0000)]
MFC r232946,232958,235233:
r232946:
Update the rt2860's firmware and add a Makefile for the module. While
here remove the ucode header file which was used to generate the fw files
but by now is outdated.
r232958:
Import the latest microcode.h which was used to generate the current
firmware files and adjust the Makefile.
r235233:
Add support for Ralink RT2800/RT3000 chipsets.
marius [Fri, 25 May 2012 15:15:25 +0000 (15:15 +0000)]
MFC: r235388
- Get rid of debugging support in order to get rid of the V8-specific C
compiler frame size used there so this whole thing is V8/V9-agnostic.
- Use 32-bit function alignment as GCC does when using UltraSPARC I or
higher optimizations.
- Don't waste delay slots when possible.
marius [Fri, 25 May 2012 15:14:22 +0000 (15:14 +0000)]
MFC: r230025
Add SPARC64 version of div/mod written in assembly.
This version is similar to the code shipped with libgcc. It is based on
the code from the SPARC64 architecture manual, provided without any
restrictions.
marius [Fri, 25 May 2012 14:40:56 +0000 (14:40 +0000)]
MFC: r234348
Turn on PREEMPTION by default. After fixing several bugs over time, the
last show-stopper keeping PREEMPTION from being usable on sparc64 should
have been dealt with in r230662 (MFC'ed to stable/9 in r230662).
At least on 2-way systems, PREEMPTION causes a little bit of a degradation
in worldstone performance. However, FreeBSD seems to have started building
up regressions in !PREEMPTION cases so sparc64 better should not be an
oddball in this regard.
fabient [Fri, 25 May 2012 07:25:30 +0000 (07:25 +0000)]
MFC r233611:
- Support inlined location in calltree output.
In case of multiple level of inlining all the locations are flattened.
Require recent binutils/addr2line (head works or binutils from ports
with the right $PATH order).
- Multiple fixes in the calltree output (recursion case, ...)
- Fix the calltree top view that previously hide some shared nodes.
Tested with Kcachegrind(kdesdk4)/qcachegrind(head).
GEOM_JOURNAL: Shutting down geom gjournal 3464572051.
panic: destroying non-empty racct: 1 allocated for resource 6
which were tracked by jh@ to be caused by checking p->p_flag,
while it wasn't initialised yet. Basically, during fork, the code
checked p_flag, concluded the process isn't marked as P_SYSTEM,
incremented the counter, and later on, when exiting, checked that
the process was marked as P_SYSTEM, and thus didn't decrement it.
Also, I believe there wasn't any good reason for checking P_SYSTEM
in the first place.
rmacklem [Thu, 24 May 2012 12:28:11 +0000 (12:28 +0000)]
MFC: r235568
A problem with the NFSv4 server was reported by Andrew Leonard
to freebsd-fs@, where the setfacl of an NFSv4 acl would fail.
This was caused by the VOP_ACLCHECK() call for ZFS replying
EOPNOTSUPP. After discussion with rwatson@, it was determined
that a call to VOP_ACLCHECK() before doing VOP_SETACL() is not
required. This patch fixes the problem by deleting the
VOP_ACLCHECK() call.
trasz [Thu, 24 May 2012 10:02:42 +0000 (10:02 +0000)]
MFC r234385:
Fix bug where NFSv4 ACL enforcement code wouldn't unconditionally
allow the owner to read and write ACL and file attributes when there
was no entry with subject matching the owner. In other words,
'getfacl meh' shouldn't fail for the owner if the ACL looks like this:
trasz [Thu, 24 May 2012 09:59:58 +0000 (09:59 +0000)]
MFC r226043:
Remove assertion against empty NFSv4 ACLs. An empty ACL is not exactly
valid - we don't allow for setting it on a file, for example - but it's
not something we should assert on.
For STABLE kernel, it changes nothing, because it's not compiled with
INVARIANTS. If it was, it would fix crashes. It also fixes an assert
in libc encountered with NFSv4 without nfsuserd(8) running.