rstone [Wed, 30 May 2012 23:42:48 +0000 (23:42 +0000)]
MFC r235459 and r235471
r235459:
Implement the DTrace sched provider. This implementation aims to be
compatible with the sched provider implemented by Solaris and its open-
source derivatives. Full documentation of the sched provider can be found
on Oracle's DTrace wiki pages.
Note that for compatibility with scripts originally written for Solaris,
serveral probes are defined that will never fire. These probes are defined
to fire when Solaris-specific features perform certain actions. As these
features are not present in FreeBSD, the probes can never fire.
Also, I have added a two probes that are not defined in Solaris, lend-pri
and load-change. These probes have been added to make it possible to
collect schedgraph data with DTrace.
Finally, a few probes are defined in Solaris to take a cpuinfo_t *
argument. As it was not immediately clear to me how to translate that to
FreeBSD, currently those probes are passed NULL in place of a cpuinfo_t *.
Sponsored by: Sandvine Incorporated
r235471:
Fix typo in function name SDT_PROBE4 and unbreak 4BSD UP.
jhb [Wed, 30 May 2012 16:13:34 +0000 (16:13 +0000)]
MFC 234501:
The amr(4) firmware contains a rather dubious "feature" where it
assumes for small buffers (< 64k) that the OS driver is actually using
a buffer rounded up to the next power of 2. It also assumes that the
buffer is at least 4k in size. Furthermore, there is at least one
known instance of megarc sending a request with a 12k buffer where the
firmware writes out a 24k-ish reply.
To workaround the data corruption triggered by this "feature", ensure
that buffers for user commands use a minimum size of 32k, and that
buffers between 32k and 64k use a 64k buffer.
jimharris [Tue, 29 May 2012 23:15:23 +0000 (23:15 +0000)]
MFC r234106, r235751:
Queue CCBs internally instead of using CAM_REQUEUE_REQ status. This fixes
problem where userspace apps such as smartctl fail due to CAM_REQUEUE_REQ
status getting returned when tagged commands are outstanding when smartctl
sends its I/O using the pass(4) interface.
yongari [Tue, 29 May 2012 04:53:46 +0000 (04:53 +0000)]
MFC r235821:
Don't force max payload size to 128. Root complex and Endpoint will
negotiate with each other on the TLP payload size so blindly
forcing the size to 128 can cause a completion error which in turn
will stop device.
yongari [Tue, 29 May 2012 04:32:27 +0000 (04:32 +0000)]
MFC r235816:
Make IPMI work in the bce driver even when the interface is
configured down. Formerly, IPMI communication was lost whenever the
interface was not up. The reason was that the BCE_EMAC_MODE
register was not configured with the correct media settings. There
are two parts to the fix.
First, resetting the chip in bce_reset() causes the BCE_EMAC_MODE
register to be initialized to a default value that does not
necessarily correspond to the actual media settings. The fix
implemented here is a bit of a hack. Ideally, at the end of
bce_reset() we would poll the PHY to determine the negotiated media,
and then we would set the BCE_EMAC_MODE register accordingly. That
is difficult, since the PHY is abstracted behind the MII layer and is
not supposed to be queried directly from the MAC driver. Instead,
we read the BCE_EMAC_MODE register at the beginning of bce_reset()
and then restore its media bits to their original values before
returning. If IPMI is up and running, then the link is already
established and the BCE_EMAC_MODE register is already set appropriately
when bce_reset() is called. If IPMI is not running, no harm is
done by preserving the BCE_EMAC_MODE settings. The driver will set
the register properly once the interface is configured up and link
is established.
Second, bce_miibus_statchg() is sometimes called when the link is
down. In that case, the reported media settings are invalid.
Formerly, the driver used them anyway to setup the BCE_EMAC_MODE
register. We now avoid changing any MAC registers unless link is
active and the reported media settings are valid.
des [Mon, 28 May 2012 21:25:19 +0000 (21:25 +0000)]
MFH r208957: check return value from archive_read_new()
MFH r214137, r214140, r214174, r224584: support reading from stdin
MFH r230044, r233456: style and markup nits
MFH r234311: correct name in copyright statement
rstone [Sun, 27 May 2012 19:03:16 +0000 (19:03 +0000)]
MFC r225640
Clear transmit checksum offload context state upon lem(4) interface
initialization. Prior to this change packets may be transmitted with an
incorrect checksum.
em(4) already has an equivalent change in r213234.
rstone [Sun, 27 May 2012 18:57:20 +0000 (18:57 +0000)]
MFC r234691
Implement the D "cpu" variable, which returns curcpu. I have chosen not
to follow the example of OpenSolaris and its descendants, which implemented
cpu as an inline that took a value out of curthread. At certain points in
the FreeBSD scheduler curthread->td_oncpu will no longer be valid (in
particukar, just before the thread gets descheduled) so instead I have
implemented this as its own built-in variable.
marius [Sun, 27 May 2012 15:48:25 +0000 (15:48 +0000)]
MFC: r235255 (partial)
Change the module order of this MAC driver to be last so its is
deterministically handled after the corresponding PHY drivers when
loaded as module. Otherwise, when this MAC/PHY driver combination
is compiled into a single module, probing the PHY drivers may fail.
This makes r151438 actually work.
Reported and tested by: yongari (for fxp(4))
Given that r226154 isn't part of stable/8, the other drivers fixed
as part of the original r235255 aren't affected here.
rstone [Sun, 27 May 2012 14:52:31 +0000 (14:52 +0000)]
MFC r233552
Instead of only iterating over the set of known SDT probes when sdt.ko is
loaded and unloaded, also have sdt.ko register callbacks with kern_sdt.c
that will be called when a newly loaded KLD module adds more probes or
a module with probes is unloaded.
This fixes two issues: first, if a module with SDT probes was loaded after
sdt.ko was loaded, those new probes would not be available in DTrace.
Second, if a module with SDT probes was unloaded while sdt.ko was loaded,
the kernel would panic the next time DTrace had cause to try and do
anything with the no-longer-existent probes.
This makes it possible to create SDT probes in KLD modules, although there
are still two caveats: first, any SDT probes in a KLD module must be part
of a DTrace provider that is defined in that module. At present DTrace
only destroys probes when the provider is destroyed, so you can still
panic the system if a KLD module creates new probes in a provider from a
different module(including the kernel) and then unload the the first module.
Second, the system will panic if you unload a module containing SDT probes
while there is an active D script that has enabled those probes.
rstone [Sun, 27 May 2012 14:25:16 +0000 (14:25 +0000)]
MFC r227441
Correct the types of the arguments to return probes of the syscall
provider. Previously we were erroneously supplying the argument types of
the corresponding entry probe.
rmacklem [Sun, 27 May 2012 14:20:46 +0000 (14:20 +0000)]
MFC: r235332
PR# 165923 reported intermittent write failures for dirty
memory mapped pages being written back on an NFS mount.
Since any thread can call VOP_PUTPAGES() to write back a
dirty page, the credentials of that thread may not have
write access to the file on an NFS server. (Often the uid
is 0, which may be mapped to "nobody" in the NFS server.)
Although there is no completely correct fix for this
(NFS servers check access on every write RPC instead of at
open/mmap time), this patch avoids the common cases by
holding onto a credential that recently opened the file
for writing and uses that credential for the write RPCs
being done by VOP_PUTPAGES() for both NFS clients.
rmacklem [Sun, 27 May 2012 12:47:35 +0000 (12:47 +0000)]
MFC: r234740
Fix a leak of namei lookup path buffers that occurs when a
ZFS volume is exported via the new NFS server. The leak occurred
because the new NFS server code didn't handle the case where
a file system sets the SAVENAME flag in its VOP_LOOKUP() and
ZFS does this for the DELETE case.
rstone [Sun, 27 May 2012 00:38:36 +0000 (00:38 +0000)]
MFC 227430
On i386, fbt probes are implemented by writing an invalid opcode over
certain instructions in a function prologue or epilogue. DTrace has a
hook into the invalid opcode fault handler that checks whether the fault
was due to an probe and if so, runs the DTrace magic.
Upon returning from an invalid opcode fault caused by a probe, DTrace must
emulate the instruction that was replaced with the invalid opcode and then
return control to the instruction following the invalid opcode.
There were a pair of related bugs in the emulation for the leave
instruction. The leave instruction is used to pop off a stack frame prior
to returning from a function. The emulation for this instruction must
move the trap frame for the invalid opcode fault down the stack to the
bottom of the stack frame that is being removed, and then execute an iret.
At two points in this process, the emulation code was storing values above
the current value of the stack pointer. This opened up a window in which
if we were two take an interrupt, the trap frame for the interrupt would
overwrite the values stored on the stack, causing the system to panic
later.
The first bug was that at one point the emulation code saves the new value
for $esp above the current stack pointer value. The fix is to save this
value instead inside of the original trap frame. At this point we do
not need the original trap frame so this is safe.
The second bug is that when the emulate code loads $esp from the stack, it
points part-way through the new trap frame instead of at its beginning.
The emulation code adjusts the stack pointer to the correct value
immediately afterwards, but this still leaves a one instruction window in
which an interrupt would corrupt this trap frame. Fix this by adjusting
the stack frame value before loading it into $esp.
This fixes panics in invop_leave on i386 when using fbt return probes.
rstone [Sun, 27 May 2012 00:34:56 +0000 (00:34 +0000)]
MFC 227429
The generated Makefile for the kernel was not running ctfconvert on
object files corresponding to source files that had the compile-with
option set in conf/files. This means that any fbt probes for functions
in that object file would not have correct argument types.
The fix is to run ctfconvert on any target file that does not have the
no-obj option set in files.
PR: bin/160275
Reported by: Paul Ambrose (ambrosehua AT gmail DOT com)
marius [Sat, 26 May 2012 09:31:25 +0000 (09:31 +0000)]
MFC: r234524
o Fixes:
- When switching to 4-bit operation, send a SET_CLR_CARD_DETECT command
to disconnect the card-detect pull-up resistor from the DAT3 line before
sending the SET_BUS_WIDTH command.
- Add the missing "reserved" zero entry to the mantissa table used to
decode various CSD fields. This was causing SD cards to report that they
could run at 30 MHz instead of the maximum 25 MHz mandated in the spec.
o Enhancements:
- At the MMC layer, format various info from the CID into a string that
uniquely identifies the card instance (manufacturer number, serial
number, product name and revision, etc). Export it as an instance
variable.
- At the MMCSD layer, display the formatted card ID string, and also
report the clock speed of the hardware (not the card's max speed), and
the number of bits and number of blocks per transfer. It comes out like
this now:
mmcsd0: 968MB <SD SD01G 8.0 SN 276886905 MFG 08/2008 by 3 SD> at mmc0
22.5MHz/4bit/128-block
o Use DEVMETHOD_END.
o Use NULL instead of 0 for pointers.
marius [Sat, 26 May 2012 09:13:38 +0000 (09:13 +0000)]
MFC: r234561
Interrupts must be disabled while handling a partial cache line flush,
as otherwise the interrupt handling code may modify data in the non-DMA
part of the cache line while we have it stashed away in the temporary
stack buffer, then we end up restoring a stale value.
marius [Sat, 26 May 2012 08:54:44 +0000 (08:54 +0000)]
MFC: r234898, r235207
Add initial support for booting from ZFS on sparc64. At least on Sun Fire
V100, the firmware is known to be broken and not allowing to simultaneously
open disk devices, causing attempts to boot from a mirror or RAIDZ to cause
a crash. This will be worked around later. The firmwares of newer sun4u models
don't seem to exhibit this problem though.
thompsa [Sat, 26 May 2012 08:44:50 +0000 (08:44 +0000)]
MFC r234936 (emaste)
Relax restriction on direct tx to child ports
Lagg(4) restricts the type of packet that may be sent directly to a child
port, to avoid undesired output from accidental misconfiguration.
Previously only ETHERTYPE_PAE was permitted.
BPF writes to a lagg(4) child port are presumably intentional, so just
allow them, while still blocking other packets that should take the
aggregation path.
thompsa [Sat, 26 May 2012 08:25:41 +0000 (08:25 +0000)]
MFC r232321
Correct the description for CTLFLAG_TUN and CTLFLAG_RDTUN, the declaring of a
system tunable has never been implemented. This flag is only used by sysctl(8)
to provide a helpful error message.
thompsa [Sat, 26 May 2012 08:00:34 +0000 (08:00 +0000)]
MFC r234163
Set the proto to LAGG_PROTO_NONE before calling the detach routine so packets
are discarded, this is an issue because lacp drops the lock which may allow
network threads to access freed memory. Expand the lock coverage so the
detach/attach happen atomically.
thompsa [Sat, 26 May 2012 07:58:58 +0000 (07:58 +0000)]
MFC r232118
Only look for a usable MAC address for the bridge ID from ports within our
bridge, this allows us to have more than one independent bridge in the same
STP domain.
thompsa [Sat, 26 May 2012 07:58:12 +0000 (07:58 +0000)]
MFC r232014,r232030,r232070
- bstp_input() always consumes the packet so remove the mbuf handling dance
around it.
- Now that network interfaces advertise if they support linkstate notifications
we do not need to perform a media ioctl every 15 seconds.
- Indicate this function decrements the timer as well as testing for expiry.
rstone [Fri, 25 May 2012 23:24:16 +0000 (23:24 +0000)]
MFC r227342
The in-kernel CTF parser caches the result of its first attempt to parse
CTF data from a module. On subsequent attempts to retrieve CTF data for
a module, return an error if there no CTF data.
This fixes a panic if you try to enable fbt probes on a module without CTF
data twice.
Submitted by: Paul Ambrose (ambrosehua AT gmail DOT com)
marius [Fri, 25 May 2012 19:57:01 +0000 (19:57 +0000)]
MFC: r235681
Rewrite nd6_sysctl_{d,p}rlist() to avoid misaligned accesses to char arrays
casted to structs by getting rid of these buffers entirely. In r169832, it
was tried to paper over this issue by 32-bit aligning the buffers. Depending
on compiler optimizations that still was insufficient for 64-bit architectures
with strong alignment requirements though.
While at it, add comments regarding the total lack of locking in this area.
bschmidt [Fri, 25 May 2012 16:39:56 +0000 (16:39 +0000)]
MFC r232946,232958,235233:
r232946:
Update the rt2860's firmware and add a Makefile for the module. While
here remove the ucode header file which was used to generate the fw files
but by now is outdated.
r232958:
Import the latest microcode.h which was used to generate the current
firmware files and adjust the Makefile.
r235233:
Add support for Ralink RT2800/RT3000 chipsets.
fabient [Fri, 25 May 2012 07:23:24 +0000 (07:23 +0000)]
MFC r233611:
- Support inlined location in calltree output.
In case of multiple level of inlining all the locations are flattened.
Require recent binutils/addr2line (head works or binutils from ports
with the right $PATH order).
- Multiple fixes in the calltree output (recursion case, ...)
- Fix the calltree top view that previously hide some shared nodes.
Tested with Kcachegrind(kdesdk4)/qcachegrind(head).
rmacklem [Thu, 24 May 2012 13:15:15 +0000 (13:15 +0000)]
MFC: r235568
A problem with the NFSv4 server was reported by Andrew Leonard
to freebsd-fs@, where the setfacl of an NFSv4 acl would fail.
This was caused by the VOP_ACLCHECK() call for ZFS replying
EOPNOTSUPP. After discussion with rwatson@, it was determined
that a call to VOP_ACLCHECK() before doing VOP_SETACL() is not
required. This patch fixes the problem by deleting the
VOP_ACLCHECK() call.
mav [Thu, 24 May 2012 02:46:35 +0000 (02:46 +0000)]
MFC r234458, r234603, r234610, r234727, r234816, r234848, r234868,
r234869, r234899, r234940, r234993, r234994, r235071 -c r235076, r235080,
r235096:
- Add support for the DDF metadata format, as defined by the SNIA Common
RAID Disk Data Format Specification v2.0;
- Add support for reading non-degraded RAID4/5/5E/5EE/5R/6/MDF volumes.
mav [Thu, 24 May 2012 01:43:08 +0000 (01:43 +0000)]
MFC r235270:
- Prevent error status leak if write to some of the RAID1/1E volume disks
failed while write to some other succeeded. Instead mark disk as failed.
- Make RAID1E less aggressive in failing disks to avoid volume breakage.
mav [Thu, 24 May 2012 01:30:17 +0000 (01:30 +0000)]
MFC r235558, r235569:
Add support for writing to HID devices through the interrupt output pipe.
Supermicro LCD screen modules seem to not support accessing reports through
the control pipes, but working fine with the interrupt pipes.
jamie [Wed, 23 May 2012 15:47:07 +0000 (15:47 +0000)]
MFC r222465, r223224, r224841, r232613:
Check for IPv4 or IPv6 to be available by the kernel to not
provoke errors trying to query options not available.
Make it possible to compile out INET or INET6 only parts.
yongari [Wed, 23 May 2012 02:12:00 +0000 (02:12 +0000)]
MFC r235151:
Implement basic remote PHY support. Remote PHY allows the
controller to perform MDIO type accesses to a remote transceiver
using message pages defined through MRBE(multirate backplane
ethernet). It's used in blade systems(e.g Dell Blade m610) which
are connected to pass-through blades rather than traditional
switches.
This change directly manipulates firmware's mailboxes to control
remote PHY such that it does not use mii(4). Alternatively, as
David said, it could be implemented in brgphy(4) by creating a fake
PHY and let brgphy(4) do necessary mii accesses and bce(4) can
implement mailbox accesses based on the type of brgphy(4)'s mii
accesses. Personally, I think it would make brgphy(4) hard to
maintain since it would have to access many bce(4) registers in
brgphy(4). Given that there are users who are suffering from lack
of remote PHY support, it would be better to get working system
rather than waiting for complete/perfect implementation.
jhb [Mon, 21 May 2012 21:16:08 +0000 (21:16 +0000)]
MFC 234190,234196,234280:
- Extend the KDB interface to add a per-debugger callback to print a
backtrace for an arbitrary thread (rather than the calling thread).
A kdb_backtrace_thread() wrapper function uses the configured debugger
if possible, otherwise it falls back to using stack(9) if that is
available.
- Replace a direct call to db_trace_thread() in propagate_priority()
with a call to kdb_backtrace_thread() instead.
- Add support for UPDATING remote fetching.
- Reorganize EXAMPLES section in pkg_updating(1).
- Style fixes.
- Replace hardcoded INDEX version. [1]
- Fix a buffer overlap. [2]
- Remove empty package when fetching fails and -K is used. [3]
- Remove useless chmod2() after mkdtemp(3). [4]
- Replace mkdir(1) call with mkdir(2). [5]
- Get rid of some vsystem() calls.
- Switch from lstat(2) to open(2) in fexists().
- Try rename(2) in move_file() first.
- Fix pkg_delete, check if the file we're trying to delete is a
symlink before complaining that it doesn't exist. Typical case
would be a leftover library symlink that's left over after the
actual library has been removed.
- Print the package name on deletion errors.
- Staticify elide_root()
- In usr.sbin/pkg_install/updating/main.c, use the size of the destination
buffer as size argument to strlcpy(), not the length of the source
bz [Mon, 21 May 2012 00:03:13 +0000 (00:03 +0000)]
MFC r232513:
Correct typo in the RFC number for the constants based on IANA assignments
for IPv6 Neighbor Discovery Option types for "IPv6 Router Advertisement
Options for DNS Configuration". It is RFC 6106.
bz [Sun, 20 May 2012 22:58:37 +0000 (22:58 +0000)]
MFC r232514:
In nd6_options() ignore the RFC 6106 options completely rather than printing
them if nd6_debug is enabled as unknown. Leave a comment about the RFC4191
option as I am undecided so far.
bz [Sun, 20 May 2012 20:25:57 +0000 (20:25 +0000)]
MFC r231532:
MFp4 204292:
Ignore the NAT_T extension types so we can at least dump the SADB from
the in-base libipsec/setkey without error when NAT_T support is present
in the kernel, though not printing the additional information yet.
However in case there is no NAT_T support in kernel still consider them
to be an error.
bz [Sat, 19 May 2012 22:50:22 +0000 (22:50 +0000)]
MFC r234643:
Do not toggle IFCAP_TSO4 if we would also do TSO6. Given the driver does
not currently announce/support TSO6 that cannot happen. Clean it up anyway
for consistency.
bz [Sat, 19 May 2012 22:16:12 +0000 (22:16 +0000)]
MFC r234620:
If we pass down 64k - L2 hdr size + 1 to 64K L3+ data adding an ether
header will make the data go over the 64k limits announced to busdma as
maxsize and the transaction will fail.
With TSO this can result in a TCP regression due to the lost packet.
According to the data sheets ixgbe(4) 82598 and 82599 can handle up to
256k so increase the maximum.
Reported by: Jon Kåre Hellan, UNINETT (jon.kare.hellan uninett.no)
Tested by: Jon Kåre Hellan, UNINETT (jon.kare.hellan uninett.no)
bz [Sat, 19 May 2012 18:33:08 +0000 (18:33 +0000)]
MFC r231767:
Fix PAWS (Protect Against Wrapped Sequence numbers) in cases when
hz >> 1000 and thus getting outside the timestamp clock frequenceny of
1ms < x < 1s per tick as mandated by RFC1323, leading to connection
resets on idle connections.
Always use a granularity of 1ms using getmicrouptime() making all but
relevant callouts independent of hz.
Use getmicrouptime(), not getmicrotime() as the latter may make a jump
possibly breaking TCP nfsroot mounts having our timestamps move forward
for more than 24.8 days in a second without having been idle for that
long.
bz [Sat, 19 May 2012 14:32:47 +0000 (14:32 +0000)]
MFC r233713:
Remove the magic mfi_array is 288 bytes and just use the
sizeof the array since it is not 288 bytes.
Change reporting of a "SYSTEM" disk to "JBOD" to match
LSI MegaCli and firmware reporting.
This means that mfiutil command to "create jbod" is now a
little confusing since a RAID per drive is not really what
LSI defines JBOD to be. This should be fixed in the future
and support added to really create LSI JBOD and enable that
feature on cards that support it.
sbruno [Fri, 18 May 2012 19:48:33 +0000 (19:48 +0000)]
MFC of head thunderbolt support for mfi(4)
r233711 -- IFV head_mfi into head for initial thunderbolt support
r233768 -- atomic_t --> mfi_atomic
r233805 -- fix tinderbuild, move megasas_sge to mfivar.h
r233877 -- remove atomic.h from includes
r235014 -- fix reading of sector >= 2^32 or 2^21, repair RAID handling
r235016 -- style(9)
r235040 -- fix returns from mfi_tbolt_sync_map_info()
r235318 -- repair panic on PAE i386
r235321 -- repair the repair of panics on PAE i386
jhb [Fri, 18 May 2012 18:53:28 +0000 (18:53 +0000)]
MFC 234186
If a linker file contains at least one module, but all of the modules
fail to load (the MOD_LOAD event fails) during a kldload(2), unload the
linker file and fail the kldload(2) with ENOEXEC.
sbruno [Fri, 18 May 2012 18:29:44 +0000 (18:29 +0000)]
MFC r235210
Modify the binding of queues to attach to as many CPUs
as possible when using more than one igb(4) adapter. This
means that queues will not be bound to the same CPUs if
there are more CPUs availble.
This is only applicable to a system that has multiple interfaces.