des [Mon, 28 May 2012 21:25:19 +0000 (21:25 +0000)]
MFH r208957: check return value from archive_read_new()
MFH r214137, r214140, r214174, r224584: support reading from stdin
MFH r230044, r233456: style and markup nits
MFH r234311: correct name in copyright statement
rstone [Sun, 27 May 2012 19:03:16 +0000 (19:03 +0000)]
MFC r225640
Clear transmit checksum offload context state upon lem(4) interface
initialization. Prior to this change packets may be transmitted with an
incorrect checksum.
em(4) already has an equivalent change in r213234.
rstone [Sun, 27 May 2012 18:57:20 +0000 (18:57 +0000)]
MFC r234691
Implement the D "cpu" variable, which returns curcpu. I have chosen not
to follow the example of OpenSolaris and its descendants, which implemented
cpu as an inline that took a value out of curthread. At certain points in
the FreeBSD scheduler curthread->td_oncpu will no longer be valid (in
particukar, just before the thread gets descheduled) so instead I have
implemented this as its own built-in variable.
marius [Sun, 27 May 2012 15:48:25 +0000 (15:48 +0000)]
MFC: r235255 (partial)
Change the module order of this MAC driver to be last so its is
deterministically handled after the corresponding PHY drivers when
loaded as module. Otherwise, when this MAC/PHY driver combination
is compiled into a single module, probing the PHY drivers may fail.
This makes r151438 actually work.
Reported and tested by: yongari (for fxp(4))
Given that r226154 isn't part of stable/8, the other drivers fixed
as part of the original r235255 aren't affected here.
rstone [Sun, 27 May 2012 14:52:31 +0000 (14:52 +0000)]
MFC r233552
Instead of only iterating over the set of known SDT probes when sdt.ko is
loaded and unloaded, also have sdt.ko register callbacks with kern_sdt.c
that will be called when a newly loaded KLD module adds more probes or
a module with probes is unloaded.
This fixes two issues: first, if a module with SDT probes was loaded after
sdt.ko was loaded, those new probes would not be available in DTrace.
Second, if a module with SDT probes was unloaded while sdt.ko was loaded,
the kernel would panic the next time DTrace had cause to try and do
anything with the no-longer-existent probes.
This makes it possible to create SDT probes in KLD modules, although there
are still two caveats: first, any SDT probes in a KLD module must be part
of a DTrace provider that is defined in that module. At present DTrace
only destroys probes when the provider is destroyed, so you can still
panic the system if a KLD module creates new probes in a provider from a
different module(including the kernel) and then unload the the first module.
Second, the system will panic if you unload a module containing SDT probes
while there is an active D script that has enabled those probes.
rstone [Sun, 27 May 2012 14:25:16 +0000 (14:25 +0000)]
MFC r227441
Correct the types of the arguments to return probes of the syscall
provider. Previously we were erroneously supplying the argument types of
the corresponding entry probe.
rmacklem [Sun, 27 May 2012 14:20:46 +0000 (14:20 +0000)]
MFC: r235332
PR# 165923 reported intermittent write failures for dirty
memory mapped pages being written back on an NFS mount.
Since any thread can call VOP_PUTPAGES() to write back a
dirty page, the credentials of that thread may not have
write access to the file on an NFS server. (Often the uid
is 0, which may be mapped to "nobody" in the NFS server.)
Although there is no completely correct fix for this
(NFS servers check access on every write RPC instead of at
open/mmap time), this patch avoids the common cases by
holding onto a credential that recently opened the file
for writing and uses that credential for the write RPCs
being done by VOP_PUTPAGES() for both NFS clients.
rmacklem [Sun, 27 May 2012 12:47:35 +0000 (12:47 +0000)]
MFC: r234740
Fix a leak of namei lookup path buffers that occurs when a
ZFS volume is exported via the new NFS server. The leak occurred
because the new NFS server code didn't handle the case where
a file system sets the SAVENAME flag in its VOP_LOOKUP() and
ZFS does this for the DELETE case.
rstone [Sun, 27 May 2012 00:38:36 +0000 (00:38 +0000)]
MFC 227430
On i386, fbt probes are implemented by writing an invalid opcode over
certain instructions in a function prologue or epilogue. DTrace has a
hook into the invalid opcode fault handler that checks whether the fault
was due to an probe and if so, runs the DTrace magic.
Upon returning from an invalid opcode fault caused by a probe, DTrace must
emulate the instruction that was replaced with the invalid opcode and then
return control to the instruction following the invalid opcode.
There were a pair of related bugs in the emulation for the leave
instruction. The leave instruction is used to pop off a stack frame prior
to returning from a function. The emulation for this instruction must
move the trap frame for the invalid opcode fault down the stack to the
bottom of the stack frame that is being removed, and then execute an iret.
At two points in this process, the emulation code was storing values above
the current value of the stack pointer. This opened up a window in which
if we were two take an interrupt, the trap frame for the interrupt would
overwrite the values stored on the stack, causing the system to panic
later.
The first bug was that at one point the emulation code saves the new value
for $esp above the current stack pointer value. The fix is to save this
value instead inside of the original trap frame. At this point we do
not need the original trap frame so this is safe.
The second bug is that when the emulate code loads $esp from the stack, it
points part-way through the new trap frame instead of at its beginning.
The emulation code adjusts the stack pointer to the correct value
immediately afterwards, but this still leaves a one instruction window in
which an interrupt would corrupt this trap frame. Fix this by adjusting
the stack frame value before loading it into $esp.
This fixes panics in invop_leave on i386 when using fbt return probes.
rstone [Sun, 27 May 2012 00:34:56 +0000 (00:34 +0000)]
MFC 227429
The generated Makefile for the kernel was not running ctfconvert on
object files corresponding to source files that had the compile-with
option set in conf/files. This means that any fbt probes for functions
in that object file would not have correct argument types.
The fix is to run ctfconvert on any target file that does not have the
no-obj option set in files.
PR: bin/160275
Reported by: Paul Ambrose (ambrosehua AT gmail DOT com)
marius [Sat, 26 May 2012 09:31:25 +0000 (09:31 +0000)]
MFC: r234524
o Fixes:
- When switching to 4-bit operation, send a SET_CLR_CARD_DETECT command
to disconnect the card-detect pull-up resistor from the DAT3 line before
sending the SET_BUS_WIDTH command.
- Add the missing "reserved" zero entry to the mantissa table used to
decode various CSD fields. This was causing SD cards to report that they
could run at 30 MHz instead of the maximum 25 MHz mandated in the spec.
o Enhancements:
- At the MMC layer, format various info from the CID into a string that
uniquely identifies the card instance (manufacturer number, serial
number, product name and revision, etc). Export it as an instance
variable.
- At the MMCSD layer, display the formatted card ID string, and also
report the clock speed of the hardware (not the card's max speed), and
the number of bits and number of blocks per transfer. It comes out like
this now:
mmcsd0: 968MB <SD SD01G 8.0 SN 276886905 MFG 08/2008 by 3 SD> at mmc0
22.5MHz/4bit/128-block
o Use DEVMETHOD_END.
o Use NULL instead of 0 for pointers.
marius [Sat, 26 May 2012 09:13:38 +0000 (09:13 +0000)]
MFC: r234561
Interrupts must be disabled while handling a partial cache line flush,
as otherwise the interrupt handling code may modify data in the non-DMA
part of the cache line while we have it stashed away in the temporary
stack buffer, then we end up restoring a stale value.
marius [Sat, 26 May 2012 08:54:44 +0000 (08:54 +0000)]
MFC: r234898, r235207
Add initial support for booting from ZFS on sparc64. At least on Sun Fire
V100, the firmware is known to be broken and not allowing to simultaneously
open disk devices, causing attempts to boot from a mirror or RAIDZ to cause
a crash. This will be worked around later. The firmwares of newer sun4u models
don't seem to exhibit this problem though.
thompsa [Sat, 26 May 2012 08:44:50 +0000 (08:44 +0000)]
MFC r234936 (emaste)
Relax restriction on direct tx to child ports
Lagg(4) restricts the type of packet that may be sent directly to a child
port, to avoid undesired output from accidental misconfiguration.
Previously only ETHERTYPE_PAE was permitted.
BPF writes to a lagg(4) child port are presumably intentional, so just
allow them, while still blocking other packets that should take the
aggregation path.
thompsa [Sat, 26 May 2012 08:25:41 +0000 (08:25 +0000)]
MFC r232321
Correct the description for CTLFLAG_TUN and CTLFLAG_RDTUN, the declaring of a
system tunable has never been implemented. This flag is only used by sysctl(8)
to provide a helpful error message.
thompsa [Sat, 26 May 2012 08:00:34 +0000 (08:00 +0000)]
MFC r234163
Set the proto to LAGG_PROTO_NONE before calling the detach routine so packets
are discarded, this is an issue because lacp drops the lock which may allow
network threads to access freed memory. Expand the lock coverage so the
detach/attach happen atomically.
thompsa [Sat, 26 May 2012 07:58:58 +0000 (07:58 +0000)]
MFC r232118
Only look for a usable MAC address for the bridge ID from ports within our
bridge, this allows us to have more than one independent bridge in the same
STP domain.
thompsa [Sat, 26 May 2012 07:58:12 +0000 (07:58 +0000)]
MFC r232014,r232030,r232070
- bstp_input() always consumes the packet so remove the mbuf handling dance
around it.
- Now that network interfaces advertise if they support linkstate notifications
we do not need to perform a media ioctl every 15 seconds.
- Indicate this function decrements the timer as well as testing for expiry.
rstone [Fri, 25 May 2012 23:24:16 +0000 (23:24 +0000)]
MFC r227342
The in-kernel CTF parser caches the result of its first attempt to parse
CTF data from a module. On subsequent attempts to retrieve CTF data for
a module, return an error if there no CTF data.
This fixes a panic if you try to enable fbt probes on a module without CTF
data twice.
Submitted by: Paul Ambrose (ambrosehua AT gmail DOT com)
marius [Fri, 25 May 2012 19:57:01 +0000 (19:57 +0000)]
MFC: r235681
Rewrite nd6_sysctl_{d,p}rlist() to avoid misaligned accesses to char arrays
casted to structs by getting rid of these buffers entirely. In r169832, it
was tried to paper over this issue by 32-bit aligning the buffers. Depending
on compiler optimizations that still was insufficient for 64-bit architectures
with strong alignment requirements though.
While at it, add comments regarding the total lack of locking in this area.
bschmidt [Fri, 25 May 2012 16:39:56 +0000 (16:39 +0000)]
MFC r232946,232958,235233:
r232946:
Update the rt2860's firmware and add a Makefile for the module. While
here remove the ucode header file which was used to generate the fw files
but by now is outdated.
r232958:
Import the latest microcode.h which was used to generate the current
firmware files and adjust the Makefile.
r235233:
Add support for Ralink RT2800/RT3000 chipsets.
fabient [Fri, 25 May 2012 07:23:24 +0000 (07:23 +0000)]
MFC r233611:
- Support inlined location in calltree output.
In case of multiple level of inlining all the locations are flattened.
Require recent binutils/addr2line (head works or binutils from ports
with the right $PATH order).
- Multiple fixes in the calltree output (recursion case, ...)
- Fix the calltree top view that previously hide some shared nodes.
Tested with Kcachegrind(kdesdk4)/qcachegrind(head).
rmacklem [Thu, 24 May 2012 13:15:15 +0000 (13:15 +0000)]
MFC: r235568
A problem with the NFSv4 server was reported by Andrew Leonard
to freebsd-fs@, where the setfacl of an NFSv4 acl would fail.
This was caused by the VOP_ACLCHECK() call for ZFS replying
EOPNOTSUPP. After discussion with rwatson@, it was determined
that a call to VOP_ACLCHECK() before doing VOP_SETACL() is not
required. This patch fixes the problem by deleting the
VOP_ACLCHECK() call.
mav [Thu, 24 May 2012 02:46:35 +0000 (02:46 +0000)]
MFC r234458, r234603, r234610, r234727, r234816, r234848, r234868,
r234869, r234899, r234940, r234993, r234994, r235071 -c r235076, r235080,
r235096:
- Add support for the DDF metadata format, as defined by the SNIA Common
RAID Disk Data Format Specification v2.0;
- Add support for reading non-degraded RAID4/5/5E/5EE/5R/6/MDF volumes.
mav [Thu, 24 May 2012 01:43:08 +0000 (01:43 +0000)]
MFC r235270:
- Prevent error status leak if write to some of the RAID1/1E volume disks
failed while write to some other succeeded. Instead mark disk as failed.
- Make RAID1E less aggressive in failing disks to avoid volume breakage.
mav [Thu, 24 May 2012 01:30:17 +0000 (01:30 +0000)]
MFC r235558, r235569:
Add support for writing to HID devices through the interrupt output pipe.
Supermicro LCD screen modules seem to not support accessing reports through
the control pipes, but working fine with the interrupt pipes.
jamie [Wed, 23 May 2012 15:47:07 +0000 (15:47 +0000)]
MFC r222465, r223224, r224841, r232613:
Check for IPv4 or IPv6 to be available by the kernel to not
provoke errors trying to query options not available.
Make it possible to compile out INET or INET6 only parts.
yongari [Wed, 23 May 2012 02:12:00 +0000 (02:12 +0000)]
MFC r235151:
Implement basic remote PHY support. Remote PHY allows the
controller to perform MDIO type accesses to a remote transceiver
using message pages defined through MRBE(multirate backplane
ethernet). It's used in blade systems(e.g Dell Blade m610) which
are connected to pass-through blades rather than traditional
switches.
This change directly manipulates firmware's mailboxes to control
remote PHY such that it does not use mii(4). Alternatively, as
David said, it could be implemented in brgphy(4) by creating a fake
PHY and let brgphy(4) do necessary mii accesses and bce(4) can
implement mailbox accesses based on the type of brgphy(4)'s mii
accesses. Personally, I think it would make brgphy(4) hard to
maintain since it would have to access many bce(4) registers in
brgphy(4). Given that there are users who are suffering from lack
of remote PHY support, it would be better to get working system
rather than waiting for complete/perfect implementation.
jhb [Mon, 21 May 2012 21:16:08 +0000 (21:16 +0000)]
MFC 234190,234196,234280:
- Extend the KDB interface to add a per-debugger callback to print a
backtrace for an arbitrary thread (rather than the calling thread).
A kdb_backtrace_thread() wrapper function uses the configured debugger
if possible, otherwise it falls back to using stack(9) if that is
available.
- Replace a direct call to db_trace_thread() in propagate_priority()
with a call to kdb_backtrace_thread() instead.
- Add support for UPDATING remote fetching.
- Reorganize EXAMPLES section in pkg_updating(1).
- Style fixes.
- Replace hardcoded INDEX version. [1]
- Fix a buffer overlap. [2]
- Remove empty package when fetching fails and -K is used. [3]
- Remove useless chmod2() after mkdtemp(3). [4]
- Replace mkdir(1) call with mkdir(2). [5]
- Get rid of some vsystem() calls.
- Switch from lstat(2) to open(2) in fexists().
- Try rename(2) in move_file() first.
- Fix pkg_delete, check if the file we're trying to delete is a
symlink before complaining that it doesn't exist. Typical case
would be a leftover library symlink that's left over after the
actual library has been removed.
- Print the package name on deletion errors.
- Staticify elide_root()
- In usr.sbin/pkg_install/updating/main.c, use the size of the destination
buffer as size argument to strlcpy(), not the length of the source
bz [Mon, 21 May 2012 00:03:13 +0000 (00:03 +0000)]
MFC r232513:
Correct typo in the RFC number for the constants based on IANA assignments
for IPv6 Neighbor Discovery Option types for "IPv6 Router Advertisement
Options for DNS Configuration". It is RFC 6106.
bz [Sun, 20 May 2012 22:58:37 +0000 (22:58 +0000)]
MFC r232514:
In nd6_options() ignore the RFC 6106 options completely rather than printing
them if nd6_debug is enabled as unknown. Leave a comment about the RFC4191
option as I am undecided so far.
bz [Sun, 20 May 2012 20:25:57 +0000 (20:25 +0000)]
MFC r231532:
MFp4 204292:
Ignore the NAT_T extension types so we can at least dump the SADB from
the in-base libipsec/setkey without error when NAT_T support is present
in the kernel, though not printing the additional information yet.
However in case there is no NAT_T support in kernel still consider them
to be an error.
bz [Sat, 19 May 2012 22:50:22 +0000 (22:50 +0000)]
MFC r234643:
Do not toggle IFCAP_TSO4 if we would also do TSO6. Given the driver does
not currently announce/support TSO6 that cannot happen. Clean it up anyway
for consistency.
bz [Sat, 19 May 2012 22:16:12 +0000 (22:16 +0000)]
MFC r234620:
If we pass down 64k - L2 hdr size + 1 to 64K L3+ data adding an ether
header will make the data go over the 64k limits announced to busdma as
maxsize and the transaction will fail.
With TSO this can result in a TCP regression due to the lost packet.
According to the data sheets ixgbe(4) 82598 and 82599 can handle up to
256k so increase the maximum.
Reported by: Jon Kåre Hellan, UNINETT (jon.kare.hellan uninett.no)
Tested by: Jon Kåre Hellan, UNINETT (jon.kare.hellan uninett.no)
bz [Sat, 19 May 2012 18:33:08 +0000 (18:33 +0000)]
MFC r231767:
Fix PAWS (Protect Against Wrapped Sequence numbers) in cases when
hz >> 1000 and thus getting outside the timestamp clock frequenceny of
1ms < x < 1s per tick as mandated by RFC1323, leading to connection
resets on idle connections.
Always use a granularity of 1ms using getmicrouptime() making all but
relevant callouts independent of hz.
Use getmicrouptime(), not getmicrotime() as the latter may make a jump
possibly breaking TCP nfsroot mounts having our timestamps move forward
for more than 24.8 days in a second without having been idle for that
long.
bz [Sat, 19 May 2012 14:32:47 +0000 (14:32 +0000)]
MFC r233713:
Remove the magic mfi_array is 288 bytes and just use the
sizeof the array since it is not 288 bytes.
Change reporting of a "SYSTEM" disk to "JBOD" to match
LSI MegaCli and firmware reporting.
This means that mfiutil command to "create jbod" is now a
little confusing since a RAID per drive is not really what
LSI defines JBOD to be. This should be fixed in the future
and support added to really create LSI JBOD and enable that
feature on cards that support it.
sbruno [Fri, 18 May 2012 19:48:33 +0000 (19:48 +0000)]
MFC of head thunderbolt support for mfi(4)
r233711 -- IFV head_mfi into head for initial thunderbolt support
r233768 -- atomic_t --> mfi_atomic
r233805 -- fix tinderbuild, move megasas_sge to mfivar.h
r233877 -- remove atomic.h from includes
r235014 -- fix reading of sector >= 2^32 or 2^21, repair RAID handling
r235016 -- style(9)
r235040 -- fix returns from mfi_tbolt_sync_map_info()
r235318 -- repair panic on PAE i386
r235321 -- repair the repair of panics on PAE i386
jhb [Fri, 18 May 2012 18:53:28 +0000 (18:53 +0000)]
MFC 234186
If a linker file contains at least one module, but all of the modules
fail to load (the MOD_LOAD event fails) during a kldload(2), unload the
linker file and fail the kldload(2) with ENOEXEC.
sbruno [Fri, 18 May 2012 18:29:44 +0000 (18:29 +0000)]
MFC r235210
Modify the binding of queues to attach to as many CPUs
as possible when using more than one igb(4) adapter. This
means that queues will not be bound to the same CPUs if
there are more CPUs availble.
This is only applicable to a system that has multiple interfaces.
jhb [Thu, 17 May 2012 15:45:00 +0000 (15:45 +0000)]
MFC 233708,234154:
Fix a few issues with transmit handling in em(4) and igb(4):
- Do not define the foo_start() methods or set if_start in the ifnet if
multiq transmit is enabled. Also, set if_transmit and if_qflush before
ether_ifattach rather than after when multiq transmit is enabled. This
helps to ensure that the drivers never try to mix different transmit
methods.
- Properly restart transmit during resume. igb(4) was not restarting it
at all, and em(4) was restarting even if the link was down and was
calling the wrong method if multiq transmit was enabled.
- Remove all the 'more' handling for transmit completions. Transmit
completion processing does not have a processing limit, so it always
runs to completion and never has more work to do when it returns.
Instead, the previous code was returning 'true' anytime there were
packets in the queue that weren't still in the process of being
transmitted. The effect was that the driver would continuously
reschedule a task to process TX completions in effect running at 100%
CPU polling the hardware until it finished transmitting all of the
packets in the ring. Now it will just wait for the next TX completion
interrupt.
- Restart packet transmission when the link becomes active.
- Fix the MSI-X queue interrupt handlers to restart packet transmission if
there are pending packets in the relevant software queue (IFQ or buf_ring)
after processing TX completions. This is the root cause for the OACTIVE
hangs as if the MSI-X queue handler drained all the pending packets from
the TX ring, nothing would ever restart it. As such, remove some
previously-added workarounds to reschedule a task to poll the TX ring
anytime OACTIVE was set.
- Use a dedicated task to handle deferred transmits from the if_transmit
method instead of reusing the existing per-queue interrupt task.
Reusing the per-queue interrupt task could result in both an interrupt
thread and the taskqueue thread trying to handle received packets on a
single queue resulting in out-of-order packet processing.
- Call ether_ifdetach() earlier in igb_detach().
- Drain tasks and free taskqueues during igb_detach().
jhb [Wed, 16 May 2012 21:07:19 +0000 (21:07 +0000)]
MFC 234152:
Allow device_busy() and device_unbusy() to be invoked while a device is
being attached. This is implemented by adding a new DS_ATTACHING state
while a device's DEVICE_ATTACH() method is being invoked. A driver is
required to not fail an attach of a busy device. The device's state will
be promoted to DS_BUSY rather than DS_ACTIVE() if the device was marked
busy during DEVICE_ATTACH()
jhb [Wed, 16 May 2012 20:05:31 +0000 (20:05 +0000)]
MFC 218221,233709,233781,233793:
- Use a dedicated taskqueue with a thread that runs at a software-interrupt
priority for the periodic polling of the machine check registers.
- Don't malloc() new MCA records for machine checks logged due to a
CMCI or MC# exception. Instead, use a pre-allocated pool of records.
When a CMCI or MC# exception fires, schedule a task to refill the pool.
The pool is sized to hold at least one record per available machine
bank, and one record per CPU. This should handle the case of all CPUs
triggering a single bank at once as well as the case a single CPU
triggering all of its banks. The periodic scans still use malloc()
since they are run from a safe context.
- Make machine check exception logging more readable. On newer Intel systems,
an uncorrected ECC error tends to fire on all CPUs in a package
simultaneously and the current printf hacks are not sufficient to make
the messages legible. Instead, use the existing mca_lock spinlock to
serialize calls to mca_log() and change the machine check code to panic
directly when an unrecoverable error is encoutered rather than falling
back to a trap_fatal() call in trap() (which adds nearly a screen-full of
logging messages that aren't useful for machine checks).
delphij [Wed, 16 May 2012 20:02:29 +0000 (20:02 +0000)]
MFC r234244:
The scandir(3) function expects fourth parameter, compar, be in type of:
int (*compar)(const struct dirent **, const struct dirent **)
The current code defines sortq() to accept two void *, then cast them
to const struct dirent **. Because the code does not really need this
cast, we can eliminate the casts by changing the function prototype
to match scandir(3) expectation.
jimharris [Wed, 16 May 2012 00:03:58 +0000 (00:03 +0000)]
MFC r235043:
Fix off-by-one error in sati_inquiry_block_device_translate_data(). Bug would
result in INQUIRY VPD 0x81 to SATA devices to return only 63 bytes of data
instead of 64 during SCSI/ATA translation.