adrian [Sun, 28 Jul 2013 04:53:00 +0000 (04:53 +0000)]
Refactor the VAP transmit path code into a utility function that both
the normal and the mesh transmit paths can use.
The API is a bit horrible because it both consumes the mbuf and frees
the node reference regardless of whether it succeeds or not.
It's a hold-over from how the code behaves; it'd be nice to have it
not free the node reference / mbuf if TX fails and let the caller
decide what to do.
DTrace: re-apply r249426 now that the underlying issues have been solved.
Merge change from illumos:
3519 DTrace fails to resolve const types from fbt
3520 dtrace internal error -- token type 316 is not a valid D
compilation token
3521 clean up dtrace unit tests
DTrace: re-merge remainder of r249367 (original from Illumos).
Bring back some important fixes from Illumos:
3022 DTrace: keys should not affect the sort order when sorting by value
3023 it should be possible to dereference dynamic variables
3024 D integer narrowing needs some work
We particularly avoid the LD_NOLAZYLOAD changes that Illumos made
as those don't apply to FreeBSD and were causing problems in
interactive mode.
Synchronize device cache on close only if there were some write operations.
While these operations are not really needed otherwise, at least for SCSI
they may cause extra errors if some other initiator holds write exclusive
reservation on the LUN (SYNCHRONIZE CACHE handled as "write" operation).
Use kern_ioctl() rather than ioctl() for testing the FBT provider, since the
latter doesn't exist in FreeBSD. All the tests under fbtprovider pass now.
At some point after stable/7 the ACPI and ISA interfaces to the IPMI controller
no longer have the parent in the device tree. This causes the identify
function in ipmi_isa.c to attempt to probe and poke at the ISA IPMI interface
Move the check for ipmi_attached out of the ipmi_isa_attach function and into
the ipmi_isa_identify function. Remove the check of the device tree for
ipmi devices attached.
This probing appears to make Broadcom management firmware on Dell machines
crash and emit NMI EISA warnings at various times requiring power cycles
of the machines to restore.
Bump MAX_TIMEOUT to 6 seconds as a hack for super slow IPMI interfaces that
need longer to respond to our intial probes on startup.
Tested on Dell R410, R510, R815, HP DL160G6
This is MFC candidate for 9.2R
Reviewed by: peter
MFC after: 2 weeks
Sponsored by: Yahoo! Inc.
marius [Sat, 27 Jul 2013 15:28:31 +0000 (15:28 +0000)]
- Set the System Identifier in the Primary Volume Descriptor to FreeBSD
rather than NetBSD.
- Correctly set the Expiration Time in the Primary Volume Descriptor;
according to ISO 9660 8.4.26.1 unspecified date and time are denoted
by the digit 0 in RBP 1 to 16 but the number 0 in RBP 17. [1]
- Merge iso9660_rrip.c rev. 1.11 from NetBSD: name_len should be read
as unsigned byte. [2]
Note: This is according to ISO 9660 9.1.10.
- Rock Ridge TF entries should use a length of 5, because after the 4
bytes of generic SUSP header there is one byte of flags. See typedef
of ISO_RRIP_TF in iso9660_rrip.h. [1]
Submitted by: Thomas Schmitt [1]
Obtained from: NetBSD [2]
MFC after: 3 days
Introduce 3 seconds timeout on `graid stop` command (mostly with -f flag).
Since completion waiting goes in g_event thread, it may cause GEOM deadlock
if consumer on top (for example, ZFS) uses g_event thread for closing.
jeff [Fri, 26 Jul 2013 23:22:05 +0000 (23:22 +0000)]
Improve page LRU quality and simplify the logic.
- Don't short-circuit aging tests for unmapped objects. This biases
against unmapped file pages and transient mappings.
- Always honor PGA_REFERENCED. We can now use this after soft busying
to lazily restart the LRU.
- Don't transition directly from active to cached bypassing the inactive
queue. This frees recently used data much too early.
- Rename actcount to act_delta to be more consistent with use and meaning.
Add support for packet-sniffing tracers to cxgbe(4). This works with
all T4 and T5 based cards and is useful for analyzing TSO, LRO, TOE, and
for general purpose monitoring without tapping any cxgbe or cxl ifnet
directly.
Tracers on the T4/T5 chips provide access to Ethernet frames exactly as
they were received from or transmitted on the wire. On transmit, a
tracer will capture a frame after TSO segmentation, hw VLAN tag
insertion, hw L3 & L4 checksum insertion, etc. It will also capture
frames generated by the TCP offload engine (TOE traffic is normally
invisible to the kernel). On receive, a tracer will capture a frame
before hw VLAN extraction, runt filtering, other badness filtering,
before the steering/drop/L2-rewrite filters or the TOE have had a go at
it, and of course before sw LRO in the driver.
There are 4 tracers on a chip. A tracer can trace only in one direction
(tx or rx). For now cxgbetool will set up tracers to capture the first
128B of every transmitted or received frame on a given port. This is a
small subset of what the hardware can do. A pseudo ifnet with the same
name as the nexus driver (t4nex0 or t5nex0) will be created for tracing.
The data delivered to this ifnet is an additional copy made inside the
chip. Normal delivery to cxgbe<n> or cxl<n> will be made as usual.
/* watch cxl0, which is the first port hanging off t5nex0. */
# cxgbetool t5nex0 tracer 0 tx0 (watch what cxl0 is transmitting)
# cxgbetool t5nex0 tracer 1 rx0 (watch what cxl0 is receiving)
# cxgbetool t5nex0 tracer list
# tcpdump -i t5nex0 <== all that cxl0 sees and puts on the wire
If you were doing TSO, a tcpdump on cxl0 may have shown you ~64K
"frames" with no L3/L4 checksum but this will show you the frames that
were actually transmitted.
adrian [Fri, 26 Jul 2013 19:41:13 +0000 (19:41 +0000)]
Break out the static, global LACP debug options into a per-lagg unit
sysctl tree.
* Create a net.link.lagg.X.lacp node
* Add a debug node under that for tx_test and rx_test
* Add lacp_strict_mode, defaulting to 1
tx_test and rx_test are still a bitmap of unit numbers for now.
At some point it would be nice to create child nodes of the lagg bundle
for each sub-interface, and then populate those with various knobs
and statistics.
jeff [Fri, 26 Jul 2013 19:06:14 +0000 (19:06 +0000)]
- Use kmem_malloc rather than kmem_alloc() for GDT/LDT/tss allocations etc.
This eliminates some unusual uses of that API in favor of more typical
uses of kmem_malloc().
make path matching in devfs rules consistent and sane (and safer)
Before this change path matching had the following features:
- for device nodes the patterns were matched against full path
- in the above case '/' in a path could be matched by a wildcard
- for directories and links only the last component was matched
So, for example, a pattern like 're*' could match the following entries:
- re0 device
- responder/u0 device
- zvol/recpool directory
Although it was possible to work around this behavior (once it was spotted
and understood), it was very confusing and contrary to documentation.
Now we always match a full path for all types of devfs entries (devices,
directories, links) and a '/' has to be matched explicitly.
This behavior follows the shell globbing rules.
This change is originally developed by Jaakko Heinonen.
Many thanks!
marius [Fri, 26 Jul 2013 14:23:25 +0000 (14:23 +0000)]
- Once we have shifted arguments thrice, base-bits-dir is $1 rather than $4.
Introduce $BASEBITSDIR for clarity and in order to avoid repeating this
mistake in the future. Fixing this ensures that we pick up the newly built
boot code and loader native to the target, which is especially relevant
when cross-building release images.
- It is pointless to specify an endianess for ISO 9660 images so strip that.
marius [Fri, 26 Jul 2013 14:22:03 +0000 (14:22 +0000)]
Ensure that makefs.h is included when using ufs_bswap.h so the FFS_EI macro
is picked up when defined. Previously, ffs_subr.c was always built without
support for opposite endianess as it doesn't include makefs.h on its own.
Assume that all Apple products using interface class 255, subclass 253
and protocol 1 are USB ethernet adapters. This avoids keeping and updating
the product list every now and then. This patch will add support for the
USB ethernet interface found in the IPAD.
GCC can generate bogus dwarf attributes with DW_AT_byte_size
set to 0xFFFFFFFF.
The issue was originaly detected in NetBSD but it has been
adapted for portability and to avoid compiler warnings.
Enhance the description of NOTE_TRACK:
- NOTE_TRACK has never triggered a NOTE_TRACK event from the parent pid.
If NOTE_FORK is set, the listener will get a NOTE_FORK event from
the parent pid, but not a separate NOTE_TRACK event.
- Explicitly note that the event added to monitor the child process
preserves the fflags from the original event.
- Move the description of NOTE_TRACKERR under NOTE_TRACK as it is not a
bit for the user to set (which is what this list pupports to be).
Also, explicitly note that if an error occurs, the NOTE_CHILD event
will not be generated.
Set the device description after we call uart_probe(). In uart_probe()
we call device-specific probe functions, which can (and typically will)
set the device description based on low-level device probe information.
In the end we never actually used the device description that we so
carefully maintained in the PCI match table. By setting the device
description after we call uart_probe(), we'll print the more user-
friendly description by default.
Avoid trashing IP fragments:
- Only enable UDP/TCP hardware checksums if CSUM_UDP or CSUM_TCP is set.
- Only enable IP hardware checksums if CSUM_IP is set.
ext2fs: Don't assume that on-disk format of a directory is the same
as in <sys/dirent.h>
ext2_readdir() has always been very fs specific and different
with respect to its ufs_ counterpart. Recent changes from UFS
have made it possible to share more closely the implementation.
MFUFS r252438:
Always start parsing at DIRBLKSIZ aligned offset, skip first entries if
uio_offset is not DIRBLKSIZ aligned. Return EINVAL if buffer is too
small for single entry.
Replace the RESET blocks with regular functions and a reset() function that
calls them all.
This code generation tool is unusual and does not appear to provide much
benefit. I do not think isolating the knowledge about which modules need to
be reset is worth an almost 500-line build tool and wider scope for
variables used by the reset functions.
Also, relying on reset functions is often wrong: the cleanup should be done
in exception handlers so that no stale state remains after 'command eval'
and the like.
These cleanup operations are not needed because they are already performed
after an optimized command substitution (whether there was an error or not).
Match function definition to declaration and call-site.
SVN r95378 refactored ahc_9005_subdevinfo_valid out into a separate
function but swapped the vendor/subvendor and device/subdevice pairs of
the parameters.
Found by: Coverity Prevent, CID 744931
Reviewed by: gibbs
snd_ds1(4): Fix order of arguments for stereo/16bit mode
This function is called 4 times in this file, with swapped parameter
ordering. Fix the function definition instead of all the call sites.
16bit/stereo or 8bit/mono playback is unaffected and was probably
working fine before, this should fix 16bit/mono and 8bit/stereo
playback.
Clear entire map structure including locks so that the
locks don't accidentally appear to have been already
initialized.
In particular, this fixes a consistent kernel crash on
armv6 with:
panic: lock "vm map (user)" 0xc09cc050 already initialized
that appeared with r251709.
per style(9):
Kernel include files (i.e. sys/*.h) come first; normally, include
<sys/types.h> OR <sys/param.h>, but not both. <sys/types.h> includes
<sys/cdefs.h>, and it is okay to depend on that.
Further restrict the MAC addresses that we use for UUID generation
to those that are universally administered. While it is possible to
add locally administered MAC addresses, it's unclear whether those
are (expected) to be more unique than random multicast MAC addresses
or not.
With many U-Boot configurations assigning fixed and non-official MAC
addresses to ethernet ports and without setting the 'X' flag, this
change may have very little value in the embedded (development)
space. Uniqueness of the universally administered addresses is non-
existent on the (H/W) bench and questionable under the (S/W) desk.
In short: this change is aimed at production environments...
Fix bug in universe where if upgrade_checks wants a new make,
it gets built 16 times in parallel in the same location.
While we are at it, until we finish getting rid of fmake,
be explicit about the make we want to use, thus avoid the problem
of the temp make being the wrong version.
In uuid_ether_add(), avoid false positives due to the limited type
used to hold the sum of the bytes of the MAC address. While here,
rename the variable that holds the sum from 'c' to 'sum'.
rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LAST
Also directly call swapper() at the end of mi_startup instead of
relying on swapper being the last thing in sysinits order.
Rationale:
- "RUN_SCHEDULER" was misleading, scheduling already takes place at that stage
- "scheduler" was misleading, the function swaps in the swapped out processes
- another SYSINIT(SI_SUB_RUN_SCHEDULER, SI_ORDER_ANY) could never be
invoked depending on its relative order with scheduler; this was not obvious
and the bug actually used to exist
Reviewed by: kib (ealier version)
MFC after: 14 days
zfs: move vnode creation from zfs_znode_cache_constructor to zfs_znode_alloc
All other places where a znode is allocated do not need z_vnode at all.
These are:
- zfs_create_share_dir
- zfs_create_fs
This chnage ensures two things:
- VN_LOCK_ASHARE is not erroneously called for VFIFO vnodes
- vn_lock is called on a fully constructed vnode with correct v_ops
The change also allows to make zfs_znode_cache_constructor a normal
kmem_cache constructor again (as it is in upstream).
This allows to avoid a problem where zfs_znode_cache_destructor
may be called on un-constructed znodes.
Decouple the UUID generator from network interfaces by having MAC
addresses added to the UUID generator using uuid_ether_add(). The
UUID generator keeps an arbitrary number of MAC addresses, under
the assumption that they are rarely removed (= uuid_ether_del()).
This achieves the following:
1. It brings up closer to having the network stack as a loadable
module.
2. It allows the UUID generator to filter MAC addresses for best
results (= highest chance of uniqeness).
3. MAC addresses can come from anywhere, irrespactive of whether
it's used for an interface or not.
A side-effect of the change is that when no MAC addresses have been
added, a random multicast MAC address is created once and re-used if
needed. Previusly, when a random MAC address was needed, it was
created for every call. Thus, a change in behaviour is introduced
for when no MAC addresses exist.
jeff [Tue, 23 Jul 2013 22:52:38 +0000 (22:52 +0000)]
- Correct a stale comment. We don't have vclean() anymore. The work is
done by vgonel() and destroy_vobject() should only be called once from
VOP_INACTIVE().
Fix a bug introduced in r252646 that causes a page with the PG_PTE_PAT bit set
to be interpreted as a superpage. This is because PG_PTE_PAT is at the same
bit position in PTE as PG_PS is in a PDE.
This caused a number of regressions on amd64 systems: panic when starting
X applications, freeze during shutdown etc.
Pointy hat to: me
Tested by: gperez@entel.upc.edu, joel, dumbbell
Reviewed by: kib
Remove the large part of struct ipsecstat. Only few fields of this
structure is used, but they already have equal fields in the struct
newipsecstat, that was introduced with FAST_IPSEC and then was merged
together with old ipsecstat structure.
This fixes kernel stack overflow on some architectures after migration
ipsecstat to PCPU counters.
Add a new flag (ETHERSWITCH_VID_VALID) to say what vlangroups are in use.
This fix the case when etherswitch is printing the information of port 0
vlan group (in port based vlan mode) with no member ports.
Add the ETHERSWITCH_VID_VALID support to ip17x driver.
Add the ETHERSWITCH_VID_VALID support to rt8366 driver.
arswitch doesn't need to be updated as it doesn't support vlans management
yet.
Add isnan() and isinf() to the global namespace in libstdc++'s <cmath>.
The standard (n3242, section 17.6.1.1, paragraph 4) says that, because these are
declared as macros in the C specification (even though they are
implemented as functions in the C++ library) they should be in the global
namespace.
A surprising number of configure checks rely on this. It was broken by recent
cleanups to math.h.
In pci_cfgregread() and pci_cfgregwrite(), multiplex the domain and
bus number into the bus argument. The bus number occupies the least
significant 8 bits. The PCI domain occupies the most significant 24
bits.
On the Altix 350, the PCI domain is a required parameter, but
changing the prototype of the pci_cfgreg*() functions to include a
separate domain argument has wide-spread consequences across the
supported architectures. We'd be changing a known interface.
Multiplexing is an acceptable kluge to give us what we need with
manageable impact. Note that the PCI bus number fits in 8 bits,
so the multiplexing of the domain is a backward compatible change.
In ia64_mca_init(), don't limit the allocation of the info block to
fall within the first 256MB of memory. The origin/reason for that
limitation is not known, but it's not believed to be required for
proper initialization. What is known is that the Altix 350 does not
have physical memory at that address (by virtue of the address space
bits).
Keep the boundary at 256MB so that the info block will be covered
by a single direct-mapped translation.
While here, change the flags to M_NOWAIT to eliminate confusion. It
does not change the behaviour of contigmalloc(). What is does is
makes the flags argument explicitly say what the actual behaviour
is.