will [Thu, 18 Sep 2014 17:28:21 +0000 (17:28 +0000)]
Start the process of cleaning up FreeBSD's firewire driver.
sys/dev/firewire/firewire.c:
sys/dev/firewire/firewire.h:
sys/dev/firewire/firewirereg.h:
sys/dev/firewire/fwcrom.c:
sys/dev/firewire/fwdev.c:
sys/dev/firewire/fwdma.c:
sys/dev/firewire/fwmem.c:
sys/dev/firewire/fwohci.c:
sys/dev/firewire/fwohci_pci.c:
sys/dev/firewire/fwohcivar.h:
sys/dev/firewire/if_fwe.c:
sys/dev/firewire/if_fwip.c:
sys/dev/firewire/sbp.c:
sys/dev/firewire/sbp_targ.c:
Unifdef the code, removing support for DragonflyBSD
and FreeBSD prior to version 5.
adrian [Thu, 18 Sep 2014 16:20:17 +0000 (16:20 +0000)]
Fix the handling of EOP in status descriptors for if_igb(4) and don't
double-free mbufs.
Like ixgbe(4) chipsets, EOP is only set on the final descriptor
in a chain of descriptors. So, to free the whole list of descriptors,
we should free the current slot _and_ the assembled list of descriptors
that make up the fragment list.
The existing code was setting discard once it saw EOP + an error status;
it then freed all the subsequent descriptors until the next EOP. That's
totally the wrong order.
- Use if_inc_counter() to increment various counters.
- Do not ever set a counter to a value. For those counters
that we don't increment, but return directly from hardware
create cases in if_get_counter() method.
will [Thu, 18 Sep 2014 15:37:53 +0000 (15:37 +0000)]
bpobj_iterate_impl(): Close a refcount leak iterating on a sublist.
If bpobj_space() returned non-zero here, the sublist would have been
left open, along with the bonus buffer hold it requires. This call
does not invoke any calls to bpobj_close() itself.
This bug doesn't have any known vector, but was found on inspection.
MFC after: 1 week
Sponsored by: Spectra Logic
Affects: All ZFS versions starting 21 May 2010 (illumos cde58dbc)
MFSpectraBSD: r1050998 on 2014/03/26
While not too late rename 'ifnet_counter' to 'ift_counter'. One of the
imporant moments that we discussed with Marcel and Anuranjan was that
a converted driver should return false for 'grep ifnet if_driver.c' :)
Add a function to set if_get_counter method for an ifnet. To be used
in the drivers that are already converted to "Juniper drvapi". This
can be revisited in future.
will [Thu, 18 Sep 2014 14:09:42 +0000 (14:09 +0000)]
zfs_setprop_error(): Handle errno value E2BIG.
This errno value is emitted by dsl_props_set_check() in
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_prop.c, and
is used to mean that the property value is too long. For the record,
the maximum length is ZAP_MAXVALUELEN, which is 8*1024 bytes.
Instead of claiming an unknown error (and abort()ing), provide
something more specific to the scenario involved. As far as I
can tell, E2BIG is not emitted for any other scenario.
MFC after: 1 week
Sponsored by: Spectra Logic
Affects: All ZFS versions starting 27 Feb 2009 (illumos ccba0801)
This change modified the value returned by
dsl_props_set_check(), so that it can distinguish between
a name that's too long and a value that's too long, but
libzfs was not updated accordingly.
MFSpectraBSD: r1051499 on 2014/03/28 11:07:59
will [Thu, 18 Sep 2014 14:02:25 +0000 (14:02 +0000)]
Fix an assert to tolerate spare parents with more than 2 children.
This can occur if a spare is being spared, which would yield three
children: the original pool drive, the previous spare, and the spare
that is replacing it.
MFC after: 1 week
Sponsored by: Spectra Logic
Affects: All ZFS versions starting 7 Jun 2006 (illumos 94de1d4c)
MFSpectraBSD: r668345 on 2013/06/04 17:10:43
While not too late rename if_get_counter_compat() to if_get_counter_default().
The compat counters will go away, but the function will remain in its place,
and in all places where it is going to be called.
- Use NULL instead of 0 for fpcurthread.
- Note the quirk with the interrupt enabled state of the dna handler.
- Use just panic() instead of printf() and panic(). Print tid instead
of pid, the fpu state is per-thread.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
For consistency with the shared header file (and to avoid confusion
with mbufs normally called *m in one place), rename the function
arguments to "mem".
This is a non-functional change.
Reviewed by: gnn, eric.joyner intel.com
MFC after: 3 days
Implement most of timer_{create,settime,gettime,getoverrun,delete}
for amd64/linux32. Fix the entirely bogus (untested) version from
r161310 for i386/linux using the same shared code in compat/linux.
It is unclear to me if we could support more clock mappings but
the current set allows me to successfully run commercial
32bit linux software under linuxolator on amd64.
Revert r271735. The comment is absolutely correct, we do not support 802.1p priority tagging. I got confused with the packet tagged and packet to be tagged.
r258695 introduces a sanity check for makefs in order to verify that
the minimum image size specified is always less than the maximum
image size. If makefs(1) is invoked specifying minimum image size,
but not maximum one, the program exits with an error. Example:
# sudo -E makefs -M 538968064 -B be /home/davide/disk.img $DESTDIR
makefs: `/home/davide/tftproot/mips' minsize of 538968064 rounded up
to ffs bsize of 8192 exceeds maxsize 0. Lower bsize, or round the
minimum and maximum sizes to bsize.
Assert then that minsize < maxsize iff maxsize is specified.
This change allows me to build MIPS images using makefs(1) and following
what specified in the wiki again.
The lagg(4) interface is based on trunk(4) interface from OpenBSD.
The FreeBSD is the only system that has the FEC protocol, that is a simple alias
to loadbalance protocol and does not implement the ancient Cisco FEC standard.
From now on, we remove the fec protocol from the documentation and keep the FEC
code only for compatibility.
Phabric: D539
Reviewed by: glebius, thompsa
Approved by: glebius
Sponsored by: QNAP Systems Inc.
will [Thu, 18 Sep 2014 02:01:36 +0000 (02:01 +0000)]
Fix a kernel panic when unloading isp(4).
In the current implementation, the isp_kthread() threads never exit.
The target threads do have an exit mode from isp_attach(), but it is
not invoked from isp_detach().
Ensure isp_detach() notifies threads started for each channel, such
that they exit before their parent device softc detaches, and thus
before the module does. Otherwise, a page fault panic occurs later in:
For isp_kthread() (and isp(4) target threads), td->td_wmesg references
now-unmapped memory after the module has been unloaded. These threads
are typically msleep()ing at the time of unload, but they could also
attempt to execute now-unmapped code segments.
will [Thu, 18 Sep 2014 01:57:36 +0000 (01:57 +0000)]
Root the lib32 object tree under the overall object tree.
This enables a common root directory for all object files for a given tree,
which eases sharing a common MAKEOBJDIRPREFIX, and cleaning up of object trees.
In particular, one can simply (from the source directory) rm -rf /usr/obj$(pwd)
to destroy all object files for it. Or to copy/sync files, etc.
The vm_mmap_cdev() explicitely converts absence of both MAP_SHARED and
MAP_PRIVATE flags to MAP_SHARED. Apparently, some code in tree, in
particular, libgeom, relied on this behaviour, see r271721. For
regular file types, the absence of the flags is interpreted as
MAP_PRIVATE, and libc nlist used this (fixed in r271723).
Allow the implicit flags for legacy binaries. Bump __FreeBSD_version
to get the ABI note on new binaries to check for in mmap code.
Remove the test for presence of one of the MAP_ANON, MAP_SHARED or
MAP_PRIVATE flags before fget_mmap(). For MAP_ANON, we already verify
that passed fd == -1. For fd != -1, test after fget_mmap() (for newer
binaries) covers the case.
Reported by: bdrewery, pho
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
- Remove mention of MAP_INHERIT. It hasn't been implemented for thirteen
years.
- Remove mention of unimplemented MAP_SWAP. There are no future plans to
implement it.
This feature is required by Mesa 9.2+. Without this, a GL application
crashes with the following message:
# glxinfo
name of display: :0.0
Gen6+ requires Kernel 3.6 or later.
Assertion failed: (ctx->Version > 0), function handle_first_current,
file ../../src/mesa/main/context.c, line 1498.
Abort (core dumped)
Now, Mesa 10.2.4 and 10.3-rc3 works fine:
# glxinfo
name of display: :0
display: :0 screen: 0
direct rendering: Yes
...
OpenGL renderer string: Mesa DRI Intel(R) 965GM
OpenGL version string: 2.1 Mesa 10.2.4
...
The code was imported from Linux 3.8.13.
Reviewed by: kib@
Tested by: kwm@, danfe@, Henry Hu,
Lundberg, Johannes <johannes@brilliantservice.co.jp>,
Johannes Dieterich <dieterich.joh@gmail.com>,
Lutz Bichler <lutz.bichler@gmail.com>,
MFC after: 3 days
Relnotes: yes
Fix LUN discovery for targets that don't support REPORT_LUNS, broken
in r263741. At least with CTL (slightly modified to report SPC2) there
is still some problem: it doesn't seem to find LUNs higher than 7.
Fix tpc_create_token() introduced in r269497 to encode CREATOR LOGICAL UNIT
DESCRIPTOR field as Identification Descriptor CSCD descriptor, not just as
Identification Descriptor.
Implement a workaround to allow this test program to be compiled with clang.
It seems that if a pragma is used to define a weak alias for a local
function, the pragma must appear after the function is defined.
Fix a number of typos and programming errors in the userland CTF tests. It
seems that they would only pass by chance on illumos; on FreeBSD, they still
fail since userland CTF is not yet supported.
Summary:
Fix the stack tracing for dtrace/powerpc by using the trapexit/asttrapexit
return address sentinels instead of checking within the kernel address space.
As part of this, I had to add new inline functions. FBT traces the kernel, so
we have to have special case handling for this, since a trap will create a full
new trap frame, and there's no way to pass around the 'real' stack. I handle
this by special-casing 'aframes == 0' with the trap frame. If aframes counts
out to the trap frame, then assume we're looking for the full kernel trap frame,
so switch to the real stack pointer.
Use a devd event to start hv_kvpd instead of doing so in rc.d script.
This is cleaner and eliminates the unneeded startup of KVP daemon on
systems that do not run as a Hyper-V guest.
Rework atse_rx_cycles handling: count packets instead of fills, and use the
limit only when polling, not when in interrupt mode. Otherwise, we may
stop reading the FIFO midpacket and clear the event mask even though the
FIFO still has data to read, which could stall receive when a large packet
arrives. Add a comment about races in the Altera FIFO interface: we may
need to do a little more work to handle races than we are.
Experimentation suggests that the Altera Triple-Speed Ethernet documentation
is incorrect and bits in the event and interrupt-enable registers are not
irrationally rearranged relative to the status register.
Substantially rework interrupt handling in the atse(4) driver:
- Introduce a new macro ATSE_TX_PENDING() which checks whether there is
any pending data to transmit, either in an in-progress packet or in
the TX queue.
- Introduce new ATSE_RX_STATUS_READ() and ATSE_TX_STAUTS_WRITE() macros
that query the FIFO status registers rather than event registers,
offering level- rather than edge-triggered FIFO conditions.
- For RX, interrupt only on full/overflow/underflow; for TX, interrupt
only on empty/overflow/underflow.
- Add new ATSE_RX_INTR_READ() and ATSE_RX_INTR_WRITE() macros useful for
debugging interrupt behaviour.
- Add a debug.atse_intr_debug_enable sysctl that causes various pieces
of FIFO state to be printed out on each RX or TX interrupt. This is
disabled by default but good to turn on if the interface appears to
wedge. Also print debugging information when polling.
- In the watchdog handler, do receive, not just transmit, processing, to
ensure that the rx, not just tx, queue is being handled -- and, in
particular, will be drained such that interrupts can resume.
- Rework both atse_rx_intr() and atse_tx_intr() to eliminate many race
conditions, and add comments on why various things are in various
orders. Interactions between modifications to the event and interrupt
masks are quite subtle indeed, and we must actively check for a number
of races (e.g., event mask cleared; packet arrives; interrupts enabled).
We also now use the status registers rather than event registers for
FIFO status checks to avoid other races; we continue to use event
registers for underflow/overflow.
With this change, interrupt-driven operation of atse appears (for the
time being) robust.
Fix source address selection on unbound sockets in the presence of multiple
fibs. Use the mbuf's or the socket's fib instead of RT_ALL_FIBS. Fixes PR
187553. Also fixes netperf's UDP_STREAM test on a nondefault fib.
sys/netinet/ip_output.c
In ip_output, lookup the source address using the mbuf's fib instead
of RT_ALL_FIBS.
sys/netinet/in_pcb.c
in in_pcbladdr, lookup the source address using the socket's fib,
because we don't seem to have the mbuf fib. They should be the same,
though.
tests/sys/net/fibs_test.sh
Clear the expected failure on udp_dontroute.
Use a consistent type for the number of HMAC algorithms.
This fixes a bug which resulted in a warning on the userland
stack, when compiled on Windows.
Thanks to Peter Kasting from Google for reporting the issue and
provinding a potential fix.
Small cleanup which addresses a warning regaring the truncation
of a 64-bit entity to a 32-bit entity. This issue was reported by
Peter Kasting from Google.
FreeBSD-SA-14:19.tcp raised attention to the state of our stack
towards blind SYN/RST spoofed attack.
Originally our stack used in-window checks for incoming SYN/RST
as proposed by RFC793. Later, circa 2003 the RST attack was
mitigated using the technique described in P. Watson
"Slipping in the window" paper [1].
After that, the checks were only relaxed for the sake of
compatibility with some buggy TCP stacks. First, r192912
introduced the vulnerability, just fixed by aforementioned SA.
Second, r167310 had slightly relaxed the default RST checks,
instead of utilizing net.inet.tcp.insecure_rst sysctl.
In 2010 a new technique for mitigation of these attacks was
proposed in RFC5961 [2]. The idea is to send a "challenge ACK"
packet to the peer, to verify that packet arrived isn't spoofed.
If peer receives challenge ACK it should regenerate its RST or
SYN with correct sequence number. This should not only protect
against attacks, but also improve communication with broken
stacks, so authors of reverted r167310 and r192912 won't be
disappointed.
o Revert r167310.
o Implement "challenge ACK" protection as specificed in RFC5961
against RST attack. On by default.
- Carefully preserve r138098, which handles empty window edge
case, not described by the RFC.
- Update net.inet.tcp.insecure_rst description.
o Implement "challenge ACK" protection as specificed in RFC5961
against SYN attack. On by default.
- Provide net.inet.tcp.insecure_syn sysctl, to turn off
RFC5961 protection.
The changes were tested at Netflix. The tested box didn't show
any anomalies compared to control box, except slightly increased
number of TCP connection in LAST_ACK state.
Reviewed by: rrs
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
Make a type conversion explicit. When compiling this code on
Windows as part of the SCTP userland stack, this fixes a
warning reported by Peter Kasting from Google.
Cache GELI passphrases entered at the console during the boot process,
in order to improve user-friendliness when a system has multiple disks
encrypted using the same passphrase.
When examining a new GELI provider, the most recently used passphrase
will be attempted before prompting for a passphrase; and whenever a
passphrase is entered, it is cached for later reference. When the root
disk is mounted, the cached passphrase is zeroed (triggered by the
"mountroot" event), in order to minimize the possibility of leakage
of passphrases. (After root is mounted, the "taste and prompt for
passphrases on the console" code path is disabled, so there is no
potential for a passphrase to be stored after the zeroing takes place.)
This behaviour can be disabled by setting kern.geom.eli.boot_passcache=0.
Reviewed by: pjd, dteske, allanjude
MFC after: 7 days
* Makefile:
. Hook e_lgammal[_r].c to the build.
. Create man page links for lgammal[-r].3.
* Symbol.map:
. Sort lgammal to its rightful place.
. Add FBSD_1.4 section for the new lgamal_r symbol.
* ld128/e_lgammal_r.c:
. 128-bit implementataion of lgammal_r().
* ld80/e_lgammal_r.c:
. Intel 80-bit format implementation of lgammal_r().
* src/e_lgamma.c:
. Expose lgammal as a weak reference to lgamma for platforms
where long double is mapped to double.
* src/e_lgamma_r.c:
. Use integer literal constants instead of real literal constants.
Let compiler(s) do the job of conversion to the appropriate type.
. Expose lgammal_r as a weak reference to lgamma_r for platforms
where long double is mapped to double.
* src/e_lgammaf_r.c:
. Fixed the Cygnus Support conversion of e_lgamma_r.c to float.
This includes the generation of new polynomial and rational
approximations with fewer terms. For each approximation, include
a comment on an estimate of the accuracy over the relevant domain.
. Use integer literal constants instead of real literal constants.
Let compiler(s) do the job of conversion to the appropriate type.
This allows the removal of several explicit casts of double values
to float.
* src/e_lgammal.c:
. Wrapper for lgammal() about lgammal_r().
adrian [Mon, 15 Sep 2014 21:09:19 +0000 (21:09 +0000)]
Disable flow-director support until it's been debugged and verified.
The flowdirector feature shares on-chip memory with other things
such as the RX buffers. In theory it should be configured in a way
that doesn't interfere with the rest of operation. In practice,
the RX buffer calculation didn't take the flow-director allocation
into account and there'd be overlap. This lead to various garbage
frames being received containing what looks like internal NIC state.
What _I_ saw was traffic ending up in the wrong RX queues.
If I was doing a UDP traffic test with only one NIC ring receiving
traffic, everything is fine. If I fired up a second UDP stream
which came in on another ring, there'd be a few percent of traffic
from both rings ending up in the wrong ring. Ie, the RSS hash would
indicate it was supposed to come in ring X, but it'd come in ring Y.
However, when the allocation was fixed up, the developers at Verisign
still saw traffic stalls.
The flowdirector feature ends up fiddling with the NIC to do various
attempts at load balancing connections by populating flow table rules
based on sampled traffic. It's likely that all of that has to be
carefully reviewed and made less "magic".
So for now the flow director feature is disabled (which fixes both
what I was seeing and what they were seeing) until it's all much
more debugged and verified.
Tested:
* (me) 82599EB 2x10G NIC, RSS UDP testing.
* (verisign) not sure on the NIC (but likely 82599), 100k-200k/sec TCP
transaction tests.
Submitted by: Marc De La Gueronniere <mdelagueronniere@verisign.com>
MFC after: 1 week
Sponsored by: Verisign, Inc.