rmacklem [Tue, 8 Nov 2016 21:47:00 +0000 (21:47 +0000)]
MFC: r307891
Fix the man page to reflect the change done by r307890 to mountd.c
so that the "-n" option uses the sysctl for the correct NFS server.
This is a content change.
rmacklem [Tue, 8 Nov 2016 21:39:15 +0000 (21:39 +0000)]
MFC: r307890
mountd(8) was erroneously setting the sysctl for the old NFS server
when the new/default NFS server was running, for the "-n" option.
This patch fixes the problem for stable/10 and stable/9.
Since the new NFS server uses vfs.nfsd.nfs_privport == 0 by default,
there wouldn't have been many users affected by the code not setting
it to 0 when the "-n" option was specified.
hselasky [Mon, 7 Nov 2016 09:19:04 +0000 (09:19 +0000)]
MFC r307518:
Fix device delete child function.
When detaching device trees parent devices must be detached prior to
detaching its children. This is because parent devices can have
pointers to the child devices in their softcs which are not
invalidated by device_delete_child(). This can cause use after free
issues and panic().
Device drivers implementing trees, must ensure its detach function
detaches or deletes all its children before returning.
While at it remove now redundant device_detach() calls before
device_delete_child() and device_delete_children(), mostly in
the USB controller drivers.
Tested by: Jan Henrik Sylvester <me@janh.de>
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D8070
avos [Sun, 6 Nov 2016 16:44:33 +0000 (16:44 +0000)]
MFC r288990:
Fix regression from r248371. We need to copy packet header to new
mbuf. Unlike in the pre-r248371 code, assert that M_PKTHDR is set
only on a first mbuf.
avos [Sun, 6 Nov 2016 13:50:54 +0000 (13:50 +0000)]
MFC r283636:
- Don't request BUS_DMA_ALLOCNOW for dma tags, that requires enormous
amount of memory.
- Don't request segsize of BUS_SPACE_MAXSIZE_32BIT, when maxsize is
MCLBYTES.
With this change bwi_attach() can succeed on i386.
Sources from the "current" build tree and generated sources in the
object tree should be used instead of sources and headers from the
already installed source tree on the build host.
This was noticed while addressing issues in the upcoming amd update.
r307801:
Align whitespace.
r307801 is related to r307800 however it was a separate commit to
HEAD in order to maintain a separation between the functional change
and a correction of style.
jhb [Fri, 4 Nov 2016 22:03:41 +0000 (22:03 +0000)]
MFC 302313:
cxgbe(4): Avoid a NULL dereference while dumping the L2 table. Entries
used by switching filters that rewrite L2 information do not have any
associated ifnet.
301516:
cxgbetool: Allow max-rate > 10Gbps for rate-limited traffic.
301520:
cxgbe(4): Create a reusable struct type for scheduling class parameters.
301531:
cxgbe(4): Break up set_sched_class. Validate the channel number and
min/max rates against their actual limits (which are chip and port
specific) instead of hardcoded constants.
301535:
cxgbe(4): Track the state of the hardware traffic schedulers in the
driver. This works as long as everyone uses set_sched_class_params
to program them.
301540:
cxgbe(4): Provide information about traffic classes in the sysctl mib.
301542:
cxgbe(4): A couple of fixes to set_sched_queue.
- Validate the scheduling class against the actual limit (which is chip
specific) instead of a magic number.
- Return an error if an attempt is made to manipulate the tx queues of a
VI that hasn't been initialized.
301628:
cxgbe(4): Add a sysctl to manage the binding of a txq to a traffic class.
jhb [Fri, 4 Nov 2016 21:48:22 +0000 (21:48 +0000)]
MFC 297875: cxgbe(4): Always read the entire mailbox into the reply buffer.
The size of the reply can be different from the size of the command in
case a debug firmware asserts. fw_asrt() needs the entire reply in
order to decode the location of the assert.
jhb [Fri, 4 Nov 2016 21:43:10 +0000 (21:43 +0000)]
MFC 297776,297777,297779: Add DDB commands to cxgbe(4).
297776:
Add a function to lookup a device_t object by name.
This just walks the global list of devices looking for one with the
requested name. The one use case outside of devctl2's implementation
is for DDB commands that wish to lookup devices by name.
297777:
Add a 'show t4 tcb <nexus> <tid>' command to dump a TCB from DDB.
This allows the contents of a TCB to be extracted from a T4/T5 card in
DDB after a panic.
297779:
Add a 'show t4 devlog <nexus>' DDB command.
This command displays the adapter's firmware device log similar to the
dev.<nexus>.misc.devlog sysctl.
jhb [Fri, 4 Nov 2016 21:02:33 +0000 (21:02 +0000)]
MFC 297194:
cxgbe(4): Be consistent and call ETHER_BPF_MTAP before writing anything
to the descriptor ring no matter what path the frame takes within the
driver's tx.
jhb [Fri, 4 Nov 2016 20:56:28 +0000 (20:56 +0000)]
MFC 296975: cxgbe(4): Tidy up PAUSE frame accounting.
Figure out if the chip is counting PAUSE frames in the "normal" stats
and take them out if it is. This fixes a bug in the tx stats because
the default hardware behavior is different for Tx and Rx but the driver
was treating both the same way. The result was that OPACKETS, OBYTES,
and OMCASTS were under-reported (if tx_pause > 0) before this change.
Note that the mac_stats sysctl still gives you the raw value of these
statistics straight from the device registers.
jhb [Fri, 4 Nov 2016 20:38:26 +0000 (20:38 +0000)]
MFC 296950,296951: Configuration updates.
296950:
cxgbe(4): Update some register settings in the default configuration
files to match the "uwire" configuration.
296951:
cxgbe(4): Enable additional capabilities in the default configuration
files. All features with FreeBSD drivers of some kind are now in the
default configuration.
296735:
Fix the following gcc warnings on sparc64, when TCP_OFFLOAD is not
defined:
sys/dev/cxgbe/t4_main.c:7474: warning: 'sysctl_tp_tick' defined but not used
sys/dev/cxgbe/t4_main.c:7505: warning: 'sysctl_tp_dack_timer' defined but not used
sys/dev/cxgbe/t4_main.c:7519: warning: 'sysctl_tp_timer' defined but not used
This just adds a bunch of #ifdef TCP_OFFLOAD in the right places.
296949:
cxgbe(4): Remove a couple of pointless assignments in sysctl_meminfo.
Do not display range if start = stop (this is a workaround for some
unused regions).
jhb [Fri, 4 Nov 2016 19:07:12 +0000 (19:07 +0000)]
MFC 296552,296596,296603,296624,296627: Fixes related to memory windows.
296552:
cxgbe(4): Rename regwin_lock to reg_lock. It is used to protect access
to indirect registers only.
296596:
cxgbe(4): Allow the addr/len pair that is being validated in
validate_mem_range to span multiple memory types. Update
validate_mt_off_len to use validate_mem_range.
296603:
cxgbe(4): Add general purpose routines that offer safe access to the
chip's memory windows. Convert existing users of these windows to the
new routines.
296624:
cxgbe(4): Fix bug in r296603. The memory window needs to be
repositioned if the start address isn't in the window already. One
of the bounds check used the end address instead.
296627:
cxgbe(4): Improvements to the code that deals with the firmware's log.
- Query the location of the log very early during attach. Refresh the
location later after establishing contact with the firmware.
- Save the log's location as a flat address in devlog_params.
- Use a memory window instead of backdoor access to the EDC/MC to read
the log.
jhb [Fri, 4 Nov 2016 18:45:06 +0000 (18:45 +0000)]
MFC 295778,296249,296333,296383,296471,296478,296481,296485,296488-296491,
296493-296496,296544,296710-296711,297863,299685: Catch up to changes to
the internal shared code.
Note that this merge includes two different firmware updates, but the
effective change is to update to the last version (1.15.37.0). As such,
I've trimmed the log message of the first update (1.15.28.0).
In addition, the M_WAIT macro added in t4_regs.h had to be renamed to
CXGBE_M_WAIT to avoid a collision on 10.x that is not present on 11.
295778:
cxgbe: catch up with the latest hardware-related definitions.
296249:
cxgbe(4): Update T5 and T4 firmwares to 1.15.28.0.
296333:
cxgbe(4): First of many changes to reduce diffs with internal shared
code:
- Rename some CamelCase variables.
- s/t4_link_start/t4_link_l1cfg/g
- Pull in t4_get_port_type_description.
- Move t4_wait_op_done to t4_hw.c.
- Flip the order of the RDMA stats.
- Remove unsused function t4_iq_start_stop.
- Move t4_wait_op_done and t4_wait_op_done_val to t4_hw.c
296383:
cxgbe(4): Very basic T6 awareness. This is part of ongoing work to
update to the latest internal shared code.
- Add a chip_params structure to keep track of hardware constants for
all generations of Terminators handled by cxgbe.
- Update t4_hw_pci_read_cfg4 to work with T6.
- Update the hardware debug sysctls (hidden within dev.<tNnex>.<n>.misc.*) to
work with T6. Most of the changes are in the decoders for the CIM
logic analyzer and the MPS TCAM.
- Acquire the regwin lock around indirect register accesses.
296471:
cxgbe(4): Updated register dumps.
- Get the list of registers to read during a regdump from the shared
code instead of the OS specific code. This follows a similar move
internally. The shared code includes the list for T6.
- Update cxgbetool to be able to decode T5 VF, T6, and T6 VF register
dumps (and catch up with some updates to T4 and T5 register decode).
296478:
cxgbe(4): Add a struct sge_params to store per-adapter SGE parameters.
Move the code that reads all the parameters to t4_init_sge_params in the
shared code. Use these per-adapter values instead of globals.
296481:
cxgbe(4): Overhaul the shared code that deals with the chip's TP block,
which is responsible for filtering and RSS.
Add the ability to use filters that match on PF/VF (aka "VNIC id") while
here. This is mutually exclusive with filtering on outer VLAN tag with
Q-in-Q.
296485:
cxgbe(4): Update the interrupt handlers for hardware errors.
296488:
cxgbe(4): Updates to mailbox routines in the shared code.
296489:
cxgbe(4): Updates to the shared routines that deal with the serial EEPROM,
flash, and VPD.
296490:
cxgbe(4): Remove __devinit and SPEED_<foo> as part of catch up with
internal shared code.
296491:
cxgbe(4): Updates to shared routines that get/set various parameters via
the firmware.
296493:
cxgbe(4): Use t4_link_down_rc_str in shared code to decode the reason
the link is down, instead of doing it in OS specific code.
296494:
cxgbe(4): Many new functions in the shared code, unused at this time.
296495:
cxgbe(4): Fix t4_tp_get_rdma_stats.
296496:
cxgbe(4): Minor updates to the shared routines that deal with firmware images.
296544:
cxgbe(4): Reshuffle and rototill t4_hw.c, solely to reduce diffs with
the internal shared code.
296710:
cxgbe(4): Catch up with the latest list of card capabilities as reported
by the firmware.
296711:
cxgbe(4): Fix typo in previous commit.
297863:
Rename the 'M_B' macro in t4_regs.h to 'CXGBE_M_B'.
This fixes a conflict with the M_B macro in powerpc's
<machine/db_machdep.h> exposed by the recent addition of DDB commands
to the cxgbe driver.
299685:
cxgbe(4): Update T5 and T4 firmwares to 1.15.37.0.
These firmwares were obtained from the "Chelsio T5/T4 Unified Wire
v2.12.0.3 for Linux" release. Changes since 1.14.4.0 (which is the
firmware in -STABLE branches) are in the "Release Notes" accompanying
the Unified Wire release and are copy-pasted here as well.
Version : 1.15.37.0
Date : 04/27/2016
================================================================================
FIXES
-----
BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
queue was ignored.
- Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
- Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
- Fixed 40G link failures with some switches when auto-negotiation enabled.
- Fixed to improve on link bring-up time.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where bogus d3hot bits were set causing traffic stall.
- Fixed an issue where sometimes adapter was not seen after reboot.
- Fixed an issue where iWARP was crashing in conjunction with traffic management.
- Fixed an issue where link failed to come up after removing twinax cable and
inserting optical module.
ETH
- Fixed a link flap issue on T580-CR.
OFLD
- Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.
FOiSCSI
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
use given inerface.
- Fixed an issue where fw was sending ENETUNREACH event for normal tcp
disconnection.
DCBX
- Fixed an issue where iscsi tlv is sent incorrectly to host. (DCBX CEE)
- Fixed an issue where apply bit set for APP id was affecting the ETS and PFC
settings.(DCBX IEEE)
- Fixed an issue where app priority values are not handled correctly in fw.
(DCBX IEEE)
- Fixed an issue where enable/disable dcbx can cause crash. (DCBX CEE,DCBX IEEE)
FOFCoE
- Removed BB6 support.
ENHANCEMENTS
------------
BASE:
- Added new interface to program DCA settings in SGE contexts; allow 32-byte
IQE size
- Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
- Added MPS raw interface.
ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
OFLD:
- WR opcode is returned to host in cqe error response.
22.2. T4 Firmware
+++++++++++++++++
Version : 1.15.37.0
Date : 04/27/2016
================================================================================
FIXES
-----
BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
was ignored.
- Fixed an issue in watchdog which was causing VM bring-up failure after reboot.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where iWARP was crashing in conjunction with traffic management.
FOiSCSI:
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
use given inerface.
DCBX
- Fixed an issue where iscsi tlv is sent incorrectly to host.(DCBX CEE)
- Fixed an issue where enable/disable dcbx can cause crash in firmware.(DCBX CEE)
FOiSCSI
- Fixes an issue where fw was sending ENETUNREACH event for normal tcp
disconnection.
FOFCoE
- Removed BB6 support.
ENHANCEMENTS
------------
BASE:
- Added MPS raw interface.
ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================
jhb [Fri, 4 Nov 2016 18:16:00 +0000 (18:16 +0000)]
MFC 287297,296236: Cleanups to cxgbetool.
287297:
- Replace N(a)/N(i)/N(T)/LEN(a)/ARRAY_SIZE(a) with nitems()
- Add missing <err.h> for err() and <sys/sysctl.h> for sysctlbyname()
- NULL -> 0 for 5th parameter of sysctlbyname()
Note, the original commit touched several files under tools/tools, but
this commit only includes changes to cxgbetool.
296236:
Fix some whitespace nits in cxgbetool.c. No functional change.
trasz [Fri, 4 Nov 2016 14:06:21 +0000 (14:06 +0000)]
MFC r297207:
Make the autofs(5) -hosts map more robust, primarily to make it correctly
handle NFS shares containing whitespace. This also adds the -E parameter
to showmount(8).
jhb [Fri, 4 Nov 2016 04:01:59 +0000 (04:01 +0000)]
MFC 301932: Use sbused() instead of sbspace() to avoid signed issues.
Inserting a full mbuf with an external cluster into the socket buffer
resulted in sbspace() returning -MLEN. However, since sb_hiwat is
unsigned, the -MLEN value was converted to unsigned in comparisons. As a
result, the socket buffer was never autosized. Note that sb_lowat is signed
to permit direct comparisons with sbspace(), but sb_hiwat is unsigned.
Follow suit with what tcp_output() does and compare the value of sbused()
with sb_hiwat instead.
Note: Since stable/10 does not include sbused(), this uses sb->sb_cc
instead.
jhb [Fri, 4 Nov 2016 03:49:53 +0000 (03:49 +0000)]
MFC 290175,290633,299206,300895,301898: Various TOE fixes.
290175:
cxgbe/tom: decide whether to shove segments or not only if there is
payload to transmit.
290633:
cxgbe/t4_tom: add a knob to the default configuration file to tune
the TOE for LAN operation. It is possible to set this to other values
(cluster for networks with little loss and really tight RTTs, and wan
for relatively large RTTs and/or lossy networks) depending on the
environment in which the TOE is being used.
None of this affects plain NIC operation in any way.
299206:
Set the correct vnet in TOE event handlers.
300895:
cxgbe/t4_tom: Exempt RDMA connections from a TCP sanity test for now, to
avoid panicking debug kernels.
t4_tom does not keep track of a connection once it switches to ULP mode
iWARP. If the connection falls out of ULP mode the driver/hardware seq#
etc. are out of sync. A better fix would be to figure out what the
current seq# are, update the driver's state, and perform all sanity
checks as usual.
301898:
cxgbe/t4_tom: Fix inverted assertion in r300895. It is RDMA
connections and not others that are allowed to fail the receive window
check.
jhb [Fri, 4 Nov 2016 03:25:34 +0000 (03:25 +0000)]
MFC 277763,280146,287631: Various fixes to DDP.
277763:
Lock the socket buffer before jumping to the 'out' label if sblock()
fails in t4_soreceive_ddp().
280146:
Move special DDP handling for closing a connection into a new
handle_ddp_close() function in t4_ddp.c as the logic is similar
to handle_ddp_data(). This allows all knowledge of the special
DDP mbufs to be private to t4_ddp.c as well.
287631:
Add a comment to clarify how to determine the amount of received DDP
data.
jch [Thu, 3 Nov 2016 19:58:12 +0000 (19:58 +0000)]
MFC r307966:
Remove an extraneous call to soisconnected() in syncache_socket(),
introduced with r261242. The useful and expected soisconnected()
call is done in tcp_do_segment().
Has been found as part of unrelated PR:212920 investigation.
Improve slightly (~2%) the maximum number of TCP accept per second.
rmacklem [Thu, 3 Nov 2016 00:58:50 +0000 (00:58 +0000)]
MFC: r307694
A problem w.r.t. interoperation between the FreeBSD NFSv4.1 server with
delegations enabled and the Linux NFSv4.1 client was reported in
reviews.freebsd.org/D7891.
I believe that the FreeBSD server behaviour conforms to the RFC and that
the Linux client has a bug. Therefore, I do not think the proposed patch
is appropriate. When nfsrv_writedelegifpos is non-zero, the FreeBSD
server will issue a write delegation for a read open if possible.
The Linux client then erroneously assumes that the credentials used for
the read open can write the file.
This patch reverses the default value for nfsrv_writedelegifpos to 0 so
that the default behaviour is Linux compatible and adds a sysctl that can
be used to set nfsrv_writedelegifpos.
This change should only affect users that are mounting a FreeBSD server
with delegations enabled (they are not enabled by default) with a Linux
NFSv4.1 client mount.
Certain warning alerts are ignored if they are received. This can mean that
no progress will be made if one peer continually sends those warning alerts.
Implement a count so that we abort the connection if we receive too many.
Issue reported by Shi Lei.
This is a direct commit to stable/10 and stable/9.
jhb [Mon, 31 Oct 2016 22:45:11 +0000 (22:45 +0000)]
MFC 291665,291685,291856,297467,302110,302263: Add support for VIs.
291665:
Add support for configuring additional virtual interfaces (VIs) on a port.
Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port. This change allows additional non-netmap
interfaces to be configured on each port. Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.
Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).
T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).
One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.
The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.
The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats. There is currently no way to
clear the stats of an individual VI.
291685:
Fix build for !TCP_OFFLOAD case.
291856:
Fix RSS build.
297467:
Remove #ifdef's from various structures used in the cxgbe/cxl driver.
This provides a constant ABI and layout for these structures (especially
struct adapter) avoiding some foot shooting.
302110:
cxgbe(4): Merge netmap support from the ncxgbe/ncxl interfaces to the
vcxgbe/vcxl interfaces and retire the 'n' interfaces. The main
cxgbe/cxl interfaces and tunables related to them are not affected by
any of this and will continue to operate as usual.
The driver used to create an additional 'n' interface for every
cxgbe/cxl interface if "device netmap" was in the kernel. The 'n'
interface shared the wire with the main interface but was otherwise
autonomous (with its own MAC address, etc.). It did not have normal
tx/rx but had a specialized netmap-only data path. r291665 added
another set of virtual interfaces (the 'v' interfaces) to the driver.
These had normal tx/rx but no netmap support.
This revision consolidates the features of both the interfaces into the
'v' interface which now has a normal data path, TOE support, and native
netmap support. The 'v' interfaces need to be created explicitly with
the hw.cxgbe.num_vis tunable. This means "device netmap" will not
result in the automatic creation of any virtual interfaces.
The following tunables can be used to override the default number of
queues allocated for each 'v' interface. nofld* = 0 will disable TOE on
the virtual interface and nnm* = 0 to will disable native netmap
support.
# number of normal NIC queues
hw.cxgbe.ntxq_vi
hw.cxgbe.nrxq_vi
# number of TOE queues
hw.cxgbe.nofldtxq_vi
hw.cxgbe.nofldrxq_vi
# number of netmap queues
hw.cxgbe.nnmtxq_vi
hw.cxgbe.nnmrxq_vi
hw.cxgbe.nnm{t,r}xq{10,1}g tunables have been removed.
--- tl;dr version ---
The workflow for netmap on cxgbe starting with FreeBSD 11 is:
1) "device netmap" in the kernel config.
2) "hw.cxgbe.num_vis=2" in loader.conf. num_vis > 2 is ok too, you'll
end up with multiple autonomous netmap-capable interfaces for every
port.
3) "dmesg | grep vcxl | grep netmap" to verify that the interface has
netmap queues.
4) Use any of the 'v' interfaces for netmap. pkt-gen -i vcxl<n>... .
One major improvement is that the netmap interface has a normal data
path as expected.
5) Just ignore the cxl interfaces if you want to use netmap only. No
need to bring them up. The vcxl interfaces are completely independent
and everything should just work.
---------------------
302263:
cxgbe(4): Do not bring up an interface when IFCAP_TOE is enabled on it.
The interface's queues are functional after VI_INIT_DONE (which is short
of interface-up) and that's all that's needed for t4_tom to communicate
with the chip.
jhb [Mon, 31 Oct 2016 22:03:44 +0000 (22:03 +0000)]
MFC 289401: cxgbe(4): support for the kernel RSS option.
You need PCBGROUP and RSS in the kernel config to use this.
Note: Since RSS is not present in 10.x this is mostly a no-op and is
stubbed out by removing the #include of opt_rss.h. This is merged
primarily to reduce conflicts in future merges, however it does add a
couple of diagnostic messages related to RSS buckets vs RX queue
counts.
dim [Mon, 31 Oct 2016 18:37:44 +0000 (18:37 +0000)]
Pull in r228705 from upstream libc++ trunk (by Eric Fiselier):
[libcxx] Fix PR 22468 - std::function<void()> does not accept
non-void-returning functions
Summary:
The bug can be found here: https://llvm.org/bugs/show_bug.cgi?id=22468
`__invoke_void_return_wrapper` is needed to properly handle calling a
function that returns a value but where the std::function return type
is void. Without this '-Wsystem-headers' will cause
`function::operator()(...)` to not compile.
sbruno [Mon, 31 Oct 2016 16:48:16 +0000 (16:48 +0000)]
MFC r308038:
The buffer address is always overwritten in the extended descriptor format,
we have to refresh it ... always. This fixes problems reported in NetMap
with em(4) devices after conversion to extended descriptor format in
svn r293331.
mav [Mon, 31 Oct 2016 07:21:37 +0000 (07:21 +0000)]
MFC r307523: Make pass driver better support CAM_CDB_POINTER flag.
Previously pass driver just ignored the flag, making random kernel code
access user-space pointer, sometime causing crashes even for correctly
written applications if user-level context was switched or swapped out.
This patch tries to copyin the CDB into kernel space to avoid it.
ed [Sat, 29 Oct 2016 15:04:24 +0000 (15:04 +0000)]
Add posix_tnode to <search.h>.
In r307227 I've refactored the binary search tree functions to use the
posix_tnode type. As this change does not apply cleanly to this version
of FreeBSD, only make the change that matters: add the definition of the
newly introduced type.
This will ease source-level compatibility going forward.
Until we can resolve the numerous hole_birth bugs that have cropped up
recently, and come up with a way going forwards to protect users from
corruption, we should disable the hole_birth feature. Using a tunable
allows those who are confident that their data is correct to continue to
take advantage of the feature.
Closes #188
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>
dsl_dataset_space is looking at the ds_bp's fill count while
dmu_objset_write_ready() is concurrently modifying it. This fix adds an
rrwlock to protect the ds_bp.
Closes #180
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Author: Paul Dagnelie <pcd@delphix.com>
mav [Sat, 29 Oct 2016 08:48:01 +0000 (08:48 +0000)]
MFC r307507, r307509, r307515:
Consider device as clean even if SYNCHRONIZE CACHE failed.
If device reservation was preempted by other initiator, our sync request
will always fail. Without this change CAM tried to sync cache on every
following device close, including numerous GEOM tasting opens/closes,
causing lots of useless noise in logs.
mav [Sat, 29 Oct 2016 08:45:06 +0000 (08:45 +0000)]
MFC r307350: Add LUN options to limit UNMAP and WRITE SAME sizes.
CTL itself has no limits on on UNMAP and WRITE SAME sizes. But depending
on backends large requests may take too much time. To avoid that new
configuration options allow to hint initiator maximal sizes it should not
exceed.
mav [Fri, 28 Oct 2016 18:25:32 +0000 (18:25 +0000)]
MFC r300881, r302058 (by asomers):
Avoid issuing spa config updates for physical path when not necessary
ZFS's configuration needs to be updated whenever the physical path for a
device changes, but not when a new device is introduced. This is because new
devices necessarily cause config updates, but only if they are actually
accepted into the pool.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
Split vdev_geom_set_physpath out of vdev_geom_attrchanged. When
setting the vdev's physical path, only request a config update if
the physical path has changed. Don't request it when opening a
device for the first time, because the config sync will happen
anyway upstack.
sys/geom/geom_dev.c
Split g_dev_set_physpath and g_dev_set_media out of
g_dev_attrchanged
mav [Fri, 28 Oct 2016 18:24:05 +0000 (18:24 +0000)]
MFC r300059 (by asomers): Speed up vdev_geom_open_by_guids
Speedup is hard to measure because the only time vdev_geom_open_by_guids
gets called on many drives at the same time is during boot. But with
vdev_geom_open hacked to always call vdev_geom_open_by_guids, operations
like "zpool create" speed up by 65%.
* Read all of a vdev's labels in parallel instead of sequentially.
* In vdev_geom_read_config, don't read the entire label, including
the uberblock. That's a waste of RAM. Just read the vdev config
nvlist. Reduces the IO and RAM involved with tasting from 1MB to
448KB.
mav [Fri, 28 Oct 2016 18:22:00 +0000 (18:22 +0000)]
MFC r298814 (by asomers): Fix a use-after-free when "zpool import" fails
clear vd->vdev_tsd in vdev_geom_close_locked instead of vdev_geom_detach.
In the latter function, it would fail to happen in certain circumstances
where cp->private was unset. Ideally, the latter should never happen, but
it can happen when vdev open fails, or where spares are involved.
mav [Fri, 28 Oct 2016 18:20:14 +0000 (18:20 +0000)]
MFC r298786 (by asomers):
Refactor vdev_geom_attach and friends to reduce code duplication
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
Move checks for provider's sectorsize and mediasize into a single
location in vdev_geom_attach. Remove the zfs::vdev::taste class;
it's ok to use the regular vdev class for tasting. Consolidate guid
checks into a single location in vdev_attach_ok. Consolidate some
error handling code from vdev_geom_attach into vdev_geom_detach,
closing a resource leak of geom consumers in the process.
Using zvols as backing devices for ZFS pools is fraught with panics and
deadlocks. For example, attempting to online a missing device in the
presence of a zvol can cause a panic when vdev_geom tastes the zvol. Better
to completely disable vdev_geom from ever opening a zvol. The solution
relies on setting a thread-local variable during vdev_geom_open, and
returning EOPNOTSUPP during zvol_open if that thread-local variable is set.
Remove the check for MUTEX_HELD(&zfsdev_state_lock) in zvol_open. Its intent
was to prevent a recursive mutex acquisition panic. However, the new check
for the thread-local variable also fixes that problem.
Also, fix a panic in vdev_geom_taste_orphan. For an unknown reason, this
function was set to panic. But it can occur that a device disappears during
tasting, and it causes no problems to ignore this departure.