truckman [Thu, 25 May 2017 22:41:34 +0000 (22:41 +0000)]
MFC r318527
Fix the queue delay estimation in PIE/FQ-PIE when the timestamp
(TS) method is used. When packet timestamp is used, the "current_qdelay"
keeps storing the last queue delay value calculated in the dequeue
function. Therefore, when a burst of packets arrives followed by
a pause, the "current_qdelay" will store a high value caused by the
burst and stick to that value during the pause because the queue
delay measurement is done inside the dequeue function. This causes
the drop probability calculation function to calculate high drop
probability value instead of zero and prevents the burst allowance
mechanism from working properly. Fix this problem by resetting
"current_qdelay" inside the drop probability calculation function
when the queue length is zero and TS option is used.
truckman [Thu, 25 May 2017 17:23:26 +0000 (17:23 +0000)]
MFC r318511
The result of right shifting a negative signed value is implementation
defined. On machines without arithmetic shift instructions, zero bits
may be shifted in from the left, giving a large positive result instead
of the desired divide-by power-of-2. Fix this by operating on the
absolute value and compensating for the possible negation later.
Reverse the order of the underflow/overflow tests and the exponential
decay calculation to avoid the possibility of an erroneous overflow
detection if p is a sufficiently small non-negative value. Also
check for negative values of prob before doing the exponential decay
to avoid another instance of of right shifting a negative value.
np [Thu, 25 May 2017 02:00:37 +0000 (02:00 +0000)]
MFC r318014, r318091, r318125, and r318263.
r318014:
cxgbe(4): Fixes related to the knob that controls link autonegotiation.
- Do not leak the adapter lock in sysctl_autoneg.
- Accept only 0 or 1 as valid settings for autonegotiation.
- A fixed speed must be requested by the driver when autonegotiation is
disabled otherwise the firmware will reject the l1cfg command. Use
the top speed supported by the port for now.
r318091:
cxgbe(4): Do not assume that if_qflush is always followed by inteface-down.
r318125:
Adjust whitespace and fix a comment. No functional change.
r318263:
cxgbe(4): netmap-only interrupts for a VI do not have an associated rxq
or ofld_rxq and should be ignored by vi_intr_iq.
np [Thu, 25 May 2017 01:43:28 +0000 (01:43 +0000)]
MFC r317702, r317847, r318307
r317702:
cxgbe(4): Support routines for Tx traffic scheduling.
- Create a new file, t4_sched.c, and move all of the code related to
traffic management from t4_main.c and t4_sge.c to this file.
- Track both Channel Rate Limiter (ch_rl) and Class Rate Limiter (cl_rl)
parameters in the PF driver.
- Initialize all the cl_rl limiters with somewhat arbitrary default
rates and provide routines to update them on the fly.
- Provide routines to reserve and release traffic classes.
r317847:
cxgbe(4): The Tx scheduler initialization either works or doesn't. It
doesn't need a refresh in either case.
r318307:
cxgbe(4): Avoid an out of bounds access when an attempt to unbind a tx
queue from a traffic class fails.
gjb [Thu, 25 May 2017 01:31:12 +0000 (01:31 +0000)]
MFC r308737, r308779:
r308737:
Pass SWAPSIZE in env(1) when invoking mk-vmimage.sh, otherwise
mkimg(1) does not create the second partition after r307008.
r308779:
Pass SWAPSIZE in env(1) when invoking mk-vmimage.sh for the
vm-image target, missed in r308737.
np [Thu, 25 May 2017 00:16:41 +0000 (00:16 +0000)]
MFC r316971:
cxgbe: Add a tunable to configure the SGE time scaler, which is
available starting with T6. The values in the timer holdoff registers
are multiplied by the scaling factor before use.
dev.<nexus>.<n>.holdoff_timers shows the final values of the
timers in microseconds.
np [Wed, 24 May 2017 20:29:20 +0000 (20:29 +0000)]
MFC r313318:
cxgbe(4): Allow tunables that control the number of queues to be set to
'-n' to tell the driver to create _up to_ 'n' queues if enough cores are
available. For example, setting hw.cxgbe.nrxq10g="-32" will result in
16 queues if the system has 16 cores, 32 if it has 32.
There is no change in the default number of queues of any type.
np [Wed, 24 May 2017 20:01:12 +0000 (20:01 +0000)]
MFC r313346:
cxgbe/t4_tom: Fix CLIP entry refcounting on the passive side. Every
IPv6 connection being handled by the TOE should have a reference on its
CLIP entry.
r311880:
The iw_cxgb and iw_cxgbe drivers should not use a FreeBSD device_t where
a linuxkpi style device is expected. If OFED/linuxkpi actually starts
using this field then we'll have to figure out whether to create fake
devices for these drivers or have linuxkpi deal with NULL device.
This mismatch was first reported as part of D6585.
r314167:
cxgbe/iw_cxgbe: Minor changes for T6.
r316118:
cxgbe/iw_cxgbe: T6 has no limit on the amount of memory that can be
registered in one ib_reg_phys_mr.
r316571:
cxgbe/iw_cxgbe: Remove bad cast that resulted in incorrect length for
memory regions larger than 4GB.
r316573:
cxgbe/iw_cxgbe: Replace a magic constant with something more readable
(and accurate).
T4 and later have an extra bit for page shift so the maximum page size
is 8TB (shift of 12 + 31) instead of 128MB (12 + 15). This saves space
in the chip's PBL (physical buffer list) when registering very large
memory regions.
r316580:
cxgbe/iw_cxgbe: Remove another bad cast. This should have been
included in r316571.
badger [Tue, 23 May 2017 12:40:50 +0000 (12:40 +0000)]
move p_sigqueue to the end of struct proc
In order to preserve KBI in stable branches, replace the existing
p_sigqueue slot with padding and move the expanded (as of r315949)
p_sigqueue to the end of the struct.
This is a repeat of r317529 (which concerned td_sigqueue in struct
thread) for p_sigqueue in struct proc.
Virtualbox modules (and possibly others) are affected without this fix.
rmacklem [Mon, 22 May 2017 21:52:06 +0000 (21:52 +0000)]
MFC: r317931
Fix mount_nfs so that it doesn't create mounttab entries for NFSv4 mounts.
The NFSv4 protocol doesn't use the Mount protocol, so it doesn't make sense
to add an entry for an NFSv4 mount to /var/db/mounttab. Also, r308871
modified umount so that it doesn't remove any entry created by mount_nfs.
rmacklem [Mon, 22 May 2017 19:57:20 +0000 (19:57 +0000)]
MFC: r317906
Fix the client side krpc from doing TCP reconnects for ERESTART from sosend().
When sosend() replies ERESTART in the client side krpc, it indicates that
the RPC message hasn't yet been sent and that the send queue is full or
locked while a signal is posted for the process.
Without this patch, this would result in a RPC_CANTSEND reply from
clnt_vc_call(), which would cause clnt_reconnect_call() to create a new
TCP transport connection. For most NFS servers, this wasn't a serious problem,
although it did imply retries of outstanding RPCs, which could possibly
have missed the DRC.
For an NFSv4.1 mount to AmazonEFS, this caused a serious problem, since
AmazonEFS often didn't retain the NFSv4.1 session and would reply with
NFS4ERR_BAD_SESSION. This implies to the client a crash/reboot which
requires open/lock state recovery.
Three options were considered to fix this:
- Return the ERESTART all the way up to the system call boundary and then
have the system call redone. This is fraught with risk, due to convoluted
code paths, asynchronous I/O RPCs etc. cperciva@ worked on this, but it
is still a work in prgress and may not be feasible.
- Set SB_NOINTR for the socket buffer. This fixes the problem, but makes
the sosend() completely non interruptible, which kib@ considered
inappropriate. It also would break forced dismount when a thread
was blocked in sosend().
- Modify the retry loop in clnt_vc_call(), so that it loops for this case
for up to 15sec. Testing showed that the sosend() usually succeeded by
the 2nd retry. The extreme case observed was 111 loop iterations, or
about 100msec of delay.
This third alternative is what is implemented in this patch, since the
change is:
- localized
- straightforward
- forced dismount is not broken by it.
This patch has been tested by cperciva@ extensively against AmazonEFS.
davidcs [Mon, 22 May 2017 19:36:26 +0000 (19:36 +0000)]
MFC r318382
1. Move Rx Processing to fp_taskqueue(). With this CPU utilization for
processing interrupts drops to around 1% for 100G and under 1% for
other speeds.
2. Use sysctls for TRACE_LRO_CNT and TRACE_TSO_PKT_LEN
3. remove unused mtx tx_lock
4. bind taskqueue kernel thread to the appropriate cpu core
5. when tx_ring is full, stop further transmits till at least 1/16th of
the Tx Ring is empty. In our case 1K entries. Also if there are
rx_pkts to process, put the taskqueue thread to sleep for 100ms,
before enabling interrupts.
6. Use rx_pkt_threshold of 128.
MSDOS and Windows GNU grep uses -u to mean "print byte offsets as if
running on an UNIX system." The option has no effect on systems that
do not use CRLF line endings.
hselasky [Mon, 22 May 2017 08:19:08 +0000 (08:19 +0000)]
MFC r318531:
mlx4: Use the CQ quota for SRIOV when creating completion EQs
When creating EQs to handle CQ completion events for the PF or for
VFs, we create enough EQE entries to handle completions for the max
number of CQs that can use that EQ.
When SRIOV is activated, the max number of CQs a VF (or the PF) can
obtain is its CQ quota (determined by the Hypervisor resource
tracker). Therefore, when creating an EQ, the number of EQE entries
that the VF should request for that EQ is the CQ quota value (and not
the total number of CQs available in the firmware).
Under SRIOV, the PF, also must use its CQ quota, because the resource
tracker also controls how many CQs the PF can obtain.
Using the firmware total CQs instead of the CQ quota when creating EQs
resulted wasting MTT entries, due to allocating more EQEs than were
needed.
ngie [Mon, 22 May 2017 06:06:48 +0000 (06:06 +0000)]
MFC r315793:
intro(3): fix markup
- Use `Em` with `.It` macro when referring to other libraries, instead of
`Xr`.
- Use `.Em` instead of `.Xr` when referring to libraries.
- Remove commented out lines.
hselasky [Fri, 19 May 2017 12:53:50 +0000 (12:53 +0000)]
MFC r313555:
Flexible and asymmetric allocation of EQs and MSI-X vectors for PF/VFs.
Previously, the mlx4 driver queried the firmware in order to get the
number of supported EQs. Under SRIOV, since this was done before the
driver notified the firmware how many VFs it actually needs, the
firmware had to take into account a worst case scenario and always
allocated four EQs per VF, where one was used for events while the
others were used for completions. Now, when the firmware supports the
asymmetric allocation scheme, denoted by exposing num_sys_eqs > 0 (-->
MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the QUERY_FUNC command to query
the firmware before enabling SRIOV. Thus we can get more EQs and MSI-X
vectors per function. Moreover, when running in the new
firmware/driver mode, the limitation that the number of EQs should be
a power of two is lifted.
Obtained from: Linux (dual BSD/GPLv2 licensed)
Submitted by: Dexuan Cui @ microsoft . com
Differential Revision: https://reviews.freebsd.org/D8867
Sponsored by: Mellanox Technologies
hselasky [Fri, 19 May 2017 12:39:35 +0000 (12:39 +0000)]
MFC r313556:
Change mlx4 QP allocation scheme.
When using Blue-Flame, BF, the QPN overrides the VLAN, CV, and SV
fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7
unset.
The current ethernet driver code reserves a TX QP range with 256b
alignment.
This is wrong because if there are more than 64 TX QPs in use, QPNs >=
base + 65 will have bits 6/7 set.
This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.
The new mechanism introduced here will support reservation for "Eth
QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:
1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
and request "BF enabled QPs" if BF is supported for the function
2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0] - number of QPs
b. param1[31-24] - flags controlling QPs reservation
Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits
6 and 7 unset in order to be used in Ethernet.
Bits 24-30 of the flags are currently reserved.
When a function tries to allocate a QP, it states the required
attributes for this QP. Those attributes are considered "best-effort".
If an attribute, such as Ethernet BF enabled QP, is a must-have
attribute, the function has to check that attribute is supported
before trying to do the allocation.
In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those
attributes and masks out unsupported attributes as well. In order to
notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP
command. This command's mailbox is filled by the PF, which notifies
which QP allocation attributes it supports.
Obtained from: Linux (dual BSD/GPLv2 licensed)
Submitted by: Dexuan Cui @ microsoft . com
Differential Revision: https://reviews.freebsd.org/D8868
Sponsored by: Mellanox Technologies
rpokala [Thu, 18 May 2017 23:41:34 +0000 (23:41 +0000)]
Persistently store NIC's hardware MAC address, and add a way to retrive it
jhb pointed out that (struct ifnet) is part of the network driver KBI, and
thus the offsets of internal fields must not change. Therefore, move the new
"if_hw_addr" field to the end, and consume one of the "if_pspare"s; that's
what they're there for. Because netmap on stable/10 uses "if_pspare[0]", the
new field replaces the *last* element of that array; that way,
offsetof(if_pspare) is unchanged compared to before r318430.
marius [Thu, 18 May 2017 21:00:53 +0000 (21:00 +0000)]
MFC: r318282
- Unlike as in the PCI case, when attached to ACPI, Intel Bay Trail
and Braswell eMMC and SDXC controllers share the same IDs. Like in
the PCI case, Braswell eMMC needs the SDHCI_QUIRK_DATA_TIMEOUT_1MHZ
quirk (see r311794 for the corresponding change to the sdhci(4) PCI
PCI front-end), though. However, due to the shared ACPI IDs, this
is trickier to do.
- Intel Apollo Lake eMMC and SDXC controllers are affected by the
APL18 ("Using 32-bit Addressing Mode With SD/eMMC Controller May
Lead to Unpredictable System Behavior") silicon bug. When this
erratum hits, typically both SDHCI and XHCI controllers wedge.
According to Intel, using ADMA2 with 64-bit addressing and 96-bit
descriptors serves as a workaround. Until such times when sdhci(4)
has ADMA2 support, flag DMA as broken for affected interfaces.
This turns out to work around the problem, too, at the cost of
performance.
- In the sdhci(4) ACPI front-end, probe the Intel Apollo Lake eMMC
and SDXC controllers, too.
marius [Thu, 18 May 2017 20:46:27 +0000 (20:46 +0000)]
MFC: r315598
o Add support for eMMC DDR bus speed mode up to 52 MHz to sdhci(4)
and mmc(4). Given that support for DDR52 is not denoted by SDHCI
capability registers, availability of that timing is indicated by
a new quirk SDHCI_QUIRK_MMC_DDR52 and only enabled for Intel SDHCI
controllers so far.
Compared to 50 MHz at SDR high speed typically yielding ~45 MB/s
read throughput with the eMMC chips tested, read performance goes
up to ~80 MB/s at DDR52.
As a side-effect, this change also fixes communication with some
eMMC devices at SDR high speed mode due to the signaling voltage
and UHS bits in the SDHCI controller no longer being left in an
inappropriate state.
o In sdhci(4), add two tunables hw.sdhci.quirk_clear as well as
hw.sdhci.quirk_set, which (when hooked up in the front-end)
allow to set/clear sdhci(4) quirks for debugging and testing
purposes. However, especially for SDHCI controllers on the
PCI bus which have no specific support code so far and, thus,
are picked up as generic SDHCI controllers, hw.sdhci.quirk_set
allows for setting the necessary quirks (if required).
o In mmc(4), check and handle the return values of some more
function calls instead of assuming that everything went right.
In case failures actually are not problematic, indicate that
by casting the return value to void.
rpokala [Wed, 17 May 2017 22:29:25 +0000 (22:29 +0000)]
MFC r318160, 318176: Persistently store NIC's hardware MAC address, and add
a way to retrive it
NOTE: Due to restructuring, the merges didn't apply cleanly; the resulting
change is almost identical to what went into stable/11, but in some cases in
different locations.
The MAC address reported by `ifconfig ${nic} ether' does not always match
the address in the hardware, as reported by the driver during attach. In
particular, NICs which are components of a lagg(4) interface all report the
same MAC.
When attaching, the NIC driver passes the MAC address it read from the
hardware as an argument to ether_ifattach(). Keep a second copy of it, and
create ioctl(SIOCGHWADDR) to return it. Teach `ifconfig' to report it along
with the active MAC address.
sephe [Wed, 17 May 2017 02:40:06 +0000 (02:40 +0000)]
MFC 318136
hyperv/vmbus: Reorganize vmbus device tree
For GEN1 Hyper-V, vmbus is attached to pcib0, which contains the
resources for PCI passthrough and SR-IOV. There is no
acpi_syscontainer0 on GEN1 Hyper-V.
For GEN2 Hyper-V, vmbus is attached to acpi_syscontainer0, which
contains the resources for PCI passthrough and SR-IOV. There is
no pcib0 on GEN2 Hyper-V.
The ACPI VMBUS device now only holds its _CRS, which is empty as
of this commit; its existence is mainly for upward compatibility.
Device tree structure is suggested by jhb@.
Tested-by: dexuan@
Collabrated-wth: dexuan@
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D10565
cy [Wed, 17 May 2017 01:38:32 +0000 (01:38 +0000)]
MFC r318281:
Separate the ipfilter function/static string from the error with a
colon (:) in error messages to assist the user in parsing out the error
from where or which object the error message refers to.
dim [Tue, 16 May 2017 18:54:25 +0000 (18:54 +0000)]
MFC r318259:
Silence a -Wunused warning about the junk variable being used to raise
an inexact floating point exception. The variable cannot be eliminated,
unfortunately, otherwise the desired addition triggering the exception
will be emitted neither by clang, nor by gcc.
brooks [Mon, 15 May 2017 23:13:49 +0000 (23:13 +0000)]
MFC r317660, r317710
r317660:
Support clnt_raw's use of FD_SETSIZE as a fake file descriptor.
Accomplish this by allocating space for it in __svc_xports and allowing
it to be registered. The failure to allocate space was causing an
out-of-bounds read in svc_getreq_common(). The failure to register
caused PR 211804.
brooks [Mon, 15 May 2017 22:50:54 +0000 (22:50 +0000)]
MFC r317845-r317846
r317845:
Provide a freebsd32 implementation of sigqueue()
The previous misuse of sys_sigqueue() was sending random register or
stack garbage to 64-bit targets. The freebsd32 implementation preserves
the sival_int member of value when signaling a 64-bit process.
Document the mixed ABI implementation of union sigval and the
incompability of sival_ptr with pointer integrity schemes.
davidcs [Mon, 15 May 2017 18:21:36 +0000 (18:21 +0000)]
MFC r317996
Fix bug where MTX_DEF lock was held while taskqueue_drain() was invoked.
Check IFF_DRV_RUNNING flag is set prior to calling ql_hw_set_multi()
marius [Sun, 14 May 2017 14:21:11 +0000 (14:21 +0000)]
MFC: r317982
- Also outside of the KOBJOPLOOKUP macro - which in turn is used by
the code auto-generated for *.m - kobj_lookup_method(9) is useful;
for example in back-ends or base class device drivers in order to
determine whether a default method has been overridden. Thus, allow
for the kobj_method_t pointer argument - used by KOBJOPLOOKUP in
order to update the cache entry - of kobj_lookup_method(9), to be
NULL. Actually, that pointer is redundant as it's just set to the
same kobj_method_t that the kobj_lookup_method(9) function returns
in the first place, but probably it serves to reduce the number of
instructions generated for KOBJOPLOOKUP.
- For the same reason, move updating kobj_lookup_{hits,misses} (if
KOBJ_STATS is defined) from kobj_lookup_method(9) to KOBJOPLOOKUP.
As a side-effect, this gets rid of the convoluted approach of always
incrementing kobj_lookup_hits in KOBJOPLOOKUP and then in case of
a cache miss, decrementing it in kobj_lookup_method(9) again.
marius [Sun, 14 May 2017 14:04:32 +0000 (14:04 +0000)]
MFC: r317578
Fix a bug introduced as part of r287726 (MFCed to stable/10 in
r292789); use the right device_t for determining the softc of
the bridge in psycho_route_interrupt(). [1]
While at it, update the corresponding comment that the code in
question is also necessary for U30s in addition to E450s (a fact
that has been known for ages).
gjb [Sun, 14 May 2017 10:15:04 +0000 (10:15 +0000)]
MFC r318190:
Update release/scripts/atlas-upload.sh to account for API changes
made recently by Atlas Hashicorp. The data returned from GET and
POST requests has changed, which caused a number of regex patterns
to fail to be properly identified as 'success' or 'failure', which
ended up in upload/publish failures.
rmacklem [Sun, 14 May 2017 00:23:27 +0000 (00:23 +0000)]
MFC: r317576
Modify the NFSv4.1/pNFS client to ask for a maximum length of layout.
The code specified the length of a layout as INT64_MAX instead of
UINT64_MAX. This could result in getting a layout for less than the
full file for extremely large files. Although having little practical
effect, this patch corrects this in the code.
Detected during recent testing of the pNFS server.
marius [Thu, 11 May 2017 21:13:06 +0000 (21:13 +0000)]
MFC: r315431
- Adds macros for the content of SDHCI_ADMA_ERR and SDHCI_HOST_CONTROL2
registers.
- Add slot type capability bits. These bits should allow recognizing
removable card slots, embedded cards and shared buses (shared bus
supposedly is always comprised of non-removable cards).
- Dump CAPABILITIES2, ADMA_ERR, HOST_CONTROL2 and ADMA_ADDRESS_LO
registers in sdhci_dumpregs().
- The drive type support flags in the CAPABILITIES2 register are for
drive types A,C,D, drive type B is the default setting (value 0) of
the drive strength field in the SDHCI_HOST_CONTROL2 register.
o Move the DRIVER_MODULE() statements that declare mmc(4) to be a child
of the various bridge drivers out of dev/mmc.c and into the bridge
drivers.
o Add ACPI platform support for SDHCI driver.
o Fix some overly long lines, whitespace and other bugs according to
style(9) as well as spelling etc. in mmc(4), mmcsd(4) and sdhci(4).
o In the mmc(4) bridges and sdhci(4) (bus) front-ends:
- Remove redundant assignments of the default bus_generic_print_child
device method,
- use DEVMETHOD_END,
- use NULL instead of 0 for pointers.
o Trim/adjust includes.
o Add and use a MMC_DECLARE_BRIDGE macro for declaring mmc(4) bridges
as kernel drivers and their dependency onto mmc(4).
o Add support for eMMC "partitions". Besides the user data area, i. e.
the default partition, eMMC v4.41 and later devices can additionally
provide up to:
1 enhanced user data area partition
2 boot partitions
1 RPMB (Replay Protected Memory Block) partition
4 general purpose partitions (optionally with a enhanced or extended
attribute)
Besides simply subdividing eMMC devices, some Intel NUCs having UEFI
code in the boot partitions etc., another use case for the partition
support is the activation of pseudo-SLC mode, which manufacturers of
eMMC chips typically associate with the enhanced user data area and/
or the enhanced attribute of general purpose partitions.
CAVEAT EMPTOR: Partitioning eMMC devices is a one-time operation.
o Now that properly issuing CMD6 is crucial (so data isn't written to
the wrong partition for example), make a step into the direction of
correctly handling the timeout for these commands in the MMC layer.
Also, do a SEND_STATUS when CMD6 is invoked with an R1B response as
recommended by relevant specifications.
o Add an IOCTL interface to mmcsd(4); this is sufficiently compatible
with Linux so that the GNU mmc-utils can be ported to and used with
FreeBSD (note that due to the remaining deficiencies outlined above
SANITIZE operations issued by/with `mmc` currently most likely will
fail). These latter have been added to ports as sysutils/mmc-utils.
Among others, the `mmc` tool of mmc-utils allows for partitioning
eMMC devices (tested working).
o For devices following the eMMC specification v4.41 or later, year 0
is 2013 rather than 1997; so correct this for assembling the device
ID string properly.
o Let mmcsd.ko depend on mmc.ko. Additionally, bump MMC_VERSION as at
least for some of the above a matching pair is required.
jhb [Thu, 11 May 2017 17:26:34 +0000 (17:26 +0000)]
MFC 313407,313449: Copy ELF machine/flags from binaries to core dumps.
313407:
Copy the e_machine and e_flags fields from the binary into an ELF core dump.
In the kernel, cache the machine and flags fields from ELF header to use in
the ELF header of a core dump. For gcore, the copy these fields over from
the ELF header in the binary.
This matters for platforms which encode ABI information in the flags field
(such as o32 vs n32 on MIPS).
313449:
Trim trailing whitespace (mostly introduced in r313407).
jhb [Thu, 11 May 2017 04:29:20 +0000 (04:29 +0000)]
MFC 313999: Consolidate statements to initialize files.
Previously, the first lines of various generated files from system call
tables were generated in two sections. Some of the initialization was
done in BEGIN, and the rest was done when the first line was encountered.
The main reason for this split before r313564 was that most of the
initialization done in the second section depended on the $FreeBSD$ tag
extracted from the system call table. Now that the $FreeBSD$ tag is no
longer used, consolidate all of the file initialization in the BEGIN
section.
This change was tested by confirming that the content of generated files
did not change.
jhb [Wed, 10 May 2017 23:09:17 +0000 (23:09 +0000)]
MFC 313564:
Drop the "created from" line from files generated by makesyscalls.sh.
This information is less useful when the generated files are included in
source control along with the source. If needed it can be reconstructed
from the $FreeBSD$ tag in the generated file. Removing this information
from the generated output permits committing the generated files along
with the change to the system call master list without having inconsistent
metadata in the generated files.
- Allow overriding the FDT slicer with a custom slicer.
- Teach the flashmap code about SPI flash.
- Allow different slicers for different flash types to be registered
with geom_flashmap(4) and teach it about MMC for slicing enhanced
user data area partitions. The FDT slicer still is the default for
CFI, NAND and SPI flash on FDT-enabled platforms.
- In addition to a device_t, also pass the name of the GEOM provider
in question to the slicers as a single device may provide more than
one provider.
- Build a geom_flashmap.ko.
- Use MODULE_VERSION() so other modules can depend on geom_flashmap(4).
- Remove redundant/superfluous GEOM routines that either do nothing
or provide/just call default GEOM (slice) functionality.
- Trim/adjust includes
marius [Wed, 10 May 2017 20:46:59 +0000 (20:46 +0000)]
MFC: r311817
In dummynet(4), random chunks of memory are casted to struct dn_*,
potentially leading to fatal unaligned accesses on architectures with
strict alignment requirements. This change fixes dummynet(4) as far
as accesses to 64-bit members of struct dn_* are concerned, tripping
up on sparc64 with accesses to 32-bit members happening to be correctly
aligned there. In other words, this only fixes the tip of the iceberg;
larger parts of dummynet(4) still need to be rewritten in order to
properly work on all of !x86.
In principle, considering the amount of code in dummynet(4) that needs
this erroneous pattern corrected, an acceptable workaround would be to
declare all struct dn_* packed, forcing compilers to do byte-accesses
as a side-effect. However, given that the structs in question aren't
laid out well either, this would break ABI/KBI.
While at it, replace all existing bcopy(9) calls with memcpy(9) for
performance reasons, as there is no need to check for overlap in these
cases.
marius [Wed, 10 May 2017 20:29:01 +0000 (20:29 +0000)]
MFC: r310712
- Use correct offsets into the keys set array. As the elements of this
zero-length array are dynamically sized at run-time based on the use
of hints, compilers can't be expected to figure out these offsets on
their own. [1]
- Fix incorrect comparison in cmp_nans(). [2]
PR: 204571 [1], 202301 [2]
Submitted by: David Binderman [2]
marius [Wed, 10 May 2017 20:12:23 +0000 (20:12 +0000)]
MFC: r293642
- Add support for Advantech PCI-1602 Rev. B1 and PCI-1603 cards. [1]
- Add a description of Advantech PCI-1602 Rev. A boards. [1]
- Properly set up REG_ACR also for PCI-1602 Rev. A based on what the
Advantech-supplied Linux driver does.
- Additionally use the macros of <dev/ic/ns16550.h> to replace existing
magic values and get rid of trivial comments.
- Fix the style of some comments.
PR: 205359 [1]
Submitted by: Jan Mikkelsen (original patch) [1]
ken [Wed, 10 May 2017 18:59:20 +0000 (18:59 +0000)]
MFC r317740:
Correct loop mode CRN resets to adhere to FCP-4 section 4.10
Prior to this change, the CRN (Command Reference Number) is reset on any
firmware LIP, LOOP DOWN, or LOOP RESET event in violation of FCP-4 which
specifies that the CRN should only be reset in response to a LIP Reset
(LIPyx) primitive. FCP-4 also indicates PLOGI/LOGO and PRLI/PRLO ELS
actions as conditions for resetting the CRN for the associated initiator
port.
These violations manifest themselves when the HBA is removed from the
loop, or a target device is removed (especially during an outstanding
command) without power cycling. If the HBA and and the target device
determine upon re-establishing the loop that no PLOGI or PRLI is
required, and the target does not issue a LIPxy to the initiator, the
CRN for the target will have been improperly reset by the isp driver. As
a result, the target port will silently ignore all FCP commands issued
during the device probe (which will time out) preventing the device from
attaching.
This change corrects thie CRN reset behavior in response to loop state
changes, also introduces CRN resets for the above mentioned ELS actions
as encountered through async PDB change events.
This change also adds cleanup of outstanding commands in isp_loop_dead()
that was previously missing.
sys/dev/isp/isp.c
Add the last login state to debug output when syncing the pdb
sys/dev/isp/isp_freebsd.c
Replace binary statement setting aborted ccb status in
isp_watchdog() with the XS_SETERR macro used elsewhere
In isp_loop_dead(), abort or complete pending commands as done
in isp_watchdog()
In isp_async(), segregate the ISPASYNC_LOOP_RESET action from
ISPASYNC_LIP, ISPASYNC_LOOP_DOWN, and ISPASYNC_LOOP_UP
fallthroughs, and only reset the CRN in the RESET case. Also add
checks to handle false LOOP RESET actions that do not have a
proper associated LIP primitive, and log the primitive in the
debug messages
In isp_async(), remove the goto from ISP_ASYNC_DEV_STAYED, and
only reset the CRN in the DEV_CHANGED action
In isp_async(), when processing an ISPASYNC_CHANGE_PDB status,
reset CRN(s) for the associated nphdl (or all ports) if the
change reason is some form of ELS login/logout. Also remove
assignment to fc since it is not used in the scope
sys/dev/isp/ispmbox.h
Add macro definition for the global N-Port handle, and correct a
macro typo 'PDB24XX_AE_PRLI_DONJE'
sys/dev/isp/ispvar.h
Add macros FCP_AL_DA_ALL, FCP_AL_PA, and FCP_IS_DEST_ALPD for
more legible code when determining if an AL_PD port matches the
portid for a given struct fcparam* by value or by virtue of the
AL_PD port being 0xFF
ken [Wed, 10 May 2017 15:20:39 +0000 (15:20 +0000)]
MFC r317775:
Fix error recovery behavior in the pass(4) driver.
After FreeBSD SVN revision 236814, the pass(4) driver changed from
only doing error recovery when the CAM_PASS_ERR_RECOVER flag was
set on a CCB to sometimes doing error recovery if the passed in
retry count was non-zero.
Error recovery would happen if two conditions were met:
1. The error recovery action was simply a retry. (Which is most
cases.)
2. The retry_count is non-zero. (Which happened a lot because of
cut-and-pasted code.)
This explains a bug I noticed in with camcontrol:
# camcontrol tur da34 -v
Unit is ready
# camcontrol reset da34
Reset of 1:172:0 was successful
At this point, there should be a Unit Attention:
# camcontrol tur da34 -v
Unit is ready
No Unit Attention.
Try it again:
# camcontrol reset da34
Reset of 1:172:0 was successful
Now set the retry_count to 0 for the TUR:
# camcontrol tur da34 -v -C 0
Unit is not ready
(pass42:mps1:0:172:0): TEST UNIT READY. CDB: 00 00 00 00 00 00
(pass42:mps1:0:172:0): CAM status: SCSI Status Error
(pass42:mps1:0:172:0): SCSI status: Check Condition
(pass42:mps1:0:172:0): SCSI sense: UNIT ATTENTION asc:29,2 (SCSI bus reset
occurred)
(pass42:mps1:0:172:0): Field Replaceable Unit: 2
There is the unit attention. camcontrol(8) has a default
retry_count of 1, in case someone sets the -E flag without
setting -C.
The CAM_PASS_ERR_RECOVER behavior was only broken with the
CAMIOCOMMAND ioctl, which is the synchronous pass(4) API. It has
worked as intended (error recovery is only done when the flag
is set) in the asynchronous API (CAMIOQUEUE ioctl).
sys/cam/scsi/scsi_pass.c:
In passsendccb(), when calling cam_periph_runccb(), only
specify the error routine when CAM_PASS_ERR_RECOVER is set.
share/man/man4/pass.4:
Document that CAM_PASS_ERR_RECOVER is needed to enable
error recovery.
rmacklem [Wed, 10 May 2017 01:39:21 +0000 (01:39 +0000)]
MFC: r317465
Fix handling of a NFSv4.1 callback reply from the session cache.
The nfsv4_seqsession() call returns NFSERR_REPLYFROMCACHE when it has a
reply in the session, due to a requestor retry. The code erroneously
assumed a return of 0 for this case. This patch fixes this and adds
a KASSERT(). This would be an extremely rare occurrence. It was found
during code inspection during the pNFS server development.
dim [Tue, 9 May 2017 16:58:08 +0000 (16:58 +0000)]
MFC r317888 and two upstream prerequisites:
Pull in r227097 from upstream libc++ trunk (by Marshall Clow):
Fix PR21428. Buffer was one byte too small in octal formatting case.
Add test
Pull in r268009 from upstream libc++ trunk (by Eric Fiselier):
Fix PR21428 for long. Buffer was one byte too small in octal
formatting case. Rename previously added test
Pull in r302362 from upstream libc++ trunk (by me):
Ensure showbase does not overflow do_put buffers
Summary:
In https://bugs.freebsd.org/207918, Daniel McRobb describes how using
std::showbase with ostreams can cause truncation of unsigned long long
when output format is octal. In fact, this can even happen with
unsigned int and unsigned long.
To ensure this does not happen, add one additional character to the
do_put buffers if std::showbase is on. Also add a test case.
brooks [Tue, 9 May 2017 16:29:06 +0000 (16:29 +0000)]
MFC r317707:
Correct an out-of-bounds read in regcomp when the RE is bad.
When passed the invalid regular expression "a**", the error is
eventually detected and seterr() is called. It sets p->error
appropriatly and p->next and p->end to nuls which is a never used char
nuls[10] which is zeros due to .bss initialization. Unfortunatly,
p_ere_exp() and p_simp_re() both have fall through cases where they set
the error, decrement p->next and access it which means a read from
whatever .bss variable comes before nuls.
Found with regex_test:repet_multi and CHERI bounds checking.