Keep state incorrectly assumes keep frags. This is counter to the
ipfilter man pages. This also currently restricts keep frags to only when
keep state is used, which is redundant because keep state currently
assumes keep frags. This commit fixes this.
To the user this change means that to maintain the current behaviour
one must add keep frags to any ipfilter keep state rule (as documented
in the man pages).
This patch also allows the flexability to specify and use keep frags
separate from keep state, as documented in an example in ipf.conf.5,
instead of the currently broken behaviour.
MFC: r316692
Set initial values for nfsstatfs in the NFSv4 client.
The AmazonEFS NFSv4.1 server does not support the FILES_FREE and FILES_TOTAL
attributes. As such, an NFSv4.1 mount to the server would return garbage
for these values. This patch initializes the fields of the nfsstatfs structure,
so that "df" and friends will at least return consistent bogus values.
MFC: r316669
Avoid starvation of the server crash recovery thread for the NFSv4 client.
This patch gives a requestor of the exclusive lock on the client state
in the NFSv4 client priority over shared lock requestors. This avoids
the server crash recovery thread being starved out by other threads doing
RPCs.
MFC: r316667
Fix the NFSv4 client hndling of a stale write verifier in the Commit operation.
When the NFSv4 client Commit operation encountered a stale write verifier,
it erroneously mapped that to EIO. This could have caused recently written
data to be lost when a server crashes/reboots between an UNSTABLE write
and the subsequent commit. This patch fixes this.
The bug was only for the NFSv4 client and did not affect NFSv3.
MFC: r316666
Fix the NFSv4.1 client for NFSERR_BADSESSION recovery via ReclaimComplete.
For the ReclaimComplete operation, the RPC layer should not loop on
NFSERR_BADSESSION. If it does, the recovery thread (nfscl) can get stuck
looping and will not do a recovery.
This patch fixes it so it does not loop. This bug only affects NFSv4.1 and
only when a server reboots.
MFC: r316655
Fix parsing failure for NFSv4 Setattr operation for failed case.
If an operation that preceeds a Setattr in an NFSv4 compound fails,
there is no bitmap of attributes to parse. Without this patch, the
parsing would fail and return EBADRPC instead of the correct failure
error. This could break recovery from a server crash/reboot.
MFC r316699:
Do not adjust interface MTU automatically. Leave this task to the system
administrator.
Before r274246 interface MTU was adjusted only when GRE key is configured.
The r274246 has changed this behavior to automatically adjust MTU when any
option, that changes the size of GRE header is configured.
This patch removes automatic MTU adjustment from if_gre(4) and if_me(4),
and restores the behavior that was prior to r274246.
MFC: r310491
Fix NFSv4.1 client recovery from NFS4ERR_BAD_SESSION errors.
For most NFSv4.1 servers, a NFS4ERR_BAD_SESSION error is a rare failure
that indicates that the server has lost session/open/lock state.
However, recent testing by cperciva@ against the AmazonEFS server found
several problems with client recovery from this due to it generating this
failure frequently.
Briefly, the problems fixed are:
- If all session slots were in use at the time of the failure, some processes
would continue to loop waiting for a slot on the old session forever.
- If an RPC that doesn't use open/lock state failed with NFS4ERR_BAD_SESSION,
it would fail the RPC/syscall instead of initiating recovery and then
looping to retry the RPC.
- If a successful reply to an RPC for an old session wasn't processed
until after a new session was created for a NFS4ERR_BAD_SESSION error,
it would erroneously update the new session and corrupt it.
- The use of the first element of the session list in the nfs mount
structure (which is always the current metadata session) was slightly
racey. With changes for the above problems it became more racey, so all
uses of this head pointer was wrapped with a NFSLOCKMNT()/NFSUNLOCKMNT().
- Although the kernel malloc() usually allocates more bytes than requested
and, as such, this wouldn't have caused problems, the allocation of a
session structure was 1 byte smaller than it should have been.
(Null termination byte for the string not included in byte count.)
There are probably still problems with a pNFS data server that fails
with NFS4ERR_BAD_SESSION, but I have no server that does this to test
against (the AmazonEFS server doesn't do pNFS), so I can't fix these yet.
Although this patch is fairly large, it should only affect the handling
of NFS4ERR_BAD_SESSION error replies from an NFSv4.1 server.
Thanks go to cperciva@ for the extension testing he did to help isolate/fix
these problems.
andrew [Mon, 24 Apr 2017 16:49:30 +0000 (16:49 +0000)]
MFC r302788, r303026, r305471
r302788:
Fix the type used to hold the value returned from getopt. On arm64 char is
unsigned so will never be -1.
r303026:
Add missing flags from acpidump. These are defined in the header, but not
printed. The HW_REDUCED flag is useful as it should be set on arm64 to
comply with the ARM Server Base Boot Requirements.
r305471:
Teach acpidump how to parse ACPI 5.1 tables found on the development
ThunderX units in the netperf cluster.
CID 1229913 Fix output of "camcontrol persist -i report_capabilities".
The reported Persistent Reservation Types were wrong in all
cases.
CID 1356029 Annotate the code so Coverity will know that this is a false
positive.
CID 1366830 Fix a memory leak in "camcontrol timestamp -s"
CID 1366832 Fix a segfault that could be caused by bad drive firmware
Also, fix the man page entry for the "camcontrol epc state" command to match
what the code does.
r316342:
Consolidate random sleeps in periodic scripts
Multiple periodic scripts sleep for a random amount of time in order to
mitigate the thundering herd problem. This is bad, because the sum of
multiple uniformly distributed random variables approaches a normal
distribution, so the problem isn't mitigated as effectively as it would be
with a single sleep.
This change creates a single configurable anticongestion sleep. periodic
will only sleep if at least one script requires it, and it will never sleep
more than once per invocation. It also won't sleep if periodic was run
interactively, fixing an unrelated longstanding bug.
MFC r316677: Do not register in CTL portal groups without portals.
From config synthax point of view such portal groups are not incorrect,
but they are useless since can not receive any connection. And since
CTL port resource is very limited, it is good to save it.
MFC r314290: Implement use of multiple transfers per I/O.
This change removes limitation of single S/G list entry and limitation on
maximal I/O size, using multiple data transfers per I/O if needed. Also
it removes code duplication between send and receive paths, which are now
completely equal.
When forwarding pf tracks the size of the largest fragment in a fragmented
packet, and refragments based on this size.
It failed to ensure that this size was a multiple of 8 (as is required for all
but the last fragment), so it could end up generating incorrect fragments.
For example, if we received an 8 byte and 12 byte fragment pf would emit a first
fragment with 12 bytes of payload and the final fragment would claim to be at
offset 8 (not 12).
We now assert that the fragment size is a multiple of 8 in ip6_fragment(), so
other users won't make the same mistake.
Reported by: Antonios Atlasis <aatlasis at secfu net>
Remove excessive horizontal whitespace from hier(7) by correctly
using "-width". The http://mdocml.bsd.lv/mdoc/details/width.html
says: "Do not use macros in the argument specifying the width,
that's not portable. While GNU troff can handle it, mandoc cannot."
The same problem seems to exist in many other man pages.
MFC r316065: Enable route and LLE (ndp) caching in TCP/IPv6
tcp_output.c was using a route on the stack for IPv6, which does not
allow route caching or LLE/ndp caching. Switch to using the route
(v6 flavor) in the in_pcb, which was already present, which caches
both L3 and L2 lookups.
MFC r302664:
mkimg(1): minor cleanups with argument order in calloc(3).
Generally the first argument in calloc is supposed to stand for a count
and the second for a size. Try to make that consistent. While here,
attempt to make some use of the overflow detection capability in
calloc(3).
MFC r316824:
The rule field in the ipfw_dyn_rule structure is used as storage
to pass rule number and rule set to userland. In r272840 the kernel
internal rule representation was changed and the rulenum field of
struct ip_fw_rule got the type uint32_t, but userlevel representation
still have the type uint16_t. To not overflow the size of pointer
on the systems with 32-bit pointer size use separate variable to
copy rulenum and set.
dim [Fri, 21 Apr 2017 17:03:48 +0000 (17:03 +0000)]
MFC r316989:
Pull in r300404 from upstream llvm trunk (by me):
Use correct registers for "A" inline asm constraint
Summary:
In PR32594, inline assembly using the 'A' constraint on x86_64 causes
llvm to crash with a "Cannot select" stack trace. This is because
`X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A'
means the EAX and EDX registers.
However, on x86_64 it means the RAX and RDX registers, and on 16-bit
x86 (ia16?) it means the old AX and DX registers.
Add new register classes in `X86RegisterInfo.td` to support these
cases, and amend the logic in `getRegForInlineAsmConstraint` to cope
with different subtargets. Also add a test case, derived from
PR32594.
MFC r310181 (matthew) (originally r309314):
Allow a user-overridable setting 'PKG_CMD' to control the command used
to create a repo during 'make packages'.
MFC r316770:
Clear h/w csum flags on mbuf handled by UDP.
When checksums of received IP and UDP header already checked, UDP uses
sbappendaddr_locked() to pass received data to the socket.
sbappendaddr_locked() uses given mbuf as is, and if NIC supports checksum
offloading, mbuf contains csum_data and csum_flags that were calculated
for already stripped headers. Some NICs support only limited checksums
offloading and do not use CSUM_PSEUDO_HDR flag, and csum_data contains
some value that UDP/TCP should use for pseudo header checksum calculation.
When L2TP is used for tunneling with mpd5, ng_ksocket receives mbuf with
filled csum_flags and csum_data, that were calculated for outer headers.
When L2TP header is stripped, a packet that was tunneled goes to the IP
layer and due to presence of csum_flags (without CSUM_PSEUDO_HDR) and
csum_data, the UDP/TCP checksum check fails for this packet.
Reported by: Irina Liakh <spell at itl ua>
Tested by: Irina Liakh <spell at itl ua>
MFC r316822,316823:
Rework r316770 to make it protocol independent and general, like we
do for streaming sockets.
And do more cleanup in the sbappendaddr_locked_internal() to prevent
leak information from existing mbuf to the one, that will be possible
created later by netgraph.
Fix a use after free panic in ipfilter's fragment processing.
Memory is malloc'd, then a search for a match in the fragment table
is made and if the fragment matches, the wrong fragment table is
freed, causing a use after free panic. This commit fixes this.
A symptom of the problem is a kernel page fault in bcopy() called by
ipf_frag_lookup() at line 715 in ip_frag.c. Another symptom is a
kernel page fault in ipf_frag_delete() when called by ipf_frag_expire()
via ipf_slowtimer().
r308569:
Always call PHYS_TO_VM_PAGE() in is_managed(). Fast road for addresses
under first_page cannot be taken as this variable is connected only to
vm_page_array segment. There could be more segments in system like the ones
for various fictitious page ranges. These can be situated under
vm_page_array segment and so, they could be skipped before this fix.
However, as far as I know, there is no report associated with it.
r308570:
The return type of is_managed() was changed from boolean_t to bool type in
r308569. Now, propagate this change further for consistency sake.
andrew [Wed, 19 Apr 2017 15:59:16 +0000 (15:59 +0000)]
MFC 313772:
Load the new sp_el0 with interrupts disabled in fork_trampoline. If an
interrupt arrives in fork_trampoline after sp_el0 was written we may then
switch to a new thread, enter userland so change this stack pointer, then
return to this code with the wrong value. This fixes this case by moving
the load of sp_el0 until after interrupts have been disabled.
andrew [Wed, 19 Apr 2017 15:46:34 +0000 (15:46 +0000)]
MFC 305355:
Explicitly include all .rodata.* sections in the kernel .rodata. This
helps link the kernel with lld as it will then put all these into a single
.rodata section.
MFC r303442, r305343: remove CONSTRUCTORS from linker scripts
r303442: remove CONSTRUCTORS from kernel linker scripts
r305343: remove CONSTRUCTORS from MIPS uboot linker script
The linker script CONSTRUCTORS keyword is only meaningful "when linking
object file formats which do not support arbitrary sections, such as
ECOFF and XCOFF"[1] and is ignored for other object file formats.
LLVM's lld does not yet accept (and ignore) CONSTRUCTORS, so just remove
CONSTRUCTORS from the linker script as it has no effect.
andrew [Wed, 19 Apr 2017 14:07:35 +0000 (14:07 +0000)]
Fix the arm64 userland building with lld:
MFC 308124:
On arm64 build the efi loader with -fPIC. Without this clang 3.9 will
generate relocation in the self relocation code.
MFC 316608:
Add -fPIC to the standalone build flags on arm64. This is needed as
loader.efi is position independend, however we were not building it as
such causing a build failure when building with lld.
MFC 315452:
Mark the EFI PE header as allocated. While ld.bfd doesn't seem to care
about not having this flag ld.lld fails to link without it.
MFC r316720
Fix defects reported by Coverity
1. Deadcode in ecore_init_cache_line_size(), qlnx_ioctl() and
qlnx_clean_filters()
2. ARRAY_VS_SINGLETON issue in qlnx_remove_all_mcast_mac() and
qlnx_update_rx_prod()
MFC r316310
Update man page for commit r316309 "Add support for optional Soft LRO".
The driver provides the ability to select either HW or Software LRO, when
LRO is enabled (default HW LRO).
MFC r316716:
Inherit IPv6 checksum offloading flags to vlan interfaces.
if_vlan(4) interfaces inherit IPv4 checksum offloading flags from the
parent when VLAN_HWCSUM and VLAN_HWTAGGING flags are present on the
parent interface. Do the same for IPv6 checksum offloading flags.
r315458:
Constrain IPv6 routes to single FIBs when net.add_addr_allfibs=0
sys/netinet6/icmp6.c
Use the interface's FIB for source address selection in ICMPv6 error
responses.
sys/netinet6/in6.c
In in6_newaddrmsg, announce arrival of local addresses on the
interface's FIB only. In in6_lltable_rtcheck, use a per-fib ND6
cache instead of a single cache.
sys/netinet6/in6_src.c
In in6_selectsrc, use the caller's fib instead of the default fib.
In in6_selectsrc_socket, remove a superfluous check.
sys/netinet6/nd6.c
In nd6_lle_event, use the interface's fib for routing socket
messages. In nd6_is_new_addr_neighbor, check all FIBs when trying
to determine whether an address is a neighbor. Also, simplify the
code for point to point interfaces.
sys/netinet6/nd6.h
sys/netinet6/nd6.c
sys/netinet6/nd6_rtr.c
Make defrouter_select fib-aware, and make all of its callers pass in
the interface fib.
sys/netinet6/nd6_nbr.c
When inputting a Neighbor Solicitation packet, consider the
interface fib instead of the default fib for DAD. Output NS and
Neighbor Advertisement packets on the correct fib.
sys/netinet6/nd6_rtr.c
Allow installing the same host route on different interfaces in
different FIBs. If rt_add_addr_allfibs=0, only install or delete
the prefix route on the interface fib.
tests/sys/netinet/fibs_test.sh
Clear some expected failures, but add a skip for the newly revealed
BUG217871.
r315656:
Fix back-to-back runs of sys/netinet/fibs_test;slaac_on_nondefault_fib6
This test was failing if run twice because rtadvd takes too long to die.
The rtadvd process from the first run was still running when the
second run created its interfaces. The solution is to use SIGKILL during
the cleanup instead of SIGTERM so rtadvd will die faster.
While I'm here, randomize the addresses used for the test, which makes bugs
like this easier to spot, and fix the cleanup order to be the opposite of
the setup order
The module is designed for modification of a packets of any protocols.
For now it implements only TCP MSS modification. It adds the external
action handler for "tcp-setmss" action.
A rule with tcp-setmss action does additional check for protocol and
TCP flags. If SYN flag is present, it parses TCP options and modifies
MSS option if its value is greater than configured value in the rule.
Then it adjustes TCP checksum if needed. After handling the search
continues with the next rule.
This opcode can be used to attach some data to external action opcode.
And unlike to O_EXTERNAL_INSTANCE opcode, this opcode does not require
creating of named instance to pass configuration arguments to external
action handler. The data is coming just next to O_EXTERNAL_ACTION opcode.
The userlevel part currenly supports formatting for opcode with ipfw_insn
size, by default it expects u16 numeric value in the arg1.
Fixes for NVIDIA Tegra124 clocks:
- EMC clock have standard peripheral clock block. Use it. - Implement full
frequency set method for PLLD2. This PLL
is used as HDMI pixel clock so we must be able to set it to wide range of
frequencies, within 5% tolerance allowed by HDMI specification. Due to
this, full state space search (over m, n, p fields) is necessary.
r308286:
TEGRA: Add basic driver for memory controller. For now, it only reports
memory and SMMU access errors.
r308287:
TEGRA: Fix numerous issues in clock code. Define and export clocks related
to XUSB driver.
r310593:
Fix late monitor hotplug event. If system starts without attached monitor,
DRM create framebuffer for VT console. Later, when monitor is attached, the
hotplug event must issue full modeset procedure to setup CRTC. In original
code, this was done in drm_fb_helper_set_par(), but we don't have this
function implemented yet. Use unrolled version of drm_fb_helper_set_par()
to ensure same functionality.
r310599:
Import drm_patform.c, an implementation of non-PCI based attachment for
graphics drivers. It will be used in upcoming driver for Nvidia Tegra
boards.
r309532:
Add IDs for HDA codecs found on Nvidia Tegra SoCs.
r310674:
Limit number of stripes supported by HDA codec to maximum number announced
by HDA controller. Incorrectly implermented HDA codec may report support
for more stripes that HDA controller already have. Due to this, always
limit number of enabled stripes by global controller maximum.
r308612:
Allow DRM2 code to be built on platforms without AGP. This patch is taken
from original drm-3.8 code.
r308614:
Allow embeding DRM2 code into kernel. It's usefull for development (for
netboot) and it also helps to boot FreeBSD on some embeded platforms (where
we must boot kernel directly, without standard boot loader).
ARM: Disconnect elf_trampoline.c from ARMv6 build. The trampoline code never
functioned properly for Cortex CPUs, and its functionality is already
provided by ubldr.