pfctl: Don't break connections on skipped interfaces on reload
On reload we used to first flush everything, including the list of skipped
interfaces. This can lead to termination of these connections if they send
packets before the new configuration is applied.
Note that this doesn't currently happen on 12 or 11, because of special EACCES
handling introduced in r315514. This special behaviour in tcp_output() may
change, hence the fix in pfctl.
PR: 214613
Submitted by: Andreas Longwitz <longwitz at incore.de>
MFC 332735:
Fix two off-by-one errors when allocating MSI and MSI-X interrupts.
x86 enforces an (arbitray) limit on the number of available MSI and
MSI-X interrupts to simplify code (in particular, interrupt_source[]
is statically sized). This means that an attempt to allocate an MSI
vector needs to fail if it would go beyond the limit, but the checks
for exceeding the limit had an off-by-one error. In the case of MSI-X
which allocates interrupts one at a time this meant that IRQ 768 kept
getting handed out multiple times for msix_alloc() instead of failing
because all MSI IRQs were in use.
MFC r333015:
Add network device event for priority code point, PCP, changes.
When the PCP is changed for either a VLAN network interface or when
prio tagging is enabled for a regular ethernet network interface,
broadcast the IFNET_EVENT_PCP event so applications like ibcore can
update its GID tables accordingly.
MFC r332902: pwd_mkdb: default to network (big) endian hash order
For cross-architecture reproducibility. The db(3) functions work with
hashes of either endianness, and the current (v4) version password db
entries already store integers in network order. Do so with the hash as
well so that identical password databases can be created on big- and
little-endian hosts.
The -B and -L flags exist to set the endianness for legacy (v3) entries
when the -l flag is used, and they will still control hash endianness
(at least until the backwards compatibility infrastructure is removed
[a change that will not be merged to stable/11]).
MFC 332733:
Workaround fixed I/O port resources encoded as I/O port ranges in _CRS.
ACPI I/O port descriptors use _MIN and _MAX fields to specify the set
of allowable base (start) addresses for an I/O port resource along with
a _LEN field specifying the length. A fixed resource is supposed to be
encoded with _MIN == _MAX, but some buggy firmwares instead set _MAX to
the end of the fixed range. Relocating I/O ranges only make sense in
_PRS (possible resource settings), not in _CRS (current resource settings),
so if an I/O port range with _MAX set set to the end of the range is
present in _CRS, treat it as a fixed I/O port resource starting at
_MIN.
dim [Fri, 27 Apr 2018 19:21:39 +0000 (19:21 +0000)]
MFC r332833:
Recommit r332501, with an additional upstream fix for "Cannot lower
EFLAGS copy that lives out of a basic block!" errors on i386.
Pull in r325446 from upstream clang trunk (by me):
[X86] Add 'sahf' CPU feature to frontend
Summary:
Make clang accept `-msahf` (and `-mno-sahf`) flags to activate the
`+sahf` feature for the backend, for bug 36028 (Incorrect use of
pushf/popf enables/disables interrupts on amd64 kernels). This was
originally submitted in bug 36037 by Jonathan Looney
<jonlooney@gmail.com>.
As described there, GCC also uses `-msahf` for this feature, and the
backend already recognizes the `+sahf` feature. All that is needed is
to teach clang to pass this on to the backend.
The mapping of feature support onto CPUs may not be complete; rather,
it was chosen to match LLVM's idea of which CPUs support this feature
(see lib/Target/X86/X86.td).
I also updated the affected test case (CodeGen/attr-target-x86.c) to
match the emitted output.
Pull in r328944 from upstream llvm trunk (by Chandler Carruth):
[x86] Expose more of the condition conversion routines in the public
API for X86's instruction information. I've now got a second patch
under review that needs these same APIs. This bit is nicely
orthogonal and obvious, so landing it. NFC.
Pull in r329414 from upstream llvm trunk (by Craig Topper):
[X86] Merge itineraries for CLC, CMC, and STC.
These are very simple flag setting instructions that appear to only
be a single uop. They're unlikely to need this separation.
Pull in r329657 from upstream llvm trunk (by Chandler Carruth):
[x86] Introduce a pass to begin more systematically fixing PR36028
and similar issues.
The key idea is to lower COPY nodes populating EFLAGS by scanning the
uses of EFLAGS and introducing dedicated code to preserve the
necessary state in a GPR. In the vast majority of cases, these uses
are cmovCC and jCC instructions. For such cases, we can very easily
save and restore the necessary information by simply inserting a
setCC into a GPR where the original flags are live, and then testing
that GPR directly to feed the cmov or conditional branch.
However, things are a bit more tricky if arithmetic is using the
flags. This patch handles the vast majority of cases that seem to
come up in practice: adc, adcx, adox, rcl, and rcr; all without
taking advantage of partially preserved EFLAGS as LLVM doesn't
currently model that at all.
There are a large number of operations that techinaclly observe
EFLAGS currently but shouldn't in this case -- they typically are
using DF. Currently, they will not be handled by this approach.
However, I have never seen this issue come up in practice. It is
already pretty rare to have these patterns come up in practical code
with LLVM. I had to resort to writing MIR tests to cover most of the
logic in this pass already. I suspect even with its current amount
of coverage of arithmetic users of EFLAGS it will be a significant
improvement over the current use of pushf/popf. It will also produce
substantially faster code in most of the common patterns.
This patch also removes all of the old lowering for EFLAGS copies,
and the hack that forced us to use a frame pointer when EFLAGS copies
were found anywhere in a function so that the dynamic stack
adjustment wasn't a problem. None of this is needed as we now lower
all of these copies directly in MI and without require stack
adjustments.
Lots of thanks to Reid who came up with several aspects of this
approach, and Craig who helped me work out a couple of things
tripping me up while working on this.
Pull in r329673 from upstream llvm trunk (by Chandler Carruth):
[x86] Model the direction flag (DF) separately from the rest of
EFLAGS.
This cleans up a number of operations that only claimed te use EFLAGS
due to using DF. But no instructions which we think of us setting
EFLAGS actually modify DF (other than things like popf) and so this
needlessly creates uses of EFLAGS that aren't really there.
In fact, DF is so restrictive it is pretty easy to model. Only STD,
CLD, and the whole-flags writes (WRFLAGS and POPF) need to model
this.
I've also somewhat cleaned up some of the flag management instruction
definitions to be in the correct .td file.
Adding this extra register also uncovered a failure to use the
correct datatype to hold X86 registers, and I've corrected that as
necessary here.
Pull in r330264 from upstream llvm trunk (by Chandler Carruth):
[x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite
uses across basic blocks in the limited cases where it is very
straight forward to do so.
This will also be useful for other places where we do some limited
EFLAGS propagation across CFG edges and need to handle copy rewrites
afterward. I think this is rapidly approaching the maximum we can and
should be doing here. Everything else begins to require either heroic
analysis to prove how to do PHI insertion manually, or somehow
managing arbitrary PHI-ing of EFLAGS with general PHI insertion.
Neither of these seem at all promising so if those cases come up,
we'll almost certainly need to rewrite the parts of LLVM that produce
those patterns.
We do now require dominator trees in order to reliably diagnose
patterns that would require PHI nodes. This is a bit unfortunate but
it seems better than the completely mysterious crash we would get
otherwise.
Together, these should ensure clang does not use pushf/popf sequences to
save and restore flags, avoiding problems with unrelated flags (such as
the interrupt flag) being restored unexpectedly.
Requested by: jtl
PR: 225330
MFC r332898:
Pull in r329771 from upstream llvm trunk (by Craig Topper):
[X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need
to emit an explicit MOV8mr instruction.
Previously the code only knew how to handle setcc to a register.
This should fix a crash in the chromium build.
This fixes various assertion failures while building ports targeting
i386:
* www/firefox: isReg() && "This is not a register operand!"
* www/iridium, www/qt5-webengine: (I.atEnd() || std::next(I) ==
def_instr_end()) && "getVRegDef assumes a single definition or no
definition"
* devel/powerpc64-gcc: FromReg != ToReg && "Cannot replace a reg with
itself"
MFC 332657:
Properly do a deep copy of the ioctls capability array for fget_cap().
fget_cap() tries to do a cheaper snapshot of a file descriptor without
holding the file descriptor lock. This snapshot does not do a deep
copy of the ioctls capability array, but instead uses a different
return value to inform the caller to retry the copy with the lock
held. However, filecaps_copy() was returning 1 to indicate that a
retry was required, and fget_cap() was checking for 0 (actually
'!filecaps_copy()'). As a result, fget_cap() did not do a deep copy
of the ioctls array and just reused the original pointer. This cause
multiple file descriptor entries to think they owned the same pointer
and eventually resulted in duplicate frees.
The only code path that I'm aware of that triggers this is to create a
listen socket that has a restricted list of ioctls and then call
accept() which calls fget_cap() with a valid filecaps structure from
getsock_cap().
To fix, change the return value of filecaps_copy() to return true if
it succeeds in copying the caps and false if it fails because the lock
is required. I find this more intuitive than fixing the caller in
this case. While here, change the return type from 'int' to 'bool'.
Finally, make filecaps_copy() more robust in the failure case by not
copying any of the source filecaps structure over. This avoids the
possibility of leaking a pointer into a structure if a similar future
caller doesn't properly handle the return value from filecaps_copy()
at the expense of one more branch.
I also added a test case that panics before this change and now passes.
MFC: r332813
Fix use of pointer after being set NULL.
Using a pointer after setting it NULL is probably not a good plan.
Spotted by inspection during changes for Flexible File Layout Ioerr handling.
This code path obviously isn't normally executed.
MFC: r332790
Fix OpenDowngrade for NFSv4.1 if a client sets the OPEN_SHARE_ACCESS_WANT* bits.
The NFSv4.1 RFC specifies that the OPEN_SHARE_ACCESS_WANT bits can be set
in the OpenDowngrade share_access argument and are basically ignored.
I do not know of a extant NFSv4.1 client that does this, but this little
patch fixes it just in case.
It also changes the error from NFSERR_BADXDR to NFSERR_INVAL since the NFSv4.1
RFC specifies this as the error to be returned if bogus bits are set.
(The NFSv4.0 RFC didn't specify any error for this, so the error reply can
be changed for NFSv4.0 as well.)
Found by inspection while looking at a problem with OpenDowngrade reported
for the ESXi 6.5 NFSv4.1 client.
MFC r332812:
Add dead_bpf_if structure, that should be used as fake bpf_if
during ifnet detach.
Since destroying interface is not atomic operation and due to the
lack of synhronization during destroy, it is possible, that in the
time between bpfdetach() and if_free() some queued on destroying
interface mbuf will be used by ether_input_internal() and
bpf_peers_present() can dereference NULL bpf_if pointer. To protect
from this, assign pointer to empty bpf_if_ext structure instead of
NULL pointer after bpfdetach().
MFC r332949 (by markj):
Use dead_bpf_if instead of bp_null.
This fixes a -Wunused error when DEV_BPF and NETGRAPH_BPF are not
defined.
MFC r332090: stand: pass --no-rosegment for i386 bits when linking with lld
btxld does not correctly handle input with other than 2 PT_LOAD
segments. Passing --no-rosegment lets lld produce output eqivalent to
ld.bfd: 2 PT_LOAD segments and no PT_GNU_RELRO.
ian [Thu, 26 Apr 2018 16:40:59 +0000 (16:40 +0000)]
MFC r308767 by br:
Make gpiobus early driver at BUS_PAS_BUS.
The gpiobus driver is attached explicitly and generally should be
at the same pass as its parent. Making it use BUS_PAS_BUS ensures
that it attaches immediately after parent adds it (assuming the
parent itself attached at BUS_PAS_BUS and above).
r332346:
Fix the position of $bootable so that -o platformid=efi applies correctly.
r332661:
Generate hybrid ISO images for amd64.
This keeps the existing El Torito entries for BIOS and UEFI boot code and
adds a GPT in the ISO image's System Area containing boot code for BIOS that
will load /boot/loader from the ISO filesystem and execute it. We then use
etdump to find the EFI System Partition image in the El Torito catalog and
add an entry to the GPT that allows EFI to find it.
r333005:
Allow etdump, makefs and mkimg to be overridden.
Recent changes to makefs and mkimg have led to situations where the
disconnect between this script and the versions installed on the host cause
failures. Provide a way to work around this that doesn't require the
installation of new versions to the host system if that's not desired.
With this change mkisoimages.sh will honour the $ETDUMP, $MAKEFS and $MKIMG
environment variables but fall back to the previous behaviour of finding them
within $PATH.
r331463, r331467, and r331468 are all replaced by r331843 but are included for
completeness' sake. r331463 also contained a change to
release/amd64/mkisoimages.sh which I've left out since those changes were
also obsoleted by a later commit which will be merged later.
r331843:
Synchronise with NetBSD's version of EFI handling for El Torito images.
Be more precise when including headers so that we're less likely to
depend on namespace pollution and as such become more portable. This
means including headers like <sys/types.h> or <stdlib.h>, but also
making sure we include system/host headers before local headers.
While here: define ENOATTR as ENOMSG in mtree.c. There is no ENOATTR
on Linux.
With this, makefs is ready for compilation on macOS and Linux.
When booted via isoboot(8) loader will be handed a disk that simply contains
an ISO9660 image. Currently this confuses it greatly. Teach it how to spot
that it's in this situation and that ISO9660 has one "partition" covering
the whole disk.
Add isoboot(8) for booting BIOS systems from HDDs containing ISO images.
This is part of a project for adding the ability to create hybrid CD/USB boot
images. In the BIOS case when booting from something that isn't a CD we need
some extra boot code to actually find our next stage (loader) within an
ISO9660 filesystem. This code will reside in a GPT partition (similar to
gptboot(8) from which it is derived) and looks for /boot/loader in an
ISO9660 filesystem on the image.
r332436:
Add the ability to specify absolute and relative offsets to size partitions.
To create hybrid boot media we want to specify a partition at a known location.
This extends the syntax of size partitions to include an optional offset that
can be absolute or relative. It also introduces validation to make sure that
this hasn't resulted in overlapping partitions. I haven't added this to the
file and process partition specifications yet but the mechanics are designed
such that if someone comes up with a good way of specifying the offset it
will be fairly easy to add in.
Add includes for <curses.h> and <termcap.h> where necessary, and
rename a few internal functions to have a "top_" prefix to avoid
clashes with standard names from curses.h/termcap.h headers.
Top now compiles without warnings on both gcc and clang.
r331949:
Add the etdump utility for dumping El Torito boot catalog information.
This can be used to check existing images but will be used in the future to
find EFI ESP images placed in El Torito catalogs so they can be used for
hybrid boot purposes.
r332427:
Check the return value of fseek.
r332438:
Remove a debugging printf that crept in.
Sponsored by: iXsystems, Inc.
Pointy hat to: benno
r331949:
Add the etdump utility for dumping El Torito boot catalog information.
This can be used to check existing images but will be used in the future to
find EFI ESP images placed in El Torito catalogs so they can be used for
hybrid boot purposes.
ian [Tue, 24 Apr 2018 17:13:31 +0000 (17:13 +0000)]
MFC r332518, r332527
r332518:
Add an option to daemon(8) to specify a delay between restarts of a
supervised program. The existing -r option has a hard-coded delay of one
second. This change adds a -R option which takes a delay in seconds. This
can be used to prevent log spam and rapid restarts, similar to init(8)'s
behavior of adding a delay between rapid restarts when it's supervising a
program.
r332527:
Fix cut-and-pasted line to have the right option letter.
r331868:
Add opt_platform.h for several modules that have #ifdef FDT in the source.
Submitted by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
r332046:
Add a missing MODULE_DEPEND().
r332194:
Add support for writing/changing spi device ivars. The SPI mode (polarity
and phase) and the maximum bus speed can be changed. The chip select
number cannot be changed, because the device instances which are children
of spibus are inherently associated with the chip select number they were
instantiated for.
r332195:
A couple minor improvements to spibus.c...
- Change the description string to "SPI bus" (was "spibus bus").
- This is the default driver for a SPI bus, not a generic implementation,
so return the probe value that indicates such.
- Use device_delete_children() at detach time, instead of a local loop
to enumerate the children and detach each one individually.
r332196:
Return BUS_PROBE_DEFAULT, not zero, because this is not the one driver
implementation that must be used, it's just the base system default driver.
Also add a comment noting that we're being more liberal about the bus
frequency property than the dts binding documents require.
r332198:
Arrange the list of generated sources as 1-per-line alphbetical, and add
the files required when building for FDT-based systems.
r332219:
Remove the existing identify() hack to force-add a spigen device on
FDT-based systems, and instead add proper FDT probe code. Because this
driver is freebsd-specific and just provides generic userland access to run
spibus transactions, there is no bindings document to mandate a compatible
string, so just arbitrarily use "freebsd,spigen".
r332231:
Generate a spibus_set_[ivarname]() convenience function for each ivar,
now that they can be set.
r332233:
Add an ioctl to get/set the SPI transfer mode. Also, make the bus clock
frequency ioctl actually set the corresponding ivar instead of just storing
the value locally in the softc (and then not using it for anything). Also,
return the correct error code if the ioctl cmd is not recognized.
r332240:
Add the ioctl definitions for spigen get/set spi mode. Should have been
part of r332233.
r332258:
Don't check for impossible NULL return from malloc(..., M_WAITOK).
r332259:
Cast the data pointer to the correct type for the data being accessed (as
opposed to one that accidentally worked on the one arch I test-compiled for
on my first try).
Reported by: np@, O. Hartmann <ohartmann@walstatt.org>
Pointy hat: ian@
r332261:
Add a manpage for spigen(4).
r332292:
Allow hinted attachment on FDT-based systems. Instead of returning ENXIO
when the FDT data doesn't enable the device instance, return
BUS_PROBE_NOWILDCARD, the same as for non-FDT systems.
MFC r332789: pwd_mkdb: warn that legacy support is deprecated (if specified)
r283981 switched pwd_mkdb to emit only v4 database entries by default,
and introduced a -l (legacy) option emit v3 entries in addition. The
commit message claims that legacy support will be removed in 12.0, so
emit a warning now if it is used.
MFC r332875: pwd_mkdb: add deprecation notice in manpage too
Followon to r332789; as reported on the -current and -stable lists and
in review D15144 the -l option will be removed before FreeBSD 12.0.
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
r332385:
hyperv/storvsc: storvsc_io_done(): do not use CAM_SEL_TIMEOUT
CAM_SEL_TIMEOUT was introduced in
https://reviews.freebsd.org/D7521 (r304251), which claimed:
"VM shall response to CAM layer with CAM_SEL_TIMEOUT to filter those
invalid LUNs. Never use CAM_DEV_NOT_THERE which will block LUN scan
for LUN number higher than 7."
But it turns out this is not correct:
I think what really filters the invalid LUNs in r304251 is that:
before r304251, we could set the CAM_REQ_CMP without checking
vm_srb->srb_status at all:
ccb->ccb_h.status |= CAM_REQ_CMP.
r304251 checks vm_srb->srb_status and sets ccb->ccb_h.status properly,
so the invalid LUNs are filtered.
I changed my code version to r304251 but replaced the CAM_SEL_TIMEOUT
with CAM_DEV_NOT_THERE, and I confirmed the invalid LUNs can also be
filtered, and I successfully hot-added and hot-removed 8 disks to/from
the VM without any issue.
CAM_SEL_TIMEOUT has an unwanted side effect -- see cam_periph_error():
For a selection timeout, we consider all of the LUNs on
the target to be gone. If the status is CAM_DEV_NOT_THERE,
then we only get rid of the device(s) specified by the
path in the original CCB.
This means: for a VM with a valid LUN on 3:0:0:0, when the VM inquires
3:0:0:1 and the host reports 3:0:0:1 doesn't exist and storvsc returns
CAM_SEL_TIMEOUT to the CAM layer, CAM will detech 3:0:0:0 as well: this
is the bug I reported recently:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583
MFC r332674:
Increase the msdosfs partition size on arm SoC images where the
current size may not be sufficiently large for development and/or
testing.
r288291 added a call to limits(1), which isn't available before partitions
are mounted. This broke the ddb rc script, which does not provide its own
start_cmd.
Alleviate the situation here by providing a start_cmd. We still have other
problems with diskless setups that need to be considered, but this is a
start.
MFC r319216:
Fix an unnecessary/incorrect check in the PKTOPT_EXTHDRCPY macro.
This macro allocates memory and, if malloc does not return NULL, copies
data into the new memory. However, it doesn't just check whether malloc
returns NULL. It also checks whether we called malloc with M_NOWAIT. That
is not necessary.
While it may be that malloc() will only return NULL when the M_NOWAIT flag
is set, we don't need to check for this when checking malloc's return
value. Further, in this case, the check was not completely accurate,
because it checked for flags == M_NOWAIT, rather than treating it as a bit
field and checking for (flags & M_NOWAIT).
MFC r319215:
Fix two places in the ICMP6 code where we could dereference a NULL pointer
in the icmp6_input() function.
When processing an ICMP6_ECHO_REQUEST, if IP6_EXTHDR_GET fails, it will
set nicmp6 and n to NULL. Therefore, we should condition our modification
to nicmp6 on n being not NULL.
And, when processing an ICMP6_WRUREQUEST in the (mode != FQDN) case, if
m_dup_pkthdr() fails, the code will set n to NULL. However, the very next
line dereferences n. Therefore, when m_dup_pkthdr() fails, we should
discontinue further processing and follow the same path as when m_gethdr()
fails.
Reported by: clang static analyzer
Sponsored by: Netflix, Inc.
MFC r319214:
Enforce the limit on ICMP messages before doing work to formulate the
response.
Delete an unneeded rate limit for UDP under IPv6. Because ICMP6
messages have their own rate limit, it is unnecessary to apply a
second rate limit to UDP messages.
MFC r314286:
Do some minimal work to better conform to the 802.3ad (LACP) standard.
In particular, don't set the synchronized bit for the peer unless it truly
appears to be synchronized to us. Also, don't set our own synchronized bit
unless we have actually seen a remote system.
Prior to this change, we were seeing some strange behavior, such as:
1. We send an advertisement with the Activity, Aggregation, and Default
flags, followed by an advertisement with the Activity, Aggregation,
Synchronization, and Default flags. However, we hadn't seen an
advertisement from another peer and were still advertising the default
(NULL) peer. A closer examination of the in-kernel data structures (using
kgdb) showed that the system had added the default (NULL) peer as a valid
aggregator for the segment.
2. We were receiving an advertisement from a peer that included the
default (NULL) peer instead of including our system information. However,
we responded with an advertisement that included the Synchronization flag
for both our system and the peer. (Since the peer's advertisement did not
include our system information, we shouldn't add the synchronization bit
for the peer.)
MFC r314116:
Fix a panic during boot caused by inadequate locking of some vt(4) driver
data structures.
vt_change_font() calls vtbuf_grow() to change some vt driver data
structures. It uses TF_MUTE to prevent the console from trying to use
those data structures while it changes them.
During the early stage of the boot process, the vt driver's tc_done
routine uses those data structures; however, it is currently called
outside the TF_MUTE check.
Move the tc_done routine inside the locked TF_MUTE check.
MFC r313447:
Ensure the idle thread's loop services interrupts in a timely way when
using the ACPI C1/mwait sleep method.
Previously, the mwait instruction would return when an interrupt was
pending; however, the idle loop did not actually enable interrupts when
this occurred. This led to a situation where the idle loop could quickly
spin through the C1/mwait sleep method a number of times when an interrupt
was pending. (Eventually, the situation corrected itself when something
other than an interrupt triggered the idle loop to either enable
interrupts or schedule another thread.)
MFC r307083:
Currently, when tcp_input() receives a packet on a session that matches a
TCPCB, it checks (so->so_options & SO_ACCEPTCONN) to determine whether or
not the socket is a listening socket. However, this causes the code to
access a different cacheline. If we first check if the socket is in the
LISTEN state, we can avoid accessing so->so_options when processing packets
received for ESTABLISHED sessions.
If INVARIANTS is defined, the code still needs to access both variables to
check that so->so_options is consistent with the state.
MFC r330511:
We shouldn't need to execute code in the recursive page table mappings;
therefore, it should be safe to set the NX bit on the PML4E for the
recursive page table mappings. According to the Intel docs, the effect
of the NX bit should propogate to any page reached through a PML4E which
has the NX bit set.
MFC r330510:
Prior to r329071, pmap_bootstrap() used pmap_kmem_choose() to round the
first available virtual address to a 2MB boundary. After r329071,
create_pagetables() rounds firstaddr up to a 2MB boundary. This ensures
the kernel is mapped in super-pages, which is the point of the logic
in pmap_kmem_choose(). Therefore, it is no longer necessary for
pmap_bootstrap() to round up to the 2MB boundary again.
As pmap_bootstrap() was the only user of pmap_kmem_choose(), we can
delete pmap_kmem_choose().
MFC r332780,r332783:
Intel drives have an optimal alignment for I/O. While they honor I/Os
that cross this boundary, they perform better when this isn't the
case. Intel uses the 3rd byte in the vendor specific area for
this. The DC P3500 was previously listed without any explanation. Add
the DC P3520 and DC P4500 to the list.
There won't be any others drives needing this quirk. Intel has
standardized a field in the namespace data in 1.3 (noiob). A future
patch will use that if it exists, with fallback to this method.
Submitted by: Keith Busch
Reviewed by: jimharris@
[[ plus tweak comments from 332783 ]]
MFC r329171:
Mark the pages used for the initial page-table entries as wired. This
makes them consistent with the way other page-table pages are allocated.
It also provides the rest of the VM system a good clue that these pages
are used.
MFC r329071:
On bootup, the amd64 pmap initialization code creates page-table
mappings for the pages used for the kernel and some initial allocations
used for the page table. It maps the kernel and the blocks used for
these initial allocations using 2MB pages.
However, if the kernel does not end on a 2MB boundary, it still maps the
last portion using a 2MB page, but reports that the unused 4K blocks
within this 2MB allocation are free physical blocks. This means that
these same physical blocks could also be mapped elsewhere - for example,
into a user process. Given the proximity to the kernel text and data
area, it seems wise to avoid allowing someone to write data to physical
blocks also mapped into these virtual addresses.
(Note that this isn't a security vulnerability: the direct map makes
most/all memory on the system mapped into kernel space. And, nothing
in the kernel should be trying to access these pages, as the virtual
addresses are unused. It simply seems wise to avoid reusing these
physical blocks while they are mapped to virtual addresses so close
to the kernel text and data area.)
Consequently, let's reserve the physical blocks covered by the
page-table mappings for these initial allocations.
MFC r331488:
This change adds a flag to the DAD entry to indicate whether it is
currently on the queue. This prevents accidentally doubly-removing a DAD
entry from the queue, while also simplifying some of the logic in
nd6_dad_stop().
MFC r331926:
r330675 introduced an extra window check in the LRO code to ensure it
captured and reported the highest window advertisement with the same
SEQ/ACK. However, the window comparison uses modulo 2**16 math, rather
than directly comparing the absolute values. Because windows use
absolute values and not modulo 2**16 math (i.e. they don't wrap), we
need to compare the absolute values.
MFC r332120:
If a user closes the socket before we call tcp_usr_abort(), then
tcp_drop() may unlock the INP. Currently, tcp_usr_abort() does not
check for this case, which results in a panic while trying to unlock
the already-unlocked INP (not to mention, a use-after-free violation).
Make tcp_usr_abort() check the return value of tcp_drop(). In the case
where tcp_drop() returns NULL, tcp_usr_abort() can skip further steps
to abort the connection and simply unlock the INP_INFO lock prior to
returning.
Update stable/11 from 11.1-STABLE to 11.2-PRERELEASE, marking the
official start of the code slush.
Set the default mdoc(7) version to 11.2, and update the clang(1)
TARGET_TRIPLE to reflect 11.2. While here, add missing FreeBSD
major versions to mdoc(7).
Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation
When the compressed ARC feature was added in commit d3c2ae1
the method of reference counting in the ARC was modified. As
part of this accounting change the arc_buf_add_ref() function
was removed entirely.
This would have be fine but the arc_buf_add_ref() function
served a second undocumented purpose of updating the ARC access
information when taking a hold on a dbuf. Without this logic
in place a cached dbuf would not migrate its associated
arc_buf_hdr_t to the MFU list. This would negatively impact
the ARC hit rate, particularly on systems with a small ARC.
This change reinstates the missing call to arc_access() from
dbuf_hold() by implementing a new arc_buf_access() function.
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
MFC r332457:
Use cfg->nomatch_verdict as return value from NAT64LSN handler when
given mbuf is considered as not matched.
If mbuf was consumed or freed during handling, we must return
IP_FW_DENY, since ipfw's pfil handler ipfw_check_packet() expects
IP_FW_DENY when mbuf pointer is NULL. This fixes KASSERT panics
when NAT64 is used with INVARIANTS. Also remove unused nomatch_final
field from struct nat64lsn_cfg.
Reported by: Justin Holcomb <justin at justinholcomb dot me>