cem [Tue, 29 Oct 2019 18:24:36 +0000 (18:24 +0000)]
libexecinfo test: Don't strip installed test
It turns out that a test of backtrace symbol resolution and formatting
requires symbols. Another option mightt be building with -rdynamic instead,
but this works for now.
Re-enabled skipped CI test, as it should now pass.
glebius [Tue, 29 Oct 2019 17:36:06 +0000 (17:36 +0000)]
There is a long standing problem with multicast programming for NICs
and IPv6. With IPv6 we may call if_addmulti() in context of processing
of an incoming packet. Usually this is interrupt context. While most
of the NIC drivers are able to reprogram multicast filters without
sleeping, some of them can't. An example is e1000 family of drivers.
With iflib conversion the problem was somewhat hidden. Iflib processes
packets in private taskqueue, so going to sleep doesn't trigger an
assertion. However, the sleep would block operation of the driver and
following incoming packets would fill the ring and eventually would
start being dropped. Enabling epoch for the full time of a packet
processing again started to trigger assertions for e1000.
Fix this problem once and for all using a general taskqueue to call
if_ioctl() method in all cases when if_addmulti() is called in a
non sleeping context. Note that nobody cares about returned value.
glebius [Tue, 29 Oct 2019 17:28:25 +0000 (17:28 +0000)]
Merge td_epochnest with td_no_sleeping.
Epoch itself doesn't rely on the counter and it is provided
merely for sleeping subsystems to check it.
- In functions that sleep use THREAD_CAN_SLEEP() to assert
correctness. With EPOCH_TRACE compiled print epoch info.
- _sleep() was a wrong place to put the assertion for epoch,
right place is sleepq_add(), as there ways to call the
latter bypassing _sleep().
- Do not increase td_no_sleeping in non-preemptible epochs.
The critical section would trigger all possible safeguards,
no sleeping counter is extraneous.
dim [Tue, 29 Oct 2019 16:51:12 +0000 (16:51 +0000)]
Pull in r373338 from upstream llvm trunk (by Simon Pilgrim):
Revert rL349624 : Let TableGen write output only if it changed,
instead of doing so in cmake, attempt 2
Differential Revision: https://reviews.llvm.org/D55842
-----------------
As discussed on PR43385 this is causing Visual Studio msbuilds to
perpetually rebuild all tablegen generated files
Pull in r373664 from upstream llvm trunk (by Nico Weber):
Reland r349624: Let TableGen write output only if it changed, instead
of doing so in cmake
Move the write-if-changed logic behind a flag and don't pass it with
the MSVC generator. msbuild doesn't have a restat optimization, so
not doing write-if-change there doesn't have a cost, and it should
fix whatever causes PR43385.
This should fix the scenario where an incremental build from before
r353358 (the clang 9.0.0 upgrade) to r353358 or later fails to update
the timestamp of the generated lib/clang/headers/arm_fp16.h header.
After such a build, installing world from read-only source and object
directories would attempt to generate the header again, leading to
"clang-tblgen: error opening arm_fp16.h.d:Read-only file system".
sjg [Mon, 28 Oct 2019 20:45:29 +0000 (20:45 +0000)]
Building head on stable/11 requires libzstd
Add lib/libzstd to _elftoolchain_libs
tools/build/Makefile needs to create the install dir for libzstd
Since this would make the line too long, rework to use a list
in one per line format (easier to add in future)
and dispense with the .for loop
vmaffione [Mon, 28 Oct 2019 19:00:27 +0000 (19:00 +0000)]
netmap: enter NET_EPOCH on generic txsync
After r353292, netmap generic adapter on if_vlan interfaces panics on
asserting the NET_EPOCH. In more detail, this happens when
nm_os_generic_xmit_frame() is called, that is in the generic txsync
routine.
Fix the issue by entering the NET_EPOCH during the generic txsync.
We amortize the cost of entering/exiting over a whole batch of
transmissions.
PR: 241489
Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
cem [Mon, 28 Oct 2019 17:12:45 +0000 (17:12 +0000)]
Remove bogus requirement from libexecinfo test
The bogus requirement was causing CI infrastructure (which does not mount
procfs) to skip the test. Procfs has not been needed by libexecinfo on
FreeBSD (nor NetBSD) for years. Both now use a sysctl to obtain the path to
the current process image.
lwhsu [Sun, 27 Oct 2019 21:07:50 +0000 (21:07 +0000)]
Follow r354121 to fix some python3 errors in sys.netpfil.*
stderr:
Traceback (most recent call last):
File "/usr/tests/sys/netpfil/common/pft_ping.py", line 135, in <module>
main()
File "/usr/tests/sys/netpfil/common/pft_ping.py", line 124, in main
ping(args.sendif[0], args.to[0], args)
File "/usr/tests/sys/netpfil/common/pft_ping.py", line 74, in ping
raw = sp.raw(str(PAYLOAD_MAGIC))
File "/usr/local/lib/python3.6/site-packages/scapy/compat.py", line 52, in raw
return bytes(x)
TypeError: string argument without an encoding
MFC with: r354121
Sponsored by: The FreeBSD Foundation
bz [Sat, 26 Oct 2019 21:19:55 +0000 (21:19 +0000)]
Upgrade (scapy) py2 tests to work on py3.
In order to move python2 out of the test framework to avoid py2 vs. py3
confusions upgrade the remaining test cases using scapy to work with py3.
That means only one version of scapy needs to be installed in the CI system.
It also gives a path forward for testing i386 issues observed in the CI
system with some of these tests.
Fixes are:
- Use default python from environment (which is 3.x these days).
- properly ident some lines as common for the rest of the file to avoid
errors.
- cast the calculated offset to an int as the division result is considered
a float which is not accepted input.
- when comparing payload to a magic number make sure we always add the
payload properly to the packet and do not try to compare string in
the result but convert the data payload back into an integer.
- fix print formating.
Discussed with: lwhsu, kp (taking it off his todo :)
MFC after: 2 weeks
asomers [Sat, 26 Oct 2019 17:11:02 +0000 (17:11 +0000)]
MFZoL: Avoid retrieving unused snapshot props
This patch modifies the zfs_ioc_snapshot_list_next() ioctl to enable it
to take input parameters that alter the way looping through the list of
snapshots is performed. The idea here is to restrict functions that
throw away some of the snapshots returned by the ioctl to a range of
snapshots that these functions actually use. This improves efficiency
and execution speed for some rollback and send operations.
Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matt Ahrens <mahrens@delphix.com> Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #8077
zfsonlinux/zfs@4c0883fb4af0d5565459099b98fcf90ecbfa1ca1
manu [Sat, 26 Oct 2019 17:10:27 +0000 (17:10 +0000)]
dtc: Allow multiple dts-v1 tag
Some dts are including dtsi that also contain a /dts-v1/ tag at the
top. GNU DTC doesn't seems to have a problem with that so fix our
dtc to behave the same.
cem [Sat, 26 Oct 2019 06:59:59 +0000 (06:59 +0000)]
Sync up with NetBSD libexecinfo changes 2014-2019
Drop portions that are unlit or redundant with llvm-libunwind: builtin.c,
unwind.h, and unwind_arm_ehabi_stub.c.
This code should now work with -fPIE binaries, should we choose to build any
that way.
When backtrace() array is full, signal an error so the underlying
Itanium-style C++ exception handling library (llvm-libunwind) knows to stop
tracing instead of continuing. (It should stop on its own when it finishes
unwinding, so this is mostly an extra seatbelt against an infinite loop bug
in the unwinder.)
mhorne [Fri, 25 Oct 2019 21:39:29 +0000 (21:39 +0000)]
RISC-V: skip cpu-map when parsing elf_hwcap
The fill_elf_hwcap() function expects to find only cpu nodes under the
/cpus entry of the device tree. Newer versions of QEMU insert a cpu-map
node which describes the CPU topology, breaking this function. To fix
this, simply skip any non-cpu entries.
rpokala [Fri, 25 Oct 2019 21:32:28 +0000 (21:32 +0000)]
Args for buf_track() might be unused
If neither FULL_BUF_TRACKING nor BUF_TRACKING are defined, then the body of
buf_track() becomes empty. Mark the arguments with "__unused" so the
compiler doesn't complain about unused arguments in that case.
Reported by: Bruce Leverett (Panasas)
Reviewed by: cem (on IRC)
MFC after: 1 month
Sponsored by: Panasas
dim [Fri, 25 Oct 2019 21:00:49 +0000 (21:00 +0000)]
Pull in r372186 from upstream llvm trunk (by Eli Friedman):
[ARM] VFPv2 only supports 16 D registers.
r361845 changed the way we handle "D16" vs. "D32" targets; there used
to be a negative "d16" which removed instructions from the
instruction set, and now there's a "d32" feature which adds
instructions to the instruction set. This is good, but there was an
oversight in the implementation: the behavior of VFPv2 was changed.
In particular, the "vfp2" feature was changed to imply "d32". This is
wrong: VFPv2 only supports 16 D registers.
In practice, this means if you specify -mfpu=vfpv2, the compiler will
generate illegal instructions.
This patch gets rid of "vfp2d16" and "vfp2d16sp", and fixes "vfp2"
and "vfp2sp" so they don't imply "d32".
markj [Fri, 25 Oct 2019 20:15:04 +0000 (20:15 +0000)]
Apply kernel module linker scripts to firmware files.
Use a separate make variable to specify the linker script so that it is
only applied at link time and not during intermediate generation of .fwo
files.
This ensures that the .text padding inserted by the amd64 linker script
is applied to the stub module load handlers embedded in firmware
modules.
kib [Fri, 25 Oct 2019 20:09:42 +0000 (20:09 +0000)]
amd64: move pcb out of kstack to struct thread.
This saves 320 bytes of the precious stack space.
The only negative aspect of the change I can think of is that the
struct thread increased by 320 bytes obviously, and that 320 bytes are
not swapped out anymore. I believe the freed stack space is much more
important than that. Also, current struct thread size is 1392 bytes
on amd64, so UMA will allocate two thread structures per (4KB) slab,
which leaves a space for pcb without increasing zone memory use.
Reviewed by: alc, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D22138
peterj [Fri, 25 Oct 2019 19:38:02 +0000 (19:38 +0000)]
Fix use of uninitialised variable.
The RK805 regs array was being allocated before it's required size was
known, causing the driver to use memory it didn't own. That memory
was subsequently allocated and used elsewhere causing later fatal data
aborts in rk805_map().
Whilst I'm here, add a sanity check to catch unsupported PMICs (this
shouldn't ever get hit because the probe should have failed).
Reviewed by: manu
MFC after: 1 week
Sponsored by: Google
bz [Fri, 25 Oct 2019 18:54:06 +0000 (18:54 +0000)]
Properly set VNET when nuking recvif from fragment queues.
In theory the eventhandler invoke should be in the same VNET as
the the current interface. We however cannot guarantee that for
all cases in the future.
So before checking if the fragmentation handling for this VNET
is active, switch the VNET to the VNET of the interface to always
get the one we want.
manu [Fri, 25 Oct 2019 18:10:02 +0000 (18:10 +0000)]
arm64: rockchip: Add RK3399 TypeC phy driver
This is a driver for the USB3 PHY present in the RK3399.
While the phy support DP (Display Port) the driver doesn't has we have
no driver to test this with for now.
All the lane and pll configuration is just magic values from rockchip.
While the manual have some info on those registers it's really hard to
understand how to calculate those values (if there is a way).
bz [Fri, 25 Oct 2019 17:41:08 +0000 (17:41 +0000)]
frag6-test: update for r354046, conform to 8200 for overlapping fragments
The change to conform to RFC 8200 for overlapping fragments now frees
the entire reassembly queue if the overlapping fragments are not an
exact match.
As a result we do see one less packet in the timeout statistics from
expiry. No other statistics change as the event is not counted.
It can be argued that we should improve the statistics counters in
that case.
This test case update should have been committed alongside the original
commit.
bz [Fri, 25 Oct 2019 16:29:09 +0000 (16:29 +0000)]
frag6: do not leak counter in error cases
When allocating the IPv6 fragement packet queue entry we do checks
against counters and if we pass we increment one of the counters
to claim the spot. Right after that we have two cases (malloc and MAC)
which can both fail in which case we free the entry but never released
our claim on the counter. In theory this can lead to not accepting new
fragments after a long time, especially if it would be MAC "refusing"
them.
Rather than immediately subtracting the value in the error case, only
increment it after these two cases so we can no longer leak it.
avg [Fri, 25 Oct 2019 15:38:09 +0000 (15:38 +0000)]
owc_gpiobus_read_data: compare times in sbintime_t units
Previously the code used sbttous() before microseconds comparison in one
place, sbttons() and nanoseconds in another, division by SBT_1US and
microseconds in yet another.
Now the code consistently uses multiplication by SBT_1US to convert
microseconds to sbintime_t before comparing them with periods between
calls to sbinuptime(). This is fast, this is precise enough (below
0.03%) and the periods defined by the protocol cannot overflow.
avg [Fri, 25 Oct 2019 09:37:54 +0000 (09:37 +0000)]
gpioiic: set output after switching to output mode if presetting it failed
Some controllers cannot preset future output value while the pin is in
input mode. This adds a fallback for those controllers. The new code
assumes that a controller reports an error in that case.
For example, all hardware supported by nctgpio behaves in that way.
This is a temporary measure. In the future we will use
GPIO_PIN_PRESET_LOW / GPIO_PIN_PRESET_HIGH to preset the output either
in hardware, if supported, or in software (e.g., in
gpiobus_pin_setflags).
While here, I extracted common functionality of gpioiic_set{sda,scl} and
gpioiic_get{sda,scl} to gpioiic_setpin and gpioiic_getpin respectively.
brooks [Thu, 24 Oct 2019 22:34:48 +0000 (22:34 +0000)]
binutils: Fix bugs found by -Wpointer-compare
The MIPS bug was introduced by upstream commit 7403cb630, which failed
to account for the additional indirection introduced and also dropped
one of the checks; change it to the standard "NULL-or-empty" check as
used elsewhere in BFD, which is also what upstream now has.
brooks [Thu, 24 Oct 2019 22:23:53 +0000 (22:23 +0000)]
nda(4): Remove unnecessary union and avoid Clang -Wsizeof-array-divwarning
Clang trunk recently gained this new warning, and complains about the
sizeof(trim->data) / sizeof(struct nvme_dsm_range) expression, since
the left hand side's element type (char) does not match the right hand
side's type. The byte buffer is unnecessary so we can remove it to clean
up the code and fix the warning at the same time.
When we receive the packet with the first fragmented part (fragoff=0)
we remember the length of the unfragmentable part and the next header
(and should probably also remember ECN) as meta-data on the reassembly
queue.
Someone replying this packet so far could change these 2 (3) values.
While changing the next header seems more severe, for a full size
fragmented UDP packet, for example, adding an extension header to the
unfragmentable part would go unnoticed (as the framented part would be
considered an exact duplicate) but make reassembly fail.
So do not allow updating the meta-data after we have seen the first
fragmented part anymore.
The frag6_20 test case is added which failed before triggering an
ICMPv6 "param prob" due to the check for each queued fragment for
a max-size violation if a fragoff=0 packet was received.
mckusick [Thu, 24 Oct 2019 21:28:37 +0000 (21:28 +0000)]
After the unlink() of one name of a file with multiple links, a
stat() of one of the remaining names of the file does not show an
updated ctime (inode modification time) until several seconds after
the unlink() completes. The problem only occurs when the filesystem
is running with soft updates enabled. When running with soft updates,
the ctime is not updated until the soft updates background process
has settled all the needed I/O operations.
This commit causes the ctime to be updated immediately during the
unlink(). A side effect of this change is that the ctime is updated
again when soft updates has finished its processing because that
is the time that is correct from the perspective of programs that
look at the disk (like dump). This change does not cause any extra
I/O to be done, it just ensures that stat() updates the ctime before
handing it back.
PR: 241373
Reported by: Alan Somers
Tested by: Alan Somers
MFC after: 3 days
Sponsored by: Netflix
bz [Thu, 24 Oct 2019 20:22:52 +0000 (20:22 +0000)]
frag6: handling of overlapping fragments to conform to RFC 8200
While the comment was updated in r350746, the code was not.
RFC8200 says that unless fragment overlaps are exact (same fragment
twice) not only the current fragment but the entire reassembly queue
for this packet must be silently discarded, which we now do if
fragment offset and fragment length do not match.
bz [Thu, 24 Oct 2019 20:08:33 +0000 (20:08 +0000)]
frag6 test cases: check more counters, wait for expiry
When done with tests check that both the per-VNET and the global-fragmented-
packets-in-system counters are zero to make sure we do not leak counters or
queue entries.
This implies that for all test cases we either have to check for the ICMPv6
packet sent in case of TLL=0 expiry (if it is sent) or sleep at least long
enough for the TTL to expire for all packets (e.g., fragments where we do not
have the off=0 packet).
This also means that statistics are now updated to include all the expired
packets.
There are cases when we do not check for counters to be zero and this is
when testing VNET teardown to behave properly and not panic, when we are
intentionally leaving fragments in the system.
tuexen [Thu, 24 Oct 2019 20:05:10 +0000 (20:05 +0000)]
Ensure that the flags indicating IPv4/IPv6 are not changed by failing
bind() calls. This would lead to inconsistent state resulting in a panic.
A fix for stable/11 was committed in
https://svnweb.freebsd.org/base?view=revision&revision=338986
An accelerated MFC is planned as discussed with emaste@.
Reported by: syzbot+2609a378d89264ff5a42@syzkaller.appspotmail.com
Obtained from: jtl@
MFC after: 1 day
Sponsored by: Netflix, Inc.
bz [Thu, 24 Oct 2019 20:00:37 +0000 (20:00 +0000)]
frag6: export another counter read-only by sysctl
Similar to the system global counter also export the per-VNET counter
"frag6_nfragpackets" detailing the current number of fragment packets
in this VNET's reassembly queues.
The read-only counter is helpful for in-VNET statistical monitoring and
for test-cases.
bz [Thu, 24 Oct 2019 19:57:18 +0000 (19:57 +0000)]
frag6: fix counter leak in error case and optimise code
In case the first fragmented part (off=0) arrives we check for the
maximum packet size for each fragmented part we already queued with the
addition of the unfragmentable part from the first one.
For one we do not have to enter the loop at all if this is the first
fragmented part to arrive, and we can skip the check.
Should we encounter an error case we send an ICMPv6 message for any
fragment exceeding the maximum length limit. While dequeueing the
original packet and freeing it, statistics were not updated and leaked
both the reassembly queue count for the fragment and the global
fragment count. Found by code inspection and confirmed by tightening
test cases checking more statistical and system counters.
bz [Thu, 24 Oct 2019 19:47:32 +0000 (19:47 +0000)]
frag6.c: do not leak packet queue entry in error case
When we are checking for the maximum reassembled packet size of the
fragmentable part and run into the error case (packet too big),
we are leaking the packet queue enntry if this was a first fragment
to arrive.
Properly cleanup, removing the queue entry from the bucket, decrementing
counters, and freeing the memory.
mckusick [Thu, 24 Oct 2019 19:47:18 +0000 (19:47 +0000)]
Soft updates needs to keep an on-disk linked list of inodes that
have been unlinked, but are still referenced by open file descriptors.
These inodes cannot be freed until the final file descriptor reference
has been closed. If the system crashes while they are still being
referenced, these inodes and their referenced blocks need to be
freed by fsck. By having them on a linked list with the head pointer
in the superblock, fsck can quickly find and process them rather
than having to check every inode in the filesystem to see if it is
unreferenced.
When updating the head pointer of this list of unlinked inodes in
the superblock, the superblock check-hash was not getting updated.
If the system crashed with the incorrect superblock check-hash, the
superblock would appear to be corrupted. This patch ensures that
the superblock check-hash is updated when updating the head pointer
of the unlinked inodes list.
There is no need to MFC as superblock check hashes first appeared in
13.0.
gallatin [Thu, 24 Oct 2019 18:39:05 +0000 (18:39 +0000)]
Add a tunable to set the pgcache zone's maxcache
When it is set to 0 (the default), a heavy Netflix-style web workload
suffers from heavy lock contention on the vm page free queue called from
vm_page_zone_{import,release}() as the buckets are frequently drained.
When setting the maxcache, this contention goes away.
We should eventually try to autotune this, as well as make this
zone eligable for uma_reclaim().
Reviewed by: alc, markj
Not Objected to by: jeff
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D22112
jhb [Thu, 24 Oct 2019 18:13:26 +0000 (18:13 +0000)]
Use a counter with a random base for explicit IVs in GCM.
This permits constructing the entire TLS header in ktls_frame() rather
than ktls_seq(). This also matches the approach used by OpenSSL which
uses an incrementing nonce as the explicit IV rather than the sequence
number.
bz [Thu, 24 Oct 2019 12:16:15 +0000 (12:16 +0000)]
frag6: leave a note about upper layer header checks TBD
Per sepcification the upper layer header needs to be within the first
fragment. The check was not done so far and there is an open review for
related work, so just leave a note as to where to put it.
Move the extraction of frag offset up to this as it is needed to determine
whether this is a first fragment or not.
bz [Thu, 24 Oct 2019 11:58:24 +0000 (11:58 +0000)]
frag6: check global limits before hash and lock
Check whether we are accepting more fragments (based on global limits)
before doing expensive operations of calculating the hash and taking the
bucket lock. This slightly increases a "race" between check time and
incrementing counters (which is already there) possibly allowing a few
more fragments than the maximum limits. However, when under attack,
we rather save this CPU time for other packets/work.
bz [Thu, 24 Oct 2019 08:15:40 +0000 (08:15 +0000)]
frag6: small improvements
Rather than walking the mbuf chain manually use m_last() which doing
exactly that for us.
Defer initializing srcifp for longer as there are multiple exit paths
out of the function which do not need it set. Initialize before taking
the lock though.
Rename the mtx lock to match the type better.
bz [Thu, 24 Oct 2019 07:53:10 +0000 (07:53 +0000)]
frag6: remove IP6_REASS_MBUF macro
The IP6_REASS_MBUF() macro did some pointer gynmastics to end up with the
same type as it gets in [*(cast **)&]. Spelling it out instead saves all
this and makes the code more readable and less obfuscated directly using
the structure field.
rrs [Thu, 24 Oct 2019 05:54:30 +0000 (05:54 +0000)]
Fix a small bug in bbr when running under a VM. Basically what
happens is we are more delayed in the pacer calling in so
we remove the stack from the pacer and recalculate how
much time is left after all data has been acknowledged. However
the comparision was backwards so we end up with a negative
value in the last_pacing_delay time which causes us to
add in a huge value to the next pacing time thus stalling
the connection.
jhibbits [Thu, 24 Oct 2019 03:51:33 +0000 (03:51 +0000)]
powerpc/booke: Simplify the MPC85XX PCIe root complex driver
Summary:
Due to bugs in the enumeration code, fsl_pcib_init() was not configuring
sub-bridges properly, so devices hanging off a separate bridge would not
be found. Since the generic PCI code already supports probing child
buses, just delete this code and initialize only the device itself,
letting the generic code handle all the additional probing and
initializing.
This also deletes setup for some PCI peripherals found on some MPC85XX
evaluation boards. The code can be resurrected if needed, but overly
complicated this code in the first place.
erj [Wed, 23 Oct 2019 23:20:49 +0000 (23:20 +0000)]
iflib: call ether_ifdetach and netmap_detach before stop
From Jake:
Calling ether_ifdetach after iflib_stop leads to a potential race where
a stale ifp pointer can remain in the route entry list for IPv6 traffic.
This will potentially cause a page fault or other system instability if
the ifp pointer is accessed.
Move both iflib_netmap_detach and ether_ifdetach to be called prior to
iflib_stop. This avoids the race above, and helps ensure that other ifp
references are removed before stopping the interface.
bz [Wed, 23 Oct 2019 23:10:12 +0000 (23:10 +0000)]
frag6: add "big picture"
Add some ASCII relation of how the bits plug together. The terminology
difference of "fragmented packets" and "fragment packets" is subtle.
While here clear up more whitespace and comments.
bz [Wed, 23 Oct 2019 23:01:18 +0000 (23:01 +0000)]
frag6: replace KAME hand-rolled queues with queue(9) TAILQs
Remove the KAME custom circular queue for fragments and fragmented packets
and replace them with a standard TAILQ.
This make the code a lot more understandable and maintainable and removes
further hand-rolled code from the the tree using a standard interface instead.
Hide the still public structures under #ifdef _KERNEL as there is no
use for them in user space.
The naming is a bit confusing now as struct ip6q and the ip6q[] buckets
array are not the same anymore; sadly struct ip6q is also used by the
MAC framework and we cannot rename it.
markj [Wed, 23 Oct 2019 20:39:21 +0000 (20:39 +0000)]
Modify release_page() to handle a missing fault page.
r353890 introduced a case where we may call release_page() with
fs.m == NULL, since the fault handler may now lock the vnode prior
to allocating a page for a page-in.
Reported by: jhb
Reviewed by: kib
MFC with: r353890
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22120
emaste [Wed, 23 Oct 2019 19:35:26 +0000 (19:35 +0000)]
arm64: enable options NUMA in GENERIC
As with amd64 NUMA is required for reasonable operation on big-iron
arm64 systems and is expected to have no significant impact on small
systems. Enable it now for wider testing in advance of FreeBSD 13.0.
You can use the 'vm.ndomains' sysctl to see if multiple domains are in
use - for example (from Cavium/Marvell ThunderX2):
# sysctl vm.ndomains
vm.ndomains: 2
No objection: manu
Sponsored by: The FreeBSD Foundation
cem [Wed, 23 Oct 2019 19:03:03 +0000 (19:03 +0000)]
amd64: Add CFI directives for libc syscall stubs
No functional change (in program code). Additional DWARF metadata is
generated in the .eh_frame section. Also, it is now a compile-time
requirement that machine/asm.h ENTRY() and END() macros are paired. (This
is subject to ongoing discussion and may change.)
This DWARF metadata allows llvm-libunwind to unwind program stacks when the
program is executing the function. The goal is to collect accurate
userspace stacktraces when programs have entered syscalls.
(The motivation for "Call Frame Information," or CFI for short -- not to be
confused with Control Flow Integrity -- is to sufficiently annotate assembly
functions such that stack unwinders can unwind out of the local frame
without the requirement of a dedicated framepointer register; i.e.,
-fomit-frame-pointer. This is necessary for C++ exception handling or
collecting backtraces.)
For the curious, a more thorough description of the metadata and some
examples may be found at [1] and documentation at [2]. You can also look at
'cc -S -o - foo.c | less' and search for '.cfi_' to see the CFI directives
generated by your C compiler.
rstone [Wed, 23 Oct 2019 17:20:20 +0000 (17:20 +0000)]
Add missing M_NOWAIT flag
The LinuxKPI linux_dma code calls PCTRIE_INSERT with a
mutex held, but does not set M_NOWAIT when allocating
nodes, leading to a potential panic. All of this code
can handle an allocation failure here, so prefer an
allocation failure to sleeping on memory.
Also fix a related case where NOWAIT/WAITOK was not
specified. In this case it's not clear whether sleeping
is allowed so be conservative and assume not. There are
a lot of other paths in this code that can fail due to
a lack of memory anyway.
dim [Wed, 23 Oct 2019 17:02:45 +0000 (17:02 +0000)]
Build toolchain components as dynamically linked executables by default
Summary:
Historically, we have built toolchain components such as cc, ld, etc as
statically linked executables. One of the reasons being that you could
sometimes save yourself from botched upgrades, by e.g. recompiling a
"known good" libc and reinstalling it.
In this day and age, we have boot environments, virtual machine
snapshots, cloud backups, and other much more reliable methods to
restore systems to working order. So I think the time is ripe to flip
this default, and link the toolchain components dynamically, just like
almost all other executables on FreeBSD.
Maybe at some point they can even become PIE executables by default! :)
dim [Wed, 23 Oct 2019 16:57:11 +0000 (16:57 +0000)]
Bump clang's default target CPU for the i386 architecture (aka "x86") to
i686, as per the discussion on the freebsd-arch mailing list. Earlier
in r352030, I had already bumped it to i586, to work around missing
atomic 64 bit functions for the i386 architecture.