Bring kernel space CloudABI code in sync with HEAD.
MFC r312353, r312354 and r312355:
Sync in the latest CloudABI generated source files.
Languages like C++17 and Go provide direct support for slice types:
pointer/length pairs. The CloudABI generator now has more complete for
this, meaning that for the C binding, pointer/length pairs now use an
automatic naming scheme of ${name} and ${name}_len.
Apart from this change and some reformatting, the ABI definitions are
identical. Binary compatibility is preserved entirely.
MFC r315700:
Make file descriptor passing work for CloudABI's sendmsg().
Reduce the potential amount of code duplication between cloudabi32 and
cloudabi64 by creating a cloudabi_sock_recv() utility function. The
cloudabi32 and cloudabi64 modules will then only contain code to convert
the iovecs to the native pointer size.
In cloudabi_sock_recv(), we can now construct an SCM_RIGHTS cmsghdr in
an mbuf and pass that on to kern_sendit().
MFC r315736:
Make file descriptor passing for CloudABI's recvmsg() work.
Similar to the change for sendmsg(), create a pointer size independent
implementation of recvmsg() and let cloudabi32 and cloudabi64 call into
it. In case userspace requests one or more file descriptors, call
kern_recvit() in such a way that we get the control message headers in
an mbuf. Iterate over all of the headers and copy the file descriptors
to userspace.
The existing ELF image activator requires the brandinfo to provide such
a string unconditionally, even if the executable format in question
doesn't use this type of branding. Skip matching when it's a null
pointer.
MFC r315857: Remove "UNMAPPED" messages printed on da periph attach.
I think this message is not very useful for end user. Also its formatting
does not match other messages printed at that time. Those who really need
this information can always find it in `camcontrol negotiate daX -v`.
Leap-second smearing is an experimental option that may be specified in
ntp.conf(5) and the -x option on the command line to spread the effect
of a leap-second over an interval as specified by the leapsmearinterval
config file statement. Recommended values are between 7200 (2 hours) and
86400 (24 hours).
It is advised that leap-second smearing not be used for public NTP
servers (https://www.meinbergglobal.com/download/burnicki/Leap\
%20Second%20Smearing%20With%20NTP.pdf). It is also advised that NTP
clients not use a mix of NTP servers using leap-second smearing with
NTP servers not using leap-second smearing as that could cause
undefined client behaviour.
Leap-second smearing was committed to ports net/ntp and net/ntp-devel
by r426825 on 2016-11-22.
For three years now CAM does not use SIM lock, but still enforces SIM to
use it. Remove this requirement, allowing SIMs to have any locking they
prefer, if they pass no mutex to cam_sim_alloc().
MFC r306041: Always pass -m to ld for converting binary files to ELF
This is in preparation for linking with LLVM's lld, which does not have
a compiled-in default output emulation. lld requires that it is
specified via the -m option, or obtained from the object file(s) being
linked.
This will also allow all build targets to share a common linker binary.
MFC r303156: Remove duplicate symbols from libroken version-script.map
Upstream commit r24759 (efed563) prefixed some symbols with rk_, but
introduced 6 duplicate symbols in the version script (because the
rk_-prefixed versions of the symbols were already present).
MFC r304041:
Move logging via BPF support into separate file.
* make interface cloner VNET-aware;
* simplify cloner code and use if_clone_simple();
* migrate LOGIF_LOCK() to rmlock;
* add ipfw_bpf_mtap2() function to pass mbuf to BPF;
* introduce new additional ipfwlog0 pseudo interface. It differs from
ipfw0 by DLT type used in bpfattach. This interface is intended to
used by ipfw modules to dump packets with additional info attached.
Currently pflog format is used. ipfw_bpf_mtap2() function uses second
argument to determine which interface use for dumping. If dlen is equal
to ETHER_HDR_LEN it uses old ipfw0 interface, if dlen is equal to
PFLOG_HDRLEN - ipfwlog0 will be used.
MFC r304046, 304108:
Add ipfw_nat64 module that implements stateless and stateful NAT64.
The module works together with ipfw(4) and implemented as its external
action module.
Stateless NAT64 registers external action with name nat64stl. This
keyword should be used to create NAT64 instance and to address this
instance in rules. Stateless NAT64 uses two lookup tables with mapped
IPv4->IPv6 and IPv6->IPv4 addresses to perform translation.
A configuration of instance should looks like this:
1. Create lookup tables:
# ipfw table T46 create type addr valtype ipv6
# ipfw table T64 create type addr valtype ipv4
2. Fill T46 and T64 tables.
3. Add rule to allow neighbor solicitation and advertisement:
# ipfw add allow icmp6 from any to any icmp6types 135,136
4. Create NAT64 instance:
# ipfw nat64stl NAT create table4 T46 table6 T64
5. Add rules that matches the traffic:
# ipfw add nat64stl NAT ip from any to table(T46)
# ipfw add nat64stl NAT ip from table(T64) to 64:ff9b::/96
6. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
via NAT64 host.
Stateful NAT64 registers external action with name nat64lsn. The only
one option required to create nat64lsn instance - prefix4. It defines
the pool of IPv4 addresses used for translation.
A configuration of instance should looks like this:
1. Add rule to allow neighbor solicitation and advertisement:
# ipfw add allow icmp6 from any to any icmp6types 135,136
2. Create NAT64 instance:
# ipfw nat64lsn NAT create prefix4 A.B.C.D/28
3. Add rules that matches the traffic:
# ipfw add nat64lsn NAT ip from any to A.B.C.D/28
# ipfw add nat64lsn NAT ip6 from any to 64:ff9b::/96
4. Configure DNS64 for IPv6 clients and add route to 64:ff9b::/96
via NAT64 host.
MFC r304048:
Replace __noinline with special debug macro NAT64NOINLINE.
MFC r304061:
Use %ju to print unsigned 64-bit value.
MFC r304076:
Make statistics nat64lsn, nat64stl an nptv6 output netstat-like:
"@value @description" and fix build due to -Wformat errors.
MFC r304378 (by bz):
Try to fix gcc compilation errors (which are right).
nat64_getlasthdr() returns an int, which can be -1 in case of error,
storing the result in an uint8_t and then comparing to < 0 is not
helpful. Do what is done in the rest of the code and make proto an
int here as well.
MFC r309187:
Fix ICMPv6 Time Exceeded error message translation.
MFC r314718:
Use new ipfw_lookup_table() in the nat64 too.
MFC r315204,315233:
Use memset with structure size.
MFC r303012:
Add ipfw_nptv6 module that implements Network Prefix Translation for IPv6
as defined in RFC 6296. The module works together with ipfw(4) and
implemented as its external action module. When it is loaded, it registers
as eaction and can be used in rules. The usage pattern is similar to
ipfw_nat(4). All matched by rule traffic goes to the NPT module.
MFC r304076:
Make statistics nat64lsn, nat64stl an nptv6 output netstat-like:
"@value @description" and fix build due to -Wformat errors.
MFC r314507:
Fix NPTv6 rule counters when one_pass is not enabled.
Consider the rule matching when both @done and @retval values
returned from ipfw_run_eaction() are zero. And modify ipfw_nptv6()
to return IP_FW_DENY and @done=0 when addresses do not match.
MFC r316029: lld: hack version and help output for compatibility with libtool
GNU libtool checks the output from invoking the linker with --version
and --help, in order to determine the linker "flavour" and the command-
ine arguments to use for various link operations (e.g. generating shared
libraries). To detect GNU ld it looks for the strings "GNU" and
"supported targets:.*elf". Since LLD is compatible with GNU ld we
include those same strings to fool libtool.
Quoting from a comment in the change:
This is somewhat ugly hack, but in reality, we had no choice other
than doing this. Considering the very long release cycle of Libtool,
it is not easy to improve it to recognize LLD as a GNU compatible
linker in a timely manner. Even if we can make it, there are still a
lot of "configure" scripts out there that are generated by old
version of Libtool. We cannot convince every software developer to
migrate to the latest version and re-generate scripts. So we have
this hack.
Merge r315910:
Make sendfile(2) more robust against file change. This fixes a possible
crash when the file shrinks. This also fixes sendfile(2) not sending more
data in a case when the file grows, and the request is open-ended or
specifies a size that is greater than old file size.
dim [Sun, 2 Apr 2017 17:24:58 +0000 (17:24 +0000)]
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release:
MFC r309142 (by emaste):
Add WITH_LLD_AS_LD build knob
If set it installs LLD as /usr/bin/ld. LLD (as of version 3.9) is not
capable of linking the world and kernel, but can self-host and link many
substantial applications. GNU ld continues to be used for the world and
kernel build, regardless of how this knob is set.
It is on by default for arm64, and off for all other CPU architectures.
Sponsored by: The FreeBSD Foundation
MFC r310840:
Reapply 310775, now it also builds correctly if lldb is disabled:
Move llvm-objdump from CLANG_EXTRAS to installed by default
We currently install three tools from binutils 2.17.50: as, ld, and
objdump. Work is underway to migrate to a permissively-licensed
tool-chain, with one goal being the retirement of binutils 2.17.50.
LLVM's llvm-objdump is intended to be compatible with GNU objdump
although it is currently missing some options and may have formatting
differences. Enable it by default for testing and further investigation.
It may later be changed to install as /usr/bin/objdump, it becomes a
fully viable replacement.
Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to
4.0.0 (branches/release_40 296509). The release will follow soon.
Please note that from 3.5.0 onwards, clang, llvm and lldb require C++11
support to build; see UPDATING for more information.
Also note that as of 4.0.0, lld should be able to link the base system
on amd64 and aarch64. See the WITH_LLD_IS_LLD setting in src.conf(5).
Though please be aware that this is work in progress.
Release notes for llvm, clang and lld will be available here:
<http://releases.llvm.org/4.0.0/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/clang/docs/ReleaseNotes.html>
<http://releases.llvm.org/4.0.0/tools/lld/docs/ReleaseNotes.html>
Thanks to Ed Maste, Jan Beich, Antoine Brodin and Eric Fiselier for
their help.
Relnotes: yes
Exp-run: antoine
PR: 215969, 216008
MFC r314708:
For now, revert r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
Reported by: mjg
MFC r314795:
Reapply r287232 from upstream llvm trunk (by Daniil Fukalov):
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Pull in r296992 from upstream llvm trunk (by Sanjoy Das):
[SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.
r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32. This change reverses that
change by introducing a separate flag for CompareValueComplexity's
threshold.
The latter revision fixes the excessive compile times for skein_block.c.
The new compiler_rt library imported with clang 4.0.0 have several fatal
issues (non-functional __udivsi3 for example) with ARM specific instrict
functions. As temporary workaround, until upstream solve these problems,
disable all thumb[1][2] related feature.
MFC r315016:
Update clang, llvm, lld, lldb, compiler-rt and libc++ to 4.0.0 release.
We were already very close to the last release candidate, so this is a
pretty minor update.
Relnotes: yes
MFC r316005:
Revert r314907, and pull in r298713 from upstream compiler-rt trunk (by
Weiming Zhao):
builtins: Select correct code fragments when compiling for Thumb1/Thum2/ARM ISA.
Summary:
Value of __ARM_ARCH_ISA_THUMB isn't based on the actual compilation
mode (-mthumb, -marm), it reflect's capability of given CPU.
Due to this:
- use __tbumb__ and __thumb2__ insteand of __ARM_ARCH_ISA_THUMB
- use '.thumb' directive consistently in all affected files
- decorate all thumb functions using
DEFINE_COMPILERRT_THUMB_FUNCTION()
---------
Note: This patch doesn't fix broken Thumb1 variant of __udivsi3 !
Let firmware do its best first, and if it can't, try software recovery.
I would remove software timeout handler completely, but found bunch of
complains on command timeout on sparc64 mailing list few years ago, so
better be safe in case of interrupt loss.
MFC r315579, r315670: Add initial support for multiple MSI-X vectors.
For 24xx and above use 2 vectors (default and response queue).
For 26xx and above use 3 vectors (default, response and ATIO queues).
Due to global lock interrupt hardlers never run simultaneously now, but
at least this allows to save one regitster read per interrupt.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
* In zfs_freebsd_setattr, if the caller wants to set the birthtime,
set the bits that zfs_settattr expects
* In zfs_setattr, if XAT_CREATETIME is set, set xoa_createtime,
expected by zfs_xvattr_set. The two levels of indirection seem
excessive, but it minimizes diffs vs OpenZFS.
* In zfs_setattr, check for overflow of va_birthtime (from delphij)
* Remove red herring in zfs_getattr
sys/cddl/contrib/opensolaris/uts/common/sys/vnode.h
* Un-booby-trap some macros
New tests are under review at https://github.com/pjd/pjdfstest/pull/6
ATF tests have a default WARNS of 0, unlike other usermode programs. This
change is technically a noop, but it documents that the msun tests don't
work with any warnings enabled, at least not on all architectures.
r310146:
Use the right bitwise OR operation for clearing single-step at trap time.
r311912:
Force all TOC references in asm to include '@toc'
r312369:
Use the explicit expanded form of cmp.
r312617:
Hide the 'MOREARGS' macro, it conflicts with contrib code, and is only used
in one file.
r312614:
Don't pass -Wa,-many through clang, the integrated as doesn't support it.
r312659:
Avoid using non-zero argument for __builtin_frame_address().
r312974:
Add a INTR_TRIG_INVALID, and use it in the powerpc interrupt code.
r312977:
Force the setting of bit 7 in the sysmouse packet byte 1 to be unsigned.
r313005:
Update CFLAGS for clang compatibility
r314826:
Clang in base now supports -mlongcall, so remove this hack
Open Firmware runs in virtual mode on the Powermac G5. This runs inside
the
kernel page table, which preserves all address translations made by OF
before
the kernel starts; as a result, the kernel address space is a strict
superset of
OF's.
Where this explodes is if OF uses an unmapped SLB entry. The SLB
fault handler
runs in real mode and refers to the PCPU pointer in SPRG0,
which blows up the
kernel. Having a value of SPRG0 that works for the kernel is
less fatal than
preserving OF's value in this case.
===
The result of this is seemingly random panics from
NULL dereferences, or hangs
immediately upon boot. By not restoring SPRG0 for
Open Firmware entry the
kernel PCPU pointer is preserved and SLB faults
are successful, resulting in a
stable kernel.
Change the "devfs_fsync: vop_stdfsync failed" from panic to a printf.
It's not a proper fix, but should be better than what we have now.
Since it got broken some six months ago it results in an incredibly
annoying and trivially reproducible panic every time eg an USB disk
gets disconnected.
Get rid of foo_sys() in linuxulator code. It was commented out, and it
would be useless anyway - there is no point in pretending to have block
devices; our "block" devices are in fact character ones, and can only
be accessed as such.
Most of these are null pointer dereferences or missing error checks in the
unit tests. One is a missing error check in xnb_attach_failed. None can
cause real problems in running systems.
If one of the scripts listed in (daily|weekly|monthly)_local is executable,
999.local should simply execute it. Only if the script isn't executable
should 999.local assume it needs /bin/sh.
Interesting fixes which were not already merged: 0c7c611 Merge C++ demangler bug fixes from ELF Tool Chain (#40) 2b208d9 __cxa_demangle_gnu3: demangle 'z' as '...', not 'ellipsis' (#41)
mm [Fri, 31 Mar 2017 20:16:24 +0000 (20:16 +0000)]
MFC r315636,315876,316095:
Sync libarchive with vendor
Vendor changes/bugfixes (FreeBSD-related):
r315636:
PR 867 (bsdcpio): show numeric uid/gid when names are not found
PR 870 (seekable zip): accept files with valid ZIP64 EOCD headers
PR 880 (pax): Fix handling of "size" pax header keyword
PR 887 (crypto): Discard 3072 bytes instead of 1024 of first keystream
OSS-Fuzz issue 806 (mtree): rework mtree_atol10 integer parser
Break ACL read/write code into platform-specific source files
r315876:
Store extended attributes with extattr_set_link() if no fd is provided
Add extended attribute tests to libarchive and bsdtar
Fix tar's test_option_acls
Support the UF_HIDDEN file flag
r316095:
Constify variables in several places
Unify platform ACL code in a single source file
Fix unused variable if compiling on FreeBSD without NFSv4 ACL support
ed [Fri, 31 Mar 2017 08:43:07 +0000 (08:43 +0000)]
MFC r315892:
Include <sys/systm.h> to obtain the memcpy() prototype.
I got a report of this source file not building on Raspberry Pi. It's
interesting that this only fails for that target and not for others.
Again, that's no reason not to include the right headers.
truckman [Fri, 31 Mar 2017 06:20:06 +0000 (06:20 +0000)]
MFC r315516
Change several constants used by the PIE algorithm from unsigned to signed.
- PIE_MAX_PROB is compared to variable of int64_t and the type promotion
rules can cause the value of that variable to be treated as unsigned.
If the value is actually negative, then the result of the comparsion
is incorrect, causing the algorithm to perform poorly in some
situations. Changing the constant to be signed cause the comparision
to work correctly.
- PIE_SCALE is also compared to signed values. Fortunately they are
also compared to zero and negative values are discarded so this is
more of a cosmetic fix.
- PIE_DQ_THRESHOLD is only compared to unsigned values, but it is small
enough that the automatic promotion to unsigned is harmless.
dchagin [Thu, 30 Mar 2017 20:12:23 +0000 (20:12 +0000)]
MFC r314402:
FreeBSD does not have analgue for epoll EPOLLPRI event type.
So, do not set EPOLLPRI event acidently.
Also, do not set EPOLLWRNORM and EPOLLRDNORM events as epoll
do not set this events.
dchagin [Thu, 30 Mar 2017 20:00:57 +0000 (20:00 +0000)]
MFC r314293:
Return EOVERFLOW error in case then the size of tv_sec field of struct timespec
in COMPAT_LINUX32 Linuxulator's not equal to the size of native tv_sec.
tsoome [Thu, 30 Mar 2017 17:23:40 +0000 (17:23 +0000)]
boot1.efi: can't boot from ZFS on 4kn HDD
The boot1.efi immediate issue from PR216964 is that we are reading into
too small buffer, from UEFI spec 2.6:
The size of the Buffer in bytes. This must be a multiple of the intrinsic block size of the device.
The secondary issue is that LBA calculation does not check reminder from
division.
This fix does check the provided buffer size and if we read less than
media sector size or the read offset is not aligned to sector boundary,
we allocate bounce buffer and perform the read by single sector.
ae [Thu, 30 Mar 2017 14:20:27 +0000 (14:20 +0000)]
MFC r303018:
Add named dynamic states support to ipfw(4).
The keep-state, limit and check-state now will have additional argument
flowname. This flowname will be assigned to dynamic rule by keep-state
or limit opcode. And then can be matched by check-state opcode or
O_PROBE_STATE internal opcode. To reduce possible breakage and to maximize
compatibility with old rulesets default flowname introduced.
It will be assigned to the rules when user has omitted state name in
keep-state and check-state opcodes. Also if name is ambiguous (can be
evaluated as rule opcode) it will be replaced to default.
MFC r304087:
Do not warn about ambiguous state name when we inspect a comment token.
MFC r304089:
Add an ability to attach comment to check-state rules.
MFC r310727 (by marius):
Fix a bug in r272840; given that the optlen parameter of setsockopt(2)
is a 32-bit socklen_t, do_get3() passes the kernel to access the wrong
32-bit half on big-endian LP64 machines when simply casting the 64-bit
size_t optlen to a socklen_t pointer.
While at it and given that the intention of do_get3() apparently is to
hide/wrap the fact that socket options are used for communication with
ipfw(4), change the optlen parameter of do_set3() to be of type size_t
and as such more appropriate than uintptr_t, too.
MFC r315305:
Change the syntax of ipfw's named states.
Since the state name is an optional argument, it often can conflict
with other options. To avoid ambiguity now the state name must be
prefixed with a colon.
r314547
loader.efi: reduce the size of the staging area if necessary
The loader assumes physical memory in [2MB, 2MB + EFI_STAGING_SIZE)
is Conventional Memory, but actually it may not, e.g. in the case
of Hyper-V Generation-2 VM (i.e. UEFI VM) running on Windows
Server 2012 R2 host, there is a BootServiceData memory block at
the address 47.449MB and the memory is not writable.
Without the patch, the loader will crash in efi_copy_finish():
see PR 211746.
The patch verifies the end of the staging area, and reduces its
size if necessary. This way, the loader will not try to write into
the BootServiceData memory any longer.
Thank Marcel Moolenaar for helping me on this issue!
The patch also allocates the staging area in the first 1GB memory.
See the comment in the patch for this.
r314770
loader.efi: fix recent UEFI-boot regression on physical machines
This patch fixes my recent patch
"loader.efi: reduce the size of the staging area if necessary", which
causes EFI-boot failure on physical machines since Mar 2:
on the host there is a 1MB LoaderData memory range, which splits
the big Conventional Memory range into a small one (15MB) and a
big one: the small one is too small to hold the staging area.
We can actually use the LoaderData range safely, because when
amd64_tramp -> efi_copy_finish() starts to run, we're almost at
the very end of the efi loader code and we're going to "return"
to the kernel entry, so we're pretty sure we won't access any loader
data any more.
For people who are interested in the details: please see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746#c22
PS, some people also reported the regression happened to FreeBSD VM
running on Bhyve in EFI mode. This patch should resolve it too,
though I don't have such a setup to test.
r314828
loader.efi: fix an off-by-one bug in efi_verify_staging_size()
Also remove the warning message: it may not be unusual to see
the memory range containing 2MB is not of EfiConventionalMemory.
Sponsored by: Microsoft
r314891
loader.efi: finally fix the off-by-one bug in efi_verify_staging_size()
r314828(loader.efi: fix an off-by-one bug in efi_verify_staging_size())
doesn't really fix the bug and this patch adds the missing part.
It's a shame that I didn't make everything correct at the very beginning...
Sponsored by: Microsoft
r314956
loader.efi: only reduce the size of the staging area on Hyper-V
Doing this on physical hosts turns out to be problematic, e.g. see comment
24 and 28 in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746.
To fix the real underlying issue correctly & thoroughly, IMO we need
a relocatable kernel, but that would require a lot of complicated long
term work: https://reviews.freebsd.org/D9686?id=25414#inline-56969
For now, let's only apply efi_verify_staging_size() to VMs running on
Hyper-V, and restore the old behavior on physical machines since that
has been working for people for a long period of time, though that's
potentially unsafe...
Sponsored by: Microsoft
r314962
loader.efi: only include the machine/ header files on x86
The 2 files may not exist on other archs like aarch64 and hence we
can have a build failure there.
Reported by: lwhsu
Sponsored by: Microsoft
r315235
loader.efi: use stricter check for Hyper-V
Some other hypervisors like Xen can pretend to be Hyper-V but obviously
they can't implement all Hyper-V features. Let's make sure we're genuine
Hyper-V here.