Merge the following from ^/projects/release-arm64 to allow
building FreeBSD/arm64 VM images and memstick.img installation
medium:
r281786, r281788, r281792:
r281786:
Add support for building arm64/aarch64 virtual machine images.
r281788:
Copy amd64/make-memstick.sh to arm64/make-memstick.sh for
aarch64 memory stick images.
Although arm64 does not yet have USB support, the memstick
image should be bootable with certain virtualization tools,
such as qemu.
r281792:
Add a buildenv_setup() prototype, intended to be overridden as
needed.
For example, the arm64/aarch64 build needs devel/aarch64-binutils,
so buildenv_setup() in the release.conf for this architecture
handles the installation of the port before buildworld/buildkernel.
Mark Johnston [Mon, 20 Apr 2015 22:08:11 +0000 (22:08 +0000)]
Move the definition of struct bpf_if to bpf.c.
A couple of fields are still exposed via struct bpf_if_ext so that
bpf_peers_present() can be inlined into its callers. However, this change
eliminates some type duplication in the resulting CTF container, since
otherwise ctfmerge(1) propagates the duplication through all types that
contain a struct bpf_if.
- Speedup significantly by not using subshells for data already fetched.
Ran against /usr/local/sbin/pkg:
Before: 25.12 real 12.41 user 33.14 sys
After: 0.53 real 0.49 user 0.13 sys
- Exit with 1 if any missing or unresolved symbol is detected.
- Add option '-U' to skip looking up unresolved symbols.
- Don't consider provided weak objects as unresolved (nm V).
phabricator related changes:
- don't lint either contrib or crypto: these are both externally written
directories
- add additional linters for spelling (check common typos like teh ->
the)
- chmod linter checks for executible bit on bad files
- merge-conflict checks for merge conflict tokens then may have been
resolved incorrectly
- filename checks for back characters in filenames
- json for json syntax correctness
- remove history.immutable: it is meaningless on subversion, and causes
workflow problems when trying to use git. It it set to 'true' by
default with hg
Eric van Gyzen [Mon, 20 Apr 2015 20:03:26 +0000 (20:03 +0000)]
Always send log(9) messages to the message buffer.
It is truer to the semantics of logging for messages to *always*
go to the message buffer, where they can eventually be collected
and, in fact, be put into a log file.
This restores the behavior prior to r70239, which seems to have
changed it inadvertently.
Submitted by: Eric Badger <eric@badgerio.us>
Reviewed by: jhb
Approved by: kib (mentor)
Obtained from: Dell Inc.
MFC after: 1 week
When building VM disk images, vm_copy_base() uses tar(1) to
copy the userland from one md(4)-mounted filesystem to a clean
filesystem to prevent remnants of files that were added and
removed from resulting in an unclean filesystem. When newfs(8)
creates the first filesystem with journaled soft-updates enabled,
the /.sujournal file in the new filesystem cannot be overwritten
by the /.sujournal in the original filesystem.
To avoid this particular error case, do not enable journaled
soft-updates when creating the md(4)-backed filesystems, and
instead use tunefs(8) to enable journaled soft-updates after
the new filesystem is populated in vm_copy_base().
While here, fix a long standing bug where the build environment
/boot files were used by mkimg(1) when creating the VM disk
images by using the files in .OBJDIR.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Ed Maste [Mon, 20 Apr 2015 17:43:55 +0000 (17:43 +0000)]
vidcontrol: skip invalid video modes returned by vt(4)
vt(4) has a stub CONS_MODEINFO ioctl that does not provide any data
but returns success. This needs to be fixed in the kernel, but address
it in vidcontrol(1) as well in case it's run on an older kernel.
Reviewed by: bde
Sponsored by: The FreeBSD Foundation
Alexander Motin [Mon, 20 Apr 2015 10:44:46 +0000 (10:44 +0000)]
Activate write-only optimization if bpf device opened with O_WRONLY.
dhclient opens bpf as write-only to send packets. It never reads received
packets from that descriptor, but processing them in kernel takes time.
Especially much time takes packet timestamping on systems with expensive
timecounter, such as bhyve guest, where network speed dropped in half.
Remove code to support the top of the stack layout for FreeBSD 1.x/2.x
kernel, but keep explanation of the old ps_strings structure to make
it clear what sanity check tries to accomplish.
Noted by: Oliver Pinter <oliver.pinter@hardenedbsd.org>
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
ed(1): Fix [-Werror=logical-not-parentheses]
/usr/src/bin/ed/glbl.c:64:36: error: logical not is only applied to
theleft hand side of comparison [-Werror=logical-not-parentheses]
Marius Strobl [Sun, 19 Apr 2015 20:15:57 +0000 (20:15 +0000)]
Refine the workaround for Intel HSD131 [1] added in r269052:
- Use the full mask described by the erratum as with a sufficiently high
number of these false-positives, the overflow bit (bit 62) additionally
gets set [7].
- HSD131 has been brought into several other Haswell-derived CPUs including
to the next generation, i. e. Intel Broadwell. Thus, also skip reporting of
these benign errors by default on CPU models affected by HSM142, HSW131 and
BDM48 [2 - 5], describing the HSD131 silicon bug for additional models.
Also, Celeron 2955U with a CPU ID of 0x45 have been reported to be covered
by this fault [6], with the specification update concerned with HSM142 [2]
only referring to 0x3c and 0x46.
Submitted by: David Froehlich [7]
MFC after: 3 days
Adrian Chadd [Sun, 19 Apr 2015 17:15:55 +0000 (17:15 +0000)]
Refactor out the _PXM -> VM domain lookup done in ACPI, in preparation for
its use in upcoming code.
This is inspired by something in jhb's NUMA IRQ allocation patchset.
However, the tricky bit here is that the PXM lookup for a node may
fail, requiring a lookup on the parent node. So if it doesn't
exist, don't fail - just go up to the parent. Only error out of the
lookup is the ACPI lookup returns an error.
Adrian Chadd [Sun, 19 Apr 2015 17:07:51 +0000 (17:07 +0000)]
Update pkt-gen to optionally use randomised source/destination
IPv4 addresses/ports.
When doing traffic testing of actual code that /does/ things to the
packet (rather than say, 'bridge.c'), it's typically a good idea to
use a variety of cache-busting and flow-tracking-busting packet
spreads. The pkt-gen method of testing an IP range was to walk
it linearly - which is fine, but not useful enough.
This can be used to completely randomize the source/destination
addresses (eg to test out flow-tracking-busting) and to keep the
destination fixed whilst randomising the source (eg to test out
what a DDoS may look like.)
identd: remove redundant zeroing
se_rpc_lowvers was set to 0 twice, so remove one of them
I can not find any other variable which they may have been a typo of.
README: changes and fixups
Two orthogonal goals:
- try to make README look a little nicer on phabricator by using
Remarkup syntax for commands (using `` instead of using a closing ')
- try to make README look a little nicer on github.
- Don't encourage `make world` when the handbook specifies otherwise
- Change language around documentation to be a bit clearer
sh: Fix the trap builtin to be POSIX-compliant for 'trap exit SIG' and 'trap n n...'.
The parser considered 'trap exit INT' to reset the default for both EXIT and
INT. This beahvior is not POSIX compliant. This was avoided if a value was
specified for 'exit', but then disallows exiting with the signal received. A
possible workaround is using ' exit'.
However POSIX does allow this type of behavior if the parameters are all
integers. Fix the handling for this and clarify its support in the manpage
since it is specifically allowed by POSIX.
The lseek(2), mmap(2), truncate(2), ftruncate(2), pread(2), and
pwrite(2) syscalls are wrapped to provide compatibility with pre-7.x
kernels which required padding before the off_t parameter. The
fcntl(2) contains compatibility code to handle kernels before the
struct flock was changed during the 8.x CURRENT development. The
shims were reasonable to allow easier revert to the older kernel at
that time.
Now, two or three major releases later, shims do not serve any
purpose. Such old kernels cannot handle current libc, so revert the
compatibility code.
Make padded syscalls support conditional under the COMPAT6 config
option. For COMPAT32, the syscalls were under COMPAT6 already.
Remove WITHOUT_SYSCALL_COMPAT build option, which only purpose was to
(partially) disable the removed shims.
Reviewed by: jhb, imp (previous versions)
Discussed with: peter
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This supports e500v1, e500v2, and e500mc. Tested only on e500v2, but the
performance counters are identical across all, with e500mc having some
additional events.
Make wait6(2), waitid(3) and ppoll(2) cancellation points. The
waitid() function is required to be cancellable by the standard. The
wait6() and ppoll() follow the other syscalls in their groups.
Reviewed by: jhb, jilles (previous versions)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Mark Johnston [Sat, 18 Apr 2015 21:00:36 +0000 (21:00 +0000)]
Add manual pages for the io, ip, proc, sched, tcp and udp DTrace providers.
The format of these pages is somewhat experimental, so they may be subject
to further tweaking.
Mark Johnston [Sat, 18 Apr 2015 20:36:58 +0000 (20:36 +0000)]
Remove unimplemented sched provider probes.
They were added for compatibility with the sched provider in Solaris and
illumos, but our sched provider is already incompatible since it uses native
types, so there isn't much point in keeping them around.
Mark Johnston [Sat, 18 Apr 2015 20:31:59 +0000 (20:31 +0000)]
SDT(9): add a section on SDT providers, mentioning the "sdt" provider.
Add examples demonstrating how one can list available providers and the
DTrace probes provided by a provider.
Alexander Motin [Sat, 18 Apr 2015 20:10:19 +0000 (20:10 +0000)]
Workaround bhyve virtual disks operation on top of GEOM providers.
GEOM does not support scatter/gather lists in its I/Os. Such requests
are cut in pieces by physio(), that may be problematic, if those pieces
are not multiple of provider's sector size. If such case is detected,
move the data through temporary sequential buffer.
Initialize td_sel in the thread_init(). Struct thread is not zeroed
on the initial allocation, but seltdinit() assumes that td_sel is NULL
or a valid pointer. Note that thread_fini()/seltdfini() also relies
on this, but correctly resets td_sel to NULL.
Change ipsec_address() and ipsec_logsastr() functions to take two
additional arguments - buffer and size of this buffer.
ipsec_address() is used to convert sockaddr structure to presentation
format. The IPv6 part of this function returns pointer to the on-stack
buffer and at the moment when it will be used by caller, it becames
invalid. IPv4 version uses 4 static buffers and returns pointer to
new buffer each time when it called. But anyway it is still possible
to get corrupted data when several threads will use this function.
ipsec_logsastr() is used to format string about SA entry. It also
uses static buffer and has the same problem with concurrent threads.
To fix these problems add the buffer pointer and size of this
buffer to arguments. Now each caller will pass buffer and its size
to these functions. Also convert all places where these functions
are used (except disabled code).
And now ipsec_address() uses inet_ntop() function from libkern.
Requeue mbuf via netisr when we use IPSec tunnel mode and IPv6.
ipsec6_common_input_cb() uses partial copy of ip6_input() to parse
headers. But this isn't correct, when we use tunnel mode IPSec.
When we stripped outer IPv6 header from the decrypted packet, it
can become IPv4 packet and should be handled by ip_input. Also when
we use tunnel mode IPSec with IPv6 traffic, we should pass decrypted
packet with inner IPv6 header to ip6_input, it will correctly handle
it and also can decide to forward it.
The "skip" variable points to offset where payload starts. In tunnel
mode we reset it to zero after stripping the outer header. So, when
it is zero, we should requeue mbuf via netisr.
Fix handling of scoped IPv6 addresses in IPSec code.
* in ipsec_encap() embed scope zone ids into link-local addresses
in the new IPv6 header, this helps ip6_output() disambiguate the
scope;
* teach key_ismyaddr6() use in6_localip(). in6_localip() is less
strict than key_sockaddrcmp(). It doesn't compare all fileds of
struct sockaddr_in6, but it is faster and it should be safe,
because all SA's data was checked for correctness. Also, since
IPv6 link-local addresses in the &V_in6_ifaddrhead are stored in
kernel-internal form, we need to embed scope zone id from SA into
the address before calling in6_localip.
* in ipsec_common_input() take scope zone id embedded in the address
and use it to initialize sin6_scope_id, then use this sockaddr
structure to lookup SA, because we keep addresses in the SADB without
embedded scope zone id.
The only thing is used from this code is ipip_output() function, that does
IPIP encapsulation. Other parts of XF_IP4 code were removed in r275133.
Also it isn't possible to configure the use of XF_IP4, nor from userland
via setkey(8), nor from the kernel.
Simplify the ipip_output() function and rename it to ipsec_encap().
* move IP_DF handling from ipsec4_process_packet() into ipsec_encap();
* since ipsec_encap() called from ipsec[64]_process_packet(), it
is safe to assume that mbuf is contiguous at least to IP header
for used IP version. Remove all unneeded m_pullup(), m_copydata
and related checks.
* use V_ip_defttl and V_ip6_defhlim for outer headers;
* use V_ip4_ipsec_ecn and V_ip6_ipsec_ecn for outer headers;
* move all diagnostic messages to the ipsec_encap() callers;
* simplify handling of ipsec_encap() results: if it returns non zero
value, print diagnostic message and free mbuf.
* some style(9) fixes.
More accurately collect name-cache statistics in sysctl functions
sysctl_debug_hashstat_nchash() and sysctl_debug_hashstat_rawnchash().
These changes are in preparation for allowing changes in the size
of the vnode hash tables driven by increases and decreases in the
maximum number of vnodes in the system.
Add the necessary support to use both TX queues available on if_emac.
Each TX queue can hold one packet (yes, if_emac can send only two(!)
packets at a time).
Even with this change the very limited FIFO buffer (3 KiB for TX and 13 KiB
for RX) fill up too quick to sustain higher throughput.
For the TCP case it turns out that TX isn't the limiting factor, but the RX
side is (the FIFO fill up and starts to discard packets, so the sender has
to slow down).
Pedro F. Giffuni [Fri, 17 Apr 2015 22:26:01 +0000 (22:26 +0000)]
Drop experimental dir_index support.
The htree directory index is a highly desirable feature for research
purposes and was meant to improve performance in our ext2/3 driver.
Unfortunately our implementation has two problems:
- It never really delivered any performance improvement.
- It appears to corrupt the filesystem in undetermined circumstances.
Strictly speaking dir_index is not required for read/write support in
ext2/3 and our limited ext4 support still works fine without it.
Regain stability in the ext2 driver by removing it. We may need it back
(fixed) if we want to support encrypted ext4 support but thanks to the
wonders of version control we can always revert this change and bring it
back.