MFC r303019:
Use g_resize_provider() to change the size of GEOM_DISK provider,
when it is being opened. This should fix the possible loss of a resize
event when disk capacity changed.
MFC r303288:
Do not invoke resize method if geom is being withered.
MFC r303637:
Do not invoke resize event if initial disk size is zero. Some disks
report the size only after first opening. And due to the events are
asynchronous, some consumers can receive this event too late and
this confuses them. This partially restores previous behaviour, and
at the same time this should fix the problem, when already opened
provider loses resize event.
MFC 304858,305485,305497: Fix various issues with PCI pass through and VT-d.
304858:
Enable I/O MMU when PCI pass through is first used.
Rather than enabling the I/O MMU when the vmm module is loaded,
defer initialization until the first attempt to pass a PCI device
through to a guest. If the I/O MMU fails to initialize or is not
present, than fail the attempt to pass a PCI device through to a
guest.
The hw.vmm.force_iommu tunable has been removed since the I/O MMU is
no longer enabled during boot. However, the I/O MMU support can be
disabled by setting the hw.vmm.iommu.enable tunable to 0 to prevent
use of the I/O MMU on any systems where it is buggy.
305485:
Leave ppt devices in the host domain when they are not attached to a VM.
This allows a pass through device to be reset to a normal device driver
on the host and reused on the host. ppt devices are now always active in
some I/O MMU domain when the I/O MMU is active, either the host domain
or the domain of a VM they are attached to.
305497:
Update the I/O MMU in bhyve when PCI devices are added and removed.
When the I/O MMU is active in bhyve, all PCI devices need valid entries
in the DMAR context tables. The I/O MMU code does a single enumeration
of the available PCI devices during initialization to add all existing
devices to a domain representing the host. The ppt(4) driver then moves
pass through devices in and out of domains for virtual machines as needed.
However, when new PCI devices were added at runtime either via SR-IOV or
HotPlug, the I/O MMU tables were not updated.
This change adds a new set of EVENTHANDLERS that are invoked when PCI
devices are added and deleted. The I/O MMU driver in bhyve installs
handlers for these events which it uses to add and remove devices to
the "host" domain.
MFC 303881: Reliably return PCI_GETCONF_LAST_DEVICE from PCIOCGETCONF.
Previously the loop in PCIIOCGETCONF would terminate as soon as it
found enough matches. Now it will continue iterating through the
PCI device list and only terminate if it finds another matching device
for which it has no room to store a conf structure. This means that
PCI_GETCONF_LAST_DEVICE is reliably returned when the number of
matching devices is equal to the number of slots in the matches
buffer. For example, if a program requests the conf structure for a
single PCI function with a specified domain/bus/slot/function it will
now get PCI_GETCONF_LAST_DEVICE instead of PCI_GETCONF_MORE_DEVS.
While here, simplify the loop conditional a bit more by explicitly
breaking out of the loop if copyout() fails and removing a redundant
i < pci_numdevs check.
MFC 303887: Add a dmardump utility to dump the VT-d context tables.
This tool parses the ACPI DMAR table looking for DMA remapping devices.
For each device it walks the root table and any context tables
referenced to display mapping info for PCI devices.
Note that acpidump -t already parses the info in the ACPI DMAR tables
directly. This tool examines some of the data structures the DMAR
remapping engines use to translate DMA requests.
- Add constants for the fields in the root-entry table address register,
namely the root type type (RTT) and root table address (RTA) mask.
- Add macros for the bitmask of the domain ID field in the second word
of context table entries as well as a helper macro (DMAR_CTX2_GET_DID)
to extract the domain ID from a context table entry.
MFC 303204: Install a handler for firmware work request error messages.
If a driver sends an malformed or disallowed work request, the firmware
responds with a work request error. Previously the driver treated this is
as an unexpected message and panicked. Now it decodes the error message
to aid in debugging.
MFC 303721: Permit the name of the /dev/iov entry to be set by the driver.
The PCI_IOV option creates character devices in /dev/iov for each PF
device driver that registers support for creating VFs. By default the
character device is named after the PF device (e.g. /dev/iov/foo0).
This change adds a variant of pci_iov_attach() called pci_iov_attach_name()
that allows the name of the /dev/iov entry to be specified by the
driver.
To preserve the ABI, this version does not modify the existing
PCI_IOV_ATTACH kobj method as was done in HEAD. Instead, a new
PCI_IOV_ATTACH_NAME method has been added that accepts the name as an
additional parameter. The PCI bus driver now provides an implementation
of PCI_IOV_ATTACH_NAME. A default implementation of PCI_IOV_ATTACH is
provided that calls PCI_IOV_ATTACH_NAME passing 'device_get_nameunit(dev)'
as the name.
MFC r306417: portsnap: only move expected snapshot contents from snap/ to files/
Previously it was possible to smuggle in addional files that would
be used by later portsnap runs. Now we only move those files expected
to be in the snapshot into files/ and require that there are no
unexpected files.
This was used by portsnap attacks 2, 3, and 4 in the "non-cryptanalytic
attacks against FreeBSD update components" anonymous gist.
1) Microoptimize %p case.
2) Implememt %u for GNU compatibility.
3) Don't forget to advance buf for %w/%u.
4) Fail with incomplete week (week 0) request and no such week in the
year.
5) Fix yday formula when Sunday requested and the week started from Monday.
6) Fail with impossible yday for incomplete week (week 0) and direct %w/%u
request.
7) Shift yday/wday to the first day of the year, if incomplete week
(week 0) requested and no %w/%u used.
8) For already non-standard %z extension implement GNU compatible formats:
+hh and -hh.
9) Check for incorrect values for %z.
MFC r306091:
Add a way for the architecture to specify the calling ABI for methods
in the EFI Runtime Services Table. On amd64, the calling conventions
are MS.
If present, honor the USB port mode (host or peripheral) set on DTS, if not,
keep the beaglebone defaults: USB0 -> peripheral/gadget, USB1 -> host.
This is only a workaround as in fact fact this hardware is capable of detect
the USB port mode based on type of cable and act according with the detected
mode. Unfortunately the driver does not handle that at moment.
mm [Sun, 25 Sep 2016 22:02:27 +0000 (22:02 +0000)]
MFC r305819:
Sync libarchive with vendor including important security fixes.
Issues fixed (FreeBSD):
PR #778: ACL error handling
Issue #745: Symlink check prefix optimization is too aggressive
Issue #746: Hard links with data can evade sandboxing restrictions
This update fixes the vulnerability #3 and vulnerability #4 as reported in
"non-cryptanalytic attacks against FreeBSD update components".
https://gist.github.com/anonymous/e48209b03f1dd9625a992717e7b89c4f
Fix for vulnerability #2 has already been merged in r305188.
When mlx5e_sq_xmit() returns an error code and the mbuf pointer is set,
we should not free the mbuf, because the caller will keep the mbuf in
the drbr. Make sure the mbuf pointer is correctly set upon function
exit.
MFC r305874:
mlx5en: Allow setting the software MTU size below 1500 bytes
The hardware MTU size can't be set to a value less than 1500 bytes due
to side-band management support. Allow setting the software MTU size
below 1500 bytes, thus creating a mismatch between hardware and
software MTU sizes.
MFC r305873:
mlx5en: Factor out common sendqueue code for use with rate limiting SQs.
Try to reuse code to setup sendqueues when possible by making some static
functions global. Further split the mlx5e_close_sq_wait() function to
separate out reusable parts.
This change also reduces the size of the mlx5e_sq structure so that the last
queue_state element will fit into the previous cacheline and then the mlx5e_sq
structure becomes one cacheline less for amd64.
MFC r305869:
mlx5en: Minor completion queue control path code refactor.
Move setting of CQ moderation mode together with the other
CQ moderation parameters. Pass completion event vector as
a separate argument to mlx5e_open_cq(), because its value is
different for each call. Pass mlx5e_priv pointer instead of
mlx5e_channel pointer so that code can be used by rate
limiting sendqueues.
MFC r305867:
Update the MLX5 core module:
- Add new firmware commands and update existing ones.
- Add more firmware related structures and update existing ones.
- Some minor fixes, like adding missing \n to some prints.
MFC bspatch Capsicumization, sanity checks, and other improvements
r304691: bspatch: apply style(9)
Make style changes (and trivial refactoring of open calls) now in order
to reduce noise in diffs for future capsicum changes.
r304807 (allanjude): Capsicumize bspatch
Move all of the fopen() and open() calls to the top of main()
Restrict each FD to least privilege (read/seek only, write only, etc)
cap_enter(), and make all except the output FD read/seek only.
r304821: bspatch: remove output file in the case of error
r305486: bspatch: add sanity checks on sizes to avoid integer overflow
Note that this introduces an explicit 2GB limit, but this was already
implicit in variable and function argument types.
This is based on the "non-cryptanalytic attacks against freebsd
update components" anonymous gist. Further refinement is planned.
r305737: bspatch: remove superfluous newlines from errx strings
r305822: bspatch: use #define for header size instead of magic number
r306026: bspatch: Remove backwards-compatibility sys/capability.h support
bspatch previously included sys/capability.h or sys/capsicum.h based
on __FreeBSD_version, as FreeBSD is the upstream for bsdiff and we may
see this file incorporated into other third-party software.
The Capsicum header is now installed as sys/capsicum.h in stable/10 and
FreeBSD 10.3, so we can just use sys/capsicum.h and simplify the logic.
MFC r305595:
In dqsync(), when called from quotactl(), um_quotas entry might appear
cleared since nothing prevents completion of the parallel quotaoff.
There is nothing to sync in this case, and no reason to panic.
MFC r305593:
There is no need to upgrade the last dvp lock on lookups for modifying
operations. Instead of upgrading, assert that the lock is exclusive.
Explain the cause in comments.
andrew [Wed, 21 Sep 2016 09:50:50 +0000 (09:50 +0000)]
MFC 304799:
Map coherent memory in a non-coherent dma tag as uncached. This is similar
to what the 32-bit arm code does, with the exception that it always assumes
the tag is non-coherent.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
andrew [Wed, 21 Sep 2016 09:45:14 +0000 (09:45 +0000)]
MFC 305285:
Add a pc_clock pcpu field and use it to implement cpu_est_clockrate. This
will allow drivers that manage the clock frequency to communicate this with
the reset of the kernel.
andrew [Wed, 21 Sep 2016 09:06:06 +0000 (09:06 +0000)]
MFC r304892:
Print both the kernel read and write translation in DDB when asking for
a virtual to physical translation. These may be different, e.g. when a
page is mapped as read-only.
ip6_output() was missing cache invalidation code analougous to
ip_output.c. r304545 disabled L2 caching for UDP/IPv6 as a workaround.
This change adds the missing cache invalidation code and reverts
r304545.
MFC r305778:
Fix swap tables between sets when this functional is enabled.
We have 6 opcode rewriters for table opcodes. When `set swap' command
invoked, it is called for each rewriter, so at the end we get the same
result, because opcode rewriter uses ETLV type to match opcode. And all
tables opcodes have the same ETLV type. To solve this problem, use
separate sets handler for one opcode rewriter. Use it to handle TEST_ALL,
SWAP_ALL and MOVE_ALL commands.
This fixes the scenario hit by the Jenkins job where it's infecting
the build with --sysroot, etc options from the Jenkins build in the
tests.
Prefix all intermediate variables (_CFLAGS, etc) with "ATF_BUILD" [*].
Requested by: jmmv
r305170:
Don't bake all of CC/CPP/CXX into CFLAGS
Capture executable names for CC, CPP, CXX (assumed to be the
first non-CCACHE_BIN word).
This change strips out all of the cross-compiler arguments, (-target,
-B, etc), added to ${CC}, etc via ${CROSSENV} in Makefile.inc1, so it
doesn't infect the build and subsequently the test.
Add comments noting why this logic is being added, and why the logic in
r305041 was necessary/what it was trying to achieve.
This is required after recent changes made to the toolchain to always
specify --sysroot, -target, -B, etc with clang in buildworld (presumably
r304681).
Use SRCTOP instead of a homegrown definition for it (SRCDIR)
r305019:
Remove unnecessary variable (SRCDIR) replaced by SRCTOP in Makefile.common
r305020:
Remove redundant declarations and simplify ../ in pathing
- TESTSBASE and LOCALBASE are already defined in bsd.tests.mk
- TESTSDIR is automatically divined as ${TESTSBASE}${RELDIR:H} after
r289158.
- Replace SRCDIR with SRCTOP
andrew [Fri, 16 Sep 2016 12:42:36 +0000 (12:42 +0000)]
MFC 305546:
When synchronising the instruction and data caches we only need to clean
the data cache to the point of unification. This is the point where the
two caches are unified to a single unified cache so cleaning past here
is just extra unneeded work.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
andrew [Fri, 16 Sep 2016 12:39:21 +0000 (12:39 +0000)]
MFC 305545:
Only call cpu_icache_sync_range when inserting an executable page. If the
page is non-executable the contents of the i-cache are unimportant so this
call is just adding unneeded overhead when inserting pages.
While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1
iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into
arm64_icache_sync_range() from pmap_enter() along with a high number of
instruction fetches and iTLB/iCache hits.
Limiting the calls to arm64_icache_sync_range() to only executable pages,
we observe the iTLB and iCache Hit going down by about 43%. These numbers
are quite misleading when looked at alone as at the same time instructions
retired were reduced by 19.2% and instruction fetches were reduced by 38.8%.
Overall this reduced the runtime of the test program by 22.4%.
On Juno hardware, in steady-state, running the same test, using the cycle
count to determine runtime, we do see a reduction of up to 28.9% in runtime.
While these numbers certainly depend on the program executed, we expect an
overall performance improvement.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
andrew [Fri, 16 Sep 2016 12:17:01 +0000 (12:17 +0000)]
MFC 303744:
Remove the pvh_global_lock lock from the arm64 pmap. It is unneeded on arm64
as invalidation will have completed before the pmap_invalidate_* functions
have complete.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation