Expose more information about PCI devices (and GPUs in particular) via
linsysfs to libdrm.
This allows unmodified modern 64-bit Linux libdrm to work, which allows
modern Linux Mesa to work. The submitter reports that he tested the change
with an Ubuntu 16.04 chroot + amdgpu from graphics/drm-next-kmod.
PR: 222375
Submitted by: Greg V <greg AT unrelenting.technology>
MCA: Expand AMD Thresholding support to cover all banks
When it was added in r314636, AMD Thresholding was hardcoded to only
bank 4 (Northbridge) for some reason. However, even on family 10h the
MCAx_MISC register Valid/Present bits determine whether thresholding is
supported on that bank.
Expand thresholding support to monitor all monitorable banks. This
simplifies some of the logic and makes it more consistent with our Intel
CMCI support.
Rick Macklem [Sun, 17 Sep 2017 22:18:01 +0000 (22:18 +0000)]
Fix bogus FREAD with NFSV4OPEN_ACCESSREAD. No functional change.
The code in nfscl_doflayoutio() bogusly used FREAD instead of
NFSV4OPEN_ACCESSREAD. Since both happen to be defined as "1", this
worked and the patch doesn't result in a functional change.
Found by inspection during development of Flex File Layout support.
Don't use a non-zero argument for __builtin_frame_address
__builtin_frame_address with a non-zero argument is unsafe and rejected by
newer gcc. Since it doesn't seem to impact the stacktrace, don't bother
with gymnastics to unwind to a different frame for starting.
Print the correct bitmask for the running Book-E CPU
All the Book-E world is no longer e500v{1,2}. e500mc the 64-bit derivatives do
not use the DOZE/NAP bits with MSR[WE], instead using the `wait' instruction to
wait for interrupts, and SoC plane controls (via CCSR) for power management.
Ed Maste [Sun, 17 Sep 2017 14:03:54 +0000 (14:03 +0000)]
libsysdecode: report invalid cap_rights_t
Previously we'd have an assertion failure in cap_rights_is_set if
sysdecode_cap_rights is called with an invalid cap_rights_t, so test for
validity first.
PR: 222258
Reviewed by: cem
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12391
Alan Cox [Sat, 16 Sep 2017 18:12:15 +0000 (18:12 +0000)]
Modify blst_leaf_alloc to take only the cursor argument.
Modify blst_leaf_alloc to find allocations that cross the boundary between
one leaf node and the next when those two leaves descend from the same
meta node.
Update the hint field for leaves so that it represents a bound on how
large an allocation can begin in that leaf, where it currently represents
a bound on how large an allocation can be found within the boundaries of
the leaf.
The first phase of blst_leaf_alloc currently shrinks sequences of
consecutive 1-bits in mask until each has been shrunken by count-1 bits,
so that any bits remaining show where an allocation can begin, or until
all the bits have disappeared, in which case the allocation fails. This
change amends that so that the high-order bit is copied, as if, when the
last block was free, it was followed by an endless stream of free
blocks. It also amends the early stopping condition, so that the shrinking
of 1-sequences stops early when there are none, or there is only one
unbounded one remaining.
The search for the first set bit is unchanged, and the code path
thereafter is mostly unchanged unless the first set bit is in a position
that makes some of those copied sign bits matter. In that case, we look
for a next leaf, and at what blocks it can provide, to see if a
cross-boundary allocation is possible.
The hint is updated on a successful allocation that clears the last bit,
but it not updated on a failed allocation that leaves the last bit
set. So, as long as the last block is free, the hint value for the leaf is
large. As long as the last block is free, and there's a next leaf, a large
allocation can begin here, perhaps. A stricter rule than this would mean
that allocations and frees in one leaf could require hint updates to the
preceding leaf, and this change seeks to leave the freeing code
unmodified.
Define BLIST_BMAP_MASK, and use it for bit masking in blst_leaf_free and
blist_leaf_fill, as well as in blst_leaf_alloc.
The usbphy node for allwinner have two kind of resources, one for the
phy_ctrl and one per phy. Instead of blindy allocating resources, alloc
the phy_ctrl and pmu ones separately.
Also add a configuration struct for all different phy that hold the difference
between them (number of phys, unknow needed register write etc ...).
While here remove A83T code as upstream and FreeBSD dts don't have
nodes for USB.
This (plus 323640) re-enable OHCI on Pine64 on the bottom USB port.
The top USB port is routed to the OHCI0/EHCI0 which is by default in OTG mode.
While the phy code can handle the re-route to standard OHCI/EHCI we still need
a driver for musb to probe and configure it in host mode.
EHCI is still buggy on Pine64 (hang the board) so do not enable it for now.
Tested On: Bananapi (A20), BananapiM2 (A31S), OrangePi One (H3) Pine64 (A64)
r323392 introduce gpio_pin_get/gpio_pin_set for a10_gpio driver.
When called via gpio method they must aquire the device lock while
when they are called via gpio_pin_configure the lock is already aquire.
Introduce a10_gpio_pin_{s,g}et_locked and call them in pin_gpio_configure
instead.
Simon J. Gerraty [Sat, 16 Sep 2017 05:42:27 +0000 (05:42 +0000)]
Use OBJS_SRCS_FILTER to control setting OBJS from SRCS
Some makefiles do reachover builds.
In some cases it is convenient to list subdirs of the distribution
in SRCS.
It is not very convenient, or always even desirable to have corresponding
subdirs in .OBJDIR, so OBJS_SRCS_FILTER allows the makefile to choose.
The default value 'R' matches existing practice.
But a makefile can set OBJS_SRCS_FILTER= T (the R gets added by
bsd.init.mk) to avoid the need for subdirs in .OBJDIR
Stephen Hurd [Sat, 16 Sep 2017 02:41:38 +0000 (02:41 +0000)]
Revert r323516 (iflib rollup)
This was really too big of a commit even if everything worked, but there
are multiple new issues introduced in the one huge commit, so it's not
worth keeping this until it's fixed.
I'll work on splitting this up into logical chunks and introduce them one
at a time over the next week or two.
John Baldwin [Fri, 15 Sep 2017 22:40:57 +0000 (22:40 +0000)]
Avoid reusing the wrong buffer for a DDP AIO request.
To optimize the case of ping-ponging between two buffers, the DDP code
caches the last two buffers used keeping the pages wired and page pods
stored in the NIC's RAM. If a new aio_read() request uses one of the
same buffers, then the work of holding pages, etc. can be avoided.
However, the starting virtual address of an aio buffer was not saved,
only the page count, length, and initial page offset. Thus, an
aio_read() request could match a different buffer in the address
space. (Earlier during development vm_fault_hold_quick_pages() was
always called and the vm_page_t values were compared, but that was
eventually removed without being adequately replaced.) Fix by storing
the starting virtual address and comparing that (along with other
fields) to determine if a buffer can be reused.
MFC after: 3 days
Sponsored by: Chelsio Communications
Warner Losh [Fri, 15 Sep 2017 20:16:06 +0000 (20:16 +0000)]
Allow multiple TRIMs to be done for nda
Don't call cam_iosched_trim_done or cam_iosched_submit_trim for nda
since its hardware can handle almost an arbitrary number of TRIMs and
we don't have to be careful to only ever do one.
Warner Losh [Fri, 15 Sep 2017 20:15:55 +0000 (20:15 +0000)]
Update comments on what the CAM_IOSCHED_FLAG_TRIM_ACTIVE means.
It's intended only for those situations where the periph driver
ones to limit the number of trims active to one and only one.
Also update comments on associated functions.
Ed Maste [Fri, 15 Sep 2017 20:05:55 +0000 (20:05 +0000)]
open(2): update ENOTCAPABLE description for .. lookups
After r308212 Capsicum permits .. lookups in capability mode, as long as
path component traversal does not escape the directory corresponding to
the provided file descriptor.
We should add a description of the vfs.lookup_cap_dotdot and
vfs.lookup_cap_dotdot_nonlocal sysctls, perhaps as a cross-reference to
capsicum(4). I intend to look at that soon.
Reviewed by: bjk, cem, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12343
- The exit probe was not appropriately filtered to only the known pid so it
was firing on any random process that would exit rather the only the one
we cared about.
- The dtest script executes the tst.raise*.exe in the background from
POSIX sh without jobs control. POSIX mandates that SIGINT be set to
SIG_IGN in this case. The test executable never actually tested that
SIGINT could be caught despite trying to block and delay the signal.
So the SIGINT sent from raise() is never actually received since it
is ignored. This could be fixed by calling 'trap - INT' from dtest
before running the executable but I've opted to just use SIGUSR1
instead in these specific tests rather than adding more logic to
test that SIGINT is not ignored at startup.
These 2 issues meant that the tests would randomly work but only if a process
coincidentally exited during the test.
Miscellaneous fixes and improvements to MMCCAM stack
* Demote the level of several debug messages to CAM_DEBUG_TRACE
* Add detection for SDHC cards that can do 1.8V. No voltage switch sequence
is issued yet;
* Don't create a separate LUN for each SDIO function. We need just one to make
pass(4) attach;
* Remove obsolete mmc_sdio* files. SDIO functionality will be moved into the
separate device that will manage a new sdio(4) bus;
* Terminate probing if got no reply to CMD0;
* Make bcm2835 SDHCI host controller driver compile with 'option MMCCAM'.
Batch freeing of the pages in vm_object_page_remove() under the same
free queue mutex lock owning session, same as it was done for the
object termination in r323561.
Reported and tested by: mjg
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Start the phasing out of TRE by disabling it by default. r317254 introduced
a BSD_GREP_FASTMATCH knob (defaulting to on) for testing of bsdgrep with and
without TRE enabled. More bugs have cropped up since then, and
WITHOUT_BSD_GREP_FASTMATCH has shown in testing to be more stable than its
counterpart.
gmirror: treat ENXIO as disk disconnect, not media error
In theory, all data access errors mean that a member is out of sync
at most. But they were treated as more serious errors to avoid the
situation where a flaky disk gets repeatedly disconnected, re-synchronized,
reconnected and then disconnected again.
ENXIO is a special error that means that the member disk disappeared,
so it should get the same handling as the GEOM orphaning event.
There is a better chance that when the disk is reconnected, it will be
a good member again.
When ENXIO happens on a read we use the exisiting G_MIRROR_BUMP_SYNCID
mechanism which means that the mirror's syncid is increased as soon
as there is a write to the mirror. That's because no data has got out
of sync yet, but the problematic memeber is disconnected, so the future
write will make it stale.
When ENXIO happens on a write we use a new G_MIRROR_BUMP_SYNCID_NOW
mechanism which means that we update the mirror metadata as soon as
possible because the problematic memeber is already behind.
Andrew Turner [Fri, 15 Sep 2017 12:57:34 +0000 (12:57 +0000)]
Add the ARMv8.3 ID register fields. These were found in the A-Profile
exploration tools documentation:
https://developer.arm.com/products/architecture/a-profile/exploration-tools
John Baldwin [Thu, 14 Sep 2017 21:06:08 +0000 (21:06 +0000)]
Fix some incorrect sysctl pointers for some error stats.
The bad_session, sglist_error, and process_error sysctl nodes were
returning the value of the pad_error node instead of the appropriate
error counters.
When a newborn socket moves from incomplete queue to complete
one, we need to obtain the listening socket lock after the child,
which is a wrong order. The old code did that in potentially
endless loop of mtx_trylock(). The new one does only one attempt
of mtx_trylock(), and in case of failure references listening
socket, unlocks child and locks everything in right order. In
case if listening socket shuts down during that, just bail out.
Reported & tested by: Jason Eggleston <jeggleston llnw.com>
Reported & tested by: Jason Wolfe <jason llnw.com>
Andrew Turner [Thu, 14 Sep 2017 17:29:51 +0000 (17:29 +0000)]
Add support for handling undefined instructions in userspace and the
kernel. We can register callbacks to perform the required operation on the
saved registers before returning.
This is initially used to work around a bug in old versions of QEMU that
trigger such an exception when reading from an ID register when it should
load z zero value.
I expect this could be used with other exception types, e.g. to emulate
special register access from userland.
John Baldwin [Thu, 14 Sep 2017 15:07:48 +0000 (15:07 +0000)]
Add a NT_ARM_VFP ELF core note to hold VFP registers for each thread.
The core note matches the format and layout of NT_ARM_VFP on Linux.
Debuggers use the AT_HWCAP flags to determine how many VFP registers
are actually used and their format.
John Baldwin [Thu, 14 Sep 2017 14:26:55 +0000 (14:26 +0000)]
Add AT_HWCAP and AT_EHDRFLAGS on all platforms.
A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'. A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags. If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.
The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present. This is a step towards unifying the
AT_* constants across platforms.
dounmount: do not release the mount point's reference on the covered vnode
As long as mnt_ref is not zero there can be a consumer that might try
to access mnt_vnodecovered. For this reason the covered vnode must not
be freed until mnt_ref goes to zero.
So, move the release of the covered vnode to vfs_mount_destroy.
Warner Losh [Thu, 14 Sep 2017 05:48:23 +0000 (05:48 +0000)]
Implement gawk multiple-arg extension to and, or, and xor.
gawk allows multiple arguemnts to bit-wiste and, or and xor
functions. Implement an arbitrary number of arguments for these
functions. Also, use NULL in preference to 0 to match rest of file.
Warner Losh [Thu, 14 Sep 2017 05:47:55 +0000 (05:47 +0000)]
Bring in bit operation functions, ala gawk.
These are from OpenBSD:
>>> Extend awk with bitwise operations. This is an extension to the awk
>>> spec and documented as such, but comes in handy from time to time.
>>> The prototypes make it compatible with a similar GNU awk extension.
>>>
>>> ok millert@, enthusiasm from deraadt@
Edited to fix cut and paste in error messages, as well as
using tabs instead of spaces after #defines added.
Use soref() in sendfile(2) instead fhold() to reference a socket.
The problem is that fdrop() requires syscall context, as it may
enter sleep in some cases. The reason to use it in the original
non-blocking sendfile implementation, was to avoid use of global
ACCEPT_LOCK() on every I/O completion. Now in head sorele() no
longer requires this lock.
Mark Johnston [Wed, 13 Sep 2017 21:54:37 +0000 (21:54 +0000)]
Widen uk_pgoff, the slab header offset field.
16 bits is only wide enough for kegs with an item size of up to 64KB.
At that size or larger, slab headers are typically offpage because the
item size is a multiple of the page size, but there is no requirement
that this be the case.
We can widen the field without affecting the layout of struct uma_keg
since the removal of uk_slabsize in r315077 left an adjacent hole.
Do not relock free queue mutex for each page, free whole terminating
object' page queue under the single mutex lock.
First, all pages on the queue are prepared for free by calls to
vm_page_free_prep(), and pages which should not be returned to the
physical allocator (e.g. wired or fictitious) are simply removed from
the queue. On the second pass, vm_page_free_phys_pglist() inserts all
pages from the queue without relocking the mutex.
The change improves the object termination, e.g. on the process exit
where large anonymous memory objects otherwise cause relocks the free
queue mutex for each page. More, if several such processes are
exiting or execing in parallel, the mutex was highly contended on
the address space demolition.
Diagnosed and tested by: mjg (previous version)
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Split vm_page_free_toq() into two parts, preparation vm_page_free_prep()
and insertion into the phys allocator free queues vm_page_free_phys().
Also provide a wrapper vm_page_free_phys_pglist() for batched free.
Reviewed by: alc, markj
Tested by: mjg (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
On some AMD FCH devices driven by intpm(4) (read: mine), the SMBus I/O port
range is split in two and the low range is only 0x10 wide. intpm(4) does
not access any registers above 0x0f, so there is no need for the wider
range.
Allan Jude [Wed, 13 Sep 2017 17:00:02 +0000 (17:00 +0000)]
Increase EFI boot file size frok 128k to 384k
generate_fat.sh does the following:
- create an 800kb zero-filled file
- create an md device backed by this file
- format the device fat12
- mount the filesystem
- create the EFI ESP directory structure
- create the EFI boot file (BOOTx64 for amd64, BOOTaa64 for aarch64, etc)
- Adds a marker to the beginning of the file, and pad it to 384kb
- 384kb was chosen as it is less than half of 800kb, thus allowing
users to keep a backup of their older boot file in the small partition
- Unmount the filesystem
- Scan the image and find the offset where the marker was inserted
- The process requires root, to make image generation easier, images for
each architecture are pregenerated, compressed with xz, and checked
into svn.
The Makefile that generates boot1.efifat does the following:
- Ensure the compiled boot1.efi file is no larger than the generated image
- Decompress the template created by generate-fat.sh
- dd the contents of boot1.efi into boot1.efifat starting at the offset
where the marker is found. This allows any file less than the maximum
size to be written into the fat filesystem without having to mount it,
so no root privileges are required.
Later work by imp and myself makes bsdinstall create a 200mb fat16 instead
of using this process, but it is retained to make image generation easier.
Submitted by: Eric McCorkle (original version)
Reviewed by: emaste, tsoome, Eric McCorkle
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D9680
Ian Lepore [Wed, 13 Sep 2017 16:54:27 +0000 (16:54 +0000)]
Defer attaching and probing iicbus and its children until interrupts are
available, in i2c controller drivers that require interrupts for transfers.
This is the result of auditing all 22 existing drivers that attach iicbus.
These drivers were the only ones remaining that require interrupts and were
not using config_intrhook to defer attachment. That has led, over the
years, to various i2c slave device drivers needing to use config_intrhook
themselves rather than performing bus transactions in their probe() and
attach() methods, just in case they were attached too early.
Fix two issues with not ready data in sockets (read: sendfile)
in UNIX sockets.
o Check that socket is still connected in uipc_ready(). If not
we are responsible to free mbufs.
o In uipc_send() if socket appears to be disconnected, but we
are sending data with pending I/Os, don't free mbufs.
Reported by: Kevin Bowling <kbowling llnw.com>
Tested by: Kevin Bowling <kbowling llnw.com>
PR: 222259
Reported by: Mark Martinec <Mark.Martinec ijs.si>
MFC after: 3 days
Gordon Tetlow [Wed, 13 Sep 2017 16:35:16 +0000 (16:35 +0000)]
Deorbit catman. The tradeoff of disk for performance has long since tipped
in favor of just rendering the manpage instead of relying on pre-formatted
catpages. Note, this does not impede the ability to use existing catpages,
it just removes the utility to generate them.
Mark Johnston [Wed, 13 Sep 2017 15:44:54 +0000 (15:44 +0000)]
Fix a logic error in the item size calculation for internal UMA zones.
Kegs for internal zones always keep the slab header in the slab itself.
Therefore, when determining the allocation size, we need to take the
slab header size into account.
https://www.illumos.org/issues/5815
When panic() is called from within ztest, the mdb ::status command isn't as
useful as it could be since the global panicstr variable isn't updated. We
should modify the function to make sure panicstr is set, so ::status can
present the error message just like it does on a failed assertion.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed by: Rich Lowe <richlowe@richlowe.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
MFC after: 4 weeks
https://www.illumos.org/issues/5815
When panic() is called from within ztest, the mdb ::status command isn't as
useful as it could be since the global panicstr variable isn't updated. We
should modify the function to make sure panicstr is set, so ::status can
present the error message just like it does on a failed assertion.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Gordon Ross <gordon.ross@nexenta.com>
Reviewed by: Rich Lowe <richlowe@richlowe.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>