If the kernel is not compiled with the CAPABILITIES kernel options
fget_unlocked doesn't return the sequence number so fd_modify will
always report modification, in that case we got infinity loop.
fget_cap_locked returns a referenced file, but the fgetvp_rights does
not need it. Instead, due to the filedesc lock being held, it can
ref the vnode after the file was looked up.
Fix up fget_cap_locked to be consistent with other _locked helpers and not
ref the file.
Add a table of vnode locks and use them along with bucketlocks to provide
concurrent modification support. The approach taken is to preserve the
current behaviour of the namecache and just lock all relevant parts before
any changes are made.
Lookups still require the relevant bucket to be locked.
Mark Johnston [Thu, 22 Sep 2016 23:22:53 +0000 (23:22 +0000)]
Re-check the systrace probe ID before calling dtrace_probe().
Otherwise there exists a narrow window during which a syscall probe can be
disabled and cause a concurrently-running thread to call dtrace_probe()
with an invalid probe ID.
amdsbwd, intpm: unify bits specific to AMD chipsets (FCHs, southbridges)
AMD chipsets have proprietary mechanisms for dicovering resources.
Those resources are not discoverable via plug-and-play mechanisms
like PCI configuration registers or ACPI.
For this reason a chipset-specific knowledge of proprietary registers
is required.
At present there are two FreeBSD drivers that require the proprietary
resource discovery. One is amdsbwd which is a driver for the watchdog
timer in the AMD chipsets. The other is intpm SMBus driver when it
attaches to the newer AMD chipsets where the resources of the SMBus HBA
are not described in the regular PCI way.
In both cases the resources are discovered by accessing AMD PMIO space.
Thus, many definitions are shared between the two drivers.
This change puts those defintions into a common header file.
As an added benefit, intpm driver now supports newest FCHs built into
AMD processors of Family 15h, models 70h-7Fh and Family 16h, models
30h-3Fh.
Fix regression from r297400, which truncates headers in case of low socket
buffer and put a small optimization for low socket buffer case:
- Do not hack uio_resid, and let m_uiotombuf() properly take care of it. This
fixes truncation of headers at low buffer.
- If headers ate all the space, jump right to the end of the cycle, to
avoid doing single page I/O and allocating zero length mbuf.
- Clear hdr_uio only if space is positive, which indicates that all uio
was copied in.
Warner Losh [Thu, 22 Sep 2016 15:17:36 +0000 (15:17 +0000)]
Revert and redo r306083.
Update the device tree source files to a Linux 4.7-RC.
The dts tree currently can't be merged w/o specific revisions.
Note: due to a stupid bug in the commit checking script, I couldn't
just remove the svn:keyword tag from the new files, I had to add
fbsd:nokeywords to all the files (including ones that didn't need it)
Ruslan Bukin [Thu, 22 Sep 2016 12:41:53 +0000 (12:41 +0000)]
Adjust the sopt_val pointer on bigendian systems (e.g. MIPS64EB).
sooptcopyin() checks if size of data provided by user is <= than we can
accept, else it strips down the size. On bigendian platforms we have to
move pointer as well so we copy the actual data.
Ed Schouten [Thu, 22 Sep 2016 12:08:26 +0000 (12:08 +0000)]
Make the cloudabi32 kernel module available on ARMv6.
Now that all of the necessary bits for ARMv6 support for CloudABI have
been checked in, let's hook the kernel module up to the build and
document its existence.
Ed Schouten [Thu, 22 Sep 2016 08:14:59 +0000 (08:14 +0000)]
Make it possible to safely use TPIDRURW from userspace.
On amd64, arm64 and i386, we have the possibility to switch between TLS
areas in userspace. The nice thing about this is that it makes it easier
to do light-weight threading, if we ever feel like doing that. On armv6,
let's go into the same direction by making it possible to safely use the
TPIDRURW register, which is intended for this purpose.
Clean up the ARMv6 code to remove md_tp entirely. Simply add a dedicated
field to the PCB to hold the value of TPIDRURW across context switches,
like we do for any other register. As userspace currently uses the
read-only TPIDRURO register, simply ensure that we keep both values in
sync where possible. The system calls for modifying the read-only
register will simply write the intended value into both registers, so
that it lazily ends up in the PCB during the next context switch.
The getsecs() function is implemented in platform- and bootfw-specific
files and, in a number of these places, there were problems with how they
were declared.
Some used int return instead of time_t. On some architectures the bit
width of time_t did not naturally fit into an integer and could lead to
some unexpected behavior. (For example, 32-bit ARM builds uses a 64-bit
time_t.)
Make sure the function prototypes always specify void for the argument
list when they do not have any arguemnts, otherwise some compilers can
complain about the prototype.
Mark Johnston [Thu, 22 Sep 2016 04:49:31 +0000 (04:49 +0000)]
Annotate syscall provider pointer arguments with the "userland" keyword.
This causes dtrace to automatically copyin arguments from userland, so
one no longer has to explicitly use the copyin() action to do so. Moreover,
copyin() on userland addresses is a no-op, so existing scripts should be
unaffected by this change.
Adrian Chadd [Wed, 21 Sep 2016 20:56:10 +0000 (20:56 +0000)]
[iwm] use rate control info from the node txrates; use mgmtrate for EAPOL frames
This changes the transmit rate control code to do a few things:
* use fixed rates (mcast, ucast, mgmt) where required.
* Don't use a hard-coded 11a or 11bg rate for non-data frames -
use what net80211 says we should use.
* use mgmtrate for EAPOL frames.
Adrian Chadd [Wed, 21 Sep 2016 19:48:07 +0000 (19:48 +0000)]
[net80211] don't add IBSS node table entries for neighbors from other SSIDs.
The adhoc probe/beacon input path was creating nodes for all SSIDs.
This wasn't a problem when the NICs were configured to only process
frames for the current BSSID, but that didn't allow IBSS merges.
Once avos and I flipped on "beacons from all BSSIDs" to allow for
correct IBSS merging, we found this interesting behaviour.
This adds a check against the current SSID.
* If there's no VAP SSID, allow anything
* If there's a VAP SSID, check if the incoming frame has a suitable
SSID and if so, allow it.
This prevents nodes being created for other SSIDs in probe and beacon
frames - ie, beacons overlapping IBSSes with different SSIDs, and
probe requests from arbitrary devices.
event generation is disabled by default in favour of sysmouse. This
behavoiur is controlled by kern.evdev.rcpt_mask sysctl, bit 2 should
be set to give priority to hw over sysmouse
Submitted by: Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by: hans
Differential Revision: https://reviews.freebsd.org/D7863
event generation is disabled by default in favour of kbdmux. This
behavoiur is controlled by kern.evdev.rcpt_mask sysctl, bit 3 should
be set to give priority to hw over mux
Submitted by: Vladimir Kondratiev <wulf@cicgroup.ru>
Reviewed by: hans
Differential Revision: https://reviews.freebsd.org/D7957
Ed Schouten [Wed, 21 Sep 2016 13:03:55 +0000 (13:03 +0000)]
Refine the dirname(3) compatibility workaround a bit more.
Right now our workaround is so good that it doesn't throw any warnings
on misuse. This means that people will keep on using the old version
of dirname(3) silently without fixing their code.
Go ahead and change the prototype of __old_dirname() to also use a plain
char *, so that we still get a compiler warning. This won't have any
negative effect on building older versions of FreeBSD on HEAD, as those
are built with -Werror disabled.
Ed Schouten [Wed, 21 Sep 2016 13:02:43 +0000 (13:02 +0000)]
Fix misuse of the basename() and dirname() functions.
These functions are allowed to overwrite their input. Pull a copy of the
input parameter and call dirname() and basename() on that instead. Do
ensure that we reload the pathname value between calls.
Add kernel interfaces to call EFI Runtime Services.
Runtime services require special execution environment for the call.
Besides that, OS must inform firmware about runtime virtual memory map
which will be active during the calls, with the SetVirtualAddressMap()
runtime call, done while the 1:1 mapping is still used. There are two
complication: the SetVirtualAddressMap() effectively must be done from
loader, which needs to know kernel address map in advance. More,
despite not explicitely mentioned in the specification, both 1:1 and
the map passed to SetVirtualAddressMap() must be active during the
SetVirtualAddressMap() call. Second, there are buggy BIOSes which
require both mappings active during runtime calls as well, most likely
because they fail to identify all relocations to perform.
On amd64, we can get rid of both problems by providing 1:1 mapping for
the duration of runtime calls, by temprorary remapping user addresses.
As result, we avoid the need for loader to know about future kernel
address map, and avoid bugs in BIOSes. Typically BIOS only maps
something in low 4G. If not runtime bugs, we would take advantage of
the DMAP, as previous versions of this patch did.
Similar but more complicated trick can be used even for i386 and 32bit
runtime, if and when the EFI boot on i386 is supported. We would need
a trampoline page, since potentially whole 4G of VA would be switched
on calls, instead of only userspace portion on amd64.
Context switches are disabled for the duration of the call, FPU access
is granted, and interrupts are not disabled. The later is possible
because kernel is mapped during calls.
To test, the sysctl mib debug.efi_time is provided, setting it to 1
makes one call to EFI get_time() runtime service, on success the efitm
structure is printed to the control terminal. Load efirt.ko, or add
EFIRT option to the kernel config, to enable code.
Discussed with: emaste, imp
Tested by: emaste (mac, qemu)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Rename efi_systbl to efi_systbl_phys, the variable contains the
physical address of the EFI System Table. Add _KERNEL guard around
its declaration in sys/efi.h.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
There is no way to see anything about the faults occuring in
loader.efi. Some intel BIOSes do output a line into serial port at
115200/8/1 regardless of the current port settings with the EFI error
number, but this is too little, and not always available, esp. if the
user does not know where to look.
The patch adds a simple facility to grab exceptions and at least dump
generic registers and some exception details. Due to the relative
complexity of correctly taking over the BIOS IDT setup, only install
the facility on user request.
Two new commands, 'grab_faults' and 'ungrab_faults' are provided,
first one takes over, second undoes the first. It is supposed that
user would execute 'grab' by the developer direction of collecting the
debugging data. The 'fault' command generates exception to test the
setup.
Fault handlers use dedicated stack to improve chances of catching
stack/TSS exceptions. Due to this, BIOS IDT is duplicated into a
private copy, and debugger needs to find a free GDT slot for TSS. This
is done in somewhat complicated efi_redirect_exceptions().
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D7935
Make resettodr_lock accessible outside subr_rtc.c. Protect
CLOCK_GETTIME() with the lock.
Now all time-related accesses to the CMOS for RTC should be under the
lock. This is needed to allow upcoming EFI Runtime Services support
to provide required execution environment for the firmware calls.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Add amd64 functions to load/store GDT register, store IDT and TR registers.
Note that lgdt() name is already used for function which, besides
loading GDT, also reloads segment descriptors cache, thus new function
is named bare_lgdt().
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Export the pmap_cache_bits() and pmap_pinit_pml4() functions from the
amd64 pmap.
The new pmap_pinit_pml4() function initializes the level 4 page table
with entries for the kernel mappings. Both functions are needed for
upcoming EFI Runtime Services support.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Michael Tuexen [Wed, 21 Sep 2016 08:28:18 +0000 (08:28 +0000)]
Fix the handling of unordered fragmented user messages using DATA chunks.
There were two bugs:
* There was an accounting bug resulting in reporting a too small a_rwnd.
* There are a bug when abandoning messages in the reassembly queue.
1) Microoptimize %p case.
2) Implememt %u for GNU compatibility.
3) Don't forget to advance buf for %w/%u.
4) Fail with incomplete week (week 0) request and no such week in the
year.
5) Fix yday formula when Sunday requested and the week started from Monday.
6) Fail with impossible yday for incomplete week (week 0) and direct %w/%u
request.
7) Shift yday/wday to the first day of the year, if incomplete week
(week 0) requested and no %w/%u used.
The assumption that the channel is only opened upon synthetic device
attach time no longer holds, e.g. Hyper-V network device MTU changes.
We have to allow device drivers to preallocate bufrings, e.g. in
attach DEVMETHOD, to prevent bufring allocation failure once the
system memory is fragmented after running for a while.
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D7960
Wojciech Macek [Wed, 21 Sep 2016 05:33:18 +0000 (05:33 +0000)]
Add support for SPI-mapped MSI interrupts outside of GICv2m.
SPI-mapped MSI interrupts coming from a controller other
than GICv2m need to have their trigger and polarity
properly configured. This patch fixes MSI/MSI-X
on Annapurna Alpine platform with GICv2.
Wojciech Macek [Wed, 21 Sep 2016 05:22:49 +0000 (05:22 +0000)]
Add support for SPI-mapped MSI interrupts in GICv3.
PIC_SETUP_INTR implementation in GICv3 did not allow
for setting up interrupts without included FDT
description. GICv2m-like MSI interrupts, which map
MSI messages to SPI interrupt lines, may not have
a description in FDT. Add support for such interrupts
by setting the trigger and polarity to the appropriate
values for MSI (edge, high) and get the hardware
IRQ number from the corresponding ISRC.
There is nothing CPU specific here, and it's usable by both fdt and Open
Firmware based systems. Rather than keeping the same file in every one, just
add it to the ofw/fdt block in the main file.
Add a ofw_parse_bootargs function, and use it for powerpc
Summary:
If the environment variable is set, U-boot adds a 'bootargs' property to
/chosen. This is already handled by ARM and MIPS, but should be handled in a
central location. For now, ofw_subr.c is a good place until we determine if it
should be moved to init_main.c, or somewhere more central to all architectures.
Eventually arm and mips should be modified to use ofw_parse_bootargs() as well,
rather than using the duplicate code already.
Reviewed By: adrian
Differential Revision: https://reviews.freebsd.org/D7846
If present, honor the USB port mode (host or peripheral) set on DTS, if not,
keep the beaglebone defaults: USB0 -> peripheral/gadget, USB1 -> host.
This is only a workaround as in fact fact this hardware is capable of detect
the USB port mode based on type of cable and act according with the detected
mode. Unfortunately the driver does not handle that at moment.
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)
Alan Somers [Tue, 20 Sep 2016 18:47:33 +0000 (18:47 +0000)]
Fix periodic scripts when an NFS mount covers a local mount
100.chksetuid and 110.neggrpperm try to search through all UFS and ZFS
filesystems. But their logic contains an error. They also search through
remote filesystems that are mounted on top of the root of a local
filesystem. For example, if a user installs a FreeBSD system with the
default ZFS layout, he'll get a zroot/usr/home filesystem. If he then mounts
/usr/home over NFS, these scripts would search through /usr/home.
Ed Maste [Tue, 20 Sep 2016 17:07:14 +0000 (17:07 +0000)]
Always pass -m to ld for converting binary files to kernel ELF objects
This is in preparation for linking with LLVM's lld, which does not have
a compiled-in default output emulation. lld requires that it is
specified via the -m option, or obtained from the object file(s) being
linked.
This will also allow all build targets to share a common linker binary.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7837
Enji Cooper [Tue, 20 Sep 2016 16:31:57 +0000 (16:31 +0000)]
Port sizes_test and statvfs_test to FreeBSD
Similar to r306030, use a simpler method for getting the value of
`hw.pagesize`, i.e. `sysctl -n hw.pagesize`. The awk filtering method doesn't
work on FreeBSD