Dexuan Cui [Thu, 9 Mar 2017 12:09:07 +0000 (12:09 +0000)]
loader.efi: only reduce the size of the staging area on Hyper-V
Doing this on physical hosts turns out to be problematic, e.g. see comment
24 and 28 in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211746.
To fix the real underlying issue correctly & thoroughly, IMO we need
a relocatable kernel, but that would require a lot of complicated long
term work: https://reviews.freebsd.org/D9686?id=25414#inline-56969
For now, let's only apply efi_verify_staging_size() to VMs running on
Hyper-V, and restore the old behavior on physical machines since that
has been working for people for a long period of time, though that's
potentially unsafe...
Don't create any threads before SI_SUB_INIT_IF in the LinuxKPI. Else
kthread_add() will assert it is called too soon. This fixes a startup
issue when COMPAT_LINUXKPI is in enabled the kernel configuration
file.
Cy Schubert [Thu, 9 Mar 2017 05:29:24 +0000 (05:29 +0000)]
Configure leap-second smearing (always).
Leap-second smearing is an experimental option that may be specified in
ntp.conf(5) and the -x option on the command line to spread the effect
of a leap-second over an interval as specified by the leapsmearinterval
config file statement. Recommended values are between 7200 (2 hours) and
86400 (24 hours).
It is advised that leap-second smearing not be used for public NTP
servers (https://www.meinbergglobal.com/download/burnicki/Leap\
%20Second%20Smearing%20With%20NTP.pdf). It is also advised that NTP
clients not use a mix of NTP servers using leap-second smearing with
NTP servers not using leap-second smearing as that could cause
undefined client behaviour.
Leap-second smearing was committed to ports net/ntp and net/ntp-devel
by r426825 on 2016-11-22.
As discussed during AsiaBSDcon devsummit, import the manpage from OpenBSD which
is has been rewritten in mdoc(7) format making it readable by default with
mandoc, it also has been extended by OpenBSD to cover all awk(1) options
Andrew Thompson [Thu, 9 Mar 2017 01:26:10 +0000 (01:26 +0000)]
ec2.conf and vmimage.subr can be used from the installation livecd after
install to prepare an AMI image. This can be used to create a ZFS AMI disk
image using a virtual machine.
Change ec2.conf to use the pkg tool from a chroot rather than trying to
bootstrap it and fail from the livecd readonly filesystem.
spigen provides userland API to SPI bus. Make it available as a loadable
module so people using official ARM images can enabled it on devices like
BBB or RPi without re-building kernel
Gleb Smirnoff [Thu, 9 Mar 2017 00:55:19 +0000 (00:55 +0000)]
Make inp_lock_assert() depend on INVARIANT_SUPPORT, not INVARIANTS.
This will make INVARIANT-enabled modules, that use this function to load
successfully on a kernel that has INVARIANT_SUPPORT only.
Gleb Smirnoff [Thu, 9 Mar 2017 00:45:15 +0000 (00:45 +0000)]
Reduce stack usage in link_elf_load_file(), allocating struct nameidata.
This function may be called recursively, when a module pulls its dependencies.
Under certain circumstances, e.g. quad chain of dependencies and presence
of dtrace we may run out of stack.
Warner Losh [Thu, 9 Mar 2017 00:33:38 +0000 (00:33 +0000)]
efidp manipulates UEFI Device Paths in various ways. At the moment, it
formats and parses UEFI standard Device Paths. In the future it will
also translate between FreeBSD driver names and UEFI Device Paths.
Warner Losh [Thu, 9 Mar 2017 00:31:36 +0000 (00:31 +0000)]
Finish implementing -d/--device/--device-path flag to print variable
as if it were a device path.
Remove language about a=b syntax on the command line. This will not be
implemented due to its limited usefulness. UEFI variables are binary
blobs, on the whole, and a simple work around exists for
strings. Clarify that the new value of the variable is taken from
stdin. Update manual with history.
Warner Losh [Thu, 9 Mar 2017 00:31:31 +0000 (00:31 +0000)]
Bring in EDK2 routines for printing and parsing device paths.
This commit implements the (mostly?) Linux compatible
efidp_format_device_path and efidp_parse_device_path APIs. These are
the only APIs exposed through this library. However, they are built on
code from Tianocore's EDK2 MdePkg. They are brought in as new files
here for reasons described in FreeBSD-update.
Symbol versioning will be introduced to control what's exported from
the EDK2 code.
Some structural changes may be necessary when we move to sharing with
sys/boot/efi.
Enji Cooper [Wed, 8 Mar 2017 23:58:10 +0000 (23:58 +0000)]
sbin/devfs: clarify usage
- Note existence of -m option.
- Note that -s applies to rule keyword, only, by adding usage text
specifically for the `rule` and `ruleset` keywords.
Don't go into any further detail in usage(..) -- it's best that one
reads the manpage to get a better idea of how things work as there are
a number of different option-specific keywords and arguments, as well
as some rule grammar.
Ian Lepore [Wed, 8 Mar 2017 18:53:32 +0000 (18:53 +0000)]
Handle fifo size differences between older and newer revs of pl011 hardware.
Starting with rev 5 (which is inexplicably indicated by a version number
of '3' in the Peripheral ID register), the pl011 doubled the size of the
rx and tx fifos, to 32 bytes, so read the ID register and set the size
variables in the softc accordingly.
An interesting wrinkle in this otherwise-simple concept is that the
bcm2835 SoC, used in Raspberry Pi systems among others, has the rev 5
pl011 hardware, but somehow also has the older 16-byte fifos. We check
the FDT data to see if the hardware is part of a bcm283x system and use
the smaller size if so.
Thanks to jchandra@ for pointing out that newer hardware has bigger fifos.
Sean Bruno [Wed, 8 Mar 2017 17:29:40 +0000 (17:29 +0000)]
Use the buildworld includes and defaults when building pkt-gen. This will
mean that you need a world built to reliably build pkg-gen but this keeps
the build from failing when your source doesn't match your host running
version, e.g. building 12 on 11.
https://www.illumos.org/issues/7867
It seems that in the case where arc_hdr_free_pdata() sees HDR_L2_WRITING() we
would fail to update the ARC space statistics.
In the normal case those statistics are updated in arc_free_data_buf(). But in
the arc_hdr_free_on_write() path we don't do that.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
https://www.illumos.org/issues/7843
get_clones_stat() could be very slow if a snapshot has many (thousands) clones.
Clone names are added to an nvlist that's created with NV_UNIQUE_NAME.
So, each time a new name is appended to the list, the whole list is searched
linearly to see if that name is not already in the list. That results in the
quadratic complexity.
That should be easy to fix as we know in advance that we should not get any
duplicate names, so we can drop NV_UNIQUE_NAME when creating the list.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
https://www.illumos.org/issues/7867
It seems that in the case where arc_hdr_free_pdata() sees HDR_L2_WRITING() we
would fail to update the ARC space statistics.
In the normal case those statistics are updated in arc_free_data_buf(). But in
the arc_hdr_free_on_write() path we don't do that.
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
https://www.illumos.org/issues/7843
get_clones_stat() could be very slow if a snapshot has many (thousands) clones.
Clone names are added to an nvlist that's created with NV_UNIQUE_NAME.
So, each time a new name is appended to the list, the whole list is searched
linearly to see if that name is not already in the list. That results in the
quadratic complexity.
That should be easy to fix as we know in advance that we should not get any
duplicate names, so we can drop NV_UNIQUE_NAME when creating the list.
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Andriy Gapon <avg@FreeBSD.org>
Michal Meloun [Wed, 8 Mar 2017 11:40:27 +0000 (11:40 +0000)]
Unbreak ARMv6 world.
The new compiler_rt library imported with clang 4.0.0 have several fatal
issues (non-functional __udivsi3 for example) with ARM specific instrict
functions. As temporary workaround, until upstream solve these problems,
disable all thumb[1][2] related feature.
Alexander Motin [Wed, 8 Mar 2017 11:24:33 +0000 (11:24 +0000)]
Add initial support for UNMAP granularity.
Report UNMAP granularity as stripesize/-offset if we have no other values
to report there.
Add new quirk DA_Q_STRICT_UNMAP for cases when target is too critical to
misaligned UNMAP request, reporting errors instead of being suboptimal.
Setting this quirk makes da periph to forcefully align all UNMAP requests
to avoid those errors by the cost of some odd ranges not being UNMAP'ed.
This makes UNMAP usable within VMware 6.x VMs, just now 100% efficient.
Enji Cooper [Wed, 8 Mar 2017 05:31:54 +0000 (05:31 +0000)]
usr.bin/fortune: convert to OBJTOP/SRCTOP idioms
- Use OBJTOP/SRCTOP-relative paths when looking for include files and
strfile.
- Add FORTUNES_OBJ and FORTUNES_SRC to abbreviate usr.bin/fortune
pathing.
This is being done to simplify make output/idioms.
Warner Losh [Wed, 8 Mar 2017 02:47:59 +0000 (02:47 +0000)]
Copy needed include files from EDK2. This is a minimal set gleened
from the .depend files after the build:
cp -r ../vendor/edk2/MdePkg/Include sys/contrib/edk2
cd lib/libefivar
make
pushd `make -V .OBJDIR`
cat .depend*.o | grep sys/contrib | cut -d' ' -f 3 |
sort -u | sed -e 's=/full/path/sys/contrib/edk2/==' > /tmp/xxx
popd
cd ../../sys/contrib/edk2
rm -rf Include
for i in `cat /tmp/xxx`; do
svn cp svn+ssh://repo.freebsd.org/base/vendor/edk2/dist/MdePkg/$i $i
done
svn cp svn+ssh://repo.freebsd.org/base/vendor/edk2/dist/MdePkg/MdePkg.dec .
The original EDK2 repo is ~265MB, the MdePkg is ~23MB, all
MdePkg/Includes is ~7MB and this minimal set is ~1.3MB.
Warner Losh [Tue, 7 Mar 2017 23:06:41 +0000 (23:06 +0000)]
Avoid dereferencing unintialized elements in the error path.
Some drives sometimes have errors for things like setting the number
of queue entries in the submission queue. The error paths taken for
these drives ensure a panic dereferencing uninialized data.
Warner Losh [Tue, 7 Mar 2017 23:02:59 +0000 (23:02 +0000)]
cwd10 takes the low 32-bits and cwd11 takes the upper 32-bits of the
lba. Rather than do a cast to uint64_t, which clang warns might be
unaligned, do the stores 32-bits at a time.
Marius Strobl [Tue, 7 Mar 2017 22:42:44 +0000 (22:42 +0000)]
Add and use a MMC_DECLARE_BRIDGE macro for declaring mmc(4) bridges
as kernel drivers and their dependency onto mmc(4); this allows for
incrementing the mmc(4) module version but also for entire omission
of these bridge declarations for mmccam(4) in a single place, i. e.
in dev/mmc/bridge.h.
Justin Hibbits [Tue, 7 Mar 2017 22:11:57 +0000 (22:11 +0000)]
Fix booting with >4GB RAM on PowerMac G5 hardware
===
From Nathan Whitehorn:
Open Firmware runs in virtual mode on the Powermac G5. This runs inside the
kernel page table, which preserves all address translations made by OF before
the kernel starts; as a result, the kernel address space is a strict superset of
OF's.
Where this explodes is if OF uses an unmapped SLB entry. The SLB fault handler
runs in real mode and refers to the PCPU pointer in SPRG0, which blows up the
kernel. Having a value of SPRG0 that works for the kernel is less fatal than
preserving OF's value in this case.
===
The result of this is seemingly random panics from NULL dereferences, or hangs
immediately upon boot. By not restoring SPRG0 for Open Firmware entry the
kernel PCPU pointer is preserved and SLB faults are successful, resulting in a
stable kernel.
Warner Losh [Tue, 7 Mar 2017 21:47:54 +0000 (21:47 +0000)]
Make multi-namespace nvme drives more robust.
Fix assumptions about name spaces in NVME driver. First, it assumes
cdata.nn is the number of configured devices. However, it is the
number of supported name spaces. Second, it assumes that there will
never be more than 16 name spaces supported, but a certain drive I'm
testing reports 1024. It assumes that name spaces are a tightly packed
namespace, but the standard seems to indicate otherwise. Finally, it
assumes that an error would be generated when quearying an
unconfigured namespace. Instead, it succeeds but the identify data is
all zeros.
Fix these by limiting the number of name spaces we probe to 16. Remove
aborting when we find one in error. When the size of the name space is
zero, ignore it.
This is admittedly a bandaide. The long term fix will be to
participate in the enumeration and name space change protocols
definfed in the NVNe standard.
EDK2 is Intel's BSD Licensed UEFI implementation. We'll be bringing in
various routines from there rather than reimplementing them from
scratch for libefivar and the EFI boot loader.
The upstream repo has ^M ending on everything (sometimes multiple
times!), so the following script was run prior to import so changes we
have to do don't first include changing every line:
Also, only the MdePkg was brought in (it's 17MB, while the entire repo
is 250MB). It's almost completely certain nothing else will be used,
but if it is, it can be brough in along side MdePkg in the future.
Add support for constant pointer constructs to READ_ONCE() in the
LinuxKPI. When the type of the argument is constant the temporary
variable cannot be assigned after the barrier. Instead assign the
temporary variable by initialization.
Enji Cooper [Tue, 7 Mar 2017 17:53:53 +0000 (17:53 +0000)]
Add bsd.man.mk references for MAN under bsd.lib.mk and bsd.prog.mk
The latter set of manpages directly consume bsd.man.mk, so the bsd.man.mk
behavior should be the source of truth for underlying behavior, whereas
the other manpage fragment descriptions should document how they tweak
the variable behavior, if at all (bsd.prog.mk does tweak the default
value, as noted in its description)
Alexander Motin [Tue, 7 Mar 2017 17:41:08 +0000 (17:41 +0000)]
Add mechanism to unload CAM periph drivers.
For now it allows to unload CTL kernel module if there are no target-capable
SIMs in CAM. As next step full teardown of CAM targets can be implemented.
Dmitry Chagin [Tue, 7 Mar 2017 17:07:16 +0000 (17:07 +0000)]
Reduce code duplication between MD Linux code by moving SYSV IPC 64-bit
related struct definitions out into the MI path.
Invert the native ipc structs to the Linux ipc structs convesion logic.
Since 64-bit variant of ipc structs has more precision convert native ipc
structs to the 64-bit Linux ipc structs and then truncate 64-bit values
into the non 64-bit if needed. Unlike Linux, return EOVERFLOW if the
values do not fit.
Fix SYSV IPC for 64-bit Linuxulator which never sets IPC_64 bit.
Andriy Gapon [Tue, 7 Mar 2017 16:07:52 +0000 (16:07 +0000)]
firewire/sbp: try to improve locking, plus a few style nits
This change tries to fix the most obvious locking problems.
sbp_cam_scan_lun() is never called with the sbp lock held, so the lock
needs to be acquired internally (if it's needed at all).
Without this change a kernel with INVARIANTS panics when a firewire disk
is connected:
panic: mutex sbp not owned at /usr/src/sys/dev/firewire/sbp.c:967
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff80420bbb = db_trace_self_wrapper+0x2b/frame 0xfffffe0504df0930
kdb_backtrace() at 0xffffffff80670359 = kdb_backtrace+0x39/frame 0xfffffe0504df09e0
vpanic() at 0xffffffff8063986c = vpanic+0x14c/frame 0xfffffe0504df0a20
panic() at 0xffffffff806395b3 = panic+0x43/frame 0xfffffe0504df0a80
__mtx_assert() at 0xffffffff8061c40d = __mtx_assert+0xed/frame 0xfffffe0504df0ac0
sbp_cam_scan_lun() at 0xffffffff80474667 = sbp_cam_scan_lun+0x37/frame 0xfffffe0504df0af0
xpt_done_process() at 0xffffffff802aacfa = xpt_done_process+0x2da/frame 0xfffffe0504df0b30
xpt_done_td() at 0xffffffff802ac2e5 = xpt_done_td+0xd5/frame 0xfffffe0504df0b80
fork_exit() at 0xffffffff805ff72f = fork_exit+0xdf/frame 0xfffffe0504df0bf0
fork_trampoline() at 0xffffffff8082483e = fork_trampoline+0xe/frame
0xfffffe0504df0bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Also, I tried to reduce the scope of the sbp lock to avoid holding it
while doing bus_dma allocations.
The code badly needs some re-engineering. SBP really should implement
a CAM transport, so that it avoids control flow inversion when re-scanning
the bus. Also, the sbp lock seems to be too coarse.
Additionally, the commit includes some changes not related to locking.
- sbp_cam_scan_lun: restore CAM_DEV_QFREEZE before re-queueing the ccb
because xpt_setup_ccb resets ccb_h.flags
- sbp_post_busreset: call xpt_release_simq only if it's actually frozen
- don't place private SIMQ_FREEZED flag (sic, "freezed") into sim->flags,
use sbp->flags for that
- some style fixes and control flow enhancements
Andriy Gapon [Tue, 7 Mar 2017 15:43:49 +0000 (15:43 +0000)]
qlxgbe: add GCC_MS_EXTENSIONS to CFLAGS to make old base GCC happy
The module uses unnamed structure and union fields and base GCC in
stable/10 doesn't like it.
I think that that is a C11 feature, so it is courteous of more modern
compilers to not complain about it when compiling in C99 mode.
Eric Badger [Tue, 7 Mar 2017 13:41:01 +0000 (13:41 +0000)]
don't stop in issignal() if P_SINGLE_EXIT is set
Suppose a traced process is stopped in ptracestop() due to receipt of a
SIGSTOP signal, and is awaiting orders from the tracing process on how
to handle the signal. Before sending any such orders, the tracing
process exits. This should kill the traced process. But suppose a second
thread handles the SIGKILL and proceeds to exit1(), calling
thread_single(). The first thread will now awaken and will have a chance
to check once more if it should go to sleep due to the SIGSTOP. It must
not sleep after P_SINGLE_EXIT has been set; this would prevent the
SIGKILL from taking effect, leaving a stopped orphan behind after the
tracing process dies.
When selecting brand based on old Elf branding, prefer the brand which
interpreter exactly matches the one requested by the activated image.
This change applies r295277, which did the same for note branding, to
the old brand selection, with the same reasoning of fixing compat32
interpreter substitution.
PR: 211837
Reported by: kenji@kens.fm
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Consistently use vm_ooffset_t type for the vm object offset in
elf_load_section.
The values passed currently as vm_offset_t are phdr.p_offset, which
have the native Elf word size. Since elf_load_section interprets them
as the file offset, use vm object offset type.
Noted and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This change makes the workqueue implementation behave more like in
Linux, both functionality wise and structure wise.
All workqueue code has been moved to linux_work.c
Add an atomic based statemachine to the work_struct to ensure proper
operation. Prior to this change struct_work was directly mapped to a
FreeBSD task. When a taskqueue has multiple threads the same task may
end up being executed on more than one worker thread simultaneously.
This might cause problems with code coming from Linux, which expects
serial behaviour, similar to Linux tasklets.
Move all global workqueue function names into the linux_xxx domain to
avoid symbol name clashes in the future.
Implement a few more workqueue related functions and macros.
Create two multithreaded taskqueues for the LinuxKPI during module
load, one for time-consuming callbacks and one for non-time consuming
callbacks.
Roger Pau Monné [Tue, 7 Mar 2017 09:18:52 +0000 (09:18 +0000)]
xen/netfront: fix inbound packet flags for checksum offload
Currently netfront is setting the flags of inbound packets with the checksum
not present (offloaded) to (CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID |
CSUM_PSEUDO_HDR). According to the mbuf(9) man page this is not the correct
combination of flags, it should instead be (CSUM_DATA_VALID |
CSUM_PSEUDO_HDR).
Reviewed by: Wei Liu <wei.liu2@citrix.com>
MFC after: 2 weeks
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D9831
Roger Pau Monné [Tue, 7 Mar 2017 09:17:48 +0000 (09:17 +0000)]
xenstore: fix suspension when using the xenstore device
Lock the xenstore request mutex when suspending user-space processes, in order
to prevent any process from holding this lock when going into suspension, or
else the xenstore suspend process is going to deadlock.
Roger Pau Monné [Tue, 7 Mar 2017 09:16:51 +0000 (09:16 +0000)]
xen: add support for canceled suspend
When running on Xen, it's possible that a suspend request to the hypervisor
fails (return from HYPERVISOR_suspend different than 0). This means that the
suspend hasn't succeed, and the resume procedure needs to properly handle this
case.
First of all, when such situation happens there's no need to reset the vector
callback, hypercall page, shared info, event channels or grant table, because
it's state is preserved. Also, the PV drivers don't need to be reset to the
initial state, since the connection with the backed has not been interrupted.