Cy Schubert [Thu, 22 Jun 2017 12:46:48 +0000 (12:46 +0000)]
In poolnodcommand(): TTL (-T) is only valid when adding a node to a
pool (ippool -a) not when removing a node from a pool (ippool -r).
Flag -T as an error in ippool -r.
Bryan Drewery [Thu, 22 Jun 2017 05:34:41 +0000 (05:34 +0000)]
Rework logic for skipping .depend/.meta file read/stat/writes.
- Rename _SKIP_READ_DEPEND to _SKIP_DEPEND since it also avoids writing.
- This now uses .NOMETA to avoid reading any .meta files related to
DEPENDOBJS. Objects not in OBJS/DEPENDOBJS may still have their .meta
files read in if they are in the dependency graph.
- This also avoids statting .meta and .depend files in the META_MODE +
-DNO_FILEMON case.
Xin LI [Thu, 22 Jun 2017 05:10:16 +0000 (05:10 +0000)]
Fix use-after-free introduced in r300388.
In r300388, endnetconfig() was called on nc_handle which would release
the associated netconfig structure, which means tmpnconf->nc_netid
would be a use-after-free.
Solve this by doing endnetconfig() in return paths instead.
Reported by: jemalloc via kevlo
Reviewed by: cem, ngie (earlier version)
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D11288
Ed Maste [Thu, 22 Jun 2017 02:46:36 +0000 (02:46 +0000)]
makefs: add copies of NetBSD makefs msdos source files
We do not treat makefs as contrib code. Import copies of makefs msdos
files from NetBSD so that we can track our changes to these files.
These are copied from NetBSD, with only a change to use __FBSDID and
$FreeBSD$ instead of __KERNEL_RCSID and $NetBSD$. A copy of the
original $NetBSD$ tag remains in each source file.
Submitted by: Siva Mahadevan
Sponsored by: The FreeBSD Foundation
Pedro F. Giffuni [Thu, 22 Jun 2017 02:43:32 +0000 (02:43 +0000)]
ext2fs: add dir_nlink feature support.
ext4 on linux has always supported more than 32000 directories through
the dir_nlink feature, but FreeBSD was unable to catch up on this feature.
As part of the 64 bit inode changes nlink_t has been extended and this
feature is now possible.
Conrad Meyer [Thu, 22 Jun 2017 02:19:39 +0000 (02:19 +0000)]
join(1): Fix field ordering for -v output
Per POSIX, join(1) (in modes other than -o) is a concatenation of selected
character fields. The joined field is first, followed by fields in the
order they occurred in the input files.
Our join(1) utility previously handled this correctly for lines with a match
in the other file. But it failed to order output fields correctly for
unmatched lines, printed in -a and -v modes.
A simple test case is:
$ touch a
$ echo "2 1" > b
$ join -v2 -2 2 a b
1 2
Rick Macklem [Thu, 22 Jun 2017 00:17:15 +0000 (00:17 +0000)]
Ensure that the credentials field of the NFSv4 client open structure is
initialized.
bdrewery@ has reported panics "newnfs_copycred: negative nfsc_ngroups".
The only way I can see that this occurs is that the credentials field of
the open structure gets used before being filled in.
I am not sure quite how this happens, but for the file create case, the
code is serialized via the vnode lock on the directory. If, somehow, a
link to the same file gets created just after file creation, this might
occur.
This patch ensures that the credentials field is initialized to a reasonable
set of credentials before the structure is linked into any list, so I
this should ensure it is initialized before use.
I am committing the patch now, since bdrewery@ notes that the panics
are intermittent and it may be months before he knows if the patch fixes
his problem.
Bryan Drewery [Wed, 21 Jun 2017 23:01:18 +0000 (23:01 +0000)]
Follow-up r308602: Don't add missing headers to .depend.tables.h.
This also avoids an error from egrep when a header is missing. This can happen
with something like WITHOUT_BLUETOOTH set when searching for
$include_dir/netgraph/bluetooth/include/ng_btsocket.h. The warning was
not an error (from set -e) due to being on the left side of a pipe. Now the
all_headers list is only filled with existing headers.
Zbigniew Bodek [Wed, 21 Jun 2017 18:28:37 +0000 (18:28 +0000)]
Enable arm,io-coherent property of PL310 L2 cache on Armada 38x platforms
This patch disables outer cache sync in PL310 driver
by adding "arm,io-coherent" property. In addition to
the previous patches it was the last bit needed
for enabling proper operation of Armada 38x SoCs
with the IO cache coherency.
Zbigniew Bodek [Wed, 21 Jun 2017 18:27:05 +0000 (18:27 +0000)]
Create root DMA tag and fix MBUS windows on DMA coherent platforms
Armada 38x SoCs, in order to work properly in IO-coherent mode,
requires an update of the MBUS windows attributesd.
This patch also configures nexus coherent dma tag, because all
busses and children devices have to inherit this setting in runtime.
The latter has to be executed as a sysinit (SI_SUB_DRIVERS type),
so that bus_dma_tag_create() can be executed properly.
Submitted by: Michal Mazur <mkm@semihalf.com>
Marcin Wojtas <mw@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: ian
Differential revision: https://reviews.freebsd.org/D11203
Zbigniew Bodek [Wed, 21 Jun 2017 18:25:35 +0000 (18:25 +0000)]
Enable setting the dma tag at the nexus level
Allow to set the dma tag for nexus in the platform init code,
so that all busses and devices would be able to inherit it.
This change is useful e.g. for setting coherent dma tag for
the platforms with hardware IO cache coherency.
Submitted by: ian
Michal Mazur <mkm@semihalf.com>
Reviewed by: ian
Differential revision: https://reviews.freebsd.org/D11202
Zbigniew Bodek [Wed, 21 Jun 2017 18:23:28 +0000 (18:23 +0000)]
Introduce support for DMA coherent ARM platforms
- Inherit BUS_DMA_COHERENT flag from parent buses
- Use cacheable memory attributes on dma coherent platform
- Disable cache synchronization on coherent platform
Changes are based on ARMv8 busdma code and commit r299683.
Submitted by: Michal Mazur <mkm@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Reviewed by: ian
Differential revision: https://reviews.freebsd.org/D11201
Andriy Gapon [Wed, 21 Jun 2017 18:19:27 +0000 (18:19 +0000)]
bhyveload: correctly query size of disks
On FreeBSD fstat(2) works fine for querying sizes of plain files,
but not so much for character devices.
So, use DIOCGMEDIASIZE to try to get the correct size for disks
and disk-like devices (e.g. zvols).
Allow the VM fault handler to be NULL in the LinuxKPI when handling a
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.
Cy Schubert [Wed, 21 Jun 2017 12:19:05 +0000 (12:19 +0000)]
Fix -S handling within poolcommand(). Specifying a seed (-S) is only
valid when adding a pool (ippool -A), not when removing a pool
(ippool -R). It is a command line syntax error if specifying a seed (-S)
is specified when emoving a pool (-R).
Andriy Gapon [Wed, 21 Jun 2017 08:12:07 +0000 (08:12 +0000)]
fix several fallouts from r320156, ZFS ABD import
All of the problems were related to the FreeBSD-only features.
One was caused by a mismerge in the zfsbootcfg support code.
All others were in the TRIM support code.
Andriy Gapon [Wed, 21 Jun 2017 08:10:45 +0000 (08:10 +0000)]
fix several fallouts from r320156, ZFS ABD import
All of the problems were related to the FreeBSD-only features.
One was caused by a mismerge in the zfsbootcfg support code.
All others were in the TRIM support code.
Sepherosa Ziehau [Wed, 21 Jun 2017 06:44:56 +0000 (06:44 +0000)]
hyperv/storvsc: Reduce log verbosity
On some windows hosts TEST_UNIT_READY command will return
SRB_STATUS_ERROR and sense data "NOT READY asc:3a,1 (Medium
not present - tray closed)", this occurs periodically, and
not hurt anything else. So, we prefer to ignore this kind
of errors.
PR: 219973
Submitted by: Hongjiang Zhang <hongzhan microsoft com>
MFC after: 3 days
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D11271
Ed Maste [Wed, 21 Jun 2017 00:33:16 +0000 (00:33 +0000)]
add -znotext to kernel module link invocation
ARM kernel modules require .text relocations (DT_TEXTREL) in shared
object ouptut, which is not allowed by default by lld. Add the -znotext
option to enable this. For simplicity add it unconditionally: it is
already default and thus either redundant (GNU BFD ld and gold from
ports) or ignored as an unknown option (GNU BFD ld 2.17.50 in the base
system).
Reviewed by: kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11250
Bryan Drewery [Tue, 20 Jun 2017 22:08:02 +0000 (22:08 +0000)]
buildworld: Pass which world phase the build is in down to submakes.
This is useful for having directories behave differently depending
on the phase - such as enabling SUBDIR_PARALLEL or disabling
redundant building of library directories already done by
earlier 'make _libraries'.
Enji Cooper [Tue, 20 Jun 2017 20:46:08 +0000 (20:46 +0000)]
ln(1): fix -F behavior
When '-F' option is used, the target directory needs to be unlinked.
Currently, the modified target ("target/source") is being unlinked, and
since it doesn't yet exist, the original target isn't removed.
This is fixed by skipping the block where target is modified to
"target/source" when '-F' option is set.
Hence, a symbolic link (with the same name as of the original target) to
the source_file is produced.
Update the test for ln(1) to reflect fix for option '-F'
Bryan Drewery [Tue, 20 Jun 2017 20:34:30 +0000 (20:34 +0000)]
LIBADD: Try to support partial tree checkouts in some limited cases.
LIBADD is only supported for in-tree builds because we do not install
share/mk/src.libnames.mk (which provides LIBADD support) into /usr/share/mk.
So if a partial checkout is done then the LIBADDs are ignored and no LDADD is
ever added.
Provide limited support for this case for when LIBADD is composed entirely of
base libraries. This is to avoid clashes with ports and other out-of-tree
LIBADD uses that should not be mapped to LDADD and because we do not want to
support LIBADD out-of-tree right now.
https://www.illumos.org/issues/8021
The ARC buf data project (known simply as "ABD" since its genesis in the ZoL
community) changes the way the ARC allocates `b_pdata` memory from using linear
`void *` buffers to using scatter/gather lists of fixed-size 1KB chunks. This
improves ZFS's performance by helping to defragment the address space occupied
by the ARC, in particular for cases where compressed ARC is enabled. It could
also ease future work to allocate pages directly from `segkpm` for minimal-
overhead memory allocations, bypassing the `kmem` subsystem.
This is essentially the same change as the one which recently landed in ZFS on
Linux, although they made some platform-specific changes while adapting this
work to their codebase:
1. Implemented the equivalent of the `segkpm` suggestion for future work
mentioned above to bypass issues that they've had with the Linux kernel memory
allocator.
2. Changed the internal representation of the ABD's scatter/gather list so it
could be used to pass I/O directly into Linux block device drivers. (This
feature is not available in the illumos block device interface yet.)
FreeBSD notes:
- the actual (default) chunk size is 4KB (despite the text above saying 1KB)
- we can try to reimplement ABDs, so that they are not permanently
mapped into the KVA unless explicitly requested, especially on
platforms with scarce KVA
- we can try to use unmapped I/O and avoid intermediate allocation of a
linear, virtual memory mapped buffer
- we can try to avoid extra data copying by referring to chunks / pages
in the original ABD
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Dan Kimmel <dan.kimmel@delphix.com>
Andriy Gapon [Tue, 20 Jun 2017 16:55:30 +0000 (16:55 +0000)]
revert r315852 which introduced zio_buf_alloc_nowait for use in vdev_queue_aggregate
I think that the change is still good, but reconciling it with a planned
merge of the ARC buf data scatter-ization is a bit more tedious
than I can handle.
Andriy Gapon [Tue, 20 Jun 2017 16:45:48 +0000 (16:45 +0000)]
fstyp: move sys/ include path after zfs include paths
The reason is that FreeBSD refcount.h shadows ZFS refcount.h and that
will lead to a build error after a planned import of the ARC buf data
scatter-ization.
It's possible that some day we will have an opposite problem where
a ZFS header would shadow an essential FreeBSD header.
So, we need to think about a better long term solution.
Andriy Gapon [Tue, 20 Jun 2017 16:40:31 +0000 (16:40 +0000)]
remove bogus declaration of malloc from tcp_wrappers
The declaration was already inactive when INET6 was enabled
and it causes a build error in the other case because of
a conflict with the correct definition in stdlib.h.
Pedro F. Giffuni [Tue, 20 Jun 2017 14:28:51 +0000 (14:28 +0000)]
ext2fs: Add uninit_bg feature support.
From the linux tune2fs(8) manpage:
"Allow the kernel to initialize bitmaps and inode tables and keep a high
watermark for the unused inodes in a filesystem, to reduce e2fsck(8) time.
This first e2fsck run after enabling this feature will take the full time,
but subsequent e2fsck runs will take only a fraction of the original time,
depending on how full the file system is."
Zbigniew Bodek [Tue, 20 Jun 2017 11:11:42 +0000 (11:11 +0000)]
Disable PL310 outer cache sync for IO coherent platforms
When a PL310 cache is used on a system that provides hardware
coherency, the outer cache sync operation is useless, and can be
skipped. Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
or Crypto controllers and the Cortex-A9.
To avoid this, this commit introduces a new Device Tree property
'arm,io-coherent' for the L2 cache controller node, valid only for the
PL310 cache. It identifies the usage of the PL310 cache in an I/O
coherent configuration. Internally, it makes the driver disable the
outer cache sync operation.
Note, that other outer-cache operations are not removed, as they may
be needed for certain situations, such as booting secondary CPUs.
Moreover, in order to enable IO coherent operation, the decision
whether to use L2 cache maintenance callbacks is done in busdma
layer, which was enabled in one of the previous commits.
Submitted by: Michal Mazur <mkm@semihalf.com>
Marcin Wojtas <mw@semihalf.com>
Reviewed by: mmel
Obtained from: Semihalf
Differential revision: https://reviews.freebsd.org/D11245
Zbigniew Bodek [Tue, 20 Jun 2017 11:09:38 +0000 (11:09 +0000)]
Implement workaround for Armada 38X family HW issue between CPU and devices
There is a hardware problem between Cortex-A9 CPUs and on-chip devices
in Armada 38X SoCs that may cause hang on heavy load. This can be
however worked around by mapping all registers and PCI IO
as strongly ordered instead of device memory.
Jason Evans [Tue, 20 Jun 2017 07:25:38 +0000 (07:25 +0000)]
Decrease relative branch brittleness.
Replace conditional branches with trampolines to unconditional branches when
jumping to labels within other compilation units. This increases the offset
range from +-1 MiB to +-128 MiB.
Emmanuel Vadot [Tue, 20 Jun 2017 02:09:50 +0000 (02:09 +0000)]
Switch back to the BSDL DTC (Device Tree Compiler).
The BSDL dtc has grown the needed features (overlays mostly) and is able to
compile all of our base DTS.
You can use WITH_GPL_DTC is you need the GPL one or DTC= in make.conf(5)
to specify an alternate location for the compiler to use.
Rick Macklem [Mon, 19 Jun 2017 22:07:53 +0000 (22:07 +0000)]
Add the definition of maxbcachebuf to sys/buf.h.
r320070 removed the definition of maxbcachebuf from sys/param.h to
fix the build for arm.
This patch adds the definition of maxbcachebuf to sys/buf.h, which
should be ok, since sys/buf.h is not being included in arm/arm/elf_note.S.
Do not queue dmar_map_entries with zeroed gseq to
dmar_qi_invalidate_locked(). Zero gseq stops the processing in the qi
task. Do not assign possibly uninitialized on-stack gseq to map
entries when requeuing them on unit tlb_flush queue. Random garbage
in gsec is interpreted as too high invalidation sequence number and
again stop the processing in the task.
Make the sequence numbers generation completely contained in
dmar_qi_invalidate_locked() and dmar_qi_emit_wait_seq(). Upper code
directly passes boolean requesting emiting wait command instead of
trying to provide hint to avoid it by passing NULL gseq pointer.
Microoptimize the requeueing to tlb_flush queue by doing it for the
whole queue.
Diagnosed and tested by: Brett Gutstein <bgutstein@rice.edu>
Discussed with: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Mark Johnston [Mon, 19 Jun 2017 21:09:50 +0000 (21:09 +0000)]
Fix the !TD_IS_IDLETHREAD(curthread) locking assertions.
Most of the lock slowpaths assert that the calling thread isn't an idle
thread. However, this may not be true if the system has panicked, and in
some cases the assertion appears before a SCHEDULER_STOPPED() check.
Kenneth D. Merry [Mon, 19 Jun 2017 20:48:00 +0000 (20:48 +0000)]
Fix a potential sleep while holding a mutex in the sa(4) driver.
If the user issues a MTIOCEXTGET ioctl, and the tape drive in question has
a serial number that is longer than 80 characters, we malloc a buffer in
saextget() to hold the output of cam_strvis().
Since a mutex is held in that codepath, doing a M_WAITOK malloc could lead
to sleeping while holding a mutex. Change it to a M_NOWAIT malloc and bail
out if we fail to allocate the memory. Devices with serial numbers longer
than 80 bytes are very rare (I don't recall seeing one), so this
should be a very unusual case to hit. But it is a bug that should be fixed.
sys/cam/scsi/scsi_sa.c:
In saextget(), if we need to malloc a buffer to hold the output of
cam_strvis(), don't wait for the memory. Fail and return an error
if we can't allocate the memory immediately.
PR: kern/220094
Submitted by: Jia-Ju Bai <baijiaju1990@163.com>
MFC after: 3 days
Sponsored by: Spectra Logic
Ignore the P_SYSTEM process flag, and do not request
VM_MAP_WIRE_SYSTEM mode when wiring the newly grown stack.
System maps do not create auto-grown stack. Any stack we handled,
even for P_SYSTEM, must be for user address space. P_SYSTEM processes
with mapped user space is either init(8) or an aio worker attached to
other user process with aio buffer pointing into stack area. In either
case, VM_MAP_WIRE_USER mode should be used.
Noted and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Bryan Drewery [Mon, 19 Jun 2017 18:08:20 +0000 (18:08 +0000)]
buildworld: Define SYSROOT to WORLDTMP.
This is to allow downstream Makefiles to know for sure they are building
against a sysroot rather than only depending on ${DESTDIR} or other
assumptions.
Allow negative aio_offset only for the read and write LIO ops on
device nodes.
Otherwise, the current check of aio_offset == -1LL makes it possible
to pass negative file offsets down to the filesystems. This trips
assertions and is even unsafe for e.g. FFS which keeps metadata at
negative offsets.
Reported and tested by: pho
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D11266
Emmanuel Vadot [Mon, 19 Jun 2017 06:30:04 +0000 (06:30 +0000)]
allwinner: Configure pins for DTS >= Linux 4.11
Starting with DTS from Linux 4.11, the pins list, function, drive and pull
are no longer prefixed with "allwinner,".
Allow the pinctrl driver to handle both case.
Rick Macklem [Sun, 18 Jun 2017 21:48:31 +0000 (21:48 +0000)]
Fix the NFS client/server so that it actually uses the 64bit ino_t filenos.
The code still doesn't use d_off. That will come in a future commit.
The code also removes the checks for servers returning a fileno that
doesn't fit in 32bits, since that should work ok now.
Bump __FreeBSD_version since this patch changes the interface between
the NFS kernel modules.
Warner Losh [Sun, 18 Jun 2017 21:03:48 +0000 (21:03 +0000)]
Create a new option ARM_USE_V6_BUSDMA to force an armv4/5 kernel to
use the armv6 busdma interface. This interface uses more memory than
the armv4 one, but bounces more data more often so may be more correct
than the armv4 one. It is intended for debugging purposes only at the
moment.
Warner Losh [Sun, 18 Jun 2017 21:03:35 +0000 (21:03 +0000)]
Load the transmit dma buffer at attach time as well. We don't need to
load and unload it all the time since the buffer never changes. In
addition, we were loading it with a hardware spin lock held, which
makes the sleepable lock in busdma (for the bounce pages) trigger a
witness warning, as well as ipend being called with it held by uart,
which made it impossible to unload.
These differences don't matter with the v4 busdma implementation, but
they do with the v6 implementation since the latter likes to bounce
transactions more, and will always do so for Atmel's driver.
It's more efficient as well as being more correct.
Pedro F. Giffuni [Sun, 18 Jun 2017 20:55:46 +0000 (20:55 +0000)]
ext2fs: Enable RO huge_file feature support.
We can have support for reading ext4 "huge" files but we can't write
(anything) on ext4. and some filesystem. Formally enable the feature so
that we can mount such filesystems.
Alan Cox [Sun, 18 Jun 2017 18:23:39 +0000 (18:23 +0000)]
Change blist_alloc()'s allocation policy from first-fit to next-fit so
that disk writes are more likely to be sequential. This change is
beneficial on both the solid state and mechanical disks that I've
tested. (A similar change in allocation policy was made by DragonFly
BSD in 2013 to speed up Poudriere with "stressful memory parameters".)
Increase the width of blst_meta_alloc()'s parameter "skip" and the local
variables whose values are derived from it to 64 bits. (This matches the
width of the field "skip" that is stored in the structure "blist" and
passed to blst_meta_alloc().)
Eliminate a pointless check for a NULL blist_t.
Simplify blst_meta_alloc()'s handling of the ALL-FREE case.
Ian Lepore [Sun, 18 Jun 2017 18:22:52 +0000 (18:22 +0000)]
Add a driver for the imx6 EPIT timer that can be used as the system
timecounter instead of the GPT timer, freeing up the more flexible GPT
hardware for other uses. The EPIT driver is a standard (always in the
kernel) driver, and the existing GPT driver is now optional and included
only if you ask for device imx_gpt.
Ian Lepore [Sun, 18 Jun 2017 17:26:54 +0000 (17:26 +0000)]
Only register as the platform DELAY() implementation if the setup of the
global timer was successful, since the implementation tries to read it.
Notably, if the platform has a variable-frequency global timer (because
of dynamic frequency scaling), it doesn't set up the global timer for use
as a system timecounter, and in that case it also can't use it for DELAY.
Such platforms use different timer hardware for both timecounter and DELAY.
Mark Johnston [Sun, 18 Jun 2017 16:43:57 +0000 (16:43 +0000)]
Avoid including list.h in LinuxKPI headers.
list.h includes a number of FreeBSD headers as a workaround for the
LIST_HEAD name collision. To reduce pollution, avoid including list.h
in commonly used headers when it is not explicitly needed.
Rick Macklem [Sun, 18 Jun 2017 12:28:43 +0000 (12:28 +0000)]
Take "extern int maxbcachebuf" out of sys/param.h, since it breaks the
arm build.
In the arm build, elf_note.S includes sys/param.h and then does an
elf macro called ELFNOTE(). Although the compile error doesn't make
sense to me, I believe it just means that an "extern ..." can't exist
in param.h for this inclusion case.
I suspect adding #if !defined(LOCORE) might fix the build, but this
commit just takes the definition out.
I will ask freebsd-current@ what is the best was to deal with this
and do a subsequent commit after that.