]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
7 years agoCorrect sysctl names.
Dag-Erling Smørgrav [Wed, 9 Aug 2017 07:24:58 +0000 (07:24 +0000)]
Correct sysctl names.

7 years agohyperv/hn: Implement transparent mode network VF.
Sepherosa Ziehau [Wed, 9 Aug 2017 05:59:45 +0000 (05:59 +0000)]
hyperv/hn: Implement transparent mode network VF.

How network VF works with hn(4) on Hyper-V in transparent mode:

- Each network VF has a cooresponding hn(4).
- The network VF and the it's cooresponding hn(4) have the same hardware
  address.
- Once the network VF is attached, the cooresponding hn(4) waits several
  seconds to make sure that the network VF attach routing completes, then:
  o  Set the intersection of the network VF's if_capabilities and the
     cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s
     if_capabilities.  And adjust the cooresponding hn(4) if_capable and
     if_hwassist accordingly. (*)
  o  Make sure that the cooresponding hn(4)'s TSO parameters meet the
     constraints posed by both the network VF and the cooresponding hn(4).
     (*)
  o  The network VF's if_input is overridden.  The overriding if_input
     changes the input packet's rcvif to the cooreponding hn(4).  The
     network layers are tricked into thinking that all packets are
     neceived by the cooresponding hn(4).
  o  If the cooresponding hn(4) was brought up, bring up the network VF.
     The transmission dispatched to the cooresponding hn(4) are
     redispatched to the network VF.
  o  Bringing down the cooresponding hn(4) also brings down the network
     VF.
  o  All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to
     the network VF; the cooresponding hn(4) changes its internal state
     if necessary.
  o  The media status of the cooresponding hn(4) solely relies on the
     network VF.
  o  If there are multicast filters on the cooresponding hn(4), allmulti
     will be enabled on the network VF. (**)
- Once the network VF is detached.  Undo all damages did to the
  cooresponding hn(4) in the above item.

NOTE:
No operation should be issued directly to the network VF, if the
network VF transparent mode is enabled.  The network VF transparent mode
can be enabled by setting tunable hw.hn.vf_transparent to 1.  The network
VF transparent mode is _not_ enabled by default, as of this commit.

The benefit of the network VF transparent mode is that the network VF
attachment and detachment are transparent to all network layers; e.g. live
migration detaches and reattaches the network VF.

The major drawbacks of the network VF transparent mode:
- The netmap(4) support is lost, even if the VF supports it.
- ALTQ does not work, since if_start method cannot be properly supported.

(*)
These decisions were made so that things will not be messed up too much
during the transition period.

(**)
This does _not_ need to go through the fancy multicast filter management
stuffs like what vlan(4) has, at least currently:
- As of this write, multicast does not work in Azure.
- As of this write, multicast packets go through the cooresponding hn(4).

MFC after: 3 days
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D11803

7 years agoAdd an entry to UPDATING for r322297 which restores the ability
Kirk McKusick [Wed, 9 Aug 2017 05:21:57 +0000 (05:21 +0000)]
Add an entry to UPDATING for r322297 which restores the ability
of fsck to automatically find alternate superblocks when the
standard one is trashed or unavailable.

MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589

7 years agoSince the switch to GPT disk labels, fsck for UFS/FFS has been
Kirk McKusick [Wed, 9 Aug 2017 05:17:21 +0000 (05:17 +0000)]
Since the switch to GPT disk labels, fsck for UFS/FFS has been
unable to automatically find alternate superblocks. This checkin
places the information needed to find alternate superblocks to the
end of the area reserved for the boot block.

Filesystems created with a newfs of this vintage or later will
create the recovery information. If you have a filesystem created
prior to this change and wish to have a recovery block created for
your filesystem, you can do so by running fsck in forground mode
(i.e., do not use the -p or -y options). As it starts, fsck will
ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should
answer yes.

Discussed with: kib, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589

7 years agoIntroduce vm_page_grab_pages(), which is intended to replace loops calling
Alan Cox [Wed, 9 Aug 2017 04:23:04 +0000 (04:23 +0000)]
Introduce vm_page_grab_pages(), which is intended to replace loops calling
vm_page_grab() on consecutive page indices.  Besides simplifying the code
in the caller, vm_page_grab_pages() allows for batching optimizations.
For example, the current implementation replaces calls to vm_page_lookup()
on consecutive page indices by cheaper calls to vm_page_next().

Reviewed by: kib, markj
Tested by: pho (an earlier version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11926

7 years agoUpdate pl310 node in Armada 38x DTS to match the one used in Linux
Marcin Wojtas [Wed, 9 Aug 2017 01:31:05 +0000 (01:31 +0000)]
Update pl310 node in Armada 38x DTS to match the one used in Linux

Since the cache controller nodes fixup is added to the platform code,
this patch aligns it to the Linux device tree representation.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11884

7 years agoEnable pl310 coherent operation in platform init for Armada 38x
Marcin Wojtas [Wed, 9 Aug 2017 01:25:47 +0000 (01:25 +0000)]
Enable pl310 coherent operation in platform init for Armada 38x

Updating PL310 sotfware context sc_io_coherent field in
platform_pl310_init() routine for Armada 38x helps to avoid
using 'arm,io-coherent' property, which is by default not present
in the device tree node in Linux.

This way another step for DT unification between two operating
systems is done. The improvemnt will also work after enabling
PLATFORM for Marvell ARMv7 SoCs.

Reviewed by: andrew, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11883

7 years agodf(1): Add --si as an alias for -H
Kyle Evans [Wed, 9 Aug 2017 01:24:52 +0000 (01:24 +0000)]
df(1): Add --si as an alias for -H

Reviewed by: cem (earlier version), emaste
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11749

7 years agoRemove clock-frequency properties from Armada 38x timer nodes
Marcin Wojtas [Wed, 9 Aug 2017 01:20:53 +0000 (01:20 +0000)]
Remove clock-frequency properties from Armada 38x timer nodes

Since the timers' base frequency setting is added to the platform code,
this patch removes clock-frequency properties from global
and twd timers, aligning both to the Linux device tree.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11882

7 years agodu(1): Add --si option to display in terms of powers of 1000
Kyle Evans [Wed, 9 Aug 2017 01:19:19 +0000 (01:19 +0000)]
du(1): Add --si option to display in terms of powers of 1000

Reviewed by: cem (earlier version), emaste
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11748

7 years agoDynamically configure timers' base frequency for Armada 38x
Marcin Wojtas [Wed, 9 Aug 2017 01:14:29 +0000 (01:14 +0000)]
Dynamically configure timers' base frequency for Armada 38x

Instead of using 'clock-frequency' device tree property for global/twd
mpcore timers of Armada 38x SoCs, set it in platform_late_init stage
with arm_tmr_change_frequency() function.

Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11881

7 years agoEnable using ofw_bus_find_compatible in early platform code
Marcin Wojtas [Wed, 9 Aug 2017 01:06:40 +0000 (01:06 +0000)]
Enable using ofw_bus_find_compatible in early platform code

Before this patch function ofw_bus_find_compatible was using
memory allocations in order to find compatible node and the property's
length. This way there was always a suited buffer for property,
however this approach had also disadvantages - ofw_bus_find_compatible
couldn't be used when malloc is not available, e.g. during fdt fixup stage.

In order to remove the usage limitation of ofw_bus_find_compatible(),
this patch modifies the function to use ofw_bus_node_is_compatible()
(instead of the one without _int suffix), which uses a fixed
buffer on stack instead of dynamic allocations.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11880

7 years agoregex(3): Refactor fast/slow stepping bits in the matching engine
Kyle Evans [Wed, 9 Aug 2017 01:04:36 +0000 (01:04 +0000)]
regex(3): Refactor fast/slow stepping bits in the matching engine

Adding features for matching is fairly straightforward, but this requires
some duplication because of this fast/slow setup. They can be fairly
trivially combined into a single walk(), so do it to make future additions
less error prone.

Reviewed by: cem (earlier version), emaste, pfg
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11233

7 years agoAdd support for "compatible" parameter in ofw_fdt_fixup
Marcin Wojtas [Wed, 9 Aug 2017 00:56:29 +0000 (00:56 +0000)]
Add support for "compatible" parameter in ofw_fdt_fixup

Sometimes it's convenient to provide fixup to many boards
that use the same SoC family (eg. Marvell Armada 38x).
Instead of putting multiple entries in fdt_fixup_table,
use one entry which refers to all boards with given SoC.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11878

7 years agoRestore original /soc ranges on Marvell Armada 38x boards
Marcin Wojtas [Wed, 9 Aug 2017 00:51:45 +0000 (00:51 +0000)]
Restore original /soc ranges on Marvell Armada 38x boards

Because fdt_get_ranges can process now multiple 'ranges' entries,
restoring the ranges from original Linux device trees is possible.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11877

7 years agoEnable parsing simple-bus 'ranges' with multiple entries
Marcin Wojtas [Wed, 9 Aug 2017 00:45:25 +0000 (00:45 +0000)]
Enable parsing simple-bus 'ranges' with multiple entries

This patch makes possible to boot with up to 8 ranges in soc.
Dynamic allocation cannot be used, because ftd_get_ranges
function is called early, when malloc is not available.

Change is required for the alignment of Marvell Armada 38x
device trees present in sys/gnu/dts/arm - originally
the platform has 6 entries in simple-bus 'ranges'.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: manu, nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11876

7 years agoRemove the ds133x and s35390a i2c RTC drivers for now. They both do i2c
Ian Lepore [Tue, 8 Aug 2017 22:58:34 +0000 (22:58 +0000)]
Remove the ds133x and s35390a i2c RTC drivers for now.  They both do i2c
transfers in their probe() or attach() routines, and that doesn't work
when the low-level controller requires interrupts to be functional.

The DS133x family of chips is nearly identical to the DS1307 and support
for them should be added to that driver, then the ds133x driver can be
deleted.  The s35390a driver just needs a non-trivial workover.  In both
cases that work will be done and committed separately.

7 years agoAdd missing parenthesis on error message
Renato Botelho [Tue, 8 Aug 2017 22:40:26 +0000 (22:40 +0000)]
Add missing parenthesis on error message

Approved by: loos
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)

7 years agopf_get_sport(): Prevent possible endless loop when searching for an unused nat port
Kristof Provost [Tue, 8 Aug 2017 21:09:26 +0000 (21:09 +0000)]
pf_get_sport(): Prevent possible endless loop when searching for an unused nat port

This is an import of Alexander Bluhm's OpenBSD commit r1.60,
the first chunk had to be modified because on OpenBSD the
'cut' declaration is located elsewhere.

Upstream report by Jingmin Zhou:
https://marc.info/?l=openbsd-pf&m=150020133510896&w=2

OpenBSD commit message:
 Use a 32 bit variable to detect integer overflow when searching for
 an unused nat port.  Prevents a possible endless loop if high port
 is 65535 or low port is 0.
 report and analysis Jingmin Zhou; OK sashan@ visa@
Quoted from: https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_lb.c

PR: 221201
Submitted by: Fabian Keil <fk@fabiankeil.de>
Obtained from:  OpenBSD via ElectroBSD
MFC after: 1 week

7 years agoTurns out to be even simpler to just not create /dev/efi if we don't
Warner Losh [Tue, 8 Aug 2017 21:01:11 +0000 (21:01 +0000)]
Turns out to be even simpler to just not create /dev/efi if we don't
have a efi runtime.

7 years agoFail to open efirt device when no EFI on system.
Warner Losh [Tue, 8 Aug 2017 20:44:16 +0000 (20:44 +0000)]
Fail to open efirt device when no EFI on system.

libefivar expects opening /dev/efi to indicate if the we can make efi
runtime calls. With a null routine, it was always succeeding leading
efi_variables_supported() to return the wrong value. Only succeed if
we have an efi_runtime table. Also, while I'm hear, out of an
abundance of caution, add a likely redundant check to make sure
efi_systbl is not NULL before dereferencing it. I know it can't be
NULL if efi_cfgtbl is non-NULL, but the compiler doesn't.

7 years agorwho/ruptime/rwhod shouldn't be gated by RCMDS.
Jeremie Le Hen [Tue, 8 Aug 2017 20:17:07 +0000 (20:17 +0000)]
rwho/ruptime/rwhod shouldn't be gated by RCMDS.

As peter@ points out in pr/220953:
"rwho, rwhod and ruptime are not part of the remote login suite (rsh, rlogin
etc).

They should *not* be in the rcmds package which is disabled by default.  We
rely on rwho/rwhod/ruptime in the freebsd.org cluster."

This commit is a re-commit of r322029 and r322031 with a better commit log, as
pointed out by ngie@.

This also includes the necesary changes to OptionalObsoleteFiles.inc, as
requested by jhb@.

PR: 220953
Reported by: peter@, jhb@
Differential Revision: https://reviews.freebsd.org/D11743

7 years agoRevert r322029 and r322031 so as to recommit them with a better commit log.
Jeremie Le Hen [Tue, 8 Aug 2017 20:07:08 +0000 (20:07 +0000)]
Revert r322029 and r322031 so as to recommit them with a better commit log.

PR: 220953
Reported by: ngie@

7 years agoFix few issues of LinuxKPI workqueue.
Alexander Motin [Tue, 8 Aug 2017 19:36:34 +0000 (19:36 +0000)]
Fix few issues of LinuxKPI workqueue.

LinuxKPI workqueue wrappers reported "successful" cancellation for works
already completed in normal way.  This change brings reported status and
real cancellation fact into sync.  This required for drm-next operation.

Reviewed by: hselasky (earlier version)
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11904

7 years agoRemove now-unused badsb declaration, missed in r322200
Ed Maste [Tue, 8 Aug 2017 18:31:40 +0000 (18:31 +0000)]
Remove now-unused badsb declaration, missed in r322200

Sponsored by: The FreeBSD Foundation

7 years agoFix a NULL pointer dereference in mly_user_command().
John Baldwin [Tue, 8 Aug 2017 17:49:57 +0000 (17:49 +0000)]
Fix a NULL pointer dereference in mly_user_command().

If mly_user_command fails to allocate a command slot it jumps to an 'out'
label used for error handling.  The error handling code checks for a data
buffer in 'mc->mc_data' to free before checking if 'mc' is NULL.  Fix by
just returning directly if we fail to allocate a command and only using
the 'out' label for subsequent errors when there is actual cleanup to
perform.

PR: 217747
Reported by: PVS-Studio
Reviewed by: emaste
MFC after: 1 week

7 years agoMake p1003_1b.aio_listio_max a tunable
Alan Somers [Tue, 8 Aug 2017 16:14:31 +0000 (16:14 +0000)]
Make p1003_1b.aio_listio_max a tunable

p1003_1b.aio_listio_max is now a tunable. Its value is reflected in the
sysctl of the same name, and the sysconf(3) variable _SC_AIO_LISTIO_MAX.
Its value will be bounded from below by the compile-time constant
AIO_LISTIO_MAX and from above by the compile-time constant
MAX_AIO_QUEUE_PER_PROC and the tunable vfs.aio.max_aio_queue.

Reviewed by: jhb, kib
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D11601

7 years agoUse the correct queue depth for nda devices.
Warner Losh [Tue, 8 Aug 2017 16:06:16 +0000 (16:06 +0000)]
Use the correct queue depth for nda devices.

Submitted by: Matt Williams

7 years agoFix logic error in the the assert, causing the condition to be always true.
Konstantin Belousov [Tue, 8 Aug 2017 15:46:29 +0000 (15:46 +0000)]
Fix logic error in the the assert, causing the condition to be always true.

Also improve the formatting of the corresponding KASSERT message.

Based on the submission by: Svyatoslav <razmyslov@viva64.com>
Found by: PVS-Studio
PR: 217741
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 1 week

7 years agotests/sys/netinet/fibs_test: skip selected tests when firewalls are enabled
Alan Somers [Tue, 8 Aug 2017 15:37:21 +0000 (15:37 +0000)]
tests/sys/netinet/fibs_test: skip selected tests when firewalls are enabled

Some tests send packets over epair(4) interfaces. Firewalls can cause
spurious failures.

Reviewed by: ngie
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D11917

7 years agoFix typo in cyapa out of bounds check.
Michael Gmelin [Tue, 8 Aug 2017 13:27:32 +0000 (13:27 +0000)]
Fix typo in cyapa out of bounds check.

PR: 217783
Submitted by: razmyslov@viva64.com
MFC after: 1 week

7 years agovmstat: Always emit a space after the free-memory column
Emmanuel Vadot [Tue, 8 Aug 2017 12:18:11 +0000 (12:18 +0000)]
vmstat: Always emit a space after the free-memory column

When displaying in non-human form, if the free-memory number
is large (more than 7 digits), there is no space between it and
the page fault column.

PR: 221290
Submitted by: Josuah Demangeon <mail@josuah.net> (Original version)

7 years agoMake sure the received IP header gets 32-bit aligned for short packets
Hans Petter Selasky [Tue, 8 Aug 2017 11:49:36 +0000 (11:49 +0000)]
Make sure the received IP header gets 32-bit aligned for short packets
in the mlx5en(4) driver.

MFC after: 1 week
Sponsored by: Mellanox Technologies

7 years agoCount drop events due to lack of PCI bandwidth as queue drops and not as
Hans Petter Selasky [Tue, 8 Aug 2017 11:36:57 +0000 (11:36 +0000)]
Count drop events due to lack of PCI bandwidth as queue drops and not as
input errors in the mlx5en(4) driver. This improves the sysadmin view of
physical port errors.

Submitted by: gallatin@
MFC after: 1 week
Sponsored by: Mellanox Technologies

7 years agoFix for mlx4en(4) to properly call m_defrag().
Hans Petter Selasky [Tue, 8 Aug 2017 11:35:02 +0000 (11:35 +0000)]
Fix for mlx4en(4) to properly call m_defrag().

The m_defrag() function can only defrag mbuf chains which have a valid
mbuf packet header. In r291699 when the mlx4en(4) driver was converted
into using BUSDMA(9), the call to m_defrag() was moved after the part
of the transmit routine which strips the header from the mbuf chain.
This effectivly disabled the mbuf defrag mechanism and such packets
simply got dropped.

This patch removes the stripping of mbufs from a chain and loads all
mbufs using busdma. If busdma finds there are no segments, unload
the DMA map and free the mbuf right away, because that means all
data in the mbuf has been inlined in the TX ring. Else proceed
as usual.

Add a per-ring rounter for the number of defrag attempts and
make sure the oversized_packets counter gets zeroed while at it.

The counters are per-ring to avoid excessive cache misses in the
TX path.

Submitted by: mjoras@
Differential Revision: https://reviews.freebsd.org/D11683
MFC after: 1 week
Sponsored by: Mellanox Technologies

7 years agoMFV r322242: 8373 TXG_WAIT in ZIL commit path
Andriy Gapon [Tue, 8 Aug 2017 11:26:03 +0000 (11:26 +0000)]
MFV r322242: 8373 TXG_WAIT in ZIL commit path

illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7
https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7

https://www.illumos.org/issues/8373
  The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign
  a transaction to a transaction group.  That seems to be logically
  incorrect as writing of the ZIL block does not introduce any new dirty
  data.  Also, when there is a lot of dirty data, the call can introduce
  significant delays into the ZIL commit path, thus affecting all
  synchronous writes. Additionally, ARC throttling may affect the ZIL
  writing.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

7 years ago8373 TXG_WAIT in ZIL commit path
Andriy Gapon [Tue, 8 Aug 2017 11:24:13 +0000 (11:24 +0000)]
8373 TXG_WAIT in ZIL commit path

illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7
https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7

https://www.illumos.org/issues/8373
  The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign a
  transaction to a transaction group.
  That seems to be logically incorrect as writing of the ZIL block does not
  introduce any new dirty data.
  Also, when there is a lot of dirty data, the call can introduce significant
  delays into the ZIL commit path,
  thus affecting all synchronous writes. Additionally, ARC throttling may affect
  the ZIL writing.
  We probably need a new mechanism similar to dmu_tx_create_assigned to assign
  ZIL transactions.
  (Ab)using TXG_WAITED does not seem to be sufficient.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

7 years agoMFV r322240: 8491 uberblock on-disk padding to reserve space for smoothly merging...
Andriy Gapon [Tue, 8 Aug 2017 11:21:58 +0000 (11:21 +0000)]
MFV r322240: 8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS

illumos/illumos-gate@79c2b812ee2010ebf20fdd92dc5f06b59000a94c
https://github.com/illumos/illumos-gate/commit/79c2b812ee2010ebf20fdd92dc5f06b59000a94c

https://www.illumos.org/issues/8491
  The zpool checkpoint feature in DxOS added a new field in the uberblock.
  The Multi-Modifier Protection Pull Request from ZoL adds two new fields in the
  uberblock (Reference: https://github.com/zfsonlinux/zfs/pull/6279).
  As these two changes come from two different sources and once upstreamed and
  deployed will introduce an incompatibility with each other we want
  to upstream a change that will reserve the padding for both of them so
  integration goes smoothly and everyone gets both features.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Olaf Faaland <faaland1@llnl.gov>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

MFC after: 3 weeks

7 years ago8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint...
Andriy Gapon [Tue, 8 Aug 2017 11:19:56 +0000 (11:19 +0000)]
8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS

illumos/illumos-gate@79c2b812ee2010ebf20fdd92dc5f06b59000a94c
https://github.com/illumos/illumos-gate/commit/79c2b812ee2010ebf20fdd92dc5f06b59000a94c

https://www.illumos.org/issues/8491
  The zpool checkpoint feature in DxOS added a new field in the uberblock.
  The Multi-Modifier Protection Pull Request from ZoL adds two new fields in the
  uberblock (Reference: https://github.com/zfsonlinux/zfs/pull/6279).
  As these two changes come from two different sources and once upstreamed and
  deployed will introduce an incompatibility with each other we want
  to upstream a change that will reserve the padding for both of them so
  integration goes smoothly and everyone gets both features.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Olaf Faaland <faaland1@llnl.gov>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

7 years agoMFV r322238: 7915 checks in l2arc_evict could use some cleaning up
Andriy Gapon [Tue, 8 Aug 2017 11:19:14 +0000 (11:19 +0000)]
MFV r322238: 7915 checks in l2arc_evict could use some cleaning up

illumos/illumos-gate@267ae6c3a88d2fc39276af66caafa978b0935b82
https://github.com/illumos/illumos-gate/commit/267ae6c3a88d2fc39276af66caafa978b0935b82

https://www.illumos.org/issues/7915
  l2arc_evict() is strictly serialized with respect to
  l2arc_write_buffers() and l2arc_write_done().  Normally, l2arc_evict()
  and l2arc_write_buffers() are called from the same thread, so they can
  not be concurrent.  Also, l2arc_write_buffers() uses zio_wait() on the
  parent zio of all cache zio-s.  That ensures that l2arc_write_done()
  is completed before l2arc_write_buffers() returns.  Finally, if a
  cache device is removed, then l2arc_evict() is called under SCL_ALL in
  the exclusive mode.  That ensures that it can not be concurrent with
  the normal L2ARC accesses to the device (including writing and
  evicting buffers).  Given the above, some checks and actions in
  l2arc_evict() do not make sense.  For instance, it must never
  encounter the write head header let alone remove it from the buffer
  list.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

7 years ago7915 checks in l2arc_evict could use some cleaning up
Andriy Gapon [Tue, 8 Aug 2017 11:15:36 +0000 (11:15 +0000)]
7915 checks in l2arc_evict could use some cleaning up

illumos/illumos-gate@267ae6c3a88d2fc39276af66caafa978b0935b82
https://github.com/illumos/illumos-gate/commit/267ae6c3a88d2fc39276af66caafa978b0935b82

https://www.illumos.org/issues/7915
  l2arc_evict() is strictly serialized with respect to l2arc_write_buffers() and
  l2arc_write_done().
  Normally, l2arc_evict() and l2arc_write_buffers() are called from the same
  thread, so they can not be concurrent.
  Also, l2arc_write_buffers() uses zio_wait() on the parent zio of all cache zio-
  s.
  That ensures that l2arc_write_done() is completed before l2arc_write_buffers()
  returns.
  Finally, if a cache device is removed, then l2arc_evict() is called under
  SCL_ALL in the exclusive mode.
  That ensures that it can not be concurrent with the normal L2ARC accesses to
  the device (including writing and evicting buffers).
  Given the above, some checks and actions in l2arc_evict() do not make sense.
  For instance, it must never encounter the write head header let alone remove it
  from the buffer list.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Andriy Gapon <avg@FreeBSD.org>

7 years agoMFV r322236: 8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing
Andriy Gapon [Tue, 8 Aug 2017 11:14:40 +0000 (11:14 +0000)]
MFV r322236: 8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing

illumos/illumos-gate@dcb6872c565819ac88acbc2ece999ef241c8b982
https://github.com/illumos/illumos-gate/commit/dcb6872c565819ac88acbc2ece999ef241c8b982

https://www.illumos.org/issues/8126
  The sync thread is concurrently modifying dn_phys->dn_nlevels
  while dbuf_dirty() is trying to assert something about it, without
  holding the necessary lock. We need to move this assertion further down
  in the function, after we have acquired the dn_struct_rwlock.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after: 2 weeks

7 years ago8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing
Andriy Gapon [Tue, 8 Aug 2017 11:13:27 +0000 (11:13 +0000)]
8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing

illumos/illumos-gate@dcb6872c565819ac88acbc2ece999ef241c8b982
https://github.com/illumos/illumos-gate/commit/dcb6872c565819ac88acbc2ece999ef241c8b982

https://www.illumos.org/issues/8126
  The sync thread is concurrently modifying dn_phys->dn_nlevels
  while dbuf_dirty() is trying to assert something about it, without
  holding the necessary lock. We need to move this assertion further down
  in the function, after we have acquired the dn_struct_rwlock.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

7 years ago8067 zdb should be able to dump literal embedded block pointer
Andriy Gapon [Tue, 8 Aug 2017 11:10:37 +0000 (11:10 +0000)]
8067 zdb should be able to dump literal embedded block pointer

illumos/illumos-gate@4923c69fddc0887da5604a262585af3efd82ee20
https://github.com/illumos/illumos-gate/commit/4923c69fddc0887da5604a262585af3efd82ee20

https://www.illumos.org/issues/8067
  Add an option to zdb to print a literal embedded block pointer supplied on the
  command line:
  zdb -E [-A] word0:word1:...:word15

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

7 years agozfs: no need for __DECONST after abd constification in r322233
Andriy Gapon [Tue, 8 Aug 2017 11:07:34 +0000 (11:07 +0000)]
zfs: no need for __DECONST after abd constification in r322233

Note that vdev_label_write_pad2() is FreeBSD specific.

MFC after: 2 weeks
X-MFC after: r322233

7 years agoMFV r322232: 8426 mark immutable buffer arguments as such in abd.h
Andriy Gapon [Tue, 8 Aug 2017 10:59:18 +0000 (10:59 +0000)]
MFV r322232: 8426 mark immutable buffer arguments as such in abd.h

illumos/illumos-gate@9b195260e22529ac0e2580faaf89402420589c1c
https://github.com/illumos/illumos-gate/commit/9b195260e22529ac0e2580faaf89402420589c1c

https://www.illumos.org/issues/8426
  abd_copy_from_buf and abd_cmp_buf do not modify their void *buf arguments, so
  qualify them with const.
  abd_copy_from_buf_off and abd_cmp_buf_off already had that type for the
  corresponding arguments.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

7 years ago8426 mark immutable buffer arguments as such in abd.h
Andriy Gapon [Tue, 8 Aug 2017 10:58:01 +0000 (10:58 +0000)]
8426 mark immutable buffer arguments as such in abd.h

illumos/illumos-gate@9b195260e22529ac0e2580faaf89402420589c1c
https://github.com/illumos/illumos-gate/commit/9b195260e22529ac0e2580faaf89402420589c1c

https://www.illumos.org/issues/8426
  abd_copy_from_buf and abd_cmp_buf do not modify their void *buf arguments, so
  qualify them with const.
  abd_copy_from_buf_off and abd_cmp_buf_off already had that type for the
  corresponding arguments.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

7 years ago8430 dir_is_empty_readdir() doesn't properly handle error from fdopendir()
Andriy Gapon [Tue, 8 Aug 2017 10:55:42 +0000 (10:55 +0000)]
8430 dir_is_empty_readdir() doesn't properly handle error from fdopendir()

illumos/illumos-gate@ba6e7e6505150388de6dc6a88741164118a421bf
https://github.com/illumos/illumos-gate/commit/ba6e7e6505150388de6dc6a88741164118a421bf

https://www.illumos.org/issues/8430
  we should close dirfd if fdopendir() fails.

Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Sowrabha Gopal <sowrabha.gopal@delphix.com>

7 years agoMFV r322229: 7600 zfs rollback should pass target snapshot to kernel
Andriy Gapon [Tue, 8 Aug 2017 10:52:01 +0000 (10:52 +0000)]
MFV r322229: 7600 zfs rollback should pass target snapshot to kernel

illumos/illumos-gate@77b171372ed21642e04c873ef1e87fe2365520df
https://github.com/illumos/illumos-gate/commit/77b171372ed21642e04c873ef1e87fe2365520df

https://www.illumos.org/issues/7600
  At present, the kernel side code seems to blindly rollback to whatever happens
  to be the latest snapshot at the time when the rollback task is processed.
  The expected target's name should be passed to the kernel driver and the sync
  task should validate that the target exists and that it is the latest snapshot
  indeed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 3 weeks

7 years ago7600 zfs rollback should pass target snapshot to kernel
Andriy Gapon [Tue, 8 Aug 2017 10:49:56 +0000 (10:49 +0000)]
7600 zfs rollback should pass target snapshot to kernel

illumos/illumos-gate@77b171372ed21642e04c873ef1e87fe2365520df
https://github.com/illumos/illumos-gate/commit/77b171372ed21642e04c873ef1e87fe2365520df

https://www.illumos.org/issues/7600
  At present, the kernel side code seems to blindly rollback to whatever happens
  to be the latest snapshot at the time when the rollback task is processed.
  The expected target's name should be passed to the kernel driver and the sync
  task should validate that the target exists and that it is the latest snapshot
  indeed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

7 years agoMFV r322227: 8377 Panic in bookmark deletion
Andriy Gapon [Tue, 8 Aug 2017 10:48:52 +0000 (10:48 +0000)]
MFV r322227: 8377 Panic in bookmark deletion

illumos/illumos-gate@42418f9e73f0d007aa87675ecc206c26fc8e073e
https://github.com/illumos/illumos-gate/commit/42418f9e73f0d007aa87675ecc206c26fc8e073e

https://www.illumos.org/issues/8377
  The problem is that when dsl_bookmark_destroy_check() is executed from open
  context (the pre-check), it fills in dbda_success based on the existence of the
  bookmark.
  But the bookmark (or containing filesystem as in this case) can be destroyed
  before we get to syncing context. When we re-run dsl_bookmark_destroy_check()
  in syncing
  context, it will not add the deleted bookmark to dbda_success, intending for
  dsl_bookmark_destroy_sync() to not process it. But because the bookmark is
  still in dbda_success
  from the open-context call, we do try to destroy it.
  The fix is that dsl_bookmark_destroy_check() should not modify dbda_success
  when called from open context.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after: 2 weeks

7 years ago8377 Panic in bookmark deletion
Andriy Gapon [Tue, 8 Aug 2017 10:47:56 +0000 (10:47 +0000)]
8377 Panic in bookmark deletion

illumos/illumos-gate@42418f9e73f0d007aa87675ecc206c26fc8e073e
https://github.com/illumos/illumos-gate/commit/42418f9e73f0d007aa87675ecc206c26fc8e073e

https://www.illumos.org/issues/8377
  The problem is that when dsl_bookmark_destroy_check() is executed from open
  context (the pre-check), it fills in dbda_success based on the existence of the
  bookmark.
  But the bookmark (or containing filesystem as in this case) can be destroyed
  before we get to syncing context. When we re-run dsl_bookmark_destroy_check()
  in syncing
  context, it will not add the deleted bookmark to dbda_success, intending for
  dsl_bookmark_destroy_sync() to not process it. But because the bookmark is
  still in dbda_success
  from the open-context call, we do try to destroy it.
  The fix is that dsl_bookmark_destroy_check() should not modify dbda_success
  when called from open context.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

7 years agoMFV r322223: 8378 crash due to bp in-memory modification of nopwrite block
Andriy Gapon [Tue, 8 Aug 2017 10:46:51 +0000 (10:46 +0000)]
MFV r322223: 8378 crash due to bp in-memory modification of nopwrite block

illumos/illumos-gate@b7edcb940884114e61382937505433c4c38c0278
https://github.com/illumos/illumos-gate/commit/b7edcb940884114e61382937505433c4c38c0278

https://www.illumos.org/issues/8378
  The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which
  we then nopwrite against.
  zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr
  to zgd_bp, dbuf_write_ready()
  could change db_blkptr, and dbuf_write_done() could remove the dirty record.
  dmu_sync() then sees the stale
  BP and that the dbuf it not dirty, so it is eligible for nop-writing.
  The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the
  db_mtx. We could still see a stale
  db_blkptr, but if it is stale then the dirty record will still exist and thus
  we won't attempt to nopwrite.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after: 2 weeks

7 years ago8378 crash due to bp in-memory modification of nopwrite block
Andriy Gapon [Tue, 8 Aug 2017 10:44:48 +0000 (10:44 +0000)]
8378 crash due to bp in-memory modification of nopwrite block

illumos/illumos-gate@b7edcb940884114e61382937505433c4c38c0278
https://github.com/illumos/illumos-gate/commit/b7edcb940884114e61382937505433c4c38c0278

https://www.illumos.org/issues/8378
  The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which
  we then nopwrite against.
  zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr
  to zgd_bp, dbuf_write_ready()
  could change db_blkptr, and dbuf_write_done() could remove the dirty record.
  dmu_sync() then sees the stale
  BP and that the dbuf it not dirty, so it is eligible for nop-writing.
  The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the
  db_mtx. We could still see a stale
  db_blkptr, but if it is stale then the dirty record will still exist and thus
  we won't attempt to nopwrite.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

7 years agoMFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz
Andriy Gapon [Tue, 8 Aug 2017 10:43:41 +0000 (10:43 +0000)]
MFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz

FreeBD note: the essence of this change was committed to FreeBSD in
r314274.  This commit catches up with differences between what was
committed to FreeBSD and what was committed to OpenZFS, mainly more
logical variable names.

illumos/illumos-gate@16a7e5ac116c85d965007a5f201104b564e82210
https://github.com/illumos/illumos-gate/commit/16a7e5ac116c85d965007a5f201104b564e82210

https://www.illumos.org/issues/7910
  It seems that the change in issue #6950 resurrected the problem that was
  earlier fixed by the change in issue #5219.
  Please also see the following FreeBSD bug report:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216178

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

7 years ago7910 l2arc_write_buffers() may write beyond target_sz
Andriy Gapon [Tue, 8 Aug 2017 10:37:03 +0000 (10:37 +0000)]
7910 l2arc_write_buffers() may write beyond target_sz

illumos/illumos-gate@16a7e5ac116c85d965007a5f201104b564e82210
https://github.com/illumos/illumos-gate/commit/16a7e5ac116c85d965007a5f201104b564e82210

https://www.illumos.org/issues/7910
  It seems that the change in issue #6950 resurrected the problem that was
  earlier fixed by the change in issue #5219.
  Please also see the following FreeBSD bug report: https://bugs.freebsd.org/
  bugzilla/show_bug.cgi?id=216178

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

7 years ago8416 abd.h is not C++ friendly
Andriy Gapon [Tue, 8 Aug 2017 10:31:42 +0000 (10:31 +0000)]
8416 abd.h is not C++ friendly

illumos/illumos-gate@5e2a074725cb7c16ea1c6554da11ab4d6b4e7aee
https://github.com/illumos/illumos-gate/commit/5e2a074725cb7c16ea1c6554da11ab4d6b4e7aee

https://www.illumos.org/issues/8416
  A C++ compiler fails to compile abd_is_linear(), which is an inline function
  defined in abd.h, with the following error:
       error: cannot initialize return object of type 'boolean_t' with an
       rvalue of type 'bool'
  That happens because a bool can not be converted to an enum in C++.
  That's a problem because abd.h can be visible through other header files that a
  C++ program that works with ZFS can include.

Reviewed by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Alek Pinchuk <pinchuk.alek@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

7 years agoMFV r322217: 8418 zfs_prop_get_table() call in zfs_validate_name() is a no-op
Andriy Gapon [Tue, 8 Aug 2017 10:30:49 +0000 (10:30 +0000)]
MFV r322217: 8418 zfs_prop_get_table() call in zfs_validate_name() is a no-op

illumos/illumos-gate@e09ba01dcda5e24964b8632718777b39166d86e4
https://github.com/illumos/illumos-gate/commit/e09ba01dcda5e24964b8632718777b39166d86e4

https://www.illumos.org/issues/8418
  The following line in zfs_validate_name() is just a no-op and it
  should be removed:
      108    (void) zfs_prop_get_table();

Reviewed by: Vitaliy Gusev <gusev.vitaliy@icloud.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>

MFC after: 2 weeks

7 years ago8418 zfs_prop_get_table() call in zfs_validate_name() is a no-op
Andriy Gapon [Tue, 8 Aug 2017 10:28:01 +0000 (10:28 +0000)]
8418 zfs_prop_get_table() call in zfs_validate_name() is a no-op

illumos/illumos-gate@e09ba01dcda5e24964b8632718777b39166d86e4
https://github.com/illumos/illumos-gate/commit/e09ba01dcda5e24964b8632718777b39166d86e4

https://www.illumos.org/issues/8418
  The following line in zfs_validate_name() is just a no-op and it should be
  removed:
  108    (void) zfs_prop_get_table();

Reviewed by: Vitaliy Gusev <gusev.vitaliy@icloud.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Marcel Telka <marcel@telka.sk>

7 years agoMake test scripts under tests/... non-executable
Enji Cooper [Tue, 8 Aug 2017 04:59:16 +0000 (04:59 +0000)]
Make test scripts under tests/... non-executable

Executable bits should be set at install time instead of in the repo.
Setting executable bits on files triggers false positives with Phabricator.

MFC after: 2 months

7 years agoAdd round_jiffies_up(), local_clock() and __setup_timer() to the LinuxKPI.
Mark Johnston [Tue, 8 Aug 2017 04:34:02 +0000 (04:34 +0000)]
Add round_jiffies_up(), local_clock() and __setup_timer() to the LinuxKPI.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11871

7 years agoAdd macros for defining attribute groups and for WO and RW attributes.
Mark Johnston [Tue, 8 Aug 2017 04:30:22 +0000 (04:30 +0000)]
Add macros for defining attribute groups and for WO and RW attributes.

Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11872

7 years agoregex(3): Handle invalid {} constructs consistently and adjust tests
Kyle Evans [Tue, 8 Aug 2017 04:10:46 +0000 (04:10 +0000)]
regex(3): Handle invalid {} constructs consistently and adjust tests

Currently, regex(3) exhibits the following wrong behavior as demonstrated
with sed:

 - echo "a{1,2,3}b" | sed -r "s/{/_/"     (1)
 - echo "a{1,2,3}b" | sed "s/\}/_/"       (2)
 - echo "a{1,2,3}b" | sed -r "s/{}/_/"    (3)

Cases (1) and (3) should throw errors but they actually succeed, and (2)
throws an error when it should match the literal '}'. The correct behavior
was decided by comparing to the behavior with the equivalent BRE (1)(3) or
ERE (2) and consulting POSIX, along with some reasonable evaluation.

Tests were also adjusted/added accordingly.

PR: 166861
Reviewed by: emaste, ngie, pfg
Approved by: emaste (mentor)
MFC after: never
Differential Revision: https://reviews.freebsd.org/D10315

7 years agopgrep naively appends the delimiter to all PIDs including the last
Lawrence Stewart [Tue, 8 Aug 2017 00:31:10 +0000 (00:31 +0000)]
pgrep naively appends the delimiter to all PIDs including the last
e.g. "pgrep -d, getty" outputs "1399,1386,1309,1308,1307,1306,1305,1302,"
Ensure the list is correctly delimited by suppressing the emission of the
delimiter after the final PID.

Reviewed by: imp, kib
MFC after: 1 week
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D8537

7 years ago- If available, use TRIM instead of ERASE for implementing BIO_DELETE.
Marius Strobl [Mon, 7 Aug 2017 23:33:05 +0000 (23:33 +0000)]
- If available, use TRIM instead of ERASE for implementing BIO_DELETE.
  This also involves adding a quirk table as TRIM is broken for some
  Kingston eMMC devices, though. Compared to ERASE (declared "legacy"
  in the eMMC specification v5.1), TRIM has the advantage of operating
  on write sectors rather than on erase sectors, which typically are
  of a much larger size. Thus, employing TRIM, we don't need to fiddle
  with coalescing BIO_DELETE requests that are also of (write) sector
  units into erase sectors, which might not even add up in all cases.
- For some SanDisk iNAND devices, the CMD38 argument, e. g. ERASE,
  TRIM etc., has to be specified via EXT_CSD[113], which now is also
  handled via a quirk.
- My initial understanding was that for eMMC partitions, the granularity
  should be used as erase sector size, e. g. 128 KB for boot partitions.
  However, rereading the relevant parts of the eMMC specification v5.1,
  this isn't actually correct. So drop the code which used partition
  granularities for delmaxsize and stripesize. For the most part, this
  change is a NOP, though, because a) for ERASE, mmcsd_delete() used
  the erase sector size unconditionally for all partitions anyway and
  b) g_disk_limit() doesn't actually take the stripesize into account.
- Take some more advantage of mmcsd_errmsg() in mmcsd(4) for making
  error codes human readable.

7 years agoEliminate useless adjustments of aliased device.
Warner Losh [Mon, 7 Aug 2017 22:42:46 +0000 (22:42 +0000)]
Eliminate useless adjustments of aliased device.

No need to set any fields in the cloned device. devfs uses symlinks,
so the adev entries returned won't be presented to the drivers. Since
we don't save copies, nothing else will see them. This code came from
the old compat code, and it appears to be obsolete or never needed.

Submitted by: kib@
Differential Review: https://reviews.freebsd.org/D11919

7 years agoRevert the parts of r322097 related to /etc/wall_cmos_clock handling as
Marius Strobl [Mon, 7 Aug 2017 21:38:10 +0000 (21:38 +0000)]
Revert the parts of r322097 related to /etc/wall_cmos_clock handling as
the previous behavior actually is required for setting up configurations
in which the RTC is using UTC but the timezone is not. Still, besides
uniform error handling, that file should get the same treatment in the
non-interactive variants supported by tzsetup(8).

7 years agoUPDATING: clarify what the RCMDS knob controls
Ed Maste [Mon, 7 Aug 2017 21:29:55 +0000 (21:29 +0000)]
UPDATING: clarify what the RCMDS knob controls

7 years agoIn debug mode, print the differences between the superblock and
Warner Losh [Mon, 7 Aug 2017 21:23:59 +0000 (21:23 +0000)]
In debug mode, print the differences between the superblock and
alternate superblock when the values disagree and we're going to
reject it.

Differential Revision: https://reviews.freebsd.org/D11589

7 years agoMake it possible to ignore superblock mismatch. This will not fix such
Warner Losh [Mon, 7 Aug 2017 21:23:54 +0000 (21:23 +0000)]
Make it possible to ignore superblock mismatch. This will not fix such
a mismatch, but will allow fsck to continue when the last alternate
superblock gets corrupted somehow.

Also, remove searching for alternate super blocks. It should have been
removed two years ago with r276737 by imp@. Leave minor vestiges in
place in case someone wants to solve the hard problem of knowing where
altnernate superblocks live without access to data formerly stored in
disklabels.

Differential Revision: https://reviews.freebsd.org/D11589

7 years agoAdd nvd alias to nda ndoes.
Warner Losh [Mon, 7 Aug 2017 21:12:43 +0000 (21:12 +0000)]
Add nvd alias to nda ndoes.

All ndaX and ndaXpY nodes will appear as nvdX and nvdXpY as well
(through symlinks in devfs via the normal disk aliasing mechanism in
GEOM).

Differential Revision: https://reviews.freebsd.org/D11873

7 years agoExpose API to allow disks to ask for alias names in devfs.
Warner Losh [Mon, 7 Aug 2017 21:12:38 +0000 (21:12 +0000)]
Expose API to allow disks to ask for alias names in devfs.

Implement disk_add_alias to allow aliases to be added to disks. All
disk have a primary name (say "foo") can also have secondary names
(say "bar") such that all instances of "foo" also have a "bar"
alias. So if you have foo0, foo0p1, foo1, foo1s1 and foo1s1a nodes
created by the foo driver and gpart, device nodes bar0, bar0p1, bar1,
bar1s1 and bar1s1a will appear as symlinks back to the original nodes.
This generalizes to multiple aliases. However, since the unit number
follows the primary name, multiple device drivers can't create the
same aliases unless those drives coorinate the unit number space (eg
you couldn't add an alias 'disk' to both 'da' and 'ada' because it's
possible to have da0 and ada0, because 'disk0' is ambiguous).

Differential Revision: https://reviews.freebsd.org/D11873

7 years agoAdd alias support to gpart.
Warner Losh [Mon, 7 Aug 2017 21:12:33 +0000 (21:12 +0000)]
Add alias support to gpart.

When we're creating new providers for each of the partitions, add
aliases to the geom before we create the provider so when geom_dev
tastes the provider, the aliases are in place so the proper /dev
entries are created. So foo5p6 gets created as an alias for bar5p6
when foo is an alias for bar in the geom we're partitioning with
g_part. This also copies aliases from the container geom (eg disk) to
the label geom (the disk with GPT partitioning) so that aliases nest
properly.

Differential Revision: https://reviews.freebsd.org/D11873

7 years agoAdd aliasing concept to geom.
Warner Losh [Mon, 7 Aug 2017 21:12:28 +0000 (21:12 +0000)]
Add aliasing concept to geom.

Add an alias name list to geoms. Use them in geom_dev to create
aliases. Previously, geom_dev would create an device node for the name
of the geom. Now, additional nodes are created pointing back to the
primary node with make_dev_alias_p. Aliases must be in place on the
geom before any tasting occurs.

Differential Revision: https://reviews.freebsd.org/D11873

7 years agogjournal is broken in handling its flush_queue. If we have 10 bio's
Kirk McKusick [Mon, 7 Aug 2017 19:40:03 +0000 (19:40 +0000)]
gjournal is broken in handling its flush_queue. If we have 10 bio's
in the flush_queue:
         1 2 3 4 5 6 7 8 9 10
and another 10 bio's go into the flush queue after only the first five
bio's are removed from the flush queue, the queue should look like:
         6 7 8 9 10 11 12 13 14 15 16 17 18 19 20,
but because of the bug we end up with
         6 11 12 13 14  15 16 17 18 19 20 7 8 9 10.
So the sequence of the bio's is damaged in the flush queue (and
therefore in the journal on disk !). This error can be triggered by
ffs_snapshot() when a block is read with readblock() and gjournal finds
this block in the broken flush queue before it goes to the correct
active queue.

The fix is to place all new blocks at the end of the queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week

7 years agosysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64
Kirk McKusick [Mon, 7 Aug 2017 19:18:27 +0000 (19:18 +0000)]
sysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64
system having over 4GB RAM. That's due to:

1) the limit being u_int instead of u_long like vm.kmem_size (the limit is
   half of vm.kmem_size by default for amd64);
2) sysctl handler g_journal_cache_limit_sysctl() using u_int instead of u_long.

The fix is to replace u_int with u_long for the kern.geom.journal.cache.limit
sysctl variable.

PR: 198500
Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Reported by: Eugene Grosbein
Discussed with: kib
MFC after: 1 week

7 years agoRespect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
Kyle Evans [Mon, 7 Aug 2017 18:01:27 +0000 (18:01 +0000)]
Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)

Instead of using a non-configurable ".BAK" suffix, respect the
SIMPLE_BACKUP_SUFFIX environment variable also used by patch(1). This
simplifies cleanup operations in some patch/indent workflows.

Reviewed by: cem (earlier version), emaste, pstef
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D10921

7 years agoAvoid DI recursion when reclaim_pv_chunk() is called from
Konstantin Belousov [Mon, 7 Aug 2017 17:29:54 +0000 (17:29 +0000)]
Avoid DI recursion when reclaim_pv_chunk() is called from
pmap_advise() or pmap_remove().

Reported and tested by: pho (previous version)
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

7 years agoExplain why delayed invalidation is not required in pmap_protect() and
Konstantin Belousov [Mon, 7 Aug 2017 17:23:10 +0000 (17:23 +0000)]
Explain why delayed invalidation is not required in pmap_protect() and
pmap_remove_pages().

Submitted by: alc
MFC after: 1 week

7 years agoFollow-up to r321684 (Don't use libc++ when cross-building for gcc
Dimitry Andric [Mon, 7 Aug 2017 16:23:53 +0000 (16:23 +0000)]
Follow-up to r321684 (Don't use libc++ when cross-building for gcc
arches), and handle two more cases where libc++ includes could be
incorrectly enabled, in case the host compiler is clang 5.0.0, and the
target (cross) compiler is gcc 4.2.1.

Noted by: bdrewery
MFC after: 3 days
X-MFC-With: 321684

7 years agoFix hrtimer_active() in case of cancellation.
Alexander Motin [Mon, 7 Aug 2017 14:34:05 +0000 (14:34 +0000)]
Fix hrtimer_active() in case of cancellation.

While there, switch to FreeBSD internal callout active status.

Reviewed by: markj, hselasky
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11900

7 years agoo Replace __riscv__ with __riscv
Ruslan Bukin [Mon, 7 Aug 2017 14:09:57 +0000 (14:09 +0000)]
o Replace __riscv__ with __riscv
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)

This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.

RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):

__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen

Reviewed by: ngie
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11901

7 years agocxgbe(4): Add the T6 and T5 Unified Wire configuration files to the
Navdeep Parhar [Mon, 7 Aug 2017 14:04:19 +0000 (14:04 +0000)]
cxgbe(4): Add the T6 and T5 Unified Wire configuration files to the
kernel, just like for T4, when the driver is compiled into the kernel.

Reported by: mav@
MFC after: 3 days
Sponsored by: Chelsio Communications

7 years agoEnhance top(1) to filter on multiple usernames
Pietro Cerutti [Mon, 7 Aug 2017 08:45:08 +0000 (08:45 +0000)]
Enhance top(1) to filter on multiple usernames

Reviewed by: cognet, bapt
Approved by: cognet
MFC after: 1 week
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D11840

7 years agorfcomm_pppd.8: fix a typo (SPD -> SDP).
Andriy Voskoboinyk [Sun, 6 Aug 2017 21:54:43 +0000 (21:54 +0000)]
rfcomm_pppd.8: fix a typo (SPD -> SDP).

MFC after: 3 days

7 years agocxgbe(4): Avoid a NULL dereference that would occur during module unload
Navdeep Parhar [Sun, 6 Aug 2017 19:45:59 +0000 (19:45 +0000)]
cxgbe(4): Avoid a NULL dereference that would occur during module unload
if there were problems earlier during attach.

MFC after: 3 days
Sponsored by: Chelsio Communications

7 years agoRemove trivial comments. Remove and-ing with UINT_MAX for minor(),
Konstantin Belousov [Sun, 6 Aug 2017 12:27:20 +0000 (12:27 +0000)]
Remove trivial comments.  Remove and-ing with UINT_MAX for minor(),
cast to int already does the required truncation of significant bits.

Requested and reviewed by: bde
Sponsored by: The FreeBSD Foundation

7 years agoRemove dead target introduced in r178828.
Cy Schubert [Sun, 6 Aug 2017 06:35:40 +0000 (06:35 +0000)]
Remove dead target introduced in r178828.

MFC after: 1 week

7 years agokrb5_err.h is generated from a .et file in kerberos5/lib/libkrb5.
Cy Schubert [Sun, 6 Aug 2017 06:31:47 +0000 (06:31 +0000)]
krb5_err.h is generated from a .et file in kerberos5/lib/libkrb5.
As kerberos5/lib/krb5 include files are already referenced it makes
no sense to generate it again here.

MFC after: 1 month

7 years agoMark each cpu in the appropriate cpuset_domain set. This allows devices to
Andrew Turner [Sat, 5 Aug 2017 20:57:34 +0000 (20:57 +0000)]
Mark each cpu in the appropriate cpuset_domain set. This allows devices to
handle cases where they can only run on a single domain.

To allow all devices access to this set we need to move reading the domain
earlier in the boot as it was previously handled in the CPU driver, however
this is too late for the GICv3 ITS driver.

Sponsored by: DARPA, AFRL

7 years agoAdd myself to the calendar.freebsd.
Andriy Voskoboinyk [Sat, 5 Aug 2017 19:57:45 +0000 (19:57 +0000)]
Add myself to the calendar.freebsd.

Reported by: mckusick

7 years agoDon't check result of chflags in f_flag_cleanup()
Enji Cooper [Sat, 5 Aug 2017 16:58:02 +0000 (16:58 +0000)]
Don't check result of chflags in f_flag_cleanup()

This will prevent false positives from occurring if the test is run on
ZFS since ZFS doesn't support fflags throbbing like UFS.

PR: 221189
MFC after: 4 days
MFC with: r321949

7 years ago- Move creation and unlinking of /etc/wall_cmos_clock from the handling
Marius Strobl [Sat, 5 Aug 2017 12:59:03 +0000 (12:59 +0000)]
- Move creation and unlinking of /etc/wall_cmos_clock from the handling
  of the initial UTC dialog to install_zoneinfo() so that file gets the
  necessary treatment also when that dialog is skipped via "-s", when
  selecting UTC from the time zone menu or on the command-line instead
  etc.
- Make the initial UTC dialog actually work by giving the relevant files
  the necessary treatment and then exit when choosing "Yes" there instead
  of moving on to the time zone menu regardless.
- Since r301131, /etc/localtime is also installed when selecting UTC in
  interactive configurations (which previously meant only via the time
  zone menu, though). Thus, the code added in r230298 which treats a
  NULL zone file name as UTC and removes /etc/localtime in that case can
  go again.
- Consistently refer to "could not delete" (as chosen by the oldest such
  code in here) when unlink(2) fails instead of a to mixture of "delete"
  and "unlink" in error messages.

7 years agoadd myself to calendar.freebsd
Christoph Moench-Tegeder [Sat, 5 Aug 2017 10:03:47 +0000 (10:03 +0000)]
add myself to calendar.freebsd

Reported by: mckusick

7 years agoProvide more detailed specification for major(), minor() and makedev().
Konstantin Belousov [Sat, 5 Aug 2017 07:52:15 +0000 (07:52 +0000)]
Provide more detailed specification for major(), minor() and makedev().

Remove some statements which are no longer correct after ino64, and
clarify other.

The rewording is not in fact specific to ino64 and improvements are
useful on the stable branches.

Noted and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

7 years agoDetect hypervisors early. We used to set lower hz on hypervisors by default
Jung-uk Kim [Sat, 5 Aug 2017 06:56:46 +0000 (06:56 +0000)]
Detect hypervisors early.  We used to set lower hz on hypervisors by default
but it was broken since r273800 (and r278522, its MFC to stable/10) because
identify_cpu() is called too late, i.e., after init_param1().

MFC after: 3 days

7 years agoloadpoolfile() implements a -R (NORESOLVE) option which is not listed
Cy Schubert [Sat, 5 Aug 2017 06:46:06 +0000 (06:46 +0000)]
loadpoolfile() implements a -R (NORESOLVE) option which is not listed
in usage(). This commit trues up usage() with loadpoolfile().

7 years agolibefi/time.c cstyle cleanup
Toomas Soome [Sat, 5 Aug 2017 05:20:03 +0000 (05:20 +0000)]
libefi/time.c cstyle cleanup

libefi/time.c is mix of different styles, this update does cleanup.
Also fix 0 versus NULL, and zero the tv structure for case we get error
from UEFI firmware.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D11861

7 years agoFix matchcing of NATed ICMP queries (resolving NATed MTU discovery).
Cy Schubert [Sat, 5 Aug 2017 00:28:42 +0000 (00:28 +0000)]
Fix matchcing of NATed ICMP queries (resolving NATed MTU discovery).

MFC after: 1 month

7 years agoSelectively print "hwaddr" from ifconfig(8).
Matt Joras [Fri, 4 Aug 2017 21:06:47 +0000 (21:06 +0000)]
Selectively print "hwaddr" from ifconfig(8).

ifconfig(8) printing the hwaddr is only really useful if it differs from
the link layer address.

Reported by: jhb
Reviewed by: rpokala
Approved by: rstone (mentor)
Differential Revision: https://reviews.freebsd.org/D11777