]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
6 years agoEnable OF_setprop API function to add property in FDT
Marcin Wojtas [Thu, 10 Aug 2017 13:45:56 +0000 (13:45 +0000)]
Enable OF_setprop API function to add property in FDT

This patch modifies function ofw_fdt_setprop (called by OF_setprop),
so that it can add property, when replacing is not possible.
Adding property is needed to fixup FDT's that have missing
properties.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11879

6 years agoFurther revise r322327 and r322352 in release/packages/kernel.ucl.
Glen Barber [Thu, 10 Aug 2017 13:32:04 +0000 (13:32 +0000)]
Further revise r322327 and r322352 in release/packages/kernel.ucl.

Use PPID and PID to kill off the pre-install and parent pkg(8)
processes unless 'Y' or 'y' are entered at the prompt if the user
wants to proceed with upgrading the kernel and userland at the same
time.

This restores some of the logic and intent of r322327, with the
caveat of printing "child process terminated unexpectedly."

MFC after: 5 days
MFC with: r322327, r322352
Sponsored by: The FreeBSD Foundation

6 years agoUse integer type to pass around jiffies and/or ticks values in the
Hans Petter Selasky [Thu, 10 Aug 2017 13:05:40 +0000 (13:05 +0000)]
Use integer type to pass around jiffies and/or ticks values in the
LinuxKPI because in FreeBSD ticks are 32-bit.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMark PROFILE option as broken when targetting mips64
Ed Maste [Thu, 10 Aug 2017 13:01:19 +0000 (13:01 +0000)]
Mark PROFILE option as broken when targetting mips64

The assembly in sys/mips/include/profile.h will only work for o32 ABI.

Submitted by: Alexander Richardson
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11950

6 years agoFixes for wait event in the LinuxKPI. These are regression issues
Hans Petter Selasky [Thu, 10 Aug 2017 13:00:10 +0000 (13:00 +0000)]
Fixes for wait event in the LinuxKPI. These are regression issues
after r319757.

1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.

2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.

3) The wait_event() function should not have a return value.

Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMake sure the linux_wait_event_common() function in the LinuxKPI properly
Hans Petter Selasky [Thu, 10 Aug 2017 12:51:04 +0000 (12:51 +0000)]
Make sure the linux_wait_event_common() function in the LinuxKPI properly
handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there
is no timeout. This is a regression issue after r319757.

While at it change the type of returned variable from "long" to "int" to
match the actual return type.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoAdd myself.
Oleg Bulyzhin [Thu, 10 Aug 2017 12:31:55 +0000 (12:31 +0000)]
Add myself.

Reported by: mckusick

6 years agoRevise part of r322327 in release/packages/kernel.ucl.
Glen Barber [Thu, 10 Aug 2017 12:30:34 +0000 (12:30 +0000)]
Revise part of r322327 in release/packages/kernel.ucl.

It appears I misunderstand process forking and signal handling in
how the pre-/post-install scripts are executed internally by pkg(8).
In some cases (not all), ^C when prompted to cancel the kernel
package update will stop the pre-install script from executing, but
allow pkg(8) to continue extracting the package when it is not the
intent.

In order to keep somewhat of an anti-footshooting measure in place,
print the recommendation to install the kernel package first if
ASSUME_ALWAYS_YES is false and TERM is set, then sleep for 5 seconds
to allow the user to see the message.

MFC after: 5 days
MFC with: r322327
X-MFC-Note: Maybe not until I am happy with this..
Sponsored by: The FreeBSD Foundation

6 years agoAdd two NFIT fields missed in r321298.
Alexander Motin [Thu, 10 Aug 2017 10:59:05 +0000 (10:59 +0000)]
Add two NFIT fields missed in r321298.

MFC after: 2 weeks

6 years agocalendars: add myself to the FreeBSD calendar
Roger Pau Monné [Thu, 10 Aug 2017 09:17:16 +0000 (09:17 +0000)]
calendars: add myself to the FreeBSD calendar

Reported by: mckusick

6 years agox86: bump MAX_APIC_ID to 512
Roger Pau Monné [Thu, 10 Aug 2017 09:16:40 +0000 (09:16 +0000)]
x86: bump MAX_APIC_ID to 512

Introduce a new define to take int account the xAPIC ID limit, for
systems where x2APIC is not available/reliable.

Also change some of the usages of the APIC ID to use an unsigned int
(which is the correct storage type to deal with x2APIC IDs as found in
x2APIC MADT entries).

This allows booting FreeBSD on a box with 256 CPUs and APIC IDs up to
295:

FreeBSD/SMP: Multiprocessor System Detected: 256 CPUs
FreeBSD/SMP: 1 package(s) x 64 core(s) x 4 hardware threads
Package HW ID = 0
Core HW ID = 0
CPU0 (BSP): APIC ID: 0
CPU1 (AP/HT): APIC ID: 1
CPU2 (AP/HT): APIC ID: 2
CPU3 (AP/HT): APIC ID: 3
[...]
Core HW ID = 73
CPU252 (AP): APIC ID: 292
CPU253 (AP/HT): APIC ID: 293
CPU254 (AP/HT): APIC ID: 294
CPU255 (AP/HT): APIC ID: 295

Submitted by: kib (previous version)
Relnotes: yes
MFC after: 1 month
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D11913

6 years agox86: make the arrays that depend on MAX_APIC_ID dynamic
Roger Pau Monné [Thu, 10 Aug 2017 09:16:03 +0000 (09:16 +0000)]
x86: make the arrays that depend on MAX_APIC_ID dynamic

So that MAX_APIC_ID can be bumped without wasting memory.

Note that the usage of MAX_APIC_ID in the SRAT parsing forces the
parser to allocate memory directly from the phys_avail physical memory
array, which is not the best approach probably, but I haven't found
any other way to allocate memory so early in boot. This memory is not
returned to the system afterwards, but at least it's sized according
to the maximum APIC ID found in the MADT table.

Sponsored by: Citrix Systems R&D
MFC after: 1 month
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D11912

6 years agoapic_enumerator: only set mp_ncpus and mp_maxid at probe cpus phase
Roger Pau Monné [Thu, 10 Aug 2017 09:15:18 +0000 (09:15 +0000)]
apic_enumerator: only set mp_ncpus and mp_maxid at probe cpus phase

Populate the lapics arrays and call cpu_add/lapic_create in the setup
phase instead. Also store the max APIC ID found in the newly
introduced max_apic_id global variable.

This is a requirement in order to make the static arrays currently
using MAX_LAPIC_ID dynamic.

Sponsored by: Citrix Systems R&D
MFC after: 1 month
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D11911

6 years agoDon't leak mbufs if clusers exceeds the number of segments. This would
Sean Bruno [Thu, 10 Aug 2017 03:43:23 +0000 (03:43 +0000)]
Don't leak mbufs if clusers exceeds the number of segments.  This would
leak mbufs over time causing crashes.

PR: 221202
Submitted by: Matt Macy <matt@mattmacy.io>
Reported by: gergely.czuczy@harmless.hu
Sponsored by: Limelight Networks

6 years agoExport IFCAP_HWSTATS so that we don't experience double stats counting
Sean Bruno [Thu, 10 Aug 2017 03:11:05 +0000 (03:11 +0000)]
Export IFCAP_HWSTATS so that we don't experience double stats counting
on iflib enabled devices.

PR: 220198
Submitted by: Matt Macy <matt@mattmacy.io>
Reported by: Ben Woods <woodsb02@freebsd.org>
Sponsored by: Limelight Networks

6 years agoAdd sbruno@ birthday information.
Sean Bruno [Thu, 10 Aug 2017 02:55:22 +0000 (02:55 +0000)]
Add sbruno@ birthday information.

Reported by: mckusick

6 years agoAdd myself (rlibby) to calendar.freebsd
Ryan Libby [Thu, 10 Aug 2017 02:15:40 +0000 (02:15 +0000)]
Add myself (rlibby) to calendar.freebsd

Reported by: mckusick
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D11947

6 years agoPick 'Remove external linkage for spin_adaptive' from upstream jemalloc
Ryan Libby [Wed, 9 Aug 2017 22:58:42 +0000 (22:58 +0000)]
Pick 'Remove external linkage for spin_adaptive' from upstream jemalloc

Apply the changes from upstream jemalloc 048c6679.  This is actually not
quite a cherry pick due to makefile difference and because FreeBSD does
not carry the msvc project files which were also modified in that
commit.

Approved by: jasone (maintainer), markj (mentor)
Sponsored by: Dell EMC Isilon

6 years agoProvide compile to choose receive processing in either Ithread or Taskqueue Thread.
David C Somayajulu [Wed, 9 Aug 2017 22:18:49 +0000 (22:18 +0000)]
Provide compile to choose receive processing in either Ithread or Taskqueue Thread.

6 years agoAdd myself to calendar.freebsd
Kyle Evans [Wed, 9 Aug 2017 21:44:55 +0000 (21:44 +0000)]
Add myself to calendar.freebsd

Requested by: mckusick
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11936

6 years agoi386/boot2: -fno-asynchronous-unwind-tables for gcc
Ryan Libby [Wed, 9 Aug 2017 20:13:49 +0000 (20:13 +0000)]
i386/boot2: -fno-asynchronous-unwind-tables for gcc

The amd64 build of boot2 was failing with gcc 6.3.0 due to being more
than 1 kB too large. It was apparently generating a .eh_frame section
which was not being removed by objcopy -S. The .eh_frame section seems
to be mandatory per the amd64 ABI, but boot2 is compiled for i386 (uses
-m32), and therefore should be optional in this context. Suppress
generation of .eh_frame with the -fno-asynchronous-unwind-tables flag to
gcc. This saves 1348 bytes (the limit is 7680 bytes).

Reviewed by: dim, imp
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11928

6 years agoMake user supplied data checks a bit stricter.
Andrey V. Elsukov [Wed, 9 Aug 2017 19:58:38 +0000 (19:58 +0000)]
Make user supplied data checks a bit stricter.

key_msg2sp() is used for parsing data from setsockopt(IP[V6]_IPSEC_POLICY)
call. This socket option is usually used to configure IPsec bypass for
socket. Only privileged user can set this socket option.
The message syntax is described here
http://www.kame.net/newsletter/20021210/

and our libipsec is usually used to create the correct request.
Add additional checks:
* that sadb_x_ipsecrequest_len is not out of bounds of user supplied buffer
* that src/dst's sa_len is the same
* that 2*sa_len is not out of bounds of user supplied buffer
* that 2*sa_len fits into bounds of sadb_x_ipsecrequest

Reported by: Ilja van Sprundel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11796

6 years agoAdd a dependency on the kernel package for the runtime package.
Glen Barber [Wed, 9 Aug 2017 19:16:54 +0000 (19:16 +0000)]
Add a dependency on the kernel package for the runtime package.

The idea here is that, provided upstream pkg(8) maintainers accept
the proposed change, the kernel.ucl will contain a post-install
script causing pkg(8) to emit a message informing to reboot the
system after the kernel is upgraded using 'pkg upgrade', so the
new userland is installed on the running new kernel.  At present,
this functionality does not exist in pkg(8), but will help ensure
the upgrade path follows that from UPDATING.  To work around this
for now, evaluate ASSUME_ALWAYS_YES, and prompt the user if they
wish to proceed if not set to true.

Since there is a kernel dependency, and a non-GENERIC kernel may
be in use, update Makefile.inc1 to replace '%KERNCONF%' in the
runtime.ucl with the first-built kernel set either via command line
or in make.conf(5).

MFC after: 5 days
Sponsored by: The FreeBSD Foundation

6 years agolldb: Make i386-*-freebsd expression work on JIT path
Ed Maste [Wed, 9 Aug 2017 19:09:23 +0000 (19:09 +0000)]
lldb: Make i386-*-freebsd expression work on JIT path

* Enable i386 ABI creation for freebsd
* Added an extra argument in ABISysV_i386::PrepareTrivialCall for mmap
  syscall
* Unlike linux, the last argument of mmap is actually 64-bit(off_t).
  This requires us to push an additional word for the higher order bits.
* Prior to this change, ktrace dump will show mmap failures due to
  invalid argument coming from the 6th mmap argument.

Submitted by: Karnajit Wangkhem
Differential Revision: https://reviews.llvm.org/D34776

6 years agocat: fix build with -DNO_UDOM_SUPPORT
Ed Maste [Wed, 9 Aug 2017 18:23:46 +0000 (18:23 +0000)]
cat: fix build with -DNO_UDOM_SUPPORT

Sponsored by: The FreeBSD Foundation

6 years agocapsicum_helpers: Add FIODTYPE to default ioctls allowed
Kyle Evans [Wed, 9 Aug 2017 18:15:07 +0000 (18:15 +0000)]
capsicum_helpers: Add FIODTYPE to default ioctls allowed

FIODTYPE will be needed by hexdump(1) to speed up the -s flag on devices
that should be able to support fseek(3); specifically, in an attempt to
correct for the fact that most tape drives don't support seeking yet don't
indicate as such when fseeko(3) is invoked. Related: D10939

Reviewed by: cem, emaste, oshogbo
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D10937

6 years agoSplit identify_cpu() into two functions for amd64 as we do for i386. This
Jung-uk Kim [Wed, 9 Aug 2017 18:09:09 +0000 (18:09 +0000)]
Split identify_cpu() into two functions for amd64 as we do for i386.  This
reduces diff between amd64 and i386.  Also, it fixes a regression introduced
in r322076, i.e., identify_hypervisor() failed to identify some hypervisors.
This function assumes cpu_feature2 is already initialized.

Reported by: dexuan
Tested by: dexuan

6 years agolibusb(3): Expose device caps as libusb_bos_descriptor::dev_capability
Kyle Evans [Wed, 9 Aug 2017 18:06:27 +0000 (18:06 +0000)]
libusb(3): Expose device caps as libusb_bos_descriptor::dev_capability

Some libusb consumers in Linux-land (in this case, libusb4java) expect a
dev_capability member that they can use to enumerate the device
capabilities.

No particular layout is expected of this, just that it can be traversed
using the bLength member until bNumDeviceCapabilities are read and that the
consumer may then use one of the libusb_get_*_descriptor methods to extract
specific (usb 2.0 vs. ss) capability information.

In collaboration with: hselasky
Reviewed by: hselasky
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11494

6 years agoPlug uninitialized stack variable leak in sendfile(2).
Gleb Smirnoff [Wed, 9 Aug 2017 17:48:38 +0000 (17:48 +0000)]
Plug uninitialized stack variable leak in sendfile(2).

Reported by: Ilja Van Sprundel <ivansprundel ioactive.com>
Submitted by: Domagoj Stolfa <domagoj.stolfa gmail.com>
MFC after: 1 week
Security: uninitialized stack variable leak

6 years agoUpgrade our copies of clang, llvm and libc++ to r310316 from the
Dimitry Andric [Wed, 9 Aug 2017 17:32:39 +0000 (17:32 +0000)]
Upgrade our copies of clang, llvm and libc++ to r310316 from the
upstream release_50 branch.

MFC after: 2 months
X-MFC-with: r321369

6 years agoAlso provide a warning for geom_fox.
Warner Losh [Wed, 9 Aug 2017 16:37:37 +0000 (16:37 +0000)]
Also provide a warning for geom_fox.

Differential Review: https://reviews.freebsd.org/D11935
Requested by: jhb@
MFC After: 3 days

6 years agoMark geom classes as deprecated.
Warner Losh [Wed, 9 Aug 2017 16:15:24 +0000 (16:15 +0000)]
Mark geom classes as deprecated.

geom_bsd, geom_mbr and geom_sunlabel have been obsolete since Marcel
Moolenaar's geom_part was in FreeBSD 7. They haven't been in GENERIC
since FreeBSD 8. Add warning when used.

geom_vol_ffs has been obsolete since ufs support to geom_label was
committed in FreeBSD 5. It hasn't been in GENERIC since FreeBSD 5.
Add warning when used.

geom_fox has been obsolete since gmultipath was committed in FreeBSD 7.
(no warning added, since this is a very obscure class).

These will all be removed in FreeBSD 12.

MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D11935

Note: Classes will be removed after MFC

6 years agoMissing remanant of 322309.
Alexander Motin [Wed, 9 Aug 2017 13:46:16 +0000 (13:46 +0000)]
Missing remanant of 322309.

MFC after: 1 week

6 years agoAdd birthday information for jonathan@.
Jonathan Anderson [Wed, 9 Aug 2017 13:25:27 +0000 (13:25 +0000)]
Add birthday information for jonathan@.

As requested by mckusick@...

6 years agoAdd to if_enc(4) ability to capture packets via BPF after pfil processing.
Andrey V. Elsukov [Wed, 9 Aug 2017 12:24:07 +0000 (12:24 +0000)]
Add to if_enc(4) ability to capture packets via BPF after pfil processing.

New flag 0x4 can be configured in net.enc.[in|out].ipsec_bpf_mask.
When it is set, if_enc(4) additionally captures a packet via BPF after
invoking pfil hook. This may be useful for debugging.

MFC after: 2 weeks
Sponsored by: Yandex LLC
Differential Revision: https://reviews.freebsd.org/D11804

6 years agoUse "Ibex Peak" codename for "5 Series/3400 Series" chipsets.
Alexander Motin [Wed, 9 Aug 2017 12:21:17 +0000 (12:21 +0000)]
Use "Ibex Peak" codename for "5 Series/3400 Series" chipsets.

This is shorter and unifies naming with later chipsets.

MFC after: 1 week

6 years agoAdd new Intel Lewisburg and Union Point chipset PCI IDs.
Alexander Motin [Wed, 9 Aug 2017 12:03:12 +0000 (12:03 +0000)]
Add new Intel Lewisburg and Union Point chipset PCI IDs.

While there, polish some old AHCI ones, since they are still reused.

MFC after: 1 week

6 years agoFix comment typo.
Oleg Bulyzhin [Wed, 9 Aug 2017 10:46:34 +0000 (10:46 +0000)]
Fix comment typo.

6 years agoPrint maximum MTU when trying to set invalid MTU in the mlx4en(4) driver.
Hans Petter Selasky [Wed, 9 Aug 2017 10:32:51 +0000 (10:32 +0000)]
Print maximum MTU when trying to set invalid MTU in the mlx4en(4) driver.
Useful for debugging.

Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org>
MFC after: 3 days
Sponsored by: Mellanox Technologies

6 years agoIncrement queue drops in the network statistics when transmitted packets
Hans Petter Selasky [Wed, 9 Aug 2017 10:30:55 +0000 (10:30 +0000)]
Increment queue drops in the network statistics when transmitted packets
are dropped by the mlx4en(4) driver.

Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org>
MFC after: 3 days
Sponsored by: Mellanox Technologies

6 years agoAdd support for RX and TX statistics when the mlx4en(4) PCI device
Hans Petter Selasky [Wed, 9 Aug 2017 10:27:21 +0000 (10:27 +0000)]
Add support for RX and TX statistics when the mlx4en(4) PCI device
is in VF or SRIOV mode typically in a virtual machine environment.

Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org>
MFC after: 3 days
Sponsored by: Mellanox Technologies

6 years agoDo not loose CCB flags after r320493.
Alexander Motin [Wed, 9 Aug 2017 09:13:15 +0000 (09:13 +0000)]
Do not loose CCB flags after r320493.

There is at least CAM_UNLOCKED that should be kept.

MFC after: 3 days

6 years agoCorrect sysctl names.
Dag-Erling Smørgrav [Wed, 9 Aug 2017 07:24:58 +0000 (07:24 +0000)]
Correct sysctl names.

6 years agohyperv/hn: Implement transparent mode network VF.
Sepherosa Ziehau [Wed, 9 Aug 2017 05:59:45 +0000 (05:59 +0000)]
hyperv/hn: Implement transparent mode network VF.

How network VF works with hn(4) on Hyper-V in transparent mode:

- Each network VF has a cooresponding hn(4).
- The network VF and the it's cooresponding hn(4) have the same hardware
  address.
- Once the network VF is attached, the cooresponding hn(4) waits several
  seconds to make sure that the network VF attach routing completes, then:
  o  Set the intersection of the network VF's if_capabilities and the
     cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s
     if_capabilities.  And adjust the cooresponding hn(4) if_capable and
     if_hwassist accordingly. (*)
  o  Make sure that the cooresponding hn(4)'s TSO parameters meet the
     constraints posed by both the network VF and the cooresponding hn(4).
     (*)
  o  The network VF's if_input is overridden.  The overriding if_input
     changes the input packet's rcvif to the cooreponding hn(4).  The
     network layers are tricked into thinking that all packets are
     neceived by the cooresponding hn(4).
  o  If the cooresponding hn(4) was brought up, bring up the network VF.
     The transmission dispatched to the cooresponding hn(4) are
     redispatched to the network VF.
  o  Bringing down the cooresponding hn(4) also brings down the network
     VF.
  o  All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to
     the network VF; the cooresponding hn(4) changes its internal state
     if necessary.
  o  The media status of the cooresponding hn(4) solely relies on the
     network VF.
  o  If there are multicast filters on the cooresponding hn(4), allmulti
     will be enabled on the network VF. (**)
- Once the network VF is detached.  Undo all damages did to the
  cooresponding hn(4) in the above item.

NOTE:
No operation should be issued directly to the network VF, if the
network VF transparent mode is enabled.  The network VF transparent mode
can be enabled by setting tunable hw.hn.vf_transparent to 1.  The network
VF transparent mode is _not_ enabled by default, as of this commit.

The benefit of the network VF transparent mode is that the network VF
attachment and detachment are transparent to all network layers; e.g. live
migration detaches and reattaches the network VF.

The major drawbacks of the network VF transparent mode:
- The netmap(4) support is lost, even if the VF supports it.
- ALTQ does not work, since if_start method cannot be properly supported.

(*)
These decisions were made so that things will not be messed up too much
during the transition period.

(**)
This does _not_ need to go through the fancy multicast filter management
stuffs like what vlan(4) has, at least currently:
- As of this write, multicast does not work in Azure.
- As of this write, multicast packets go through the cooresponding hn(4).

MFC after: 3 days
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D11803

6 years agoAdd an entry to UPDATING for r322297 which restores the ability
Kirk McKusick [Wed, 9 Aug 2017 05:21:57 +0000 (05:21 +0000)]
Add an entry to UPDATING for r322297 which restores the ability
of fsck to automatically find alternate superblocks when the
standard one is trashed or unavailable.

MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589

6 years agoSince the switch to GPT disk labels, fsck for UFS/FFS has been
Kirk McKusick [Wed, 9 Aug 2017 05:17:21 +0000 (05:17 +0000)]
Since the switch to GPT disk labels, fsck for UFS/FFS has been
unable to automatically find alternate superblocks. This checkin
places the information needed to find alternate superblocks to the
end of the area reserved for the boot block.

Filesystems created with a newfs of this vintage or later will
create the recovery information. If you have a filesystem created
prior to this change and wish to have a recovery block created for
your filesystem, you can do so by running fsck in forground mode
(i.e., do not use the -p or -y options). As it starts, fsck will
ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should
answer yes.

Discussed with: kib, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589

6 years agoIntroduce vm_page_grab_pages(), which is intended to replace loops calling
Alan Cox [Wed, 9 Aug 2017 04:23:04 +0000 (04:23 +0000)]
Introduce vm_page_grab_pages(), which is intended to replace loops calling
vm_page_grab() on consecutive page indices.  Besides simplifying the code
in the caller, vm_page_grab_pages() allows for batching optimizations.
For example, the current implementation replaces calls to vm_page_lookup()
on consecutive page indices by cheaper calls to vm_page_next().

Reviewed by: kib, markj
Tested by: pho (an earlier version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11926

6 years agoUpdate pl310 node in Armada 38x DTS to match the one used in Linux
Marcin Wojtas [Wed, 9 Aug 2017 01:31:05 +0000 (01:31 +0000)]
Update pl310 node in Armada 38x DTS to match the one used in Linux

Since the cache controller nodes fixup is added to the platform code,
this patch aligns it to the Linux device tree representation.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11884

6 years agoEnable pl310 coherent operation in platform init for Armada 38x
Marcin Wojtas [Wed, 9 Aug 2017 01:25:47 +0000 (01:25 +0000)]
Enable pl310 coherent operation in platform init for Armada 38x

Updating PL310 sotfware context sc_io_coherent field in
platform_pl310_init() routine for Armada 38x helps to avoid
using 'arm,io-coherent' property, which is by default not present
in the device tree node in Linux.

This way another step for DT unification between two operating
systems is done. The improvemnt will also work after enabling
PLATFORM for Marvell ARMv7 SoCs.

Reviewed by: andrew, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11883

6 years agodf(1): Add --si as an alias for -H
Kyle Evans [Wed, 9 Aug 2017 01:24:52 +0000 (01:24 +0000)]
df(1): Add --si as an alias for -H

Reviewed by: cem (earlier version), emaste
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11749

6 years agoRemove clock-frequency properties from Armada 38x timer nodes
Marcin Wojtas [Wed, 9 Aug 2017 01:20:53 +0000 (01:20 +0000)]
Remove clock-frequency properties from Armada 38x timer nodes

Since the timers' base frequency setting is added to the platform code,
this patch removes clock-frequency properties from global
and twd timers, aligning both to the Linux device tree.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11882

6 years agodu(1): Add --si option to display in terms of powers of 1000
Kyle Evans [Wed, 9 Aug 2017 01:19:19 +0000 (01:19 +0000)]
du(1): Add --si option to display in terms of powers of 1000

Reviewed by: cem (earlier version), emaste
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11748

6 years agoDynamically configure timers' base frequency for Armada 38x
Marcin Wojtas [Wed, 9 Aug 2017 01:14:29 +0000 (01:14 +0000)]
Dynamically configure timers' base frequency for Armada 38x

Instead of using 'clock-frequency' device tree property for global/twd
mpcore timers of Armada 38x SoCs, set it in platform_late_init stage
with arm_tmr_change_frequency() function.

Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11881

6 years agoEnable using ofw_bus_find_compatible in early platform code
Marcin Wojtas [Wed, 9 Aug 2017 01:06:40 +0000 (01:06 +0000)]
Enable using ofw_bus_find_compatible in early platform code

Before this patch function ofw_bus_find_compatible was using
memory allocations in order to find compatible node and the property's
length. This way there was always a suited buffer for property,
however this approach had also disadvantages - ofw_bus_find_compatible
couldn't be used when malloc is not available, e.g. during fdt fixup stage.

In order to remove the usage limitation of ofw_bus_find_compatible(),
this patch modifies the function to use ofw_bus_node_is_compatible()
(instead of the one without _int suffix), which uses a fixed
buffer on stack instead of dynamic allocations.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11880

6 years agoregex(3): Refactor fast/slow stepping bits in the matching engine
Kyle Evans [Wed, 9 Aug 2017 01:04:36 +0000 (01:04 +0000)]
regex(3): Refactor fast/slow stepping bits in the matching engine

Adding features for matching is fairly straightforward, but this requires
some duplication because of this fast/slow setup. They can be fairly
trivially combined into a single walk(), so do it to make future additions
less error prone.

Reviewed by: cem (earlier version), emaste, pfg
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D11233

6 years agoAdd support for "compatible" parameter in ofw_fdt_fixup
Marcin Wojtas [Wed, 9 Aug 2017 00:56:29 +0000 (00:56 +0000)]
Add support for "compatible" parameter in ofw_fdt_fixup

Sometimes it's convenient to provide fixup to many boards
that use the same SoC family (eg. Marvell Armada 38x).
Instead of putting multiple entries in fdt_fixup_table,
use one entry which refers to all boards with given SoC.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11878

6 years agoRestore original /soc ranges on Marvell Armada 38x boards
Marcin Wojtas [Wed, 9 Aug 2017 00:51:45 +0000 (00:51 +0000)]
Restore original /soc ranges on Marvell Armada 38x boards

Because fdt_get_ranges can process now multiple 'ranges' entries,
restoring the ranges from original Linux device trees is possible.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11877

6 years agoEnable parsing simple-bus 'ranges' with multiple entries
Marcin Wojtas [Wed, 9 Aug 2017 00:45:25 +0000 (00:45 +0000)]
Enable parsing simple-bus 'ranges' with multiple entries

This patch makes possible to boot with up to 8 ranges in soc.
Dynamic allocation cannot be used, because ftd_get_ranges
function is called early, when malloc is not available.

Change is required for the alignment of Marvell Armada 38x
device trees present in sys/gnu/dts/arm - originally
the platform has 6 entries in simple-bus 'ranges'.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: manu, nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11876

6 years agoRemove the ds133x and s35390a i2c RTC drivers for now. They both do i2c
Ian Lepore [Tue, 8 Aug 2017 22:58:34 +0000 (22:58 +0000)]
Remove the ds133x and s35390a i2c RTC drivers for now.  They both do i2c
transfers in their probe() or attach() routines, and that doesn't work
when the low-level controller requires interrupts to be functional.

The DS133x family of chips is nearly identical to the DS1307 and support
for them should be added to that driver, then the ds133x driver can be
deleted.  The s35390a driver just needs a non-trivial workover.  In both
cases that work will be done and committed separately.

6 years agoAdd missing parenthesis on error message
Renato Botelho [Tue, 8 Aug 2017 22:40:26 +0000 (22:40 +0000)]
Add missing parenthesis on error message

Approved by: loos
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)

6 years agopf_get_sport(): Prevent possible endless loop when searching for an unused nat port
Kristof Provost [Tue, 8 Aug 2017 21:09:26 +0000 (21:09 +0000)]
pf_get_sport(): Prevent possible endless loop when searching for an unused nat port

This is an import of Alexander Bluhm's OpenBSD commit r1.60,
the first chunk had to be modified because on OpenBSD the
'cut' declaration is located elsewhere.

Upstream report by Jingmin Zhou:
https://marc.info/?l=openbsd-pf&m=150020133510896&w=2

OpenBSD commit message:
 Use a 32 bit variable to detect integer overflow when searching for
 an unused nat port.  Prevents a possible endless loop if high port
 is 65535 or low port is 0.
 report and analysis Jingmin Zhou; OK sashan@ visa@
Quoted from: https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_lb.c

PR: 221201
Submitted by: Fabian Keil <fk@fabiankeil.de>
Obtained from:  OpenBSD via ElectroBSD
MFC after: 1 week

6 years agoTurns out to be even simpler to just not create /dev/efi if we don't
Warner Losh [Tue, 8 Aug 2017 21:01:11 +0000 (21:01 +0000)]
Turns out to be even simpler to just not create /dev/efi if we don't
have a efi runtime.

6 years agoFail to open efirt device when no EFI on system.
Warner Losh [Tue, 8 Aug 2017 20:44:16 +0000 (20:44 +0000)]
Fail to open efirt device when no EFI on system.

libefivar expects opening /dev/efi to indicate if the we can make efi
runtime calls. With a null routine, it was always succeeding leading
efi_variables_supported() to return the wrong value. Only succeed if
we have an efi_runtime table. Also, while I'm hear, out of an
abundance of caution, add a likely redundant check to make sure
efi_systbl is not NULL before dereferencing it. I know it can't be
NULL if efi_cfgtbl is non-NULL, but the compiler doesn't.

6 years agorwho/ruptime/rwhod shouldn't be gated by RCMDS.
Jeremie Le Hen [Tue, 8 Aug 2017 20:17:07 +0000 (20:17 +0000)]
rwho/ruptime/rwhod shouldn't be gated by RCMDS.

As peter@ points out in pr/220953:
"rwho, rwhod and ruptime are not part of the remote login suite (rsh, rlogin
etc).

They should *not* be in the rcmds package which is disabled by default.  We
rely on rwho/rwhod/ruptime in the freebsd.org cluster."

This commit is a re-commit of r322029 and r322031 with a better commit log, as
pointed out by ngie@.

This also includes the necesary changes to OptionalObsoleteFiles.inc, as
requested by jhb@.

PR: 220953
Reported by: peter@, jhb@
Differential Revision: https://reviews.freebsd.org/D11743

6 years agoRevert r322029 and r322031 so as to recommit them with a better commit log.
Jeremie Le Hen [Tue, 8 Aug 2017 20:07:08 +0000 (20:07 +0000)]
Revert r322029 and r322031 so as to recommit them with a better commit log.

PR: 220953
Reported by: ngie@

6 years agoFix few issues of LinuxKPI workqueue.
Alexander Motin [Tue, 8 Aug 2017 19:36:34 +0000 (19:36 +0000)]
Fix few issues of LinuxKPI workqueue.

LinuxKPI workqueue wrappers reported "successful" cancellation for works
already completed in normal way.  This change brings reported status and
real cancellation fact into sync.  This required for drm-next operation.

Reviewed by: hselasky (earlier version)
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11904

6 years agoRemove now-unused badsb declaration, missed in r322200
Ed Maste [Tue, 8 Aug 2017 18:31:40 +0000 (18:31 +0000)]
Remove now-unused badsb declaration, missed in r322200

Sponsored by: The FreeBSD Foundation

6 years agoFix a NULL pointer dereference in mly_user_command().
John Baldwin [Tue, 8 Aug 2017 17:49:57 +0000 (17:49 +0000)]
Fix a NULL pointer dereference in mly_user_command().

If mly_user_command fails to allocate a command slot it jumps to an 'out'
label used for error handling.  The error handling code checks for a data
buffer in 'mc->mc_data' to free before checking if 'mc' is NULL.  Fix by
just returning directly if we fail to allocate a command and only using
the 'out' label for subsequent errors when there is actual cleanup to
perform.

PR: 217747
Reported by: PVS-Studio
Reviewed by: emaste
MFC after: 1 week

6 years agoVendor import of libc++ release_50 branch r310316:
Dimitry Andric [Tue, 8 Aug 2017 16:53:40 +0000 (16:53 +0000)]
Vendor import of libc++ release_50 branch r310316:
https://llvm.org/svn/llvm-project/libcxx/branches/release_50@310316

6 years agoVendor import of clang release_50 branch r310316:
Dimitry Andric [Tue, 8 Aug 2017 16:53:22 +0000 (16:53 +0000)]
Vendor import of clang release_50 branch r310316:
https://llvm.org/svn/llvm-project/cfe/branches/release_50@310316

6 years agoVendor import of llvm release_50 branch r310316:
Dimitry Andric [Tue, 8 Aug 2017 16:52:53 +0000 (16:52 +0000)]
Vendor import of llvm release_50 branch r310316:
https://llvm.org/svn/llvm-project/llvm/branches/release_50@310316

6 years agoMake p1003_1b.aio_listio_max a tunable
Alan Somers [Tue, 8 Aug 2017 16:14:31 +0000 (16:14 +0000)]
Make p1003_1b.aio_listio_max a tunable

p1003_1b.aio_listio_max is now a tunable. Its value is reflected in the
sysctl of the same name, and the sysconf(3) variable _SC_AIO_LISTIO_MAX.
Its value will be bounded from below by the compile-time constant
AIO_LISTIO_MAX and from above by the compile-time constant
MAX_AIO_QUEUE_PER_PROC and the tunable vfs.aio.max_aio_queue.

Reviewed by: jhb, kib
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D11601

6 years agoUse the correct queue depth for nda devices.
Warner Losh [Tue, 8 Aug 2017 16:06:16 +0000 (16:06 +0000)]
Use the correct queue depth for nda devices.

Submitted by: Matt Williams

6 years agoFix logic error in the the assert, causing the condition to be always true.
Konstantin Belousov [Tue, 8 Aug 2017 15:46:29 +0000 (15:46 +0000)]
Fix logic error in the the assert, causing the condition to be always true.

Also improve the formatting of the corresponding KASSERT message.

Based on the submission by: Svyatoslav <razmyslov@viva64.com>
Found by: PVS-Studio
PR: 217741
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation (kib)
MFC after: 1 week

6 years agotests/sys/netinet/fibs_test: skip selected tests when firewalls are enabled
Alan Somers [Tue, 8 Aug 2017 15:37:21 +0000 (15:37 +0000)]
tests/sys/netinet/fibs_test: skip selected tests when firewalls are enabled

Some tests send packets over epair(4) interfaces. Firewalls can cause
spurious failures.

Reviewed by: ngie
MFC after: 3 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D11917

6 years agoFix typo in cyapa out of bounds check.
Michael Gmelin [Tue, 8 Aug 2017 13:27:32 +0000 (13:27 +0000)]
Fix typo in cyapa out of bounds check.

PR: 217783
Submitted by: razmyslov@viva64.com
MFC after: 1 week

6 years agovmstat: Always emit a space after the free-memory column
Emmanuel Vadot [Tue, 8 Aug 2017 12:18:11 +0000 (12:18 +0000)]
vmstat: Always emit a space after the free-memory column

When displaying in non-human form, if the free-memory number
is large (more than 7 digits), there is no space between it and
the page fault column.

PR: 221290
Submitted by: Josuah Demangeon <mail@josuah.net> (Original version)

6 years agoMake sure the received IP header gets 32-bit aligned for short packets
Hans Petter Selasky [Tue, 8 Aug 2017 11:49:36 +0000 (11:49 +0000)]
Make sure the received IP header gets 32-bit aligned for short packets
in the mlx5en(4) driver.

MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoCount drop events due to lack of PCI bandwidth as queue drops and not as
Hans Petter Selasky [Tue, 8 Aug 2017 11:36:57 +0000 (11:36 +0000)]
Count drop events due to lack of PCI bandwidth as queue drops and not as
input errors in the mlx5en(4) driver. This improves the sysadmin view of
physical port errors.

Submitted by: gallatin@
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoFix for mlx4en(4) to properly call m_defrag().
Hans Petter Selasky [Tue, 8 Aug 2017 11:35:02 +0000 (11:35 +0000)]
Fix for mlx4en(4) to properly call m_defrag().

The m_defrag() function can only defrag mbuf chains which have a valid
mbuf packet header. In r291699 when the mlx4en(4) driver was converted
into using BUSDMA(9), the call to m_defrag() was moved after the part
of the transmit routine which strips the header from the mbuf chain.
This effectivly disabled the mbuf defrag mechanism and such packets
simply got dropped.

This patch removes the stripping of mbufs from a chain and loads all
mbufs using busdma. If busdma finds there are no segments, unload
the DMA map and free the mbuf right away, because that means all
data in the mbuf has been inlined in the TX ring. Else proceed
as usual.

Add a per-ring rounter for the number of defrag attempts and
make sure the oversized_packets counter gets zeroed while at it.

The counters are per-ring to avoid excessive cache misses in the
TX path.

Submitted by: mjoras@
Differential Revision: https://reviews.freebsd.org/D11683
MFC after: 1 week
Sponsored by: Mellanox Technologies

6 years agoMFV r322242: 8373 TXG_WAIT in ZIL commit path
Andriy Gapon [Tue, 8 Aug 2017 11:26:03 +0000 (11:26 +0000)]
MFV r322242: 8373 TXG_WAIT in ZIL commit path

illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7
https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7

https://www.illumos.org/issues/8373
  The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign
  a transaction to a transaction group.  That seems to be logically
  incorrect as writing of the ZIL block does not introduce any new dirty
  data.  Also, when there is a lot of dirty data, the call can introduce
  significant delays into the ZIL commit path, thus affecting all
  synchronous writes. Additionally, ARC throttling may affect the ZIL
  writing.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

6 years ago8373 TXG_WAIT in ZIL commit path
Andriy Gapon [Tue, 8 Aug 2017 11:24:13 +0000 (11:24 +0000)]
8373 TXG_WAIT in ZIL commit path

illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7
https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7

https://www.illumos.org/issues/8373
  The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign a
  transaction to a transaction group.
  That seems to be logically incorrect as writing of the ZIL block does not
  introduce any new dirty data.
  Also, when there is a lot of dirty data, the call can introduce significant
  delays into the ZIL commit path,
  thus affecting all synchronous writes. Additionally, ARC throttling may affect
  the ZIL writing.
  We probably need a new mechanism similar to dmu_tx_create_assigned to assign
  ZIL transactions.
  (Ab)using TXG_WAITED does not seem to be sufficient.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

6 years agoMFV r322240: 8491 uberblock on-disk padding to reserve space for smoothly merging...
Andriy Gapon [Tue, 8 Aug 2017 11:21:58 +0000 (11:21 +0000)]
MFV r322240: 8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS

illumos/illumos-gate@79c2b812ee2010ebf20fdd92dc5f06b59000a94c
https://github.com/illumos/illumos-gate/commit/79c2b812ee2010ebf20fdd92dc5f06b59000a94c

https://www.illumos.org/issues/8491
  The zpool checkpoint feature in DxOS added a new field in the uberblock.
  The Multi-Modifier Protection Pull Request from ZoL adds two new fields in the
  uberblock (Reference: https://github.com/zfsonlinux/zfs/pull/6279).
  As these two changes come from two different sources and once upstreamed and
  deployed will introduce an incompatibility with each other we want
  to upstream a change that will reserve the padding for both of them so
  integration goes smoothly and everyone gets both features.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Olaf Faaland <faaland1@llnl.gov>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

MFC after: 3 weeks

6 years ago8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint...
Andriy Gapon [Tue, 8 Aug 2017 11:19:56 +0000 (11:19 +0000)]
8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS

illumos/illumos-gate@79c2b812ee2010ebf20fdd92dc5f06b59000a94c
https://github.com/illumos/illumos-gate/commit/79c2b812ee2010ebf20fdd92dc5f06b59000a94c

https://www.illumos.org/issues/8491
  The zpool checkpoint feature in DxOS added a new field in the uberblock.
  The Multi-Modifier Protection Pull Request from ZoL adds two new fields in the
  uberblock (Reference: https://github.com/zfsonlinux/zfs/pull/6279).
  As these two changes come from two different sources and once upstreamed and
  deployed will introduce an incompatibility with each other we want
  to upstream a change that will reserve the padding for both of them so
  integration goes smoothly and everyone gets both features.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Olaf Faaland <faaland1@llnl.gov>
Approved by: Gordon Ross <gwr@nexenta.com>
Author: Serapheim Dimitropoulos <serapheim@delphix.com>

6 years agoMFV r322238: 7915 checks in l2arc_evict could use some cleaning up
Andriy Gapon [Tue, 8 Aug 2017 11:19:14 +0000 (11:19 +0000)]
MFV r322238: 7915 checks in l2arc_evict could use some cleaning up

illumos/illumos-gate@267ae6c3a88d2fc39276af66caafa978b0935b82
https://github.com/illumos/illumos-gate/commit/267ae6c3a88d2fc39276af66caafa978b0935b82

https://www.illumos.org/issues/7915
  l2arc_evict() is strictly serialized with respect to
  l2arc_write_buffers() and l2arc_write_done().  Normally, l2arc_evict()
  and l2arc_write_buffers() are called from the same thread, so they can
  not be concurrent.  Also, l2arc_write_buffers() uses zio_wait() on the
  parent zio of all cache zio-s.  That ensures that l2arc_write_done()
  is completed before l2arc_write_buffers() returns.  Finally, if a
  cache device is removed, then l2arc_evict() is called under SCL_ALL in
  the exclusive mode.  That ensures that it can not be concurrent with
  the normal L2ARC accesses to the device (including writing and
  evicting buffers).  Given the above, some checks and actions in
  l2arc_evict() do not make sense.  For instance, it must never
  encounter the write head header let alone remove it from the buffer
  list.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

6 years ago7915 checks in l2arc_evict could use some cleaning up
Andriy Gapon [Tue, 8 Aug 2017 11:15:36 +0000 (11:15 +0000)]
7915 checks in l2arc_evict could use some cleaning up

illumos/illumos-gate@267ae6c3a88d2fc39276af66caafa978b0935b82
https://github.com/illumos/illumos-gate/commit/267ae6c3a88d2fc39276af66caafa978b0935b82

https://www.illumos.org/issues/7915
  l2arc_evict() is strictly serialized with respect to l2arc_write_buffers() and
  l2arc_write_done().
  Normally, l2arc_evict() and l2arc_write_buffers() are called from the same
  thread, so they can not be concurrent.
  Also, l2arc_write_buffers() uses zio_wait() on the parent zio of all cache zio-
  s.
  That ensures that l2arc_write_done() is completed before l2arc_write_buffers()
  returns.
  Finally, if a cache device is removed, then l2arc_evict() is called under
  SCL_ALL in the exclusive mode.
  That ensures that it can not be concurrent with the normal L2ARC accesses to
  the device (including writing and evicting buffers).
  Given the above, some checks and actions in l2arc_evict() do not make sense.
  For instance, it must never encounter the write head header let alone remove it
  from the buffer list.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Andriy Gapon <avg@FreeBSD.org>

6 years agoMFV r322236: 8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing
Andriy Gapon [Tue, 8 Aug 2017 11:14:40 +0000 (11:14 +0000)]
MFV r322236: 8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing

illumos/illumos-gate@dcb6872c565819ac88acbc2ece999ef241c8b982
https://github.com/illumos/illumos-gate/commit/dcb6872c565819ac88acbc2ece999ef241c8b982

https://www.illumos.org/issues/8126
  The sync thread is concurrently modifying dn_phys->dn_nlevels
  while dbuf_dirty() is trying to assert something about it, without
  holding the necessary lock. We need to move this assertion further down
  in the function, after we have acquired the dn_struct_rwlock.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after: 2 weeks

6 years ago8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing
Andriy Gapon [Tue, 8 Aug 2017 11:13:27 +0000 (11:13 +0000)]
8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing

illumos/illumos-gate@dcb6872c565819ac88acbc2ece999ef241c8b982
https://github.com/illumos/illumos-gate/commit/dcb6872c565819ac88acbc2ece999ef241c8b982

https://www.illumos.org/issues/8126
  The sync thread is concurrently modifying dn_phys->dn_nlevels
  while dbuf_dirty() is trying to assert something about it, without
  holding the necessary lock. We need to move this assertion further down
  in the function, after we have acquired the dn_struct_rwlock.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

6 years ago8067 zdb should be able to dump literal embedded block pointer
Andriy Gapon [Tue, 8 Aug 2017 11:10:37 +0000 (11:10 +0000)]
8067 zdb should be able to dump literal embedded block pointer

illumos/illumos-gate@4923c69fddc0887da5604a262585af3efd82ee20
https://github.com/illumos/illumos-gate/commit/4923c69fddc0887da5604a262585af3efd82ee20

https://www.illumos.org/issues/8067
  Add an option to zdb to print a literal embedded block pointer supplied on the
  command line:
  zdb -E [-A] word0:word1:...:word15

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

6 years agozfs: no need for __DECONST after abd constification in r322233
Andriy Gapon [Tue, 8 Aug 2017 11:07:34 +0000 (11:07 +0000)]
zfs: no need for __DECONST after abd constification in r322233

Note that vdev_label_write_pad2() is FreeBSD specific.

MFC after: 2 weeks
X-MFC after: r322233

6 years agoMFV r322232: 8426 mark immutable buffer arguments as such in abd.h
Andriy Gapon [Tue, 8 Aug 2017 10:59:18 +0000 (10:59 +0000)]
MFV r322232: 8426 mark immutable buffer arguments as such in abd.h

illumos/illumos-gate@9b195260e22529ac0e2580faaf89402420589c1c
https://github.com/illumos/illumos-gate/commit/9b195260e22529ac0e2580faaf89402420589c1c

https://www.illumos.org/issues/8426
  abd_copy_from_buf and abd_cmp_buf do not modify their void *buf arguments, so
  qualify them with const.
  abd_copy_from_buf_off and abd_cmp_buf_off already had that type for the
  corresponding arguments.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks

6 years ago8426 mark immutable buffer arguments as such in abd.h
Andriy Gapon [Tue, 8 Aug 2017 10:58:01 +0000 (10:58 +0000)]
8426 mark immutable buffer arguments as such in abd.h

illumos/illumos-gate@9b195260e22529ac0e2580faaf89402420589c1c
https://github.com/illumos/illumos-gate/commit/9b195260e22529ac0e2580faaf89402420589c1c

https://www.illumos.org/issues/8426
  abd_copy_from_buf and abd_cmp_buf do not modify their void *buf arguments, so
  qualify them with const.
  abd_copy_from_buf_off and abd_cmp_buf_off already had that type for the
  corresponding arguments.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

6 years ago8430 dir_is_empty_readdir() doesn't properly handle error from fdopendir()
Andriy Gapon [Tue, 8 Aug 2017 10:55:42 +0000 (10:55 +0000)]
8430 dir_is_empty_readdir() doesn't properly handle error from fdopendir()

illumos/illumos-gate@ba6e7e6505150388de6dc6a88741164118a421bf
https://github.com/illumos/illumos-gate/commit/ba6e7e6505150388de6dc6a88741164118a421bf

https://www.illumos.org/issues/8430
  we should close dirfd if fdopendir() fails.

Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Sowrabha Gopal <sowrabha.gopal@delphix.com>

6 years agoMFV r322229: 7600 zfs rollback should pass target snapshot to kernel
Andriy Gapon [Tue, 8 Aug 2017 10:52:01 +0000 (10:52 +0000)]
MFV r322229: 7600 zfs rollback should pass target snapshot to kernel

illumos/illumos-gate@77b171372ed21642e04c873ef1e87fe2365520df
https://github.com/illumos/illumos-gate/commit/77b171372ed21642e04c873ef1e87fe2365520df

https://www.illumos.org/issues/7600
  At present, the kernel side code seems to blindly rollback to whatever happens
  to be the latest snapshot at the time when the rollback task is processed.
  The expected target's name should be passed to the kernel driver and the sync
  task should validate that the target exists and that it is the latest snapshot
  indeed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 3 weeks

6 years ago7600 zfs rollback should pass target snapshot to kernel
Andriy Gapon [Tue, 8 Aug 2017 10:49:56 +0000 (10:49 +0000)]
7600 zfs rollback should pass target snapshot to kernel

illumos/illumos-gate@77b171372ed21642e04c873ef1e87fe2365520df
https://github.com/illumos/illumos-gate/commit/77b171372ed21642e04c873ef1e87fe2365520df

https://www.illumos.org/issues/7600
  At present, the kernel side code seems to blindly rollback to whatever happens
  to be the latest snapshot at the time when the rollback task is processed.
  The expected target's name should be passed to the kernel driver and the sync
  task should validate that the target exists and that it is the latest snapshot
  indeed.

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

6 years agoMFV r322227: 8377 Panic in bookmark deletion
Andriy Gapon [Tue, 8 Aug 2017 10:48:52 +0000 (10:48 +0000)]
MFV r322227: 8377 Panic in bookmark deletion

illumos/illumos-gate@42418f9e73f0d007aa87675ecc206c26fc8e073e
https://github.com/illumos/illumos-gate/commit/42418f9e73f0d007aa87675ecc206c26fc8e073e

https://www.illumos.org/issues/8377
  The problem is that when dsl_bookmark_destroy_check() is executed from open
  context (the pre-check), it fills in dbda_success based on the existence of the
  bookmark.
  But the bookmark (or containing filesystem as in this case) can be destroyed
  before we get to syncing context. When we re-run dsl_bookmark_destroy_check()
  in syncing
  context, it will not add the deleted bookmark to dbda_success, intending for
  dsl_bookmark_destroy_sync() to not process it. But because the bookmark is
  still in dbda_success
  from the open-context call, we do try to destroy it.
  The fix is that dsl_bookmark_destroy_check() should not modify dbda_success
  when called from open context.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after: 2 weeks

6 years ago8377 Panic in bookmark deletion
Andriy Gapon [Tue, 8 Aug 2017 10:47:56 +0000 (10:47 +0000)]
8377 Panic in bookmark deletion

illumos/illumos-gate@42418f9e73f0d007aa87675ecc206c26fc8e073e
https://github.com/illumos/illumos-gate/commit/42418f9e73f0d007aa87675ecc206c26fc8e073e

https://www.illumos.org/issues/8377
  The problem is that when dsl_bookmark_destroy_check() is executed from open
  context (the pre-check), it fills in dbda_success based on the existence of the
  bookmark.
  But the bookmark (or containing filesystem as in this case) can be destroyed
  before we get to syncing context. When we re-run dsl_bookmark_destroy_check()
  in syncing
  context, it will not add the deleted bookmark to dbda_success, intending for
  dsl_bookmark_destroy_sync() to not process it. But because the bookmark is
  still in dbda_success
  from the open-context call, we do try to destroy it.
  The fix is that dsl_bookmark_destroy_check() should not modify dbda_success
  when called from open context.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

6 years agoMFV r322223: 8378 crash due to bp in-memory modification of nopwrite block
Andriy Gapon [Tue, 8 Aug 2017 10:46:51 +0000 (10:46 +0000)]
MFV r322223: 8378 crash due to bp in-memory modification of nopwrite block

illumos/illumos-gate@b7edcb940884114e61382937505433c4c38c0278
https://github.com/illumos/illumos-gate/commit/b7edcb940884114e61382937505433c4c38c0278

https://www.illumos.org/issues/8378
  The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which
  we then nopwrite against.
  zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr
  to zgd_bp, dbuf_write_ready()
  could change db_blkptr, and dbuf_write_done() could remove the dirty record.
  dmu_sync() then sees the stale
  BP and that the dbuf it not dirty, so it is eligible for nop-writing.
  The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the
  db_mtx. We could still see a stale
  db_blkptr, but if it is stale then the dirty record will still exist and thus
  we won't attempt to nopwrite.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after: 2 weeks

6 years ago8378 crash due to bp in-memory modification of nopwrite block
Andriy Gapon [Tue, 8 Aug 2017 10:44:48 +0000 (10:44 +0000)]
8378 crash due to bp in-memory modification of nopwrite block

illumos/illumos-gate@b7edcb940884114e61382937505433c4c38c0278
https://github.com/illumos/illumos-gate/commit/b7edcb940884114e61382937505433c4c38c0278

https://www.illumos.org/issues/8378
  The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which
  we then nopwrite against.
  zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr
  to zgd_bp, dbuf_write_ready()
  could change db_blkptr, and dbuf_write_done() could remove the dirty record.
  dmu_sync() then sees the stale
  BP and that the dbuf it not dirty, so it is eligible for nop-writing.
  The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the
  db_mtx. We could still see a stale
  db_blkptr, but if it is stale then the dirty record will still exist and thus
  we won't attempt to nopwrite.

Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

6 years agoMFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz
Andriy Gapon [Tue, 8 Aug 2017 10:43:41 +0000 (10:43 +0000)]
MFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz

FreeBD note: the essence of this change was committed to FreeBSD in
r314274.  This commit catches up with differences between what was
committed to FreeBSD and what was committed to OpenZFS, mainly more
logical variable names.

illumos/illumos-gate@16a7e5ac116c85d965007a5f201104b564e82210
https://github.com/illumos/illumos-gate/commit/16a7e5ac116c85d965007a5f201104b564e82210

https://www.illumos.org/issues/7910
  It seems that the change in issue #6950 resurrected the problem that was
  earlier fixed by the change in issue #5219.
  Please also see the following FreeBSD bug report:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216178

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after: 2 weeks