]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
6 years agoWhen mdstart_swap() accesses a page that is already in the active queue,
Alan Cox [Mon, 2 Oct 2017 07:14:32 +0000 (07:14 +0000)]
When mdstart_swap() accesses a page that is already in the active queue,
mark the page as referenced rather than calling vm_page_activate().  This
allows the page's act_count to grow beyond ACT_INIT and better reflect
its usage.  (See also r324146, which modified a function used by tmpfs,
uiomove_object_page(), to behave in the same way.)

Reviewed by: kib, markj
MFC after: 2 weeks

6 years agoPPC: increase MAX_PICS to 32
Wojciech Macek [Mon, 2 Oct 2017 06:05:19 +0000 (06:05 +0000)]
PPC: increase MAX_PICS to 32

Previous value was too low on dual-socket POWER8 system.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           nwhitehorn
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D12540

6 years agoDefine a single instance of ahci_devclass and reference it from all the
Ian Lepore [Mon, 2 Oct 2017 02:58:28 +0000 (02:58 +0000)]
Define a single instance of ahci_devclass and reference it from all the
attachment code for various SOCs and busses.  Remove all the static and
should-have-been-static and named-differently instances of it.

This should eliminate the recently-grown build warnings about multiple
definitions when building arm kernels.

6 years agoEnhance the interrupt capabilities of ti_pruss driver.
Ian Lepore [Mon, 2 Oct 2017 01:03:18 +0000 (01:03 +0000)]
Enhance the interrupt capabilities of ti_pruss driver.

The existing ti_pruss driver for the PRUSS Hardware provided by the AM335x
ARM CPU has basic interrupt capabilities.  This updated driver provides some
more options:

 - Sysctl based configuration for the interrupts (for some examples, see the
   test plan in the phabricator review cited below).

 - A device file (/dev/pruss0.irqN) for each enabled interrupt. This file
   can be read and the device blocks if no irq has happened or returns an
   uint64_t timestamp based on nanouptime().

 - Each interrupt device file provides kqueue-based event notification,
   blocking read(), or select().

Submitted by: Manuel Stuhn <freebsdnewbie@freenet.de>
Differential Revision: https://reviews.freebsd.org/D11959

6 years agoAllow Raspberry Pi platform and drivers to be configured with upstream DTBs.
Ian Lepore [Mon, 2 Oct 2017 00:49:33 +0000 (00:49 +0000)]
Allow Raspberry Pi platform and drivers to be configured with upstream DTBs.

 - Added more compatibility strings to drivers not yet converted
 - Added new RPI platform code compatibility string to match the ones used
   upstream
 - Adapted RPI and RPI2 DTS to match the new platform code compatibility
   string

The goal is to use the upstream DTBs as a replacement for our custom one.
This is now possible with these changes.

Additionally, as the RPI firmware automatically chooses the right DTB for
us, this would allow to have one common armv6 kernel for RPI0 and RPI1
(BCM2835-based), and one common armv7 kernel for RPI2 v1.1 (BCM2836-based),
and RPI2 v1.2 / RPI3 (BCM2837-based).

Submitted by: Sylvain Garrigues <sylgar@gmail.com>
Differential Revision: https://reviews.freebsd.org/D12360

6 years agoThe soisconnected() call removed from syncache_socket() in r307966 was
Patrick Kelsey [Sun, 1 Oct 2017 23:37:17 +0000 (23:37 +0000)]
The soisconnected() call removed from syncache_socket() in r307966 was
not extraneous in the TCP Fast Open (TFO) passive-open case.  In the
TFO passive-open case, syncache_socket() is being called during
processing of a TFO SYN bearing a valid cookie, and a call to
soisconnected() is required in order to allow the application to
immediately consume any data delivered in the SYN and to have a chance
to generate response data to accompany the SYN-ACK.  The removal of
this call to soisconnected() effectively converted all TFO passive
opens to having the same RTT cost as a standard 3WHS.

This commit adds a call to soisconnected() to syncache_tfo_expand() so
that it is only in the TFO passive-open path, thereby restoring TFO
passve-open RTT performance and preserving the non-TFO connection-rate
performance gains realized by r307966.

MFC after: 1 week
Sponsored by: Limelight Networks

6 years agoFix an infinite loop in tcp_tw_2msl_scan() when an INP_TIMEWAIT inp has
Julien Charbon [Sun, 1 Oct 2017 21:20:28 +0000 (21:20 +0000)]
Fix an infinite loop in tcp_tw_2msl_scan() when an INP_TIMEWAIT inp has
been destroyed before its tcptw with INVARIANTS undefined.

This is a symmetric change of r307551:

A INP_TIMEWAIT inp should not be destroyed before its tcptw, and INVARIANTS
will catch this case.  If INVARIANTS is undefined it will emit a log(LOG_ERR)
and avoid a hard to debug infinite loop in tcp_tw_2msl_scan().

Reported by: Ben Rubson, hselasky
Submitted by: hselasky
Tested by: Ben Rubson, jch
MFC after: 1 week
Sponsored by: Verisign, inc
Differential Revision: https://reviews.freebsd.org/D12267

6 years agounbreak kernel builds on sparc64 and powerpc after r324163, ZFS Channel Programs
Andriy Gapon [Sun, 1 Oct 2017 20:12:30 +0000 (20:12 +0000)]
unbreak kernel builds on sparc64 and powerpc after r324163, ZFS Channel Programs

The custom iscntrl() in ZFS Lua code expects a signed argumnet, so
remove the harmful cast.

Reported by: ian
MFC after: 5 weeks
X-MFC with: r324163

6 years agoTo prepare for adding EFI runtime services support on arm64 move the
Andrew Turner [Sun, 1 Oct 2017 19:52:47 +0000 (19:52 +0000)]
To prepare for adding EFI runtime services support on arm64 move the
machine independent parts of the existing code to a new file that can be
shared between amd64 and arm64.

Reviewed by: kib (previous version), imp
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D12434

6 years agoFix supposed typo in the include guard symbol name, use full path for
Konstantin Belousov [Sun, 1 Oct 2017 19:03:21 +0000 (19:03 +0000)]
Fix supposed typo in the include guard symbol name, use full path for
the name.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoWhen an I/O error occurs on page out, there is no need to dirty the page,
Alan Cox [Sun, 1 Oct 2017 17:04:26 +0000 (17:04 +0000)]
When an I/O error occurs on page out, there is no need to dirty the page,
because it is already dirty.  Instead, assert that the page is dirty.

Reviewed by: kib, markj
MFC after: 1 week

6 years agoAlign test I/O buffer to page boundary.
Alexander Motin [Sun, 1 Oct 2017 16:59:02 +0000 (16:59 +0000)]
Align test I/O buffer to page boundary.

This is more alike to typical kernel behavior, that can be useful from
benchmarking point of view.

MFC after: 1 week

6 years agoMFV r323794: 8605 zfs channel programs: zfs.exists undocumented and non-working
Andriy Gapon [Sun, 1 Oct 2017 16:51:05 +0000 (16:51 +0000)]
MFV r323794: 8605 zfs channel programs: zfs.exists undocumented and non-working

illumos/illumos-gate@5f39f884e2035d671ec02148fc4d8420c670bcb4
https://github.com/illumos/illumos-gate/commit/5f39f884e2035d671ec02148fc4d8420c670bcb4

https://www.illumos.org/issues/8605
  zfs.exists() in channel programs doesn't return any result, and should have a
  man page entry.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Chris Williamson <chris.williamson@delphix.com>

MFC after: 5 weeks
X-MFC after: r324163

6 years agoWork around bcm283x silicon bugs to make i2c repeat-start work for the most
Ian Lepore [Sun, 1 Oct 2017 16:48:36 +0000 (16:48 +0000)]
Work around bcm283x silicon bugs to make i2c repeat-start work for the most
common case where it's needed -- a write followed by a read to the same slave.

The i2c controller in this chip only performs complete transfers, it does
not provide control over start/repeat-start/stop operations on the bus.
Thus, we have gotten a full stop/start sequence rather than a repeat-start
when doing a typical i2c slave access of "write address, read data".  Some
i2c slave devices require a repeat-start to work correctly.

These changes cause the controller to do a repeat-start by pre-staging the
read parameters in the controller registers immediate after the controller
has latched the values for the initial write operation, but before any
bytes are actually written.  With the values pre-staged, when the write
portion of the transfer completes, the state machine in the silicon sees
a new start operation already staged and that causes it to perform a
repeat-start.  The key to tricking the buggy hardware into doing this is
to avoid prefilling any output data in the transmit FIFO so that it is
possible to catch the silicon in the state where transmit values are
latched but the transmit isn't completed yet.

6 years agoMFV r323531: 8521 nvlist memory leak in get_clones_stat() and spa_load_best()
Andriy Gapon [Sun, 1 Oct 2017 16:41:05 +0000 (16:41 +0000)]
MFV r323531: 8521 nvlist memory leak in get_clones_stat() and spa_load_best()

illumos/illumos-gate@7d3000f774e20097a1ee45cbd06d0e38065ddd5a
https://github.com/illumos/illumos-gate/commit/7d3000f774e20097a1ee45cbd06d0e38065ddd5a

https://www.illumos.org/issues/8521
  Yuri reported this to the mailing list:
  doing a `reboot -d` on current illumos-gate HEAD gives the following "::
  findleaks -dv" output:
  findleaks: maximum buffers => 301061
  findleaks: actual buffers => 297587
  findleaks:
  findleaks: potential pointers => 29289774
  findleaks: dismissals => 26242305 (89.5%)
  findleaks: misses => 331153 ( 1.1%)
  findleaks: dups => 2419681 ( 8.2%)
  findleaks: follows => 296635 ( 1.0%)
  findleaks:
  findleaks: peak memory usage => 7353 kB
  findleaks: elapsed CPU time => 1.5 seconds
  findleaks: elapsed wall time => 2.0 seconds
  findleaks:
  CACHE LEAKED BUFCTL CALLER
  ffffff03d222b008 120 ffffff03ef7ceb78 nv_alloc_sys+0x1f
  ffffff03d222a448 123 ffffff03f4150cc8 nv_alloc_sys+0x1f
  ffffff03d222b448 5 ffffff03f28bd598 nv_alloc_sys+0x1f
  ffffff03d222b888 87 ffffff03f28c10f0 nv_alloc_sys+0x1f
  ffffff03d222c008 21 ffffff03f4139310 nv_alloc_sys+0x1f
  ffffff03d222b888 43 ffffff040ef3f3e8 nv_alloc_sys+0x1f
  ffffff03d222c008 120 ffffff03f4591e58 nv_alloc_sys+0x1f
  ffffff03d222b008 121 ffffff03f352c068 nv_alloc_sys+0x1f
  ffffff03d222a448 112 ffffff03f414e5f8 nv_alloc_sys+0x1f
  ffffff03d222b008 119 ffffff03ee92fdc0 nv_alloc_sys+0x1f
  ffffff03d222b888 46 ffffff03f28c1378 nv_alloc_sys+0x1f
  ffffff03d222b448 4 ffffff03f28c7708 nv_alloc_sys+0x1f
  ffffff03d222c008 20 ffffff03f2a6e7e8 nv_alloc_sys+0x1f

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC after: 5 weeks
X-MFC after: r324163

6 years agorevert r324166, it has an unrelated change in it
Andriy Gapon [Sun, 1 Oct 2017 16:37:54 +0000 (16:37 +0000)]
revert r324166, it has an unrelated change in it

6 years agoMFV r323531: 8521 nvlist memory leak in get_clones_stat() and spa_load_best()
Andriy Gapon [Sun, 1 Oct 2017 16:34:16 +0000 (16:34 +0000)]
MFV r323531: 8521 nvlist memory leak in get_clones_stat() and spa_load_best()

illumos/illumos-gate@7d3000f774e20097a1ee45cbd06d0e38065ddd5a
https://github.com/illumos/illumos-gate/commit/7d3000f774e20097a1ee45cbd06d0e38065ddd5a

https://www.illumos.org/issues/8521
  Yuri reported this to the mailing list:
  doing a `reboot -d` on current illumos-gate HEAD gives the following "::
  findleaks -dv" output:
  findleaks: maximum buffers => 301061
  findleaks: actual buffers => 297587
  findleaks:
  findleaks: potential pointers => 29289774
  findleaks: dismissals => 26242305 (89.5%)
  findleaks: misses => 331153 ( 1.1%)
  findleaks: dups => 2419681 ( 8.2%)
  findleaks: follows => 296635 ( 1.0%)
  findleaks:
  findleaks: peak memory usage => 7353 kB
  findleaks: elapsed CPU time => 1.5 seconds
  findleaks: elapsed wall time => 2.0 seconds
  findleaks:
  CACHE LEAKED BUFCTL CALLER
  ffffff03d222b008 120 ffffff03ef7ceb78 nv_alloc_sys+0x1f
  ffffff03d222a448 123 ffffff03f4150cc8 nv_alloc_sys+0x1f
  ffffff03d222b448 5 ffffff03f28bd598 nv_alloc_sys+0x1f
  ffffff03d222b888 87 ffffff03f28c10f0 nv_alloc_sys+0x1f
  ffffff03d222c008 21 ffffff03f4139310 nv_alloc_sys+0x1f
  ffffff03d222b888 43 ffffff040ef3f3e8 nv_alloc_sys+0x1f
  ffffff03d222c008 120 ffffff03f4591e58 nv_alloc_sys+0x1f
  ffffff03d222b008 121 ffffff03f352c068 nv_alloc_sys+0x1f
  ffffff03d222a448 112 ffffff03f414e5f8 nv_alloc_sys+0x1f
  ffffff03d222b008 119 ffffff03ee92fdc0 nv_alloc_sys+0x1f
  ffffff03d222b888 46 ffffff03f28c1378 nv_alloc_sys+0x1f
  ffffff03d222b448 4 ffffff03f28c7708 nv_alloc_sys+0x1f
  ffffff03d222c008 20 ffffff03f2a6e7e8 nv_alloc_sys+0x1f

Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Pavel Zakharov <pavel.zakharov@delphix.com>

MFC after: 5 weeks
X-MFC after: r324163

6 years agofix up r324163, MFV of r323530,r323533,r323534, 7431 ZFS Channel Programs
Andriy Gapon [Sun, 1 Oct 2017 16:25:14 +0000 (16:25 +0000)]
fix up r324163, MFV of r323530,r323533,r323534, 7431 ZFS Channel Programs

Add several new files to the files enabled by ZFS kernel option.

MFC after: 5 weeks
X-MFC with: r324163

6 years agoMFV r323530,r323533,r323534: 7431 ZFS Channel Programs, and followups
Andriy Gapon [Sun, 1 Oct 2017 16:11:07 +0000 (16:11 +0000)]
MFV r323530,r323533,r323534: 7431 ZFS Channel Programs, and followups

7431 ZFS Channel Programs

illumos/illumos-gate@dfc115332c94a2f62058ac7f2bce7631fbd20b3d
https://github.com/illumos/illumos-gate/commit/dfc115332c94a2f62058ac7f2bce7631fbd20b3d

https://www.illumos.org/issues/7431
  ZFS channel programs (ZCP) adds support for performing compound ZFS
  administrative actions via Lua scripts in a sandboxed environment (with time
  and memory limits).
  This initial commit includes both base support for running ZCP scripts, and a
  small initial library of API calls which support getting properties and
  listing, destroying, and promoting datasets.
  Testing: in addition to the included unit tests, channel programs have been in
  use at Delphix for several months for batch destroying filesystems. The
  dsl_destroy_snaps_nvl() call has also been replaced with

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: Chris Williamson <chris.williamson@delphix.com>

8552 ZFS LUA code uses floating point math

illumos/illumos-gate@916c8d881190bd2c3ca20d9fca919aecff504435
https://github.com/illumos/illumos-gate/commit/916c8d881190bd2c3ca20d9fca919aecff504435

https://www.illumos.org/issues/8552
  In the LUA interpreter used by "zfs program", the lua format() function
  accidentally includes support for '%f' and friends, which can cause compilation
  problems when building on platforms that don't support floating-point math in
  the kernel (e.g. sparc). Support for '%f' friends (%f %e %E %g %G) should be
  removed, since there's no way to supply a floating-point value anyway (all
  numbers in ZFS LUA are int64_t's).

Reviewed by: Yuri Pankov <yuripv@gmx.com>
Reviewed by: Igor Kozhukhov <igor@dilos.org>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

8590 memory leak in dsl_destroy_snapshots_nvl()

illumos/illumos-gate@e6ab4525d156c82445c116ecf6b2b874d5e9009d
https://github.com/illumos/illumos-gate/commit/e6ab4525d156c82445c116ecf6b2b874d5e9009d

https://www.illumos.org/issues/8590
  In dsl_destroy_snapshots_nvl(), "snaps_normalized" is not freed after it is
  added to "arg".

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Matthew Ahrens <mahrens@delphix.com>

FreeBSD notes:
- zfs-program.8 manual page is taken almost as is from the vendor repository,
  no FreeBSD-ification done
- fixed multiple instances of NULL being used where an integer is expected
- replaced ETIME and ECHRNG with ETIMEDOUT and EDOM respectively

This commit adds a modified version of Lua 5.2.4 under
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua, mirroring the
upstream.  See README.zfs in that directory for the description of Lua
customizations.
See zfs-program.8 on how to use the new feature.

MFC after: 5 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D12528

6 years agoImprove the debug parsing to allow flags to be added and subtracted
Scott Long [Sun, 1 Oct 2017 15:35:21 +0000 (15:35 +0000)]
Improve the debug parsing to allow flags to be added and subtracted
from the existing set.

Submitted by: rea@freebsd.org

6 years agoMark libifconfig as private library in src.libnames.mk (completes r305700)
Andriy Voskoboinyk [Sun, 1 Oct 2017 12:54:40 +0000 (12:54 +0000)]
Mark libifconfig as private library in src.libnames.mk (completes r305700)

6 years agoImprove smb(4) devfs interactions.
Konstantin Belousov [Sun, 1 Oct 2017 11:17:30 +0000 (11:17 +0000)]
Improve smb(4) devfs interactions.

Use make_dev_s(9) to create device, since the device ioctl interface
needs to access si_drv1 to get softc pointer.

Remove the common but not functional attempt to prevent parallel
accesses by file descriptors by blocking more than one open.  Either
threads in one process, or forked siblings, or file descriptors passed
over unix domain sockets all allow to execute parallel requests once
one fd is opened.  Since ioctl handler uses smbus_request_bus() to
take the bus ownership, the correct mechanism establishes exclusive
access already.

Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks

6 years agoAdd initial support for Address Lookup Table (A-LUT).
Alexander Motin [Sun, 1 Oct 2017 09:48:31 +0000 (09:48 +0000)]
Add initial support for Address Lookup Table (A-LUT).

When enabled by EEPROM, use it to relax translation address/size alignment
requirements for BAR2 window by 128 or 256 times.

MFC after: 1 week
Sponsored by: iXsystems, Inc.

6 years agoMFV r324145,324147:
Martin Matuska [Sun, 1 Oct 2017 00:40:23 +0000 (00:40 +0000)]
MFV r324145,324147:
Sync libarchive with vendor.

Relevant vendor changes:
  PR #905: Support for Zstandard read and write filters
  PR #922: Avoid overflow when reading corrupt cpio archive
  Issue #935: heap-based buffer overflow in xml_data (CVE-2017-14166)
  OSS-Fuzz 2936: Place a limit on the mtree line length
  OSS-Fuzz 2394: Ensure that the ZIP AES extension header is large enough
  OSS-Fuzz 573: Read off-by-one error in RAR archives (CVE-2017-14502)

MFC after: 1 week
Security: CVE-2017-14166, CVE-2017-14502

6 years agoHave uiomove_object_page() keep accessed pages in the active queue.
Mark Johnston [Sat, 30 Sep 2017 23:41:28 +0000 (23:41 +0000)]
Have uiomove_object_page() keep accessed pages in the active queue.

Previously, uiomove_object_page() would maintain LRU by requeuing the
accessed page. This involves acquiring one of the heavily contended page
queue locks. Moreover, it is unnecessarily expensive for pages in the
active queue.

As of r254304 the page daemon continually performs a slow scan of the
active queue, with the effect that unreferenced pages are gradually
moved to the inactive queue, from which they can be reclaimed. Prior to
that revision, the active queue was scanned only during shortages of
free and inactive pages, meaning that unreferenced pages could get
"stuck" in the queue. Thus, tmpfs was required to use the inactive queue
and requeue pages in order to maintain LRU. Now that this is no longer
the case, tmpfs I/O operations can use the active queue and avoid the
page queue locks in most cases, instead setting PGA_REFERENCED on
referenced pages to provide pseudo-LRU.

Reviewed by: alc (previous version)
MFC after: 2 weeks

6 years agoUpdate vendor/libarchive to git 92366744a52f3fa83c3899e375e415a5080a05f2
Martin Matuska [Sat, 30 Sep 2017 23:33:19 +0000 (23:33 +0000)]
Update vendor/libarchive to git 92366744a52f3fa83c3899e375e415a5080a05f2

Relevant vendor changes:
  PR #905: Support for Zstandard read and write filters
  PR #922: Avoid overflow when reading corrupt cpio archive
  Issue #935: heap-based buffer overflow in xml_data (CVE-2017-14166)
  OSS-Fuzz 2936: Place a limit on the mtree line length
  OSS-Fuzz 2394: Ensure that the ZIP AES extension header is large enough
  OSS-Fuzz 573: Read off-by-one error in RAR archives (CVE-2017-14502)

Security: CVE-2017-14166, CVE-2017-14502

6 years agouath(4): fix varible types, add missing checks for descriptor / command
Andriy Voskoboinyk [Sat, 30 Sep 2017 21:00:46 +0000 (21:00 +0000)]
uath(4): fix varible types, add missing checks for descriptor / command
header structure fields.

Reported by: hselasky
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D11786

6 years agoAdjust r322633 to only apply to libexec/rtld-elf, and not usr.bin/ldd,
Enji Cooper [Sat, 30 Sep 2017 21:00:08 +0000 (21:00 +0000)]
Adjust r322633 to only apply to libexec/rtld-elf, and not usr.bin/ldd,
when running build32/install32

This unbreaks installing usr.bin/ldd as ldd32 when NO_RTLD is defined.

MFC after:      1 week
MFC with:       r322633

6 years agoRevert r324109. This commit broke a number of systems.
Jung-uk Kim [Sat, 30 Sep 2017 20:28:50 +0000 (20:28 +0000)]
Revert r324109.  This commit broke a number of systems.

Reported by: lwhsu, kib
Requested by: ngie

6 years agotmpfs: skip zero-sized page count updates
Mateusz Guzik [Sat, 30 Sep 2017 18:23:45 +0000 (18:23 +0000)]
tmpfs: skip zero-sized page count updates

Such updates consisted of vast majority of modificiations, especially
in tmpfs_reg_resize.

For the case where page count did no change and the size grew we only
need to update tn_size. Use this fact to avoid vm object lock/relock.

MFC after: 1 week

6 years agoInitialize mdsize to make gcc happy again. This fixes buildworld on powerpc.
Andreas Tobler [Sat, 30 Sep 2017 17:51:10 +0000 (17:51 +0000)]
Initialize mdsize to make gcc happy again. This fixes buildworld on powerpc.

Reviewed by: ian@

6 years agoAdd sysctl/tunable for maximal request time.
Alexander Motin [Sat, 30 Sep 2017 13:17:31 +0000 (13:17 +0000)]
Add sysctl/tunable for maximal request time.

MFC after: 1 week

6 years agoFix reporting of probing size. This bug was introduced in r324119.
Michael Tuexen [Sat, 30 Sep 2017 12:30:05 +0000 (12:30 +0000)]
Fix reporting of probing size. This bug was introduced in r324119.

MFC after: 4 weeks

6 years agoAdd SCTP and TCP as protocols for sending probe packets.
Michael Tuexen [Sat, 30 Sep 2017 11:45:33 +0000 (11:45 +0000)]
Add SCTP and TCP as protocols for sending probe packets.

MFC after: 4 weeks

6 years ago* Update function definitions.
Michael Tuexen [Sat, 30 Sep 2017 11:40:18 +0000 (11:40 +0000)]
* Update function definitions.
* Ensure that the datalen always describes the length after the IPv6
  header consistently, not matter which protocol us used for probes..
* Document that the default length is 20, not 12.
* Don't send inormation in probe packets which is not needed or
  even checked when the responses are processed.
* Address CID 978587.

This is mainly a cleanup preparing the addition of SCTP and TCP
as possible probe packet protocols.

MFC after: 4 weeks

6 years agoDisable/enable CSUM_UDP and CSUM_TCP along with CSUM_IP
Jared McNeill [Sat, 30 Sep 2017 10:35:44 +0000 (10:35 +0000)]
Disable/enable CSUM_UDP and CSUM_TCP along with CSUM_IP

Submitted by: guyyur@gmail.com
Differential Revision: https://reviews.freebsd.org/D12536

6 years agoFix if_awg tx dma status reg offsets.
Jared McNeill [Sat, 30 Sep 2017 10:34:07 +0000 (10:34 +0000)]
Fix if_awg tx dma status reg offsets.

Submitted by: guyyur@gmail.com
Differential Revision: https://reviews.freebsd.org/D12535

6 years agoUpdate cpucontrol(8).
Konstantin Belousov [Sat, 30 Sep 2017 10:03:42 +0000 (10:03 +0000)]
Update cpucontrol(8).

Mention new -n flag.
Remove optional -h from the operation list lines, -h would cause the
utility to exit without performing the action.
Explain the default path behavior, list default path.
Correct example of update performed from the non-default path,
it needs -n and the trailing slash is redundand.
Remove useless BUGS section.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoAllow to disable default microcode updates search path with the new
Konstantin Belousov [Sat, 30 Sep 2017 09:59:32 +0000 (09:59 +0000)]
Allow to disable default microcode updates search path with the new
'-n' option.

Look for updates in the default locations only after user-supplied
locations are tried.

If newer microcode files are put into non-standard path, both measures
allow to avoid situation where older update loaded from the default
path first, and then the second update is applied from non-standard
path.  Applying intermediate updates might be undesirable.

Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoFix Makefile entries from r323275
Alan Somers [Fri, 29 Sep 2017 23:47:23 +0000 (23:47 +0000)]
Fix Makefile entries from r323275

Reported by: Vladimir Zakharov <zakharov.vv@gmail.com>
Reviewed by: ngie
MFC after: 3 weeks
X-MFC-With: 323275

6 years agoAdd support for Flex File Layout to the pNFS client structures.
Rick Macklem [Fri, 29 Sep 2017 23:13:01 +0000 (23:13 +0000)]
Add support for Flex File Layout to the pNFS client structures.

This patch modifies the pNFS client layout and deviceinfo structures
to add fields and unions for the Flex File Layout. Until a future
commit adds Flex File layout support, these new fields are not used.
This patch should not affect the "pnfs" option for File Layout.

6 years agoMerge ACPICA 20170929.
Jung-uk Kim [Fri, 29 Sep 2017 23:02:49 +0000 (23:02 +0000)]
Merge ACPICA 20170929.

6 years agoRemove spurious $flags; it's a paste-o from copying the line from rc.subr.
Ian Lepore [Fri, 29 Sep 2017 22:21:42 +0000 (22:21 +0000)]
Remove spurious $flags; it's a paste-o from copying the line from rc.subr.
Also, add a comment documenting the args passed to mount_md().

6 years agoEnhance mdmfs(8) to work with tmpfs(5).
Ian Lepore [Fri, 29 Sep 2017 22:13:26 +0000 (22:13 +0000)]
Enhance mdmfs(8) to work with tmpfs(5).

Existing scripts and associated config such as rc.initdiskless, rc.d/var,
and others, use mdmfs to create memory filesystems. That program accepts a
size argument which allows SI suffixes and treats an unsuffixed number as a
count of 512 byte sectors. That makes it difficult to convert existing
scripts to use tmpfs instead of mdmfs, because tmpfs treats unsuffixed
numbers as a count of bytes. The script logic to deal with existing user
config that might include suffixed and unsuffixed numbers is... unpleasant.

Also, there is no g'tee that tmpfs will be available. It is sometimes
configured out of small-resource embedded systems to save memory and flash
storage space.

These changes enhance mdmfs(8) so that it accepts two new values for the
'md-device' arg: 'tmpfs' and 'auto'. With tmpfs, the program always uses
tmpfs(5) (and fails if it's not available). With 'auto' the program prefers
tmpfs, but falls back to using md(4) if tmpfs isn't available. It also
handles the -s <size> argument so that the mdconfig interpetation of
unsuffixed numbers applies when tmpfs is used as well, so that existing user
config keeps working after a switch to tmpfs.

A new rc setting, mfs_type, is added to etc/defaults/rc.conf to let users
force the use of tmpfs or md; the default value is "auto".

Differential Revision: https://reviews.freebsd.org/D12301

6 years agoaesni(4): Fix GCC build
Conrad Meyer [Fri, 29 Sep 2017 19:56:09 +0000 (19:56 +0000)]
aesni(4): Fix GCC build

The GCC xmmintrin.h header brokenly includes mm_malloc.h unconditionally.
(The Clang version of xmmintrin.h only includes mm_malloc.h if not compiling
in standalone mode.)

Hack around GCC's broken header by defining the include guard macro ahead of
including xmmintrin.h.

Reported by: lwhsu, jhb
Tested by: lwhsu
Sponsored by: Dell EMC Isilon

6 years agoImport ACPICA 20170929.
Jung-uk Kim [Fri, 29 Sep 2017 17:08:30 +0000 (17:08 +0000)]
Import ACPICA 20170929.

6 years ago__setrunelocale: Fix asprintf(3) failure not returning an error.
Bryan Drewery [Fri, 29 Sep 2017 16:30:50 +0000 (16:30 +0000)]
__setrunelocale: Fix asprintf(3) failure not returning an error.

Also fix the style of the asprintf(3) call in __collate_load_tables_l().
Both of these lines were modified away from snprintf(3) during the
import from DragonFly/Illumos.

Reviewed by: jilles (briefly over shoulder)
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon

6 years agonetsmb: Fix buggy/racy smb_strdupin()
Conrad Meyer [Fri, 29 Sep 2017 15:53:26 +0000 (15:53 +0000)]
netsmb: Fix buggy/racy smb_strdupin()

smb_strdupin() tried to roll a copyin() based strlen to allocate a buffer
and then blindly copyin that size.  Of course, a malicious user program
could simultaneously manipulate the buffer, resulting in a non-terminated
string being copied.

Later assumptions in the code rely upon the string being nul-terminated.

Just use copyinstr() and drop the racy sizing.

PR: 222687
Reported by: Meng Xu <meng.xu AT gatech.edu>
Security: possible local DoS
Sponsored by: Dell EMC Isilon

6 years agoman(1): silent the output of mandoc when testing
Baptiste Daroussin [Fri, 29 Sep 2017 07:44:48 +0000 (07:44 +0000)]
man(1): silent the output of mandoc when testing

This reduce the spam a user may face when mandoc tries to
figure out if it can renders a manpage or fallback on groff(1)

Reported by: bdrewery
MFC after: 3 days

6 years agoCompile loader as Little-Endian on PPC64/POWER8
Wojciech Macek [Fri, 29 Sep 2017 06:36:19 +0000 (06:36 +0000)]
Compile loader as Little-Endian on PPC64/POWER8

  Add flag to the makefile to allow loader compilation as
  Little-Endian 32-bit executable.
  Usage:

  make WITH_LOADER_FORCE_LE=yes -C sys/boot all

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           imp, nwhitehorn
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D12421

6 years agoSome mbuf related fixes in icmp_error()
Andrey V. Elsukov [Fri, 29 Sep 2017 06:24:45 +0000 (06:24 +0000)]
Some mbuf related fixes in icmp_error()

* check mbuf length before doing mtod() and accessing to IP header;
* update oip pointer and all depending pointers after m_pullup();
* remove extra checks and extra parentheses, wrap long lines;

PR: 222670
Reported by: Prabhakar Lakhera
MFC after: 1 week

6 years agoConvert sysctl sbuf usage to use a fully dynaic sbuf. This is strictly
Scott Long [Fri, 29 Sep 2017 04:52:15 +0000 (04:52 +0000)]
Convert sysctl sbuf usage to use a fully dynaic sbuf.  This is strictly
needed, but it silences an erroneous Coverity warning and makes the code a
little more logically consistent.  Also mark the sysctl as MPSAFE.

Sponsored by: Netflix

6 years agoAdd ThinkPad USB 3.0 Ethernet Adapter.
Kevin Lo [Fri, 29 Sep 2017 01:19:22 +0000 (01:19 +0000)]
Add ThinkPad USB 3.0 Ethernet Adapter.

Submitted by: jh

6 years agoAdd the NFS client state flag that enables Flexible File Layout.
Rick Macklem [Thu, 28 Sep 2017 23:05:08 +0000 (23:05 +0000)]
Add the NFS client state flag that enables Flexible File Layout.

This patch adds a NFSSTA_FLEXFILE flag that will be used to enable
Flexible File Layout for the NFSv4.1 pNFS client. It is not yet
used, but will be after a future commit adds Flex File Layout support.

6 years agoChange nfsv4_getipaddr() and nfsrpc_fillsa() to not use sockaddr_storage.
Rick Macklem [Thu, 28 Sep 2017 22:33:01 +0000 (22:33 +0000)]
Change nfsv4_getipaddr() and nfsrpc_fillsa() to not use sockaddr_storage.

This patch changes nfsv4_getipaddr() and nfsrpc_fillsa() to use
a sockaddr_in * and sockaddr_in6 * instead of sockaddr_storage, to
avoid allocating the latter on the stack. It also moves the nfsrpc_fillsa()
call to after the completion of parsing of the DeviceInfo reply from
the server. This patch is in preparation for addition of Flex File
Layout support in a future commit.
It only affects the "pnfs" NFSv4.1 client mount option and should not
have changed its semantics.

6 years agoMake this compile if NO_SYSCTL_DESCR is defined.
Nick Hibma [Thu, 28 Sep 2017 19:57:46 +0000 (19:57 +0000)]
Make this compile if NO_SYSCTL_DESCR is defined.

Defining a variable with the description and then only use it in the
SYSCTL declaration led to an unused variable warning. In the SYSCTL the
passed value is discarded using __DESCR.

6 years agoMake this compile with DEVICE_POLLING set.
Nick Hibma [Thu, 28 Sep 2017 19:33:36 +0000 (19:33 +0000)]
Make this compile with DEVICE_POLLING set.

smc_poll had the wrong prototype. It returns 0 as it does not check
anything but submits a taskqueue.

Reviewed by: benno
MFC after: 2 weeks

6 years agoOptimize vm_object_page_remove() by eliminating pointless calls to
Alan Cox [Thu, 28 Sep 2017 17:55:41 +0000 (17:55 +0000)]
Optimize vm_object_page_remove() by eliminating pointless calls to
pmap_remove_all().  If the object to which a page belongs has no
references, then that page cannot possibly be mapped.

Reviewed by: kib
MFC after: 1 week

6 years agoAlike to ZFS disable cache flush after first ENOTSUP error.
Alexander Motin [Thu, 28 Sep 2017 15:58:41 +0000 (15:58 +0000)]
Alike to ZFS disable cache flush after first ENOTSUP error.

MFC after: 1 week

6 years agoTypo in filename in comment.
Nick Hibma [Thu, 28 Sep 2017 12:43:25 +0000 (12:43 +0000)]
Typo in filename in comment.

6 years agoCorrection after r323873: #include <sys/lock.h> in addition to <sys/rmlock.h>
Eugene Grosbein [Thu, 28 Sep 2017 11:26:37 +0000 (11:26 +0000)]
Correction after r323873: #include <sys/lock.h> in addition to <sys/rmlock.h>

PR: 220076
Approved by: mav (mentor)
MFC after: 3 days

6 years agoA different fix for the issue from r323722.
Konstantin Belousov [Thu, 28 Sep 2017 09:01:28 +0000 (09:01 +0000)]
A different fix for the issue from r323722.

Split the handlers for pop of invalid selectors from the trap frame
into usermode and kernel variants.  Usermode handler is kept as is, it
restores the already loaded parts of the trap frame and jumps to set
up a signal delivery to the user process.

New kernel part of the handler emulates IRET treatment of the segments
which would violate access right.  It loads NUL selector in the
segment register which load causes the fault, and then continues the
return to interrupted kernel code.  Since invalid selectors in the
segment registers in the kernel mode can only exist while kernel still
enters or exits from userspace, we only zero invalid userspace
selectors.  If userspace tries to use the segment register, it gets a
signal, as if the processor segment descriptor cache was reloaded.

Reported by: Maxime Villard <max@m00nbsd.net>
Suggested and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoRestore a part of r323722.
Konstantin Belousov [Thu, 28 Sep 2017 08:46:15 +0000 (08:46 +0000)]
Restore a part of r323722.

Do not return from interrupt using the POP_FRAME;iret instruction
sequence, always jump to doreti.

The user segments selectors saved on the stack might become invalid
because userspace manipulated LDT in a parallel thread.  trap() is
aware of such issue, but it is only prepared to handle it at iret and
segment registers load operations in doreti path.

Also remove POP_FRAME macro because it is no longer used.

Reviewed by: bde, jhb (as part of r323722)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoRevert r323722. A better fix will be committed shortly, as well as
Konstantin Belousov [Thu, 28 Sep 2017 08:38:24 +0000 (08:38 +0000)]
Revert r323722.  A better fix will be committed shortly, as well as
some still useful bits of the reverted revision.

The problem with the committed fix is that there are still issues with
returning from NMI, when NMI interrupted kernel in a moment where the
kernel segments selectors were still not loaded into registers.  If
this happens, the NMI return would loose the userspace selectors
because r323722 does not reload segment registers on return to kernel
mode.

Fixing the problem is complicated.  Since an alternative approach to
handle the original bug exists, it makes sence to stop adding more
complexity.

Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agohyperv/hn: Unbreak i386 building.
Sepherosa Ziehau [Thu, 28 Sep 2017 07:02:56 +0000 (07:02 +0000)]
hyperv/hn: Unbreak i386 building.

Reported by: cy
MFC after: 1 week
Sponsored by: Microsoft

6 years agoTweak performance of nda completions
Warner Losh [Thu, 28 Sep 2017 01:27:00 +0000 (01:27 +0000)]
Tweak performance of nda completions

Use xpt_done_direct in preference to xpt_done when completing a
successful I/O. Continue to use xpt_done when there's an error, or for
completion of the submission of a CCB. This eliminates a context
switch to the cam_doneq thread.

Sponsored by: Netflix
Suggested by: scottl@

6 years agoFix a memory leak that occurred in the pNFS client.
Rick Macklem [Wed, 27 Sep 2017 23:23:41 +0000 (23:23 +0000)]
Fix a memory leak that occurred in the pNFS client.

When a "pnfs" NFSv4.1 mount was unmounted, it didn't free up the layouts
and deviceinfo structures. This leak only affects "pnfs" mounts and only
when the mount is umounted.
Found while testing the pNFS Flexible File layout client code.

MFC after: 2 weeks

6 years agoUse UMA_ALIGNOF() for name cache UMA zones.
John Baldwin [Wed, 27 Sep 2017 23:18:57 +0000 (23:18 +0000)]
Use UMA_ALIGNOF() for name cache UMA zones.

This fixes kernel crashes due to misaligned accesses to the 64-bit
time_t embedded in struct namecache_ts in MIPS n32 kernels.

MFC after: 1 week
Sponsored by: DARPA / AFRL

6 years agoAdd UMA_ALIGNOF().
John Baldwin [Wed, 27 Sep 2017 23:15:33 +0000 (23:15 +0000)]
Add UMA_ALIGNOF().

This is a wrapper around _Alignof() that sets the alignment for a zone
to the alignment required by a given type.  This allows the compiler to
determine the proper alignment rather than having the programmer try to
guess.

Discussed on: arch@
MFC after: 1 week
Sponsored by: DARPA / AFRL

6 years agobhnd: Add support for supplying bus I/O callbacks when initializing an EROM
Landon J. Fuller [Wed, 27 Sep 2017 19:48:34 +0000 (19:48 +0000)]
bhnd: Add support for supplying bus I/O callbacks when initializing an EROM
parser.

This allows us to use the EROM parser API in cases where the standard bus
space I/O APIs are unsuitable. In particular, this will allow us to parse
the device enumeration table directly from bhndb(4) drivers, prior to
full attach and configuration of the bridge.

Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12510

6 years agobhnd: Implement bhnd(4) platform device registration.
Landon J. Fuller [Wed, 27 Sep 2017 19:44:23 +0000 (19:44 +0000)]
bhnd: Implement bhnd(4) platform device registration.

Add bhnd(4) API for explicitly registering BHND platform devices (ChipCommon,
PMU, NVRAM, etc) with the bus, rather than walking the newbus hierarchy to
discover platform devices. These devices are now also refcounted; attempting
to deregister an actively used platform device will return EBUSY.

This resolves a lock ordering incompatibility with bwn(4)'s firmware loading
threads; previously it was necessary to acquire Giant to protect newbus access
when locating and querying the NVRAM device.

Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12392

6 years agoSince the human readable name is actually ignored, and not matching a
Warner Losh [Wed, 27 Sep 2017 19:22:10 +0000 (19:22 +0000)]
Since the human readable name is actually ignored, and not matching a
'human' pnp string, change it to #, the name reserved for fields that
are ignored.

6 years agoImprove description of the PNP string a bit.
Warner Losh [Wed, 27 Sep 2017 19:21:52 +0000 (19:21 +0000)]
Improve description of the PNP string a bit.

6 years agoUnrevert r324059
Conrad Meyer [Wed, 27 Sep 2017 19:14:00 +0000 (19:14 +0000)]
Unrevert r324059

With a colon and bogus name ("#") added to appease the simplistic parser
used in kldxref.

Sponsored by: Dell EMC Isilon

6 years agoUse C99 initializers for DTrace provider methods.
Mark Johnston [Wed, 27 Sep 2017 17:46:38 +0000 (17:46 +0000)]
Use C99 initializers for DTrace provider methods.

This makes the definitions easier to read and more cscope-friendly.

MFC after: 1 week

6 years agoTx Ring Shadow Consumer Index Register needs to be cleared prior
David C Somayajulu [Wed, 27 Sep 2017 17:46:11 +0000 (17:46 +0000)]
Tx Ring Shadow Consumer Index Register needs to be cleared prior
to passing it's physical address to the FW during Tx Create Context.

MFC after:3 days

6 years agoAdd check to avoid raw inode iblocks fields overflow in case of huge_file feature.
Fedor Uporov [Wed, 27 Sep 2017 16:12:13 +0000 (16:12 +0000)]
Add check to avoid raw inode iblocks fields overflow in case of huge_file feature.
Use the Linux logic for now.

Reviewed by:    pfg (mentor)
Approved by:    pfg (mentor)
MFC after:      2 weeks
Differential Revision: https://reviews.freebsd.org/D12131

6 years agoRemove PNP metadata from drm2 drivers until kldxref problem is resolved
Conrad Meyer [Wed, 27 Sep 2017 14:59:18 +0000 (14:59 +0000)]
Remove PNP metadata from drm2 drivers until kldxref problem is resolved

Reported by: np
Sponsored by: Dell EMC Isilon

6 years agoRemove unused function.
Michael Tuexen [Wed, 27 Sep 2017 13:05:23 +0000 (13:05 +0000)]
Remove unused function.

MFC after: 1 week

6 years agovfs_export: Simplify vfs_export_lookup
Emmanuel Vadot [Wed, 27 Sep 2017 09:39:16 +0000 (09:39 +0000)]
vfs_export: Simplify vfs_export_lookup

If the filesystem is not exported directly return NULL.
If no address is given and filesystem is exported using some default
one return it directly, if it doesn't have a default one directly
return NULL.

Reviewed by: kib, bapt
MFC after: 1 week
Sponsored by: Gandi.net
Differential Revision: https://reviews.freebsd.org/D12505

6 years agokernel: Bump __FreeBSD_version for the removal of M_HASHTYPE_RSS_UDP_IPV4_EX
Sepherosa Ziehau [Wed, 27 Sep 2017 06:33:55 +0000 (06:33 +0000)]
kernel: Bump __FreeBSD_version for the removal of M_HASHTYPE_RSS_UDP_IPV4_EX

Sponsored by: Microsoft

6 years agombuf: Remove UDP_IPV4_EX, which was never defined.
Sepherosa Ziehau [Wed, 27 Sep 2017 06:31:35 +0000 (06:31 +0000)]
mbuf: Remove UDP_IPV4_EX, which was never defined.

Add comment to explain the IPV6_EX suffix.  The confusion about
these RSS hash type probably stems from the facts that they were
never widely implemented by hardwares.

Reviewed by: rwatson
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12453

6 years agoixl: Fix mbuf hash type settings.
Sepherosa Ziehau [Wed, 27 Sep 2017 05:59:54 +0000 (05:59 +0000)]
ixl: Fix mbuf hash type settings.

IPV6_EXs in RSS never mean fragment.  They mean:
"- Home address from the home address option in the IPv6 destination
   options header.  If the extension header is not present, use the
   Source IPv6 Address.
 - IPv6 address that is contained in the Routing-Header-Type-2 from
   the associated extension header.  If the extension header is not
   present, use the Destination IPv6 Address."

UDP_IPV4_EX is an invalid RSS hash type, which will be removed.

Quoted from:
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-hashing-types#ndishashipv6ex

Reviewed by: erj
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12450

6 years agotcp: Don't "negotiate" MSS.
Sepherosa Ziehau [Wed, 27 Sep 2017 05:52:37 +0000 (05:52 +0000)]
tcp: Don't "negotiate" MSS.

_NO_ OSes actually "negotiate" MSS.

RFC 879:
"... This Maximum Segment Size (MSS) announcement (often mistakenly
called a negotiation) ..."

This negotiation behaviour was introduced 11 years ago by r159955
without any explaination about why FreeBSD had to "negotiate" MSS:

    In syncache_respond() do not reply with a MSS that is larger than what
    the peer announced to us but make it at least tcp_minmss in size.

    Sponsored by:   TCP/IP Optimization Fundraise 2005

The tcp_minmss behaviour is still kept.

Syncookie fix was prodded by tuexen, who also helped to test this
patch w/ packetdrill.

Reviewed by: tuexen, karels, bz (previous version)
MFC after: 2 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12430

6 years agohyperv/hn: Fix UDP checksum offload issue in Azure.
Sepherosa Ziehau [Wed, 27 Sep 2017 05:44:50 +0000 (05:44 +0000)]
hyperv/hn: Fix UDP checksum offload issue in Azure.

UDP checksum offload does not work in Azure if following conditions are
met:
- sizeof(IP hdr + UDP hdr + payload) > 1420.
- IP_DF is not set in IP hdr

Use software checksum for UDP datagrams falling into this category.

Add two tunables to disable UDP/IPv4 and UDP/IPv6 checksum offload, in
case something unexpected happened.

MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12429

6 years agohyperv/hn: Set tcp header offset for CSUM/LSO offloading.
Sepherosa Ziehau [Wed, 27 Sep 2017 04:42:40 +0000 (04:42 +0000)]
hyperv/hn: Set tcp header offset for CSUM/LSO offloading.

No observable effect; better safe than sorry.

MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12417

6 years agosysctl: remove target buffer read/write checks prior to calling the handler
Mateusz Guzik [Wed, 27 Sep 2017 01:31:52 +0000 (01:31 +0000)]
sysctl: remove target buffer read/write checks prior to calling the handler

Said checks were inherently racy anyway as jokers could unmap target areas
before the handler got around to accessing them.

This saves time by avoiding locking the address space.

MFC after: 1 week

6 years agoAnnotate sysctlmemlock with __exclusive_cache_line.
Mateusz Guzik [Wed, 27 Sep 2017 01:27:43 +0000 (01:27 +0000)]
Annotate sysctlmemlock with __exclusive_cache_line.

MFC after: 1 week

6 years agoRemove manpage entries about crshared(9)
Mateusz Guzik [Wed, 27 Sep 2017 01:12:47 +0000 (01:12 +0000)]
Remove manpage entries about crshared(9)

The function itself was removed years ago in r272546

Submitted by: Paulm <paulm tetrardus.net>
MFC after: 2 weeks

6 years agoWhack procctl(8)
Mateusz Guzik [Wed, 27 Sep 2017 01:03:00 +0000 (01:03 +0000)]
Whack procctl(8)

It was supposed to provide a recovery mechanism against bugs in procfs's
long deprecated tracing capabilities.

Remove the tool as a prerequisite to axing the kernel side.

The tracing facility to use is ptrace(2).

MFC after: 2 weeks

6 years agomtx: drop the tid argument from _mtx_lock_sleep
Mateusz Guzik [Wed, 27 Sep 2017 00:57:05 +0000 (00:57 +0000)]
mtx: drop the tid argument from _mtx_lock_sleep

tid must be equal to curthread and the target routine was already reading
it anyway, which is not a problem. Not passing it as a parameter allows for
a little bit shorter code in callers.

MFC after: 1 week

6 years agoAdd major and minor version arguments to nfscl_reqstart().
Rick Macklem [Tue, 26 Sep 2017 23:42:44 +0000 (23:42 +0000)]
Add major and minor version arguments to nfscl_reqstart().

This patch adds "vers" and "minorvers" arguments to nfscl_reqstart().
The patch always passes them in as "0" and that implies no change
in semantics. These arguments will be used by a future commit that
adds support for the Flexible File Layout.

6 years agoDon't defer wakeup()s for completed journal workitems.
John Baldwin [Tue, 26 Sep 2017 23:24:15 +0000 (23:24 +0000)]
Don't defer wakeup()s for completed journal workitems.

Normally wakeups() are performed for completed softupdates work items
in workitem_free() before the underlying memory is free()'d.
complete_jseg() was clearing the "wakeup needed" flag in work items to
defer the wakeup until the end of each loop iteration.  However, this
resulted in the item being free'd before it's address was used with
wakeup().  As a result, another part of the kernel could allocate this
memory from malloc() and use it as a wait channel for a different
"event" with a different lock.  This triggered an assertion failure
when the lock passed to sleepq_add() did not match the existing lock
associated with the sleep queue.  Fix this by removing the code to
defer the wakeup in complete_jseg() allowing the wakeup to occur
slightly earlier in workitem_free() before free() is called.

The main reason I can think of for deferring a wakeup() would be to
avoid waking up a waiter while holding a lock that the waiter would
need.  However, no locks are dropped in between the wakeup() in
workitem_free() and the end of the loop in complete_jseg() as far as I
can tell.

In general I think it is not safe to do a wakeup() after free() as one
cannot control how other parts of the kernel that might reuse the
address for a different wait channel will handle spurious wakeups.

Reported by: pho
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D12494

6 years agoAdd PNP metadata to more drivers
Conrad Meyer [Tue, 26 Sep 2017 23:23:58 +0000 (23:23 +0000)]
Add PNP metadata to more drivers

GPUs: radeonkms, i915kms
NICs: if_em, if_igb, if_bnxt

This metadata isn't used yet, but it will be handy to have later to
implement automatic module loading.

Reviewed by: imp, mmacy
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12488

6 years agoaesni(4): Add support for x86 SHA intrinsics
Conrad Meyer [Tue, 26 Sep 2017 23:12:32 +0000 (23:12 +0000)]
aesni(4): Add support for x86 SHA intrinsics

Some x86 class CPUs have accelerated intrinsics for SHA1 and SHA256.
Provide this functionality on CPUs that support it.

This implements CRYPTO_SHA1, CRYPTO_SHA1_HMAC, and CRYPTO_SHA2_256_HMAC.

Correctness: The cryptotest.py suite in tests/sys/opencrypto has been
enhanced to verify SHA1 and SHA256 HMAC using standard NIST test vectors.
The test passes on this driver.  Additionally, jhb's cryptocheck tool has
been used to compare various random inputs against OpenSSL.  This test also
passes.

Rough performance averages on AMD Ryzen 1950X (4kB buffer):
aesni:      SHA1: ~8300 Mb/s    SHA256: ~8000 Mb/s
cryptosoft:       ~1800 Mb/s    SHA256: ~1800 Mb/s

So ~4.4-4.6x speedup depending on algorithm choice.  This is consistent with
the results the Linux folks saw for 4kB buffers.

The driver borrows SHA update code from sys/crypto sha1 and sha256.  The
intrinsic step function comes from Intel under a 3-clause BSDL.[0]  The
intel_sha_extensions_sha<foo>_intrinsic.c files were renamed and lightly
modified (added const, resolved a warning or two; included the sha_sse
header to declare the functions).

[0]: https://software.intel.com/en-us/articles/intel-sha-extensions-implementations

Reviewed by: jhb
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12452

6 years agoFix regression from r323855. The EXIT trap now isn't cleared, so upon
Gleb Smirnoff [Tue, 26 Sep 2017 21:54:19 +0000 (21:54 +0000)]
Fix regression from r323855.  The EXIT trap now isn't cleared, so upon
exit it tried to unmount already unmounted partition, resulting in failure.

6 years agoFix delete all multicast addresses
David C Somayajulu [Tue, 26 Sep 2017 20:53:25 +0000 (20:53 +0000)]
Fix delete all multicast addresses

Submitted by:Anand.Khoje@cavium.com
MFC after:5 days

6 years agoa10_gpio: Enable all needed clocks
Emmanuel Vadot [Tue, 26 Sep 2017 20:23:09 +0000 (20:23 +0000)]
a10_gpio: Enable all needed clocks

Do not enable only the first clock, enable them all.

6 years agoa10_ehci: Enable all clocks and reset
Emmanuel Vadot [Tue, 26 Sep 2017 19:21:43 +0000 (19:21 +0000)]
a10_ehci: Enable all clocks and reset

a10_ehci can have multiple clocks and reset, enable them all instead of
only the first one.

6 years agoaw_usbphy: Only reroute OTG for phy0
Emmanuel Vadot [Tue, 26 Sep 2017 19:20:50 +0000 (19:20 +0000)]
aw_usbphy: Only reroute OTG for phy0

We only need to route OTG port to host mode on phy0 and if no VBUS
is present on the port, otherwise leave the port in periperal mode.