]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
6 years agoMake this compile with DEVICE_POLLING set.
n_hibma [Thu, 28 Sep 2017 19:33:36 +0000 (19:33 +0000)]
Make this compile with DEVICE_POLLING set.

smc_poll had the wrong prototype. It returns 0 as it does not check
anything but submits a taskqueue.

Reviewed by: benno
MFC after: 2 weeks

6 years agoOptimize vm_object_page_remove() by eliminating pointless calls to
alc [Thu, 28 Sep 2017 17:55:41 +0000 (17:55 +0000)]
Optimize vm_object_page_remove() by eliminating pointless calls to
pmap_remove_all().  If the object to which a page belongs has no
references, then that page cannot possibly be mapped.

Reviewed by: kib
MFC after: 1 week

6 years agoAlike to ZFS disable cache flush after first ENOTSUP error.
mav [Thu, 28 Sep 2017 15:58:41 +0000 (15:58 +0000)]
Alike to ZFS disable cache flush after first ENOTSUP error.

MFC after: 1 week

6 years agoTypo in filename in comment.
n_hibma [Thu, 28 Sep 2017 12:43:25 +0000 (12:43 +0000)]
Typo in filename in comment.

6 years agoCorrection after r323873: #include <sys/lock.h> in addition to <sys/rmlock.h>
eugen [Thu, 28 Sep 2017 11:26:37 +0000 (11:26 +0000)]
Correction after r323873: #include <sys/lock.h> in addition to <sys/rmlock.h>

PR: 220076
Approved by: mav (mentor)
MFC after: 3 days

6 years agoA different fix for the issue from r323722.
kib [Thu, 28 Sep 2017 09:01:28 +0000 (09:01 +0000)]
A different fix for the issue from r323722.

Split the handlers for pop of invalid selectors from the trap frame
into usermode and kernel variants.  Usermode handler is kept as is, it
restores the already loaded parts of the trap frame and jumps to set
up a signal delivery to the user process.

New kernel part of the handler emulates IRET treatment of the segments
which would violate access right.  It loads NUL selector in the
segment register which load causes the fault, and then continues the
return to interrupted kernel code.  Since invalid selectors in the
segment registers in the kernel mode can only exist while kernel still
enters or exits from userspace, we only zero invalid userspace
selectors.  If userspace tries to use the segment register, it gets a
signal, as if the processor segment descriptor cache was reloaded.

Reported by: Maxime Villard <max@m00nbsd.net>
Suggested and reviewed by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoRestore a part of r323722.
kib [Thu, 28 Sep 2017 08:46:15 +0000 (08:46 +0000)]
Restore a part of r323722.

Do not return from interrupt using the POP_FRAME;iret instruction
sequence, always jump to doreti.

The user segments selectors saved on the stack might become invalid
because userspace manipulated LDT in a parallel thread.  trap() is
aware of such issue, but it is only prepared to handle it at iret and
segment registers load operations in doreti path.

Also remove POP_FRAME macro because it is no longer used.

Reviewed by: bde, jhb (as part of r323722)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agoRevert r323722. A better fix will be committed shortly, as well as
kib [Thu, 28 Sep 2017 08:38:24 +0000 (08:38 +0000)]
Revert r323722.  A better fix will be committed shortly, as well as
some still useful bits of the reverted revision.

The problem with the committed fix is that there are still issues with
returning from NMI, when NMI interrupted kernel in a moment where the
kernel segments selectors were still not loaded into registers.  If
this happens, the NMI return would loose the userspace selectors
because r323722 does not reload segment registers on return to kernel
mode.

Fixing the problem is complicated.  Since an alternative approach to
handle the original bug exists, it makes sence to stop adding more
complexity.

Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week

6 years agohyperv/hn: Unbreak i386 building.
sephe [Thu, 28 Sep 2017 07:02:56 +0000 (07:02 +0000)]
hyperv/hn: Unbreak i386 building.

Reported by: cy
MFC after: 1 week
Sponsored by: Microsoft

6 years agoTweak performance of nda completions
imp [Thu, 28 Sep 2017 01:27:00 +0000 (01:27 +0000)]
Tweak performance of nda completions

Use xpt_done_direct in preference to xpt_done when completing a
successful I/O. Continue to use xpt_done when there's an error, or for
completion of the submission of a CCB. This eliminates a context
switch to the cam_doneq thread.

Sponsored by: Netflix
Suggested by: scottl@

6 years agoFix a memory leak that occurred in the pNFS client.
rmacklem [Wed, 27 Sep 2017 23:23:41 +0000 (23:23 +0000)]
Fix a memory leak that occurred in the pNFS client.

When a "pnfs" NFSv4.1 mount was unmounted, it didn't free up the layouts
and deviceinfo structures. This leak only affects "pnfs" mounts and only
when the mount is umounted.
Found while testing the pNFS Flexible File layout client code.

MFC after: 2 weeks

6 years agoUse UMA_ALIGNOF() for name cache UMA zones.
jhb [Wed, 27 Sep 2017 23:18:57 +0000 (23:18 +0000)]
Use UMA_ALIGNOF() for name cache UMA zones.

This fixes kernel crashes due to misaligned accesses to the 64-bit
time_t embedded in struct namecache_ts in MIPS n32 kernels.

MFC after: 1 week
Sponsored by: DARPA / AFRL

6 years agoAdd UMA_ALIGNOF().
jhb [Wed, 27 Sep 2017 23:15:33 +0000 (23:15 +0000)]
Add UMA_ALIGNOF().

This is a wrapper around _Alignof() that sets the alignment for a zone
to the alignment required by a given type.  This allows the compiler to
determine the proper alignment rather than having the programmer try to
guess.

Discussed on: arch@
MFC after: 1 week
Sponsored by: DARPA / AFRL

6 years agobhnd: Add support for supplying bus I/O callbacks when initializing an EROM
landonf [Wed, 27 Sep 2017 19:48:34 +0000 (19:48 +0000)]
bhnd: Add support for supplying bus I/O callbacks when initializing an EROM
parser.

This allows us to use the EROM parser API in cases where the standard bus
space I/O APIs are unsuitable. In particular, this will allow us to parse
the device enumeration table directly from bhndb(4) drivers, prior to
full attach and configuration of the bridge.

Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12510

6 years agobhnd: Implement bhnd(4) platform device registration.
landonf [Wed, 27 Sep 2017 19:44:23 +0000 (19:44 +0000)]
bhnd: Implement bhnd(4) platform device registration.

Add bhnd(4) API for explicitly registering BHND platform devices (ChipCommon,
PMU, NVRAM, etc) with the bus, rather than walking the newbus hierarchy to
discover platform devices. These devices are now also refcounted; attempting
to deregister an actively used platform device will return EBUSY.

This resolves a lock ordering incompatibility with bwn(4)'s firmware loading
threads; previously it was necessary to acquire Giant to protect newbus access
when locating and querying the NVRAM device.

Approved by: adrian (mentor)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12392

6 years agoSince the human readable name is actually ignored, and not matching a
imp [Wed, 27 Sep 2017 19:22:10 +0000 (19:22 +0000)]
Since the human readable name is actually ignored, and not matching a
'human' pnp string, change it to #, the name reserved for fields that
are ignored.

6 years agoImprove description of the PNP string a bit.
imp [Wed, 27 Sep 2017 19:21:52 +0000 (19:21 +0000)]
Improve description of the PNP string a bit.

6 years agoUnrevert r324059
cem [Wed, 27 Sep 2017 19:14:00 +0000 (19:14 +0000)]
Unrevert r324059

With a colon and bogus name ("#") added to appease the simplistic parser
used in kldxref.

Sponsored by: Dell EMC Isilon

6 years agoUse C99 initializers for DTrace provider methods.
markj [Wed, 27 Sep 2017 17:46:38 +0000 (17:46 +0000)]
Use C99 initializers for DTrace provider methods.

This makes the definitions easier to read and more cscope-friendly.

MFC after: 1 week

6 years agoTx Ring Shadow Consumer Index Register needs to be cleared prior
davidcs [Wed, 27 Sep 2017 17:46:11 +0000 (17:46 +0000)]
Tx Ring Shadow Consumer Index Register needs to be cleared prior
to passing it's physical address to the FW during Tx Create Context.

MFC after:3 days

6 years agoAdd check to avoid raw inode iblocks fields overflow in case of huge_file feature.
fsu [Wed, 27 Sep 2017 16:12:13 +0000 (16:12 +0000)]
Add check to avoid raw inode iblocks fields overflow in case of huge_file feature.
Use the Linux logic for now.

Reviewed by:    pfg (mentor)
Approved by:    pfg (mentor)
MFC after:      2 weeks
Differential Revision: https://reviews.freebsd.org/D12131

6 years agoRemove PNP metadata from drm2 drivers until kldxref problem is resolved
cem [Wed, 27 Sep 2017 14:59:18 +0000 (14:59 +0000)]
Remove PNP metadata from drm2 drivers until kldxref problem is resolved

Reported by: np
Sponsored by: Dell EMC Isilon

6 years agoRemove unused function.
tuexen [Wed, 27 Sep 2017 13:05:23 +0000 (13:05 +0000)]
Remove unused function.

MFC after: 1 week

6 years agovfs_export: Simplify vfs_export_lookup
manu [Wed, 27 Sep 2017 09:39:16 +0000 (09:39 +0000)]
vfs_export: Simplify vfs_export_lookup

If the filesystem is not exported directly return NULL.
If no address is given and filesystem is exported using some default
one return it directly, if it doesn't have a default one directly
return NULL.

Reviewed by: kib, bapt
MFC after: 1 week
Sponsored by: Gandi.net
Differential Revision: https://reviews.freebsd.org/D12505

6 years agokernel: Bump __FreeBSD_version for the removal of M_HASHTYPE_RSS_UDP_IPV4_EX
sephe [Wed, 27 Sep 2017 06:33:55 +0000 (06:33 +0000)]
kernel: Bump __FreeBSD_version for the removal of M_HASHTYPE_RSS_UDP_IPV4_EX

Sponsored by: Microsoft

6 years agombuf: Remove UDP_IPV4_EX, which was never defined.
sephe [Wed, 27 Sep 2017 06:31:35 +0000 (06:31 +0000)]
mbuf: Remove UDP_IPV4_EX, which was never defined.

Add comment to explain the IPV6_EX suffix.  The confusion about
these RSS hash type probably stems from the facts that they were
never widely implemented by hardwares.

Reviewed by: rwatson
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12453

6 years agoixl: Fix mbuf hash type settings.
sephe [Wed, 27 Sep 2017 05:59:54 +0000 (05:59 +0000)]
ixl: Fix mbuf hash type settings.

IPV6_EXs in RSS never mean fragment.  They mean:
"- Home address from the home address option in the IPv6 destination
   options header.  If the extension header is not present, use the
   Source IPv6 Address.
 - IPv6 address that is contained in the Routing-Header-Type-2 from
   the associated extension header.  If the extension header is not
   present, use the Destination IPv6 Address."

UDP_IPV4_EX is an invalid RSS hash type, which will be removed.

Quoted from:
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-hashing-types#ndishashipv6ex

Reviewed by: erj
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12450

6 years agotcp: Don't "negotiate" MSS.
sephe [Wed, 27 Sep 2017 05:52:37 +0000 (05:52 +0000)]
tcp: Don't "negotiate" MSS.

_NO_ OSes actually "negotiate" MSS.

RFC 879:
"... This Maximum Segment Size (MSS) announcement (often mistakenly
called a negotiation) ..."

This negotiation behaviour was introduced 11 years ago by r159955
without any explaination about why FreeBSD had to "negotiate" MSS:

    In syncache_respond() do not reply with a MSS that is larger than what
    the peer announced to us but make it at least tcp_minmss in size.

    Sponsored by:   TCP/IP Optimization Fundraise 2005

The tcp_minmss behaviour is still kept.

Syncookie fix was prodded by tuexen, who also helped to test this
patch w/ packetdrill.

Reviewed by: tuexen, karels, bz (previous version)
MFC after: 2 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12430

6 years agohyperv/hn: Fix UDP checksum offload issue in Azure.
sephe [Wed, 27 Sep 2017 05:44:50 +0000 (05:44 +0000)]
hyperv/hn: Fix UDP checksum offload issue in Azure.

UDP checksum offload does not work in Azure if following conditions are
met:
- sizeof(IP hdr + UDP hdr + payload) > 1420.
- IP_DF is not set in IP hdr

Use software checksum for UDP datagrams falling into this category.

Add two tunables to disable UDP/IPv4 and UDP/IPv6 checksum offload, in
case something unexpected happened.

MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12429

6 years agohyperv/hn: Set tcp header offset for CSUM/LSO offloading.
sephe [Wed, 27 Sep 2017 04:42:40 +0000 (04:42 +0000)]
hyperv/hn: Set tcp header offset for CSUM/LSO offloading.

No observable effect; better safe than sorry.

MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D12417

6 years agosysctl: remove target buffer read/write checks prior to calling the handler
mjg [Wed, 27 Sep 2017 01:31:52 +0000 (01:31 +0000)]
sysctl: remove target buffer read/write checks prior to calling the handler

Said checks were inherently racy anyway as jokers could unmap target areas
before the handler got around to accessing them.

This saves time by avoiding locking the address space.

MFC after: 1 week

6 years agoAnnotate sysctlmemlock with __exclusive_cache_line.
mjg [Wed, 27 Sep 2017 01:27:43 +0000 (01:27 +0000)]
Annotate sysctlmemlock with __exclusive_cache_line.

MFC after: 1 week

6 years agoRemove manpage entries about crshared(9)
mjg [Wed, 27 Sep 2017 01:12:47 +0000 (01:12 +0000)]
Remove manpage entries about crshared(9)

The function itself was removed years ago in r272546

Submitted by: Paulm <paulm tetrardus.net>
MFC after: 2 weeks

6 years agoWhack procctl(8)
mjg [Wed, 27 Sep 2017 01:03:00 +0000 (01:03 +0000)]
Whack procctl(8)

It was supposed to provide a recovery mechanism against bugs in procfs's
long deprecated tracing capabilities.

Remove the tool as a prerequisite to axing the kernel side.

The tracing facility to use is ptrace(2).

MFC after: 2 weeks

6 years agomtx: drop the tid argument from _mtx_lock_sleep
mjg [Wed, 27 Sep 2017 00:57:05 +0000 (00:57 +0000)]
mtx: drop the tid argument from _mtx_lock_sleep

tid must be equal to curthread and the target routine was already reading
it anyway, which is not a problem. Not passing it as a parameter allows for
a little bit shorter code in callers.

MFC after: 1 week

6 years agoAdd major and minor version arguments to nfscl_reqstart().
rmacklem [Tue, 26 Sep 2017 23:42:44 +0000 (23:42 +0000)]
Add major and minor version arguments to nfscl_reqstart().

This patch adds "vers" and "minorvers" arguments to nfscl_reqstart().
The patch always passes them in as "0" and that implies no change
in semantics. These arguments will be used by a future commit that
adds support for the Flexible File Layout.

6 years agoDon't defer wakeup()s for completed journal workitems.
jhb [Tue, 26 Sep 2017 23:24:15 +0000 (23:24 +0000)]
Don't defer wakeup()s for completed journal workitems.

Normally wakeups() are performed for completed softupdates work items
in workitem_free() before the underlying memory is free()'d.
complete_jseg() was clearing the "wakeup needed" flag in work items to
defer the wakeup until the end of each loop iteration.  However, this
resulted in the item being free'd before it's address was used with
wakeup().  As a result, another part of the kernel could allocate this
memory from malloc() and use it as a wait channel for a different
"event" with a different lock.  This triggered an assertion failure
when the lock passed to sleepq_add() did not match the existing lock
associated with the sleep queue.  Fix this by removing the code to
defer the wakeup in complete_jseg() allowing the wakeup to occur
slightly earlier in workitem_free() before free() is called.

The main reason I can think of for deferring a wakeup() would be to
avoid waking up a waiter while holding a lock that the waiter would
need.  However, no locks are dropped in between the wakeup() in
workitem_free() and the end of the loop in complete_jseg() as far as I
can tell.

In general I think it is not safe to do a wakeup() after free() as one
cannot control how other parts of the kernel that might reuse the
address for a different wait channel will handle spurious wakeups.

Reported by: pho
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D12494

6 years agoAdd PNP metadata to more drivers
cem [Tue, 26 Sep 2017 23:23:58 +0000 (23:23 +0000)]
Add PNP metadata to more drivers

GPUs: radeonkms, i915kms
NICs: if_em, if_igb, if_bnxt

This metadata isn't used yet, but it will be handy to have later to
implement automatic module loading.

Reviewed by: imp, mmacy
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12488

6 years agoaesni(4): Add support for x86 SHA intrinsics
cem [Tue, 26 Sep 2017 23:12:32 +0000 (23:12 +0000)]
aesni(4): Add support for x86 SHA intrinsics

Some x86 class CPUs have accelerated intrinsics for SHA1 and SHA256.
Provide this functionality on CPUs that support it.

This implements CRYPTO_SHA1, CRYPTO_SHA1_HMAC, and CRYPTO_SHA2_256_HMAC.

Correctness: The cryptotest.py suite in tests/sys/opencrypto has been
enhanced to verify SHA1 and SHA256 HMAC using standard NIST test vectors.
The test passes on this driver.  Additionally, jhb's cryptocheck tool has
been used to compare various random inputs against OpenSSL.  This test also
passes.

Rough performance averages on AMD Ryzen 1950X (4kB buffer):
aesni:      SHA1: ~8300 Mb/s    SHA256: ~8000 Mb/s
cryptosoft:       ~1800 Mb/s    SHA256: ~1800 Mb/s

So ~4.4-4.6x speedup depending on algorithm choice.  This is consistent with
the results the Linux folks saw for 4kB buffers.

The driver borrows SHA update code from sys/crypto sha1 and sha256.  The
intrinsic step function comes from Intel under a 3-clause BSDL.[0]  The
intel_sha_extensions_sha<foo>_intrinsic.c files were renamed and lightly
modified (added const, resolved a warning or two; included the sha_sse
header to declare the functions).

[0]: https://software.intel.com/en-us/articles/intel-sha-extensions-implementations

Reviewed by: jhb
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12452

6 years agoFix regression from r323855. The EXIT trap now isn't cleared, so upon
glebius [Tue, 26 Sep 2017 21:54:19 +0000 (21:54 +0000)]
Fix regression from r323855.  The EXIT trap now isn't cleared, so upon
exit it tried to unmount already unmounted partition, resulting in failure.

6 years agoFix delete all multicast addresses
davidcs [Tue, 26 Sep 2017 20:53:25 +0000 (20:53 +0000)]
Fix delete all multicast addresses

Submitted by:Anand.Khoje@cavium.com
MFC after:5 days

6 years agoa10_gpio: Enable all needed clocks
manu [Tue, 26 Sep 2017 20:23:09 +0000 (20:23 +0000)]
a10_gpio: Enable all needed clocks

Do not enable only the first clock, enable them all.

6 years agoa10_ehci: Enable all clocks and reset
manu [Tue, 26 Sep 2017 19:21:43 +0000 (19:21 +0000)]
a10_ehci: Enable all clocks and reset

a10_ehci can have multiple clocks and reset, enable them all instead of
only the first one.

6 years agoaw_usbphy: Only reroute OTG for phy0
manu [Tue, 26 Sep 2017 19:20:50 +0000 (19:20 +0000)]
aw_usbphy: Only reroute OTG for phy0

We only need to route OTG port to host mode on phy0 and if no VBUS
is present on the port, otherwise leave the port in periperal mode.

6 years agoaw_usbphy: Fix write of unknown register
manu [Tue, 26 Sep 2017 19:19:44 +0000 (19:19 +0000)]
aw_usbphy: Fix write of unknown register

Some SoC require a write to a unknown register to work corectly.
This write should be in the pmu region not in the phy ctrl one.

Reported by: Mark Millard (markmi@dsl-only.net)

6 years agoopencrypto: Use C99 initializers for auth_hash instances
cem [Tue, 26 Sep 2017 17:52:52 +0000 (17:52 +0000)]
opencrypto: Use C99 initializers for auth_hash instances

A misordering in the Via padlock driver really strongly suggested that these
should use C99 named initializers.

No functional change.

Sponsored by: Dell EMC Isilon

6 years agoopencrypto: Loosen restriction on HMAC key sizes
cem [Tue, 26 Sep 2017 16:18:10 +0000 (16:18 +0000)]
opencrypto: Loosen restriction on HMAC key sizes

Theoretically, HMACs do not actually have any limit on key sizes.
Transforms should compact input keys larger than the HMAC block size by
using the transform (hash) on the input key.

(Short input keys are padded out with zeros to the HMAC block size.)

Still, not all FreeBSD crypto drivers that provide HMAC functionality
handle longer-than-blocksize keys appropriately, so enforce a "maximum" key
length in the crypto API for auth_hashes that previously expressed a
requirement.  (The "maximum" is the size of a single HMAC block for the
given transform.)  Unconstrained auth_hashes are left as-is.

I believe the previous hardcoded sizes were committed in the original
import of opencrypto from OpenBSD and are due to specific protocol
details of IPSec.  Note that none of the previous sizes actually matched
the appropriate HMAC block size.

The previous hardcoded sizes made the SHA tests in cryptotest.py
useless for testing FreeBSD crypto drivers; none of the NIST-KAT example
inputs had keys sized to the previous expectations.

The following drivers were audited to check that they handled keys up to
the block size of the HMAC safely:

  Software HMAC:
    * padlock(4)
    * cesa
    * glxsb
    * safe(4)
    * ubsec(4)

  Hardware accelerated HMAC:
    * ccr(4)
    * hifn(4)
    * sec(4) (Only supports up to 64 byte keys despite claiming to
      support SHA2 HMACs, but validates input key sizes)
    * cryptocteon (MIPS)
    * nlmsec (MIPS)
    * rmisec (MIPS) (Amusingly, does not appear to use key material at
      all -- presumed broken)

Reviewed by: jhb (previous version), rlibby (previous version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12437

6 years agofix r324011, MFV of r323535, 8585 improve batching done in zil_commit()
avg [Tue, 26 Sep 2017 15:38:16 +0000 (15:38 +0000)]
fix r324011, MFV of r323535, 8585 improve batching done in zil_commit()

I managed to commit an older version of the change.
Plus, even the latest version was not ready for userland compilation.

Reported by: "O. Hartmann" <ohartmann@walstatt.org>,
cy
MFC after: 1 week
X-MFC with: r324011

6 years agomountd: Avoid memory leak by freeing dp_dirp
manu [Tue, 26 Sep 2017 12:15:13 +0000 (12:15 +0000)]
mountd: Avoid memory leak by freeing dp_dirp

Introduced in r324007, the data alloced by strdup was never free'ed.
While here, remove cast to caddr_t when freeing dp.

Reported by: bde
MFC after: 1 week
X MFC With: r324007

6 years agocalendar: replace strcpy/strcat with asprintf
bapt [Tue, 26 Sep 2017 11:16:33 +0000 (11:16 +0000)]
calendar: replace strcpy/strcat with asprintf

6 years agomountd: Remove unneeded cast
manu [Tue, 26 Sep 2017 11:11:17 +0000 (11:11 +0000)]
mountd: Remove unneeded cast

Reported by: kib
MFC after: 1 week
X MFC With: r324007

6 years agoMFV r323535: 8585 improve batching done in zil_commit()
avg [Tue, 26 Sep 2017 11:04:08 +0000 (11:04 +0000)]
MFV r323535: 8585 improve batching done in zil_commit()

FreeBSD notes:
- this MFV reverts FreeBSD commit r314549 to make the merge easier
- at present our emulation of cv_timedwait_hires is rather poor,
  so I elected to use cv_timedwait_sbt directly
Please see the differential revision for details.
Unfortunately, I did not get any positive reviews, so there could be
bugs in the FreeBSD-specific piece of the merge.
Hence, the long MFC timeout.

illumos/illumos-gate@1271e4b10dfaaed576c08a812f466f6e81370e5e
https://github.com/illumos/illumos-gate/commit/1271e4b10dfaaed576c08a812f466f6e81370e5e

https://www.illumos.org/issues/8585
  The current implementation of zil_commit() can introduce significant
  latency, beyond what is inherent due to the latency of the underlying
  storage. The additional latency comes from two main problems:
  1. When there's outstanding ZIL blocks being written (i.e. there's
      already a "writer thread" in progress), then any new calls to
      zil_commit() will block waiting for the currently oustanding ZIL
      blocks to complete. The blocks written for each "writer thread" is
      coined a "batch", and there can only ever be a single "batch" being
      written at a time. When a batch is being written, any new ZIL
      transactions will have to wait for the next batch to be written,
      which won't occur until the current batch finishes.
  As a result, the underlying storage may not be used as efficiently
      as possible. While "new" threads enter zil_commit() and are blocked
      waiting for the next batch, it's possible that the underlying
      storage isn't fully utilized by the current batch of ZIL blocks. In
      that case, it'd be better to allow these new threads to generate
      (and issue) a new ZIL block, such that it could be serviced by the
      underlying storage concurrently with the other ZIL blocks that are
      being serviced.
  2. Any call to zil_commit() must wait for all ZIL blocks in its "batch"
      to complete, prior to zil_commit() returning. The size of any given
      batch is proportional to the number of ZIL transaction in the queue
      at the time that the batch starts processing the queue; which
      doesn't occur until the previous batch completes. Thus, if there's a
      lot of transactions in the queue, the batch could be composed of
      many ZIL blocks, and each call to zil_commit() will have to wait for
      all of these writes to complete (even if the thread calling
      zil_commit() only cared about one of the transactions in the batch).

Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>

MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12355

6 years agomountd: Replace malloc+strcpy to strdup
manu [Tue, 26 Sep 2017 09:18:18 +0000 (09:18 +0000)]
mountd: Replace malloc+strcpy to strdup

Reviewed by: bapt
MFC after: 1 week
Sponsored by: Gandi.net
Differential Revision: https://reviews.freebsd.org/D12503

6 years agoRemove empty lines for consistency with other entries
bapt [Tue, 26 Sep 2017 05:47:33 +0000 (05:47 +0000)]
Remove empty lines for consistency with other entries

6 years agoDo not actually install uneeded alias for man
bapt [Tue, 26 Sep 2017 05:46:10 +0000 (05:46 +0000)]
Do not actually install uneeded alias for man

6 years agoRemove unneeded locales and alias man directories
bapt [Tue, 26 Sep 2017 05:43:55 +0000 (05:43 +0000)]
Remove unneeded locales and alias man directories

In base, locales (and encoding) specific directories are not used
by any tool. Just remove them.

While here also remove the cat page directory for openssl

6 years agoDo not print error when running make delete-old on system
bapt [Tue, 26 Sep 2017 05:33:15 +0000 (05:33 +0000)]
Do not print error when running make delete-old on system
without catpages directories

6 years agocrypto(9): Use a more specific error code when a capable driver is not found
cem [Tue, 26 Sep 2017 01:31:49 +0000 (01:31 +0000)]
crypto(9): Use a more specific error code when a capable driver is not found

When crypto_newsession() is given a request for an unsupported capability,
raise a more specific error than EINVAL.

This allows cryptotest.py to skip some HMAC tests that a driver does not
support.

Reviewed by: jhb, rlibby
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12451

6 years agoFix the return value from _Unwind_Backtrace() on arm.
ian [Mon, 25 Sep 2017 23:50:10 +0000 (23:50 +0000)]
Fix the return value from _Unwind_Backtrace() on arm.

If unwinding stops due to hitting the end of the call chain, the return
value is supposed to be _URC_END_OF_STACK; other values indicate internal
errors.  The return value from get_eit_entry() is now returned without
translating it to _URC_FAILURE, so that callers can see _URC_END_OF_STACK
when it happens.

6 years agoFix handling of uncaught exceptions in a std::terminate() handler on arm.
ian [Mon, 25 Sep 2017 23:24:41 +0000 (23:24 +0000)]
Fix handling of uncaught exceptions in a std::terminate() handler on arm.

When raising an exception, the unwinder searches for a catch handler and if
none is found it should invoke std::terminate() with the uncaught exception
as the "current" exception.  Before this change, the terminate handler was
invoked with no exception as current (abi::__cxa_current_exception_type()
returned NULL), because the return value from the unwinder indicated an
internal failure in unwinding.  It turns out that was because all errors
from get_eit_entry() were translated to _URC_FAILURE.  Now the error is
returned untranslated, which allows _URC_END_OF_STACK to percolate upwards
to throw_exception() in libcxxrt.  When it sees that return status it
properly calls std::terminate() with the uncaught exception installed
as the current exception, allowing custom terminate handlers to work
with it.

6 years agoRemove the cat pages directory now that catman(1) is gone
bapt [Mon, 25 Sep 2017 21:23:49 +0000 (21:23 +0000)]
Remove the cat pages directory now that catman(1) is gone

6 years agoClose a memory leak when using zpool_read_all_labels
asomers [Mon, 25 Sep 2017 20:44:40 +0000 (20:44 +0000)]
Close a memory leak when using zpool_read_all_labels

MFC after: 3 weeks
X-MFC-With: 322854
Sponsored by: Spectra Logic Corp

6 years agoLog signal number passed to PT_STEP requests in KTR_PTRACE traces.
jhb [Mon, 25 Sep 2017 20:38:55 +0000 (20:38 +0000)]
Log signal number passed to PT_STEP requests in KTR_PTRACE traces.

MFC after: 1 week

6 years agoUse tmpfs_print for tmpfs FIFOs.
jhb [Mon, 25 Sep 2017 20:26:16 +0000 (20:26 +0000)]
Use tmpfs_print for tmpfs FIFOs.

Reviewed by: kib (part of a larger patch)

6 years agolibefi: efipart_floppy() will should not pass acpi pointer if the HID test fails
tsoome [Mon, 25 Sep 2017 19:49:56 +0000 (19:49 +0000)]
libefi: efipart_floppy() will should not pass acpi pointer if the HID test fails

The current efipart_floppy() implementation is leaking the acpi pointer.

6 years agocapsicum_helpers: Add SEEK to default stdio rights set
cem [Mon, 25 Sep 2017 19:33:32 +0000 (19:33 +0000)]
capsicum_helpers: Add SEEK to default stdio rights set

PR: 219173
Sponsored by: Dell EMC Isilon

6 years agoUse nstosbt() instead of multiplying by SBT_1NS to avoid roundoff errors.
ian [Mon, 25 Sep 2017 15:03:27 +0000 (15:03 +0000)]
Use nstosbt() instead of multiplying by SBT_1NS to avoid roundoff errors.

Differential Revision: https://reviews.freebsd.org/D11779

6 years agoFix gcc compilation issues in the mvneta driver
mw [Mon, 25 Sep 2017 02:06:51 +0000 (02:06 +0000)]
Fix gcc compilation issues in the mvneta driver

Compiling mvneta driver with gcc unveiled two issues, that
required fixing.

Reported by: andrew
Obtained from: Semihalf

6 years agoChange vm_page_try_to_free() to require a managed page. Essentially,
alc [Sun, 24 Sep 2017 23:35:01 +0000 (23:35 +0000)]
Change vm_page_try_to_free() to require a managed page.  Essentially,
vm_page_try_to_free() is testing conditions, like clean versus dirty,
that only vary in managed pages.

Suggested by: kib
Reviewed by: markj
X-MFC after: never

6 years agoModernize the use of vm_page_unwire(). Since r288122, vm_page_unwire()
alc [Sun, 24 Sep 2017 22:29:11 +0000 (22:29 +0000)]
Modernize the use of vm_page_unwire().  Since r288122, vm_page_unwire()
has returned TRUE when the wire count transitions to zero, eliminating
the need for callers to inspect the page's wire count.

MFC after: 1 week

6 years agoSmall style(9) issue: spaces vs TAB.
pfg [Sun, 24 Sep 2017 20:57:03 +0000 (20:57 +0000)]
Small style(9) issue: spaces vs TAB.

6 years agoChange a panic to an error return.
rmacklem [Sun, 24 Sep 2017 20:05:48 +0000 (20:05 +0000)]
Change a panic to an error return.

There was a panic() in the NFS server's write operation that didn't
need to be a panic() and could just be an error return.
This patch makes that change.
Found by code inspection during development of the pNFS service.

MFC after: 2 weeks

6 years agog_resize_provider_event: Do not invoke orphan method twice
cem [Sun, 24 Sep 2017 19:59:26 +0000 (19:59 +0000)]
g_resize_provider_event: Do not invoke orphan method twice

Like r266444, g_resize_provider_event can attempt to orphan an already
orphaned geom_dev consumer.  This will cause a panic in g_dev_orphan.  Apply
the same fix as was applied to g_orphan_register.

Reviewed by: ae
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12469

6 years agoRemove 0 filling from nfsm_uiombuflist().
rmacklem [Sun, 24 Sep 2017 19:43:31 +0000 (19:43 +0000)]
Remove 0 filling from nfsm_uiombuflist().

nfsm_uiombuflist() zero filled the mbuf list to a multiple of 4bytes
as required for XDR. Unfortunately that modified an mbuf list after
it was m_copym()'d and was broken. This patch removes the zero filling code.
Since nfsm_uiombuflist() is not yet used in head/current, this has no
effect on users.
The function will be used by a future commit of code that adds Flex
File Layout support.

6 years agoOptimize vm_page_try_to_free(). Specifically, the call to pmap_remove_all()
alc [Sun, 24 Sep 2017 16:50:10 +0000 (16:50 +0000)]
Optimize vm_page_try_to_free().  Specifically, the call to pmap_remove_all()
can be avoided when the page's containing object has a reference count of
zero.  (If the object has a reference count of zero, then none of its pages
can possibly be mapped.)

Address nearby style issues in vm_page_try_to_free(), and change its
return type to "bool".

Reviewed by: kib, markj
MFC after: 1 week

6 years agoAdd myself to the calendar.freebsd
fsu [Sun, 24 Sep 2017 14:36:01 +0000 (14:36 +0000)]
Add myself to the calendar.freebsd

Reviewed by:    pfg (mentor)
Approved by:    pfg (mentor)

6 years agoFix packages with interactive post install scripts.
imp [Sun, 24 Sep 2017 14:22:36 +0000 (14:22 +0000)]
Fix packages with interactive post install scripts.

Tell pkg(8) we're running non-interactively so packages that with
interactive post install scripts don't hang.

Submitted by: Guido van Rooij

6 years agoRemove the VIRT kernel config, it's now useable through GENERIC.
andrew [Sun, 24 Sep 2017 13:28:24 +0000 (13:28 +0000)]
Remove the VIRT kernel config, it's now useable through GENERIC.

Sponsored by: DARPA, AFRL

6 years agoAdd the ability to report and set debug flags as text strings instead of
scottl [Sun, 24 Sep 2017 13:14:50 +0000 (13:14 +0000)]
Add the ability to report and set debug flags as text strings instead of
just integer flags.  Report both for convenience.

Submitted by: Eygene Ryabinkin (manpage)
Sponsored by: Netflix

6 years agoAdd i.MX6 and Xilinx to GENERIC.
andrew [Sun, 24 Sep 2017 09:33:08 +0000 (09:33 +0000)]
Add i.MX6 and Xilinx to GENERIC.

Merge in the missing devices from the IMX6 and ZEDBOARD kernel configs. The
Freescale sdma device has been renamed to fslsdma to mark it as a platform
specific driver.

Reviewed by: ian
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11564

6 years agoRename sdhci_cam_start_slot() into sdhci_start_slot()
kibab [Sun, 24 Sep 2017 09:05:35 +0000 (09:05 +0000)]
Rename sdhci_cam_start_slot() into sdhci_start_slot()

This change allows to just call sdhci_start_slot() in SDHCI drivers
and not to think about which stack handles the operation.

As a side effect, this will also fix MMCCAM with sdhci_acpi driver.

Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D12471

6 years agoDon't display empty error context.
imp [Sun, 24 Sep 2017 05:04:06 +0000 (05:04 +0000)]
Don't display empty error context.

Context extraction didn't handle this case and showed uninitialized memory.

Obtained from: OpenBSD lib.c 1.21
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D12379

6 years agoFix %c for floating values that become 0 when coerced to int.
imp [Sun, 24 Sep 2017 05:04:02 +0000 (05:04 +0000)]
Fix %c for floating values that become 0 when coerced to int.

Obtained from: OpenBSD run.c 1.36 (From Jeremy Devenport)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D12379

6 years agoFix uninitialized variable
imp [Sun, 24 Sep 2017 05:03:57 +0000 (05:03 +0000)]
Fix uninitialized variable

echo | awk 'BEGIN {i=$1; print i}' prints a boatload of stack
garbage. NUL terminate the memory returned from malloc to prevent it.

Obtained from: OpenBSD run.c 1.40
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D12379

6 years agoFix typo from r323945.
cy [Sun, 24 Sep 2017 03:33:26 +0000 (03:33 +0000)]
Fix typo from r323945.

Reported by: Gary Jennejohn <gljennjohn@gmail.com>
Point hat to: cy (me)

6 years agoSince the page "frame" doesn't belong to a vm object, it can't be paged
alc [Sun, 24 Sep 2017 02:50:59 +0000 (02:50 +0000)]
Since the page "frame" doesn't belong to a vm object, it can't be paged
out.  Since it can't be paged out, it is never actually enqueued in a
paging queue.  Nonetheless, passing PQ_INACTIVE to vm_page_unwire()
creates the appearance that the page "frame" is being enqueued in the
inactive queue.  As of r288122, we can avoid this false impression by
passing PQ_NONE.

MFC after: 1 week

6 years agoConvert some idioms over to py3k-compatible idioms
ngie [Sun, 24 Sep 2017 00:14:48 +0000 (00:14 +0000)]
Convert some idioms over to py3k-compatible idioms

- Import print_function from __future__ and use print(..) instead of `print ..`.
- Use repr instead of backticks when the object needs to be dumped, unless
  print(..) can do it lazily. Use str instead of backticks as appropriate
  for simplification reasons.

This doesn't fully convert these modules over py3k. It just gets over some of
the trivial compatibility hurdles.

6 years agoAdd myself as src committer.
fsu [Sat, 23 Sep 2017 19:49:12 +0000 (19:49 +0000)]
Add myself as src committer.

Approved by:    pfg (mentor)

6 years agoddb(4): Add 'show badstacks' command to show witness badstacks
cem [Sat, 23 Sep 2017 17:48:49 +0000 (17:48 +0000)]
ddb(4): Add 'show badstacks' command to show witness badstacks

Add a DDB command that mirrors sysctl debug.witness.badstacks.

Reapply r323935 after fixing trivial deficiency.  I forgot to compile with
WITNESS enabled.  Thanks emaste@ for fixing the build while I was asleep.

Reported by: rstone
Reviewed by: rstone (previous version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12468

6 years agobnxt: Choose better HW LRO defaults for performance
shurd [Sat, 23 Sep 2017 16:59:37 +0000 (16:59 +0000)]
bnxt: Choose better HW LRO defaults for performance

1) Choose correct Firmware options for HW LRO for best performance
2) Delete TBD and other comments which are not required.
3) Added sysctl interface to enable / disable / modify different factors
   of HW LRO.
4) Disabled HW LRO by default to avoid issues with packet forwarding

This allows much better control over the LRO configuration via sysctls, and
uses much better defaults.  Hardware LRO can now be enabled/disabled
independantly from the software LRO, and the tuning parameters are exposed.

manpage updates coming soon.

Submitted by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by: shurd, sbruno
Approved by: sbruno (mentor)
Sponsored by: Broadcom Limited
Differential Revision: https://reviews.freebsd.org/D12223

6 years agoHave ifmp_ring_enqueue() abdicate instead of switch to a consumer
shurd [Sat, 23 Sep 2017 16:46:30 +0000 (16:46 +0000)]
Have ifmp_ring_enqueue() abdicate instead of switch to a consumer

Move TX out of the enqueue() path. As a result, we need
to have ifmp_ring_check_drainage() pick up from the abdicate state.

We also need to either enqueue the TX task, or check drainage
after calling ifmp_ring_enqueue() to ensure it's sent.

This change results in a 30% small packet forwarding improvement.

Reviewed by: olivier, sbruno
Approved by: sbruno (mentor)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D12439

6 years agoAfter the r317886 support for TFTP and NFS can be enable simultaneously.
oshogbo [Sat, 23 Sep 2017 12:44:42 +0000 (12:44 +0000)]
After the r317886 support for TFTP and NFS can be enable simultaneously.

The cleanup of this distinction was done in the r318988, but this Makefile
was omitted.

Submitted by: kczekirda@

6 years agoRevert r323935 as it broke the build
emaste [Sat, 23 Sep 2017 12:35:46 +0000 (12:35 +0000)]
Revert r323935 as it broke the build

subr_witness.c:2577:4: error: use of undeclared identifier 'req'
                        req->oldidx = 0;
                        ^

6 years agoGarbage collect usued fields
scottl [Sat, 23 Sep 2017 08:26:42 +0000 (08:26 +0000)]
Garbage collect usued fields

Sponsored by: Netflix

6 years agoCorrect two misspellings. Also align */.
cy [Sat, 23 Sep 2017 06:00:17 +0000 (06:00 +0000)]
Correct two misspellings. Also align */.

6 years agoMake struct grouptask gt_name member a char array
shurd [Sat, 23 Sep 2017 01:39:16 +0000 (01:39 +0000)]
Make struct grouptask gt_name member a char array

Previously, it was just a pointer which was copied, but
some callers pass in a stack variable which will go out of scope.
Add GROUPTASK_NAMELEN macro (32) and snprintf() the name into it,
using "grouptask" if name is NULL. We can now safely include
gtask->gt_name in console messages.

Reviewed by: sbruno
Approved by: sbruno (mentor)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D12449

6 years agoMake the rx budget a tunable
shurd [Sat, 23 Sep 2017 01:37:01 +0000 (01:37 +0000)]
Make the rx budget a tunable

This allows tuning the rx budget for special load profiles
as well as more easily testing to determine sane defaults.

Reviewed by: sbruno
Approved by: sbruno (mentor)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D12445

6 years agoChain mbufs before passing to if_input()
shurd [Sat, 23 Sep 2017 01:35:14 +0000 (01:35 +0000)]
Chain mbufs before passing to if_input()

Build a list of mbufs to pass to if_input() after LRO. Results in
12% small packet forwarding rate improvement.

Reviewed by: sbruno
Approved by: sbruno (mentor)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D12444

6 years agoSome small packet performance improvements
shurd [Sat, 23 Sep 2017 01:33:20 +0000 (01:33 +0000)]
Some small packet performance improvements

If the packet is smaller than MTU, disable the TSO flags.
Move TCP header parsing inside the IS_TSO?() test.
Add a new IFLIB_NEED_ZERO_CSUM flag to indicate the checksums need to be zeroed before TX.

Reviewed by: sbruno
Approved by: sbruno (mentor)
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D12442

6 years agoddb(4): Add 'show badstacks' command to show witness badstacks
cem [Fri, 22 Sep 2017 20:01:12 +0000 (20:01 +0000)]
ddb(4): Add 'show badstacks' command to show witness badstacks

Add a DDB command that mirrors sysctl debug.witness.badstacks.

Reported by: rstone
Reviewed by: rstone
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12468