]> CyberLeo.Net >> Repos - FreeBSD/FreeBSD.git/log
FreeBSD/FreeBSD.git
4 years agofusefs: rewrite vop_getpages and vop_putpages
asomers [Tue, 25 Jun 2019 17:24:43 +0000 (17:24 +0000)]
fusefs: rewrite vop_getpages and vop_putpages

Use the standard facilities for getpages and putpages instead of bespoke
implementations that don't work well with the writeback cache.  This has
several corollaries:

* Change the way we handle short reads _again_.  vfs_bio_getpages doesn't
  provide any way to handle unexpected short reads.  Plus, I found some more
  lock-order problems.  So now when the short read is detected we'll just
  clear the vnode's attribute cache, forcing the file size to be requeried
  the next time it's needed.  VOP_GETPAGES doesn't have any way to indicate
  a short read to the "caller", so we just bzero the rest of the page
  whenever a short read happens.

* Change the way we decide when to set the FUSE_WRITE_CACHE bit.  We now set
  it for clustered writes even when the writeback cache is not in use.

Sponsored by:   The FreeBSD Foundation

4 years agofusefs: fix multiple issues with the io tests
asomers [Tue, 25 Jun 2019 16:49:20 +0000 (16:49 +0000)]
fusefs: fix multiple issues with the io tests

* During TearDown, close the test file before the backing file.  That way
  the backing file artifact will have the correct contents after the test
  completes.  It doesn't matter when running in Kyua, but it may when
  running the test manually.
* Add a closeopen operation that mimics what FSX does with the "-c" option.
* Skip mmap-related tests when vfs.fusefs.data_cache_mode == 0

Sponsored by: The FreeBSD Foundation

4 years agofusefs: refine the short read fix from r349332
asomers [Mon, 24 Jun 2019 20:08:28 +0000 (20:08 +0000)]
fusefs: refine the short read fix from r349332

b_fsprivate1 needs to be initialized even for write operations, probably
because a buffer can be used to read, write, and read again with the final
read serviced by cache.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: improve the short read fix from r349279
asomers [Mon, 24 Jun 2019 17:05:31 +0000 (17:05 +0000)]
fusefs: improve the short read fix from r349279

VOP_GETPAGES intentionally tries to read beyond EOF, so fuse_read_biobackend
can't rely on bp->b_resid > 0 indicating a short read.  And adjusting
bp->b_count after a short read seems to cause some sort of resource leak.
Instead, store the shortfall in the bp->b_fsprivate1 field.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: fix corruption on short reads caused by r349279
asomers [Fri, 21 Jun 2019 23:29:29 +0000 (23:29 +0000)]
fusefs: fix corruption on short reads caused by r349279

Even if a short read is caused by EOF, it's still necessary to bzero the
remaining buffer, because that buffer could become valid as a result of a
future ftruncate or pwrite operation.

Reported by: fsx
Sponsored by: The FreeBSD Foundation

4 years agofusefs: correctly handle short reads
asomers [Fri, 21 Jun 2019 21:44:31 +0000 (21:44 +0000)]
fusefs: correctly handle short reads

A fuse server may return a short read for three reasons:

* The file is opened with FOPEN_DIRECT_IO.  In this case, the short read
  should be returned directly to userland.  We already handled this case
  correctly.

* The file was truncated server-side, and the read hit EOF.  In this case,
  the kernel should update the file size.  Fixed in the case of VOP_READ.
  Fixing this for VOP_GETPAGES is TODO.

* The file is opened in writeback mode, there are dirty buffers past what
  the server thinks is the file's EOF, and the read hit what the server
  thinks is the file's EOF.  In this case, the client is trying to read a
  hole, and should zero-fill it.  We already handled this case, and I added
  a test for it.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: raise protocol level to 7.23
asomers [Fri, 21 Jun 2019 04:57:23 +0000 (04:57 +0000)]
fusefs: raise protocol level to 7.23

None of the new features are implemented yet.  This commit just adds the new
protocol definitions and adds backwards-compatibility code for pre 7.23
servers.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: update tests after r349260
asomers [Fri, 21 Jun 2019 04:37:11 +0000 (04:37 +0000)]
fusefs: update tests after r349260

r349260 removed some Linuxisms from the FUSE protocol header file in favor
of standard C99 types.  This change follows suit in the tests.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: use standard integer types in fuse_kernel.h
asomers [Fri, 21 Jun 2019 03:17:27 +0000 (03:17 +0000)]
fusefs: use standard integer types in fuse_kernel.h

This is a merge of Linux revision 4c82456eeb4da081dd63dc69e91aa6deabd29e03.
No functional change.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: raise the protocol level to 7.21
asomers [Fri, 21 Jun 2019 03:04:56 +0000 (03:04 +0000)]
fusefs: raise the protocol level to 7.21

Jumping from protocol 7.15 to 7.21 adds several new features.  While they're
all potentially useful, they're also all optional, and I'm not implementing
any right now because my highest priority lies in a later version.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: diff reduction of fuse_kernel.h vs the upstream version
asomers [Fri, 21 Jun 2019 02:55:43 +0000 (02:55 +0000)]
fusefs: diff reduction of fuse_kernel.h vs the upstream version

fuse_kernel.h is based on Linux's fuse.h.  In r349250 I modified
fuse_kernel.h by generating a diff of two versions of Linux's fuse.h and
applying it to our tree.  patch succeeded, but it put one chunk in the wrong
location.  This commit fixes that.  No functional changes.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: raise protocol level to 7.15
asomers [Thu, 20 Jun 2019 23:32:25 +0000 (23:32 +0000)]
fusefs: raise protocol level to 7.15

This protocol level adds two new features: the ability for the server to
store or retrieve data into/from the client's cache.  But the messages
aren't defined soundly since they identify the file only by its inode,
without the generation number.  So it's possible for them to modify the
wrong file's cache.  Also, I don't know of any file systems in ports that
use these messages.  So I'm not implementing them.  I did add a (disabled)
test for the store message, however.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: trivially raise protocol level to 7.14
asomers [Thu, 20 Jun 2019 23:12:19 +0000 (23:12 +0000)]
fusefs: trivially raise protocol level to 7.14

The only new feature is splice(2) support on /dev/fuse, which FreeBSD can't
support.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: attempt to support servers as old as protocol 7.4
asomers [Thu, 20 Jun 2019 22:21:42 +0000 (22:21 +0000)]
fusefs: attempt to support servers as old as protocol 7.4

Previously we allowed servers as old as 7.1 to connect (there never was a
7.0).  However, we wrongly assumed a few things about protocols older than
7.8.  This commit attempts to support servers as old as 7.4 but no older.  I
added no new tests because I'm not sure there actually _are_ any servers
this old in the wild.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: raise protocol level to 7.13
asomers [Thu, 20 Jun 2019 21:29:28 +0000 (21:29 +0000)]
fusefs: raise protocol level to 7.13

This protocol version adds one new feature: the ability for the server to
set the maximum number of background requests and a "congestion threshold"
with ill-defined properties.  I don't know of any fuse file systems in ports
that use this feature, so I'm not implementing it.

Sponsored by: The FreeBSD Foundation

4 years agofusefs: implement VOP_BMAP
asomers [Thu, 20 Jun 2019 17:08:21 +0000 (17:08 +0000)]
fusefs: implement VOP_BMAP

If the fuse daemon supports FUSE_BMAP, then use that for the block mapping.
Otherwise, use the same technique used by vop_stdbmap.  Report large values
for runp and runb in order to maximize read clustering and minimize upcalls,
even if we don't know the true layout.

The major result of this change is that sequential reads to FUSE files will
now usually happen 128KB at a time instead of 64KB.

Sponsored by: The FreeBSD Foundation

4 years agoMFHead @349234
asomers [Thu, 20 Jun 2019 15:56:08 +0000 (15:56 +0000)]
MFHead @349234

Sponsored by: The FreeBSD Foundation

4 years agoVOP_BMAP(9): fix typo in the copyright header
asomers [Thu, 20 Jun 2019 14:40:36 +0000 (14:40 +0000)]
VOP_BMAP(9): fix typo in the copyright header

Reported by: rgrimes
MFC after: 2 weeks
MFC-With: 349230
Sponsored by: The FreeBSD Foundation

4 years ago#include <sys/types.h> from sys/filio.h
asomers [Thu, 20 Jun 2019 14:35:28 +0000 (14:35 +0000)]
#include <sys/types.h> from sys/filio.h

This fixes world build after r349231

Reported by: Jenkins
MFC after: 2 weeks
MFC-With: 349231
Sponsored by: The FreeBSD Foundation

4 years agoAdd FIOBMAP2 ioctl
asomers [Thu, 20 Jun 2019 14:13:10 +0000 (14:13 +0000)]
Add FIOBMAP2 ioctl

This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux.  FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.

Reviewed by: mckusick
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20705

4 years agoAdd a VOP_BMAP(9) man page
asomers [Thu, 20 Jun 2019 13:59:46 +0000 (13:59 +0000)]
Add a VOP_BMAP(9) man page

Reviewed by: mckusick
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20704

4 years agoAdd head(1) to native-xtools so that it can be used in qemu-user jails
antoine [Thu, 20 Jun 2019 13:24:58 +0000 (13:24 +0000)]
Add head(1) to native-xtools so that it can be used in qemu-user jails

4 years agoThe variable names in the description of the port number usage is
tuexen [Thu, 20 Jun 2019 12:38:41 +0000 (12:38 +0000)]
The variable names in the description of the port number usage is
inconsistent. This patch fixes that and improves the precision of
the description.
Thanks to Tom Marcoen for reporting the issue and providing an
initial patch, on which this change is based.

PR: 237723
Reviewed by: bcr@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20708

4 years agoFinsh readding Big5 in r317204, which was reverting r315568. This commit
lwhsu [Thu, 20 Jun 2019 07:17:16 +0000 (07:17 +0000)]
Finsh readding Big5 in r317204, which was reverting r315568.  This commit
reverts r315569.

Reported by: Ting-Wei Lan <lantw44 gmail com>
Discussed with: kevlo
MFC after: 3 days
Sponsored by: The FreeBSD Foundation

4 years agoAdd wakeup_any(), cheaper wakeup_one() for taskqueue(9).
mav [Thu, 20 Jun 2019 01:15:33 +0000 (01:15 +0000)]
Add wakeup_any(), cheaper wakeup_one() for taskqueue(9).

wakeup_one() and underlying sleepq_signal() spend additional time trying
to be fair, waking thread with highest priority, sleeping longest time.
But in case of taskqueue there are many absolutely identical threads, and
any fairness between them is quite pointless.  It makes even worse, since
round-robin wakeups not only make previous CPU affinity in scheduler quite
useless, but also hide from user chance to see CPU bottlenecks, when
sequential workload with one request at a time looks evenly distributed
between multiple threads.

This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup
thread that went to sleep last, but no longer in context switch (to avoid
immediate spinning on the thread lock).  On top of that new wakeup_any()
function is added, equivalent to wakeup_one(), but setting the flag.
On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its
threads.

As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs
with 16KB block size spend 34% less time in wakeup_any() and descendants
then it was spending in wakeup_one(), and total write throughput increased
by ~10% with the same as before CPU usage.

Reviewed by: markj, mmacy
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D20669

4 years agoGroup vm_page_activate()'s definition with other related functions.
markj [Wed, 19 Jun 2019 21:36:00 +0000 (21:36 +0000)]
Group vm_page_activate()'s definition with other related functions.

No functional change intended.

MFC after: 3 days

4 years agoTell loader to ignore newer features enabled on the root pool.
mmacy [Wed, 19 Jun 2019 21:10:13 +0000 (21:10 +0000)]
Tell loader to ignore newer features enabled on the root pool.

There are many new features in ZoF. Most, if not all, do not effect read only usage.
Encryption in particular is enabled at the pool level but used at the dataset level.
The loader obviously will not be able to boot if the boot dataset is encrypted, but
should not care if some other dataset in the root pool is encrypted.

Reviewed by: allanjude
MFC after: 1 week

4 years agoFollow-up r349065: Fix .TARGET flag ambiguity with PROGS which broke MK_TESTS.
bdrewery [Wed, 19 Jun 2019 19:19:37 +0000 (19:19 +0000)]
Follow-up r349065: Fix .TARGET flag ambiguity with PROGS which broke MK_TESTS.

X-MFC-With: r349065
Sponsored by: DellEMC

4 years agoefinet: Defer exclusively opening the network handles
bcran [Wed, 19 Jun 2019 18:47:44 +0000 (18:47 +0000)]
efinet: Defer exclusively opening the network handles

Don't commit to exclusive access to the network device handle by
efinet until the loader has decided to load something through the
network. This allows for the possibility of other users of the
network device.

Submitted by: scottph
Reviewed by: tsoome, emaste
Tested by:  tsoome, bcran
Differential Revision: https://reviews.freebsd.org/D20642

4 years agoMake zlib encoding messages idempotent.
markj [Wed, 19 Jun 2019 16:09:20 +0000 (16:09 +0000)]
Make zlib encoding messages idempotent.

Otherwise duplicate messages can trigger a reinitialization of the
compression stream while the update thread is running.  Also ensure
that the stream is initialized before the update thread may attempt
to use it.

PR: 238333
Reviewed by: cem, rgrimes
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20673

4 years agoUse sbuf_cat() in GEOM confxml generation.
mav [Wed, 19 Jun 2019 15:36:02 +0000 (15:36 +0000)]
Use sbuf_cat() in GEOM confxml generation.

When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.

4 years agoAdd the ability to limit how much the code will fragment the RACK send map
jtl [Wed, 19 Jun 2019 13:55:00 +0000 (13:55 +0000)]
Add the ability to limit how much the code will fragment the RACK send map
in response to SACKs. The default behavior is unchanged; however, the limit
can be activated by changing the new net.inet.tcp.rack.split_limit sysctl.

Submitted by: Peter Lei <peterlei@netflix.com>
Reported by: jtl
Reviewed by: lstewart (earlier version)
Security: CVE-2019-5599

4 years agoFix typo in r349178.
mav [Wed, 19 Jun 2019 13:30:50 +0000 (13:30 +0000)]
Fix typo in r349178.

Reported by: ae
MFC after: 1 week

4 years ago[PPC] Fix loader input with newer QEMU versions
luporl [Wed, 19 Jun 2019 11:37:43 +0000 (11:37 +0000)]
[PPC] Fix loader input with newer QEMU versions

At least since version 4.0.0, QEMU became bug-compatible with PowerVM's
vty, by inserting a \0 after every \r. As this confuses loader's
interpreter and as a \0 coming from the console doesn't seem reasonable,
it's now being filtered at OFW console input.

Reviewed by: jhibbits
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20676

4 years agoWhitespace
sevan [Wed, 19 Jun 2019 11:22:09 +0000 (11:22 +0000)]
Whitespace

4 years agoV_ip6_forwarding and V_ipforwarding have been defined in ip6_var.h /
zec [Wed, 19 Jun 2019 08:49:24 +0000 (08:49 +0000)]
V_ip6_forwarding and V_ipforwarding have been defined in ip6_var.h /
ip_var.h since at least 2008, so make use of those definitions here.

MFC after: 3 days

4 years agoEvaluating htons() at compile time is more efficient than doing ntohs()
zec [Wed, 19 Jun 2019 08:39:19 +0000 (08:39 +0000)]
Evaluating htons() at compile time is more efficient than doing ntohs()
at runtime.  This change removes a dependency on a barrel shifter pass
before branch resolution, while reducing the instruction stream size
by 9 bytes on amd64.

MFC after: 3 days

4 years agoImplement VT-d capability detection on chipsets that have multiple
scottl [Wed, 19 Jun 2019 06:41:07 +0000 (06:41 +0000)]
Implement VT-d capability detection on chipsets that have multiple
translation units with differing capabilities

From the author via Bugzilla:
---
When an attempt is made to passthrough a PCI device to a bhyve VM
(causing initialisation of IOMMU) on certain Intel chipsets using
VT-d the PCI bus stops working entirely. This issue occurs on the
E3-1275 v5 processor on C236 chipset and has also been encountered
by others on the forums with different hardware in the Skylake
series.

The chipset has two VT-d translation units. The issue is caused by
an attempt to use the VT-d device-IOTLB capability that is
supported by only the first unit for devices attached to the
second unit which lacks that capability. Only the capabilities of
the first unit are checked and are assumed to be the same for all
units.

Attached is a patch to rectify this issue by determining which
unit is responsible for the device being added to a domain and
then checking that unit's device-IOTLB capability. In addition to
this a few fixes have been made to other instances where the first
unit's capabilities are assumed for all units for domains they
share. In these cases a mutual set of capabilities is determined.
The patch should hopefully fix any bugs for current/future
hardware with multiple translation units supporting different
capabilities.

A description is on the forums at
https://forums.freebsd.org/threads/pci-passthrough-bhyve-usb-xhci.65235
The thread includes observations by other users of the bug
occurring, and description as well as confirmation of the fix.
I'd also like to thank Ordoban for their help.

---
Personally tested on a Skylake laptop, Skylake Xeon server, and
a Xeon-D-1541, passing through XHCI and NVMe functions.  Passthru
is hit-or-miss to the point of being unusable without this
patch.

PR: 229852
Submitted by: callum@aitchison.org
MFC after: 1 week

4 years agoCorrect an error in r349122. pmap_unwire() should update the pmap's wired
alc [Wed, 19 Jun 2019 03:33:00 +0000 (03:33 +0000)]
Correct an error in r349122.  pmap_unwire() should update the pmap's wired
count, not its resident count.

X-MFC with: r349122

4 years agoRework r349061: Don't apply guessed dependencies if there is a custom target.
bdrewery [Tue, 18 Jun 2019 22:00:38 +0000 (22:00 +0000)]
Rework r349061: Don't apply guessed dependencies if there is a custom target.

This is still targeting bin/sh cyclic dependency issues.  Only apply
guessed dependencies that are explicitly set for an object (which
gnu/lib/cc/cc_tools needs) and if no custom target exists with its
own dependencies.

This was manifesting as a missing yacc.h in usr.bin/mkesdb_static when
built without -j (or -B). No actual yacc.h dependency ordering was
defined but with -j it got lucky and built fine.

Before r349061 the behavior was different for META_MODE but that logic
difference isn't needed.

X-MFC-With: r349061
Sponsored by: DellEMC

4 years agoOptimize kern.geom.conf* sysctls.
mav [Tue, 18 Jun 2019 21:05:10 +0000 (21:05 +0000)]
Optimize kern.geom.conf* sysctls.

On large systems those sysctls may generate megabytes of output.  Before
this change sbuf(9) code was resizing buffer by 4KB each time many times,
generating tons of TLB shootdowns.  Unfortunately in this case existing
sbuf_new_for_sysctl() mechanism, supposed to help with this issue, is not
applicable, since all the sbuf writes are done in different kernel thread.

This change improves situation in two ways:
 - on first sysctl call, not providing any output buffer, it sets special
sbuf drain function, just counting the data and so not needing big buffer;
 - on second sysctl call it uses as initial buffer size value saved on
previous call, so that in most cases there will be no reallocation, unless
GEOM topology changed significantly.

MFC after: 1 week
Sponsored by: iXsystems, Inc.

4 years agoMark NetBSD branch points
sevan [Tue, 18 Jun 2019 21:02:40 +0000 (21:02 +0000)]
Mark NetBSD branch points
NetBSD 7.0 was a separate branch, subsequent 8.x releases did not emerge from
this branch.
Clean up minor visual nits, centre OpenBSD listing on the B, DragonFly
listings on the y.

4 years agorandom(4): Fix a regression in short AES mode reads
cem [Tue, 18 Jun 2019 18:50:58 +0000 (18:50 +0000)]
random(4): Fix a regression in short AES mode reads

In r349154, random device reads of size < 16 bytes (AES block size) were
accidentally broken to loop forever.  Correct the loop condition for small
reads.

Reported by: pho
Reviewed by: delphij
Approved by: secteam(delphij)
Differential Revision: https://reviews.freebsd.org/D20686

4 years agobhyve: vtnet: fix locking on receive
vmaffione [Tue, 18 Jun 2019 17:51:30 +0000 (17:51 +0000)]
bhyve: vtnet: fix locking on receive

The vsc_rx_ready and the RX virtqueue is protected by the rx_mtx lock.
However, pci_vtnet_ping_rxq() (currently called only once after each
device reset) accesses those without acquiring the lock.

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20609

4 years agoHandle labels specified with hints even on FDT systems. Hints are the
ian [Tue, 18 Jun 2019 17:05:05 +0000 (17:05 +0000)]
Handle labels specified with hints even on FDT systems.  Hints are the
easiest thing for a user to control (via loader.conf or kenv+kldload), so
handle them in addition to any label specified via the FDT data.

4 years agoRemove sys/capability.h for the third time
emaste [Tue, 18 Jun 2019 14:13:52 +0000 (14:13 +0000)]
Remove sys/capability.h for the third time

In all supported (and most unsupported) FreeBSD versions the appropriate
header for Capsicum is sys/capsicum.h.  Software including sys/capability.h
is most likely looking for Linux capabilities based on the withdrawn
POSIX.1e draft.

This header was previously removed in r334929 and r340156, but reverted
each time due to ports failures.  These issues have now (broadly) been
addressed.

PR: 228878 [exp-run]
Submitted by: eadler (r334929)
Relnotes: Yes
Sponsored by: The FreeBSD Foundation

4 years agoAdd a pwmc(4) manpage.
ian [Tue, 18 Jun 2019 04:32:19 +0000 (04:32 +0000)]
Add a pwmc(4) manpage.

4 years agoOops, it seems I left out the word 'cycle', fix it.
ian [Tue, 18 Jun 2019 02:27:30 +0000 (02:27 +0000)]
Oops, it seems I left out the word 'cycle', fix it.

Reported by: rpokala@

4 years agoRearrange the argument checking and processing so that enable and disable
ian [Tue, 18 Jun 2019 01:15:00 +0000 (01:15 +0000)]
Rearrange the argument checking and processing so that enable and disable
can be combined with configuring the period and duty cycle (the same ioctl
sets all 3 values at once, so there's no reason to require the user to run
the program twice to get all 3 things set).

4 years agoExplain the relationship between PWM hardware channels being controlled and
ian [Tue, 18 Jun 2019 00:17:10 +0000 (00:17 +0000)]
Explain the relationship between PWM hardware channels being controlled and
pwmc(4) device filenames.  Also, use uppercase PWM when the term is being
used as an acronym, and expand the acronym where it's first used.

4 years agoRemove everything related to channels from the pwmc public interface, now
ian [Tue, 18 Jun 2019 00:11:00 +0000 (00:11 +0000)]
Remove everything related to channels from the pwmc public interface, now
that there is a pwmc(4) instance per channel and the channel number is
maintained as a driver ivar rather than being passed in from userland.

4 years agofusefs: multiple fixes related to the write cache
asomers [Mon, 17 Jun 2019 23:34:11 +0000 (23:34 +0000)]
fusefs: multiple fixes related to the write cache

* Don't always write the last page synchronously.  That's not actually
  required.  It was probably just masking another bug that I fixed later,
  possibly in r349021.

* Enable the NotifyWriteback tests now that Writeback cache is working.

* Add a test to ensure that the write cache isn't flushed synchronously when
  in writeback mode.

Sponsored by: The FreeBSD Foundation

4 years agoAdd ACPI support for USB driver.
takawata [Mon, 17 Jun 2019 23:03:30 +0000 (23:03 +0000)]
Add ACPI support for USB driver.
This adds ACPI device path on devinfo(8) output and
show  value of _UPC(usb port capabilities), _PLD (physical location of device)
when hw.usb.debug >= 1 .

Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D20630

4 years agoFix passing ${CONF_FILES} (which contains MAKE_CONF and
gjb [Mon, 17 Jun 2019 22:53:39 +0000 (22:53 +0000)]
Fix passing ${CONF_FILES} (which contains MAKE_CONF and
SRC_CONF, __MAKE_CONF and SRCCONF, respectively) through
to arm_install_base() and chroot_arm_build_release().
This prevents failures when the target image is intended
to be build with make.conf(5) and src.conf(5) overrides,
which are correctly handled for non-embedded image builds.

Reported and tested by: Daniel Engberg
PR: 238615
MFC after: 3 days
Sponsored by: The FreeBSD Foundation

4 years agofusefs: run the Io tests with various combinations of mount options
asomers [Mon, 17 Jun 2019 22:13:59 +0000 (22:13 +0000)]
fusefs: run the Io tests with various combinations of mount options

Sponsored by: The FreeBSD Foundation

4 years agofusefs: use cluster_read for more readahead
asomers [Mon, 17 Jun 2019 22:01:23 +0000 (22:01 +0000)]
fusefs: use cluster_read for more readahead

fusefs will now use cluster_read.  This allows readahead of more than one
cache block.  However, it won't yet actually cluster the reads because that
requires VOP_BMAP, which fusefs does not yet implement.

Sponsored by: The FreeBSD Foundation

4 years agoAdd NetBSD 8.1 & DragonFly BSD 5.6
sevan [Mon, 17 Jun 2019 21:46:13 +0000 (21:46 +0000)]
Add NetBSD 8.1 & DragonFly BSD 5.6

4 years agoFix tab
sevan [Mon, 17 Jun 2019 21:38:33 +0000 (21:38 +0000)]
Fix tab

4 years agorandom(4): Fortuna: allow increased concurrency
cem [Mon, 17 Jun 2019 20:29:13 +0000 (20:29 +0000)]
random(4): Fortuna: allow increased concurrency

Add experimental feature to increase concurrency in Fortuna.  As this
diverges slightly from canonical Fortuna, and due to the security
sensitivity of random(4), it is off by default.  To enable it, set the
tunable kern.random.fortuna.concurrent_read="1".  The rest of this commit
message describes the behavior when enabled.

Readers continue to update shared Fortuna state under global mutex, as they
do in the status quo implementation of the algorithm, but shift the actual
PRF generation out from under the global lock.  This massively reduces the
CPU time readers spend holding the global lock, allowing for increased
concurrency on SMP systems and less bullying of the harvestq kthread.

It is somewhat of a deviation from FS&K.  I think the primary difference is
that the specific sequence of AES keys will differ if READ_RANDOM_UIO is
accessed concurrently (as the 2nd thread to take the mutex will no longer
receive a key derived from rekeying the first thread).  However, I believe
the goals of rekeying AES are maintained: trivially, we continue to rekey
every 1MB for the statistical property; and each consumer gets a
forward-secret, independent AES key for their PRF.

Since Chacha doesn't need to rekey for sequences of any length, this change
makes no difference to the sequence of Chacha keys and PRF generated when
Chacha is used in place of AES.

On a GENERIC 4-thread VM (so, INVARIANTS/WITNESS, numbers not necessarily
representative), 3x concurrent AES performance jumped from ~55 MiB/s per
thread to ~197 MB/s per thread.  Concurrent Chacha20 at 3 threads went from
roughly ~113 MB/s per thread to ~430 MB/s per thread.

Prior to this change, the system was extremely unresponsive with 3-4
concurrent random readers; each thread had high variance in latency and
throughput, depending on who got lucky and won the lock.  "rand_harvestq"
thread CPU use was high (double digits), seemingly due to spinning on the
global lock.

After the change, concurrent random readers and the system in general are
much more responsive, and rand_harvestq CPU use dropped to basically zero.

Tests are added to the devrandom suite to ensure the uint128_add64 primitive
utilized by unlocked read functions to specification.

Reviewed by: markm
Approved by: secteam(delphij)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D20313

4 years agoAllow the hostapd program to be specified. This allows users to use
cy [Mon, 17 Jun 2019 20:11:02 +0000 (20:11 +0000)]
Allow the hostapd program to be specified. This allows users to use
hostapd from ports instead of the one in base. The default is the hostapd
in base.

PR: 238571
MFC after: 1 week

4 years agoMake ipf_objbytes a constant. ipf_objbytes is a table of internal data
cy [Mon, 17 Jun 2019 20:10:55 +0000 (20:10 +0000)]
Make ipf_objbytes a constant. ipf_objbytes is a table of internal data
structures that are saved across reboots by ipfs(8). The table is not
changed at runtime.

MFC after: 3 days

4 years agoSeparate kernel crc32() implementation to its own header (gsb_crc32.h) and
delphij [Mon, 17 Jun 2019 19:49:08 +0000 (19:49 +0000)]
Separate kernel crc32() implementation to its own header (gsb_crc32.h) and
rename the source to gsb_crc32.c.

This is a prerequisite of unifying kernel zlib instances.

PR: 229763
Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision: https://reviews.freebsd.org/D20193

4 years agopci.4: Use plural configuration registers
zeising [Mon, 17 Jun 2019 17:35:55 +0000 (17:35 +0000)]
pci.4: Use plural configuration registers

It is customary to use plural when talking about PCI configure registers.

Reported by: scottl
MFC after: 2 weeks
X-MFC-with: r349133

4 years agofusefs: skip the Write.mmap test when mmap is not available
asomers [Mon, 17 Jun 2019 17:17:01 +0000 (17:17 +0000)]
fusefs: skip the Write.mmap test when mmap is not available

fusefs doesn't not allow mmap when data caching is disabled.

Sponsored by: The FreeBSD Foundation

4 years agoAdd some missing MLINKs for tree(3).
markj [Mon, 17 Jun 2019 16:57:44 +0000 (16:57 +0000)]
Add some missing MLINKs for tree(3).

MFC after: 3 days

4 years agofusefs: implement non-clustered readahead
asomers [Mon, 17 Jun 2019 16:56:51 +0000 (16:56 +0000)]
fusefs: implement non-clustered readahead

fusefs will now read ahead at most one cache block at a time (usually 64
KB).  Clustered reads are still TODO.  Individual file systems may disable
read ahead by setting fuse_init_out.max_readahead=0 during initialization.

Sponsored by: The FreeBSD Foundation

4 years agopci.4: wordsmith and add missing words
zeising [Mon, 17 Jun 2019 16:54:51 +0000 (16:54 +0000)]
pci.4: wordsmith and add missing words

Add missing words after PCI in the description of the PCIOCWRITE and
PCIOCATTACHED ioctls.
Use singular in PCIOCREAD, we only read one register at the time.

Reviewed by: bcr, bjk, rgrimes, cem
MFC after: 2 weeks
X-MFC-with: r349133
Differential Revision: https://reviews.freebsd.org/D20671

4 years agoPut periods at the ends of argument descriptions. Explain the relationship
ian [Mon, 17 Jun 2019 16:50:58 +0000 (16:50 +0000)]
Put periods at the ends of argument descriptions.  Explain the relationship
between the period and duty arguments.

4 years agoFollow changes in the pwmc(4) driver in relation to device filenames.
ian [Mon, 17 Jun 2019 16:43:33 +0000 (16:43 +0000)]
Follow changes in the pwmc(4) driver in relation to device filenames.

The driver now names its cdev nodes pwmcX.Y where X is unit number and
Y is the channel within that unit.  Change the default device name from
pwmc0 to pwmc0.0.  The driver now puts cdev files and label aliases in
the /dev/pwm directory, so allow the user to provide unqualified names
with -f and automatically prepend the /dev/pwm part for them.

Update the examples in the manpage to show the new device name format
and location within /dev/pwm.

4 years agoPut the pwmc cdev filenames under the pwm directory along with any label
ian [Mon, 17 Jun 2019 16:26:43 +0000 (16:26 +0000)]
Put the pwmc cdev filenames under the pwm directory along with any label
names.  I.e., everything related to pwm now goes in /dev/pwm.  This will
make it easier for userland tools to turn an unqualified name into a fully
qualified pathname, whether it's the base pwmcX.Y name or a label name.

4 years agorandom(4): Generalize algorithm-independent APIs
cem [Mon, 17 Jun 2019 15:09:12 +0000 (15:09 +0000)]
random(4): Generalize algorithm-independent APIs

At a basic level, remove assumptions about the underlying algorithm (such as
output block size and reseeding requirements) from the algorithm-independent
logic in randomdev.c.  Chacha20 does not have many of the restrictions that
AES-ICM does as a PRF (Pseudo-Random Function), because it has a cipher
block size of 512 bits.  The motivation is that by generalizing the API,
Chacha is not penalized by the limitations of AES.

In READ_RANDOM_UIO, first attempt to NOWAIT allocate a large enough buffer
for the entire user request, or the maximal input we'll accept between
signal checking, whichever is smaller.  The idea is that the implementation
of any randomdev algorithm is then free to divide up large requests in
whatever fashion it sees fit.

As part of this, two responsibilities from the "algorithm-generic" randomdev
code are pushed down into the Fortuna ra_read implementation (and any other
future or out-of-tree ra_read implementations):

  1. If an algorithm needs to rekey every N bytes, it is responsible for
  handling that in ra_read(). (I.e., Fortuna's 1MB rekey interval for AES
  block generation.)

  2. If an algorithm uses a block cipher that doesn't tolerate partial-block
  requests (again, e.g., AES), it is also responsible for handling that in
  ra_read().

Several APIs are changed from u_int buffer length to the more canonical
size_t.  Several APIs are changed from taking a blockcount to a bytecount,
to permit PRFs like Chacha20 to directly generate quantities of output that
are not multiples of RANDOM_BLOCKSIZE (AES block size).

The Fortuna algorithm is changed to NOT rekey every 1MiB when in Chacha20
mode (kern.random.use_chacha20_cipher="1").  This is explicitly supported by
the math in FS&K ยง9.4 (Ferguson, Schneier, and Kohno; "Cryptography
Engineering"), as well as by their conclusion: "If we had a block cipher
with a 256-bit [or greater] block size, then the collisions would not
have been an issue at all."

For now, continue to break up reads into PAGE_SIZE chunks, as they were
before.  So, no functional change, mostly.

Reviewed by: markm
Approved by: secteam(delphij)
Differential Revision: https://reviews.freebsd.org/D20312

4 years agorandom(4): Add regression tests for uint128 implementation, Chacha CTR
cem [Mon, 17 Jun 2019 14:59:45 +0000 (14:59 +0000)]
random(4): Add regression tests for uint128 implementation, Chacha CTR

Add some basic regression tests to verify behavior of both uint128
implementations at typical boundary conditions, to run on all architectures.

Test uint128 increment behavior of Chacha in keystream mode, as used by
'kern.random.use_chacha20_cipher=1' (r344913) to verify assumptions at edge
cases.  These assumptions are critical to the safety of using Chacha as a
PRF in Fortuna (as implemented).

(Chacha's use in arc4random is safe regardless of these tests, as it is
limited to far less than 4 billion blocks of output in that API.)

Reviewed by: markm
Approved by: secteam(gordon)
Differential Revision: https://reviews.freebsd.org/D20392

4 years agofusefs: rename the ReadCacheable.default_readahead test
asomers [Mon, 17 Jun 2019 14:42:27 +0000 (14:42 +0000)]
fusefs: rename the ReadCacheable.default_readahead test

The test didn't actually have anything to do with readahead.  Rename it to
"ReadCacheable.cache_block"

Sponsored by: The FreeBSD Foundation

4 years agoMFV r349134:
mm [Mon, 17 Jun 2019 11:46:37 +0000 (11:46 +0000)]
MFV r349134:
Sync libarchive with vendor.

Relevant vendor changes:
  PR #1212: RAR5 reader - window_mask was not updated correctly
            (OSS-Fuzz 15278)
  OSS-Fuzz 15120: RAR reader - extend use after free bugfix

MFC after: 1 week (together with r348993)

4 years agoUpdate vendor/libarchive/dist to git 809f0dc32fff7434aef45a7c688fa285c7208af7
mm [Mon, 17 Jun 2019 11:29:32 +0000 (11:29 +0000)]
Update vendor/libarchive/dist to git 809f0dc32fff7434aef45a7c688fa285c7208af7

Relevant vendor changes:
  PR #1212: RAR5 reader - window_mask was not updated correctly
            (OSS-Fuzz 15278)
  OSS-Fuzz 15120: RAR reader - extend use after free bugfix
  Add HAVE_UNLINKAT to config_freebsd.h

4 years agopci(4): Document PCIOCATTACHED
zeising [Mon, 17 Jun 2019 05:41:47 +0000 (05:41 +0000)]
pci(4): Document PCIOCATTACHED

Document the PCIOCATTACHED ioctl(2) in the pci(4) manual.
PCIOCATTACHED is used to query if a driver has attached to a PCI.

Reviewed by: bcr, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D20652

4 years agoAdd back a const qualifier I somehow fumbled away between test-building
ian [Mon, 17 Jun 2019 03:48:44 +0000 (03:48 +0000)]
Add back a const qualifier I somehow fumbled away between test-building
and committing recent changes.

4 years agoImplement the ofw_bus_get_node method in aw_pwm(4) so that ofw_pwmbus can
ian [Mon, 17 Jun 2019 03:40:00 +0000 (03:40 +0000)]
Implement the ofw_bus_get_node method in aw_pwm(4) so that ofw_pwmbus can
find its metadata for instantiating children.

4 years agoAdd ofw_pwmbus to enumerate pwmbus devices on systems configured with fdt
ian [Mon, 17 Jun 2019 03:32:05 +0000 (03:32 +0000)]
Add ofw_pwmbus to enumerate pwmbus devices on systems configured with fdt
data.  Also, add fdt support to pwmc.

4 years agoEliminate a redundant call to pmap_invalidate_page() from
alc [Mon, 17 Jun 2019 01:58:25 +0000 (01:58 +0000)]
Eliminate a redundant call to pmap_invalidate_page() from
pmap_ts_referenced().

MFC after: 14 days
Differential Revision: https://reviews.freebsd.org/D12725

4 years agoThree changes to arm64's pmap_unwire():
alc [Sun, 16 Jun 2019 22:13:27 +0000 (22:13 +0000)]
Three changes to arm64's pmap_unwire():

Implement wiring changes on superpage mappings.  Previously, a superpage
mapping was unconditionally demoted by pmap_unwire(), even if the wiring
change applied to the entire superpage mapping.

Rewrite a comment to use the arm64 names for bits in a page table entry.
Previously, the bits were referred to by their x86 names.

Use atomic_"op"_64() instead of atomic_"op"_long() to update a page table
entry in order to match the prevailing style in this file.

MFC after: 10 days

4 years agoFix bug on newbus device deletion: we should delete the child's devinfo
nwhitehorn [Sun, 16 Jun 2019 21:56:45 +0000 (21:56 +0000)]
Fix bug on newbus device deletion: we should delete the child's devinfo
on deletion, not the parent's.

MFC after: 3 weeks

4 years agoRemove tabs from BSD.var.dist
antoine [Sun, 16 Jun 2019 20:01:45 +0000 (20:01 +0000)]
Remove tabs from BSD.var.dist

Reported by: zeising

4 years agoRework pwmbus and pwmc so that each child will handle a single PWM channel.
ian [Sun, 16 Jun 2019 19:44:42 +0000 (19:44 +0000)]
Rework pwmbus and pwmc so that each child will handle a single PWM channel.

Previously, there was a pwmc instance for each instance of pwm hardware
regardless of how many pwm channels that hardware supported.  Now there
will be a pwmc instance for each channel when the hardware supports
multiple channels.  With a separate instance for each channel, we can have
"named channels" in userland by making devfs alias entries in /dev/pwm.

These changes add support for ivars to pwmbus, and use an ivar to track the
channel number for each child.  It also adds support for hinted children.

In pwmc, the driver checks for a label hint, and if present, it's used to
create an alias for the cdev in /dev/pwm.  It's not anticipated that hints
will be heavily used, but it's easy to do and allows quick ad-hoc creation
of named channels from userland by using kenv to create hint.pwmc.N.label=
hints.  Upcoming changes will add FDT support, and most labels will
probably be specified that way.

4 years agoIn iostat(8) output, skip the decimal point and the fractional part
trasz [Sun, 16 Jun 2019 17:32:05 +0000 (17:32 +0000)]
In iostat(8) output, skip the decimal point and the fractional part
for tps >= 100 and MB/s >= 1000, to prevent them for widening too much.

MFC after: 2 weeks

4 years agoThree enhancements to arm64's pmap_protect():
alc [Sun, 16 Jun 2019 16:45:01 +0000 (16:45 +0000)]
Three enhancements to arm64's pmap_protect():

Implement protection changes on superpage mappings.  Previously, a superpage
mapping was unconditionally demoted by pmap_protect(), even if the
protection change applied to the entire superpage mapping.

Precompute the bit mask describing the protection changes rather than
recomputing it for every page table entry that is changed.

Skip page table entries that already have the requested protection changes
in place.

Reviewed by: andrew, kib
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D20657

4 years agoIn detach(), call bus_generic_detach() before deleting the iicbus child.
ian [Sun, 16 Jun 2019 16:02:50 +0000 (16:02 +0000)]
In detach(), call bus_generic_detach() before deleting the iicbus child.
This gives the bus and its children the chance to return EBUSY to abort
the detach if they're in the middle of doing some IO.

4 years agoRename pwmbus.h to ofw_pwm.h, because after all the recent changes, there
ian [Sun, 16 Jun 2019 15:56:59 +0000 (15:56 +0000)]
Rename pwmbus.h to ofw_pwm.h, because after all the recent changes, there
is nothing left in the file that related to pwmbus at all.  It just contains
prototypes for the functions implemented in dev/pwm.ofw_pwm.c, so name it
accordingly and fix the include protect wrappers to match.

A new pwmbus.h will be coming along in a future commit.

4 years agovtfontcvt: correct typo in hex parsing update
emaste [Sun, 16 Jun 2019 15:14:49 +0000 (15:14 +0000)]
vtfontcvt: correct typo in hex parsing update

PR: 205707
Submitted by: Dmitry Wagin
MFC with: 349100
Event: Berlin Devsummit 2019

4 years agovtfontcvt: improve .bdf validation
emaste [Sun, 16 Jun 2019 13:51:45 +0000 (13:51 +0000)]
vtfontcvt: improve .bdf validation

Previously if we had a BBX entry that had invalid values (e.g. bounding
box outside of font bounding box) and failed sscanf (e.g., because it
had fewer than four values) we skipped the BBX value validation and then
triggered an assertion failure.

Reported by: afl
MFC with: r349100
Event: Berlin Devsummit 2019
Sponsored by: The FreeBSD Foundation

4 years agovtfontcvt: improve .bdf verification
emaste [Sun, 16 Jun 2019 13:35:53 +0000 (13:35 +0000)]
vtfontcvt: improve .bdf verification

Previously we would crash if the BBX y-offset was outside of the font
bounding box.

Reported by: afl
MFC with: r349100
Event: Berlin Devsummit 2019
Sponsored by: The FreeBSD Foundation

4 years agoallow vt(4) fonts to be built from .bdf files
emaste [Sun, 16 Jun 2019 12:44:49 +0000 (12:44 +0000)]
allow vt(4) fonts to be built from .bdf files

vtfontcvt(8) can convert both .bdf and .hex inputs to binary vt(4) .fnt
files.

Event: Berlin Devsummit 2019
Sponsored by: The FreeBSD Foundation

4 years agovtfontcvt: initialize another variable to quiet GCC warning
emaste [Sun, 16 Jun 2019 12:26:46 +0000 (12:26 +0000)]
vtfontcvt: initialize another variable to quiet GCC warning

I believe this case could be triggered by a broken .bdf font.

PR: 205707
Reported by: ci.freebsd.org
MFC with: 349100
Event: Berlin Devsummit 2019
Sponsored by: The FreeBSD Foundation

4 years agoDifferentiate package versions for ALPHA/BETA/PRERELEASE/RC phases.
rene [Sun, 16 Jun 2019 11:53:22 +0000 (11:53 +0000)]
Differentiate package versions for ALPHA/BETA/PRERELEASE/RC phases.

Currently APLHA packages are treated as CURRENT or STABLE versions,
resulting in e.g. 13.0.s20190615125609. This version number is indeed
different from the next version number but ALPHA2 would be nicer IMO.

For the BETA, PRERELEASE and RC phases the packages are versioned the
same as for releases, so 11.3-BETA1 is 11.3 and so is 11.3-RC1, meaning
that pkg cannot easiliy upgrade from the former the next. This happened
on my Raspberry Pi which runs pkgbase.

Submitted by: rene
Approved by: manu
Event: Berlin hackathon 2019
Differential Revision: https://reviews.freebsd.org/D20651

4 years agovtfontcvt: initialize bbwbytes to avoid GCC 4.2.1 uninitialized warning
emaste [Sun, 16 Jun 2019 10:43:18 +0000 (10:43 +0000)]
vtfontcvt: initialize bbwbytes to avoid GCC 4.2.1 uninitialized warning

PR: 205707
MFC with: 349100
Event: Berlin Devsummit 2019
Sponsored by: The FreeBSD Foundation

4 years agovtfontcvt: improve BDF and hex font parsing
emaste [Sun, 16 Jun 2019 09:17:26 +0000 (09:17 +0000)]
vtfontcvt: improve BDF and hex font parsing

Support larger font sizes.

PR: 205707
Submitted by: Dmitry Wagin (original version)
MFC after: 2 weeks
Event: Berlin Devsummit 2019
Differential Revision: https://reviews.freebsd.org/D20650

4 years agosymlinkat(2) is not covered.
bdrewery [Sun, 16 Jun 2019 05:12:17 +0000 (05:12 +0000)]
symlinkat(2) is not covered.

4 years agoAdd macOS-like three finger drag trackpad gesture to psm(4)
philip [Sun, 16 Jun 2019 03:06:05 +0000 (03:06 +0000)]
Add macOS-like three finger drag trackpad gesture to psm(4)

Submitted by: Yan Ka Chiu <nyan@myuji.xyz>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20648

4 years agoBuild SoC-specific modules with GENERIC for the SoCs that have them.
ian [Sun, 16 Jun 2019 01:23:45 +0000 (01:23 +0000)]
Build SoC-specific modules with GENERIC for the SoCs that have them.

4 years agoAdd module makefiles for Texas Instruments ARM SoCs.
ian [Sun, 16 Jun 2019 01:22:44 +0000 (01:22 +0000)]
Add module makefiles for Texas Instruments ARM SoCs.

The natural place to look for them based on how other SoCs are organized
would be sys/modules/ti, but that's already taken.  Drop a clue into
modules/ti/Makefile directing people to modules/arm_ti if they're looking
for ARM modules.