Currently, every buffer cached in the L2ARC is accompanied by a 240-byte
header in memory, leading to very high memory consumption when using very
large cache devices. These changes significantly reduce this overhead.
Trunk Optimized
+-----------------+
L1-only | 176 B | 176 B | (same)
+-----------------+
L1 & L2 | 240 B | 208 B | (saved 32 bytes)
+-----------------+
L2-only | 240 B | 128 B | (saved 116 bytes)
+-----------------+
For an average blocksize of 8KB, this means that for the L2ARC, the ratio
of metadata to data has gone down from about 2.92% to 1.56%. For a
'storage optimized' EC2 instance with 1600GB of SSD and 60GB of RAM, this
means that we expect a completely full L2ARC to use (1600 GB * 0.0156) /
60GB = 41% of the available memory, down from 78%.
bz [Mon, 10 Aug 2015 10:29:32 +0000 (10:29 +0000)]
Rather than hardcoding a string and limiting the comparison to these
characters use the defined constant so that in case of change this
would not break.
mav [Sun, 9 Aug 2015 20:23:35 +0000 (20:23 +0000)]
MFV 286550: 5694 traverse_prefetcher does not prefetch enough
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Author: George Wilson <george.wilson@delphix.com>
mav [Sun, 9 Aug 2015 20:08:38 +0000 (20:08 +0000)]
MFV 286548:
5693 ztest fails in dbuf_verify: buf[i] == 0, due to dedup and bp_override
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Matthew Ahrens <mahrens@delphix.com>
mav [Sun, 9 Aug 2015 19:35:39 +0000 (19:35 +0000)]
MFV 286544:
5630 stale bonus buffer in recycled dnode_t leads to data corruption
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Justin T. Gibbs <justing@spectralogic.com>
mav [Sun, 9 Aug 2015 19:29:10 +0000 (19:29 +0000)]
MFV 286542: 5592 NULL pointer dereference in dsl_prop_notify_all_cb()
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Will Andrews <will@freebsd.org>
Approved by: Robert Mustacchi <rm@joyent.com>
mav [Sun, 9 Aug 2015 19:26:21 +0000 (19:26 +0000)]
MFV 286540: 5531 NULL pointer dereference in dsl_prop_get_ds()
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan McDonald <danmcd@omniti.com>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Justin T. Gibbs <justing@spectralogic.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Robert Mustacchi <rm@fingolfin.org>
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Rich Lowe <richlowe@richlowe.net>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Justin T. Gibbs <justing@spectralogic.com>
imp [Sun, 9 Aug 2015 18:15:33 +0000 (18:15 +0000)]
cmp and cp are used by the kerberos install, so need to be imclided in
ITOOLS. They are tiny enough that I'm not making conditional: the
minuscule savings in disk space isn't worth the obfuscation of
Makefile.inc1.
alc [Sun, 9 Aug 2015 07:45:15 +0000 (07:45 +0000)]
Revise the text about the atomicity of the defined operations across
multiple processors. In particular, clearly state that the operations
are always atomic when they are applied to the default memory type
that is used by the kernel (and applications).
peter [Sun, 9 Aug 2015 05:54:53 +0000 (05:54 +0000)]
Move the USE_PREAD configuration knob out of the middle of the autoconf
generated ones. It is easy to mistake as an option that has gone away
when it's actually a control that was explicitly turned on for FreeBSD.
peter [Sun, 9 Aug 2015 05:22:53 +0000 (05:22 +0000)]
Update svnlite from 1.8.10 to 1.8.14. This is mostly for client-side bug
fixes and quality of life improvements.
While there are security issues in this time frame that affect usage as a
server (eg: linked into apache), this isn't possible here.
ian [Sat, 8 Aug 2015 20:11:47 +0000 (20:11 +0000)]
Provide the tty-layer mutex when initializing the pps api. This allows
time_pps_fetch() to be used in blocking mode.
Also, don't init the pps api for system devices (consoles) that provide a
custom attach routine. The device may actually be a keyboard or other non-
tty device. If it wants to do pps processing (unlikely) it must handle
everything for itself. (In reality, only a sun keyboard uses a custom
attach routine, and it doesn't make a good pps device.)
melifaro [Sat, 8 Aug 2015 18:14:59 +0000 (18:14 +0000)]
MFP r274295:
* Move interface route cleanup to route.c:rt_flushifroutes()
* Convert most of "for (fibnum = 0; fibnum < rt_numfibs; fibnum++)" users
to use new rt_foreach_fib() instead of hand-rolling cycles.
melifaro [Sat, 8 Aug 2015 17:48:54 +0000 (17:48 +0000)]
MFP r274553:
* Move lle creation/deletion from lla_lookup to separate functions:
lla_lookup(LLE_CREATE) -> lla_create
lla_lookup(LLE_DELETE) -> lla_delete
lla_create now returns with LLE_EXCLUSIVE lock for lle.
* Provide typedefs for new/existing lltable callbacks.
melifaro [Sat, 8 Aug 2015 15:58:35 +0000 (15:58 +0000)]
Simplify ip[6] simploop:
Do not pass 'dst' sockaddr to ip[6]_mloopback:
- We have explicit check for AF_INET in ip_output()
- We assume ip header inside passed mbuf in ip_mloopback
- We assume ip6 header inside passed mbuf in ip6_mloopback
mav [Sat, 8 Aug 2015 11:48:11 +0000 (11:48 +0000)]
Disable 32-bit PIO for 6Gbit/s Intel SATA controllers.
For some reason 32-bit PIO writes are not working on 6Gbit/s Intel SATA
ports, while 16/32-bit PIO reads and 16-bit PIO writes are working fine.
3Gbit/s ports on the same controllers have no this problem.
Workaround this by disabling 32-bit PIO for all Intel controllers that may
have 6Gbit/s ports. It halves PIO performance from 6MB/s to 3MB/s, but
who bother about speed of such rare and slow mode, which is also highly
discouraged by SATA specifications?
trasz [Sat, 8 Aug 2015 10:38:37 +0000 (10:38 +0000)]
Fix interaction between libedit initialization and Capsicum
in units(1). The most visible is the removal of libedit warnings
about being unable to open termcap database.
Reviewed by: eadler@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3322
pjd [Sat, 8 Aug 2015 09:51:38 +0000 (09:51 +0000)]
Enable BIO_DELETE passthru in GELI, so TRIM/UNMAP can work as expected when
GELI is used on a SSD or inside virtual machine, so that guest can tell
host that it is no longer using some of the storage.
Enabling BIO_DELETE passthru comes with a small security consequence - an
attacker can tell how much space is being really used on encrypted device and
has less data no analyse then. This is why the -T option can be given to the
init subcommand to turn off this behaviour and -t/T options for the configure
subcommand can be used to adjust this setting later.
PR: 198863
Submitted by: Matthew D. Fuller fullermd at over-yonder dot net
This commit also includes a fix from Fabian Keil freebsd-listen at
fabiankeil.de for 'configure' on onetime providers which is not strictly
related, but is entangled in the same code, so would cause conflicts if
separated out.
jch [Sat, 8 Aug 2015 08:40:36 +0000 (08:40 +0000)]
Fix a kernel assertion issue introduced with r286227:
Avoid too strict INP_INFO_RLOCK_ASSERT checks due to
tcp_notify() being called from in6_pcbnotify().
Reported by: Larry Rosenman <ler@lerctr.org>
Submitted by: markj, jch
ian [Fri, 7 Aug 2015 23:31:31 +0000 (23:31 +0000)]
Only process the PPS event types currently enabled in pps_params.mode.
This makes the PPS API behave correctly, but isn't ideal -- we still end
up capturing PPS data for non-enabled edges, we just don't process the
data into an event that becomes visible outside of kern_tc. That's because
the event type isn't passed to pps_capture(), so it can't do the filtering.
Any solution for capture filtering is going to require touching every driver.
markj [Fri, 7 Aug 2015 19:56:22 +0000 (19:56 +0000)]
- Use an explicit "depends_on module kernel" guard in DTrace libraries that
reference types defined in the kernel. Otherwise dtrace(1) expects to find
CTF definitions for all referenced types, which is not very reasonable
when it is being used in a build environment. This was previously worked
around by adding "-x nolibs" to dtrace -h or -G invocations, but as of
r283025, dtrace(1) actually handles dependencies properly, so this is no
longer necessary.
- Remove "pragma ident" directives from DTrace libraries, as they're being
phased out upstream as well.
Submitted by: Krister Johansen <Krister.Johansen@isilon.com> [1]
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> MFH: Ports tree branch name. Request approval for merge.
> Relnotes: Set to 'yes' for mention in release notes.
> Security: Vulnerability reference (one per line) or description.
> Sponsored by: If the change was sponsored by an organization.
> Differential Revision: https://reviews.freebsd.org/D### (*full* phabric URL needed).
> Empty fields above will be automatically removed.
M libdtrace/io.d
M libdtrace/ip.d
M libdtrace/nfs.d
M libdtrace/nfssrv.d
M libdtrace/psinfo.d
M libdtrace/regs_x86.d
M libdtrace/sched.d
M libdtrace/siftr.d
M libdtrace/tcp.d
M libdtrace/udp.d
mav [Fri, 7 Aug 2015 14:38:26 +0000 (14:38 +0000)]
Add unmapped I/O support to ata(4) driver.
Main problem there was PIO mode support, that required KVA mapping.
Handle that case using recently added pmap_quick_enter_page(9) KPI,
mapping data pages to KVA one at a time.
glebius [Fri, 7 Aug 2015 11:43:14 +0000 (11:43 +0000)]
Change KPI of how device drivers that provide wireless connectivity interact
with the net80211 stack.
Historical background: originally wireless devices created an interface,
just like Ethernet devices do. Name of an interface matched the name of
the driver that created. Later, wlan(4) layer was introduced, and the
wlanX interfaces become the actual interface, leaving original ones as
"a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer
and a driver became a mix of methods that pass a pointer to struct ifnet
as identifier and methods that pass pointer to struct ieee80211com. From
user point of view, the parent interface just hangs on in the ifconfig
list, and user can't do anything useful with it.
Now, the struct ifnet goes away. The struct ieee80211com is the only
KPI between a device driver and net80211. Details:
- The struct ieee80211com is embedded into drivers softc.
- Packets are sent via new ic_transmit method, which is very much like
the previous if_transmit.
- Bringing parent up/down is done via new ic_parent method, which notifies
driver about any changes: number of wlan(4) interfaces, number of them
in promisc or allmulti state.
- Device specific ioctls (if any) are received on new ic_ioctl method.
- Packets/errors accounting are done by the stack. In certain cases, when
driver experiences errors and can not attribute them to any specific
interface, driver updates ic_oerrors or ic_ierrors counters.
Details on interface configuration with new world order:
- A sequence of commands needed to bring up wireless DOESN"T change.
- /etc/rc.conf parameters DON'T change.
- List of devices that can be used to create wlan(4) interfaces is
now provided by net.wlan.devices sysctl.
Most drivers in this change were converted by me, except of wpi(4),
that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing
changes to at least 8 drivers. Thanks to Olivier Cochard, gjb@, mmoll@,
op@ and lev@, who also participated in testing. Details here:
https://wiki.freebsd.org/projects/ifnet/net80211
Still, drivers: ndis, wtap, mwl, ipw, bwn, wi, upgt, uath were not
tested. Changes to mwl, ipw, bwn, wi, upgt are trivial and chances
of problems are low. The wtap wasn't compilable even before this change.
But the ndis driver is complex, and it is likely to be broken with this
commit. Help with testing and debugging it is appreciated.
kib [Fri, 7 Aug 2015 08:13:34 +0000 (08:13 +0000)]
The condition to use direct processing for the unmapped bio is
reverted. We can do direct processing when g_io_check() does not need
to perform transient remapping of the bio, otherwise the thread has to
sleep.
Reviewed by: mav (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kib [Fri, 7 Aug 2015 05:59:58 +0000 (05:59 +0000)]
Remove unused i386 header privatespace.h. For the native kernel, its
use was removed in r173592 (Nov 2007), yet Xen PV bits continued
referencing the privatespace structure, and were removed in r282274
(Apr 2015).
Discussed with: jhb
Sponsored by: The FreeBSD Foundation