mav [Tue, 22 Oct 2013 08:22:19 +0000 (08:22 +0000)]
Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.
The defined now safety requirements are:
- caller should not hold any locks and should be reenterable;
- callee should not depend on GEOM dual-threaded concurency semantics;
- on the way down, if request is unmapped while callee doesn't support it,
the context should be sleepable;
- kernel thread stack usage should be below 50%.
To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
- G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
- G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
- G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
- G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe. If any of requirements are not met, request is queued to
g_up or g_down thread same as before.
Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).
To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.
This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).
bz [Tue, 22 Oct 2013 00:50:53 +0000 (00:50 +0000)]
Make netback compile without INET support in the kernel.
This shuld have been a problem since r230587. Not exactly sure why it
was not detected the last weeks with the tinderbox. I would assume
r255744 is what started to cause it.
sbruno [Mon, 21 Oct 2013 22:55:56 +0000 (22:55 +0000)]
Resolve clang warning about a use of syslog. By using a proper enforcement
of string format in a call so syslog
/usr/src/gnu/lib/libssp/../../../contrib/gcclibs/libssp/ssp.c:137:23:
warning: format string is not a string literal (potentially insecure)
[-Wformat-security]
syslog (LOG_CRIT, msg1);
^~~~
brooks [Mon, 21 Oct 2013 22:43:38 +0000 (22:43 +0000)]
Remove the isf(4) driver. It was created by accident and is subset of
the cfi(4) driver. It remained in the tree longer than would be ideal
due to the time required to bring cfi(4) to feature parity.
brooks [Mon, 21 Oct 2013 21:13:01 +0000 (21:13 +0000)]
MFP4: 223121 (FDT infrastructure portion)
Implement support for interrupt-parent nodes in simplebus. The current
implementation requires that device declarations have an interrupt-parent
node and that it point to a device that has registered itself as a
interrupt controller in fdt_ic_list_head and implements the fdt_ic
interface.
kib [Mon, 21 Oct 2013 16:46:12 +0000 (16:46 +0000)]
Add a resource limit for the total number of kqueues available to the
user. Kqueue now saves the ucred of the allocating thread, to
correctly decrement the counter on close.
Under some specific and not real-world use scenario for kqueue, it is
possible for the kqueues to consume memory proportional to the square
of the number of the filedescriptors available to the process. Limit
allows administrator to prevent the abuse.
This is kernel-mode side of the change, with the user-mode enabling
commit following.
Reported and tested by: pho
Discussed with: jmg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
kib [Mon, 21 Oct 2013 16:44:53 +0000 (16:44 +0000)]
Add a resource limit for the total number of kqueues available to the
user. Kqueue now saves the ucred of the allocating thread, to
correctly decrement the counter on close.
Under some specific and not real-world use scenario for kqueue, it is
possible for the kqueues to consume memory proportional to the square
of the number of the filedescriptors available to the process. Limit
allows administrator to prevent the abuse.
This is kernel-mode side of the change, with the user-mode enabling
commit following.
Reported and tested by: pho
Discussed with: jmg
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
kib [Mon, 21 Oct 2013 16:22:51 +0000 (16:22 +0000)]
Reset function on SandyBridge holds the gt_lock for the whole duration
already. Also, according to the specs, GDRST register is not in the
power well, so the forcewake for reset status read is excessive for
this reason.
Use plain register read for waiting of the reset completion
notification, to avoid gt_lock recursion. Linux upstream did the
similar change, but their code was already restructured.
Reported by: ray
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
mav [Mon, 21 Oct 2013 12:00:26 +0000 (12:00 +0000)]
Merge CAM locking changes from the projects/camlock branch to radically
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
Replace big per-SIM locks with bunch of smaller ones:
- per-LUN locks to protect device and peripheral drivers state;
- per-target locks to protect list of LUNs on target;
- per-bus locks to protect reference counting;
- per-send queue locks to protect queue of CCBs to be sent;
- per-done queue locks to protect queue of completed CCBs;
- remaining per-SIM locks now protect only HBA driver internals.
While holding LUN lock it is allowed (while not recommended for performance
reasons) to take SIM lock. The opposite acquisition order is forbidden.
All the other locks are leaf locks, that can be taken anywhere, but should
not be cascaded. Many functions, such as: xpt_action(), xpt_done(),
xpt_async(), xpt_create_path(), etc. are no longer require (but allow) SIM
lock to be held.
To keep compatibility and solve cases where SIM lock can't be dropped, all
xpt_async() calls in addition to xpt_done() calls are queued to completion
threads for async processing in clean environment without SIM lock held.
Instead of single CAM SWI thread, used for commands completion processing
before, use multiple (depending on number of CPUs) threads. Load balanced
between them using "hash" of the device B:T:L address.
HBA drivers that can drop SIM lock during completion processing and have
sufficient number of completion threads to efficiently scale to multiple
CPUs can use new function xpt_done_direct() to avoid extra context switch.
Make ahci(4) driver to use this mechanism depending on hardware setup.
bdrewery [Mon, 21 Oct 2013 10:09:48 +0000 (10:09 +0000)]
Fix 'make delete-old-libs' and 'make check-libs' to delete .debug
files created by WITH_DEBUG_FILES. Also cleanup .symbols files from
the period between r244236 when .symbols were supported and r251512
when they were renamed to .debug.
Only propose to delete a .debug file if the corresponding library
itself was deleted already.
Reported by: des
Reviewed by: emaste (earlier version)
Approved by: bapt
MFC after: 3 days
mav [Mon, 21 Oct 2013 06:44:55 +0000 (06:44 +0000)]
MFprojects/camlock r256619:
Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping
temporary mapped buffer. That fixes double unmap if biodone() called twice
for the same BIO (but with different done methods).
Move mapping removal before calling bio_done() method. I believe that it is
very wrong to do anything to BIO after reporting completion. kib@ thinks
it was done for some forgotten now case when bio_done() method needed mapped
buffer. But 1) if BIO was sent as unmapped, then IMO done() should be called
in the same way; 2) IMO there is no guatantee that buffer will be mapped at
this point at all, for example, if all underlying stack supports unmapped
I/O, so bio_done() handler can not expect that.
mav [Mon, 21 Oct 2013 06:04:39 +0000 (06:04 +0000)]
Partial MFproject/camlock r256671:
Fix several target mode SIMs to not blindly clear ccb_h.flags field of
ATIO CCBs. Not all CCB flags there belong to them.
glebius [Mon, 21 Oct 2013 05:14:00 +0000 (05:14 +0000)]
Provide a working example line for an interface with 1 address running
with CARP.
Currently, we've got a problem that interface isn't IFF_UP at the time
we assign it a redundant address, and the latter gets stuck in INIT state.
Additional SIOCSIFFLAGS from ifconfig(8) kicks it to a working state.
A proper fix is kernel side and appeared to be non-trivial, not to be
checked in before 10.0-RELEASE.
Submitted by: Ole Myhre <ole.myhre dataoppdrag.no>
markj [Mon, 21 Oct 2013 04:15:55 +0000 (04:15 +0000)]
When fetching function arguments out of a frame on amd64, explicitly select
the register based on the argument index rather than relying on the fields
in struct reg to be in the right order. This assumption is incorrect on
FreeBSD and generally led to bogus argument values for the sixth argument
of PID and USDT probes; the first five are passed directly to dtrace_probe()
via the fasttrap trap handler and so were correctly handled.
mckusick [Mon, 21 Oct 2013 00:28:02 +0000 (00:28 +0000)]
Restructuring of the soft updates code to set it up so that the
single kernel-wide soft update lock can be replaced with a
per-filesystem soft-updates lock. This per-filesystem lock will
allow each filesystem to have its own soft-updates flushing thread
rather than being limited to a single soft-updates flushing thread
for the entire kernel.
Move soft update variables out of the ufsmount structure and into
their own mount_softdeps structure referenced by ufsmount field
um_softdep. Eventually the per-filesystem lock will be in this
structure. For now there is simply a pointer to the kernel-wide
soft updates lock.
Change all instances of ACQUIRE_LOCK and FREE_LOCK to pass the lock
pointer in the mount_softdeps structure instead of a pointer to the
kernel-wide soft-updates lock.
Replace the five hash tables used by soft updates with per-filesystem
copies of these tables allocated in the mount_softdeps structure.
Several functions that flush dependencies when too many are allocated
in the kernel used to operate across all filesystems. They are now
parameterized to flush dependencies from a specified filesystem.
For now, we stick with the round-robin flushing strategy when the
kernel as a whole has too many dependencies allocated.
While there are many lines of changes, there should be no functional
change in the operation of soft updates.
Tested by: Peter Holm and Scott Long
Sponsored by: Netflix
mckusick [Sun, 20 Oct 2013 22:21:01 +0000 (22:21 +0000)]
Fourth of several cleanups to soft dependency implementation.
Add KASSERTS that soft dependency functions only get called
for filesystems running with soft dependencies. Calling these
functions when soft updates are not compiled into the system
become panic's.
No functional change.
Tested by: Peter Holm and Scott Long
Sponsored by: Netflix
mckusick [Sun, 20 Oct 2013 21:11:40 +0000 (21:11 +0000)]
Third of several cleanups to soft dependency implementation.
Ensure that softdep_unmount() and softdep_setup_sbupdate()
only get called for filesystems running with soft dependencies.
No functional change.
Tested by: Peter Holm and Scott Long
Sponsored by: Netflix
ian [Sun, 20 Oct 2013 21:07:38 +0000 (21:07 +0000)]
Add a driver for the Freescale Fast Ethernet Controller found on various
Freescale SoCs including the i.MX series. This also works for the newer
SoCs with the ENET gigabit controller, but doesn't use any of the new
hardware features other than enabling gigabit speed.
mckusick [Sun, 20 Oct 2013 20:41:38 +0000 (20:41 +0000)]
First of several cleanups to soft dependency implementation.
Convert three functions exported from ffs_softdep.c to static
functions as they are not used outside of ffs_softdep.c.
No functional change.
Tested by: Peter Holm and Scott Long
Sponsored by: Netflix
jilles [Sun, 20 Oct 2013 20:10:31 +0000 (20:10 +0000)]
pathchk: Ensure bytes >= 128 are considered non-portable characters.
This was not broken on architectures such as ARM where char is unsigned.
Also, remove the first non-portable character from the output. POSIX does
not require this, and printing the first byte may yield an invalid byte
sequence with UTF-8.
nwhitehorn [Sun, 20 Oct 2013 18:40:55 +0000 (18:40 +0000)]
Since the PS3 port was committed, the AIM nexus device works perfectly fine
on all PowerPC platforms, whether or not they have Open Firmware. Remove
some more duplication and have there be only one nexus driver.
nwhitehorn [Sun, 20 Oct 2013 18:38:19 +0000 (18:38 +0000)]
Some nexus devices add wildcard children. Since fdtbus_probe returned
BUS_PROBE_DEFAULT, it would attach to them all, producing both many
fdtbus instances and preventing other devices from attaching. Instead
return BUS_PROBE_NOWILDCARD, which exists for exactly this purpose.
nwhitehorn [Sun, 20 Oct 2013 16:37:03 +0000 (16:37 +0000)]
Replace the two almost-exactly-identical AIM and Book-E clock.c
implementations with a single one after the application of a very small
amount of #ifdef.
nwhitehorn [Sun, 20 Oct 2013 16:14:03 +0000 (16:14 +0000)]
Unify the AIM and Book-E vm_machdep.c implementations, which previously
differed only with respect to the AIM version not following style(9) and
some additional features for 64-bit systems and machines with direct maps
in the AIM implementation that are no-ops on Book-E (at least for now).
andrew [Sun, 20 Oct 2013 15:13:32 +0000 (15:13 +0000)]
Merge from projects/arm_eabi_vfp r255380:
Fix the VCVT instruction. It must round towards zero when converting from
a floating-point to an integer value. This was not the case causing issues
when printing certain values.
There is a VCVTR instruction that will round depending on the current
rounding mode. We don't yet support this instruction, or setting the
rounding mode.
cperciva [Sat, 19 Oct 2013 21:37:06 +0000 (21:37 +0000)]
Add support for "first boot" rc.d scripts. [1]
These scripts, containing
# KEYWORD: firstboot
will only be run if a sentinel file (default: /firstboot, configurable
via the rc.conf ${firstboot_sentinel} variable) exists; this sentinel
file will be deleted at the end of the boot process.
Scripts can request that the system reboot after the first boot by
creating the file ${firstboot_sentinel}-reboot.
This functionality is expected to be useful for embedded systems and
virtual machine images, where it may be desirable to
(a) download and install updates which became available between when
the image was created and when it was "turned on";
(b) download and install packages which may be newer than those
which were available when the image was created;
(c) install packages which run binaries during their install process,
bypassing the problem of cross-architecture installs;
(d) resize filesystems to match the disk onto which a VM image was
installed;
(e) perform initialization tasks relevant to cloud systems (e.g.,
Amazon's Elastic Compute Cloud);
and likely to perform many other one-time initialization functions.
Document this new functionality in rc.conf(5) and rc(8). [2]
jmg [Sat, 19 Oct 2013 18:51:06 +0000 (18:51 +0000)]
Enable the automatic creation of a certificate (if one does not exists)
and enable the usage by sendmail if sendmail is enabled. Include and
document knobs to disable this feature and also set the Common Name of
the certificate created.
As the certificate is signed w/ a discarded key, it only helps prevent
Eve, but not Malory from knowing the contents of the emails.
This means that new installs (and people that use the updated freebsd.mc
file) will automaticly have STARTTLS enabled allowing incoming email to
be encrypted in most cases.
Reviewed by: gshapiro
MFC after: 3 days
Security: Yes, please.
des [Sat, 19 Oct 2013 09:59:11 +0000 (09:59 +0000)]
Do not error out when adding an interface to a group to which it
already belongs or removing it from a group to which it does not
belong. This makes it possible to include group memberships in
ifconfig_foo0 in rc.conf without fear of breaking "service netif
restart foo0".
rpaulo [Sat, 19 Oct 2013 06:51:34 +0000 (06:51 +0000)]
Plug atf-run into the 'test' target.
If atf-run exists in ATF_PREFIX and if ALLOW_DEPRECATED_ATF_TOOLS has
been set to yes, the test target is automatically defined to use it
for the execution of test programs.
Submitted by: Julio Merino jmmv google.com
MFC after: 2 weeks
rpaulo [Sat, 19 Oct 2013 06:50:56 +0000 (06:50 +0000)]
Add the automatic generation of Kyuafile files.
These files are generated from bsd.test.mk because kyua is able to run
test programs implemented using different libraries/frameworks. In
order to make this possible, this change also extends the various
*.test.mk file to explicitly indicate the interface of every test
program.
Submitted by: Julio Merino jmmv google.com
MFC after: 2 weeks
rpaulo [Sat, 19 Oct 2013 06:50:17 +0000 (06:50 +0000)]
Add the automatic generation of Atffile files.
These are only used by the deprecated atf-run and atf-report tools.
Generating them is easy and provides a mechanism for people to
experiment with these tools if they wish.
But, because these tools and files are deprecated, doing this only
happens if the user has explicitly set ALLOW_DEPRECATED_ATF_TOOLS
to yes.
Submitted by: Julio Merino jmmv google.com
MFC after: 2 weeks
rpaulo [Sat, 19 Oct 2013 06:48:49 +0000 (06:48 +0000)]
Clearly split the logic to build ATF and plain tests apart.
This change introduces a new plain.test.mk file that provides the build
infrastructure to build test programs that don't use any framework.
Most of the code previously in bsd.test.mk moves to plain.test.mk and
atf.test.mk is extended with the missing pieces.
In doing so, this change pushes all test program building logic to the
various *.test.mk files instead of trying to reuse some tiny bits.
In fact, this attempt to reuse some definitions makes the code harder
to read and harder to extend.
The clear benefit of this is that the interface of bsd.test.mk is now
clearly delimited.
Submitted by: Julio Merino jmmv google.com
MFC after: 2 weeks
rrs [Sat, 19 Oct 2013 06:47:02 +0000 (06:47 +0000)]
Corrects the Kirkwood dreamplug to use
the right register for power managment. It
was incorrectly using the clock register
which also caused the status to be the
opposite of what it is supposed to be.
1 - its disabled
0 - its enabled
Per kirkwood spec FSS_88F6180_9x_6281_OpenSource.pdf
Add an option ATSE_CFI_HACK to allow memory mapped CFI devices to have
their address range allocated sharable so that atse(4) can find it's
Ethernet address in the expected location.
We intend to remove this hack once the BERI platform has a loader.
Add atse(4), a driver for the Altera Triple Speed Ethernet MegaCore.
The current driver support gigabit Ethernet speeds only and works with
the MegaCore only in the internal FIFO configuration in the soon to be
open sourced BERI CPU configuration.
Submitted by: bz
MFC after: 3 days
Sponsored by: DARPA/AFRL
hselasky [Fri, 18 Oct 2013 17:38:57 +0000 (17:38 +0000)]
Improve XHCI stability. When a command timeout happens, the command
should be aborted else the command queue can stop. Refer to section
"4.6.1.2" of the XHCI specification.
jilles [Fri, 18 Oct 2013 12:35:12 +0000 (12:35 +0000)]
sh: Remove one syscall when waiting for a foreground job.
The getpgrp() call is unnecessary: if there is no job control then the
result was not used at all and if there is job control then we are not a
subshell and our process group ID is equal to our process ID (rootpid).
trasz [Fri, 18 Oct 2013 09:14:19 +0000 (09:14 +0000)]
Make geom_label(4) resize-aware. This fixes a situation when "gpart resize"
would resize a partition, but label providers - e.g. /dev/gptid/XXX - would
stay the same size.