Implement 'fast path' for the vm page fault handler. Or, it could be
called a scalable path. When several preconditions hold, the vm
object lock for the object containing the faulted page is taken in
read mode, instead of write, which allows parallel faults processing
in the region.
Namely, the fast path is taken when the faulted page already exists
and does not need copy on write, is already fully valid, and not busy.
For technical reasons, fast path is avoided when the fault is the
first write on the vnode object, or when the fault is for wiring or
debugger read or write.
On the fast path, pmap_enter(9) is passed the PMAP_ENTER_NOSLEEP flag,
since object lock is kept. Pmap might fail to create the entry, in
which case the fallback to slow path is performed.
Reviewed by: alc
Tested by: pho (previous version)
Hardware provided and hosted by: The FreeBSD Foundation and
Sentex Data Communications
Sponsored by: The FreeBSD Foundation
MFC after: 2 week
Alan Somers [Thu, 14 Aug 2014 22:33:56 +0000 (22:33 +0000)]
Convert devd's client socket to type SOCK_SEQPACKET.
This change consists of two merges from projects/zfsd/head along with the
addition of an ATF test case for the new functionality.
sbin/devd/tests/Makefile
sbin/devd/tests/client_test.c
Add ATF test cases for reading events from both devd socket types.
r266519:
sbin/devd/devd.8
sbin/devd/devd.cc
Create a new socket, of type SOCK_SEQPACKET, for communicating with
clients. SOCK_SEQPACKET sockets preserve record boundaries,
simplying code in the client. The old SOCK_STREAM socket is retained
for backwards-compatibility with existing clients.
r269993:
sbin/devd/devd.8
Fix grammar bug.
CR: https://reviews.freebsd.org/rS266519
MFC after: 5 days
Sponsored by: Spectra Logic
Gleb Smirnoff [Thu, 14 Aug 2014 18:57:46 +0000 (18:57 +0000)]
- Count global pf(4) statistics in counter(9).
- Do not count global number of states and of src_nodes,
use uma_zone_get_cur() to obtain values.
- Struct pf_status becomes merely an ioctl API structure,
and moves to netpfil/pf/pf.h with its constants.
- V_pf_status is now of type struct pf_kstatus.
Warner Losh [Thu, 14 Aug 2014 16:17:23 +0000 (16:17 +0000)]
Only install the boot loader if it actually exists. This is a stop-gap
change, since larger changes to use geom more exclusively to create
partitions is in th works.
Warner Losh [Thu, 14 Aug 2014 16:01:51 +0000 (16:01 +0000)]
ins is only set and unused, but only when we're not doing software
single stepping. Only set it when we're doing that by bending
style(9) rules a little to avoid even worse #ifdef soup.
Warner Losh [Thu, 14 Aug 2014 16:01:46 +0000 (16:01 +0000)]
Disable all inline warnings on gcc >= 4.3. Not sure exactly where the
cutover is, but we need better tools to cope with inline tuning per
compiler version than we have. This is a quick bandaid until such
tools are around.
Warner Losh [Thu, 14 Aug 2014 16:01:38 +0000 (16:01 +0000)]
Delete pp_isadma. It isn't use, and the code that used it has been
commented out (temporarily) since 1998 when this driver hit the
tree. Also, no need to compute the ethernet header and then never use
it.
Warner Losh [Thu, 14 Aug 2014 16:01:33 +0000 (16:01 +0000)]
Streamline format extensions. Either the compiler supports them, and
we enable them and format wordings. Or it doesn't, and we disable
format warnings because the kernel uses the extensions pervasively.
Alan Somers [Thu, 14 Aug 2014 14:59:40 +0000 (14:59 +0000)]
Skip pgrep-j and pkill-j if jail or jls is not installed.
Even though jail is part of the base system, it can be disabled by src.conf
settings. Therefore, it should be listed as a required program for tests
that use it.
CR: D603
MFC after: 3 days
Sponsored by: Spectra Logic
Ed Maste [Thu, 14 Aug 2014 13:45:02 +0000 (13:45 +0000)]
Fix euro symbol in copied keymaps
These were copied from share/syscons/keymaps/??.iso.kbd. They were
not actually ISO 8859-1 as assumed. When interpreted as Unicode they
ended up with the generic currency sign (U+00A4) instead of the euro
(U+20AC).
Xin LI [Thu, 14 Aug 2014 05:31:39 +0000 (05:31 +0000)]
Add a new loader tunable, vm.kmem_zmax which allows a system administrator
to limit the maximum allocation size that malloc(9) would consider using
the UMA cache allocator as backend.
Xin LI [Thu, 14 Aug 2014 05:13:24 +0000 (05:13 +0000)]
Re-instate UMA cached backend for 4K - 64K allocations. New consumers
like geli(4) uses malloc(9) to allocate temporary buffers that gets
free'ed shortly, causing frequent TLB shootdown as observed in hwpmc
supported flame graph.
Warner Losh [Thu, 14 Aug 2014 04:21:31 +0000 (04:21 +0000)]
Add AIC to at91sam9260 support, now that it is needed for multipass to
work. This gets my AT91SAM9260-based boards almost booting with
current in multi pass. The MCI driver is broken, but it is equally
broken before multi-pass.
Warner Losh [Thu, 14 Aug 2014 04:20:13 +0000 (04:20 +0000)]
From https://sourceware.org/ml/newlib/2014/msg00113.html
By Richard Earnshaw at ARM
>
>GCC has for a number of years provides a set of pre-defined macros for
>use with determining the ISA and features of the target during
>pre-processing. However, the design was always somewhat cumbersome in
>that each new architecture revision created a new define and then
>removed the previous one. This meant that it was necessary to keep
>updating the support code simply to recognise a new architecture being
>added.
>
>The ACLE specification (ARM C Language Extentions)
>(http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.swdev/index.html)
>provides a much more suitable interface and GCC has supported this
>since gcc-4.8.
>
>This patch makes use of the ACLE pre-defines to map to the internal
>feature definitions. To support older versions of GCC a compatibility
>header is provided that maps the traditional pre-defines onto the new
>ACLE ones.
Stop using __FreeBSD_ARCH_armv6__ and switch to __ARM_ARCH >= 6 in the
couple of places in tree. clang already implements ACLE. Add a define
that says we implement version 1.1, even though the implementation
isn't quite complete.
Pedro F. Giffuni [Wed, 13 Aug 2014 21:18:31 +0000 (21:18 +0000)]
Use "NO NAME" as the default unnamed label.
Microsoft recommends avoiding the use of spaces in the
string structures for FAT. Unfortunately they do just
that by default in the case of unlabeled filesystems.
Follow the default MS behavior to avoid confusion in
common tools like file(1). This was actually the
default behavior before r203868.
Obtained from: NetBSD (CVS rev. 1.39)
MFC after: 3 days
Dimitry Andric [Wed, 13 Aug 2014 16:42:44 +0000 (16:42 +0000)]
Supplement r259111 by also using correct casts in gcc's emmintrin.h for
the first argument of the following builtin function:
* __builtin_ia32_psrlqi128() takes __v2di instead of __v4si
This should fix the following errors when building the graphics/webp
port with base gcc:
lossless_sse2.c:403: error: incompatible type for argument 1 of '__builtin_ia32_psrlqi128'
lossless_sse2.c:404: error: incompatible type for argument 1 of '__builtin_ia32_psrlqi128'
Reported by: Jos Chrispijn <ports@webrz.net>
MFC after: 3 days
Michael Tuexen [Wed, 13 Aug 2014 15:50:16 +0000 (15:50 +0000)]
Add support for the SCTP_PR_STREAM_STATUS and SCTP_PR_ASSOC_STATUS
socket options. This includes managing the correspoing stat counters.
Add the SCTP_DETAILED_STR_STATS kernel option to control per policy
counters on every stream. The default is off and only an aggregated
counter is available. This is sufficient for the RTCWeb usecase.
Add a knob LIBPTHREAD_BIGSTACK_MAIN, which instructs libthr to leave
the whole RLIMIT_STACK-sized region of the kernel-allocated stack as
the stack of main thread.
By default, the main thread stack is clamped at 2MB (4MB on 64bit
ABIs) and the rest is used for other threads stack allocation. Since
there is no programmatic way to adjust the size of the main thread
stack, pthread_attr_setstacksize() is too late, the knob allows user
to manage the main stack size both for single-threaded and
multi-threaded processes with the rlimit.
Reported by: "Ivan A. Kosarev" <ivan@ivan-labs.com>
Tested by: dim
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
If vm_page_grab() allocates a new page, the page is not inserted into
page queue even when the allocation is not wired. It is
responsibility of the vm_page_grab() caller to ensure that the page
does not end on the vm_object queue but not on the pagedaemon queue,
which would effectively create unpageable unwired page.
In exec_map_first_page() and vm_imgact_hold_page(), activate the page
immediately after unbusying it, to avoid leak.
In the uiomove_object_page(), deactivate page before the object is
unlocked. There is no leak, since the page is deactivated after
uiomove_fromphys() finished. But allowing non-queued non-wired page
in the unlocked object queue makes it impossible to assert that leak
does not happen in other places.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Enji Cooper [Wed, 13 Aug 2014 04:56:27 +0000 (04:56 +0000)]
Integrate lib/libutil into the build/kyua
Remove the .t wrappers
Rename all of the TAP test applications from test-<test> to
<test>_test to match the convention described in the TestSuite
wiki page
humanize_number_test.c:
- Fix -Wformat warnings with counter variables
- Fix minor style(9) issues:
-- Header sorting
-- Variable declaration alignment/sorting in main(..)
-- Fit the lines in <80 columns
- Fix an off by one index error in the testcase output [*]
- Remove unnecessary `extern char * optarg;` (this is already provided by
unistd.h)
Rui Paulo [Wed, 13 Aug 2014 01:27:51 +0000 (01:27 +0000)]
Make sure the DTrace header files are built before depend and before
the build starts.
This adds a new variable DHDRS that contains a list of all DTrace
header files. Then, we use the beforedepend hook to make sure the
heaeder files are built.
Introduce a beforebuild dependency (from projects/bmake) based on
feedback from Simon J. Gerraty. This lets us generate the header
files without running make depend.
Enji Cooper [Tue, 12 Aug 2014 17:51:26 +0000 (17:51 +0000)]
Complete the usr.bin/yacc kyua integration work I originally
submitted via r268811
- Install the Kyuafile by adding FILES to FILESGROUPS
- Run the testcases with an unprivileged user
Some of the testcases depend upon behavior that's broken when
run as root on FreeBSD because of how permissions are treated
with access(2) vs eaccess(2), open(2), etc
- Simplify the test driver to just inspect the exit code from
run_test because it now exits with 0 if successful and exits
with !0 if unsuccessful
- Don't do ad hoc temporary directory creation/deletion; let Kyua
handle that
- Add entries for files removed in r268811 to
OptionalObsoleteFiles.inc
Many compilers may optimize away the overflow check `msg + l < msg',
where `msg' is a pointer and `l' is an integer, because pointer
overflow is undefined behavior in C.
Use a safe precondition test `l >= eom - msg' instead.
- Fix radix tree memory leakage when unloading modules using radix
trees. This happens because the logic inserting items into the radix
tree is allocating empty radix levels, when index zero does not
contain any items.
- Add proper error case handling, so that the radix tree does not end
up in a bad state, if memory cannot be allocated during insertion of
an item.
- Add check for inserting NULL items into the radix tree.
- Add check for radix tree getting too big.
Revision r269457 removed the Giant around mount and unmount code, but
r269533, which was tested before r269457 was committed, implicitely
relied on the Giant to protect the manipulations of the softdepmounts
list. Use softdep global lock consistently to guarantee the list
structure now.
Insert the new struct mount_softdeps into the softdepmounts only after
it is sufficiently initialized, to prevent softdep_speedup() from
accessing bare memory. Similarly, remove struct mount_softdeps for
the unmounted filesystem from the tailq before destroying structure
rwlock.
Reported and tested by: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Add sysctl and loader tunable kern.geom.part.mbr.enforce_chs that is set
by default. It can be used to disable automatic alignment to CHS geometry,
that GEOM_PART_MBR does.
Alan Cox [Mon, 11 Aug 2014 17:45:41 +0000 (17:45 +0000)]
Change {_,}pmap_allocpte() so that they look for the flag PMAP_ENTER_NOSLEEP
instead of M_NOWAIT/M_WAITOK when deciding whether to sleep on page table
page allocation. (The same functions in the i386/xen and mips pmap
implementations already use PMAP_ENTER_NOSLEEP.)
Glen Barber [Mon, 11 Aug 2014 16:31:28 +0000 (16:31 +0000)]
In arm/release.sh, continue if 'xdev-links' target fails
where the target is not valid (stable/10), instead of doing
per-branch evaluation on if xdev-links needs to be invoked.
Roger Pau Monné [Mon, 11 Aug 2014 15:37:02 +0000 (15:37 +0000)]
blkfront: add support for unmapped IO
Using unmapped IO is really beneficial when running inside of a VM,
since it avoids IPIs to other vCPUs in order to invalidate the
mappings.
This patch adds unmapped IO support to blkfront. The following tests
results have been obtained when running on a Xen host without HAP:
PVHVM
3165.84 real 6354.17 user 4483.32 sys
PVHVM with unmapped IO
2099.46 real 4624.52 user 2967.38 sys
This is because when running using shadow page tables TLB flushes and
range invalidations are much more expensive, so using unmapped IO
provides a very important performance boost.
Warner Losh [Mon, 11 Aug 2014 14:50:49 +0000 (14:50 +0000)]
Remove dependence on source tree options. Move all kernel module
options into kern.opts.mk and change all the places where we use
src.opts.mk to pull in the options. Conditionally define SYSDIR and
use SYSDIR/conf/kern.opts.mk instead of a CURDIR path. Replace all
instances of CURDIR/../../etc with STSDIR, but only in the affected
files.
As a special compatibility hack, include bsd.owm.mk at the top of
kern.opts.mk to allow the bare build of sys/modules to work on older
systems. If the defaults ever change between 9.x, 10.x and current for
these options, however, you'll wind up with the host OS' defaults
rather than the -current defaults. This hack will be removed when
we no longer need to support this build scenario.
Fix too long (seed length >12 chars) challenge handling.
1) " ext" length should be included into OPIE_CHALLENGE_MAX (as all places
of opie code expects that).
2) Overflow check in challenge.c is off by 1 even with corrected
OPIE_CHALLENGE_MAX
3) When fallback to randomchallenge() happens and rval is 0 (i.e.
challenge is too long), its value should be set to error state too.
To demonstrate the bug, run opiepasswd with valid seed:
opiepasswd -s 1234567890123456
and notice that it falls back to randomchallenge() (i.e. no 1234567890123456 in the prompt).