Gleb Smirnoff [Fri, 22 Nov 2013 19:22:26 +0000 (19:22 +0000)]
The DIOCKILLSRCNODES operation was implemented with O(m*n) complexity,
where "m" is number of source nodes and "n" is number of states. Thus,
on heavy loaded router its processing consumed a lot of CPU time.
Reimplement it with O(m+n) complexity. We first scan through source
nodes and disconnect matching ones, putting them on the freelist and
marking with a cookie value in their expire field. Then we scan through
the states, detecting references to source nodes with a cookie, and
disconnect them as well. Then the freelist is passed to pf_free_src_nodes().
In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de>
PR: kern/176763
Sponsored by: InnoGames GmbH
Sponsored by: Nginx, Inc.
Gleb Smirnoff [Fri, 22 Nov 2013 19:16:34 +0000 (19:16 +0000)]
To support upcoming changes change internal API for source node handling:
- Removed pf_remove_src_node().
- Introduce pf_unlink_src_node() and pf_unlink_src_node_locked().
These function do not proceed with freeing of a node, just disconnect
it from storage.
- New function pf_free_src_nodes() works on a list of previously
disconnected nodes and frees them.
- Utilize new API in pf_purge_expired_src_nodes().
In collaboration with: Kajetan Staszkiewicz <kajetan.staszkiewicz innogames.de>
Sponsored by: InnoGames GmbH
Sponsored by: Nginx, Inc.
Neel Natu [Fri, 22 Nov 2013 18:57:22 +0000 (18:57 +0000)]
Eliminate redundant information about the host cpu in bhyve's KTR trace points.
This is always tracked by ktr(4) and can be displayed using the "-c" option
of ktrdump(8).
Dimitry Andric [Fri, 22 Nov 2013 17:54:53 +0000 (17:54 +0000)]
Revert r258455 for now, as it apparently causes miscompilation in some
situations. Until this is fully resolved, the X.org workaround in ports
still needs to take place.
Luigi Rizzo [Fri, 22 Nov 2013 04:57:50 +0000 (04:57 +0000)]
make ipfw_check_packet() and ipfw_check_frame() public,
so they can be used in the userspace version of ipfw/dummynet
(normally using netmap for the I/O path).
This is the first of a few commits to ease compiling the
ipfw kernel code in userspace.
Devin Teske [Fri, 22 Nov 2013 00:32:32 +0000 (00:32 +0000)]
Improve network device scanning in the netdev module. First, make it use the
`device.subr' framework (improving performane and reducing sub-shells). Next
improve the `device.subr' framework itself. Make use of the `flags' device
struct member for network interfaces to indicate if an interface is Active,
Wired Ethernet, or 802.11 Wireless. Functions have been added to make checks
against the `flags' bit-field quick and efficient. Last, add function for
rescanning the network to update the device registers. Remove an unnecessary
local (ifn) while we're here (use already provided local `if').
Brooks Davis [Fri, 22 Nov 2013 00:06:11 +0000 (00:06 +0000)]
Fix mergemaster -U by forcing FreeBSD 9 compatiblity in mtree when mtree is
nmtree.
The mtree output used by mergemaster in this case was clearly not meant for
computer consumption and an approach based on -f <file1> -f <file2> would
probalby be a better idea, but this is a minimal change.
However, the GOT references must all be resolved at dlopen() time, and so this
approach cannot be used with lazy dynamic linking (e.g. using RTLD_LAZY), which
usually populates the PLT with stubs that perform the actual resolving.
This patch changes X86TargetLowering::LowerCall() to skip tail call
optimization, if the called function is a global or external symbol.
This fixes problems with loading X.org driver modules, which could occur
when X.org was compiled on i386 with tailcall optimization on, for which
ports r312583 was committed as a workaround. After this change, the
workaround can be removed.
Marcel Moolenaar [Thu, 21 Nov 2013 22:02:59 +0000 (22:02 +0000)]
Have the GPT probe return a lower priority when the MBR is not a PMBR
The purpose of the PMBR is to have the disk appear in use to GPT
unaware utilities (like fdisk). However, if the PMBR has been changed
by a GPT unaware utlity then we must assume that this was deliberate
(as it involved removal of the special slice) and we should not treat
the unmodified GPT-specific sectors as being valid. By lowering the
probe priority in that case, the MBR scheme will take precedence and
the kernel will end up using the MBR and not the GPT. We will still
use the GPT if the kernel does not support the MBR scheme.
Devin Teske [Thu, 21 Nov 2013 19:43:45 +0000 (19:43 +0000)]
f_die() (see `bsdconfig includes -dF die') uses a dialog box (and has been
documented as such; I just forgot). These utilities are command-line only
and as such should stick to either using f_die without arguments or printf)
Brooks Davis [Thu, 21 Nov 2013 19:29:41 +0000 (19:29 +0000)]
Sync with NetBSD. The funtional change is to make the output when
comparing a directory to an mtree file more compatible with fmtree when
FreeBSD 9 compatiblity mode is on. This output is clearly intended for
humans not computers, but some tools such as mergemaster's -U option rely
on it.
Pedro F. Giffuni [Thu, 21 Nov 2013 16:38:57 +0000 (16:38 +0000)]
gcc: another round of merges from the gcc pre-43 branch.
Bring The following revisions from the gcc43 branch[1]:
118360, 118361, 118363, 118576, 119820,
123906, 125246, and 125721.
They all have in common that the were merged long ago
into Apple's gcc and should help improve the general
quality of the compiler and make it easier to bring
new features from Apple's gcc42.
For details please review the additions to the files:
gcc/ChangeLog.gcc43
gcc/cp/ChangeLog.gcc43 (new, adds previous revisions)
Nathan Whitehorn [Thu, 21 Nov 2013 15:41:52 +0000 (15:41 +0000)]
For PCI<->PCI bridges, #address-cells may be 3. Allow this when parsing the
ibm,dma-window properties. This is especially a concern when
#ibm,dma-address-cells is not specified and we have to use the regular
#address-cells property.
Ed Maste [Thu, 21 Nov 2013 14:12:36 +0000 (14:12 +0000)]
libexecinfo: Include terminating null in byte count
Otherwise, a formatted string with a strlen equal to the remaining
buffer space would have the last character omitted (because vsnprintf
always null-terminates), and later the assert in backtrace_symbols_fmt
would fail.
Olivier Houchard [Wed, 20 Nov 2013 23:06:54 +0000 (23:06 +0000)]
In pmap_unmapdev(), remember the size, and use that as an argument to
kva_free(), or we'd end up always passing it a size of 0, and for some
strange reason it doesn't seem to like it.
John-Mark Gurney [Wed, 20 Nov 2013 20:25:27 +0000 (20:25 +0000)]
flag that the aesni driver is sync... This means we don't waste a
context switch just to call the done callback... On my machine, this
improves geli/gzero decrypt performance by ~27% from 550MB/sec to
~700MB/sec...
Redo r258088 to avoid relying on signed arithmetic overflow, since
compiler interprets this as an undefined behaviour. Instead, ensure
that the sum of uio_offset and uio_resid is below OFF_MAX using the
operation which cannot overflow.
Reported and tested by: pho
Discussed with: bde
Approved by: des (pseudofs maintainer)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Ian Lepore [Wed, 20 Nov 2013 15:53:50 +0000 (15:53 +0000)]
Call cpu_setup() immediately after the page tables are installed. This
enables data cache and other chip-specific features. It was previously
done via an early SYSINIT, but it was being done after pmap and vm setup,
and those setups need to use mutexes. On some modern ARM platforms,
the ldrex/strex instructions that implement mutexes require the data cache
to be enabled.
A nice side effect of enabling caching earlier is that it eliminates the
multi-second pause that used to happen early in boot while physical memory
and pmap and vm were being set up. On boards with 1 GB or more of ram
this pause was very noticible, sometimes 5-6 seconds.
Split raw reading/programming into smaller chunks to avoid allocating too
big chunk of kernel memory. Validate size of data. Add error handling to
avoid calling copyout() when data has not been read correctly.
Andriy Gapon [Wed, 20 Nov 2013 10:41:10 +0000 (10:41 +0000)]
4101 metaslab_debug should allow for fine-grained control
4102 space_maps should store more information about themselves
4103 space map object blocksize should be increased
4104 ::spa_space no longer works
4105 removing a mirrored log device results in a leaked object
4106 asynchronously load metaslab
Vm map code performs clipping when map entry covers region which is
larger than the operational region. If the op region size is zero,
clipping would create a zero-sized map entry. The result is that vm
map splay starts behaving inconsistently, sometimes returning
zero-sized entry, sometimes the next (or previous) entry.
One step further, it could result in e.g. vm_map_wire() setting
MAP_ENTRY_IN_TRANSITION on the zero-sized entry, but failing to clear
it in the done part. The vm_map_delete() than hangs forever waiting
for the flag removal.
Verify for zero-length requests and act as if it is always successfull
without performing any action on the address space.
Revert back to use int for the page counts. In vn_io_fault(), the i/o
is chunked to pieces limited by integer io_hold_cnt tunable, while
vm_fault_quick_hold_pages() takes integer max_count as the upper bound.
Rearrange the checks to correctly handle overflowing address arithmetic.
Justin Hibbits [Wed, 20 Nov 2013 01:42:29 +0000 (01:42 +0000)]
Use 'int' to store the return value of getopt(), rather than char.
On some architectures (powerpc), char is unsigned by default, which means
comparisons against -1 always fail, so the programs get stuck in an
infinite loop.
Zbigniew Bodek [Tue, 19 Nov 2013 23:37:50 +0000 (23:37 +0000)]
Apply access flags for managed and unmanaged pages properly on ARMv6/v7
When entering a mapping via pmap_enter() unmanaged pages ought to be
naturally excluded from the "modified" and "referenced" emulation.
RW permission should be granted implicitly when requested,
otherwise unmanaged page will not recover from the permission fault
since there will be no PV entry to indicate that the page can be written.
In addition, only managed pages that participate in "modified"
emulation need to be marked as "dirty" and "writeable" when entered
with RW permissions. Likewise with "referenced" flag for managed pages.
Unmanaged ones however should not be marked as such.
Zbigniew Bodek [Tue, 19 Nov 2013 23:31:39 +0000 (23:31 +0000)]
Avoid clearing EXEC permission bit when setting the page RW on ARMv6/v7
When emulating modified bit the executable attribute was cleared by
mistake when calling pmap_set_prot(). This was not a problem before
changes to ref/mod emulation since all the pages were created RW basing
on the "prot" argument in pmap_enter(). Now however not all pages are RW
and the RW permission can be cleared in the process.
Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4).
Now it is easy to expand the size of the mirror when all its components
are replaced. Also add g_resize method to geom_mirror class. It will write
updated metadata to new last sector, when parent provider is resized.
Ian Lepore [Tue, 19 Nov 2013 22:14:35 +0000 (22:14 +0000)]
Bugfixes... the host capabilties from FDT data are stored in host.caps, not
host.host_ocr, examine the correct field when setting up the hardware. Also,
the offset for the capabilties register should be 0x140, not 0x240.
Submitted by: Ilya Bakulin <ilya@bakulin.de>
Pointy hat to: me
Andriy Gapon [Tue, 19 Nov 2013 18:43:47 +0000 (18:43 +0000)]
zfs page_busy: fix the boundaries of the cleared range
This is a fix for a regression introduced in r246293.
vm_page_clear_dirty expects the range to have DEV_BSIZE aligned boundaries,
otherwise it extends them. Thus it can happen that the whole page is
marked clean while actually having some small dirty region(s).
This commit makes the range properly aligned and ensures that only
the clean data is marked as such.
It would interesting to evaluate how much benefit clearing with DEV_BSIZE
granularity produces. Perhaps instead we should clear the whole page
when it is completely overwritten and don't bother clearing any bits
if only a portion a page is written.
Reported by: George Hartzell <hartzell@alerce.com>,
Richard Todd <rmtodd@servalan.servalan.com>
Tested by: George Hartzell <hartzell@alerce.com>,
Reviewed by: kib
MFC after: 5 days
Andriy Gapon [Tue, 19 Nov 2013 18:35:38 +0000 (18:35 +0000)]
fsx: add an option to randomly call msync(MS_INVALIDATE)
This call should be a sufficiently close approximation of what happens
when a filesystem is unmounted and remounted. To be more specific, it
should test that the data that was in the page cache is the same data
that ends up on a stable storage or in a filesystem's internal cache,
if any.
This will catch the cases where a page with modified data is marked as
a clean page for whatever reason.
While there, make logging of the special events (open+close before
plus invalidation now) more generic and slightly better than the previous
hack.
Andriy Gapon [Tue, 19 Nov 2013 18:35:01 +0000 (18:35 +0000)]
fsx: new option to disable msync(MS_SYNC) after each write via mmaped region
This option should be useful for testing if a filesystem uses the
unified buffer / page cache.
Or, if filesystem's emulation of the unified cache works as expected.
This should be the case for e.g. ZFS.
Dimitry Andric [Tue, 19 Nov 2013 17:53:19 +0000 (17:53 +0000)]
Pull in r191896 from upstream llvm trunk:
CaptureTracking: Plug a loophole in the "too many uses" heuristic.
The heuristic was added to avoid spending too much compile time in a
specially crafted test case (PR17461, PR16474) with many uses on a
select or bitcast instruction can still trigger the slow case. Add a
check for that case.
This only affects compile time, don't have a good way to test it.
This fixes the excessive compile time spent on a specific file of the
graphics/rawtherapee port.
Bryan Drewery [Tue, 19 Nov 2013 15:35:26 +0000 (15:35 +0000)]
Support SNI in libfetch
SNI is Server Name Indentification which is a protocol for TLS that
indicates the host that is being connected to at the start of the
handshake. It allows to use Virtual Hosts on HTTPS.