Split the SLB mirror cache into two kinds of object, one for kernel maps
which are similar to the previous ones, and one for user maps, which
are arrays of pointers into the SLB tree. This changes makes user SLB
updates atomic, closing a window for memory corruption. While here,
rearrange the allocation functions to make context switches faster.
Replace the SLB backing store splay tree used on 64-bit PowerPC AIM
hardware with a lockless sparse tree design. This marginally improves
the performance of PMAP and allows copyin()/copyout() to run without
acquiring locks when used on wired mappings.
marius [Wed, 15 Sep 2010 21:44:31 +0000 (21:44 +0000)]
Add a VIS-based block copy function for SPARC64 V and later, which
additionally takes advantage of the prefetch cache of these CPUs.
Unlike the uncommitted US-III version, which provide no measurable
speedup or even resulted in a slight slowdown on certain CPUs models
compared to using the US-I version with these, the SPARC64 version
actually results in a slight improvement.
Don't suggest using bwn for the bcm4306 cards in the list. The
bcm4306 cards are ambiguous. BCM4306 rev 2 requires bwi. BCM4306 rev
3 will work with either. Since we can't easily determine which is
which, just remove them.
marius [Wed, 15 Sep 2010 17:11:15 +0000 (17:11 +0000)]
Sync with other platforms:
- make dflt_lock() always panic,
- add kludge to use contigmalloc() when the alignment is larger than the size
and print a diagnostic when we didn't satisfy the alignment.
marius [Wed, 15 Sep 2010 15:18:41 +0000 (15:18 +0000)]
- Update the comment in swi_vm() regarding busdma bounce buffers; it's
unlikely that support for these ever will be implemented on sparc64 as
the IOMMUs are able to translate to up to the maximum physical address
supported by the respective machine, bypassing the IOMMU is affected
by hardware errata and being able to support DMA engines which cannot
do at least 32-bit DMA does not justify the costs.
- The page zeroing in uma_small_alloc() may use the VIS-based block zero
function so take advantage of it.
Fix bogus busying mechanism from cdevsw callbacks:
- D_TRACKCLOSE may be used there as d_close() are expected to match up
d_open() calls
- Replace the hand-crafted counter and flag with the
device_busy()/device_unbusy() proper usage.
Sponsored by: Sandvine Incorporated
Reported by: Mark Johnston <mjohnston at sandvine dot com>
Tested by: Mark Johnston
Reviewed by: emaste
devfs_delete() now recursively removes empty parent directories unless
the DEVFS_DEL_NORECURSE flag is specified. devfs_delete() can't be
called anymore with a parent directory vnode lock held because the
possible parent directory deletion needs to lock the vnode. Thus we
unlock the parent directory vnode in devfs_remove() before calling
devfs_delete().
Call devfs_populate_vp() from devfs_symlink() and devfs_vptocnp() as now
directories can get removed.
Add a check for DE_DOOMED flag to devfs_populate_vp() because
devfs_delete() drops dm_lock before the VI_DOOMED vnode flag gets set.
This ensures that devfs_populate_vp() returns an error for directories
which are in progress of deletion.
Reviewed by: kib
Discussed on: freebsd-current (mostly silence)
zfs vn_has_cached_data: take into account v_object->cache != NULL
This mirrors code in tmpfs.
This changge shouldn't affect much read path, it may cause unnecessary
vm_page_lookup calls in the case where v_object has no active or inactive
pages but has some cache pages. I believe this situation to be non-essential.
In write path this change should allow us to properly detect the above
case and free a cache page when we write to a range that corresponds to it.
If this situation is undetected then we could have a discrepancy between
data in page cache and in ARC or on disk.
This change allows us to re-enable vn_has_cached_data() check in zfs_write.
NOTE: strictly speaking resident_page_count and cache fields of v_object
should be exmined under VM_OBJECT_LOCK, but for this particular usage
we may get away with it.
andre [Wed, 15 Sep 2010 10:39:30 +0000 (10:39 +0000)]
Change the default MSS for IPv4 and IPv6 TCP connections from an
artificial power-of-2 rounded number to their real values specified
in RFC879 and RFC2460.
From the history and existing comments it appears that the rounded
numbers were intended to be advantageous for the kernel and mbuf
system. However this hasn't been the case at for at least a long
time. The mbuf clusters used in tcp_output() have enough space
to hold the larger real value for the default MSS for both IPv4 and
IPv6. Note that the default MSS is only used when path MTU discovery
is disabled.
Update and expand related comments.
Reviewed by: lsteward (including some word-smithing)
MFC after: 2 weeks
tmpfs, zfs + sendfile: mark page bits as valid after populating it with data
Otherwise, adding insult to injury, in addition to double-caching of data
we would always copy the data into a vnode's vm object page from backend.
This is specific to sendfile case only (VOP_READ with UIO_NOCOPY).
sys/pcpu.h: remove a workaround for a fixed ld bug
The workaround was incorrectly documented as having something to do with
set_pcpu section's progbits, but in fact it was for incorrect placement
of __start_set_pcpu because of the bug in ld.
The bug was fixed in r210245, see commit message for details.
A side-effect of the workaround was that a zero-size set_pcpu section was
produced for modules, source code of which included pcpu.h but didn't
actually define any dynamic per-cpu variables.
This commit should remove the side-effect.
The same workaround is present sys/net/vnet.h, has an analogous side-effect
and can be removed as well.
An UPDATING entry that warns about a need for recent ld is following.
add code to support stack unwinding when thread exits. note that only
defer-mode cancellation works, asynchrnous mode does not work because
it lacks of libuwind's support. stack unwinding is not enabled unless
LIBTHR_UNWIND_STACK is defined in Makefile.
Introduce inheritance into the PowerPC MMU kobj interface.
include/mmuvar.h - Change the MMU_DEF macro to also create the class
definition as well as define the DATA_SET. Add a macro, MMU_DEF_INHERIT,
which has an extra parameter specifying the MMU class to inherit methods
from. Update the comments at the start of the header file to describe the
new macros.
marius [Tue, 14 Sep 2010 20:41:06 +0000 (20:41 +0000)]
Use saner nsegments and maxsegsz parameters when creating certain DMA tags;
tags for 1-byte allocations cannot possibly be split across 2 segments and
maxsegsz must not exceed maxsize.
marius [Tue, 14 Sep 2010 20:31:09 +0000 (20:31 +0000)]
Remove a KASSERT which will also trigger for perfectly valid combinations
of small maxsize and "large" (including BUS_SPACE_UNRESTRICTED) nsegments
parameters. Generally using a presz of 0 (which indeed might indicate the
use of bogus parameters for DMA tag creation) is not fatal, it just means
that no additional DVMA space will be preallocated.
All gpart(8) subcommands apart from the 'bootcode' subcommand handle
given geom/provider names with and without /dev/ prefix. Teach the
'bootcode' subcommand to handle /dev/<foo> names as well.
Introduce special G_VAL_OPTIONAL define, which when given in value field
tells geom(8) to ignore it when it is not given and don't try to obtain
default value.
Make kern_tc.c provide minimum frequency of tc_ticktock() calls, required
to handle current timecounter wraps. Make kern_clocksource.c to honor that
requirement, scheduling sleeps on first CPU for no more then specified
period. Allow other CPUs to sleep up to 1/4 second (for any case).
Resurrect PSIM support by moving the cacheline size-detection warning
printf outside of the MMU-disabled region. A call into OpenFirmware
with the MMU off resulted in an internal PSIM assert.
Avoid repeatedly spamming the console while a timed out command is waiting
to complete. Instead, print one message after the timeout period expires,
and one more when (if) the command eventually completes.
Do not explicitly enable interrupts in smp_init_secondary() because it
renders any spinlock protected code after that point to run with
interrupts enabled. This is because the processor is executing in the
context of idlethread whose 'md_spinlock_count' is already set to 1.
Instead just let sched_throw() re-enable interrupts when it releases
the spinlock.
The original powerpc commit log for r212559 is available here:
http://svn.freebsd.org/viewvc/base?view=revision&revision=212559
Fix a missing set of parantheses that could cause recent versions of libthr
to crash deferencing a NULL pointer to the user context on powerpc64
systems with COMPAT_FREEBSD32 defined.
Fix segment:offset calculation of interrupt vector for relocated video BIOS
when the original offset is bigger than size of one page. X86BIOS macros
cannot be used here because it is assumed address is only linear in a page.
Split $ipv6_prefer into $ip6addrctl_policy and $ipv6_activate_all_interfaces.
The $ip6addrctl_policy is a variable to choose a pre-defined address
selection policy set by ip6addrctl(8).
The keyword "ipv4_prefer" sets IPv4-preferred one described in Section 10.3,
the keyword "ipv6_prefer" sets IPv6-preferred one in Section 2.1 in RFC 3484,
respectively. When "AUTO" is specified, it attempts to read
/etc/ip6addrctl.conf first. If it is found, it reads and installs it as
a policy table. If not, either of the two pre-defined policy tables is
chosen automatically according to $ipv6_activate_all_interfaces.
When $ipv6_activate_all_interfaces=NO, interfaces which have no corresponding
$ifconfig_IF_ipv6 is marked as IFDISABLED for security reason.
The default values are ip6addrctl_policy=AUTO and
ipv6_activate_all_interfaces=NO.
Revert r212370, as it causes a LOR on powerpc. powerpc does a few
unexpected things in copyout(9) and so wiring the user buffer is not
sufficient to perform a copyout(9) while holding a random mutex.
Allow a kernel config to specify a set but empty value via
'makeoptions OPTION=' for consistency with the make commandline.
Previously 'makeoptions WERROR=' would result in a syntax error; now
it produces the same effect as 'makeoptions WERROR'. Both forms now
result in 'WERROR=' in the generated Makefile.
Bump __FreeBSD_version to reflect the userland DTrace changes.
Sponsored by: The FreeBSD Foundation
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> Security: Vulnerability reference (one per line) or description.
> Empty fields above will be automatically removed.
Sponsored by: The FreeBSD Foundation
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> Security: Vulnerability reference (one per line) or description.
> Empty fields above will be automatically removed.
Fix a subtle bug uncovered by the recent one-shot timer import in which
any spin locks acquired between the enabling of interrupts in
machdep_ap_bootstrap() and the invocation of the scheduler would fail to
have interrupts disabled due to the fake spinlock already held by the
idle thread. sched_throw(NULL) will enable interrupts by itself when
exiting this spinlock, so just let it do that and don't enable interrupts
here.
Add G_TYPE_MULTI flag, which when set for the given option, will
allow the option to be specified multiple times. This will help to
implement things like passing multiple keyfiles to geli(8) instead of
cat(1)ing them all into stdin and reading from there using one '-k -'
option.
- Remove gc_argname field. It was introduced for gpart(8), but if I
understand everything correctly, we don't really need it.
- Provide default numeric value as strings. This allows to simplify
a lot of code.
- Bump version number.
- Remove sync from msgrng_send, sync needs to be called just once before
sending.
- Fix retry logic - don't reload registers when retrying in message_send,
also fix check for send pending fail.
- remove unused message_send_block_fast()
- merge message_receive_fast() to message_receive
- style(9) fixes, and comments
- rge and nlge updated for the sys/mips/rmi/msgring.h changes