Rick Macklem [Sat, 11 Jun 2011 21:14:22 +0000 (21:14 +0000)]
Make three one line changes to the rc scripts so that
they work with the new NFS client being the default,
since the new NFS client's module name is nfscl and
not nfsclient.
Assert that page is VPO_BUSY or page owner object is locked in
vm_page_undirty(). The assert is not precise due to VPO_BUSY owner
to tracked, so assertion does not catch the case when VPO_BUSY is
owned by other thread.
Joel Dahl [Sat, 11 Jun 2011 09:08:46 +0000 (09:08 +0000)]
Enable sound support by default on i386 and amd64.
The generic sound driver has been added, along with enough
device-specific drivers to support the most common audio
chipsets.
We've discussed enabling it from time to time over the years
and we've received numerous requests from users, so we decided
that shipping 9.0 with working audio by default would be the
best thing to do.
Bug reports should be sent to the multimedia@ mailing list, as
usual.
Justin T. Gibbs [Sat, 11 Jun 2011 04:59:01 +0000 (04:59 +0000)]
Monitor and emit events for XenStore changes to XenBus trees
of the devices we manage. These changes can be due to writes
we make ourselves or due to changes made by the control domain.
The goal of these changes is to insure that all state transitions
can be detected regardless of their source and to allow common
device policies (e.g. "onlined" backend devices) to be centralized
in the XenBus bus code.
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbus_if.m:
Add a new method for XenBus drivers "localend_changed".
This method is invoked whenever a write is detected to
a device's XenBus tree. The default implementation of
this method is a no-op.
sys/xen/xenbus/xenbus_if.m:
sys/dev/xen/netfront/netfront.c:
sys/dev/xen/blkfront/blkfront.c:
sys/dev/xen/blkback/blkback.c:
Change the signature of the "otherend_changed" method.
This notification cannot fail, so it should return void.
sys/xen/xenbus/xenbusb_back.c:
Add "online" device handling to the XenBus Back Bus
support code. An online backend device remains active
after a front-end detaches as a reconnect is expected
to occur in the near future.
sys/xen/interface/io/xenbus.h:
Add comment block further explaining the meaning and
driver responsibilities associated with the XenBus
Closed state.
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
sys/xen/xenbus/xenbusb_back.c:
sys/xen/xenbus/xenbusb_front.c:
sys/xen/xenbus/xenbusb_if.m:
o Register a XenStore watch against the local XenBus tree
for all devices.
o Cache the string length of the path to our local tree.
o Allow the xenbus front and back drivers to hook/filter both
local and otherend watch processing.
o Update the device ivar version of "state" when we detect
a XenStore update of that node.
sys/dev/xen/control/control.c:
sys/xen/xenbus/xenbus.c:
sys/xen/xenbus/xenbusb.c:
sys/xen/xenbus/xenbusb.h:
sys/xen/xenbus/xenbusvar.h:
sys/xen/xenstore/xenstorevar.h:
Allow clients of the XenStore watch mechanism to attach
a single uintptr_t worth of client data to the watch.
This removes the need to carefully place client watch
data within enclosing objects so that a cast or offsetof
calculation can be used to convert from watch to enclosing
object.
Jeff Roberson [Fri, 10 Jun 2011 22:48:35 +0000 (22:48 +0000)]
Implement fully asynchronous partial truncation with softupdates journaling
to resolve errors which can cause corruption on recovery with the old
synchronous mechanism.
- Append partial truncation freework structures to indirdeps while
truncation is proceeding. These prevent new block pointers from
becoming valid until truncation completes and serialize truncations.
- On completion of a partial truncate journal work waits for zeroed
pointers to hit indirects.
- softdep_journal_freeblocks() handles last frag allocation and last
block zeroing.
- vtruncbuf/ffs_page_remove moved into softdep_*_freeblocks() so it
is only implemented in one place.
- Block allocation failure handling moved up one level so it does not
proceed with buf locks held. This permits us to do more extensive
reclaims when filesystem space is exhausted.
- softdep_sync_metadata() is broken into two parts, the first executes
once at the start of ffs_syncvnode() and flushes truncations and
inode dependencies. The second is called on each locked buf. This
eliminates excessive looping and rollbacks.
- Improve the mechanism in process_worklist_item() that handles
acquiring vnode locks for handle_workitem_remove() so that it works
more generally and does not loop excessively over the same worklist
items on each call.
- Don't corrupt directories by zeroing the tail in fsck. This is only
done for regular files.
- Push a fsync complete record for files that need it so the checker
knows a truncation in the journal is no longer valid.
Jeff Roberson [Fri, 10 Jun 2011 22:18:25 +0000 (22:18 +0000)]
- If the fsync in ufs_direnter fails SUJ can later panic because we have
partially added a name. Allow ufs_direnter() to continue in the
hopes that it is a transient error. If it is not, the directory
is corrupted already from IO errors and writing this new block
is not likely to make things worse.
Attilio Rao [Fri, 10 Jun 2011 20:23:56 +0000 (20:23 +0000)]
- Fix races on detach handling of AAC_IFFLAGS_* mask
- Fix races on setting AAC_AIFFLAGS_ALLOCFIBS
- Remove some unused AAC_IFFLAGS_* bits.
Please note that the kthread still makes a difference between the
total mask and AAC_AIFFLAGS_ALLOCFIBS because more flags may be
added in the future to aifflags.
Justin T. Gibbs [Fri, 10 Jun 2011 20:10:30 +0000 (20:10 +0000)]
Remove C constructs that are incompatible with C++ from various
OpenSolaris and ZFS header files. These changes are sufficient
to allow a C++ program to use the libzfs library.
Note: The majority of these files already included 'extern "C"'
declarations, so the intention of providing C++ compatibility
already existed even if it wasn't provided.
cddl/compat/opensolaris/include/assert.h:
Wrap our compatibility assert implementation in
'extern "C"'. Since this is a compatibility header
I matched the Solaris style of doing this explicitly
rather than rely on FreeBSD's __BEGIN/END_DECLS macro.
sys/cddl/compat/opensolaris/sys/kstat.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/ddt.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h:
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h:
Rename parameters in function declarations that conflict
with C++ keywords. This was the solution preferred by
members of the Illumos community.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h:
In C, nested structures are visible in the global namespace,
but in C++, they take on the namespace of the structure in
which they are contained. Flatten nested structure
definitions within struct zfs_cmd so these structures are
visible in the global namespace when compiled in both
languages.
Jilles Tjoelker [Fri, 10 Jun 2011 13:47:11 +0000 (13:47 +0000)]
skel/.shrc: Improve commented CDPATH example for POSIX requirements.
POSIX says an empty entry in CDPATH shall not result in the new directory
being printed, while any non-empty entry shall result in the new directory
being printed, including ".". Therefore, the value of CDPATH should almost
always start with a colon, not dot and colon.
Our sh does not print the name for empty entries as well as "." entries.
Jilles Tjoelker [Thu, 9 Jun 2011 23:12:23 +0000 (23:12 +0000)]
sh: Do parameter expansion before printing PS4 (set -x).
The function name expandstr() and the general idea of doing this kind of
expansion by treating the text as a here document without end marker is from
dash.
All variants of parameter expansion and arithmetic expansion also work (the
latter is not required by POSIX but it does not take extra code and many
other shells also allow it).
Command substitution is prevented because I think it causes too much code to
be re-entered (for example creating an unbounded recursion of trace lines).
Unfortunately, our LINENO is somewhat crude, otherwise PS4='$LINENO+ ' would
be quite useful.
Add -Wa,-many to CFLAGS on PowerPC. This aids in building a kernel using
clang, which would otherwise complain about some 64-bit bridge mode
instructions.
Alexander Motin [Thu, 9 Jun 2011 16:30:13 +0000 (16:30 +0000)]
Intel NM10 chipset's SATA controller has same PCI ID and revision as ICH7's,
but has only 2 SATA ports instead of 4. The worst part is that SStatus and
SError registers for missing ports are not implemented and return wrong
values (0xffffffff), that caused infinite reset loop.
Just ignore that SError value while I found no better way to identify them.
Jung-uk Kim [Wed, 8 Jun 2011 23:44:59 +0000 (23:44 +0000)]
Tidy up r222866.
- Re-add accidentally removed atomic op. for sysctl(9) handler.
- Remove a period(`.') at the end of a debugging message.
- Consistently spell "low" for "TSC-low" timecounter throughout.
- Major reorganization of mbuf handling throughout the driver to
increase robustness (no more calls to panic(9)) and simplify
code.
- Allocate RX/TX data structures as a single buffer rather than
an array of 4KB pages to simplify code.
- Fixed LRO (aka TPA) code. Removed kernel module parameter and
support enabling disabling LRO through ifconfig(8) command line.
LRO is still disabled by default but should be enabled for best
performance on an endpoint device.
- Fixed statistcs code and removed kernel module parameter (stats
should just work).
- Added many software counters to help identify the cause of some
performance issues.
- Streamlined adapter internal init/stop code paths.
- Fiddled with debug code (adding some here, removing some there).
- Continued style(9) adjustments.
Jung-uk Kim [Wed, 8 Jun 2011 20:08:06 +0000 (20:08 +0000)]
Increase quality of TSC (or TSC-low) timecounter to 1000 if it is P-state
invariant. For SMP case (TSC-low), it also has to pass SMP synchronization
test and the CPU vendor/model has to be white-listed explicitly. Currently,
all Intel CPUs and single-socket AMD Family 15h processors are listed here.
Jung-uk Kim [Wed, 8 Jun 2011 19:38:31 +0000 (19:38 +0000)]
Introduce low-resolution TSC timecounter "TSC-low". It replaces the normal
TSC timecounter if TSC frequency is higher than ~4.29 MHz (or 2^32-1 Hz) or
multiple CPUs are present. The "TSC-low" frequency is always lower than a
preset maximum value and derived from TSC frequency (by being halved until
it becomes lower than the maximum). Note the maximum value for SMP case is
significantly lower than UP case because we want to reduce (rare but known)
"temporal anomalies" caused by non-serialized RDTSC instruction. Normally,
it is still higher than "ACPI-fast" timecounter frequency (which was default
timecounter hardware for long time until r222222) to be useful.
Attilio Rao [Wed, 8 Jun 2011 19:28:59 +0000 (19:28 +0000)]
In the current code, a double panic condition may lead to dumps
interleaving.
Signal dumping to happen only for the first panic which should be the
most important.
Sponsored by: Sandvine Incorporated
Submitted by: Nima Misaghian (nmisaghian AT sandvine DOT com)
MFC after: 2 weeks
Andreas Tobler [Wed, 8 Jun 2011 16:00:30 +0000 (16:00 +0000)]
- Improve error handling.
- Add retry loops in the i2c read/write functions.
- Combied the ADC channel selection and readout of the value into
one iicbus_transfer to avoid possible races.
Compile RTLD with global dot symbols on 64-bit PowerPC, as a crutch for
GDB's ability to locate r_debug_state (which is actually the only function
that need be compiled this way).
Bjoern A. Zeeb [Wed, 8 Jun 2011 10:59:36 +0000 (10:59 +0000)]
Add the missing call to ip6_ipsec_filtertunnel() to be able to control
whether decapsulated IPsec packets will be passed to pfil again depending
on the setting of the net.ip6.ipsec6.filtertunnel sysctl.
Andriy Gapon [Wed, 8 Jun 2011 08:12:15 +0000 (08:12 +0000)]
remove code for dynamic offlining/onlining of CPUs on x86
The code has definitely been broken for SCHED_ULE, which is a default
scheduler. It may have been broken for SCHED_4BSD in more subtle ways,
e.g. with manually configured CPU affinities and for interrupt devilery
purposes.
We still provide a way to disable individual CPUs or all hyperthreading
"twin" CPUs before SMP startup. See the UPDATING entry for details.
Interaction between building CPU topology and disabling CPUs still
remains fuzzy: topology is first built using all availble CPUs and then
the disabled CPUs should be "subtracted" from it. That doesn't work
well if the resulting topology becomes non-uniform.
This work is done in cooperation with Attilio Rao who in addition to
reviewing also provided parts of code.
Ruslan Ermilov [Wed, 8 Jun 2011 08:08:42 +0000 (08:08 +0000)]
Pull up all vendor changes to mdoc(7).
This also replaces the local fix in r219209 that made .Ac emit
ASCII angle quotes with an official fix. In the official fix,
ASCII quotes are output when using the .Aq, .Ao and .Ac calls,
but only when nested into the .An macro.
Marius Strobl [Tue, 7 Jun 2011 23:15:21 +0000 (23:15 +0000)]
- For the case when tl1_align(_trap) is used to call rsf_fatal via
RSF_FATAL we need to switch to alternate globals for KSTACK_CHECK just
like tl1_data_excptn(_trap) does. This is more or less cosmetic because
in case RSF_FATAL is called we're already heading south.
- Correct an END().
- Read the window state from the correct register for a CATR().
Bjoern A. Zeeb [Tue, 7 Jun 2011 19:39:34 +0000 (19:39 +0000)]
For the moment document the possible problem introduced with dynamic address
family detection in world, mostly noticed by ifconfig(8), when running with
an old kernel.
Reported by: Andrzej Tobola (ato iem.pw.edu.pl)
Reported by: gcooper
Xin LI [Tue, 7 Jun 2011 18:48:49 +0000 (18:48 +0000)]
Add a special mount option "failok" to indicate that the administrator wants
the system to proceed to boot without bailing out into single user mode,
even when the file system can not be successfully mounted.
This option is implemented in mount(8) and not passed into kernel.
Marius Strobl [Tue, 7 Jun 2011 17:33:39 +0000 (17:33 +0000)]
Adapt CATR() to r222813. This is somewhat tricky as we can't afford using
more than three temporary register in several places CATR() is used so
this code trades instructions in for registers. Actually, this still isn't
sufficient and CATR() has the side-effect of clobbering %y. Luckily, with
the current uses of CATR() this either doesn't matter or we are able to
(save and) restore it.
Now that there's only one use of AND() and TEST() left inline these.
Alexander Motin [Tue, 7 Jun 2011 17:01:52 +0000 (17:01 +0000)]
Make automatic hw.snd.default_unit choice a bit more intelligent. Instead
of just setting it to the first registered device, reevaluate it for each
device registered, trying to choose best candidate, unless one was forced.
For now use such preference order: play&rec, play, rec.
As side effect, this should workaround the situation when HDMI audio output
of the video card, usually not connected to anything, becomes default, that
requires manual user intervention to make sound working. If at some point
this won't be enough, we can try to fetch some additional priority flags
from the device driver.
Do not use LCM from stripesize and user specified alignment value.
When user wants have specific alignment - do what user wants.
Use stripesize as alignment value in case, when some of gpart's
arguments are ommitted for automatic calculation.
Adrian Chadd [Tue, 7 Jun 2011 09:03:28 +0000 (09:03 +0000)]
Flesh out a new HAL method to fetch the radar PHY error frame information.
For the AR5211/AR5212, this is apparently a one byte pulse duration
counter value. It is only coded up here for the AR5212 as I don't have
any AR5211-series hardware to test it on.
This information was extracted from the Madwifi DFS branch along with
some local additions.
Please note - all this does is extract out the radar event duration,
it in no way reflects the presence of a radar. Further code is needed
to take a set of radar events and filter them to extract out correct
radar pulse trains (and ignore other events.)
Attilio Rao [Tue, 7 Jun 2011 08:46:13 +0000 (08:46 +0000)]
etire the cpumask_t type and replace it with cpuset_t usage.
This is intended to fix the bug where cpu mask objects are
capped to 32. MAXCPU, then, can now arbitrarely bumped to whatever
value. Anyway, as long as several structures in the kernel are
statically allocated and sized as MAXCPU, it is suggested to keep it
as low as possible for the time being.
Technical notes on this commit itself:
- More functions to handle with cpuset_t objects are introduced.
The most notable are cpusetobj_ffs() (which calculates a ffs(3)
for a cpuset_t object), cpusetobj_strprint() (which prepares a string
representing a cpuset_t object) and cpusetobj_strscan() (which
creates a valid cpuset_t starting from a string representation).
- pc_cpumask and pc_other_cpus are target to be removed soon.
With the moving from cpumask_t to cpuset_t they are now inefficient
and not really useful. Anyway, for the time being, please note that
access to pcpu datas is protected by sched_pin() in order to avoid
migrating the CPU while reading more than one (possible) word
- Please note that size of cpuset_t objects may differ between kernel
and userland. While this is not directly related to the patch itself,
it is good to understand that concept and possibly use the patch
as a reference on how to deal with cpuset_t objects in userland, when
accessing kernland members.
- KTR_CPUMASK is changed and now is represented through a string, to be
set as the example reported in NOTES.
Please additively note that no MAXCPU is bumped in this patch, but
private testing has been done until to MAXCPU=128 on a real 8x8x2(htt)
machine (amd64).
Please note that the FreeBSD version is not yet bumped because of
the upcoming pcpu changes. However, note that this patch is not
targeted for MFC.
People to thank for the time spent on this patch:
- sbruno, pluknet and Nicholas Esborn (nick AT desert DOT net) tested
several revision of the patches and really helped in improving
stability of this work.
- marius fixed several bugs in the sparc64 implementation and reviewed
patches related to ktr.
- jeff and jhb discussed the basic approach followed.
- kib and marcel made targeted review on some specific part of the
patch.
- marius, art, nwhitehorn and andreast reviewed MD specific part of
the patch.
- marius, andreast, gonzo, nwhitehorn and jceel tested MD specific
implementations of the patch.
- Other people have made contributions on other patches that have been
already committed and have been listed separately.
Companies that should be mentioned for having participated at several
degrees:
- Yahoo! for having offered the machines used for testing on big
count of CPUs.
- The FreeBSD Foundation for having sponsored my devsummit attendance,
which has been instrumental.
- Sandvine for having offered offices and infrastructure during
development.
(I really hope I didn't forget anyone, if it happened I apologize in
advance).
Sync ng_nat with recent (r222806) ipfw_nat changes:
Make a behaviour of the libalias based in-kernel NAT a bit closer to
how natd(8) does work. natd(8) drops packets only when libalias returns
PKT_ALIAS_IGNORED and "deny_incoming" option is set, but ipfw_nat
always did drop packets that were not aliased, even if they should
not be aliased and just are going through.
Also add SCTP support: mark response packets to skip firewall processing.
Make a behaviour of the libalias based in-kernel NAT a bit closer to
how natd(8) does work. natd(8) drops packets only when libalias returns
PKT_ALIAS_IGNORED and "deny_incoming" option is set, but ipfw_nat
always did drop packets that were not aliased, even if they should
not be aliased and just are going through.
PR: kern/122109, kern/129093, kern/157379
Submitted by: Alexander V. Chernikov (previous version)
MFC after: 1 month
Andriy Gapon [Tue, 7 Jun 2011 06:18:02 +0000 (06:18 +0000)]
amdsbwd: update to support SB8xx southbridges
Many thanks to Tino <tinotom@gmail.com> for drawing my attention to
this, for doing a lot of testing and providing great feedback.
Many thanks to AMD for continuing to release public specifications for
their chipsets.
Set pca.p_bufr to NULL when we haven't allocated a buffer.
Otherwise, p_bufr is set to garbage on the stack, and if that garbage
happens to be non-NULL, and the TOLOG or TOCONS flag is set, putbuf()
will get called and attempt to fill the non-existent buffer.
This is really only relevant for tprintf() (and only when the priority is
not -1), but set it in uprintf() and ttyprintf() for completeness.
The next step, to avoid log buffer scrambling, would be to add the
PRINTF_BUFR_SIZE code to tprintf(), but this should prevent panics.
Fix making kernel dumps from the debugger by creating a command
for it. Do not not expect a developer to call doadump(). Calling
doadump does not necessarily work when it's declared static. Nor
does it necessarily do what was intended in the context of text
dumps. The dump command always creates a core dump.
Move printing of error messages from doadump to the dump command,
now that we don't have to worry about being called from DDB.
Add ia64_sync_icache() and use it to make the I-cache coherent
after loading the kernel's text segment. The kernel will do the
same for loaded modules, so don't worry about that.
Jung-uk Kim [Mon, 6 Jun 2011 23:03:37 +0000 (23:03 +0000)]
Validate INT 15h and 16h vectors more strictly. Traditionally these entry
points are fixed addresses and (U)EFI CSM specification also mandated that.
Unfortunately, (U)EFI CSM specification does not specifically mention this
is to call service routine via interrupt vector table or to jump directly
to the entry point. As a result, some CSM seems to install two routines
and acts differently, depending on how it was executed, unfortunately.
When INT 15h is used, it calls a function pointer (which is probably a UEFI
service function). When it jumps directly to the entry point, it executes
a simple and traditional INT 15h service routine. Therefore, actually there
are two possible fixes, i. e., this fix or jumping directly to the fixed
entry point. However, we chose this fix because a) keyboard typematic
support via BIOS is becoming extremely rarer and b) we cannot support random
service routine installed by a firmware or a boot loader. This should fix
Lenovo X220 laptop, specifically.