Don't allocate new unnecessary pages when devstat_alloc() looses the
run for re-acuiring the lock, but recheck if new pages are allocatable
from the pool and free the previously allocated ones.
Tested by: pho, Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
Change the default transport protocol for use by the Mount protocol
from UDP to TCP, so that it is consistent with TCP for NFS, which
became the default at r176198. Without this change, doing an NFS mount
against a server that only supports UDP would result in an unusable
mount point if a transport protocol option wasn't specified for the
mount.
Purge namecache for the file system being rolled back, so it doesn't point at
invalid vnodes after the rollback resulting in EIO errors when trying to access
files which are in the namecache.
kan [Thu, 17 Sep 2009 13:21:53 +0000 (13:21 +0000)]
Make libc.a provide __stack_chk_fail_local weak alias. This is
needed to satisfy static libraries that are compiled with -fpic
and linked into static binary afterwards. Several libraries in
gcc are examples of such static libs.
- Implement MSI support (MSIX support was already there)
- Use a table to drive MSI/MSIX exceptions
- Pre-calculate the command address instead of wasting cycles doing the
calculation on every i/o.
Make MSI and PERFORMANT interrupts work correctly. Only require the minimum
number of MSIX interrupts that are needed, and don't strictly check for 4.
Enable enough interrupt mask bits so that the controller will generate
interrupts in PERFORMANT mode. This fixes the hang-on-boot issues that
people were seeing with newer controllers.
Increase CISS_MAX_PHYSTGT to 256 so that it matches what the controller might
give us. Without this, certain data structures get sized incorrectly, leading
to a panic on certain cards that want to use high-value target numbers.
When checking traffic endpoint's adresses families in key_spdadd(),
compare them together instead of comparing each one with respective
tunnel endpoint.
ed [Wed, 16 Sep 2009 07:01:11 +0000 (07:01 +0000)]
Extend the keyboard character size to 24 bits.
Because we use an int to store keyboard chacacters and their flags, we
can easily store the flags in the top byte instead of the second byte.
This means it's a lot easier to make Unicode work.
The only change that still needs to be made, is that keyent_t's map is
extended to u_int.
Add the ability to see TCP timers via netstat -x. This can be a useful
feature when you have a seemingly stuck socket and want to figure
out why it has not been closed yet.
No plans to MFC this, as it changes the netstat sysctl ABI.
EV_RECEIPT is useful to disambiguating error conditions when multiple
events structures are passed to kevent(2). The error code is returned
in the data field and EV_ERROR is set.
When the EV_DISPATCH flag is used the event source will be disabled
immediately after the delivery of an event. This is similar to the
EV_ONESHOT flag but it doesn't delete the event.
Add user events support to kernel events which are not associated with any
kernel mechanism but are triggered by user level code. This is useful for
adding user level events to an event handler that may also be monitoring
kernel events.
The touch event filter is called when a kernel event data is possibly
updated. There are two hook points. First, during a kevent() system
call. Second, when an event has been triggered.
andre [Tue, 15 Sep 2009 22:23:45 +0000 (22:23 +0000)]
-Put the optimized soreceive_stream() under a compile time option called
TCP_SORECEIVE_STREAM for the time being.
Requested by: brooks
Once compiled in make it easily switchable for testers by using a tuneable
net.inet.tcp.soreceive_stream
and a corresponding read-only sysctl to report the current state.
Suggested by: rwatson
MFC after: 2 days
-This line, and those below, will be ignored--
> Description of fields to fill in above: 76 columns --|
> PR: If a GNATS PR is affected by the change.
> Submitted by: If someone else sent in the change.
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> Security: Vulnerability reference (one per line) or description.
> Empty fields above will be automatically removed.
M sys/conf/options
M sys/kern/uipc_socket.c
M sys/netinet/tcp_subr.c
M sys/netinet/tcp_usrreq.c
Reverting the previous change for now. Some users reports the patch
fixes their issues but one reports a failure in NFS ROOT. Revert
the change for now pending further investigation.
Add two test cases from PR 130504.
An additional one coming from http://www.research.att.com/~gsf/testregex/
was not added; at some point the entire AT&T regression test harness
should be imported here.
But that would also mean commitment to fix the uncovered errors.
Self pointing routes are installed for configured interface addresses
and address aliases. After an interface is brought down and brought
back up again, those self pointing routes disappeared. This patch
ensures after an interface is brought back up, the loopback routes
are reinstalled properly.
Use explicit int values for the device states in order to allow,
if necessary, in the future, adds of new states without breaking ABI
between revisions.
Fix sched_switch_migrate():
- In 8.x and above the run-queue locks are nomore shared even in the
HTT case, so remove the special case.
- The deadlock explained in the removed comment here is still possible
even with different locks, with the contribution of tdq_lock_pair().
An explanation is here:
(hypotesis: a thread needs to migrate on another CPU, thread1 is doing
sched_switch_migrate() and thread2 is the one handling the sched_switch()
request or in other words, thread1 is the thread that needs to migrate
and thread2 is a thread that is going to be preempted, most likely an
idle thread. Also, 'old' is referred to the context (in terms of
run-queue and CPU) thread1 is leaving and 'new' is referred to the
context thread1 is going into. Finally, thread3 is doing tdq_idletd()
or sched_balance() and definitively doing tdq_lock_pair())
* thread1 blocks its td_lock. Now td_lock is 'blocked'
* thread1 drops its old runqueue lock
* thread1 acquires the new runqueue lock
* thread1 adds itself to the new runqueue and sends an IPI_PREEMPT
through tdq_notify() to the new CPU
* thread1 drops the new lock
* thread3, scanning the runqueues, locks the old lock
* thread2 received the IPI_PREEMPT and does thread_lock() with td_lock
pointing to the new runqueue
* thread3 wants to acquire the new runqueue lock, but it can't because
it is held by thread2 so it spins
* thread1 wants to acquire old lock, but as long as it is held by
thread3 it can't
* thread2 going further, at some point wants to switchin in thread1,
but it will wait forever because thread1->td_lock is in blocked state
This deadlock has been manifested mostly on 7.x and reported several time
on mailing lists under the voice 'spinlock held too long'.
Many thanks to des@ for having worked hard on producing suitable textdumps
and Jeff for help on the comment wording.
Reviewed by: jeff
Reported by: des, others
Tested by: des, Giovanni Trematerra
<giovanni dot trematerra at gmail dot com>
(STABLE_7 based version)
Accomodate old style XPT_IMMED_NOTIFY and XPT_NOTIFY_ACK so that
we at least don't panic.
We don't really support dual role mode (INITIATOR/TARGET) any more. We
should but it's broken and will take a fair amount of effort to fix
and correctly manage both initiator and target roles sharing the port
database. So, for now, disallow it.
The bootp code installs an interface address and the nfs client
module tries to install the same address again. This extra code
is removed, which was discovered by the removal of a call to
in_ifscrub() in r196714. This call to in_ifscrub is put back here
because the SIOCAIFADDR command can be used to change the prefix
length of an existing alias.
Previously local end of point-to-point interface is not reachable
within the system that owns the interface. Packets destined to
the local end point leak to the wire towards the default gateway
if one exists. This behavior is changed as part of the L2/L3
rewrite efforts. The local end point is now reachable within the
system. The inpcb code needs to consider this fact during the
address selection process.
- Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular
df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows
ZFS route of not listing snapshots by default with 'zfs list' command.
- Add UPDATING entry to note that ZFS snapshots are no longer visible in
mount(8) and df(1) output by default.
Modify mount(8) to skip MNT_IGNORE file systems by default, just like df(1)
does. This is not POLA violation, because there is no single file system in the
base that use MNT_IGNORE currently, although ZFS snapshots will be mounted with
MNT_IGNORE after next commit.
Protect cross-script invocation by checking that the target script exists.
This allows pruning of rc.d scripts without getting too many ugly boottime
error messages.
John Baldwin suggested that 'stolen memory' only happens in the case of
i810 and therefore is useful info there. Aperture size and stolen memory
are now printed on one line.
ed [Sun, 13 Sep 2009 18:45:59 +0000 (18:45 +0000)]
Make sure we never place the cursor outside the screen.
For some vague reason, it may be possible that scp->cursor_pos exceeds
scp->ysize * scp->xsize. This means that teken_set_cursor() may get
called with an invalid position. Just ignore the old cursor position in
this case.
Reported by: Paul B. Mahol <onemda gmail com>
MFC after: 1 month
Fixes two bugs:
1) A lock issue, if we ever had to try again
we would double lock the INP lock.
2) We were allowing (at wrap) associd 0... which really
we cannot allow since 0 normally means in most socket
API calls that we are wishing to effect something on
the INP not TCB.
Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories
by just returning EOPNOTSUPP. This will allow NFS server to fall back to
regular READDIR.
Note that converting inode number to snapshot's vnode is expensive operation.
Snapshots are stored in AVL tree, but based on their names, not inode numbers,
so to convert inode to snapshot vnode we have to interate over all snalshots.
This is not a problem in OpenSolaris, because in their READDIRPLUS
implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on
d_fileno as we do.
PR: kern/125149
Reported by: Weldon Godfrey <wgodfrey@ena.com>
Analysis by: Jaakko Heinonen <jh@saunalahti.fi>
MFC after: 3 days
marius [Sun, 13 Sep 2009 14:47:31 +0000 (14:47 +0000)]
Factor out the duplicated macro for the device type used in the
OFW device tree for PCI bridges and add a new one for PCI Express.
While at it, take advantage of the former for the rman(9) work-
around in jbusppm(4).
There is a bug where mze_insert() can trigger an assert() of inserting
the same entry twice. This bug is not fixed yet, but leads to situation
where when try to access corrupted directory the kernel will panic.
Until the bug is properly fixed, try to recover from it and log that it
happened.
Reported by: marck
OpenSolaris bug: 6709336
MFC after: 3 days
The following changes are added because of
network_ipv6->rc.d/netif integration:
- $ipv6_enable is now obsolete. Instead, IPv6 is enabled by
default if the kernel supports it, and $ipv6_network_interfaces
is "none" by default. If you want to use IPv6, define
$ipv6_network_interfaces and $ifconfig_xxx_ipv6.
An interface which is in $network_interfaces and not in
$ipv6_network_interfaces will be marked as "inet6
-auto_linklocal ifdisabled" (see ifconfig(8)).
- $ipv6_ifconfig_xxx is renamed to ifconfig_xxx_ipv6 for
consistency with other address families. The old variables
still work but can be removed in the future. Note that
ipv6_ifconfig_xxx="..." should be replaced with
ifconfig_xxx_ipv6="inet6 ...".
- Receiving ICMPv6 Router Advertisement is not automatically
enabled even if there is no manual configuration of IPv6 in
rc.conf. If you want it, define
ifconfig_xxx_ipv6="inet6 ... accept_rtadv".
- The rc.d/ip6addrctl now chooses address selection policy based
on $ipv6_prefer, not $ipv6_enable. The default is
ipv6_prefer=NO.
- $router* and $ipv6_router* are replaced with $routed_* and
$route6d_* for consistency. The old variables still work but
can be removed in the future.
Add an extension of set_rcvar(), a new function set_rcvar_obsolete(),
and $desc.
The set_rcvar_obsolete() is for displaying an obsolete variable
and the new one. More specifically, a warning is displayed when
a variable is removed or changed in the source tree and the user
still defines the old one.
$router* and $ipv6_router* are replaced with $routed_* and
$route6d_* for consistency. The old variables still work but
can be removed in the future.
- Add rc.d/stf and rc.d/faith for stf(4) and faith(4).
- Remove rc.d/auto_linklocal and rc.d/network_ipv6.
- Move rc.d/sysctl to just before FILESYSTEMS because rc.d/netif
depends on some sysctl variables.
Improve flexibility of receiving Router Advertisement and
automatic link-local address configuration:
- Convert a sysctl net.inet6.ip6.accept_rtadv to one for the
default value of a per-IF flag ND6_IFF_ACCEPT_RTADV, not a
global knob. The default value of the sysctl is 0.
- Add a new per-IF flag ND6_IFF_AUTO_LINKLOCAL and convert a
sysctl net.inet6.ip6.auto_linklocal to one for its default
value. The default value of the sysctl is 1.
- Make ND6_IFF_IFDISABLED more robust. It can be used to disable
IPv6 functionality of an interface now.
- Receiving RA is allowed if ip6_forwarding==0 *and*
ND6_IFF_ACCEPT_RTADV is set on that interface. The former
condition will be revisited later to support a "host + router" box
like IPv6 CPE router. The current behavior is compatible with
the older releases of FreeBSD.
- The ifconfig(8) now supports these ND6 flags as well as "nud",
"prefer_source", and "disabled" in ndp(8). The ndp(8) now
supports "auto_linklocal".
Discussed with: bz and jinmei
Reviewed by: bz
MFC after: 3 days
luigi [Sat, 12 Sep 2009 21:44:34 +0000 (21:44 +0000)]
Make sure callouts are not processed one tick late.
The problem was introduced in SVN 180608/ rev 1.114 and affects
all users of callout_reset() (including select, usleep, setitimer).
A better fix probably involves replicating 'ticks' in the
struct callout_cpu; this commit is just a temporary thing so that
we can MFC it after a suitable test time and RE approval.
Don't allow joins w/o source on an existing group.
This is almost always pilot error.
We don't need to check for group filter UNDEFINED state at t1,
because we only ever allocate filters with their groups, so we
unconditionally reject such calls with EINVAL.
Trying to change the active filter mode w/o going through IP_MSFILTER
is also disallowed.
Deals with the case described in PR 137164 upfront, cumulative
with the fix in svn rev 197132 which only calls imo_match_source()
if the source address family was not unspecified.
- Protect reclaim with z_teardown_inactive_lock.
- Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if
z_dbuf field is NULL - this might happen in case of rollback or forced
unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete().
- On forced unmount wait for all znodes to be destroyed - destruction can be
done asynchronously via zfs_reclaim_complete().
Tighten input checking in inp_join_group():
* Don't try to use the source address, when its family is unspecified.
* If we get a join without a source, on an existing inclusive
mode group, this is an error, as it would change the filter mode.
Fix a problem with the handling of in_mfilter for new memberships:
* Do not rely on imf being NULL; it is explicitly initialized to a
non-NULL pointer when constructing a membership.
* Explicitly initialize *imf to EX mode when the source address
is unspecified.
This fixes a problem with in_mfilter slot recycling in the join path.
PR: 138690
Submitted by: Stef Walter
MFC after: 5 days
Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain
NULL, but also can point to dead vnode, take that into account.
PR: kern/132068
Reported by: Edward Fisk" <7ogcg7g02@sneakemail.com>, kris
Fix based on patch from: Jaakko Heinonen <jh@saunalahti.fi>
MFC after: 1 week