Do not embed struct ucred into larger netcred parent structures.
Credential might need to hang around longer than its parent and be used
outside of mnt_explock scope controlling netcred lifetime. Use separate
reference-counted ucred allocated separately instead.
While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up
some unused declarations in new NFS code.
Reported by: John Hickey
PR: kern/133439
Reviewed by: dfr, kib
Ed Schouten [Sat, 9 May 2009 15:09:40 +0000 (15:09 +0000)]
Add support for old TTY ioctls to kdump.
At first I allowed ioctl_compat.h to be included, but it just returned
an empty file. I had to do this, to keep kdump happy. I really want to
raise a compiler error when including this header, so now it will just
throw an error if you don't set COMPAT_43TTY.
Alan Cox [Sat, 9 May 2009 08:30:44 +0000 (08:30 +0000)]
Fix a race involving vnode_pager_input_smlfs(). Specifically, in the case
that vnode_pager_input_smlfs() zeroes the page, it should not mark the page
as valid until after the page is zeroed. Otherwise, the page could be
mapped for read access (e.g., by vm_map_pmap_enter()) before the page is
zeroed. Reviewed by: tegge
Eliminate gratuitous clearing of the page's dirty mask by
vnode_pager_input_smlfs(). Instead, assert that the page is clean.
Reviewed by: tegge
Eliminate some blank lines.
Eliminate pointless calls to pmap_clear_modify() and vm_page_undirty() from
vnode_pager_input_old(). The page is not mapped. Therefore, it cannot have
any page table entries that are modified.
Eliminate an incorrect comment from vnode_pager_generic_getpages().
John Baldwin [Sat, 9 May 2009 05:07:36 +0000 (05:07 +0000)]
Convert IPFW_DEFAULT_TO_ACCEPT into a loader tunable
'net.inet.ip.fw.default_to_accept'. The current value can also be queried
via a read-only sysctl of the same name.
Kip Macy [Sat, 9 May 2009 01:45:55 +0000 (01:45 +0000)]
- rename atomic.S and crc32.c to avoid collisions when linking zfs in to the kernel
- update Makefile
- ifdef out acl_{alloc, free}, they aren't used by zfs and conflict with existing in-kernel routines
- Fixed incorrect packet length problem caused be earlier change to
support ZERO_COPY_SOCKETS.
- Created #define for context initialization retry count.
Ed Schouten [Fri, 8 May 2009 20:06:37 +0000 (20:06 +0000)]
Burn TTY ioctl bridges in compat layers.
I really don't want any pieces of code to include ioctl_compat.h, so let
the ibcs2 and svr4 compat leave sgtty alone. If they want to support
sgtty, they should emulate it on top of termios, not sgtty.
The code has been marked with BURN_BRIDGES for a long time. ibcs2 and
svr4 are not really popular pieces of code anyway.
Marko Zec [Fri, 8 May 2009 14:11:06 +0000 (14:11 +0000)]
Introduce a new virtualization container, provisionally named vprocg, to hold
virtualized instances of hostname and domainname, as well as a new top-level
virtualization struct vimage, which holds pointers to struct vnet and struct
vprocg. Struct vprocg is likely to become replaced in the near future with
a new jail management API import.
As a consequence of this change, change struct ucred to point to a struct
vimage, instead of directly pointing to a vnet.
Change the internal buffer used to store input lines from a static buffer
to a dynamically allocated one in order to support input lines of
arbitrary length.
Tim Kientzle [Thu, 7 May 2009 23:01:03 +0000 (23:01 +0000)]
Partially revert r191171, which went too far in trying
to eliminate some duplicated code. In particular,
archive_read_open_filename() has different close
handling than archive_read_open_fd(), so delegating
the former to the latter in the degenerate case
(a NULL filename is treated as stdin) broke reading
from pipelines. In particular, this fixes occasional
port failures that were seen when using "gunzip | tar"
pipelines under /bin/csh.
Thanks to Alexey Shuvaev for reporting this failure and
patiently helping me to track down the cause.
Kip Macy [Thu, 7 May 2009 20:28:06 +0000 (20:28 +0000)]
Asynchronously release vnodes to avoid blocking on range locks when calling back in to zfs.
This is based on a fix that went in to opensolaris on March 9th. However, it uses a dedicated
thread instead of a Solaris' taskq to avoid doing a blocking memory allocation with the vnode
interlock held.
This fixes a long-time deadlock in ZFS. This is not, strictly speaking, an LOR. The spa_zio
thread releases a vnode, this calls in to vn_reclaim which in turn needs to acquire range locks
to sync dirty data out to disk. The range locks are already held by a user-level process waiting
on a condition variable that it the process is waiting on a spa_zio thread to signal it on. The
process could not be signalled because the spa_zio thread could not proceed.
The nature of this problem was not apparent due to ZFS locks opting out of witness which meant
that DDB did not know about the locks that were held by ZFS.
Jamie Gritton [Thu, 7 May 2009 18:36:47 +0000 (18:36 +0000)]
Move the per-prison Linux MIB from a private one-off pointer to the new
OSD-based jail extensions. This allows the Linux MIB to accessed via
jail_set and jail_get, and serves as a demonstration of adding jail support
to a module.
Ed Schouten [Thu, 7 May 2009 17:39:23 +0000 (17:39 +0000)]
If we have a regular rint handler, never go into rint_bypass mode.
It turns out if we called cfmakeraw() on a TTY with only a rint handler
in place, it could inject data into the TTY, even though it should be
redirected. Always take a look at the hooks before looking at the
termios flags.
Dmitry Chagin [Thu, 7 May 2009 14:24:50 +0000 (14:24 +0000)]
Linux exports HZ value to user space via AT_CLKTCK auxiliary vector entry,
which is available for Glibc as sysconf(_SC_CLK_TCK). If AT_CLKTCK entry is
not exported, Glibc uses 100.
linux_times() shall use the value that is exported to user space.
Ed Schouten [Thu, 7 May 2009 13:49:48 +0000 (13:49 +0000)]
Add tcsetsid(3).
The entire world seems to use the non-standard TIOCSCTTY ioctl to make a
TTY a controlling terminal of a session. Even though tcsetsid(3) is also
non-standard, I think it's a lot better to use in our own source code,
mainly because it's similar to tcsetpgrp(), tcgetpgrp() and tcgetsid().
I stole the idea from QNX. They do it the other way around; their
TIOCSCTTY is just a wrapper around tcsetsid(). tcsetsid() then calls
into an IPC framework.
Andrew Thompson [Thu, 7 May 2009 02:15:58 +0000 (02:15 +0000)]
- Fix the u3g port detection where it would not calculate the correct number of
ports when multiple interfaces are present.
- Claim all interfaces regardless of how many are attached
Sam Leffler [Thu, 7 May 2009 00:35:32 +0000 (00:35 +0000)]
optimize ath_tx_findrix: there's no need to walk the rates table as
sc_rixmap is an inverse map
NB: could eliminate the check for an invalid rate by filling in 0 for
invalid entries but the rate control modules use it to identify
bogus rates so leave it for now
Sam Leffler [Wed, 6 May 2009 23:49:55 +0000 (23:49 +0000)]
o cleanup checks for which vap combinations are permitted and what to
use for ic_opmode
o fixes the case where creating ahdemo+wds vaps caused ic_opmode to be
set to hostap
Ulf Lilleengen [Wed, 6 May 2009 19:34:32 +0000 (19:34 +0000)]
- Split up the BIO queue into a queue for new and one for completed requests.
This is necessary for two reasons:
1) In order to avoid collisions with the use of a BIOs flags set by a consumer
or a provider
2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was
available, so the consumer flags of a BIO had to be misused in order to
support enough flags. The new queue makes it possible to recycle the
GV_BIO_DONE flag into GV_BIO_GROW.
As a consequence, gvinum will now work with any other GEOM class under it or
on top of it.
- Use bio_pflags for storing internal flags on downgoing BIOs, as the requests
appear to come from a consumer of a gvinum volume. Use bio_cflags only for
cloned BIOs.
- Move gv_post_bio to be used internally for maintenance requests.
- Remove some cases where flags where set without need.
Ulf Lilleengen [Wed, 6 May 2009 18:21:48 +0000 (18:21 +0000)]
- Split the queue mutex into one for the event queue and one for the BIO queue,
as they do not really relate and to prepare for an additional queue to be
covered by the BIO queue mutex.
- Implement wrappers for fetching the next element from the event queue as well
as for putting a new element into the BIO queue.
Silence unsolicited spam printed out when KTR_MLD happens to be
in KTR_COMPILE mask. Compiling KTR trace points in does not necessarily
mean enabling them, use proper check against ktr_mask instead.
Andrew Thompson [Tue, 5 May 2009 15:36:23 +0000 (15:36 +0000)]
Revert part of r191494 which used the udev state to mark suspending, this needs
to be set via two variables (peer_suspended and self_suspended) and can not be
merged into one.
Marko Zec [Tue, 5 May 2009 10:56:12 +0000 (10:56 +0000)]
Change the curvnet variable from a global const struct vnet *,
previously always pointing to the default vnet context, to a
dynamically changing thread-local one. The currvnet context
should be set on entry to networking code via CURVNET_SET() macros,
and reverted to previous state via CURVNET_RESTORE(). Recursions
on curvnet are permitted, though strongly discuouraged.
This change should have no functional impact on nooptions VIMAGE
kernel builds, where CURVNET_* macros expand to whitespace.
The curthread->td_vnet (aka curvnet) variable's purpose is to be an
indicator of the vnet context in which the current network-related
operation takes place, in case we cannot deduce the current vnet
context from any other source, such as by looking at mbuf's
m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so
far curvnet has turned out to be an invaluable consistency checking
aid: it helps to catch cases when sockets, ifnets or any other
vnet-aware structures may have leaked from one vnet to another.
The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros
was a result of an empirical iterative process, whith an aim to
reduce recursions on CURVNET_SET() to a minimum, while still reducing
the scope of CURVNET_SET() to networking only operations - the
alternative would be calling CURVNET_SET() on each system call entry.
In general, curvnet has to be set in three typicall cases: when
processing socket-related requests from userspace or from within the
kernel; when processing inbound traffic flowing from device drivers
to upper layers of the networking stack, and when executing
timer-driven networking functions.
This change also introduces a DDB subcommand to show the list of all
vnet instances.
Alexander Motin [Tue, 5 May 2009 01:13:20 +0000 (01:13 +0000)]
Do not try to initialize LAPIC timer if we are not going to use it.
It solves assertion, when kernel built with INVARIANTS configured
to use i8254 timer.
John Baldwin [Mon, 4 May 2009 20:25:56 +0000 (20:25 +0000)]
Always compute the root of the kernel source tree and explicitly pass it
to module builds. This avoids having to have the module builds walk up
the tree to find the kernel sources. It also allows a kernel + module
build to succeed when a new level of module subdirectories is added without
requiring that the /usr/share/mk/bsd.kmod.mk file on the machine be patched.