John Baldwin [Fri, 19 Aug 2011 21:28:40 +0000 (21:28 +0000)]
Walk the zombproc list as well as the allproc list when enumerating threads
and processes in a kernel image. This allows examination of threads that
have exited or are in the late stages of exiting.
Marius Strobl [Fri, 19 Aug 2011 19:12:58 +0000 (19:12 +0000)]
r221812 reveals that at least some Broadcom PHYs default to being not only
isolated but also powered down after a reset and while they just work fine
[sic] when both is the case they don't if they are only deisolate but still
powered down. So in order to put PHYs in an overall normal operation mode
for the common case, ensure in mii_phy_reset() that they are not powered
down after a reset. Unfortunately, this only helps in case of BCM5421,
while BCM5709S apparently only work when they remain isolated and powered
down after a reset. So don't call mii_phy_reset() in brgphy_reset() and
implement the reset locally leaving the problematic bits alone. Effectively
this bypasses r221812 for brgphy(4).
Thanks to Justin Hibbits for doing a binary search in order to identify
the problematic commit.
PR: 157405, 158156
Reviewed by: yongari (mii_phy_reset() part)
Approved by: re (kib)
MFC after: 3 days
The decimal() function was changed in r217808 to take the
maximum value instead of number of bits. But for case when
limitation is not needed it erroneously skips conversion to
number and always returns zero. So, don't skip conversion
for case when limitation is not needed.
Robert Watson [Fri, 19 Aug 2011 08:29:10 +0000 (08:29 +0000)]
r222015 introduced a new assertion that the size of a fixed-length sbuf
buffer is greater than 1. This triggered panics in at least one spot in
the kernel (the MAC Framework) which passes non-negative, rather than >1
buffer sizes based on the size of a user buffer passed into a system
call. While 0-size buffers aren't particularly useful, they also aren't
strictly incorrect, so loosen the assertion.
Discussed with: phk (fears I might be EDOOFUS but willing to go along)
Spotted by: pho + stress2
Approved by: re (kib)
Ensure that process descriptors work as expected. We should be able to:
- pdfork(), like regular fork(), but producing a process descriptor
- pdgetpid() to convert a PD into a PID
- pdkill() to send signals to a process identified by a PD
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
A "process descriptor" file descriptor is used to manage processes
without using the PID namespace. This is required for Capsicum's
Capability Mode, where the PID namespace is unavailable.
New system calls pdfork(2) and pdkill(2) offer the functional equivalents
of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote
process for debugging purposes. The currently-unimplemented pdwait(2) will,
in the future, allow querying rusage/exit status. In the interim, poll(2)
may be used to check (and wait for) process termination.
When a process is referenced by a process descriptor, it does not issue
SIGCHLD to the parent, making it suitable for use in libraries---a common
scenario when using library compartmentalisation from within large
applications (such as web browsers). Some observers may note a similarity
to Mach task ports; process descriptors provide a subset of this behaviour,
but in a UNIX style.
This feature is enabled by "options PROCDESC", but as with several other
Capsicum kernel features, is not enabled by default in GENERIC 9.0.
Reviewed by: jhb, kib
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
John Baldwin [Thu, 18 Aug 2011 22:20:45 +0000 (22:20 +0000)]
One of the general principles of the sysctl(3) API is that a user can
query the needed size for a sysctl result by passing in a NULL old
pointer and a valid oldsize. The kern.proc.args sysctl handler broke
this assumption by not calling SYSCTL_OUT() if the old pointer was
NULL.
Nathan Whitehorn [Thu, 18 Aug 2011 16:00:32 +0000 (16:00 +0000)]
Fix a bug that prevented docsinstall from being able to use DNS in most
cases and provide a better error handling mechanism during package
installation.
Alexander Motin [Tue, 16 Aug 2011 21:51:29 +0000 (21:51 +0000)]
Always check current HPET counter value after comparator programming to
avoid lost timer interrupts. Previous optimization attempt doing it only
for intervals less then 5000 ticks (~300us) reported to be unreliable by
some people. Probably because of some heavy SMI code on their boards.
Introduce additional safety interval of 128 counter ticks (~9us) between
programmed comparator and counter values to cover different cases of
delayed write found on some chipsets.
Michael Tuexen [Tue, 16 Aug 2011 21:04:18 +0000 (21:04 +0000)]
Fix the handling of [gs]etsockopt() unconnected 1-to-1 style sockets.
While there:
* Fix a locking issue in setsockopt() of SCTP_CMT_ON_OFF.
* Fix a bug in setsockopt() of SCTP_DEFAULT_PRINFO, where the pr_value
was ignored.
Do not return success and a string "unknown" when vn_fullpath() was unable
to resolve the path of the text vnode of the process. The behaviour is
very confusing for any consumer of the procfs, in particular, java.
Reported and tested by: bf
MFC after: 2 weeks
Approved by: re (bz)
Add the fo_chown and fo_chmod methods to struct fileops and use them
to implement fchown(2) and fchmod(2) support for several file types
that previously lacked it. Add MAC entries for chown/chmod done on
posix shared memory and (old) in-kernel posix semaphores.
Based on the submission by: glebius
Reviewed by: rwatson
Approved by: re (bz)
r224086 added "goto out"-style error handling to nfssvc_nfsd(), in order
to reliably call NFSEXITCODE() before returning. Our Capsicum changes,
based on the old "return (error)" model, did not merge nicely.
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
Nathan Whitehorn [Mon, 15 Aug 2011 13:33:14 +0000 (13:33 +0000)]
Use a maximum of -O on PowerPC kernels instead of -O2 to prevent a rare
bug that could cause intermittent memory corruption on PowerPC SMP
systems using non-debug kernels. This is a temporary change until the
real problem is fixed.
Robert Watson [Mon, 15 Aug 2011 07:30:48 +0000 (07:30 +0000)]
Bump __FreeBSD_version to reflect the availability of capabilities, but
also capability-related changes to fget(9). This is likely not part of
a formal KPI, but the nvidia driver (at least) uses it.
Mention /dev/{stdin,stdout,stderr} breakage that appears in certain
kernel revisions as best avoided!
Michael Tuexen [Sun, 14 Aug 2011 20:55:32 +0000 (20:55 +0000)]
Add support for the spp_dscp field in the SCTP_PEER_ADDR_PARAMS
socket option. Backwards compatibility is provided by still
supporting the spp_ipv4_tos field.
Jilles Tjoelker [Sun, 14 Aug 2011 13:37:38 +0000 (13:37 +0000)]
tail: Fix crash if -F'ed file's filesystem disappears.
If tail notices that a file it is following no longer exists (because stat()
fails), it will output any final lines and then close the file. If the read
operation also causes an error, such as when the filesystem is forcefully
unmounted, it closes the file as well, leading to fclose(NULL) and a
segmentation fault.
Matt Jacob [Sat, 13 Aug 2011 23:34:17 +0000 (23:34 +0000)]
Most of these changes to isp are to allow for isp.ko unloading.
We also revive loop down freezes. We also externaliz within isp
isp_prt_endcmd so something outside the core module can print
something about a command completing. Also some work in progress to
assist in handling timed out commands better.
Martin Matuska [Sat, 13 Aug 2011 21:35:22 +0000 (21:35 +0000)]
zfs_ioctl.c: improve code readability in zfs_ioc_dataset_list_next()
zvol.c: fix calling of dmu_objset_prefetch() in zvol_create_minors()
by passing full instead of relative dataset name and prefetching all
visible datasets to be processed later instead of just the pool name
Reviewed by: pjd
Approved by: re (kib)
MFC after: 1 week
> Reviewed by: If someone else reviewed your modification.
> Approved by: If you needed approval for this commit.
> Obtained from: If the change is from a third party.
> MFC after: N [day[s]|week[s]|month[s]]. Request a reminder email.
> Security: Vulnerability reference (one per line) or description.
> Empty fields above will be automatically removed.
M opensolaris/uts/common/fs/zfs/zfs_ioctl.c
M opensolaris/uts/common/fs/zfs/zvol.c
Robert Watson [Sat, 13 Aug 2011 16:03:40 +0000 (16:03 +0000)]
When falloc() was broken into separate falloc_noinstall() and finstall(),
a bug was introduced in kern_openat() such that the error from the vnode
open operation was overwritten before it was passed as an argument to
dupfdopen(). This broke operations on /dev/{stdin,stdout,stderr}. Fix
by preserving the original error number across finstall() so that it is
still available.
Robert Watson [Sat, 13 Aug 2011 13:11:28 +0000 (13:11 +0000)]
Bump __FreeBSD_version to reflect the availability of capabilities, but
also capability-related changes to fget(9). This is likely not part of
a formal KPI, but the nvidia driver (at least) uses it.
Robert Watson [Sat, 13 Aug 2011 12:14:40 +0000 (12:14 +0000)]
Regenerate system call files following r224812 changes to capabilities.conf.
A no-op for non-Capsicum kernels; for Capsicum kernels, completes the
enabling of fooat(2) system calls using capabilities. With this change,
and subject to bug fixes, Capsicum capability support is now complete for
9.0.
Approved by: re (kib)
Submitted by: jonathan
Sponsored by: Google Inc
Martin Matuska [Sat, 13 Aug 2011 10:58:53 +0000 (10:58 +0000)]
Fix race between dmu_objset_prefetch() invoked from
zfs_ioc_dataset_list_next() and dsl_dir_destroy_check() indirectly
invoked from dmu_recv_existing_end() via dsl_dataset_destroy() by not
prefetching temporary clones, as these count as always inconsistent.
In addition, do not prefetch hidden datasets at all as we are not
going to process these later.
Filed as Illumos Bug #1346
PR: kern/157728
Tested by: Borja Marcos <borjam@sarenet.es>, mm
Reviewed by: pjd
Approved by: re (kib)
MFC after: 1 week
Allow openat(2), fstatat(2), etc. in capability mode.
namei() and lookup() can now perform "strictly relative" lookups.
Such lookups, performed when in capability mode or when looking up
relative to a directory capability, enforce two policies:
- absolute paths are disallowed (including symlinks to absolute paths)
- paths containing '..' components are disallowed
These constraints make it safe to enable openat() and friends.
These system calls are instrumental in supporting Capsicum
components such as the capability-mode-aware runtime linker.
Finally, adjust comments in capabilities.conf to reflect the actual state
of the world (e.g. shm_open(2) already has the appropriate constraints,
getdents(2) already requires CAP_SEEK).
Approved by: re (bz), mentor (rwatson)
Sponsored by: Google Inc.
Allow Capsicum capabilities to delegate constrained
access to file system subtrees to sandboxed processes.
- Use of absolute paths and '..' are limited in capability mode.
- Use of absolute paths and '..' are limited when looking up relative
to a capability.
- When a name lookup is performed, identify what operation is to be
performed (such as CAP_MKDIR) as well as check for CAP_LOOKUP.
With these constraints, openat() and friends are now safe in capability
mode, and can then be used by code such as the capability-mode runtime
linker.
Approved by: re (bz), mentor (rwatson)
Sponsored by: Google Inc
Matt Jacob [Fri, 12 Aug 2011 20:09:38 +0000 (20:09 +0000)]
Fixes for sure bus reference miscounting and potential device and
target reference miscounts. It also adds a helper function to get
the current reference counts for components of cam_path for debug
aid. One minor style(9) change.
This patch does three things:
- puts capability rights in a more pleasing declaration order
- changes mask values to match the new declaration order
- declare new rights which will be used soon (e.g. CAP_LOOKUP, CAP_MKDIR)
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
This commit adds regression testing for openat(), fstatat(), etc. with
capability scoping ("strict relative" lookup), which applies:
- in capability mode
- when performing any *at() lookup relative to a capability
These tests will fail until the *at() code is committed; on my local
instance, with the *at() changes, they all pass.
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
Eliminate the zfsdev_state_lock entirely and replace it with the
spa_namespace_lock. This fixes LOR between the spa_namespace_lock and
spa_config lock. LOR can cause deadlock on vdevs removal/insertion.
Since kern_openat() now uses falloc_noinstall() and finstall() separately,
there are cases where we could get to cleanup code without ever creating
a file descriptor. In those cases, we should not call fdclose() on FD -1.
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
Robert Watson [Thu, 11 Aug 2011 12:30:23 +0000 (12:30 +0000)]
Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.
Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc
Use synchronous device destruction instead of asynchronous, so that a new
device having the same name like a previous one is not created before the old
one is gone. This fixes some panics due to asserts in the devfs code which
were added recently.
Ruslan Ermilov [Thu, 11 Aug 2011 10:29:10 +0000 (10:29 +0000)]
- Merged awk upstream that includes a fix for a bug exposed by kmod_syms.mk.
- Provide a build aid for those who already have a buggy awk(1) installed.
Marius Strobl [Wed, 10 Aug 2011 19:12:21 +0000 (19:12 +0000)]
Sync makefs(8) ISO 9660 support with NetBSD:
o cd9960 -> cd9660
o Move inclusion of sys/endian.h from cd9660_eltorito.c to cd9660.h
since actual user is not cd9660_eltorito.c but iso.h and
cd9660_eltorito.h.
Actually, include order/place of sys/endian.h doesn't matter on
netbsd since it is always included by sys/types.h but it's not
true on other system. This should fix cross build breakage on
freebsd introduced by rev. 1.16 of cd9660_eltorito.c.
Problem reported and fix suggested on twitter.
o Fix fd leaks in error cases. Found by cppcheck.
o RRIP RE length should be 4, not 0
o Apply fixes for PR bin/44114 (makefs(8) -t cd9660 -o rockridge creates
corrupted cd9660fs), iso9660_rrip.c part:
- cd9660_rrip_finalize_node() should check rr_real_parent in node->parent,
not in node itself in RRIP_PL case
- cd9660_rrip_initialize_node() should update only node passed as arg
so handle RRIP_PL in DOTDOT case
Fixes malformed dotdot entries in deep (more than 8 level) directories
moved into .rr_moved dir.
Should be pulled up to netbsd-5.
(no official ISO has such deep dirs, but cobalt restorecd is affected)
Reviewed by: mm
Approved by: re (kib)
Obtained from: NetBSD
MFC after: 3 days
Marius Strobl [Wed, 10 Aug 2011 19:05:22 +0000 (19:05 +0000)]
o Improve 224494:
- Ignore some more internal SAS device status change events.
- Correct inverted Bus and TargetID arguments in a warning.
o Add a warning for MPI_EVENT_SAS_DISCOVERY_ERROR events, which can help
identifying broken disks.
Submitted by: Andrew Boyer
Approved by: re (kib)
Committed from: Chaos Communication Camp 2011
- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag
to VPO_UNMANAGED (and also making the flag protected by the vm object
lock, instead of vm page queue lock).
- Mark the fake pages with both PG_FICTITIOUS (as it is now) and
VPO_UNMANAGED. As a consequence, pmap code now can use use just
VPO_UNMANAGED to decide whether the page is unmanaged.
Reviewed by: alc
Tested by: pho (x86, previous version), marius (sparc64),
marcel (arm, ia64, powerpc), ray (mips)
Sponsored by: The FreeBSD Foundation
Approved by: re (bz)
John Baldwin [Tue, 9 Aug 2011 15:29:58 +0000 (15:29 +0000)]
Merge 220876, 220877, and 221537 from the new NFS client to the old:
Allow the NFS client to use a max file size larger than 1TB for v3 mounts.
It now allows files up to OFF_MAX subject to whatever limit the server
advertises.
Alexander Motin [Tue, 9 Aug 2011 08:11:26 +0000 (08:11 +0000)]
Do not block zero report ID. It is correct value for devices with single
ID. This fixes USB_SET_IMMED call (synchronous operation) of the uhid(4)
driver on devices with single report ID.
Dimitry Andric [Mon, 8 Aug 2011 20:53:04 +0000 (20:53 +0000)]
Fix buffer overflow in sys/boot/common/util.c's printf(), when printing
large (>= 10^10) numbers. In theory, 20 characaters should be enough,
but bump the buffer to 32 characters, so we have some room for the
future.
When setting a fixed channel on adapters with 11n support the scan
channel list ends up with 2 entries, the HT and the legacy channel.
The scan itself is currently always done at legacy rates so we end
up receiving scan results for legacy networks on the HT channel and
erroneously assigning the BSS to the 11n channel. As the channel's
capabilities are used to setup the adapter we might end up with
non-working settings and/or firmware crashes.
Fix this by ensuring that scan results received on a HT channel
are only assigned to that channel if the htcap IE is available,
else use the legacy channel equivalent.
Tested by: Pawel Worach, Raoul Megelas, Maciej Milewski,
Andrei <az at azsupport dot com>
Approved by: re (kib)
Adrian Chadd [Mon, 8 Aug 2011 16:22:42 +0000 (16:22 +0000)]
Introduce some more DFS related hooks, inspired both by local work
and the Atheros reference code.
The radar detection code needs to know what the current DFS domain is.
Since net80211 doesn't currently know this information, it's extracted
from the HAL regulatory domain information.
The specifics:
* add a new ath_dfs API hook, ath_dfs_init_radar_filters(), which
updates the radar filters whenever the regulatory domain changes.
* add HAL_DFS_DOMAIN which describes the currently configured DFS domain .
* add a new HAL internal variable which tracks the currently configured
HAL DFS domain.
* add a new HAL capability, HAL_CAP_DFS_DMN, which returns the currently
configured HAL DFS domain setting.
* update the HAL DFS domain setting whenever the channel setting is
updated.
Since this isn't currently used by any radar code, these should all
be no-ops for existing users.