Instead of incomplete handling of read(2)/write(2) return values that
does not fit into registers, declare that we do not support this case
using CTASSERT(), and remove endianess-unsafe code to split return value
into td_retval.
While there, change the style of the sysctl debug.iosize_max_clamp
definition.
Andreas Tobler [Sun, 4 Mar 2012 11:55:28 +0000 (11:55 +0000)]
Restore proper dot symbol creation for assembly files in the kernel build case.
Without this patch we were not able to see the assembly function.
Only the function descriptor was visible.
- Distinguish between user-land and kernel when creating the ENTRY() point of
assembly source.
- Make the ENTRY() macro more readable, replace the .align directive with the
gas platform independant .p2align directive.
- Create an END()macro for later use to provide traceback tables on powerpc64.
Andreas Tobler [Sun, 4 Mar 2012 08:43:33 +0000 (08:43 +0000)]
Add support for PWM controlled fans. I found these fans on my PowerMac9,1.
These fans are not located under the same node as the the RPM controlled ones,
So I had to adapt the current source to parse and fill the properties correctly.
To control the fans we can set the PWM ratio via sysctl between 20 and 100%.
Adrian Chadd [Sun, 4 Mar 2012 05:52:26 +0000 (05:52 +0000)]
* Introduce new flag for QoS control field;
* Change in mesh_input to validate that QoS is set and Mesh Control field
is present, also both bytes of the QoS are read;
* Moved defragmentation in mesh_input before we try to forward packet as
inferred from amendment spec, because Mesh Control field only present in first
fragment;
* Changed in ieee80211_encap to set QoS subtype and Mesh Control field present,
only first fragment have Mesh Control field present bit equal to 1;
Adrian Chadd [Sun, 4 Mar 2012 05:49:39 +0000 (05:49 +0000)]
* Added IEEE80211_ACTION_CAT_MESH in ieee80211.h as specified amendment spec;
* Moved old categories as specified by D4.0 to be action fields of MESH category
as specified in amendment spec;
* Modified functions to use MESH category and its action fields:
+ ieee80211_send_action_register
+ ieee80211_send_action
+ ieee80211_recv_action_register
+ieee80211_recv_action;
* Modified ieee80211_hwmp_init and hwmp_send_action so they uses correct
action fields as specified in amendment spec;
* Modified ieee80211_parse_action so that it verifies MESH frames.
* Change Mesh Link Metric to use one information element as amendment spec.
Draft 4.0 defined two different information elements for request and response.
Juli Mallett [Sun, 4 Mar 2012 05:19:55 +0000 (05:19 +0000)]
Fix tls base computation with COMPAT_FREEBSD32 on n64 kernels. The previous
version was missing an else and would always use the n64 TP_OFFSET. Eliminate
some duplication of logic here.
It may be worth getting rid of some of the ifdefs and introducing gratuitous
SV_ILP32 runtime checks on n64 kernels without COMPAT_FREEBSD32 and on o32
kernels, similarly to how PowerPC works.
Dimitry Andric [Sat, 3 Mar 2012 23:49:53 +0000 (23:49 +0000)]
Revert r232473. I have been convinced by Doug Barton and Bjoern Zeeb
that it is better to error out when people attempt to build using the
wrong bsd.*.mk files, than to silently ignore the problem.
This means, that after this commit, if you want to build kernel modules
by hand (or via a port) from a head source tree, you *must* make sure
the files in /usr/share/mk are in sync with that tree. If that isn't
possible, for example when you are running on an older FreeBSD branch,
you can:
- Run "make buildenv" from your head source tree, to have the correct
environment setup. (It's advisable to have run "make buildworld", or
at a minimum "make toolchain" first.)
- Alternatively, set MAKESYSPATH to the share/mk directory under your
head source tree. If your build tools are too old, other problems may
still occur.
- Alternatively, use "make -m" and specify the share/mk directory under
your head source tree. Again, build tools that are too old may still
result in trouble.
Juli Mallett [Sat, 3 Mar 2012 21:39:12 +0000 (21:39 +0000)]
On MIPS, _ALIGN always aligns to 8 bytes, even for 32-bit binaries. This might
not be ideal, but is the ABI we've shipped so far. Fix macros which reflect
the results of _ALIGN on 32-bit MIPS to use the right alignment.
This fixes sendmsg under COMPAT_FREEBSD32 on n64 MIPS kernels.
Dimitry Andric [Sat, 3 Mar 2012 18:58:15 +0000 (18:58 +0000)]
After r232322, it turned out many people (and some ports) are building
kernel modules using their old installed /usr/share/mk/bsd.*.mk files,
instead of the updated ones in their source tree. This leads to errors
like:
Obviously, these errors will go away after a "make installworld", or
alternatively, by using "make buildenv" before attempting to manually
build modules.
However, since it is apparently an expected use case to build using old
.mk files, change the way we test for clang, so it also works when the
MK_CLANG_IS_CC macro doesn't exist.
Note the conditional expressions are becoming rather unreadable now, but
I will attempt to fix that on a followup commit.
John Baldwin [Sat, 3 Mar 2012 18:08:57 +0000 (18:08 +0000)]
Expand the set of APIs available for locating PCI capabilities:
- pci_find_extcap() is repurposed to be used for fetching PCI-express
extended capabilities (PCIZ_* constants in <dev/pci/pcireg.h>).
- pci_find_htcap() can be used to locate a specific HyperTransport
capability (PCIM_HTCAP_* constants in <dev/pci/pcireg.h>).
- Cache the starting location of the PCI-express capability for PCI-express
devices in PCI device ivars.
Rick Macklem [Sat, 3 Mar 2012 16:13:20 +0000 (16:13 +0000)]
The name caching changes of r230394 exposed an intermittent bug
in the new NFS server for NFSv4, where it would report ENOENT
when the file actually existed on the server. This turned out
to be caused by not initializing ni_topdir before calling lookup()
and there was a rare case where the value on the stack location
assigned to ni_topdir happened to be a pointer to a ".." entry,
such that "dp == ndp->ni_topdir" succeeded in lookup().
This patch initializes ni_topdir to fix the problem.
John Baldwin [Sat, 3 Mar 2012 14:25:36 +0000 (14:25 +0000)]
Update the pci_get_vpd_readonly() wrapper to use 'vptr' instead of
'identptr' for its last parameter to match the default implementation
as well as the method definition in pci_if.m.
John Baldwin [Sat, 3 Mar 2012 14:23:54 +0000 (14:23 +0000)]
Expand and reorganize the pci(9) manpage a bit:
- Document the following routines: pci_alloc_msi(), pci_alloc_msix(),
pci_find_cap(), pci_get_max_read_req(), pci_get_vpd_ident(),
pci_get_vpd_readonly(), pci_msi_count(), pci_msix_count(),
pci_pending_msix(), pci_release_msi(), pci_remap_msix(), and
pci_set_max_read_req().
- Group the functions into five sub-sections: raw configuration access,
locating devices, device information, device configuration, and
message signaled interrupts.
- Discourage use of pci_disable_io() and pci_enable_io() in device drivers.
The PCI bus driver handles this automatically as resources are activated.
Alexander Motin [Sat, 3 Mar 2012 11:50:48 +0000 (11:50 +0000)]
Fix bug of r232207, when cpu_search() could prefer CPU group with best
load, but with no CPU matching given limitations. It caused kernel panics
in some cases when thread was bound to specific CPUs with cpuset(1).
Juli Mallett [Sat, 3 Mar 2012 08:19:18 +0000 (08:19 +0000)]
o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with userlands
using the o32 ABI. This mostly follows nwhitehorn's lead in implementing
COMPAT_FREEBSD32 on powerpc64.
o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the
32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit
ABIs use a 64-bit time_t.
o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS
with 32-bit compatibility, then, disable some code which assumes otherwise
wrongly when built for MIPS. A more general macro to check in this case would
seem like a good idea eventually. If someone adds support for using n32
userland with n64 kernels on MIPS, then they will have to add a variety of
flags related to each piece of the ABI that can vary. That's probably the
right time to generalize further.
o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the
freebsd32 compat code. Probably this should be generalized at some point.
Make sure that the USB system suspend event is executed synchronously
and not asynchronously. This fixes problems related to USB system
suspend and resume. It is assumed that we are always allowed to sleep
from the device_suspend() method.
Rick Macklem [Sat, 3 Mar 2012 01:06:54 +0000 (01:06 +0000)]
Post r230394, the Lookup RPC counts for both NFS clients increased
significantly. Upon investigation this was caused by name cache
misses for lookups of "..". For name cache entries for non-".."
directories, the cache entry serves double duty. It maps both the
named directory plus ".." for the parent of the directory. As such,
two ctime values (one for each of the directory and its parent) need
to be saved in the name cache entry.
This patch adds an entry for ctime of the parent directory to the
name cache. It also adds an additional uma zone for large entries
with this time value, in order to minimize memory wastage.
As well, it fixes a couple of cases where the mtime of the parent
directory was being saved instead of ctime for positive name cache
entries. With this patch, Lookup RPC counts return to values similar
to pre-r230394 kernels.
Fix a problem that was causing the mpt(4) driver to attach to MegaRAID
cards that should be handled by the mfi(4) driver.
The root of the problem is that the mpt(4) driver was masking off the
bottom bit of the PCI device ID when deciding which cards to attach to.
It appears that a number of the mpt(4) Fibre Channel cards had a LAN
variant whose PCI device ID was just one bit off from the FC card's device
ID. The FC cards were even and the LAN cards were odd.
The problem was that this pattern wasn't carried over on the SAS and
parallel SCSI mpt(4) cards. Luckily the SAS and parallel SCSI PCI device
IDs were either even numbers, or they would get masked to a supported
adjacent PCI device ID, and everything worked well.
Now LSI is using some of the odd-numbered PCI device IDs between the 3Gb
SAS device IDs for their new MegaRAID cards. This is causing the mpt(4)
driver to attach to the RAID cards instead of the mfi(4) driver.
The solution is to stop masking off the bottom bit of the device ID, and
explicitly list the PCI device IDs of all supported cards.
This change should be a no-op for mpt(4) hardware. The only intended
functional change is that for the 929X, the is_fc variable gets set. It
wasn't being set previously, but needs to be because the 929X is a Fibre
Channel card.
Reported by: Kashyap Desai <Kashyap.Desai@lsi.com>
MFC After: 3 days
Juli Mallett [Fri, 2 Mar 2012 21:46:31 +0000 (21:46 +0000)]
When creating a handle for a subregion, be sure to actually math out the new
handle address, where we're using handles as raw addresses.
This fixes devices with subregions on Octeon PCI specifically, and likely also on
MIPS more generally, where there isn't another bus_space in use that was doing the
math already.
John Baldwin [Fri, 2 Mar 2012 20:38:04 +0000 (20:38 +0000)]
- Add a bus_dma tag to each PCI bus that is a child of a Host-PCI bridge.
The tag enforces a single restriction that all DMA transactions must not
cross a 4GB boundary. Note that while this restriction technically only
applies to PCI-express, this change applies it to all PCI devices as it
is simpler to implement that way and errs on the side of caution.
- Add a softc structure for PCI bus devices to hold the bus_dma tag and
a new pci_attach_common() routine that performs actions common to the
attach phase of all PCI bus drivers. Right now this only consists of
a bootverbose printf and the allocate of a bus_dma tag if necessary.
- Adjust all PCI bus drivers to allocate a PCI bus softc and to call
pci_attach_common() from their attach routines.
Juli Mallett [Fri, 2 Mar 2012 20:34:15 +0000 (20:34 +0000)]
Unbreak SMP on stock Octeon systems -- copy the core_mask from bootinfo into
sysinfo. This should have been done as part of replacing bootinfo with sysinfo.
John Baldwin [Fri, 2 Mar 2012 18:55:19 +0000 (18:55 +0000)]
Similar to the fixes in 226967 and 226987, purge any name cache entries
associated with the previous vnode (if any) associated with the target of
a rename(). Otherwise, a lookup of the target pathname concurrent with a
rename() could re-add a name cache entry after the namei(RENAME) lookup
in kern_renameat() had purged the target pathname.
Ruslan Ermilov [Fri, 2 Mar 2012 14:05:50 +0000 (14:05 +0000)]
Removed excessive _seekdir() call in closedir(). This saves one lseek()
syscall. Before r5958, seekdir() was called for its side effect of
freeing memory allocated by opendir() for rewinddir(), but that revision
added _reclaim_telldir() that frees all memory allocated by telldir()
calls, making this call redundant.
This introduces a slight change. If an application duplicated the descriptor
obtained through dirfd(), it can no longer rely on file position to be
reset to the start of file after a call to closedir(). It's believed to
be safe because neither POSIX, nor any other OS I've tested (NetBSD, Linux,
OS X) rewind the file offset pointer on closedir().
Ruslan Ermilov [Fri, 2 Mar 2012 10:03:38 +0000 (10:03 +0000)]
Finally removed the stat() and fstat() calls from the opendir() code.
They were made excessive in r205424 by opening with O_DIRECTORY.
Also eliminated the fcntl() call used to set FD_CLOEXEC by opening
with O_CLOEXEC.
(fdopendir() still checks that the passed descriptor is a directory,
and sets FD_CLOEXEC on it.)
Hiroki Sato [Fri, 2 Mar 2012 07:23:28 +0000 (07:23 +0000)]
Allow to configure net.inet6.ip6.{accept_rtadv,no_radr} by the loader tunables
as well because they have to be configured before interface initialization for
AF_INET6.
Adrian Chadd [Fri, 2 Mar 2012 02:53:43 +0000 (02:53 +0000)]
Attempt to catch scan cancellations at exactly the wrong time from occuring.
The scan code unlocks the comlock and calls into the driver. It then
assumes the state hasn't changed from underneath it.
Although I haven't seen this particular condition trigger, I'd like to
be informed if I or anyone else sees it.
What I'm thinking may occur:
* A cancellation comes in during the scan_end call;
* the cancel flag is set;
* but it's never checked, so scandone isn't updated;
* .. and the interface stays in the STA power save mode.
Davide Italiano [Thu, 1 Mar 2012 21:23:26 +0000 (21:23 +0000)]
- Add support for the Intel Sandy Bridge microarchitecture (both core and uncore counting events)
- New manpages with event lists.
- Add MSRs for the Intel Sandy Bridge microarchitecture
John Baldwin [Thu, 1 Mar 2012 20:36:50 +0000 (20:36 +0000)]
Update the documentation on pci_get/set_powerstate(). These methods are
not ACPI-specific at all, but deal with PCI power states. Also,
pci_set_powerstate() fails with EOPNOTSUPP if a request is made that the
underlying device does not support rather than falling back to somehow
setting D0.
John Baldwin [Thu, 1 Mar 2012 20:20:55 +0000 (20:20 +0000)]
Add pci_save_state() and pci_restore_state() wrappers around
pci_cfg_save() and pci_cfg_restore() for device drivers to use when
saving and restoring state (e.g. to handle device-specific resets).
John Baldwin [Thu, 1 Mar 2012 19:58:34 +0000 (19:58 +0000)]
- Change contigmalloc() to use the vm_paddr_t type instead of an unsigned
long for specifying a boundary constraint.
- Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary
constraints.
These allow boundary constraints to be fully expressed for cases where
sizeof(bus_addr_t) != sizeof(bus_size_t). Specifically, it allows a
driver to properly specify a 4GB boundary in a PAE kernel.
Note that this cannot be safely MFC'd without a lot of compat shims due
to KBI changes, so I do not intend to merge it.
Kirk McKusick [Thu, 1 Mar 2012 18:45:25 +0000 (18:45 +0000)]
This change avoids a kernel deadlock on "snaplk" when using
snapshots on UFS filesystems running with journaled soft updates.
This is the first of several bugs that need to be fixed before
removing the restriction added in -r230250 to prevent the use
of snapshots on filesystems running with journaled soft updates.
The deadlock occurs when holding the snapshot lock (snaplk)
and then trying to flush an inode via ffs_update(). We become
blocked by another process trying to flush a different inode
contained in the same inode block that we need. It holds the
inode block for which we are waiting locked. When it tries to
write the inode block, it gets blocked waiting for the our
snaplk when it calls ffs_copyonwrite() to see if the inode
block needs to be copied in our snapshot.
The most obvious place that this deadlock arises is in the
ffs_copyonwrite() routine when it updates critical metadata
in a snapshot and tries to write it out before proceeding.
The fix here is to write the data and indirect block pointer
for the snapshot, but to skip the call to ffs_update() to
write the snapshot inode. To ensure that we will never have
to update a pointer in the inode itself, the ffs_snapshot()
routine that creates the snapshot has to ensure that all the
direct blocks are allocated as part of the creation of the
snapshot.
A less obvious place that this deadlock occurs is when we hold
the snaplk because we are deleting a snapshot. In the course of
doing the deletion, we need to allocate various soft update
dependency structures and allocate some journal space. If we
hit a resource limit while doing this we decrease the resources
in use by flushing out an existing dirty file to get it to give
up the soft dependency resources that it holds. The flush can
cause an ffs_update() to be done on the inode for the file that
we have selected to flush resulting in the same deadlock as
described above when the inode that we have chosen to flush
resides in the same inode block as the snapshot inode that we hold.
The fix is to defer cleaning up any time that the inode on which
we are operating is a snapshot.
Help and review by: Jeff Roberson
Tested by: Peter Holm
MFC (to 9 only) after: 2 weeks
Dimitry Andric [Wed, 29 Feb 2012 22:58:51 +0000 (22:58 +0000)]
Add a WITH_CLANG_IS_CC option for src.conf(5), disabled by default, that
installs clang as /usr/bin/cc, /usr/bin/c++ and /usr/bin/cpp.
Note this does *not* disable building and installing gcc, which will
still be available as /usr/bin/gcc, /usr/bin/g++ and /usr/bin/gcpp. If
you want to disable gcc completely, you must use WITHOUT_GCC.
Andrew Thompson [Wed, 29 Feb 2012 22:41:40 +0000 (22:41 +0000)]
Correct the description for CTLFLAG_TUN and CTLFLAG_RDTUN, the declaring of a
system tunable has never been implemented. This flag is only used by sysctl(8)
to provide a helpful error message.
Olivier Houchard [Wed, 29 Feb 2012 22:35:09 +0000 (22:35 +0000)]
Use srandom() to init the PRNG, not srand(), since we use random().
This is harmless because srandom() is called somewhere else, with time(NULL)
as a seed, but this is more correct.
Obtained from: https://bitbucket.org/mux/csup
Pointyhat to: not mux, somebody else
Mikolaj Golub [Wed, 29 Feb 2012 21:38:31 +0000 (21:38 +0000)]
Introduce VOP_UNP_BIND(), VOP_UNP_CONNECT(), and VOP_UNP_DETACH()
operations for setting and accessing vnode's v_socket field.
The operations are necessary to implement proper unix socket handling
on layered file systems like nullfs(5).
This change fixes the long standing issue with nullfs(5) being in that
unix sockets did not work between lower and upper layers: if we bound
to a socket on the lower layer we could connect only to the lower
path; if we bound to the upper layer we could connect only to the
upper path. The new behavior is one can connect to both the lower and
the upper paths regardless what layer path one binds to.
Justin T. Gibbs [Wed, 29 Feb 2012 17:47:01 +0000 (17:47 +0000)]
blkif interface comment cleanups. No functional changes
sys/xen/interface/io/blkif.h:
o Insert space in "Red Hat".
o Fix typo "discard-aligment" -> "discard-alignment"
o Fix typo "unamp" -> "unmap"
o Fix typo "formated" -> "formatted"
o Clarify the text for "params".
o Clarify the text for "sector-size".
o Clarify the text for "max-requests" in the backend section.