For some file types, select code registers two selfd structures. E.g.,
for socket, when specified POLLIN|POLLOUT in events, you would have one
selfd registered for receiving socket buffer, and one for sending. Now,
if both events are not ready to fire at the time of the initial scan,
but are simultaneously ready after the sleep, pollrescan() would iterate
over the pollfd struct twice. Since both times revents is not zero,
returned value would be off by one.
Fix this by recalculating the return value in pollout().
Martin Matuska [Sat, 28 Aug 2010 09:24:11 +0000 (09:24 +0000)]
Import changes from OpenSolaris that provide
- better ACL caching and speedup of ACL permission checks
- faster handling of stat()
- lowered mutex contention in the read/writer lock (rrwlock)
- several related bugfixes
Detailed information (OpenSolaris onnv changesets and Bug IDs):
9749:105f407a2680 6802734 Support for Access Based Enumeration (not used on FreeBSD) 6844861 inconsistent xattr readdir behavior with too-small buffer
9866:ddc5f1d8eb4e 6848431 zfs with rstchown=0 or file_chown_self privilege allows user to "take" ownership
9981:b4907297e740 6775100 stat() performance on files on zfs should be improved 6827779 rrwlock is overly protective of its counters
10143:d2d432dfe597 6857433 memory leaks found at: zfs_acl_alloc/zfs_acl_node_alloc 6860318 truncate() on zfsroot succeeds when file has a component of its path set without access permission
10232:f37b85f7e03e 6865875 zfs sometimes incorrectly giving search access to a dir
10250:b179ceb34b62 6867395 zpool_upgrade_007_pos testcase panic'd with BAD TRAP: type=e (#pf Page fault)
10269:2788675568fd 6868276 zfs_rezget() can be hazardous when znode has a cached ACL
There is a bug in vfs_allocate_syncvnode() failure handling in mount code.
Actually it is hard to properly handle such a failure, especially in MNT_UPDATE
case. The only reason for the vfs_allocate_syncvnode() function to fail is
getnewvnode() failure. Fortunately it is impossible for current implementation
of getnewvnode() to fail, so we can assert this and make
vfs_allocate_syncvnode() void. This in turn free us from handling its failures
in the mount code.
Correct offset conversion to little endian. It was implemented in version 2,
but because of a bug it was a no-op, so we were still using offsets in native
byte order for the host. Do it properly this time, bump version to 4 and set
the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4.
Rui Paulo [Sat, 28 Aug 2010 08:03:29 +0000 (08:03 +0000)]
Register an interrupt vector for DTrace return probes. There is some
code missing in lapic to make sure that we don't overwrite this entry,
but this will be done on a sequent commit.
Alexander Motin [Sat, 28 Aug 2010 07:24:45 +0000 (07:24 +0000)]
MFata(4):
Add Intel Cougar Point PCH SATA Controller DeviceIDs. Correct some existing
entries for Intel Ibex Peak (5 Series/3400 Series) PCH SATA controllers.
Pyun YongHyeon [Sat, 28 Aug 2010 00:34:22 +0000 (00:34 +0000)]
Do not allocate multicast array memory in multicast filter
configuration function. For failed memory allocations, em(4)/lem(4)
called panic(9) which is not acceptable on production box.
igb(4)/ixgb(4)/ix(4) allocated the required memory in stack which
consumed 768 bytes of stack memory which looks too big.
To address these issues, allocate multicast array memory in device
attach time and make multicast configuration success under any
conditions. This change also removes the excessive use of memory in
stack.
Pyun YongHyeon [Sat, 28 Aug 2010 00:16:49 +0000 (00:16 +0000)]
If em(4) failed to allocate RX buffers, do not call panic(9).
Just showing some buffer allocation error is more appropriate
action for drivers. This should fix occasional panic reported on
em(4) when driver encountered resource shortage.
Jayachandran C. [Fri, 27 Aug 2010 19:53:57 +0000 (19:53 +0000)]
Revamp XLR interrupt handling, the previous scheme does not work well on
SMP.
We used to route all PIC based interrupts to cpu 0, and used the per-CPU
interrupt mask to enable/disable interrupts. But the interrupt threads can
run on any cpu on SMP, and the interrupt thread will re-enable the interrupts
on the CPU it runs on when it is done, and not on cpu0 where the PIC will
still send interrupts to.
The fix is move the disable/enable for PIC based interrupts to PIC, we will
ack on PIC only when the interrupt thread is done, and we do not use the
per-CPU interrupt mask.
The changes also introduce a way for subsystems to add a function that
will be called to clear the interrupt on the subsystem. Currently This is
used by the PCI/PCIe for doing additional work during the interrupt
handling.
- Run hooks in background - don't block waiting for them to finish.
- Keep all hooks we're running in a global list, so we can report when
they finish and also report when they are running for too long.
MFC after: 2 weeks
Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com
Jaakko Heinonen [Fri, 27 Aug 2010 11:08:11 +0000 (11:08 +0000)]
Don't attempt to write label with GEOM_BSD based method if the class is
not available. This improves error reporting when bsdlabel(8) is unable
to open a device for writing. If GEOM_BSD was unavailable, only a rather
obscure error message "Class not found" was printed.
- When VFS_VGET() is not supported, switch to VOP_LOOKUP().
- We are fine by only share-locking the vnode.
- Remove assertion that doesn't hold for ZFS where we cross mount points
boundaries by going into .zfs/snapshot/<name>/.
- Check the result of malloc(M_NOWAIT) in replay_alloc(). The caller
(replay_alloc()) knows how to handle replay_alloc() failure.
- Eliminate 'freed_one' variable, it is not needed - when no entry is found
rce will be NULL.
- Add locking assertions where we expect a rc_lock to be held.
Warner Losh [Thu, 26 Aug 2010 15:49:52 +0000 (15:49 +0000)]
Make sure TARGET_ABI is defined. TARGET_ABI will die a horrible death
after we get all of TBEMD merged back into head, and make mips64 imply
n64, so don't bother to make this 100% pretty. You'll have to settle
for only 64% pretty.
Jung-uk Kim [Wed, 25 Aug 2010 22:09:02 +0000 (22:09 +0000)]
Add an experimental feature to shadow video BIOS. Long ago, this trick was
supported by many BIOSes to improve performance of VESA BIOS calls for real
mode OSes but it is not our intention here. However, this may help some
platforms where the video ROMs are inaccessible after suspend, for example.
Note it may consume up to 64K bytes of contiguous memory depending on video
controller model when it is enabled. This feature can be disabled by
setting zero to 'debug.vesa.shadow_rom' loader tunable via loader(8) or
loader.conf(5). The default is 1 (enabled), for now.
Revert r210194, adding a comment explaining why calls to chgproccnt()
in unionfs are actually needed. I have a better fix in trasz_hrl p4 branch,
but now is not a good moment to commit it.
Jung-uk Kim [Wed, 25 Aug 2010 21:13:23 +0000 (21:13 +0000)]
Increase maximum number of page table entries per VM86 context from 8 to 24
pages, yet again. Now we can allocate a whole segment, which is required
for shadowing option ROM images, for example.
Jung-uk Kim [Wed, 25 Aug 2010 20:52:40 +0000 (20:52 +0000)]
Check opcode for short jump as well. Some option ROMs do short jumps
(e.g., some NVIDIA video cards) and we were not able to do POST while
resuming because we only honored long jump.
John Baldwin [Wed, 25 Aug 2010 19:12:05 +0000 (19:12 +0000)]
Intel QPI chipsets actually provide two extra "non-core" PCI buses that
provide PCI devices for various hardware such as memory controllers, etc.
These PCI buses are not enumerated via ACPI however. Add qpi(4) psuedo
bus and Host-PCI bridge drivers to enumerate these buses. Currently the
driver uses the CPU ID to determine the bridges' presence.
In collaboration with: Joseph Golio @ Isilon Systems
MFC after: 2 weeks
Brian Somers [Wed, 25 Aug 2010 18:09:51 +0000 (18:09 +0000)]
If we read zero bytes from the directory, early out with ENOENT
rather than forging ahead and interpreting garbage buffer content
and dirent structures.
This change backs out r211684 which was essentially a no-op.
Jayachandran C. [Wed, 25 Aug 2010 13:37:55 +0000 (13:37 +0000)]
Provide timecounter based on XLR PIC timer.
- Use timer 7 in XLR PIC as a 32 counter
- provide pic_init_timer(), pic_set_timer(), pic_timer_count32() and
pic_timer_count() PIC timer operations.
- register this timer as platform_timecounter on rmi platform.
Jayachandran C. [Wed, 25 Aug 2010 12:10:20 +0000 (12:10 +0000)]
XLR PIC code update.
- Fix a bug in xlr_pic_init (use irq in PIC_IRQ_IS_EDGE_TRIGGERED)
- use new macro PIC_INTR_TO_IRQ() and PIC_IRT_x() in xlr_pic_init
Jayachandran C. [Wed, 25 Aug 2010 11:49:48 +0000 (11:49 +0000)]
XLR PIC code update and style(9) fixes.
- style(9) fixes to mips/rmi platform files
- update pic.h to add pic_setup_intr() and use pic_setup_intr() for setting
up interrupts which are routed thru PIC.
- remove rmi_spin_mutex_safe and haslock, and make sure that the functions
are called only after mutexes are available.
Jayachandran C. [Wed, 25 Aug 2010 09:53:00 +0000 (09:53 +0000)]
Rename on_chip.c to fmn.c, as the file has just the fast messaging network
code. The iodi.c has the bus for SoC devices, so the name on_chip.c is
misleading.
Jayachandran C. [Wed, 25 Aug 2010 08:48:54 +0000 (08:48 +0000)]
RMI XLR platform code clean-up.
- move PIC code to xlr_machdep.c
- move fast message ring code completely to on_chip.c
- move memory initialization to a new function xlr_mem_init()
- style fixes
- Change the threshold from 'running next scrub the <value+1>th day after the
last one' to 'running next scrub the <value>th day after the last one'.
- Improve wording.
David Xu [Wed, 25 Aug 2010 03:14:32 +0000 (03:14 +0000)]
If a thread is removed from umtxq while sleeping, reset error code
to zero, this gives userland a better indication that a thread needn't
to be cancelled.
Warner Losh [Wed, 25 Aug 2010 02:03:48 +0000 (02:03 +0000)]
Prodded by Yongari, add support for Holtek HT80232. Add the device
ID, plus the ability to force '16-bit mode' which really means NE-2000
mode. Other open source drivers suggest that the Holtek misbehaves if
you allow the 8-bit probe. Also, all of the PCI chips emulate
NE-2000ish cards, so always force 16-bit mode for memory transfers.