Craig Rodrigues [Sat, 18 Nov 2006 18:22:11 +0000 (18:22 +0000)]
Previously, the mount_ext2fs binary listed the acceptable mount
options for ext2fs. Now that we use nmount() directly from the mount
binary to access ext2fs filesystems, add the list of acceptable mount
options to ext2_ops, so that vfs_filteropts() will accept
options like "noatime" for ext2fs.
PR: 105483
Noticed by: Dr. Markus Waldeck <waldeck gmx de>
MFC after: 1 month
Add debuging printfs to syscalls that do not contain it yet. In
sethostname do not print the hostname because it would require to copyin
the string. Sethostname is not very frequently used.
Scott Long [Sat, 18 Nov 2006 07:33:53 +0000 (07:33 +0000)]
Change the internal API for polled commands. Calling mfi_polled_command
after calling mfi_mapcmd is no longer needed, so long as the MFI_CMD_POLLED
flag is set. This change eliminates the possibility of a polled command
getting posted twice to the driver. This is turn fixes panics on shutdown
when INVARIANTS is set.
Matt Jacob [Sat, 18 Nov 2006 03:53:16 +0000 (03:53 +0000)]
Make the SAN login/logout stuff more common between different chipsets
and provied an isp_control entry point so that the outer layers can
do PLOGI/LOGO explicitly. Add MS IOCB support. This completes the cycle
for base support for SMI-S.
Jung-uk Kim [Fri, 17 Nov 2006 20:43:01 +0000 (20:43 +0000)]
Fix msgsnd(3)/msgrcv(3) deadlock under heavy resource pressure by timing out
msgsnd and rechecking resources. This problem was found while I was running
Linux Test Project test suite (test cases: msgctl08, msgctl09).
Change `msgwait' to `msgsnd' and `msgrcv' to distinguish its sleeping
conditions. Few cosmetic changes to debugging messages.
John Baldwin [Fri, 17 Nov 2006 20:27:01 +0000 (20:27 +0000)]
Add support for 8 byte hardware watches in long mode. Kernel hardware
watches support 8 byte watches. For userland, we disallow 8 byte watches
for 32-bit tasks.
John Baldwin [Fri, 17 Nov 2006 19:20:32 +0000 (19:20 +0000)]
- Add macro constants for the various fields in %dr7 and use them in place
of various scattered magic values.
- Pretty print the address of hardware watchpoints in 'show watch' rather
than just displaying hex.
- Expand address field width on amd64 for 64-bit pointers.
John Baldwin [Fri, 17 Nov 2006 16:41:03 +0000 (16:41 +0000)]
Trim some noise from bootverbose:
- Drop the printf in intr_machdep.c when we assign an interrupt souce to
a CPU. Each source already has a more detailed printf.
- Don't output a line for each ioapic pin showing its initial state, this
has outlived its usefulness.
- When an APIC enumerator sets the bus, polarity, or trigger mode of an
ioapic pin, just return success without printing anything if the new
value matches the current one.
Matt Jacob [Thu, 16 Nov 2006 23:47:16 +0000 (23:47 +0000)]
Finally fix local command responses to set residual correctly.
This allows us to play nicely on SANs when we have target mode
enabled in f/w but have neither the scsi_targbh enabled or
scsi_targ with a target enabled.
Ken Smith [Thu, 16 Nov 2006 23:09:35 +0000 (23:09 +0000)]
Move the documentation to its own separate disc to make room for more
packages on disc2. This will also let users decide if they want to
have a CD of the docs at all - unless they're disconnected from the
net they will probably find the Web site more useful.
Mohan Srinivasan [Thu, 16 Nov 2006 23:03:46 +0000 (23:03 +0000)]
vfs_hash_insert() vputs() the losing vnode before returning, in the event of
a race where a duplicate vnode is entered into the vfs hash. nfs_nget() shouldn't
be releasing the vnode in that case.
Mohan Srinivasan [Thu, 16 Nov 2006 23:02:37 +0000 (23:02 +0000)]
Fix to readdir+ reply handling. When inserting an entry into the namecache,
initialize the nfsnode's ctime. Otherwise a subsequent lookup purges the
just entered namecache entry.
Kip Macy [Thu, 16 Nov 2006 07:50:33 +0000 (07:50 +0000)]
Resize the hash table upwards if the number of collision entries is greater than 1/4 of the
total
it is possible that the resize threshold should be revised upwards
Approved by: scottl (standing in for mentor rwatson)
Scott Long [Thu, 16 Nov 2006 06:28:54 +0000 (06:28 +0000)]
Due to an incorrect macro, it appears that this driver has always been
accidentally truncating off the VLAN tag field in the TX descriptor. Fix
this by splitting up the vlan_tag and flags fields into separate fields,
and handling them appropriately.
John Polstra [Thu, 16 Nov 2006 04:04:07 +0000 (04:04 +0000)]
In bce_start_locked, check the used_tx_bd count rather than the
descriptor's mbuf pointer to see if the transmit ring is full. The
mbuf pointer is set only in the last descriptor of a
multi-descriptor packet. By relying on the mbuf pointers of the
earlier descriptors, the driver would sometimes overwrite a
descriptor belonging to a packet that wasn't completed yet. Also,
tx_chain_prod wasn't updated inside the loop, causing the wrong
descriptor to be checked after the first iteration. The upshot of
all this was the loss of some transmitted packets at medium to high
packet rates.
In bce_tx_encap, remove a couple of old statements that shuffled
around the tx_mbuf_map pointers. These now correspond 1-to-1 with
the transmit descriptors, and they are not supposed to be changed.
Change sleepq_add(9) argument from 'struct mtx *' to 'struct lock_object *',
which allows to use it with different kinds of locks. For example it allows
to implement Solaris conditions variables which will be used in ZFS port on
top of sx(9) locks.
John Baldwin [Wed, 15 Nov 2006 20:44:07 +0000 (20:44 +0000)]
Adjust assertions to allow for magical properties of the 'lbolt' wait
channel for tsleep():
- Allow tsleep() on &lbolt without Giant with a timeout 0 since &lbolt has
an implied timeout.
- If &lbolt is used with msleep() pass NULL to sleepq_add() for the lock
object. Unlike other sleepq channels, &lbolt doesn't have an associated
owning lock.
John Baldwin [Wed, 15 Nov 2006 20:04:57 +0000 (20:04 +0000)]
Add MSI support to em(4), bce(4), and mpt(4). For now, we only support
devices that support a maximum of 1 message, and we use that 1 message
instead of the INTx rid 0 IRQ with the same interrupt handler, etc.
Robert Watson [Wed, 15 Nov 2006 12:43:45 +0000 (12:43 +0000)]
Add a short regression test to try to exercise races in the non-atomic
nature of implied connect via sendto(). Oddly, uipc_usrreq.c implements
this for stream sockets, but doesn't set the flag in its protocol
definition so that it can actually be used. As such, the stream test is
implemented but doesn't run for now.
Group pid and parent are shared in a case of CLONE_THREAD not CLONE_VM.
This fix lets clone02 LTP test pass with 2.6 emulation. In reality 99%
of the cases are that CLONE_VM and CLONE_THREAD are both set so it
seemed to work.
In rev 1.188 of linux_misc.c the added check for valid options ommited
__WCLONE. This fixes it thus fixing skype/teamspeak to not keep zombies
after exit.
Submitted by: rdivacky
Reported by: Bakul Shah (bakul at bitblocks com)
Tim Kientzle [Wed, 15 Nov 2006 05:33:38 +0000 (05:33 +0000)]
Add archive_write_open_filename()/archive_read_open_filename() as
synonyms for archive_write_open_file()/archive_read_open_file().
The new names are much clearer.
Tim Kientzle [Wed, 15 Nov 2006 05:14:20 +0000 (05:14 +0000)]
Change the internal API for writing data to an entry; make the
internal format-specific functions return the same as the public
function, so that the public API layer doesn't have to guess the
correct return value. This addresses an obscure problem that occurs
when someone tries to write more data than the size of the entry (as
indicated in the entry header). In this case, the return value from
archive_write_data() was incorrect, reflecting the requested write
rather than the amount actually written.
Doug Ambrisko [Tue, 14 Nov 2006 16:48:00 +0000 (16:48 +0000)]
- Add in FreeBSD native ioctl that models the Linux version.
- Add a translation so the Linux ioctl's don't conflict with
the FreeBSD definition.
- Assume Linux 32bit emulation on amd64.
This was tested on i386 and amd64 with the 32bit Linux MegaCli.
Eventually we should do a 32bit native FreeBSD translation app.
Matt Jacob [Tue, 14 Nov 2006 08:45:48 +0000 (08:45 +0000)]
Push things closer to path failover by implementing loop down and
gone device timers and zombie state entries. There are tunables
that can be used to select a number of parameters.
loop_down_limit - how long to wait for loop to come back up before
declaring
all devices dead (default 300 seconds)
gone_device_time- how long to wait for a device that has appeared
to leave the loop or fabric to reappear (default 30 seconds)
Internal tunables include (which should be externalized):
quick_boot_time- how long to wait when booting for loop to come up
change_is_bad- whether or not to accept devices with the same
WWNN/WWPN that reappear at a different PortID as being the 'same'
device.
Keen students of some of the subtle issues here will ask how
one can keep devices from being re-accepted at all (the answer
is to set a gone_device_time to zero- that effectively would
be the same thing).
John Baldwin [Mon, 13 Nov 2006 22:23:34 +0000 (22:23 +0000)]
MD support for PCI Message Signalled Interrupts on amd64 and i386:
- Add a new apic_alloc_vectors() method to the local APIC support code
to allocate N contiguous IDT vectors (aligned on a M >= N boundary).
This function is used to allocate IDT vectors for a group of MSI
messages.
- Add MSI and MSI-X PICs. The PIC code here provides methods to manage
edge-triggered MSI messages as x86 interrupt sources. In addition to
the PIC methods, msi.c also includes methods to allocate and release
MSI and MSI-X messages. For x86, we allow for up to 128 different
MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs,
16-254 for APIC PCI IRQs, and IRQ 255 is reserved).
- Add pcib_(alloc|release)_msi[x]() methods to the MD x86 PCI bridge
drivers to bubble the request up to the nexus driver.
- Add pcib_(alloc|release)_msi[x]() methods to the x86 nexus drivers that
ask the MSI PIC code to allocate resources and IDT vectors.
John Baldwin [Mon, 13 Nov 2006 21:47:30 +0000 (21:47 +0000)]
First cut at MI support for PCI Message Signalled Interrupts (MSI):
- Add 3 new functions to the pci_if interface along with suitable wrappers
to provide the device driver visible API:
- pci_alloc_msi(dev, int *count) backed by PCI_ALLOC_MSI(). '*count'
here is an in and out parameter. The driver stores the desired number
of messages in '*count' before calling the function. On success,
'*count' holds the number of messages allocated to the device. Also on
success, the driver can access the messages as SYS_RES_IRQ resources
starting at rid 1. Note that the legacy INTx interrupt resource will
not be available when using MSI. Note that this function will allocate
either MSI or MSI-X messages depending on the devices capabilities and
the 'hw.pci.enable_msix' and 'hw.pci.enable_msi' tunables. Also note
that the driver should activate the memory resource that holds the
MSI-X table and pending bit array (PBA) before calling this function
if the device supports MSI-X.
- pci_release_msi(dev) backed by PCI_RELEASE_MSI(). This function
releases the messages allocated for this device. All of the
SYS_RES_IRQ resources need to be released for this function to succeed.
- pci_msi_count(dev) backed by PCI_MSI_COUNT(). This function returns
the maximum number of MSI or MSI-X messages supported by this device.
MSI-X is preferred if present, but this function will honor the
'hw.pci.enable_msix' and 'hw.pci.enable_msi' tunables. This function
should return the largest value that pci_alloc_msi() can return
(assuming the MD code is able to allocate sufficient backing resources
for all of the messages).
- Add default implementations for these 3 methods to the pci_driver generic
PCI bus driver. (The various other PCI bus drivers such as for ACPI and
OFW will inherit these default implementations.) This default
implementation depends on 4 new pcib_if methods that bubble up through
the PCI bridges to the MD code to allocate IRQ values and perform any
needed MD setup code needed:
- PCIB_ALLOC_MSI() attempts to allocate a group of MSI messages.
- PCIB_RELEASE_MSI() releases a group of MSI messages.
- PCIB_ALLOC_MSIX() attempts to allocate a single MSI-X message.
- PCIB_RELEASE_MSIX() releases a single MSI-X message.
- Add default implementations for these 4 methods that just pass the
request up to the parent bus's parent bridge driver and use the
default implementation in the various MI PCI bridge drivers.
- Add MI functions for use by MD code when managing MSI and MSI-X
interrupts:
- pci_enable_msi(dev, address, data) programs the MSI capability address
and data registers for a group of MSI messages
- pci_enable_msix(dev, index, address, data) initializes a single MSI-X
message in the MSI-X table
- pci_mask_msix(dev, index) masks a single MSI-X message
- pci_unmask_msix(dev, index) unmasks a single MSI-X message
- pci_pending_msix(dev, index) returns true if the specified MSI-X
message is currently pending
- Save the MSI capability address and data registers in the pci_cfgreg
block in a PCI devices ivars and restore the values when a device is
resumed. Note that the MSI-X table is not currently restored during
resume.
- Add constants for MSI-X register offsets and fields.
- Record interesting data about any MSI-X capability blocks we come
across in the pci_cfgreg block in the ivars for PCI devices.
Tested on: em (i386, MSI), bce (amd64/i386, MSI), mpt (amd64, MSI-X)
Reviewed by: scottl, grehan, jfv
MFC after: 2 months
John Baldwin [Mon, 13 Nov 2006 21:14:54 +0000 (21:14 +0000)]
Various fixes:
- Remove an extra entry from the array for 0x0f prefixed instruction groups.
This fixes decoding of instructions where the second opcode >= 0x80.
- Add support for the 64-bit immediate mov instructions.
- When short_addr is enabled, don't parse the modr/m byte for a 16-bit
address, but as a 32-bit address.
- Support %rip relative addressing.
- Don't print a displacement of 0 if there is a base or index register.