kib [Fri, 18 Jan 2008 18:07:04 +0000 (18:07 +0000)]
In the rev. 1.153, the one place for converting minor number to unit
was missed. As result, pty_create_slave() may index out of the names[]
bounds, creating wrong slave tty names.
Tested by: kensmith
Reviewed by: jhb
MFC after: 3 days
ambrisko [Fri, 18 Jan 2008 16:31:24 +0000 (16:31 +0000)]
First real attempt at proper locking. The locking is a little complicated
since the the command and data that is being built to be sent to or read
from the HW lives in the softc. Commands are later run via an_setdef etc.
In the ioctl path various references are kept to the data stored in
the softc so it needs to be protected. Almost think of the command
in the softc a global variable since it essentially is. Since locking
wasn't done in this type of context the commands would get corrupted.
Thanks to avatar@ for catching some lock issues and dhw@ for testing.
Things are a lot more stable except for the MPI-350 cards. My an(4)
remote laptop stays on the network now.
The driver should be changed so that it uses private memory that is passed
to the functions that talk to the card. Then only those functions would
really need to grab locks.
rwatson [Fri, 18 Jan 2008 12:19:50 +0000 (12:19 +0000)]
In tcp_ctloutput(), don't hold the inpcb lock over sooptcopyin(), rather,
drop the lock and then re-acquire it, revalidating TCP connection state
assumptions when we do so. This avoids a potential lock order reversal
(and potential deadlock, although none have been reported) due to the
inpcb lock being held over a page fault.
kib [Fri, 18 Jan 2008 12:09:54 +0000 (12:09 +0000)]
udf_vget() shall vgone() the vnode when the file_entry cannot be allocated
or read from the volume. Otherwise, half-constructed vnode could be found
later and cause panic when accessed.
yongari [Fri, 18 Jan 2008 08:32:08 +0000 (08:32 +0000)]
Use m_collapse(9) to collapse mbuf chains instead of relying on
shortest possible chain of mbufs of m_defrag(9). What we want is
chains of mbufs that can be safely stored to a Tx descriptor which
can have up to STGE_MAXTXSEGS mbufs. The ethernet controller does
not need to align Tx buffers on 32bit boundary. So the use of
m_defrag(9) was waste of time.
kientzle [Fri, 18 Jan 2008 05:48:50 +0000 (05:48 +0000)]
The previous commit caused the archive_write_disk interface to
start obeying filesize limits; this test wasn't properly setting
file sizes before trying to write file data.
kientzle [Fri, 18 Jan 2008 05:05:58 +0000 (05:05 +0000)]
Issues with hardlinks in newc-format files prompted me to
write a new test to exercise the hardlink strategies used
by different archive formats (tar, old cpio, new cpio).
This uncovered two problems, both fixed by this commit:
1) Enforce file size when writing files to disk.
2) When restoring hardlink entries, if they have data associated, go
ahead and open the file so we can write the data.
In particular, this fixes bsdtar/bsdcpio extraction of new cpio
formats where the "original" is empty and the subsequent "hardlink"
entry actually carries the data. It also provides correct behavior
for old cpio archives where hardlinked entries have their bodies
stored multiple times in the archive; the last body should always be
the one that ends up in the final file. The new pax format also
permits (but does not require) hardlinks to carry file data; again,
the last contents should always win.
Note that with any of these, a size of zero on a hardlink simply means
that the hardlink carries no data; it does not mean that the file has
zero size. A non-zero size on a hardlink does provide the file size.
Thanks to: John Baldwin, for reminding me about this long-standing bug
and sending me a simple example archive that prompted this test case
thompsa [Fri, 18 Jan 2008 00:19:10 +0000 (00:19 +0000)]
IEEE 802.1D-2004 states, frames containing any of the group MAC Addresses
specified in Table 7-10 in their destination address field shall not be relayed
by the Bridge. Add a check in bridge_forward() to adhere to this.
jhb [Thu, 17 Jan 2008 23:37:47 +0000 (23:37 +0000)]
- Retire npe_defrag(), gem_defrag(), msk_defrag(), nfe_defrag(), and
re_defrag() and use m_collapse() instead.
- Replace a reference to ath_defrag() in a comment in if_wpi.c with
m_collapse().
jhb [Thu, 17 Jan 2008 21:43:12 +0000 (21:43 +0000)]
Add a new 'add-kld <kld>' command to kgdb to make it easier to analyze
crash dumps with kernel modules. The command is basically a wrapper
around add-symbol-file except that it uses the kernel linker data
structures and the ELF section headers of the kld to calculate the
section addresses add-symbol-file needs.
The 'kld' parameter may either be an absolute path or a relative path.
kgdb looks for the kld in several locations checking for variants with
".symbols" or ".debug" suffixes in each location. The first location it
tries is just opening the specified path (this handles absolute paths and
looks for the kld relative to the current directory otherwise). Next
it tries to find the module in the same directory of the kernel image
being used. If that fails it extracts the kern.module_path from the
kernel being debugged and looks in each of those paths.
The upshot is that for the common cases of debugging /boot/kernel/kernel
where the module is in either /boot/kernel or /boot/modules one can merely
do 'add-kld foo.ko'.
kmacy [Thu, 17 Jan 2008 21:25:58 +0000 (21:25 +0000)]
- remove bogus_imm counter
- disable pcpu cluster cache by default until reference counting is handled
correctly for held clusters - can be re-enable by sysctl
alc [Thu, 17 Jan 2008 18:25:52 +0000 (18:25 +0000)]
Retire PMAP_DIAGNOSTIC. Any useful diagnostics that were conditionally
compiled under PMAP_DIAGNOSTIC are now KASSERT()s. (Note: The kernel
option DIAGNOSTIC still disables inlining of certain pmap functions.)
Eliminate dead code from pmap_enter(). This code implemented an assertion.
On i386, an equivalent check is already implemented. However, on amd64,
a small change is required to implement an equivalent check.
Eliminate \n from a nearby panic string.
Use KASSERT() to reimplement pmap_copy()'s two assertions.
bde [Thu, 17 Jan 2008 17:02:11 +0000 (17:02 +0000)]
Add a macro STRICT_ASSIGN() to help avoid the compiler bug that
assignments and casts don't clip extra precision, if any. The
implementation is to assign to a temporary volatile variable and read
the result back to assign to the original lvalue.
lib/msun currently 2 different hard-coded hacks to avoid the problem
in just a few places and needs it in a few more places. One variant
uses volatile for the original lvalue. This works but is slower than
necessary. Another temporarily casts the lvalue to volatile. This
broke with gcc-4.2.1 or earlier (gcc now stores to the lvalue but
doesn't load from it).
bde [Thu, 17 Jan 2008 16:39:07 +0000 (16:39 +0000)]
Add an alternative view of the bits in an 80-bit long double (64+16
instead of 32+32+15+1) on all arches that have such long doubles (amd64,
ia64 and i386). Large objects should be be accessed in large units,
and the 32+32+15+1[+padding] decomposition asks for almost the opposite
of that, sometimes resulting in very slow accesses depending on how
well the compiler ignores what we ask for and converts to the best
units for the given machine. E.g., on Athlons, there is a 10-20 cycle
penalty for accessing the middle 32-bit word immediately after an
80-bit store.
Whether actually using the alternative view is better is very machine-
dependent. A 32+32+16 view is probably best with old 32-bit systems
and gcc through 4.2.1. The compiler should mostly avoid the view and
generate best accesses, but gcc-4.2.1 is far from doing that. I think
64+16 is best for now. Similarly for doubles -- they should be using
64+0 especially on 64-bit machines, but fdlibm uses 32+32 extensively
for them. Fortunately, in 64-bit mode for doubles, gcc already ignores
the 32+32-bit view and generates best accesses in many cases.
bde [Thu, 17 Jan 2008 13:12:46 +0000 (13:12 +0000)]
Translate from the i386. All FP constants and operations are evaluated
in the range and precision of their type(s) on amd64, but FLT_EVAL_METHOD
said that they were evalated in the "interesting" (buggy) i387 methods.
float_t was broken compatibly with FLT_EVAL_METHOD.
These definitions seem to be broken on powerpc and possibly on arm.
float_t is float on powerpc with gcc [-notraditional] according to
glibc, and FLT_EVAL_METHOD is marked with XXX on arm.
jhb [Wed, 16 Jan 2008 18:47:07 +0000 (18:47 +0000)]
Add a header containing constants for the various HPET registers and their
fields and update the code to match. The PR served more as an inspiration
than providing the actual diffs.
remko [Wed, 16 Jan 2008 13:54:40 +0000 (13:54 +0000)]
Dont accidentally remove a filesocket which is still in use. This gives
problems when the DRM driver is loaded and the AIXGL extension is loaded
, the AIXGL driver requests a drm_close and this will cause the radeon
driver to fail while starting X windows.
PR: kern/114688
Submitted by: vehemens <vehemens at verizon dot net>
Prodded by: Robert Noland
Approved by: imp (mentor, a while ago already), anholt
MFC After: 1 week
keramida [Wed, 16 Jan 2008 06:59:22 +0000 (06:59 +0000)]
Document that loader(8) stops reading `loader.conf' when it
encounters a syntax error, and add a tip about adding first
the `vital' options and then experimental ones.
PR: docs/119658
Submitted by: Julian Stacey, jhs at berklix.org
njl [Wed, 16 Jan 2008 01:05:21 +0000 (01:05 +0000)]
Remove duplicate cpufreq levels, i.e. ones that are within 25 Mhz of each
other. The first one survives, the rest are removed. So far, it appears
only some acpi_perf(4) BIOS tables have these invalid states, but address
this in the core to be sure to handle other potential driver data.
kmacy [Wed, 16 Jan 2008 00:28:30 +0000 (00:28 +0000)]
Fix mbuf leak caused by freeing packet zone clusters but not their associated mbufs
- Track packet zone mbufs separately from other mbufs
- free packet zone buffers via m_free rather than trying to manage the refcount
as with clusters - its refcount and management seems to be "special"
jhb [Tue, 15 Jan 2008 21:40:46 +0000 (21:40 +0000)]
Don't cache the new-bus name of a PCI device in the PCI conf structure,
but reread it from the device_t every time the device list is fetched.
Previously the device name in pciconf -l would not be updated when a driver
was unloaded or if a device was detached and attached to a different
driver.
gallatin [Tue, 15 Jan 2008 20:34:49 +0000 (20:34 +0000)]
Add optional support to mxge for MSI-X interrupts and multiple receive
queues (which we call slices). The NIC will steer traffic into up to
hw.mxge.max_slices different receive rings based on a configurable
hash type (hw.mxge.rss_hash_type).
Currently the driver defaults to using a single slice, so the default
behavior is unchanged. Also, transmit from non-zero slices is
disabled currently.
jhb [Tue, 15 Jan 2008 18:50:47 +0000 (18:50 +0000)]
Fix a few minor issues based on a bug report and reading over the HPET
spec:
- Use read/modify/write cycles to enable and disable the HPET instead of
writing 0 to reserved bits.
- Shutdown the HPET during suspend as encouraged by the spec.
- Fail to attach to an HPET with a period of zero.
kientzle [Tue, 15 Jan 2008 16:27:15 +0000 (16:27 +0000)]
Handle Zip archives that are "multi-part archives with only
one part" by simply ignoring the marker at the beginning
of the file. (Zip archivers reserve four bytes at the beginning
of each part of a multi-part archive, if it happens to only
require one part, those four bytes get filled with a placeholder
that can be ignored.)
Thanks to: Marius Nuennerich,
for pointing me to a Zip archive that libarchive couldn't handle
MFC after: 7 days
jhb [Tue, 15 Jan 2008 15:36:23 +0000 (15:36 +0000)]
Put back the openpty(3) and ptsname(3) fixes but don't disable ptsname(3)
on pts(4) devices this time. This fixes the issues while leaving pts(4)
enabled on HEAD.
gallatin [Tue, 15 Jan 2008 13:29:32 +0000 (13:29 +0000)]
Update to firmware version 1.4.29 from 1.4.25. Relevant changes include:
- Fix a bug introduced in 1.4.20 where speculative read by the processor in the
write-only doorbell region would cause a target-abort (as opposed to simply
returning random data). This could manifest itself as NMI or machine freeze
depending on how the BIOS/OS/chipset configuration handles target-abort.
- Add support for new revisions of -R cards (with AEL1002/AEL1010 xaui->xfi)
- Increase an internal timing (dispatch engine): fix possible spurious reset
(seen on very few cards).
jeff [Tue, 15 Jan 2008 09:03:09 +0000 (09:03 +0000)]
- When executing the 'tryself' branch in sched_pickcpu() look at the
lowest priority on the queue for the current cpu vs curthread's
priority. In the case that curthread is waking up many threads of a
lower priority as would happen with a turnstile_broadcast() or wakeup()
of many threads this prevents them from all ending up on the current cpu.
- In sched_add() make the relationship between a scheduled ithread and
the current cpu advisory rather than strict. Only give the ithread
affinity for the current cpu if it's actually being scheduled from
a hardware interrupt. This prevents it from migrating when it simply
blocks on a lock.
kmacy [Tue, 15 Jan 2008 08:08:09 +0000 (08:08 +0000)]
- Simplify mb_free_ext_fast
- increase asserts for mbuf accounting
- track outstanding mbufs (maps very closely to leaked)
- actually only create one thread per port if !multiq
Oddly enough this fixes the use after free
- move txq_segs to stack in t3_encap
- add checks that pidx doesn't move pass cidx
- simplify mbuf free logic in collapse mbufs routine
das [Tue, 15 Jan 2008 07:40:30 +0000 (07:40 +0000)]
Fix some bugs in wall(1):
- Handle wrapping correctly when \r appears in the input, and don't
remove the \r from the output.
- For lines longer than 79 characters, don't drop every 80th character.
- Style: Braces around compound while statement.
mpp [Tue, 15 Jan 2008 06:33:20 +0000 (06:33 +0000)]
Quotacheck may possibly skip quota accounting for up to 2 files
on a filesystem if the quota data files reside on a different
filesystem (e.g. the userquota=/somepath,groupquota=/somepath2
options are specified in /etc/fstab to place the quota files
somewhere other than the default location).
Fix quotacheck to only skip accounting if the quota data file
actually resides on the filesystem being checked.
kmacy [Tue, 15 Jan 2008 03:27:42 +0000 (03:27 +0000)]
- move WR_LEN in to cxgb_adapter.h add PIO_LEN to make intent clearer
- move cxgb_tx_common in to cxgb_multiq.c and rename to cxgb_tx
- move cxgb_tx_common dependencies
- further simplify cxgb_dequeue_packet for the non-multiqueue case
- only launch one service thread per port in the non-multiq case
- remove dead cleaning code from cxgb_sge.c
- simplify PIO case substantially in by returning directly from mbuf collapse
and just using m_copydata
- remove gratuitous m_gethdr in the rx path
- clarify freeing of mbufs in collapse
yongari [Tue, 15 Jan 2008 01:10:31 +0000 (01:10 +0000)]
Overhaul re(4).
o Increased number of Rx/Tx descriptors to 256 for 8169 GigEs
because it's hard to push the hardware to the limit with default
64 descriptors.
TSO requires large number of Tx descriptors to pass a full sized
TCP segment(65535 bytes IP packet) to hardware. Previously it
consumed 32 Tx descriptors, assuming MCLBYTES DMA segment size,
to send the TCP segment which means re(4) couldn't queue more
than two full sized IP packets.
For 8139C+ it still uses 64 Rx/Tx descriptors due to its hardware
limitations. With this changes there are (very) small waste of
memory for 8139C+ users but I don't think it would affect 8139C+
users for most cases.
o Various bus_dma(9) fixes.
- The hardware supports DAC so allow 64bit DMA operations.
- Removed BUS_DMA_ALLOC_NOW flag.
- Increased DMA segment size to 4096 from MCLBYTES because TSO
consumes too many descriptors with MCLBYTES DMA segment size.
- Tx/Rx side bus_dmamap_load_mbuf_sg(9) support. With these
changes the code is more readable than previous one and got a
(slightly) better performance as it doesn't need to pass/
decode arguments to/from callback function.
- Removed unnecessary callback function re_dmamap_desc() and
nuked rl_dmaload_arg structure which was used in the callback.
- Additional protection for DMA map load failure. In case of
failure reuse current map instead of returning a bogus DMA
map.
- Deferred DMA map unloading/sync operation for maximum
performance until we really need to load new DMA map. If we
happen to reuse current map(e.g. input error) there is no need
to sync/unload/load again.
- The number of allowable Tx DMA segments for a mbuf chains are
now 32 instead of magic nseg value. If the number of available
Tx descriptors are short enough to send highly fragmented mbuf
chains an optimized re_defrag() is called to collapse mbuf
chains which is supposed to be much faster than m_defrag(9).
re_defrag() was borrowed from ath(4).
- Separated Rx/Tx DMA tag from a common DMA tag such that Rx DMA
tag correctly uses DMA maps that were created with DMA alignment
restriction(8bytes alignments). Tx DMA tag does not have such
alignment limitation.
- Added additional sanity checks for DMA ring map load failure.
- Added additional spare Rx DMA map for graceful handling of Rx
DMA map load failure.
- Fixed misused bus_dmamap_sync(9) and added missing
bus_dmamap_sync(9) in re_encap()/re_txeof()/re_rxeof().
o Enabled TSO again as re(4) have reasonable number of Tx
descriptors.
o Don't touch DMA address of a Tx descriptor in re_txeof(). It's
not needed.
o Fix incorrect update of if_ierrors counter. For Rx buffer
shortage it should update if_qdrops as the buffer is reused.
o Added checks for unsupported H/W revisions and return ENXIO for
these hardwares. This is required to remove resource allocation
code in re_probe as other drivers do in device probe routine.
o Modified descriptor index manipulation macros as it's now possible
to have different number of descriptors for Rx/Tx.
o In re_start, to save a lock operation, use IFQ_DRV_IS_EMPTY before
trying to invoke IFQ_DRV_DEQUEUE. Also don't blindly call re_encap
since we already know the number of available Tx descriptors in
advance.
o Removed RL_TX_DESC_THLD which was used to reserve RL_TX_DESC_THLD
descriptors in Tx path. There is no such a limitation mentioned in
8139C+/8169/8110/8168/8101/8111 datasheet and it seems to work ok
without reserving RL_TX_DESC_THLD descriptors.
o Fix a comment for RL_GTXSTART. The register is 8bits register.
o Added comments for 8169/8139C+ hardware restrictions on descriptors.
o Removed forward declaration for "struct rl_softc", it's not needed.
o Added a new structure rl_txdesc for Tx descriptor managements and
a structure rl_rxdesc for Rx descriptor managements.
o Removed unused member variable rl_intlock in driver softc. There are
still several unused member variables which are supposed to be used
to access hardware statistics counters. But it seems that accessing
hardware counters were not implemented yet.
jhb [Mon, 14 Jan 2008 23:49:56 +0000 (23:49 +0000)]
Update the manpage for openpty(3) to account for the recent fixes.
Specifically, remove the BUGS section and note that openpty(3) now always
does the various security-related steps. Also, update the error return
value section. The PR below is for the original bug rather than the doc
updates.
peter [Mon, 14 Jan 2008 22:53:01 +0000 (22:53 +0000)]
Update the KVA_PAGES comments for the effect that PAE has on it. It
becomes a unit size of 2MB instead of 4MB and must be a multiple of 8 to
get a valid KERNBASE.
alc [Mon, 14 Jan 2008 21:25:06 +0000 (21:25 +0000)]
Make pmap_is_prefaultable() more TLB friendly. Specifically, make it use
the kernel's direct map instead of the pmap's recursive mapping to access
the lowest level in the page table. The direct map is preferable for two
reasons: (1) The TLB is more likely to hold the required direct mapping
because pmap_enter() has already used the direct map to access a nearby
PTE and (2) loading a direct mapping into the TLB involves walking only 2
or 3 levels of the page table instead of 4.
das [Mon, 14 Jan 2008 09:21:34 +0000 (09:21 +0000)]
Changing 'r' to a size_t in the previous commit turned quicksort
into slowsort for some sequences because different parts of the
code used 'r' to store two different things, one of which was
signed. Clean things up by splitting 'r' into two variables, and
use a more meaningful name.
yongari [Mon, 14 Jan 2008 07:16:48 +0000 (07:16 +0000)]
Implement WOL capability.
- Turn on WOL bits in suspend/shutdown method.
- WOL is disabled in resume routine as WOL can interfere normal
Rx operation.
- Move stge_reset() to stge_init_locked() as resetting hardware
clears configured Rx information which in turn results in
non-working Rx module after suspend/shutdown operation.
das [Mon, 14 Jan 2008 02:18:00 +0000 (02:18 +0000)]
Tests for lrintl() and llrintl(). I didn't add anything specially
tailored for the long double format; instead, I just modified the existing
tests to test lrintl() and llrintl() as well.
kientzle [Sun, 13 Jan 2008 23:50:30 +0000 (23:50 +0000)]
Since the tar bidder can never get called more than once, it
doesn't need to compensate for this situation.
While here, fix a minor longstanding bug that empty tar archives
(which begin with at least 512 zero bytes) never properly reported
their format. In particular, this fixes the output of:
bsdtar tvvf /dev/zero
And, of course, a new test to verify that libarchive correctly
recognizes the format of such files.