Implements gjournal support. If file system has gjournal support enabled
and -p flag was given perform fast file system checking (bascially only
garbage collecting of orphaned objects).
Rename bread() to blread() and bwrite() to blwrite() as we now link to
the libufs library, which also implement functions with that names.
Add -J flag to both newfs(8) and tunefs(8) which allows to enable gjournal
support.
I left -j flag for UFS journal implementation which we may gain at some
point.
Add gjournal specific code to the UFS file system:
- Add FS_GJOURNAL flag which enables gjournal support on a file system.
- Add cg_unrefs field to the cylinder group structure which holds
number of unreferenced (orphaned) inodes in the given cylinder group.
- Add fs_unrefs field to the super block structure which holds
total number of unreferenced (orphaned) inodes.
- When file or a directory is orphaned (last reference is removed, but
object is still open), increase fs_unrefs and cg_unrefs fields,
which is a hint for fsck in which cylinder groups looks for such
(orphaned) objects.
- When file is last closed, decrease {fs,cg}_unrefs fields.
- Add VV_DELETED vnode flag which points at orphaned objects.
Add MNT_GJOURNAL flag which indicates, that file system has gjournal
support enabled.
Add mnt_gjprovider field which keeps gjournal provider's name on which
file system is placed on. This allows to not place file system on gjournal
directly and allows gjournal class to pair gjournal provider with file
system.
Add gjournal GEOM class (kernel side), which implements block level
journaling and can be tought about marking file system as clean before
doing journal switch, which easly allows to add journaling to file
systems that don't have this feature.
Add a new I/O request - BIO_FLUSH, which basically tells providers below to
flush their caches. For now will mostly be used by disks to flush their
write cache.
Mohan Srinivasan [Tue, 31 Oct 2006 20:25:37 +0000 (20:25 +0000)]
Make EWOULDBLOCK a recoverable error so that the request is retransmitted.
This bug results in data corruption with NFS/TCP. Writes are silently dropped
on EWOULDBLOCK (because socket send buffer is full and sockbuf timer fires).
John Baldwin [Tue, 31 Oct 2006 17:21:14 +0000 (17:21 +0000)]
Allocate receive and transmit data structures during attach() and free them
during detach() similar to other NIC drivers rather than allocating them
during init() and freeing them during stop():
- Move creation of tx bus_dma tag amd maps and tx_buffer_area from
em_setup_transmit_structures() to em_allocate_transmit_structures().
- Call em_allocate_xxx_structures() in em_attach().
- Only call em_free_xxx_structures() in em_detach().
- Change em_setup_xxx_structures() to free any existing tx or rx buffers
and in the case of rx repopulate the ring with newer buffers.
John Baldwin [Tue, 31 Oct 2006 17:05:02 +0000 (17:05 +0000)]
- Use callout_init_mtx() to close various callout-related races.
- Drain the two timers in detach.
- Check IFF_DRV_RUNNING in the link task and bail w/o doing anything if
it is clear.
Gleb Smirnoff [Tue, 31 Oct 2006 16:19:21 +0000 (16:19 +0000)]
Rework the transmit register handling. In em_encap() store index of
the EOP descriptor in the first descriptor of the packet. And then
in em_txeof() search for DD bits set only in the EOP descriptors,
embedding the cleanup of all packet's descriptors into inner loop.
This change is important for future chips, where DD bit is going
to be set only on the EOP descriptors.
Gleb Smirnoff [Tue, 31 Oct 2006 15:00:14 +0000 (15:00 +0000)]
Merge new vendor release - 6.2.9.
Details:
o if_em.c changes:
- Added several new PCI ids.
- Check em_check_phy_reset_block() before doing SIOCSIFMEDIA ioctl.
- Don't touch TARC registers, they are now handled in shared
code in if_em_hw.c.
- Move RDH and RDT setting to the end of
em_initialize_receive_unit().
- Declare em_read_pcie_cap_reg(), now empty.
o if_em_hw.c dropped in from vendor, then restored rev. 1.15.
o if_em_hw.h dropped in from vendor, then modified:
- Added RX overrun interrupt flag to interrupt enable mask.
- Remove declarations of em_io_read(), em_io_write().
Hartmut Brandt [Tue, 31 Oct 2006 10:23:28 +0000 (10:23 +0000)]
Bind to INADDR_ANY in the default configuration. This makes bsnmpd(1)
automatically work on multi-homed hosts and without explicite specification
of the hostname in the config file.
Hartmut Brandt [Tue, 31 Oct 2006 10:09:10 +0000 (10:09 +0000)]
Define a base OID for the FreeBSD version as returned in sysObjectID
by bsnmpd(1). The actual OID is formed by appending the release numbers
to this base OID.
Hartmut Brandt [Tue, 31 Oct 2006 09:00:35 +0000 (09:00 +0000)]
Vendor patch: synthesize the initial value for sysObjectId from the value
of uname -r in FreeBSD. This value can be overwritten in the configuration
file.
Matt Jacob [Tue, 31 Oct 2006 05:53:29 +0000 (05:53 +0000)]
The first of 3 major steps to move the CAM layer forward to using
the CAM_NEW_TRAN_CODE that has been in the tree for some years now.
This first step consists solely of adding to or correcting
CAM_NEW_TRAN_CODE pieces in the kernel source tree such
that a both a GENERIC (at least on i386) and a LINT build
with CAM_NEW_TRAN_CODE as an option will compile correctly
and run (at least with some the h/w I have).
After a short settle time, the other pieces (making
CAM_NEW_TRAN_CODE the default and updating libcam
and camcontrol) will be brought in.
This will be an incompatible change in that the size of structures
related to XPT_PATH_INQ and XPT_{GET,SET}_TRAN_SETTINGS change
in both size and content. However, basic system operation and
basic system utilities work well enough with this change.
Reviewed by: freebsd-scsi and specific stakeholders
Xin LI [Tue, 31 Oct 2006 02:22:36 +0000 (02:22 +0000)]
Correct a security issue introduced in previous commit:
instead of removing the file and issue a warning about
the removal, do not do any operation at all in case -P
is specified when the dinode has hard links.
With -f and -P specified together, we assume that the
user wants rm to overwrite the contents of the file
and remove it (destroy the contents of file but leave
its hard links as is).
The reason of doing it this way is that, in case where
a hard link is created by a malicious user (currently
this is permitted even if the user has no access to the
file). Losing the link can potentially mean that the
actual owner would lose control completely to the user
who wants to obtain access in a future day.
Markus Brueffer [Tue, 31 Oct 2006 00:26:58 +0000 (00:26 +0000)]
- Add a 'verbose' switch -v
- Only dump items that are being used for padding when being verbose. This
brings bthidcontrol in line with the behaviour of usbhidctl(1).
- Update the manpage accordingly
Warner Losh [Mon, 30 Oct 2006 22:46:33 +0000 (22:46 +0000)]
Assign start to the value we were able to allocate and use that to
write out the BAR. Otherwise, we were trying to shift a 32-bit
quantity on 32-bit platforms. Also, 'start' check sanity to where it
is known.
Marius Strobl [Mon, 30 Oct 2006 21:50:11 +0000 (21:50 +0000)]
In the replacement text of the __bswapN_const() macros encapsulate the
argument in parentheses so these macros are safe to use and invocations
with an expression as the argument like __bswap32_const(42 << 23 | 13)
work as expected. Additionally, mask all the individually shifted bytes
as appropriate so the bytes which exceed the width of the respective
__bswapN_const() macro in invocations like __bswap16_const(0xdead600d)
are ignored like it's the case with the corresponding __bswapN_var()
function.
Julian Elischer [Mon, 30 Oct 2006 19:50:01 +0000 (19:50 +0000)]
Add configuration stubs for adding package derived files to the various
sample configurations.
Submitted by Jeremie Le Hen and tested by Jean Milanez Melo.
Warner Losh [Mon, 30 Oct 2006 19:18:46 +0000 (19:18 +0000)]
More fully support 64-bit bars. Prior to this commit, we supported
only those bars that had addresses assigned by the BIOS and where the
bridges were properly programmed. Now even unprogrammed ones work.
This was needed for sun4v. We still only implement up to 2GB memory
ranges, even for 64-bit bars. PCI standards at least through 2.2 say
that this is the max (or 1GB is, I only know it is < 32bits).
o Always define pci_addr_t as uint64_t. A pci address is always 64-bits,
but some hosts can't address all of them.
o Preserve the upper half of the 64-bit word during resource probing.
o Test to make sure that 64-bit values can fit in a u_long (true on some
platforms, but not others). Don't use those that can't.
o minor pedantry about data sizes.
o Better bridge resource reporting in bootverbose case.
o Minor formatting changes to cope with different data types on different
platforms.
Submitted by: jmg, with many changes by me to fully support 64-bit
addresses.
Driver for some ASUS desktop motherboard extras.
Though it is named after overclocking tool for ASUS motherboards,
it is not capable to change clock ratio or CPU core voltage.
This driver exports Templature, Power output voltage, Fan RPM under
dev.acpi_aiboost.0.*.
Descriptions for these values are set to sysctl describe, which can be
get by sysctl -d.
Xin LI [Mon, 30 Oct 2006 03:32:09 +0000 (03:32 +0000)]
Be more reasonable when overwrite mode is specified while there
is hard links. Overwritting when links > 1 would cause data
loss, which is usually undesired.
Inspired by: discussion on -hackers@
Suggested by: elessar at bsdforen de
Obtained from: OpenBSD
Marius Strobl [Mon, 30 Oct 2006 01:58:50 +0000 (01:58 +0000)]
Forced commit to denote that the third item of the previous commit
message should have read:
- Remove the hw.dc_quick SYSCTL, which allowed to turn off the above
mentioned optimization, as like the equivalent and already removed
hw.sis_quick it existed for testing purposes only.
Marius Strobl [Sun, 29 Oct 2006 20:24:27 +0000 (20:24 +0000)]
- Wrap code optimized for architectures without alignment constraints
in #ifdef __NO_STRICT_ALIGNMENT rather than #ifdef __i386__. This
means that amd64 now also uses the optimized code. [1]
While at it, fix a nearby style(9) bug.
- Remove the hw.dc_quick SYSCTL, which allowed to turn off the above
mentioned optimization, as like the equivalent and already removed
- In dc_setcfg() suppress printing a warning when forcing the receiver
and transceiver to idle state times out for chips where the status
bits in question just never change (observed in detail with DM9102A)
and therefore the warning would be highly likely false positive. [2]
- In dc_ifmedia_sts() add a missing DC_UNLOCK().
Marius Strobl [Sun, 29 Oct 2006 20:19:41 +0000 (20:19 +0000)]
Wrap code optimized for architectures without alignment constraints
in #ifdef __NO_STRICT_ALIGNMENT rather than #if defined(__i386__) ||
defined(__amd64__). Currently this change is cosmetic only though.
While at it, fix a nearby style(9) bug and remove a no longer used
header.
Ruslan Ermilov [Sun, 29 Oct 2006 14:50:58 +0000 (14:50 +0000)]
Because the BTX mini-kernel now uses flat memory mode and clients
are no longer limited to a virtual address space of 16 megabytes,
only mask high two bits of a virtual address. This allows to load
larger kernels (up to 1 gigabyte). Not masking addresses at all
was a bad idea on machines with less than >3G of memory -- kernels
are linked at 0xc0xxxxxx, and that would attempt to load a kernel
at above 3G. By masking only two highest bits we stay within the
safe limits while still allowing to boot larger kernels.
(This is a safer reimplmentation of sys/boot/i386/boot2/boot.2.c
rev. 1.71.)
Backout the linux aio stuff. Several problems where identified and the
dynamic nature (if no native aio code is available, the linux part
returns ENOSYS because of missing requisites) should be solved differently
than it is.
All this will be done in P4.
Not included in this commit is a backout of the changes to the native aio
code (removing static in some places). Those changes (and some more) will
also be needed when the reworked linux aio stuff will reenter the tree.
Bruce Evans [Sun, 29 Oct 2006 09:48:44 +0000 (09:48 +0000)]
Removed some SMP ifdefs so that using the TSC as a cputime clock is
not completely decided at config time. Just don't default to using
the TSC if there are multiple active CPUs. Also, don't default to
using the TSC if it is broken. SMP ifdefs are still used to disallow
using perfmon since perfmon is always broken if SMP is just configured.
This only helps much for SMP kernels running on 1 CPU. The overheads
for using the i8254 cputime clock were a bit too high on 486/33's, and
now on multi-GHz CPUs they are usually in the 99-99.9% range. Switching
from the old default of an i8254 clock to the TSC works poorly because
the overheads are not recalibrated.
Use the same condition for declaring perfmon stuff as for using it.
Call vfs_setdirty_locked_object() from vfs_busy_pages() instead of
vfs_setdirty(), thereby eliminating a second acquisition and release
of the same vm object lock.
Alan Cox [Sat, 28 Oct 2006 19:16:57 +0000 (19:16 +0000)]
In bufdone_finish() restrict the acquisition and release of the page
queues lock to BIO_READ operations. Recent changes to the implementation
of the per-page flags have eliminated the need for the page queues lock
in the other cases.
Bruce Evans [Sat, 28 Oct 2006 13:12:06 +0000 (13:12 +0000)]
In the userland .mcount():
- Don't use a frame pointer. Our callers need a frame pointer, but we
could only use one to support things that aren't supported. (These
things are:
- profiling of profiling
- debugging of profiling. The core ENTRY() macro doesn't support
forcing a frame pointer for debugging, so don't do more here.)
- Ensure that we are in the text section and have normal alignment.
- Use the normal syntax for `.type'.
Bruce Evans [Sat, 28 Oct 2006 11:03:03 +0000 (11:03 +0000)]
i386/include/profile.h:
Fixed a syntax error for the (!__KERNEL && !__GNUCLIKE_ASM) case in
rev.1.36. Apparently, this case has never been reached even by lint.
Submitted by: stefanf
{amd64,i386}/include/profile.h:
In case the above case is actually reached, break it properly by
providing null support that will fail at link time instead of a stub
that gives wrong (null) profiling at runtime.
MFP4:
Rename MAX_SAMPLE_RATES macro to OSS_MAX_SAMPLE_RATES. The old
macro clashed with those used in other applications and libaries
(ex: RtAudio). 4Front responded by updating their spec, so we
will follow suit.
Bruce Evans [Sat, 28 Oct 2006 07:59:11 +0000 (07:59 +0000)]
In MCOUNT_OVERHEAD(label), actually use the `label' parameter. We were
still using the global label named "profil", and this worked accidentally
because all callers use the same name.
Bruce Evans [Sat, 28 Oct 2006 06:38:51 +0000 (06:38 +0000)]
Cleaned up includes. <machine/profile.h> was unused. <machine/timerreg.h>
was only used in the GUPROF case, so the messes to get its i386 prerequisites
included shouldn't have been needed.
Fixed some style bugs. Quote #error contents, and don't repeat an #error
directive on amd64.
Bruce Evans [Sat, 28 Oct 2006 06:04:29 +0000 (06:04 +0000)]
Removed all traces of HIDENAME() in amd64 and i386 kernel code. Using
this used to be slightly cleaner than using ifdefs in a few places to
support both a.out and elf, but using it now just causes messes and
unportabilities. It seems to be impossible to implement the elf
HIDENAME() portably in cpp (since token pasting of "." and <name> is
invalid).
*/prof_machdep.c:
- Removed all uses of CNAME(). CNAME() is easy enough to use in pure
asm code, but using it in inline asm requires messy quoting. The
core pure asm code has been hacked on more and all uses of CNAME() in
it have already gone away. Just assume the elf convention here too.
- Removed now-uneeded include of <machine/asmacros.h>.
- Removed the workaround for a namespace conflict with this include.
Jack F Vogel [Sat, 28 Oct 2006 00:47:55 +0000 (00:47 +0000)]
This is the merge of the Intel 6.2.9 driver. It provides all new shared code,
new device support, and it is hoped a more stable driver for 6.2. RELEASE.
This checkin was discussed and approved today by RE, scottl, jhb, and pdeuskar
Bruce Evans [Fri, 27 Oct 2006 14:17:50 +0000 (14:17 +0000)]
Don't call mexitcount or provide a stub mexitcount to call when
profiling is configured but high resolution profiling is not configured.
Only functions in *.[Ss] called the stub, so efficiency was not
significantly affected.
Oleg Bulyzhin [Fri, 27 Oct 2006 13:05:37 +0000 (13:05 +0000)]
- Convert
net.inet.ip.dummynet.curr_time
net.inet.ip.dummynet.searches
net.inet.ip.dummynet.search_steps
to SYSCTL_LONG nodes. It will prevent frequent wrap around on 64bit archs.
- Implement simple mechanics for dummynet(4) internal time correction.
Under certain circumstances (system high load, dummynet lock contention, etc)
dummynet's tick counter can be significantly slower than it should be.
(I've observed up to 25% difference on one of my production servers).
Since this counter used for packet scheduling, it's accuracy is vital for
precise bandwidth limitation.
Introduce new sysctl nodes:
net.inet.ip.dummynet.
tick_lost - number of ticks coalesced by taskqueue thread.
tick_adjustment - number of time corrections done.
tick_diff - adjusted vs non-adjusted tick counter difference
tick_delta - last vs 'standard' tick differnece (usec).
tick_delta_sum - accumulated (and not corrected yet) time
difference (usec).
John Birrell [Thu, 26 Oct 2006 22:11:35 +0000 (22:11 +0000)]
Remove the KSE option now that it's in DEFAULTS on these arches/machines.
The 'nooption' kernel config entry has to be used to turn KSE off now.
This isn't my preferred way of dealing with this, but I'll defer to
scottl's experience with the io/mem kernel option change and the grief
experienced over that.
John Birrell [Thu, 26 Oct 2006 22:05:25 +0000 (22:05 +0000)]
Add 'options KSE' to the kernel config DEFAULTS on all arches/machines
except sun4v.
This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.