Enhance the explanation of using filesystem-specific mount options
in /etc/fstab. We do support passing special options on a per
filesystem type basis, like `-u UID -g GID' for mount_msdosfs, but
the syntax of these options in fstab is non-obvious and a lot of
users have asked about it.
PR: docs/128816
Submitted by: Roland Smith, rsmith at xs4all dot nl
MFC after: 2 days
Marcel Moolenaar [Sat, 22 Nov 2008 21:22:53 +0000 (21:22 +0000)]
Cast to uintptr_t before casting to void*. This allows the
QUICC backend to be built on LP64 platforms. This makes it
possible to include the QUICC backend in the kernel module.
IFp4: Don't rely on disk IDs and always use vdev guids, which means always look
up for components by reading metadata. This might be slower when there are big
number of disks in the system, but is definiately more reliable.
IFp4: Finish implemnetation of chflags(2) for ZFS. While doing this I found
that zfs_access() can only handle VREAD, VWRITE and VEXEC, for the rest we need
to use vaccess(9).
Busy ufs filesystem around block of code that does ".." lookup. Since
mnt_lock is before lock of any vnode on the mp, it uses LK_NOWAIT. Since
MNTK_UNMOUNT may be transient, pdp lock is dropped when vfs_busy()
failed, and operation is retried after some time. This way, ffs_vget()
is not called on the mp that may be in the process of being destroyed by
unmount.
Check for the VI_DOOMED flag on pdp after its lock is reacquired, to
better detect some situations where directory containing ".."
entry is removed during the lookup.
Add sv_flags field to struct sysentvec with intention to provide description
of the ABI of the currently executing image. Change some places to test
the flags instead of explicit comparing with address of known sysentvec
structures to determine ABI features.
Kip Macy [Sat, 22 Nov 2008 08:05:05 +0000 (08:05 +0000)]
- enable multiple transmit queues
- invert sense of hw.cxgb.singleq tunable to hw.cxgb.multiq
- don't wake up transmitting thread by default
- add per tx queue ifaltq to handle ALTQ
- remove several unused functions in cxgb_multiq.c
- add several sysctls: multiq_tx_enable, coalesce_tx_enable,
and wakeup_tx_thread
- this obsoletes the hw.cxgb.snd_queue_len as ifq is replaced
by a buf_ring
Kip Macy [Sat, 22 Nov 2008 05:55:56 +0000 (05:55 +0000)]
- bump __FreeBSD version to reflect added buf_ring, memory barriers,
and ifnet functions
- add memory barriers to <machine/atomic.h>
- update drivers to only conditionally define their own
- add lockless producer / consumer ring buffer
- remove ring buffer implementation from cxgb and update its callers
- add if_transmit(struct ifnet *ifp, struct mbuf *m) to ifnet to
allow drivers to efficiently manage multiple hardware queues
(i.e. not serialize all packets through one ifq)
- expose if_qflush to allow drivers to flush any driver managed queues
This work was supported by Bitgravity Inc. and Chelsio Inc.
Several small additions to the Chelsio 10G driver.
1) Fix a bug in dealing with the Alerus 1006 PHY which prevented the
device from ever coming back up once it had been set to down.
2) Add a kernel tunable (hw.cxgb.snd_queue_len) which makes it possible
to give the device more than IFQ_MAXLEN entries in its send queue. The
default remains 50.
3) Add code to place the card'd identification and serial number into
its description (%desc) so that users can tell which card they have
installed.
Warner Losh [Fri, 21 Nov 2008 03:03:57 +0000 (03:03 +0000)]
Create a /dev/cardbus%d.cis, to be compatible with older versions of
the software. This is a trivial amount of code to keep wireless
monitoring software working... I plan on removing it in 9.0.
Marius Strobl [Thu, 20 Nov 2008 18:44:09 +0000 (18:44 +0000)]
- According to OpenSolaris, CDMA flushing/syncing for Tomatillos
and XMITS has to be basically done in the same manner as for
the Sabres, i.e. only for devices behind PCI-PCI-bridges and
after a PIO read on the far side of the farest PCI-PCI-bridge.
Given that the Tomatillo documentation mentions no difference
to the Schizo bridges in this regard and this is also still
part of the procedure described Schizo documentation this
seems about right so adjust accordingly (the unconditional
CDMA flushing/syncing previously done was based on how Linux
behaves).
- Implement CDMA flushing/syncing for Schizo version >= 5,
which requires the workaround described in Schizo Errata I-23.
According to Schizo Errata I-13 it's just unusable with
version < 5 though. [1]
- Don't register the Schizo streaming buffer for now until it's
usage is sorted out according to the erratas.
- Register our interrupt filters with the revived INTR_FAST so
they these interrupts can even interrupt filters of device
drivers as necessary.
- Remove the comment regarding lack of newbus'ified bus_dma(9)
as being able to associate a DMA tag with a device would
allow to implement CDMA flushing/syncing in bus_dmamap_sync(9)
but that would totally kill performance. Given that for devices
not behind a PCI-PCI bridge the host-to-PCI bridges also only
do CDMA flushing/syncing based on interrupts there's no
additional disadvantage for polling(4) callbacks in the case
schizo(4) has to do the CDMA flushing/syncing but rather a
general problem.
Luigi Rizzo [Thu, 20 Nov 2008 14:57:09 +0000 (14:57 +0000)]
As reported in kern/118222, pxeboot in RELENG7 (and presumably
above) exhibits some misbehaviours on machines with AMD64 CPUs,
which at least in some cases I have tracked down to a heap overflow.
It is unclear whether it depends on the CPU or on the pxe bios
itself which may use more memory on AMD machines.
Noticeably a pxeboot compiled from 6.x sources works fine on all
machines I have tried so far, while a pxeboot compiled from 7.x
sources does not.
This patch is a first step in reducing the amount of memory used
while processing the configuration files read by the loader at boot
(some of them are quite large, 1700+ lines), and it does so by:
+ moving a buffer to static memory instead of allocating in the heap;
+ skipping empty lines;
+ reducing the amount of memory used for line descriptors;
Unfortunately there are several changes between 6.x and above,
affecting the compiler, the loader code itself, and libstand,
and it is not so straightforward to
These changes fix the behaviour on one motherboard with a
single-core AMD cpu, but are still not enough e.g on an Asus
M2N-VM (with a dual-core CPU).
I need to investigate the problem a bit more before figuring
out what should be committed to RELENG_7
Warner Losh [Thu, 20 Nov 2008 08:32:19 +0000 (08:32 +0000)]
damn. Always do make depend. Forgot to recompile main because of it,
so the changes for the struct cis -> struct tuple_list didn't get
made. They have been now.
Warner Losh [Thu, 20 Nov 2008 08:30:15 +0000 (08:30 +0000)]
Fix check for link target so we don't print cardbus CIS information twice.
Also, eliminate some magic constants and replace them with values from cis.h.
Warner Losh [Thu, 20 Nov 2008 08:20:53 +0000 (08:20 +0000)]
Restore now-useless ioctl as a roadmap. The original dumpcis code
assumed it had to toggle between attribute and common memory in the
cards. The kernel is supposed to cope with that automatically and
give us a tuple list. However, there's a number of details of how
that happens that's currently, ummm, magical and/or not implemented
for 16-bit PC Cards that have CIS_LONGLINK_C tuples in them (eg, mix
both attribute memory and common memory). Also, CIS_LOGNLINK_A
entries might not be handled completely correctly either, since there
can be gaps in the attribute vs common stuff.
All this will need to be corrected in the kernel. Once it is
corrected, dumpcis can be made even simpler in some ways, a little
more complicated in others once an API for presentation of CIS to
userland in these weird cases is settled upon.
Warner Losh [Thu, 20 Nov 2008 08:12:26 +0000 (08:12 +0000)]
The original programs that this code was lifted from (pccardd and
pccardc) parsed data to make decisions about stuff related to card
configuration.
The purely CIS dumping aspect of this program obviates the need for
such parsing. Save some space and don't parse the data anymore for
configuration purposes. Just parse it to print an interpreatation of
it.
Marius Strobl [Wed, 19 Nov 2008 22:12:32 +0000 (22:12 +0000)]
Use the interrupt level right below PIL_FAST for executing interrupt
filters instead of PIL_FAST and allow special filters and handlers
for interrupts which need to be able to interrupt even filters, f.e.
bus error interrupts, to be registered with the revived INTR_FAST
at PIL_FAST.
Marius Strobl [Wed, 19 Nov 2008 22:09:03 +0000 (22:09 +0000)]
Given that the buffer dcons_crom(4) exposes is used for both input
and output, set BUS_DMA_COHERENT when creating the DMA map used for
loading the buffer. As a side-effect this solves locking issues on
sparc64 when dcons(4) calls bus_dmamap_sync(9) while in an interrupt
filter, which are executed in a critical section, and iommu(4) has
to use a sleep lock when taking advantage of the streaming buffer.
Reported and tested by: kensmith
Approved by: simokawa
Ed Schouten [Wed, 19 Nov 2008 21:07:33 +0000 (21:07 +0000)]
Make nmdm(4) use MPSAFE callouts.
For some reason the nmdm(4) driver doesn't use CALLOUT_MPSAFE, even
though we live in the MPSAFE TTY era. Add the CALLOUT_MPSAFE flags.
System survives.
Rafal Jaworowski [Wed, 19 Nov 2008 17:34:28 +0000 (17:34 +0000)]
Initial storage functionality for U-Boot support library.
- Only non-sliced bsdlabel style partitioning is currently supported (but provisions
are made towards GPT support, which should follow soon)
- Enable storage support in loader on ARM
Doug Rabson [Wed, 19 Nov 2008 16:39:01 +0000 (16:39 +0000)]
Add a GPT-aware variant of zfsboot which should be used in a similar manner
to gptboot, i.e. installed in a freebsd-boot partition using /sbin/gpart or
/sbin/gpt.
Tweak the /boot/loader ZFS support so that it can find ZFS pools that are
contained in GPT partitions.
Doug Rabson [Wed, 19 Nov 2008 16:04:07 +0000 (16:04 +0000)]
If we free the GPT partition list in bd_open_gpt() because of an error, don't
try to free it again in bd_closedisk(). While I'm here, fix a DEBUG print.
Marko Zec [Wed, 19 Nov 2008 09:39:34 +0000 (09:39 +0000)]
Change the initialization methodology for global variables scheduled
for virtualization.
Instead of initializing the affected global variables at instatiation,
assign initial values to them in initializer functions. As a rule,
initialization at instatiation for such variables should never be
introduced again from now on. Furthermore, enclose all instantiations
of such global variables in #ifdef VIMAGE_GLOBALS blocks.
Essentialy, this change should have zero functional impact. In the next
phase of merging network stack virtualization infrastructure from
p4/vimage branch, the new initialization methology will allow us to
switch between using global variables and their counterparts residing in
virtualization containers with minimum code churn, and in the long run
allow us to intialize multiple instances of such container structures.
Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
Bugfix for Linux USB compat layer. Do not free non-generic FIFOs when
doing an alternate setting.
Cleanup USB IOCTL and USB reference handling.
Fix a corner case where USB-FS was left initialised after
setting a new configuration or alternate setting.
src/sys/dev/usb2/core/usb2_hub.c
Improvement: Check all USB HUB ports by default at least one time.
src/sys/dev/usb2/core/usb2_request.c
Bugfix: Make sure destination ASCII string is properly zero terminated
in all cases.
Improvement: Skip invalid characters instead of replacing with a dot.
John Baldwin [Tue, 18 Nov 2008 23:19:43 +0000 (23:19 +0000)]
- Fix a typo in a comment.
- Whitespace fix.
- Remove #if 0'd BSD 4.x code for flushing busy buffers from a mountpoint
during an unmount. FreeBSD uses vflush() for this.
John Baldwin [Tue, 18 Nov 2008 23:18:37 +0000 (23:18 +0000)]
When looking up the vnode for the device to mount the filesystem on,
ask NDINIT to return a locked vnode instead of letting it drop the
lock and return a referenced vnode and then relock the vnode a few
lines down. This matches the behavior of other filesystem mount routines.
John Baldwin [Tue, 18 Nov 2008 21:01:54 +0000 (21:01 +0000)]
Allow device hints to wire the unit numbers of devices.
- An "at" hint now reserves a device name.
- A new BUS_HINT_DEVICE_UNIT method is added to the bus interface. When
determining the unit number of a device, this method is invoked to
let the bus driver specify the unit of a device given a specific
devclass. This is the only way a device can be given a name reserved
via an "at" hint.
- Implement BUS_HINT_DEVICE_UNIT() for the acpi(4) and isa(4) bus drivers.
Both of these busses implement this by comparing the resources for a
given hint device with the resources enumerated by ACPI/PnPBIOS and
wire a unit if the hint resources are a subset of the "real" resources.
- Use bus_hinted_children() for adding hinted devices on isa(4) busses
now instead of doing it by hand.
- Remove the unit kludging from sio(4) as it is no longer necessary.
Alexander Motin [Tue, 18 Nov 2008 13:24:38 +0000 (13:24 +0000)]
Set of powerd enchancements:
1. Make it more SMP polite. Previous version uses average CPU load that
often leads to load underestimation. It make powerd with default
configuration unusable on systems with more then 2 CPUs. I propose to use
summary load instead of average one. IMO this is the best we can do without
specially tuned scheduler. Also as soon as measuring total load on SMP
systems is more useful then total idle, I have switched to it.
2. Make powerd's operation independent from number and size of frequency
levels. I have added internal frequency counter which translated into real
frequencies only on a last stage and only as good as gone. Some systems may
have only several power levels, while others - many of them, so adaptation
time with previous approach was completely different.
3. As part of previous I have changed adaptive mode to rise frequency on
demand up to 2 times and fall on 1/8 per time internal.
4. For desktop (AC-powered) systems I have added one more mode - "hiadaptive".
It rises frequency twice faster, drops it 4 times slower, prefers twice
lower CPU load and has additional delay before leaving the highest frequency
after the period of maximum load. This mode was specially made to improve
interactivity of the systems where operation capabilities are more
significant then power consumption, but keeping maximum frequency all the
time is not needed.
5. I have reduced default polling interval from 1/2 to 1/4 of second.
It is not so important for algorithm math now, but gives better system
interactivity.