kib [Fri, 19 Nov 2010 21:17:34 +0000 (21:17 +0000)]
Remove prtactive variable and related printf()s in the vop_inactive
and vop_reclaim() methods. They seems to be unused, and the reported
situation is normal for the forced unmount.
MFC after: 1 week
X-MFC-note: keep prtactive symbol in vfs_subr.c
attilio [Fri, 19 Nov 2010 19:43:56 +0000 (19:43 +0000)]
Scan the list in reverse order for the shutdown handlers of loaded modules.
This way, when there is a dependency between two modules, the handler of the
latter probed runs first.
This is a similar approach as the modules are unloaded in the same
linkerfile.
Sponsored by: Sandvine Incorporated
Submitted by: Nima Misaghian <nmisaghian at sandvine dot com>
MFC after: 1 week
jhb [Fri, 19 Nov 2010 17:57:50 +0000 (17:57 +0000)]
Set the POSIX semaphore capability when the semaphore module is enabled.
This is ignored in HEAD where semaphores are marked as always enabled in
<unistd.h>.
jhb [Fri, 19 Nov 2010 17:56:16 +0000 (17:56 +0000)]
Set various POSIX capability sysctls to the version of the API that is
supported rather than 1. They are supposed to return a suitable value
for sysconf(3). While here, make the fsync sysctl match <unistd.h>.
cperciva [Fri, 19 Nov 2010 15:12:19 +0000 (15:12 +0000)]
Make pmap_release consistent with pmap_pinit with respect to unpinning
pages. The pinning of NPGPTD pages is #if 0ed out in pmap_pinit (I'm
not quite sure why...) and this commit adds a corresponding #if 0 in
pmap_release to avoid unpinning those pages.
Some versions of Xen seem to silently ignore requests to unpin pages
which were never pinned in the first place, but some return an error
(causing FreeBSD to panic) prior to this commit.
avg [Fri, 19 Nov 2010 15:07:36 +0000 (15:07 +0000)]
specialreg.h: add definitions for MPERF/APERF pair of MSRs
These MSRs can be used to determine actual (average) performance as
compared to a maximum defined performance.
Availability of these MSRs is indicated by bit0 in CPUID.6.ECX on both
Intel and AMD processors.
It seems that this MSR has been available in a range of AMD processors
families for quite a while now.
Note1: not all AMD MSRs that are found in amd64 specialreg.h are also in
the i386 version.
Note2: perhaps some additional name component is needed to distinguish
AMD-specific MSRs.
jilles [Fri, 19 Nov 2010 12:56:13 +0000 (12:56 +0000)]
sh: Add printf builtin.
This was removed in 2001 but I think it is appropriate to add it back:
* I do not want to encourage people to write fragile and non-portable echo
commands by making printf much slower than echo.
* Recent versions of Autoconf use it a lot.
* Almost no software still wants to support systems that do not have
printf(1) at all.
* In many other shells printf is already a builtin.
Side effect: printf is now always the builtin version (which behaves
identically to /usr/bin/printf) and cannot be overridden via PATH (except
via the undocumented %builtin mechanism).
Code size increases about 5K on i386. Embedded folks might want to replace
/usr/bin/printf with a hard link to /usr/bin/alias.
rstone [Fri, 19 Nov 2010 03:47:10 +0000 (03:47 +0000)]
When doing a camcontrol rescan all or a camcontrol reset all, use the wildcard
path id for enumerating the available busses. Previously camcontrol was
implicitly passing 0 as the first path id, which meant that if bus 0 was not
present camcontrol would fail with EINVAL instead of rescanning/resetting any
busses that were present.
rstone [Thu, 18 Nov 2010 23:46:55 +0000 (23:46 +0000)]
When netstat was run with -i/-I and -w1 to produce running counters, the idrop
field printed an absolute value rather than the delta from the last value
kib [Thu, 18 Nov 2010 21:09:02 +0000 (21:09 +0000)]
vm_pageout_flush() might cache the pages that finished write to the
backing storage. Such pages might be then reused, racing with the
assert in vm_object_page_collect_flush() that verified that dirty
pages from the run (most likely, pages with VM_PAGER_AGAIN status) are
write-protected still. In fact, the page indexes for the pages that
were removed from the object page list should be ignored by
vm_object_page_clean().
Return the length of successfully written run from vm_pageout_flush(),
that is, the count of pages between requested page and first page
after requested with status VM_PAGER_AGAIN. Supply the requested page
index in the array to vm_pageout_flush(). Use the returned run length
to forward the index of next page to clean in vm_object_page_clean().
cperciva [Thu, 18 Nov 2010 21:02:40 +0000 (21:02 +0000)]
Don't KASSERT in pmap_release that
xpmap_ptom(VM_PAGE_TO_PHYS(m)) == (pmap->pm_pdpt[i] & PG_FRAME)
for i = NPGPTD, since pmap->pm_pdpt[i] is only initialized for
0 <= i < NPGPTD.
This fixes an inevitable panic with XEN && PAE && INVARIANTS when
pmap_release is called (e.g., when /sbin/init is launched).
kib [Thu, 18 Nov 2010 20:46:28 +0000 (20:46 +0000)]
Only increment object generation count when inserting the page into
object page list. The only use of object generation count now is a
restart of the scan in vm_object_page_clean(), which makes sense to do
on the page addition. Page removals do not affect the dirtiness of the
object, as well as manipulations with the shadow chain.
mav [Thu, 18 Nov 2010 19:28:45 +0000 (19:28 +0000)]
Make ATA_CAM wrapper to report SATA power management capabilities to CAM to
make it configure device to initiate transitions if controller configured
to accept them. This makes hint.ata.X.pm_level=1 mode working.
marius [Thu, 18 Nov 2010 17:58:59 +0000 (17:58 +0000)]
Fix a bug introduced with r215298; when atphy_reset() is called from
atphy_attach() the current media has not been set, yet, leading to a
NULL-dereference in atphy_setmedia().
mav [Thu, 18 Nov 2010 13:38:33 +0000 (13:38 +0000)]
If HBA doesn't report user-enabled SATA capabilies (like ATA_CAM wrapper) -
handle all of them as disabled. This was original cause of the problem,
workarounded by r215453.
mav [Thu, 18 Nov 2010 11:58:17 +0000 (11:58 +0000)]
Even if we are skipping SATA hard reset - set power management bits in
SControl register. This should make things consistent and help to avoid
unexpected PHY events that I've noticed in some cases on VIA controllers.
mav [Thu, 18 Nov 2010 08:03:40 +0000 (08:03 +0000)]
Some VIA SATA controllers provide access to non-standard SATA registers via
PCI config space. Use them to implement hot-plug and link speed reporting.
Tested on ASRock PV530 board with VX900 chipset.
andreast [Wed, 17 Nov 2010 19:35:56 +0000 (19:35 +0000)]
Check the real-mode? OF property to find out whether we operate in real or
virtual mode. In virtual mode we have to do memory mapping. On PowerMacs it is
usually false while on pSeries we have found that it is true. The real-mode?
property is not available on sparc64.
yongari [Wed, 17 Nov 2010 18:09:02 +0000 (18:09 +0000)]
MCP55 is the only NVIDIA controller that supports VLAN tag
insertion/stripping and it also supports TSO over VLAN. Implement
TSO over VLAN support for MCP55 controller.
While I'm here clean up SIOCSIFCAP ioctl handler. Since nfe(4)
sets ifp capabilities based on various hardware flags in device
attach, there is no need to check hardware flags again in
SIOCSIFCAP ioctl handler. Also fix a bug which toggled both TX and
RX checksum offloading even if user requested either TX or RX
checksum configuration change.
Tested by: Rob Farmer ( rfarmer <> predatorlabs dot net )
bz [Wed, 17 Nov 2010 10:43:20 +0000 (10:43 +0000)]
Do not initialize flag variables before needed.
Consistently use the LLE_ prefix for lla_lookup() and the ND6_ prefix
for nd6_lookup() even though both are defined the same. Use the right
flag variable when checking each.
bschmidt [Wed, 17 Nov 2010 09:32:39 +0000 (09:32 +0000)]
Fix a panic on i386 for drivers using MmAllocateContiguousMemory()
and MmAllocateContiguousMemorySpecifyCache().
Those two functions take 64-bit variable(s) for their arguments. On i386
that takes additional 32-bit variable per argument. This is required so
that windrv_wrap() can correctly wrap function that miniport driver calls
with stdcall convention. Similar explanation is provided in subr_ndis.c for
other functions.
jkim [Tue, 16 Nov 2010 23:26:02 +0000 (23:26 +0000)]
Restore CR0 after MTRR initialization for correctness sakes. There will be
no noticeable change because we enable caches before we enter here for both
BSP and AP cases. Remove another pointless optimization for CR4.PGE bit
while I am here.
jkim [Tue, 16 Nov 2010 22:44:58 +0000 (22:44 +0000)]
Invalidate TLBs explicitly. r1.4 of sys/i386/i386/i686_mem.c removed this
code but probably it only worked by chance because modifying CR4.PGE bit
causes invlidation of entire TLBs. Since these are very rare events, this
micro-optimization seems useless.
avg [Tue, 16 Nov 2010 15:53:44 +0000 (15:53 +0000)]
zfs+sendfile: populate all requested pages, not just those already cached
kern_sendfile() uses vm_rdwr() to read-ahead blocks of data to populate
page cache. When sendfile stumbles upon a page that is not populated
yet, it sends out all the mbufs that it collected so far. This
resulted in very poor performance with ZFS when file data is not in the
page cache, because ZFS vop_read for UIO_NOCOPY case populated only
those pages that are already in cache, but not valid. Which means that
most of the time it populated only the first requested page in the
described above scenario.
Reported by: Alexander Zagrebin <alexz@visp.ru>
Tested by: Alexander Zagrebin <alexz@visp.ru>,
Artemiev Igor <ai@kliksys.ru>
MFC after: 12 days
avg [Tue, 16 Nov 2010 12:43:45 +0000 (12:43 +0000)]
hwpstate: use CPU_FOREACH when binding to all available processors
Also, add a comment mentioning _PSD - on some systems it's enough to
put one logical CPU into a particular P-state to make other CPUs in
the same domain to enter that P-state.
Also, call sched_unbind() after the loop - sched_bind() automatically
rebinds from previous CPU to a new one, and the new arrangement of code
is safer against early loop exit.
lstewart [Tue, 16 Nov 2010 09:34:31 +0000 (09:34 +0000)]
Make the CC framework more VIMAGE friendly by adding the machinery to allow
vnets to select their own default CC algorithm independent of each other and the
base system. If the base system or a vnet has set a default which gets unloaded,
we reset that netstack's default to NewReno.
Sponsored by: FreeBSD Foundation
Tested by: Mikolaj Golub <to.my.trociny at gmail com>
Reviewed by: bz (briefly)
MFC after: 3 months
lstewart [Tue, 16 Nov 2010 08:43:25 +0000 (08:43 +0000)]
- Querying the default CC algo is more common than setting it and the function
is small, so there is no good reason not to declare the buffer at the top.
lstewart [Tue, 16 Nov 2010 07:57:56 +0000 (07:57 +0000)]
On CC algorithm module unload, we walk the list of active TCP control blocks.
Any found to be using the algorithm that is about to go away are switched back
to NewReno to avoid leaving dangling pointers which would trigger a panic. For
VIMAGE kernels, there is a list per vnet to walk, yet the implementation was
only examining one of the vnet lists.
Fix the implementation of the above feature for VIMAGE kernels by looping
through all active TCP control blocks across all vnets.
Sponsored by: FreeBSD Foundation
Tested by: Mikolaj Golub <to.my.trociny at gmail com>
Reviewed by: bz (briefly)
MFC after: 11 weeks
lstewart [Tue, 16 Nov 2010 07:09:05 +0000 (07:09 +0000)]
cc_init() should only be run once on system boot, but with VIMAGE kernels it
runs on boot and each time a vnet jail is created. Running cc_init() multiple
times results in a panic when attempting to initialise the cc_list lock again,
and so r215166 effectively broke the use of vnet jails.
Switch to using a SYSINIT to run cc_init() on boot. CC algorithm modules loaded
on boot register in the same SI_SUB_PROTO_IFATTACHDOMAIN category as is used in
this patch, so cc_init() is run at SI_ORDER_FIRST to ensure the framework is
initialised before module registration is attempted.
Sponsored by: FreeBSD Foundation
Reported and tested by: Mikolaj Golub <to.my.trociny at gmail com>
MFC after: 11 weeks
X-MFC with: r215166
nwhitehorn [Mon, 15 Nov 2010 22:11:18 +0000 (22:11 +0000)]
Try including Makefile.${TARGET_ARCH} before Makefile.${TARGET_CPUARCH} if
it exists in order to allow arch-specific overrides. This fixes the
binutils (and world) build on powerpc64 after recent TBEMD merges.
marius [Mon, 15 Nov 2010 21:41:45 +0000 (21:41 +0000)]
Return from mii_attach() after calling bus_generic_attach(9) on the device_t
of the MAC driver in order to attach miibus(4) on the first pass instead of
falling through to also calling it on the device_t of miibus(4). The latter
code flow was intended to attach the PHY drivers the same way regardless of
whether it's the first or a repeated pass, modulo the bus_generic_attach()
call in miibus_attach() which shouldn't be there. However, it turned out
that these variants cause miibus(4) to be attached twice under certain
conditions when using MAC drivers as modules.
jhb [Mon, 15 Nov 2010 19:55:19 +0000 (19:55 +0000)]
Don't display option 2 (to toggle ACPI on or off) on x86 machines if the
BIOS does not support ACPI. The other options in the menu retain their
existing numbers, option 2 is simply blanked out (and '2' is ignored).
netchild [Mon, 15 Nov 2010 13:03:35 +0000 (13:03 +0000)]
- print out the PID and program name of the program trying to use an
unsupported futex operation
- for those futex operations which are known to be not supported,
print out which futex operation it is
- shortcut the error return of the unsupported FUTEX_CLOCK_REALTIME in
some cases:
FUTEX_CLOCK_REALTIME can be used to tell linux to use
CLOCK_REALTIME instead of CLOCK_MONOTONIC. FUTEX_CLOCK_REALTIME
however must only be set, if either FUTEX_WAIT_BITSET or
FUTEX_WAIT_REQUEUE_PI are set too. If that's not the case
we can die with ENOSYS right at the beginning.
Submitted by: arundel
Reviewed by: rdivacky (earlier iteration of the patch)
MFC after: 1 week
yongari [Mon, 15 Nov 2010 00:06:19 +0000 (00:06 +0000)]
Add flow control for all re(4) controllers. re(4) controllers do
not provide any MAC configuration interface for resolved flow
control parameters. There is even no register that configures water
mark which will control generation of pause frames.
However enabling flow control surely enhanced performance a lot.
yongari [Sun, 14 Nov 2010 23:37:43 +0000 (23:37 +0000)]
P5N32-SLI PREMIUM from ASUSTeK is known to have MSI/MSI-X issue
such that nfe(4) does not work with MSI-X. When MSI-X support was
introduced, I remember MCP55 controller worked without problems so
the issue could be either PCI bridge or BIOS issue. But I also
noticed snd_hda(4) disabled MSI on all MCP55 chipset so I'm still
not sure this is generic issue of MCP55 chipset. If this was PCI
bridge issue we would have added it to a system wide black-list
table but it's not clear to me at this moment whether it was caused
by either broken BIOS or silicon bug of MCP55 chipset.
To workaround the issue, maintain a MSI/MSI-X black-list table in
driver and lookup base board manufacturer and product name from the
table before attempting to use MSI-X. If driver find an matching
entry, nfe(4) will not use MSI/MSI-X and fall back on traditional
INTx mode. This approach should be the last resort since it relies
on smbios and if another instance of MSI/MSI-X breakage is reported
with different maker/product, we may have to get the PCI bridge
black-listed instead of adding an new entry.
dd [Sun, 14 Nov 2010 23:05:57 +0000 (23:05 +0000)]
Add a special INIT product ID used by some models of the HUAWEI
K3765 datacard. After ejecting this device, it reappears using
the normal K3765 ID. It does not switch automatically
kib [Sun, 14 Nov 2010 21:59:11 +0000 (21:59 +0000)]
Do not use __FreeBSD_version prefix for the special osrel version.
The ports/Mk/bsd.port.mk uses sys/param.h to fetch osrel, and cannot
grok several constants with the prefix.
Reported and tested by: swell.k gmail com
MFC after: 1 week
dim [Sun, 14 Nov 2010 20:40:55 +0000 (20:40 +0000)]
Instead of unconditionally emitting .globl's for the __start_set_xxx and
__stop_set_xxx symbols, only emit them when the set_vnet or set_pcpu
sections are actually defined.