John Baldwin [Mon, 16 Nov 2015 21:36:50 +0000 (21:36 +0000)]
Only use a power of 2 for the number of receive and transmit queues.
Using other values causes VMXNET3_CMD_ENABLE to fail. The Linux
driver also enforces this restriction.
Marius Strobl [Mon, 16 Nov 2015 21:13:57 +0000 (21:13 +0000)]
With r290566 in place it turned out that WOL previously only worked by
accident with RTL8168G and later chips when the interface actually was
brought up. This is due to the fact that with these MAC variants, RXDV
gate needs be disabled for WOL to work. So do just that in re_setwol()
when IFCAP_WOL is requested.
Reported and tested by: dhw
Craig Rodrigues [Mon, 16 Nov 2015 16:48:43 +0000 (16:48 +0000)]
Import ypldap from OpenBSD.
ypldap -- Intended to be a drop-in replacement for ypserv, gluing in a
LDAP directory and thus providing support for users and groups stored in
LDAP for the get{pw,gr}ent family of functions.
The code tracks a counter which is the number of events until the next
sample. On context switch in, it loads the saved counter. On context
switch out, it tries to calculate a new saved counter.
Problems:
1. The saved counter was shared by all threads in a process. However, this
means that all threads would be initially loaded with the same saved
counter. However, that could result in sampling more often than once every
X number of events.
2. The calculation to determine a new saved counter was backwards. It
added when it should have subtracted, and subtracted when it should have
added. Assume a single-threaded process with a reload count of 1000 events.
Assuming the counter on context switch in was 100 and the counter on context
switch out was 50 (meaning the thread has "consumed" 50 more events), the
code would calculate a new saved counter of 150 (instead of the proper 50).
Fix:
1. As soon as the saved counter is used to initialize a monitor for a
thread on context switch in, set the saved counter to the reload count.
That way, subsequent threads to use the saved counter will get the full
reload count, assuring we sample at least once every X number of events
(across all threads).
2. Change the calculation of the saved counter. Due to the change to the
saved counter in #1, we simply need to add (modulo the reload count) the
remaining counter time we retrieve from the CPU when a thread is context
switched out.
Change the driver stats to what they really are: unsigned values.
When pmcstat exits after some samples were dropped, give the user an
idea of how many were lost. (Granted, these are global numbers, but
they may still help quantify the scope of the loss.)
Use explicitly specified ivsize instead of blocksize when we mean IV size.
Set zero ivsize for enc_xform_null and remove special handling from
xform_esp.c.
Rework the test which raises OOM condition. Right now, the code
checks for the swap space consumption plus checks that the amount of
the free pages exceeds some limit, in case pagedeamon did not coped
with the page shortage in one of the late passes. This is wrong
because it does not account for the presence of the reclamaible pages
in the queues which are not selectable for reclaim immediately. E.g.,
on the swap-less systems, large active queue easily triggered OOM.
Instead, only raise OOM when pagedaemon is unable to produce a free
page in several back-to-back passes. Track the failed passes per
pagedaemon thread.
The number of passes to trigger OOM was selected empirically and
tested both on small (32M-64M i386 VM) and large (32G amd64)
configurations. If the specifics of the load require tuning, sysctl
vm.pageout_oom_seq sets the number of back-to-back passes which must
fail before OOM is raised. Each pass takes 1/2 of seconds. Less the
value, more sensible the pagedaemon is to the page shortage.
In future, some heuristic to calculate the value of the tunable might
be designed based on the system configuration and load. But before it
can be done, the i/o system must be fixed to reliably time-out
pagedaemon writes, even if waiting for the memory to proceed. Then,
code can account for the in-flight page-outs and postpone OOM until
all of them finished, which should reduce the need in tuning. Right
now, ignoring the in-flight writes and the counter allows to break
deadlocks due to write path doing sleepable memory allocations.
Reported by: Dmitry Sivachenko, bde, many others
Tested by: pho, bde, tuexen (arm)
Reviewed by: alc
Discussed with: bde, imp
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Do not use vmspace_resident_count() for the OOM process selection.
Residency count track the number of pte entries installed into the
current pmap, which does not reflect the consumption of the physical
memory by the address map. Due to several mechanisms like pv entries
reclamation, copy on write etc. the resident pte entries count may be
much less than the amount of physical memory kept by the process.
Provide the OOM-specific vm_pageout_oom_pagecount() function which
estimates the amount of reclamaible memory which could be stolen if
the process is killed.
Reported and tested by: pho
Reviewed by: alc
Comments text by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
VM daemon works in parallel with the pagedaemon threads, and, among
other actions, swaps out kernel stacks of the processes. On the other
hand, currentl OOM logic which selects a process to kill in the
critical condition, skips process with swapped-out thread. Under some
loads, this results in the big(gest) process being ignored by OOM.
Do not skip a process which has inhibited thread due to the swap-out,
in the OOM selection loop. Note that killing such process requires
the thread stack page-in, but sometimes this is the only way to
recover.
Reported and tested by: pho
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Enji Cooper [Mon, 16 Nov 2015 05:28:14 +0000 (05:28 +0000)]
Port contrib/netbsd-tests/kernel/t_mqueue.c to FreeBSD
- Add missing headers
- Ensure mqueuefs is loaded
- Make sure the mqueuefs path is absolute and relative to /
- Cast the result of mq_open returning -1 to (mqd_t) to mute a compiler
warning
Adrian Chadd [Mon, 16 Nov 2015 04:28:00 +0000 (04:28 +0000)]
Add initial support for the QCA953x ("Honeybee") from Qualcomm Atheros.
The QCA953x SoC is an integrated 2x2 2GHz 11n + MIPS24k core, with
a 5 port FE switch, gige WAN port, and all the same stuff you'd find on
its predecessor - the AR9331.
However, buried deep in here somewhere is also a PCIe EP/RC for various
applications and some other weird bits I don't yet know about.
This is enough to get the reference board up and booting. I haven't yet
had it pass lots of packets - I need to finalise the ethernet switch
bits and the GMAC configuration (ie, how the ethernet ports and switch
are wired up) and I'll bring that in when I commit the base configuration
files to use the thing.
The wifi stuff will come much later. I have to port that support from
Linux ath9k and extend our vendor HAL to support it.
The reference board (AP143) comes with 32MB RAM and 4MB flash, so in order
to use it I need to get USB working fully so I can run root from there.
Thankyou to Qualcomm Atheros for access to the reference design board.
Details:
* Add register definitions from openwrt;
* It looks like a QCA955x but shrunk down to a QCA933x footprint, so
use the QCA955x bits and fix up the clock detection code to do the
QCA953x bits (they're very subtly different);
* Teach GPIO about it;
* Teach EHCI about it;
* Teach if_arge about it;
* Teach the CPU detection code about it.
John-Mark Gurney [Mon, 16 Nov 2015 01:29:58 +0000 (01:29 +0000)]
If you backup a large file that is mostly holes, previously we'd issue
a seek for every block... For large (Exabyte sized files) this would
issue lots of unneeded seeks, so combine them...
Thanks for the work Jan, sorry took so long to commit... And an item
completed off the IdeasPage!
Enji Cooper [Sun, 15 Nov 2015 18:56:58 +0000 (18:56 +0000)]
Disable -Wformat with scanfloat_test when compiling with gcc to avoid a
"use of assignment suppression and length modifier together in scanf format"
warning on line 90 (it's intentional)
Rework locale-links to not make symlinks on directories but symlinks on files
The goal here is to make the upgrade seamless for users
Add aliases for zh_HK
Remove bad symlinks created by previous bad upgrade procedure.
Complete ObsoleteFiles.inc with more locales that have been removed
Enji Cooper [Sun, 15 Nov 2015 04:50:08 +0000 (04:50 +0000)]
Polish up iswctype_test
- Split up the testcases into C locale and ja_JP.eucJP testcases.
- Avoid a segfault in the event that setlocale fails, similar to r290843
- Replace `sizeof(x) / sizeof(*x)` pattern with `nitems(x)`
Enji Cooper [Sun, 15 Nov 2015 04:33:14 +0000 (04:33 +0000)]
Polish up the tests a bit more after projects/collation was merged to head
Provide more meaningful diagnostic messages if LC_CTYPE can't be set properly
instead of segfaulting, because setlocale returns NULL and strcmp(NULL, b) will
always segfault
Split up the testcases so one failing (in this case en_US.ISO8859-15) won't
cause the rest of the testcases to be skipped
Optimizations to the way hwpmc gathers user callchains
Changes to the code to gather user stacks:
* Delay setting pmc_cpumask until we actually have the stack.
* When recording user stack traces, only walk the portion of the ring
that should have samples for us.
Currently, there is a single pm_stalled flag that tracks whether a
performance monitor was "stalled" due to insufficent ring buffer
space for samples. However, because the same performance monitor
can run on multiple processes or threads at the same time, a single
pm_stalled flag that impacts them all seems insufficient.
In particular, you can hit corner cases where the code fails to stop
performance monitors during a context switch out, because it thinks
the performance monitor is already stopped. However, in reality,
it may be that only the monitor running on a different CPU was stalled.
This patch attempts to fix that behavior by tracking on a per-CPU basis
whether a PM desires to run and whether it is "stalled". This lets the
code make better decisions about when to stop PMs and when to try to
restart them. Ideally, we should avoid the case where the code fails
to stop a PM during a context switch out.
Randall Stewart [Fri, 13 Nov 2015 22:51:35 +0000 (22:51 +0000)]
This fixes several places where callout_stops return is examined. The
new return codes of -1 were mistakenly being considered "true". Callout_stop
now returns -1 to indicate the callout had either already completed or
was not running and 0 to indicate it could not be stopped. Also update
the manual page to make it more consistent no non-zero in the callout_stop
or callout_reset descriptions.
MFC after: 1 Month with associated callout change.