Optimizations to the way hwpmc gathers user callchains
Changes to the code to gather user stacks:
* Delay setting pmc_cpumask until we actually have the stack.
* When recording user stack traces, only walk the portion of the ring
that should have samples for us.
Currently, there is a single pm_stalled flag that tracks whether a
performance monitor was "stalled" due to insufficent ring buffer
space for samples. However, because the same performance monitor
can run on multiple processes or threads at the same time, a single
pm_stalled flag that impacts them all seems insufficient.
In particular, you can hit corner cases where the code fails to stop
performance monitors during a context switch out, because it thinks
the performance monitor is already stopped. However, in reality,
it may be that only the monitor running on a different CPU was stalled.
This patch attempts to fix that behavior by tracking on a per-CPU basis
whether a PM desires to run and whether it is "stalled". This lets the
code make better decisions about when to stop PMs and when to try to
restart them. Ideally, we should avoid the case where the code fails
to stop a PM during a context switch out.
Randall Stewart [Fri, 13 Nov 2015 22:51:35 +0000 (22:51 +0000)]
This fixes several places where callout_stops return is examined. The
new return codes of -1 were mistakenly being considered "true". Callout_stop
now returns -1 to indicate the callout had either already completed or
was not running and 0 to indicate it could not be stopped. Also update
the manual page to make it more consistent no non-zero in the callout_stop
or callout_reset descriptions.
MFC after: 1 Month with associated callout change.
Brad Davis [Fri, 13 Nov 2015 17:25:20 +0000 (17:25 +0000)]
Fix a few files that where being incorrectly installed as one file. This was caused by the nvi upgrade fallout in r281994. So add the missing directories back to the mtree and add distrib-cleanup target to retroactively remove the files that should have been directories.
Warner Losh [Fri, 13 Nov 2015 15:36:40 +0000 (15:36 +0000)]
Add support for the Zybo and similar boards to ZEDBOARD kernel.
Zybo needs its own DTB and has a different PHY, so add it to
the base kernel. Details on building bootable SD images at
http://www.thomasskibo.com/zedbsd/
Allow admins to specify a regex which is applied (in the negative) to the
output from df, similar to what security/200.chkmounts does. This can be
useful to avoid listing automounted ZFS snapshots, for instance.
Bryan Drewery [Fri, 13 Nov 2015 01:47:56 +0000 (01:47 +0000)]
META MODE: Don't double stage SYMLINKS for libraries.
meta.stage.mk is handling ${SYMLINKS:T} for stage_libs already. The logic in
bsd.sys.mk to handle ${SYMLINKS} was brought in r247817 when it was moved out
of bsd.prog.mk and bsd.lib.mk into bsd.sys.mk. The logic previously was
limited to bsd.prog.mk.
This fixes a race, seen easily in lib/libthr, where libpthread_p.a is created
by both stage_libs and stage_symlinks resulting in 'ln: File exists'.
1) Remove my overcomplicated error fallback and just return error
immediatelly as old code does, now for append modes too.
Real use case for such fallback is impossible (unless specially crafted).
2) Remove now unneded include I forgot to remove in prev. commits.
John Baldwin [Thu, 12 Nov 2015 22:00:59 +0000 (22:00 +0000)]
Export various helper variables describing the layout and size of
certain kernel structures for use by debuggers. This mostly aids
in examining cores from a kernel without debug symbols as a debugger
can infer these values if debug symbols are available.
One set of variables describes the layout of 'struct linker_file' to
walk the list of loaded kernel modules.
A second set of variables describes the layout of 'struct proc' and
'struct thread' to walk the list of processes in the kernel and the
threads in each process.
The 'pcb_size' variable is used to index into the stoppcbs[] array.
The 'vm_maxuser_address' is used to distinguish kernel virtual addresses
from user addresses. This doesn't have to be perfect, and
'vm_maxuser_address' is a cheap and simple way to differentiate kernel
pointers from simple values like TIDs and PIDs.
While here, annotate the fields in struct pcb used by kgdb on amd64
and i386 to note that their ABI should be preserved. Annotations for
other platforms will be added in the future.
There is no need for the upstream and downstream addresses to be
different for the NTB configs. Go to using a single set of address. It
is still possible to configure them differently using module parameter
override however (CEM: tunable).
Authored by: Dave Jiang <dave.jiang@intel.com>
Reviewed by: Allen Hubbe <Allen.Hubbe@emc.com>
Reviewed by: Jon Mason <jdmason@kudzu.us>
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Fix integer to pointer of different size conversion warnings when
using GCC for 32-bit platforms. The integer size in this case is
hardcoded 64-bit while the pointer size is 32-bit.
Build fixes:
- Add some missing I/O functions for non-i386 and amd64 platforms.
- Stub ioremap() to NULL using a macro to ensure non-existing memory
attributes are not referred when they do not exist.
- Add more header files to linux/list.h to resolve driver compilation
issues on Sparc64 and PowerPC platforms.
Warner Losh [Thu, 12 Nov 2015 05:53:32 +0000 (05:53 +0000)]
Make the slice names for root configurable. For embedded platforms, we
need s1 to be a FAT partition, s2 to be the config partition and s3
and s4 to be the ping-pong upgrade partitions.
NANO_SLICE_ROOT defaults to s1
NANO_SLICE_ALTROOT defaults to s2
NANO_SLICE_CFG defaults to s3
NANO_SLICE_DATA defaults to s4
All can be overridden in the config file. Some basic sanity checking
is in place, but is no substitute for being careful.
Edwin Groothuis [Thu, 12 Nov 2015 03:25:04 +0000 (03:25 +0000)]
MFV of 290695,tzdata2015g
Update to tzdata2015g:
Turkey's 2015 fall-back transition is scheduled for Nov. 8, not Oct. 25.
Norfolk moves from +1130 to +1100 on 2015-10-04 at 02:00 local time.
Fiji's 2016 fall-back transition is scheduled for January 17, not 24.
Fort Nelson, British Columbia will not fall back on 2015-11-01. It has
effectively been on MST (-0700) since it advanced its clocks on 2015-03-08.
New zone America/Fort_Nelson.
Edwin Groothuis [Thu, 12 Nov 2015 03:23:58 +0000 (03:23 +0000)]
Vendor import of tzdata2015g:
Update to tzdata2015g:
Turkey's 2015 fall-back transition is scheduled for Nov. 8, not Oct. 25.
Norfolk moves from +1130 to +1100 on 2015-10-04 at 02:00 local time.
Fiji's 2016 fall-back transition is scheduled for January 17, not 24.
Fort Nelson, British Columbia will not fall back on 2015-11-01. It has
effectively been on MST (-0700) since it advanced its clocks on 2015-03-08.
New zone America/Fort_Nelson.
Warner Losh [Thu, 12 Nov 2015 00:26:47 +0000 (00:26 +0000)]
Revisit this old board with 64MB of RAM. Comment out usb entirely,
since it isn't used for my application. Add back the md device since
it's needed for NanoBSD support. Add in many of the small memory
footprint options from the access points.
With these changes we go from having ~8MB to having ~20MB free,
though free + inactive only goes from ~35MB to ~42MB. We can
also boot a nanobsd image mostly (I had to hand tweak what was
built to represent the final goal).
Move the FDT stuff to the top. We're almost ready to pull the trigger
to moving over to FDT, but something in the MCI driver is freaking out
when we do and that needs fixing first.
Bryan Drewery [Wed, 11 Nov 2015 23:52:08 +0000 (23:52 +0000)]
Move META MODE's HOST_CC/CXX/CPP setting to local.meta.sys.mk, which
centralizes the handling of CC and HOST_CC.
This fixes a bug with WITH_CCACHE_BUILD when using MACHINE=host since
CC is overridden in local.init.mk via src.opts.mk long before bsd.compiler.mk
is included.
Originally the ccache implementation was placed in local.init.mk but moved
to bsd.compiler.mk as it seemed more proper and avoided other ordering
issues.
Conrad Meyer [Wed, 11 Nov 2015 18:56:21 +0000 (18:56 +0000)]
if_ntb: MFV c92ba3c5: invalid buf pointer in multi-MW setups
Order of operations issue with the QP Num and MW count, which would
result in the receive buffer pointer being invalid if there are more
than 1 MW. Corrected with parenthesis to enforce the proper order of
operations.
Reported by: John I. Kading <John.Kading@gd-ms.com>
Reported by: Conrad Meyer <cem@FreeBSD.org>
Authored by: Jon Mason <jdmason@kudzu.us>
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Conrad Meyer [Wed, 11 Nov 2015 18:54:49 +0000 (18:54 +0000)]
NTB: Skip db_valid validation writing DB link bit
In ntb_poll_link, we are intentionally writing the link bit, which is
absent from db_valid_mask. Don't panic on a kassert when we do so.
The Linux version of this (dual BSD/GPL) driver has the db_valid_mask
assertions in callers of db_iowrite() rather than db_iowrite() itself;
it skips the assertions in the equivalent of ntb_poll_link(). Rather
than duplicating the assertions in every caller, add a db_iowrite_raw()
that doesn't check and use it from ntb_poll_link().
Bryan Drewery [Wed, 11 Nov 2015 18:45:48 +0000 (18:45 +0000)]
Use explicit filename when creating locale symlinks to avoid creating a
directory symlink when the target directory does not exist. This will
cause an error instead of a broken setup.
Now that we have mandoc, we can leave $Mdocdate$ tags as-is. Unfortunately,
there is (currently) no way to make Subversion generate correct $Mdocdate$
tags, but perhas we can teach mandoc to read Subversion's %d format.
Alexander Motin [Wed, 11 Nov 2015 13:18:38 +0000 (13:18 +0000)]
Modify target port groups logic in CTL.
- Introduce "ha_shared" port option, which being set to "on" moves the
port into separate port group, shared between HA nodes. This allows to
better handle cases when iSCSI portals are bound to CARP address that can
dynamically move between nodes. Some initiators (at least VMware) don't
detect that after iSCSI reconnect they've attached to different SCSI port
from different port group, that totally breakes ALUA status parsing.
In theory, I believe, it should be enough to have different iSCSI portal
group tags on different nodes to make initiators detect this condition,
but it seems like VMware ignores those values, and even full LUN retaste
forced by UA does not help.
- Make CTL report up to three port groups: 1 -- non-HA mode or ports
with "ha_shared" option set, 2 -- HA node 1, 3 -- HA node 2.
- Report Transitioning state for all port groups when HA interlink is
connected, but neither of nodes is primary for the LUN.
Randall Stewart [Tue, 10 Nov 2015 14:49:32 +0000 (14:49 +0000)]
Add new async_drain to the callout system. This is so-far not used but
should be used by TCP for sure in its cleanup of the IN-PCB (will be coming shortly).
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D4076
Randall Stewart [Tue, 10 Nov 2015 14:14:41 +0000 (14:14 +0000)]
Add a kernel test framework. The callout_test is a demonstration and will only
work with the upcoming async-drain functionality. Tests can be added
to the tests directory and then the framework can be used to launch
those tests.
Josh Paetzel [Tue, 10 Nov 2015 14:14:32 +0000 (14:14 +0000)]
Fix a bug in the CPU % limiting code
If you attempt to set a pcpu limit that is higher than
110% using rctl (for instance, you want a jail to be
able to use 2 cores on your system so you set pcpu to
200%) the thing you are trying to limit becomes unthrottled.
Svatopluk Kraus [Tue, 10 Nov 2015 13:20:21 +0000 (13:20 +0000)]
Fix cp15 PAR definition and function. While here, add cp15 ATS1CPW
function which checks an address for privileged (PL1) write access.
The function is inlined so it does not bring any cost, but makes
function set for checking privileged access complete.
Add mlx5 and mlx5en driver(s) for ConnectX-4 and ConnectX-4LX cards
from Mellanox Technologies. The current driver supports ethernet
speeds up to and including 100 GBit/s. Infiniband support will be
done later.
The code added is not compiled by default, which will be done by a
separate commit.
Michal Meloun [Tue, 10 Nov 2015 11:45:41 +0000 (11:45 +0000)]
ARM: Improve robustness of locore_v6.S and fix errors.
- boot page table is not allocated in data section, so must be
cleared before use
- map only one section (1 MB) for SOCDEV mapping (*)
- DSB must be used for ensuring of finishing TLB operations
- Invalidate BTB when appropriate
Enji Cooper [Tue, 10 Nov 2015 10:59:40 +0000 (10:59 +0000)]
- Move the testing entries up for netbsd-tests/pjdfstest
- Add pjd to contrib/pjdfstest
- Add atf to the list; add jmmv
- Add tests
- Add share/mk/*.test.mk
Update the wsp driver to support newer touch pads, like found in
MacBookPro11,4 and MacBook12,1. This update adds support for the
force touch parameter.
return "US-ASCII" instead of "POSIX" for "C" and "POSIX" locales
as it used to be in previous version of the locales. Returning
"POSIX" has too many fallouts.