1 .\" Copyright (c) 2003-2007 Joseph Koshy
2 .\" Copyright (c) 2007 The FreeBSD Foundation
3 .\" All rights reserved.
5 .\" Redistribution and use in source and binary forms, with or without
6 .\" modification, are permitted provided that the following conditions
8 .\" 1. Redistributions of source code must retain the above copyright
9 .\" notice, this list of conditions and the following disclaimer.
10 .\" 2. Redistributions in binary form must reproduce the above copyright
11 .\" notice, this list of conditions and the following disclaimer in the
12 .\" documentation and/or other materials provided with the distribution.
14 .\" This software is provided by Joseph Koshy ``as is'' and
15 .\" any express or implied warranties, including, but not limited to, the
16 .\" implied warranties of merchantability and fitness for a particular purpose
17 .\" are disclaimed. in no event shall Joseph Koshy be liable
18 .\" for any direct, indirect, incidental, special, exemplary, or consequential
19 .\" damages (including, but not limited to, procurement of substitute goods
20 .\" or services; loss of use, data, or profits; or business interruption)
21 .\" however caused and on any theory of liability, whether in contract, strict
22 .\" liability, or tort (including negligence or otherwise) arising in any way
23 .\" out of the use of this software, even if advised of the possibility of
33 .Nd "performance measurement with performance monitoring hardware"
40 .Op Fl M Ar mapfilename
42 .Op Fl O Ar logfilename
43 .Op Fl P Ar event-spec
44 .Op Fl R Ar logfilename
45 .Op Fl S Ar event-spec
52 .Op Fl o Ar outputfile
53 .Op Fl p Ar event-spec
56 .Op Fl s Ar event-spec
57 .Op Fl t Ar process-spec
60 .Op Fl z Ar graphdepth
61 .Op Ar command Op Ar args
65 utility measures system performance using the facilities provided by
70 utility can measure both hardware events seen by the system as a
71 whole, and those seen when a specified set of processes are executing
73 If a specific set of processes is being targeted (for example,
76 option is specified, or if a command line is specified using
78 then measurement occurs till
80 exits, or till all target processes specified by the
82 options exit, or till the
84 utility is interrupted by the user.
85 If a specific set of processes is not targeted for measurement, then
87 will perform system-wide measurements till interrupted by the
92 can mix allocations of system-mode and process-mode PMCs, of both
93 counting and sampling flavors.
94 The values of all counting PMCs are printed in human readable form
95 at regular intervals by
97 The output of sampling PMCs may be configured to go to a log file for
98 subsequent offline analysis, or, at the expense of greater
99 overhead, may be configured to be printed in text form on the fly.
101 Hardware events to measure are specified to
103 using event specifier strings
105 The syntax of these event specifiers is machine dependent and is
109 A process-mode PMC may be configured to be inheritable by the target
110 process' current and future children.
112 The following options are available:
113 .Bl -tag -width indent
115 Toggle between showing cumulative or incremental counts for
116 subsequent counting mode PMCs specified on the command line.
117 The default is to show incremental counts.
119 Create files with per-program samples in the directory named
122 The default is to create these files in the current directory.
124 Toggle showing per-process counts at the time a tracked process
125 exits for subsequent process-mode PMCs specified on the command line.
126 This option is useful for mapping the performance characteristics of a
127 complex pipeline of processes when used in conjunction with the
130 The default is to not to enable per-process tracking.
132 Print callchain information to file
138 this information is sent to the output file specified by the
141 .It Fl M Ar mapfilename
142 Write the mapping between executable objects encountered in the event
143 log and the abbreviated pathnames used for
147 If this option is not specified, mapping information is not written.
152 in which case this mapping information is sent to the output
153 file configured by the
157 Toggle capturing callchain information for subsequent sampling PMCs.
158 The default is for sampling PMCs to capture callchain information.
159 .It Fl O Ar logfilename
160 Send logging output to file
165 .Ar hostname Ns : Ns Ar port ,
168 does not start with a
174 will open a network socket to host
181 option is not specified and one of the logging options is requested,
184 will print a textual form of the logged events to the configured
186 .It Fl P Ar event-spec
187 Allocate a process mode sampling PMC measuring hardware events
190 .It Fl R Ar logfilename
191 Perform offline analysis using sampling data in file
193 .It Fl S Ar event-spec
194 Allocate a system mode sampling PMC measuring hardware events
198 Toggle logging the incremental counts seen by the threads of a
199 tracked process each time they are scheduled on a CPU.
200 This is an experimental feature intended to help analyse the
201 dynamic behaviour of processes in the system.
202 It may incur substantial overhead if enabled.
203 The default is for this feature to be disabled.
205 Set the cpus for subsequent system mode PMCs specified on the
210 is a comma separated list of CPU numbers, or the literal
213 The default is to allocate system mode PMCs on all active CPUs in
216 Toggle between process mode PMCs measuring events for the target
217 process' current and future children or only measuring events for
219 The default is to measure events for the target process alone.
221 Produce profiles in a format compatible with
223 A separate profile file is generated for each executable object
225 Profile files are placed in sub-directories named by their PMC
227 .It Fl k Ar kerneldir
228 Set the pathname of the kernel directory to argument
230 This directory specifies where
232 should look for the kernel and its modules.
236 Set the default sampling rate for subsequent sampling mode
237 PMCs specified on the command line.
238 The default is to configure PMCs to sample the CPU's instruction
239 pointer every 65536 events.
240 .It Fl o Ar outputfile
241 Send counter readings and textual representations of logged data
244 The default is to send output to
246 when collecting live data and to
248 when processing a pre-existing logfile.
249 .It Fl p Ar event-spec
250 Allocate a process mode counting PMC measuring hardware events
256 Set the top of the filesystem hierarchy under which executables
257 are located to argument
261 .It Fl s Ar event-spec
262 Allocate a system mode counting PMC measuring hardware events
265 .It Fl t Ar process-spec
266 Attach process mode PMCs to the processes named by argument
270 may be a non-negative integer denoting a specific process id, or a
271 regular expression for selecting processes based on their command names.
275 Print the values of all counting mode PMCs every
280 may be a fractional value.
281 The default interval is 5 seconds.
282 .It Fl z Ar graphdepth
283 When printing system-wide callgraphs, limit callgraphs to the depth
284 specified by argument
290 is specified, it is executed using
293 To perform system-wide statistical sampling on an AMD Athlon CPU with
294 samples taken every 32768 instruction retirals and data being sampled
298 .Dl "pmcstat -O sample.stat -n 32768 -S k7-retired-instructions"
302 and measure the number of data cache misses suffered
303 by it and its children every 12 seconds on an AMD Athlon, use:
304 .Dl "pmcstat -d -w 12 -p k7-dc-misses mozilla"
306 To measure processor instructions retired for all processes named
309 .Dl "pmcstat -t '^emacs$' -p instructions"
311 To count instruction tlb-misses on CPUs 0 and 2 on a Intel
312 Pentium Pro/Pentium III SMP system use:
313 .Dl "pmcstat -c 0,2 -s p6-itlb-miss"
315 To collect profiling information for a specific process with pid 1234
316 based on instruction cache misses seen by it use:
317 .Dl "pmcstat -P ic-misses -t 1234 -O /tmp/sample.out"
319 To perform system-wide sampling on all configured processors
320 based on processor instructions retired use:
321 .Dl "pmcstat -S instructions -O /tmp/sample.out"
322 If callgraph capture is not desired use:
323 .Dl "pmcstat -N -S instructions -O /tmp/sample.out"
325 To send the generated event log to a remote machine use:
326 .Dl "pmcstat -S instructions -O remotehost:port"
327 On the remote machine, the sample log can be collected using
329 .Dl "nc -l remotehost port > /tmp/sample.out"
333 compatible profiles from a sample file use:
334 .Dl "pmcstat -R /tmp/sample.out -g"
336 To print a system-wide profile with callgraphs to file
339 .Dl "pmcstat -R /tmp/sample.out -G foo.graph"
343 Due to the limitations of the
347 compatible profiles generated by the
349 option do not contain information about calls that cross executable
353 files are also only meaningful for native executables.
366 utility first appeared in
371 .An Joseph Koshy Aq jkoshy@FreeBSD.org
375 utility cannot yet analyse
377 logs generated by non-native architectures.