1 .\" Copyright (c) 2003-2008 Joseph Koshy. All rights reserved.
3 .\" Redistribution and use in source and binary forms, with or without
4 .\" modification, are permitted provided that the following conditions
6 .\" 1. Redistributions of source code must retain the above copyright
7 .\" notice, this list of conditions and the following disclaimer.
8 .\" 2. Redistributions in binary form must reproduce the above copyright
9 .\" notice, this list of conditions and the following disclaimer in the
10 .\" documentation and/or other materials provided with the distribution.
12 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
13 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
14 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
15 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
16 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
17 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
18 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
19 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
20 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
21 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
31 .Nd library for accessing hardware performance monitoring counters
37 Intel Pentium PMCs are present in Intel
42 These PMCs are documented in the
44 .%B "Intel 64 and IA-32 Intel(R) Architectures Software Developer's Manual"
45 .%T "Volume 3B: System Programming Guide, Part 2"
46 .%N "Order Number 253669-024US"
48 .%Q "Intel Corporation"
51 These CPUs contain two PMCs, each 40 bits wide.
52 These PMCs support the following capabilities:
53 .Bl -column "PMC_CAP_INTERRUPT" "Support"
54 .It Em Capability Ta Em Support
55 .It PMC_CAP_CASCADE Ta \&No
56 .It PMC_CAP_EDGE Ta \&No
57 .It PMC_CAP_INTERRUPT Ta \&No
58 .It PMC_CAP_INVERT Ta \&No
59 .It PMC_CAP_READ Ta Yes
60 .It PMC_CAP_PRECISE Ta \&No
61 .It PMC_CAP_SYSTEM Ta Yes
62 .It PMC_CAP_TAGGING Ta \&No
63 .It PMC_CAP_THRESHOLD Ta \&No
64 .It PMC_CAP_USER Ta Yes
65 .It PMC_CAP_WRITE Ta Yes
68 Event specifiers for Intel Pentium PMCs can have the following common
70 .Bl -tag -width indent
72 Count duration (in clocks) of events.
73 The default is to count events.
75 Measure events at privilege levels 0, 1 and 2.
77 Assert the external processor pin associated with a counter on counter
80 Measure events at privilege level 3.
87 qualifiers are specified, the default is to enable both.
89 Some events may only be used on specific counters and some events
90 are defined only on processors supporting the MMX instruction set.
91 Note that these PMCs do not have the ability to interrupt the CPU.
92 .Ss Intel Pentium Event Specifiers
93 The event specifiers supported by Intel Pentium PMCs are:
94 .Bl -tag -width indent
95 .It Li p5-any-segment-register-loaded
97 The number of writes to any segment register, including the LDTR,
99 Far control transfers and task switches that involve privilege
100 level changes will count this event twice.
101 .It Li p5-bank-conflicts
103 The number of actual bank conflicts.
106 The number of taken and not taken branches including branches, jumps, calls,
107 software interrupts and interrupt returns.
108 .It Li p5-breakpoint-match-on-dr0-register
110 The number of matches on the DR0 breakpoint register.
111 .It Li p5-breakpoint-match-on-dr1-register
113 The number of matches on the DR1 breakpoint register.
114 .It Li p5-breakpoint-match-on-dr2-register
116 The number of matches on the DR2 breakpoint register.
117 .It Li p5-breakpoint-match-on-dr3-register
119 The number of matches on the DR3 breakpoint register.
120 .It Li p5-btb-false-entries
121 .Pq Event 3AH , Tn Pentium MMX
122 The number of false entries in the BTB.
123 This event is only allocated on counter 0.
126 The number of branches executed that hit in the branch table buffer.
127 .It Li p5-btb-miss-prediction-on-not-taken-branch
128 .Pq Event 3AH , Tn Pentium MMX
129 The number of times the BTB predicted a not-taken branch as taken.
130 This event is only allocated on counter 1.
131 .It Li p5-bus-cycle-duration
133 The number of cycles while a bus cycle was in progress.
134 .It Li p5-bus-ownership-latency
135 .Pq Event 2AH , Tn Pentium MMX
136 The time from bus ownership being requested to ownership being granted.
137 This event is only allocated on counter 0.
138 .It Li p5-bus-ownership-transfers
139 .Pq Event 2AH , Tn Pentium MMX
140 The number of bus ownership transfers.
141 This event is only allocated on counter 1.
142 .It Li p5-bus-utilization-due-to-processor-activity
143 .Pq Event 2EH , Tn Pentium MMX
144 The number of clocks the bus is busy due to the processor's own
146 This event is only allocated on counter 0.
147 .It Li p5-cache-line-sharing
148 .Pq Event 2CH , Tn Pentium MMX
149 The number of shared data lines in L1 cache.
150 This event is only allocated on counter 1.
151 .It Li p5-cache-m-state-line-sharing
152 .Pq Event 2CH , Tn Pentium MMX
153 The number of hits to an M- state line due to a memory access by
155 This event is only allocated on counter 0.
156 .It Li p5-code-cache-miss
158 The number of instruction reads that miss the internal code cache.
159 Both cacheable and un-cacheable misses are counted.
162 The number of instruction reads to both cacheable and un-cacheable regions.
163 .It Li p5-code-tlb-miss
165 The number of instruction reads that miss the instruction TLB.
166 Both cacheable and un-cacheable unreads are counted.
167 .It Li p5-d1-starvation-and-fifo-is-empty
168 .Pq Event 33H , Tn Pentium MMX
169 The number of times the D1 stage cannot issue any instructions because
171 This event is only allocated on counter 0.
172 .It Li p5-d1-starvation-and-only-one-instruction-in-fifo
173 .Pq Event 33H , Tn Pentium MMX
174 The number of times the D1 stage could issue only one instruction
175 because the FIFO had one instruction ready.
176 This event is only allocated on counter 1.
177 .It Li p5-data-cache-lines-written-back
179 The number of data cache lines that are written back, including
180 those caused by internal and external snoops.
181 .It Li p5-data-cache-tlb-miss-stall-duration
182 .Pq Event 30H , Tn Pentium MMX
183 The number of clocks the pipeline is stalled due to a data cache
185 This event is only allocated on counter 1.
188 The number of memory data reads, counting internal data cache hits and
190 I/O and data memory accesses due to TLB miss processing are
192 Split cycle reads are counted individually.
193 .It Li p5-data-read-miss
195 The number of memory read accesses that miss the data cache, counting
196 both cacheable and un-cacheable accesses.
197 Data accesses that are part of TLB miss processing are not included.
198 I/O accesses are not included.
199 .It Li p5-data-read-miss-or-write-miss
201 The number of data reads and writes that miss the internal data cache,
202 counting un-cacheable accesses.
203 Data accesses due to TLB miss processing are not counted.
204 .It Li p5-data-read-or-write
206 The number of data reads and writes including internal data cache hits
208 Data reads due to TLB miss processing are not counted.
209 .It Li p5-data-tlb-miss
211 The number of misses to the data cache translation look aside buffer.
214 The number of memory data writes, counting internal data cache hits
216 I/O is not included and split cycle writes are counted individually.
217 .It Li p5-data-write-miss
219 The number of memory write accesses that miss the data cache, counting
220 both cacheable and un-cacheable accesses.
221 I/O accesses are not counted.
222 .It Li p5-emms-instructions-executed
223 .Pq Event 2DH , Tn Pentium MMX
224 The number of EMMS instructions executed.
225 This event is only allocated on counter 0.
226 .It Li p5-external-data-cache-snoop-hits
228 The number of external snoops to the data cache that hit a valid line,
229 or the data line fill buffer, or one of the write back buffers.
230 .It Li p5-external-snoops
232 The number of external snoop requests accepted, including snoops that
233 hit in the code cache, the data cache and that hit in neither.
234 .It Li p5-floating-point-stalls-duration
235 .Pq Event 32H , Tn Pentium MMX
236 The number of cycles the pipeline is stalled due to a floating point
238 This event is only allocated on counter 0.
241 The number of floating point adds, subtracts, multiples, divides and
243 Transcendental instructions trigger this event multiple times.
244 Instructions generating divide-by-zero, negative square root, special
245 operand and stack exceptions are not counted.
246 Integer multiply instructions that use the x87 FPU are counted.
247 .It Li p5-full-write-buffer-stall-duration-while-executing-mmx-instructions
248 .Pq Event 3BH , Tn Pentium MMX
249 The number of clocks the pipeline has stalled due to full write
250 buffers when executing MMX instructions.
251 This event is only allocated on counter 0.
252 .It Li p5-hardware-interrupts
254 The number of taken INTR and NMI interrupts.
255 .It Li p5-instructions-executed
257 The number of instructions executed.
258 Repeat prefixed instructions are counted only once.
259 The HLT instruction is counted only once, irrespective of the number
260 of cycles spent in the halted state.
261 All hardware and software exceptions are counted as instructions, and
262 fault handler invocations are also counted as instructions.
263 .It Li p5-instructions-executed-v-pipe
265 The number of instructions that executed in the V pipe.
266 .It Li p5-io-read-or-write-cycle
268 The number of bus cycles directed to I/O space.
269 .It Li p5-locked-bus-cycle
271 The number of locked bus cycles that occur on account of the lock
272 prefixes, LOCK instructions, page table updates and descriptor table
274 .It Li p5-memory-accesses-in-both-pipes
276 The number of data memory reads or writes that are paired in both pipes.
277 .It Li p5-misaligned-data-memory-or-io-references
279 The number of memory or I/O reads or writes that are not aligned on
281 2- and 4-byte accesses are counted as misaligned if they cross a 4
283 .It Li p5-misaligned-data-memory-reference-on-mmx-instructions
284 .Pq Event 36H , Tn Pentium MMX
285 The number of misaligned data memory references when executing MMX
287 This event is only allocated on counter 0.
288 .It Li p5-mispredicted-or-unpredicted-returns
289 .Pq Event 37H , Tn Pentium MMX
290 The number of returns predicted incorrectly or not at all, only
291 counting RET instructions.
292 This event is only allocated on counter 0.
293 .It Li p5-mmx-instruction-data-read-misses
294 .Pq Event 31H , Tn Pentium MMX
295 The number of MMX instruction data read misses.
296 This event is only allocated on counter 1.
297 .It Li p5-mmx-instruction-data-reads
298 .Pq Event 31H , Tn Pentium MMX
299 The number of MMX instruction data reads.
300 This event is only allocated on counter 0.
301 .It Li p5-mmx-instruction-data-write-misses
302 .Pq Event 34H , Tn Pentium MMX
303 The number of data write misses caused by MMX instructions.
304 This event is only allocated on counter 1.
305 .It Li p5-mmx-instruction-data-writes
306 .Pq Event 34H , Tn Pentium MMX
307 The number of data writes caused by MMX instructions.
308 This event is only allocated on counter 0.
309 .It Li p5-mmx-instructions-executed-u-pipe
310 .Pq Event 2BH , Tn Pentium MMX
311 The number of MMX instructions executed in the U pipe.
312 This event is only allocated on counter 0.
313 .It Li p5-mmx-instructions-executed-v-pipe
314 .Pq Event 2BH , Tn Pentium MMX
315 The number of MMX instructions executed in the V pipe.
316 This event is only allocated on counter 1.
317 .It Li p5-mmx-multiply-unit-interlock
318 .Pq Event 38H , Tn Pentium MMX
319 The number of clocks the pipeline is stalled because the destination
320 of a prior MMX multiply is not ready.
321 This event is only allocated on counter 0.
322 .It Li p5-movd-movq-store-stall-due-to-previous-mmx-operation
323 .Pq Event 38H , Tn Pentium MMX
324 The number of clocks a MOVD/MOVQ instruction stalled in the D2 stage
325 of the pipeline due to a previous MMX instruction.
326 This event is only allocated on counter 1.
327 .It Li p5-noncacheable-memory-reads
329 The number of bus cycles for non-cacheable instruction or data reads,
330 including cycles caused by TLB misses.
331 .It Li p5-number-of-cycles-not-in-halt-state
332 .Pq Event 30H , Tn Pentium MMX
333 The number of cycles the processor is not idle due to the HLT
335 This event is only allocated on counter 0.
336 .It Li p5-pipeline-agi-stalls
338 The number of address generation interlock stalls.
339 An AGI that occurs in both the U and V pipelines in the same clock
340 signals the event twice.
341 .It Li p5-pipeline-flushes
343 The number of pipeline flushes that occur.
344 Pipeline flushes are caused by branch mispredicts, exceptions,
345 interrupts, some segment register loads, and BTB misses.
346 Prefetch queue flushes due to serializing instructions are not
348 .It Li p5-pipeline-flushes-due-to-wrong-branch-predictions
349 .Pq Event 35H , Tn Pentium MMX
350 The number of pipeline flushes due to wrong branch predictions
351 resolved in either the E- or WB- stage of the pipeline.
352 This event is only allocated on counter 0.
353 .It Li p5-pipeline-flushes-due-to-wrong-branch-predictions-resolved-in-wb-stage
354 .Pq Event 35H , Tn Pentium MMX
355 The number of pipeline flushes due to wrong branch predictions
356 resolved in the stage of the pipeline.
357 This event is only allocated on counter 1.
358 .It Li p5-pipeline-stall-for-mmx-instruction-data-memory-reads
359 .Pq Event 36H , Tn Pentium MMX
360 The number of clocks during pipeline stalls caused by waiting MMX data
362 This event is only allocated on counter 1.
363 .It Li p5-predicted-returns
364 .Pq Event 37H , Tn Pentium MMX
365 The number of predicted returns, whether correct or incorrect.
366 This counter only counts RET instructions.
367 This event is only allocated on counter 1.
369 .Pq Event 39H , Tn Pentium MMX
370 The number of RET instructions executed.
371 This event is only allocated on counter 0.
372 .It Li p5-saturating-mmx-instructions-executed
373 .Pq Event 2FH , Tn Pentium MMX
374 The number of saturating MMX instructions executed.
375 This event is only allocated on counter 0.
376 .It Li p5-saturations-performed
377 .Pq Event 2FH , Tn Pentium MMX
378 The number of saturating MMX instructions executed when at least one
379 of its results were actually saturated.
380 This event is only allocated on counter 1.
381 .It Li p5-stall-on-mmx-instruction-write-to-e-o-m-state-line
382 .Pq Event 3BH , Tn Pentium MMX
383 The number of clocks during stalls on MMX instructions writing to
384 E- or M- state cache lines.
385 This event is only allocated on counter 1.
386 .It Li p5-stall-on-write-to-an-e-or-m-state-line
388 The number of stalls on a write to an exclusive or modified data cache
390 .It Li p5-taken-branch-or-btb-hit
392 The number of events that may cause a hit in the BTB, namely either
393 taken branches or BTB hits.
394 .It Li p5-taken-branches
395 .Pq Event 32H , Tn Pentium MMX
396 The number of taken branches.
397 This event is only allocated on counter 1.
398 .It Li p5-transitions-between-mmx-and-fp-instructions
399 .Pq Event 2DH , Tn Pentium MMX
400 The number of transitions between MMX and floating-point instructions
402 This event is only allocated on counter 1.
403 .It Li p5-waiting-for-data-memory-read-stall-duration
405 The number of clocks the pipeline was stalled waiting for data
407 Data TLB misses processing is included in this count.
408 .It Li p5-write-buffer-full-stall-duration
410 The number of clocks while the pipeline was stalled due to write
412 .It Li p5-write-hit-to-m-or-e-state-lines
414 The number of writes that hit exclusive or modified lines in the data
416 .It Li p5-writes-to-noncacheable-memory
417 .Pq Event 2EH , Tn Pentium MMX
418 The number of writes to non-cacheable memory, including write cycles
419 caused by TLB misses and I/O writes.
420 This event is only allocated on counter 1.
422 .Ss Event Name Aliases
423 The following table shows the mapping between the PMC-independent
426 and the underlying hardware events used.
427 .Bl -column "branch-mispredicts" "Description"
428 .It Em Alias Ta Em Event
429 .It Li branches Ta Li p5-taken-branches
430 .It Li branch-mispredicts Ta Li (unsupported)
431 .It Li dc-misses Ta Li p5-data-read-miss-or-write-miss
432 .It Li ic-misses Ta Li p5-code-cache-miss
433 .It Li instructions Ta Li p5-instructions-executed
434 .It Li interrupts Ta Li p5-hardware-interrupts
435 .It Li unhalted-cycles Ta Li p5-number-of-cycles-not-in-halt-state
454 library first appeared in
459 library was written by
460 .An Joseph Koshy Aq Mt jkoshy@FreeBSD.org .