1 .\" Copyright (c) 2014 Hiren Panchasara <hiren@FreeBSD.org>
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
13 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28 .Dt PMC.ATOMSILVERMONT 3
31 .Nm pmc.atomsilvermont
32 .Nd measurement events for
43 CPUs contain PMCs conforming to version 3 of the
45 performance measurement architecture.
46 These CPUs contains two classes of PMCs:
47 .Bl -tag -width "Li PMC_CLASS_IAP"
49 Fixed-function counters that count only one hardware event per counter.
51 Programmable counters that may be configured to count one of a defined
52 set of hardware events.
55 The number of PMCs available in each class and their widths need to be
56 determined at run time by calling
59 Intel Atom Silvermont PMCs are documented in
61 .%B "Intel 64 and IA-32 Intel(R) Architecture Software Developer's Manual"
62 .%T "Combined Volumes"
63 .%N "Order Number 325462-050US"
65 .%Q "Intel Corporation"
67 .Ss ATOM SILVERMONT FIXED FUNCTION PMCS
68 These PMCs and their supported events are documented in
70 .Ss ATOM SILVERMONT PROGRAMMABLE PMCS
71 The programmable PMCs support the following capabilities:
72 .Bl -column "PMC_CAP_INTERRUPT" "Support"
73 .It Em Capability Ta Em Support
74 .It PMC_CAP_CASCADE Ta \&No
75 .It PMC_CAP_EDGE Ta Yes
76 .It PMC_CAP_INTERRUPT Ta Yes
77 .It PMC_CAP_INVERT Ta Yes
78 .It PMC_CAP_READ Ta Yes
79 .It PMC_CAP_PRECISE Ta \&No
80 .It PMC_CAP_SYSTEM Ta Yes
81 .It PMC_CAP_TAGGING Ta \&No
82 .It PMC_CAP_THRESHOLD Ta Yes
83 .It PMC_CAP_USER Ta Yes
84 .It PMC_CAP_WRITE Ta Yes
87 Event specifiers for these PMCs support the following common
89 .Bl -tag -width indent
91 Count matching events seen on any logical processor in a package.
92 .It Li cmask= Ns Ar value
93 Configure the PMC to increment only if the number of configured
94 events measured in a cycle is greater than or equal to
97 Configure the PMC to count the number of de-asserted to asserted
98 transitions of the conditions expressed by the other qualifiers.
99 If specified, the counter will increment only once whenever a
100 condition becomes true, irrespective of the number of clocks during
101 which the condition remains true.
103 Invert the sense of comparison when the
105 qualifier is present, making the counter increment when the number of
106 events per cycle is less than the value specified by the
110 Configure the PMC to count events happening at processor privilege
113 Configure the PMC to count events occurring at privilege levels 1, 2
121 qualifiers are specified, the default is to enable both.
123 Events that require core-specificity to be specified use a
125 .Dq Li core= Ns Ar core ,
129 .Bl -tag -width indent
131 Measure event conditions on all cores.
133 Measure event conditions on this core.
139 Events that require an agent qualifier to be specified use an
141 .Dq Li agent= Ns agent ,
145 .Bl -tag -width indent
147 Measure events associated with this bus agent.
149 Measure events caused by any bus agent.
155 Events that require a hardware prefetch qualifier to be specified use an
157 .Dq Li prefetch= Ns Ar prefetch ,
161 .Bl -tag -width "exclude"
163 Include all prefetches.
165 Only count hardware prefetches.
167 Exclude hardware prefetches.
173 Events that require a cache coherence qualifier to be specified use an
175 .Dq Li cachestate= Ns Ar state ,
178 contains one or more of the following letters:
179 .Bl -tag -width indent
181 Count cache lines in the exclusive state.
183 Count cache lines in the invalid state.
185 Count cache lines in the modified state.
187 Count cache lines in the shared state.
193 Events that require a snoop response qualifier to be specified use an
195 .Dq Li snoopresponse= Ns Ar response ,
198 comprises of the following keywords separated by
201 .Bl -tag -width indent
203 Measure CLEAN responses.
205 Measure HIT responses.
207 Measure HITM responses.
210 The default is to measure all the above responses.
212 Events that require a snoop type qualifier use an additional qualifier
213 .Dq Li snooptype= Ns Ar type ,
216 comprises the one of the following keywords:
217 .Bl -tag -width indent
219 Measure CMP2I snoops.
221 Measure CMP2S snoops.
224 The default is to measure both snoops.
225 .Ss Event Specifiers (Programmable PMCs)
226 Atom Silvermont programmable PMCs support the following events:
227 .Bl -tag -width indent
228 .It Li REHABQ.LD_BLOCK_ST_FORWARD
229 .Pq Event 03H , Umask 01H
230 The number of retired loads that were
231 prohibited from receiving forwarded data from the store
232 because of address mismatch.
233 .It Li REHABQ.LD_BLOCK_STD_NOTREADY
234 .Pq Event 03H , Umask 02H
235 The cases where a forward was technically possible,
236 but did not occur because the store data was not available
238 .It Li REHABQ.ST_SPLITS
239 .Pq Event 03H , Umask 04H
240 The number of retire stores that experienced.
241 cache line boundary splits.
242 .It Li REHABQ.LD_SPLITS
243 .Pq Event 03H , Umask 08H
244 The number of retire loads that experienced.
245 cache line boundary splits.
247 .Pq Event 03H , Umask 10H
248 The number of retired memory operations with lock semantics.
249 These are either implicit locked instructions such as the
250 XCHG instruction or instructions with an explicit LOCK
252 .It Li REHABQ.STA_FULL
253 .Pq Event 03H , Umask 20H
254 The number of retired stores that are delayed
255 because there is not a store address buffer available.
257 .Pq Event 03H , Umask 40H
258 The number of load uops reissued from Rehabq.
260 .Pq Event 03H , Umask 80H
261 The number of store uops reissued from Rehabq.
262 .It Li MEM_UOPS_RETIRED.L1_MISS_LOADS
263 .Pq Event 04H , Umask 01H
264 The number of load ops retired that miss in L1
266 Note that prefetch misses will not be counted.
267 .It Li MEM_UOPS_RETIRED.L2_HIT_LOADS
268 .Pq Event 04H , Umask 02H
269 The number of load micro-ops retired that hit L2.
270 .It Li MEM_UOPS_RETIRED.L2_MISS_LOADS
271 .Pq Event 04H , Umask 04H
272 The number of load micro-ops retired that missed L2.
273 .It Li MEM_UOPS_RETIRED.DTLB_MISS_LOADS
274 .Pq Event 04H , Umask 08H
275 The number of load ops retired that had DTLB miss.
276 .It Li MEM_UOPS_RETIRED.UTLB_MISS
277 .Pq Event 04H , Umask 10H
278 The number of load ops retired that had UTLB miss.
279 .It Li MEM_UOPS_RETIRED.HITM
280 .Pq Event 04H , Umask 20H
281 The number of load ops retired that got data
282 from the other core or from the other module.
283 .It Li MEM_UOPS_RETIRED.ALL_LOADS
284 .Pq Event 04H , Umask 40H
285 The number of load ops retired.
286 .It Li MEM_UOP_RETIRED.ALL_STORES
287 .Pq Event 04H , Umask 80H
288 The number of store ops retired.
289 .It Li PAGE_WALKS.D_SIDE_CYCLES
290 .Pq Event 05H , Umask 01H
291 Every cycle when a D-side (walks due to a load) page walk is in progress.
292 Page walk duration divided by number of page walks is the average duration of
294 Edge trigger bit must be cleared.
295 Set Edge to count the number of page walks.
296 .It Li PAGE_WALKS.I_SIDE_CYCLES
297 .Pq Event 05H , Umask 02H
298 Every cycle when a I-side (walks due to an instruction fetch) page walk is in
300 Page walk duration divided by number of page walks is the average duration of
302 .It Li PAGE_WALKS.WALKS
303 .Pq Event 05H , Umask 03H
304 The number of times a data (D) page walk or an instruction (I) page walk is
305 completed or started.
306 Since a page walk implies a TLB miss, the number of TLB misses can be counted
307 by counting the number of pagewalks.
308 .It Li LONGEST_LAT_CACHE.MISS
309 .Pq Event 2EH , Umask 41H
310 the total number of L2 cache references and the number of L2 cache misses
312 L3 is not supported in Silvermont microarchitecture.
313 .It Li LONGEST_LAT_CACHE.REFERENCE
314 .Pq Event 2EH , Umask 4FH
315 The number of requests originating from the core that
316 references a cache line in the L2 cache.
317 L3 is not supported in Silvermont microarchitecture.
318 .It Li L2_REJECT_XQ.ALL
319 .Pq Event 30H , Umask 00H
320 The number of demand and prefetch
321 transactions that the L2 XQ rejects due to a full or near full
322 condition which likely indicates back pressure from the IDI link.
323 The XQ may reject transactions from the L2Q (non-cacheable
324 requests), BBS (L2 misses) and WOB (L2 write-back victims)
325 .It Li CORE_REJECT_L2Q.ALL
326 .Pq Event 31H , Umask 00H
327 The number of demand and L1 prefetcher
328 requests rejected by the L2Q due to a full or nearly full condition which
329 likely indicates back pressure from L2Q.
330 It also counts requests that would have gone directly to the XQ, but are
331 rejected due to a full or nearly full condition, indicating back pressure from
333 The L2Q may also reject transactions from a core to insure fairness between
334 cores, or to delay a core's dirty eviction when the address conflicts incoming
336 (Note that L2 prefetcher requests that are dropped are not counted by this
338 .It Li CPU_CLK_UNHALTED.CORE_P
339 .Pq Event 3CH , Umask 00H
340 The number of core cycles while the core is not in a halt state.
341 The core enters the halt state when it is running the HLT instruction.
342 In mobile systems the core frequency may change from time to time.
343 For this reason this event may have a changing ratio with regards to time.
344 .It Li CPU_CLK_UNHALTED.REF_P
345 .Pq Event 3CH , Umask 01H
346 The number of reference cycles that the core is not in a halt state.
347 The core enters the halt state when it is running the HLT instruction.
348 In mobile systems the core frequency may change from time.
349 This event is not affected by core frequency changes but counts as if the core
350 is running at the maximum frequency all the time.
352 .Pq Event 80H , Umask 01H
353 The number of instruction fetches from the instruction cache.
355 .Pq Event 80H , Umask 02H
356 The number of instruction fetches that miss the Instruction cache or produce
358 This includes uncacheable fetches.
359 An instruction fetch miss is counted only once and not once for every cycle
361 .It Li ICACHE.ACCESSES
362 .Pq Event 80H , Umask 03H
363 The number of instruction fetches, including uncacheable fetches.
364 .It Li NIP_STALL.ICACHE_MISS
365 .Pq Event B6H , Umask 04H
366 The number of cycles the NIP stalls because of an icache miss.
367 This is a cumulative count of cycles the NIP stalled for all
369 .It Li OFFCORE_RESPONSE_0
370 .Pq Event B7H , Umask 01H
371 Requires MSR_OFFCORE_RESP0 to specify request type and response.
372 .It Li OFFCORE_RESPONSE_1
373 .Pq Event B7H , Umask 02H
374 Requires MSR_OFFCORE_RESP to specify request type and response.
375 .It Li INST_RETIRED.ANY_P
376 .Pq Event C0H , Umask 00H
377 The number of instructions that retire execution.
378 For instructions that consist of multiple micro-ops, this event counts the
379 retirement of the last micro-op of the instruction.
380 The counter continues counting during hardware interrupts, traps, and inside
382 .It Li UOPS_RETIRED.MS
383 .Pq Event C2H , Umask 01H
384 The number of micro-ops retired that were supplied from MSROM.
385 .It Li UOPS_RETIRED.ALL
386 .Pq Event C2H , Umask 10H
387 The number of micro-ops retired.
388 .It Li MACHINE_CLEARS.SMC
389 .Pq Event C3H , Umask 01H
390 The number of times that a program writes to a code section.
391 Self-modifying code causes a severe penalty in all Intel
392 architecture processors.
393 .It Li MACHINE_CLEARS.MEMORY_ORDERING
394 .Pq Event C3H , Umask 02H
395 The number of times that pipeline was cleared due to memory
397 .It Li MACHINE_CLEARS.FP_ASSIST
398 .Pq Event C3H , Umask 04H
399 The number of times that pipeline stalled due to FP operations
401 .It Li MACHINE_CLEARS.ALL
402 .Pq Event C3H , Umask 08H
403 The number of times that pipeline stalled due to due to any causes
404 (including SMC, MO, FP assist, etc).
405 .It Li BR_INST_RETIRED.ALL_BRANCHES
406 .Pq Event C4H , Umask 00H
407 The number of branch instructions retired.
408 .It Li BR_INST_RETIRED.JCC
409 .Pq Event C4H , Umask 7EH
410 The number of branch instructions retired that were conditional
412 .It Li BR_INST_RETIRED.FAR_BRANCH
413 .Pq Event C4H , Umask BFH
414 The number of far branch instructions retired.
415 .It Li BR_INST_RETIRED.NON_RETURN_IND
416 .Pq Event C4H , Umask EBH
417 The number of branch instructions retired that were near indirect
418 call or near indirect jmp.
419 .It Li BR_INST_RETIRED.RETURN
420 .Pq Event C4H , Umask F7H
421 The number of near RET branch instructions retired.
422 .It Li BR_INST_RETIRED.CALL
423 .Pq Event C4H , Umask F9H
424 The number of near CALL branch instructions retired.
425 .It Li BR_INST_RETIRED.IND_CALL
426 .Pq Event C4H , Umask FBH
427 The number of near indirect CALL branch instructions retired.
428 .It Li BR_INST_RETIRED.REL_CALL
429 .Pq Event C4H , Umask FDH
430 The number of near relative CALL branch instructions retired.
431 .It Li BR_INST_RETIRED.TAKEN_JCC
432 .Pq Event C4H , Umask FEH
433 The number of branch instructions retired that were conditional
434 jumps and predicted taken.
435 .It Li BR_MISP_RETIRED.ALL_BRANCHES
436 .Pq Event C5H , Umask 00H
437 The number of mispredicted branch instructions retired.
438 .It Li BR_MISP_RETIRED.JCC
439 .Pq Event C5H , Umask 7EH
440 The number of mispredicted branch instructions retired that were
442 .It Li BR_MISP_RETIRED.FAR
443 .Pq Event C5H , Umask BFH
444 The number of mispredicted far branch instructions retired.
445 .It Li BR_MISP_RETIRED.NON_RETURN_IND
446 .Pq Event C5H , Umask EBH
447 The number of mispredicted branch instructions retired that were
448 near indirect call or near indirect jmp.
449 .It Li BR_MISP_RETIRED.RETURN
450 .Pq Event C5H , Umask F7H
451 The number of mispredicted near RET branch instructions retired.
452 .It Li BR_MISP_RETIRED.CALL
453 .Pq Event C5H , Umask F9H
454 The number of mispredicted near CALL branch instructions retired.
455 .It Li BR_MISP_RETIRED.IND_CALL
456 .Pq Event C5H , Umask FBH
457 The number of mispredicted near indirect CALL branch instructions
459 .It Li BR_MISP_RETIRED.REL_CALL
460 .Pq Event C5H , Umask FDH
461 The number of mispredicted near relative CALL branch instructions
463 .It Li BR_MISP_RETIRED.TAKEN_JCC
464 .Pq Event C5H , Umask FEH
465 The number of mispredicted branch instructions retired that were
466 conditional jumps and predicted taken.
467 .It Li NO_ALLOC_CYCLES.ROB_FULL
468 .Pq Event CAH , Umask 01H
469 The number of cycles when no uops are allocated and the ROB is full
470 (less than 2 entries available).
471 .It Li NO_ALLOC_CYCLES.RAT_STALL
472 .Pq Event CAH , Umask 20H
473 The number of cycles when no uops are allocated and a RATstall is
475 .It Li NO_ALLOC_CYCLES.ALL
476 .Pq Event CAH , Umask 3FH
477 The number of cycles when the front-end does not provide any
478 instructions to be allocated for any reason.
479 .It Li NO_ALLOC_CYCLES.NOT_DELIVERED
480 .Pq Event CAH , Umask 50H
481 The number of cycles when the front-end does not provide any
482 instructions to be allocated but the back end is not stalled.
483 .It Li RS_FULL_STALL.MEC
484 .Pq Event CBH , Umask 01H
485 The number of cycles the allocation pipe line stalled due to
486 the RS for the MEC cluster is full.
487 .It Li RS_FULL_STALL.ALL
488 .Pq Event CBH , Umask 1FH
489 The number of cycles that the allocation pipe line stalled due
490 to any one of the RS is full.
491 .It Li CYCLES_DIV_BUSY.ANY
492 .Pq Event CDH , Umask 01H
493 The number of cycles the divider is busy.
495 .Pq Event E6H , Umask 01H
496 The number of baclears for any type of branch.
497 .It Li BACLEARS.RETURN
498 .Pq Event E6H , Umask 08H
499 The number of baclears for return branches.
501 .Pq Event E6H , Umask 10H
502 The number of baclears for conditional branches.
503 .It Li MS_DECODED.MS_ENTRY
504 .Pq Event E7H , Umask 01H)
505 The number of times the MSROM starts a flow of UOPS.
526 library first appeared in
531 library was written by
533 .Aq jkoshy@FreeBSD.org .
534 The support for the Atom Silvermont
535 microarchitecture was written by
536 .An "Hiren Panchasara"
537 .Aq hiren@FreeBSD.org .