1 .\" Copyright (c) 2003-2005 Joseph Koshy
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
13 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27 .Dd September 28, 2005
32 .Nd "Hardware Performance Monitoring Counter support"
34 .Cd "options HWPMC_HOOKS"
37 Additionally, for i386 systems:
42 driver virtualizes the hardware performance monitoring facilities in
43 modern CPUs and provides support for using these facilities from
46 The driver supports multi-processor systems.
48 PMCs are allocated using the
49 .Dv PMC_OP_PMCALLOCATE
52 .Dv PMC_OP_PMCALLOCATE
53 request will return an integer handle (typically a small integer) to
54 the requesting process.
55 Subsequent operations on the allocated PMC use this handle to denote
57 A process that has successfully allocated a PMC is termed an
60 PMCs may be allocated to operate in process-private or in system-wide
62 .Bl -tag -width ".Em Process-private"
63 .It Em Process-private
64 In process-private mode, a PMC is active only when a thread belonging
65 to a process it is attached to is scheduled on a CPU.
67 In system-wide mode, a PMC operates independently of processes and
68 measures hardware events for the system as a whole.
73 driver supports the use of hardware PMCs for counting or for
75 .Bl -tag -width ".Em Counting"
77 In counting modes, the PMCs count hardware events.
78 These counts are retrievable using the
80 system call on all architectures, though some architectures like the
81 i386 and amd64 offer faster methods of reading these counts.
83 In sampling modes, where PMCs are configured to sample the CPU
84 instruction pointer after a configurable number of hardware events
86 These instruction pointer samples are directed to a log file for
90 These modes of operation are orthogonal; a PMC may be configured to
91 operate in one of four modes:
92 .Bl -tag -width indent
93 .It Process-private, counting
94 These PMCs count hardware events whenever a thread in their attached process is
96 These PMCs normally count from zero, but the initial count may be
100 Applications can read the value of the PMC anytime using the
103 .It Process-private, sampling
104 These PMCs sample the target processes instruction pointer after they
105 have seen the configured number of hardware events.
106 The PMCs only count events when a thread belonging to their attached
108 The desired frequency of sampling is set using the
110 operation prior to starting the PMC.
111 Log files are configured using the
112 .Dv PMC_OP_CONFIGURELOG
114 .It System-wide, counting
115 These PMCs count hardware events seen by them independent of the
116 processes that are executing.
117 The current count on these PMCs can be read using the
120 These PMCs normally count from zero, but the initial count may be
124 .It System-wide, sampling
125 These PMCs will periodically sample the instruction pointer of the CPU
126 they are allocated on, and will write the sample to a log for further
128 The desired frequency of sampling is set using the
130 operation prior to starting the PMC.
131 Log files are configured using the
132 .Dv PMC_OP_CONFIGURELOG
135 System-wide statistical sampling can only be enabled by a process with
136 super-user privileges.
139 Processes are allowed to allocate as many PMCs are the hardware and
140 current operating conditions permit.
141 Processes may mix allocations of system-wide and process-private
143 Multiple processes are allowed to be concurrently using the facilities
148 Allocated PMCs are started using the
150 operation, and stopped using the
153 Stopping and starting a PMC is permitted at any time the owner process
154 has a valid handle to the PMC.
156 Process-private PMCs need to be attached to a target process before
158 Attaching a process to a PMC is done using the
161 An already attached PMC may be detached from its target process
167 operation on an as yet unattached PMC will cause it to be attached
168 to its owner process.
169 The following rules determine whether a given process may attach
170 a PMC to another target process:
173 A non-jailed process with super-user privileges is allowed to attach
174 to any other process in the system.
176 Other processes are only allowed to attach to targets that they would
177 be able to attach to for debugging (as determined by
181 PMCs are released using
182 .Dv PMC_OP_PMCRELEASE .
184 .Dv PMC_OP_PMCRELEASE
185 operation the handle to the PMC will become invalid.
188 .Dv PMC_OP_PMCALLOCATE
189 operation supports the following flags that modify the behavior
191 .Bl -tag -width indent
192 .It Dv PMC_F_DESCENDANTS
193 This modifier is valid only for a PMC being allocated in process-private
195 It signifies that the PMC will track hardware events for its
196 target process and the target's current and future descendants.
198 This modifier is valid only for a PMC being allocated in system-wide
200 It signifies that the PMC's sampling interrupt is to be used to drive
203 .It Dv PMC_F_LOG_PROCCSW
204 This modifier is valid only for a PMC being allocated in process-private
206 When this modifier is present, at every context switch,
208 will log a record containing the number of hardware events
209 seen by the target process when it was scheduled on the CPU.
210 .It Dv PMC_F_LOG_PROCEXIT
211 This modifier is valid only for a PMC being allocated in process-private
213 With this modifier present,
215 will maintain per-process counts for each target process attached to
217 At process exit time, a record containing the target process' PID and
218 the accumulated per-process count for that process will be written to the
223 .Dv PMC_F_LOG_PROCEXIT
225 .Dv PMC_F_LOG_PROCCSW
226 may be used in combination with modifier
227 .Dv PMC_F_DESCENDANTS
228 to track the behaviour of complex pipelines of processes.
230 .Dv PMC_F_LOG_PROCEXIT
232 .Dv PMC_F_LOG_PROCCSW
233 cannot be started until their owner process has configured a log file.
237 driver may deliver signals to processes that have allocated PMCs:
238 .Bl -tag -width ".Dv SIGBUS"
242 operation was attempted on a process-private PMC that does not have
243 attached target processes.
247 driver is being unloaded from the kernel.
250 The recommended way for application programs to use the facilities of
253 driver is using the API provided by the
259 driver operates using a system call number that is dynamically
260 allotted to it when it is loaded into the kernel.
264 driver supports the following operations:
265 .Bl -tag -width indent
266 .It Dv PMC_OP_CONFIGURELOG
267 Configure a log file for sampling mode PMCs.
268 .It Dv PMC_OP_FLUSHLOG
269 Transfer buffered log data inside
271 to a configured output file.
272 This operation returns to the caller after the write operation
274 .It Dv PMC_OP_GETCPUINFO
275 Retrieve information about the number of CPUs on the system and
276 the number of hardware performance monitoring counters available per-CPU.
277 .It Dv PMC_OP_GETDRIVERSTATS
278 Retrieve module statistics (for analyzing the behavior of
281 .It Dv PMC_OP_GETMODULEVERSION
282 Retrieve the version number of API.
283 .It Dv PMC_OP_GETPMCINFO
284 Retrieve information about the current state of the PMCs on a
286 .It Dv PMC_OP_PMCADMIN
287 Set the administrative state (i.e., whether enabled or disabled) for
288 the hardware PMCs managed by the
291 .It Dv PMC_OP_PMCALLOCATE
292 Allocate and configure a PMC.
293 On successful allocation, a handle to the PMC (a small integer)
295 .It Dv PMC_OP_PMCATTACH
296 Attach a process mode PMC to a target process.
297 The PMC will be active whenever a thread in the target process is
301 .Dv PMC_F_DESCENDANTS
302 flag had been specified at PMC allocation time, then the PMC is
303 attached to all current and future descendants of the target process.
304 .It Dv PMC_OP_PMCDETACH
305 Detach a PMC from its target process.
306 .It Dv PMC_OP_PMCRELEASE
309 Read and write a PMC.
310 This operation is valid only for PMCs configured in counting modes.
311 .It Dv PMC_OP_SETCOUNT
312 Set the initial count (for counting mode PMCs) or the desired sampling
313 rate (for sampling mode PMCs).
314 .It Dv PMC_OP_PMCSTART
316 .It Dv PMC_OP_PMCSTOP
318 .It Dv PMC_OP_WRITELOG
319 Insert a timestamped user record into the log file.
321 .Ss i386 Specific API
322 Some i386 family CPUs support the RDPMC instruction which allows a
323 user process to read a PMC value without needing to invoke a
326 On such CPUs, the machine address associated with an allocated PMC is
327 retrievable using the
328 .Dv PMC_OP_PMCX86GETMSR
330 .Bl -tag -width indent
331 .It Dv PMC_OP_PMCX86GETMSR
332 Retrieve the MSR (machine specific register) number associated with
333 the given PMC handle.
335 The PMC needs to be in process-private mode and allocated without the
336 .Dv PMC_F_DESCENDANTS
337 modifier flag, and should be attached only to its owner process at the
340 .Ss amd64 Specific API
341 AMD64 CPUs support the RDPMC instruction which allows a
342 user process to read a PMC value without needing to invoke a
345 The machine address associated with an allocated PMC is
346 retrievable using the
347 .Dv PMC_OP_PMCX86GETMSR
349 .Bl -tag -width indent
350 .It Dv PMC_OP_PMCX86GETMSR
351 Retrieve the MSR (machine specific register) number associated with
352 the given PMC handle.
354 The PMC needs to be in process-private mode and allocated without the
355 .Dv PMC_F_DESCENDANTS
356 modifier flag, and should be attached only to its owner process at the
359 .Sh SYSCTL VARIABLES AND LOADER TUNABLES
362 is influenced by the following
367 .Bl -tag -width indent
368 .It Va kern.hwpmc.debugflags Pq string, read-write
369 (Only available if the
371 driver was compiled with
373 Control the verbosity of debug messages from the
376 .It Va kern.hwpmc.hashsize Pq integer, read-only
377 The number of rows in the hash tables used to keep track of owner and
380 .It Va kern.hwpmc.logbuffersize Pq integer, read-only
381 The size in kilobytes of each log buffer used by
384 The default buffer size is 4KB.
385 .It Va kern.hwpmc.mtxpoolsize Pq integer, read-only
386 The size of the spin mutex pool used by the PMC driver.
388 .It Va kern.hwpmc.nbuffers Pq integer, read-only
389 The number of log buffers used by
393 .It Va kern.hwpmc.nsamples Pq integer, read-only
394 The number of entries in the per-CPU ring buffer used during sampling.
396 .It Va security.bsd.unprivileged_syspmcs Pq boolean, read-write
397 If set to non-zero, allow unprivileged processes to allocate system-wide
399 The default value is 0.
400 .It Va security.bsd.unprivileged_proc_debug Pq boolean, read-write
403 driver will only allow privileged processes to attach PMCs to other
407 These variables may be set in the kernel environment using
412 .Sh SECURITY CONSIDERATIONS
413 PMCs may be used to monitor the actual behaviour of the system on hardware.
414 In situations where this constitutes an undesirable information leak,
415 the following options are available:
421 .Va security.bsd.unprivileged_syspmcs
423 This ensures that unprivileged processes cannot allocate system-wide
424 PMCs and thus cannot observe the hardware behavior of the system
426 This tunable may also be set at boot time using
432 driver into the kernel.
437 .Va security.bsd.unprivileged_proc_debug
439 This will ensure that an unprivileged process cannot attach a PMC
440 to any process other than itself and thus cannot observe the hardware
441 behavior of other processes with the same credentials.
444 System administrators should note that on IA-32 platforms
446 makes the content of the IA-32 TSC counter available to all processes
447 via the RDTSC instruction.
448 .Sh IMPLEMENTATION NOTES
450 The kernel driver requires all physical CPUs in an SMP system to have
451 identical performance monitoring counter hardware.
453 Historically, on the x86 architecture,
455 has permitted user processes running at a processor CPL of 3 to
456 read the TSC using the RDTSC instruction.
459 driver preserves this semantics.
460 .Ss Intel P4/HTT Handling
461 On CPUs with HTT support, Intel P4 PMCs are capable of qualifying
462 only a subset of hardware events on a per-logical CPU basis.
463 Consequently, if HTT is enabled on a system with Intel Pentium P4
466 driver will reject allocation requests for process-private PMCs that
467 request counting of hardware events that cannot be counted separately
468 for each logical CPU.
469 .Ss Intel Pentium-Pro Handling
470 Writing a value to the PMC MSRs found in Intel Pentium-Pro style PMCs
472 .Tn "Intel Pentium Pro" ,
478 processors) will replicate bit 31 of the
479 value being written into the upper 8 bits of the MSR,
480 bringing down the usable width of these PMCs to 31 bits.
481 For process-virtual PMCs, the
483 driver implements a workaround in software and makes the corrected 64
484 bit count available via the
487 Processes that intend to use RDPMC instructions directly or
488 that intend to write values larger than 2^31 into these PMCs with
490 need to be aware of this hardware limitation.
493 .It "hwpmc: [class/npmc/capabilities]..."
494 Announce the presence of
498 with capabilities described by bit string
500 .It "hwpmc: kernel version (0x%x) does not match module version (0x%x)."
501 The module loading process failed because a version mismatch was detected
502 between the currently executing kernel and the module being loaded.
503 .It "hwpmc: this kernel has not been compiled with 'options HWPMC_HOOKS'."
504 The module loading process failed because the currently executing kernel
505 was not configured with the required configuration option
507 .It "hwpmc: tunable hashsize=%d must be greater than zero."
508 A negative value was supplied for tunable
509 .Va kern.hwpmc.hashsize .
510 .It "hwpmc: tunable logbuffersize=%d must be greater than zero."
511 A negative value was supplied for tunable
512 .Va kern.hwpmc.logbuffersize .
513 .It "hwpmc: tunable nlogbuffers=%d must be greater than zero."
514 A negative value was supplied for tunable
515 .Va kern.hwpmc.nlogbuffers .
516 .It "hwpmc: tunable nsamples=%d out of range."
517 The value for tunable
518 .Va kern.hwpmc.nsamples
519 was negative or greater than 65535.
526 The API and ABI documented in this manual page may change in
528 The recommended method of accessing this driver is using the
532 A command issued to the
534 driver may fail with the following errors:
538 .Dv PMC_OP_CONFIGURELOG
539 operation was requested while an existing log was active.
541 A DISABLE operation was requested using the
543 request for a set of hardware resources currently in use for
544 process-private PMCs.
548 operation was requested on an active system mode PMC.
552 operation was requested for a target process that already had another
553 PMC using the same hardware resources attached to it.
557 request writing a new value was issued on a PMC that was active.
560 .Dv PMC_OP_PMCSETCOUNT
561 request was issued on a PMC that was active.
565 operation was requested without a log file being configured for a
567 .Dv PMC_F_LOG_PROCCSW
569 .Dv PMC_F_LOG_PROCEXIT
574 request was reissued for a target process that already is the target
577 A bad address was passed in to the driver.
579 A process specified an invalid PMC handle.
581 An invalid CPU number was passed in for a
582 .Dv PMC_OP_GETPMCINFO
585 An invalid CPU number was passed in for a
589 An invalid operation request was passed in for a
593 An invalid PMC ID was passed in for a
597 A suitable PMC matching the parameters passed in to a
598 .Dv PMC_OP_PMCALLOCATE
599 request could not be allocated.
601 An invalid PMC mode was requested during a
602 .Dv PMC_OP_PMCALLOCATE
605 An invalid CPU number was specified during a
606 .Dv PMC_OP_PMCALLOCATE
613 request for a process-private PMC.
619 request for a system-wide PMC.
624 .Dv PMC_OP_PMCALLOCATE
625 request contained unknown flags.
627 A PMC allocated for system-wide operation was specified with a
635 request specified an illegal process ID.
639 request was issued for a PMC not attached to the target process.
645 request contained illegal flags.
648 .Dv PMC_OP_PMCX86GETMSR
649 operation was requested for a PMC not in process-virtual mode, or
650 for a PMC that is not solely attached to its owner process, or for
651 a PMC that was allocated with flag
652 .Dv PMC_F_DESCENDANTS .
654 (On Intel Pentium 4 CPUs with HTT support)
655 An allocation request for
656 a process-private PMC was issued for an event that does not support
657 counting on a per-logical CPU basis.
659 The system was not able to allocate kernel memory.
663 .Dv PMC_OP_PMCX86GETMSR
664 operation was requested for hardware that does not support reading
665 PMCs directly with the RDPMC instruction.
668 .Dv PMC_OP_GETPMCINFO
669 operation was requested for a disabled CPU.
671 A system-wide PMC on a disabled CPU was requested to be allocated with
672 .Dv PMC_OP_PMCALLOCATE .
678 request was issued for a system-wide PMC that was allocated on a
679 currently disabled CPU.
682 .Dv PMC_OP_PMCALLOCATE
683 request was issued for PMC capabilities not supported
684 by the specified PMC class.
687 A sampling mode PMC was requested on a CPU lacking an APIC.
691 request was issued by a process without super-user
692 privilege or by a jailed super-user process.
696 operation was issued for a target process that the current process
697 does not have permission to attach to.
699 (i386 and amd64 architectures)
702 operation was issued on a PMC whose MSR has been retrieved using
703 .Dv PMC_OP_PMCX86GETMSR .
705 A process issued a PMC operation request without having allocated any
708 A process issued a PMC operation request after the PMC was detached
709 from all of its target processes.
713 request specified a non-existent process ID.
715 The target process for a
717 operation is not being monitored by the
734 driver first appeared in
737 The driver samples the state of the kernel's logical processor support
738 at the time of initialization (i.e., at module load time).
739 On CPUs supporting logical processors, the driver could misbehave if
740 logical processors are subsequently enabled or disabled while the
743 On the i386 architecture, the driver requires that the local APIC on the
744 CPU be enabled for sampling mode to be supported.
745 Many single-processor motherboards keep the APIC disabled in BIOS; on
748 will not support sampling PMCs.