1 .\" Copyright (c) 2003-2005 Joseph Koshy
2 .\" All rights reserved.
4 .\" Redistribution and use in source and binary forms, with or without
5 .\" modification, are permitted provided that the following conditions
7 .\" 1. Redistributions of source code must retain the above copyright
8 .\" notice, this list of conditions and the following disclaimer.
9 .\" 2. Redistributions in binary form must reproduce the above copyright
10 .\" notice, this list of conditions and the following disclaimer in the
11 .\" documentation and/or other materials provided with the distribution.
13 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
14 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
15 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
16 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
17 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
18 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
19 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
20 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
21 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
22 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
32 .Nd Hardware performance monitoring counter support
40 driver virtualizes the hardware performance monitoring facilities in
41 modern CPUs and provides support for using these facilities from
44 The driver supports multi-processor systems.
46 PMCs are allocated using the
47 .Ic PMC_OP_PMCALLOCATE
50 .Ic PMC_OP_PMCALLOCATE
51 request will return an integer handle (typically a small integer) to
52 the requesting process.
53 Subsequent operations on the allocated PMC use this handle to denote
55 A process that has successfully allocated a PMC is termed an
58 PMCs may be allocated to operate in process-private or in system-wide
60 .Bl -hang -width "XXXXXXXXXXXXXXX"
61 .It Em Process-private
62 In process-private mode, a PMC is active only when a thread belonging
63 to a process it is attached to is scheduled on a CPU.
65 In system-wide mode a PMC operates independently of processes and
66 measures hardware events for the system as a whole.
71 driver supports the use of hardware PMCs for counting or for
73 .Bl -hang -width "XXXXXXXXX"
75 In counting modes, the PMCs count hardware events.
76 These counts are retrievable using the
78 system call on all architectures, though some architectures like the
79 x86 and amd64 offer faster methods of reading these counts.
81 In sampling modes, where PMCs are configured to sample the CPU
82 instruction pointer after a configurable number of hardware events
84 These instruction pointer samples are directed to a log file for
88 These modes of operation are orthogonal; a PMC may be configured to
89 operate in one of four modes:
90 .Bl -tag -width indent
91 .It Process-private, counting
92 These PMCs count hardware events whenever a thread in their attached process is
94 These PMCs normally count from zero, but the initial count may be
98 Applications can read the value of the PMC anytime using the
101 .It Process-private, sampling
102 These PMCs sample the target processes instruction pointer after they
103 have seen the configured number of hardware events.
104 The PMCs only count events when a thread belonging to their attached
106 The desired frequency of sampling is set using the
108 operation prior to starting the PMC.
109 Log files are configured using the
110 .Ic PMC_OP_CONFIGURELOG
112 .It System-wide, counting
113 These PMCs count hardware events seen by them independent of the
114 processes that are executing.
115 The current count on these PMCs can be read using the
118 These PMCs normally count from zero, but the initial count may be
122 .It System-wide, sampling
123 These PMCs will periodically sample the instruction pointer of the CPU
124 they are allocated on, and will write the sample to a log for further
126 The desired frequency of sampling is set using the
128 operation prior to starting the PMC.
129 Log files are configured using the
130 .Ic PMC_OP_CONFIGURELOG
133 System-wide statistical sampling can only be enabled by a process with
134 super-user privileges.
137 Processes are allowed to allocate as many PMCs are the hardware and
138 current operating conditions permit.
139 Processes may mix allocations of system-wide and process-private
141 Multiple processes are allowed to be concurrently using the facilities
146 Allocated PMCs are started using the
148 operation, and stopped using the
151 Stopping and starting a PMC is permitted at any time the owner process
152 has a valid handle to the PMC.
154 Process-private PMCs need to be attached to a target process before
156 Attaching a process to a PMC is done using the
159 An already attached PMC may be detached from its target process
165 operation on an as yet unattached PMC will cause it to be attached
166 to its owner process.
167 The following rules determine whether a given process may attach
168 a PMC to another target process:
171 A non-jailed process with super-user privileges is allowed to attach
172 to any other process in the system.
174 Other processes are only allowed to attach to targets that they would
175 be able to attach to for debugging (as determined by
179 PMCs are released using
180 .Ic PMC_OP_PMCRELEASE .
182 .Ic PMC_OP_PMCRELEASE
183 operation the handle to the PMC will become invalid.
186 .Ic PMC_OP_PMCALLOCATE
187 operation supports the following flags that modify the behavior
189 .Bl -tag -width indent -compact
190 .It Dv PMC_F_DESCENDANTS
191 This modifier is valid only for a PMC being allocated in process-private
193 It signifies that the PMC will track hardware events for its
194 target process and the target's current and future descendants.
196 This modifier is valid only for a PMC being allocated in system-wide
198 It signifies that the PMC's sampling interrupt is to be used to drive
201 .It Dv PMC_F_LOG_PROCCSW
202 This modifier is valid only for a PMC being allocated in process-private
204 When this modifier is present, at every process context switch time,
206 will append a record containing the count of the hardware events
207 seen by the process to the configured log file.
208 .It Dv PMC_F_LOG_PROCEXIT
209 This modifier is valid only for a PMC being allocated in process-private
211 With this modifier present,
213 will maintain per-process counts for each target process attached to
215 At process exit time, a record containing the target process' pid and
216 the accumulated per-process count for that process will be written to the
220 .Dv PMC_F_LOG_PROCEXIT
222 .Dv PMC_F_LOG_PROCCSW
223 may be used in combination with modifier
224 .Dv PMC_F_DESCENDANTS
225 to track the behaviour of complex pipelines of processes.
229 driver may deliver signals to processes that have allocated PMCs:
230 .Bl -tag -width "XXXXXXXX" -compact
234 operation was attempted on a process-private PMC that does not have
235 attached target processes.
239 driver is being unloaded from the kernel.
242 The recommended way for application programs to use the facilities of
245 driver is using the API provided by the library
250 driver operates using a system call number that is dynamically
251 allotted to it when it is loaded into the kernel.
255 driver supports the following operations:
256 .Bl -tag -width indent
257 .It Ic PMC_OP_CONFIGURELOG
258 Configure a log file for sampling mode PMCs.
259 .It Ic PMC_OP_FLUSHLOG
260 Transfer buffered log data inside
262 to a configured output file.
263 This operation returns to the caller after the write operation
265 .It Ic PMC_OP_GETCPUINFO
266 Retrieve information about the number of CPUs on the system and
267 the number of hardware performance monitoring counters available per-CPU.
268 .It Ic PMC_OP_GETDRIVERSTATS
269 Retrieve module statistics (for analyzing the behavior of
272 .It Ic PMC_OP_GETMODULEVERSION
273 Retrieve the version number of API.
274 .It Ic PMC_OP_GETPMCINFO
275 Retrieve information about the current state of the PMCs on a
277 .It Ic PMC_OP_PMCADMIN
278 Set the administrative state (i.e., whether enabled or disabled) for
279 the hardware PMCs managed by the
282 .It Ic PMC_OP_PMCALLOCATE
283 Allocate and configure a PMC.
284 On successful allocation, a handle to the PMC (a small integer)
286 .It Ic PMC_OP_PMCATTACH
287 Attach a process mode PMC to a target process.
288 The PMC will be active whenever a thread in the target process is
292 .Dv PMC_F_DESCENDANTS
293 flag had been specified at PMC allocation time, then the PMC is
294 attached to all current and future descendants of the target process.
295 .It Ic PMC_OP_PMCDETACH
296 Detach a PMC from its target process.
297 .It Ic PMC_OP_PMCRELEASE
300 Read and write a PMC.
301 This operation is valid only for PMCs configured in counting modes.
302 .It Ic PMC_OP_SETCOUNT
303 Set the initial count (for counting mode PMCs) or the desired sampling
304 rate (for sampling mode PMCs).
305 .It Ic PMC_OP_PMCSTART
307 .It Ic PMC_OP_PMCSTOP
309 .It Ic PMC_OP_WRITELOG
310 Insert a timestamped user record into the log file.
312 .Ss i386 SPECIFIC API
313 Some i386 family CPUs support the RDPMC instruction which allows a
314 user process to read a PMC value without needing to invoke a
317 On such CPUs, the machine address associated with an allocated PMC is
318 retrievable using the
319 .Ic PMC_OP_PMCX86GETMSR
321 .Bl -tag -width indent
322 .It Ic PMC_OP_PMCX86GETMSR
323 Retrieve the MSR (machine specific register) number associated with
324 the given PMC handle.
326 The PMC needs to be in process-private mode and allocated without the
327 .Va PMC_F_DESCENDANTS
328 modifier flag, and should be attached only to its owner process at the
331 .Ss amd64 SPECIFIC API
332 AMD64 cpus support the RDPMC instruction which allows a
333 user process to read a PMC value without needing to invoke a
336 The machine address associated with an allocated PMC is
337 retrievable using the
338 .Ic PMC_OP_PMCX86GETMSR
340 .Bl -tag -width indent
341 .It Ic PMC_OP_PMCX86GETMSR
342 Retrieve the MSR (machine specific register) number associated with
343 the given PMC handle.
345 The PMC needs to be in process-private mode and allocated without the
346 .Va PMC_F_DESCENDANTS
347 modifier flag, and should be attached only to its owner process at the
353 is influenced by the following
358 .Bl -tag -width indent
359 .It Va kern.hwpmc.debugflags Pq string, read-write
360 (Only available if the
362 driver was compiled with
364 Control the verbosity of debug messages from the
367 .It Va kern.hwpmc.hashsize Pq integer, read-only
368 The number of rows in the hash-tables used to keep track of owner and
371 .It Va kern.hwpmc.logbuffersize Pq integer, read-only
372 The size in kilobytes of each log buffer used by
375 The default buffers size is 4KB.
376 .It Va kern.hwpmc.mtxpoolsize Pq integer, read-only
377 The size of the spin mutex pool used by the PMC driver.
379 .It Va kern.hwpmc.nbuffers Pq integer, read-only
380 The number of log buffers used by
384 .It Va kern.hwpmc.nsamples Pq integer, read-only
385 The number of entries in the per-cpu ring buffer used during sampling.
387 .It Va security.bsd.unprivileged_syspmcs Pq boolean, read-write
388 If set to non-zero, allow unprivileged processes to allocate system-wide
390 The default value is 0.
391 .It Va security.bsd.unprivileged_proc_debug Pq boolean, read-write
394 driver will only allow privileged processes to attach PMCs to other
398 These variables may be set in the kernel environment using
403 .Sh SECURITY CONSIDERATIONS
404 PMCs may be used to monitor the actual behaviour of the system on hardware.
405 In situations where this constitutes an undesirable information leak,
406 the following options are available:
412 .Va "security.bsd.unprivileged_syspmcs"
414 This ensures that unprivileged processes cannot allocate system-wide
415 PMCs and thus cannot observe the hardware behavior of the system
417 This tunable may also be set at boot time using
423 driver into the kernel.
428 .Va "security.bsd.unprivileged_proc_debug"
430 This will ensure that an unprivileged process cannot attach a PMC
431 to any process other than itself and thus cannot observe the hardware
432 behavior of other processes with the same credentials.
435 System administrators should note that on IA-32 platforms
437 makes the content of the IA-32 TSC counter available to all processes
438 via the RDTSC instruction.
439 .Sh IMPLEMENTATION NOTES
441 The kernel driver requires all physical CPUs in an SMP system to have
442 identical performance monitoring counter hardware.
443 .Ss i386 TSC Handling
444 Historically, on the x86 architecture,
446 has permitted user processes running at a processor CPL of 3 to
447 read the TSC using the RDTSC instruction.
450 driver preserves this semantic.
451 .Ss Intel P4/HTT Handling
452 On CPUs with HTT support, Intel P4 PMCs are capable of qualifying
453 only a subset of hardware events on a per-logical CPU basis.
454 Consequently, if HTT is enabled on a system with Intel Pentium P4
457 driver will reject allocation requests for process-private PMCs that
458 request counting of hardware events that cannot be counted separately
459 for each logical CPU.
460 .Ss Intel Pentium-Pro Handling
461 Writing a value to the PMC MSRs found ing Intel Pentium-Pro style PMCs
463 .Tn "Intel Pentium Pro" ,
469 processors) will replicate bit 31 of the
470 value being written into the upper 8 bits of the MSR,
471 bringing down the usable width of these PMCs to 31 bits.
472 For process-virtual PMCs, the
474 driver implements a workaround in software and makes the corrected 64
475 bit count available via the
478 Processes that intend to use RDPMC instructions directly or
479 that intend to write values larger than 2^31 into these PMCs with
481 need to be aware of this hardware limitation.
484 .It hwpmc: tunable hashsize=%d must be greater than zero.
485 A negative value was supplied for tunable
486 .Va kern.hwpmc.hashsize .
487 .It hwpmc: tunable logbuffersize=%d must be greater than zero.
488 A negative value was supplied for tunable
489 .Va kern.hwpmc.logbuffersize .
490 .It hwpmc: tunable nlogbuffers=%d must be greater than zero.
491 A negative value was supplied for tunable
492 .Va kern.hwpmc.nlogbuffers .
493 .It hwpmc: tunable nsamples=%d out of range.
494 The value for tunable
495 .Va kern.hwpmc.nsamples
496 was negative or greater than 65535.
499 An command issued to the
501 driver may fail with the following errors:
506 operation was requested while an existing log was active.
510 operation was requested using the
512 request for a set of hardware resources currently in use for
513 process-private PMCs.
517 operation was requested on an active system mode PMC.
521 operation was requested for a target process that already had another
522 PMC using the same hardware resources attached to it.
526 request writing a new value was issued on a PMC that was active.
529 .Ic PMC_OP_PMCSETCOUNT
530 request was issued on a PMC that was active.
534 request was reissued for a target process that already is the target
537 A bad address was passed in to the driver.
539 A process specified an invalid PMC handle.
541 An invalid CPU number was passed in for an
542 .Ic PMC_OP_GETPMCINFO
545 An invalid CPU number was passed in for an
549 An invalid operation request was passed in for an
553 An invalid PMC id was passed in for an
557 A suitable PMC matching the parameters passed in to a
558 .Ic PMC_OP_PMCALLOCATE
559 request could not be allocated.
561 An invalid PMC mode was requested during a
562 .Ic PMC_OP_PMCALLOCATE
565 An invalid CPU number was specified during a
566 .Ic PMC_OP_PMCALLOCATE
573 request for a process-private PMC.
579 request for a system-wide PMC.
584 .Ic PMC_OP_PMCALLOCATE
585 request contained unknown flags.
587 A PMC allocated for system-wide operation was specified with a
595 request specified an illegal process id.
599 request was issued for a PMC not attached to the target process.
605 request contained illegal flags.
608 .Ic PMC_OP_PMCX86GETMSR
609 operation was requested for a PMC not in process-virtual mode, or
610 for a PMC that is not solely attached to its owner process, or for
611 a PMC that was allocated with flag
612 .Va PMC_F_DESCENDANTS .
614 (On Intel Pentium 4 CPUs with HTT support) An allocation request for
615 a process-private PMC was issued for an event that does not support
616 counting on a per-logical CPU basis.
618 The system was not able to allocate kernel memory.
620 (i386 architectures) A
621 .Ic PMC_OP_PMCX86GETMSR
622 operation was requested for hardware that does not support reading
623 PMCs directly with the RDPMC instruction.
627 operation was requested for a disabled CPU.
629 A system-wide PMC on a disabled CPU was requested to be allocated with
630 .Ic PMC_OP_PMCALLOCATE .
636 request was issued for a system-wide PMC that was allocated on a
637 currently disabled CPU.
641 request was issued by a process without super-user
642 privilege or by a jailed super-user process.
646 operation was issued for a target process that the current process
647 does not have permission to attach to.
649 .Pq "i386 and amd64 architectures"
652 operation was issued on a PMC whose MSR has been retrieved using
653 .Ic PMC_OP_PMCX86GETMSR .
655 A process issued a PMC operation request without having allocated any
658 A process issued a PMC operation request after the PMC was detached
659 from all of its target processes.
663 request specified a non-existent process id.
665 The target process for a
667 operation is not being monitored by the
672 The driver samples the state of the kernel's logical processor support
673 at the time of initialization (i.e., at module load time).
674 On CPUs supporting logical processors, the driver could misbehave if
675 logical processors are subsequently enabled or disabled while the