2 .\" Copyright (c) 1998, 1999 Kenneth D. Merry.
3 .\" All rights reserved.
5 .\" Redistribution and use in source and binary forms, with or without
6 .\" modification, are permitted provided that the following conditions
8 .\" 1. Redistributions of source code must retain the above copyright
9 .\" notice, this list of conditions and the following disclaimer.
10 .\" 2. Redistributions in binary form must reproduce the above copyright
11 .\" notice, this list of conditions and the following disclaimer in the
12 .\" documentation and/or other materials provided with the distribution.
13 .\" 3. The name of the author may not be used to endorse or promote products
14 .\" derived from this software without specific prior written permission.
16 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
17 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
19 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
20 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
22 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
23 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
24 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
25 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
35 .Nm devstat_end_transaction ,
36 .Nm devstat_end_transaction_bio ,
37 .Nm devstat_end_transaction_bio_bt ,
38 .Nm devstat_new_entry ,
39 .Nm devstat_remove_entry ,
40 .Nm devstat_start_transaction ,
41 .Nm devstat_start_transaction_bio
42 .Nd kernel interface for keeping device statistics
47 .Fa "const void *dev_name"
49 .Fa "uint32_t block_size"
50 .Fa "devstat_support_flags flags"
51 .Fa "devstat_type_flags device_type"
52 .Fa "devstat_priority priority"
55 .Fn devstat_remove_entry "struct devstat *ds"
57 .Fo devstat_start_transaction
58 .Fa "struct devstat *ds"
59 .Fa "const struct bintime *now"
62 .Fo devstat_start_transaction_bio
63 .Fa "struct devstat *ds"
67 .Fo devstat_end_transaction
68 .Fa "struct devstat *ds"
70 .Fa "devstat_tag_type tag_type"
71 .Fa "devstat_trans_flags flags"
72 .Fa "const struct bintime *now"
73 .Fa "const struct bintime *then"
76 .Fo devstat_end_transaction_bio
77 .Fa "struct devstat *ds"
78 .Fa "const struct bio *bp"
81 .Fo devstat_end_transaction_bio_bt
82 .Fa "struct devstat *ds"
83 .Fa "const struct bio *bp"
84 .Fa "const struct bintime *now"
87 The devstat subsystem is an interface for recording device
88 statistics, as its name implies.
89 The idea is to keep reasonably detailed
90 statistics while utilizing a minimum amount of CPU time to record them.
91 Thus, no statistical calculations are actually performed in the kernel
95 Instead, that is left for user programs to handle.
97 The historical and antiquated
99 model assumed a single active IO operation per device, which is not accurate
100 for most disk-like drivers in the 2000s and beyond.
101 New consumers of the interface should almost certainly use only the "bio"
102 variants of the start and end transacation routines.
104 .Fn devstat_new_entry
105 allocates and initializes
107 structure and returns a pointer to it.
108 .Fn devstat_new_entry
109 takes several arguments:
110 .Bl -tag -width device_type
112 The device name, e.g., da, cd, sa.
116 Block size of the device, if supported.
117 If the device does not support a
118 block size, or if the blocksize is unknown at the time the device is added
121 list, it should be set to 0.
123 Flags indicating operations supported or not supported by the device.
124 See below for details.
127 This is broken into three sections: base device type
128 (e.g., direct access, CDROM, sequential access), interface type (IDE, SCSI
129 or other) and a pass-through flag to indicate pas-through devices.
130 See below for a complete list of types.
133 The priority is used to determine how devices are
137 Devices are sorted first by priority (highest to lowest),
138 and then by attach order.
139 See below for a complete list of available
143 .Fn devstat_remove_entry
144 removes a device from the
147 It takes the devstat structure for the device in question as
151 generation number is incremented and the number of devices is decremented.
153 .Fn devstat_start_transaction
154 registers the start of a transaction with the
157 Optionally, if the caller already has a
159 value available, it may be passed in
161 Usually the caller can just pass
165 and the routine will gather the current
168 The busy count is incremented with each transaction start.
169 When a device goes from idle to busy, the system uptime is recorded in the
175 .Fn devstat_start_transaction_bio
178 in the provided bio's
181 .Fn devstat_start_transaction .
183 .Fn devstat_end_transaction
184 registers the end of a transaction with the
187 It takes six arguments:
188 .Bl -tag -width tag_type
192 structure for the device in question.
194 The number of bytes transferred in this transaction.
196 Transaction tag type.
197 See below for tag types.
199 Transaction flags indicating whether the transaction was a read, write, or
200 whether no data was transferred.
204 at the end of the transaction, or
209 at the beginning of the transaction, or
217 it collects the current time from
223 the operation is not tracked in the
228 .Fn devstat_end_transaction_bio
229 is a thin wrapper for
230 .Fn devstat_end_transaction_bio_bt
236 .Fn devstat_end_transaction_bio_bt
238 .Fn devstat_end_transaction
239 which pulls all needed information from a
242 .Fn devstat_start_transaction_bio .
243 The bio must be ready for
249 must be correctly initialized).
253 structure is composed of the following fields:
254 .Bl -tag -width dev_creation_time
257 An implementation detail used to gather consistent snapshots of device
260 Number of operations started.
262 Number of operations completed.
265 can be calculated by subtracting
272 are used to get a consistent snapshot.)
273 This is the current number of outstanding transactions for the device.
274 This should never go below zero, and on an idle device it should be zero.
275 If either one of these conditions is not true, it indicates a problem.
277 There should be one and only one
278 transaction start event and one transaction end event for each transaction.
282 structure is placed in a linked list when it is registered.
285 field contains a pointer to the next entry in the list of
289 The device number is a unique identifier for each device.
291 number is incremented for each new device that is registered.
293 number is currently only a 32-bit integer, but it could be enlarged if
294 someone has a system with more than four billion device arrival events.
296 The device name is a text string given by the registering driver to
304 The unit number identifies the particular instance of the peripheral driver
307 This array contains the number of bytes that have been read (index
310 .Dv DEVSTAT_WRITE ) ,
311 freed or erased (index
314 .Dv DEVSTAT_NO_DATA ) .
315 All values are unsigned 64-bit integers.
317 This array contains the number of operations of a given type that have been
319 The indices are identical to those for
323 or "other" represents the number of transactions to the device which are
324 neither reads, writes, nor frees.
327 drivers often send a test unit ready command to
330 The test unit ready command does not read or write any data.
331 It merely causes the device to return its status.
333 This array contains the total bintime corresponding to completed operations of
335 The indices are identical to those for
338 (Operations that complete using the historical
339 .Fn devstat_end_transaction
340 API and do not provide a non-NULL
342 are not accounted for.)
344 This is the amount of time that the device busy count has been greater than
346 This is only updated when the busy count returns to zero.
348 This is the time, as reported by
350 that the device was registered.
352 This is the block size of the device, if the device has a block size.
354 This is an array of counters to record the number of various tag types that
355 are sent to a device.
356 See below for a list of tag types.
358 If the device is not busy, this was the time that a transaction last completed.
359 If the device is busy, this the most recent of either the time that the device
360 became busy, or the time that the last transaction completed.
362 These flags indicate which statistics measurements are supported by a
364 These flags are primarily intended to serve as an aid
365 to userland programs that decipher the statistics.
367 This is the device type.
368 It consists of three parts: the device type
369 (e.g., direct access, CDROM, sequential access, etc.), the interface (IDE,
370 SCSI or other) and whether or not the device in question is a pass-through
372 See below for a complete list of device types.
374 This is the priority.
375 This is the first parameter used to determine where
376 to insert a device in the
379 The second parameter is attach order.
380 See below for a list of available priorities.
382 Identification for GEOM nodes.
385 Each device is given a device type.
386 Pass-through devices have the same underlying device type and interface as the
387 device they provide an interface for, but they also have the pass-through flag
389 The base device types are identical to the
391 device type numbers, so with
393 peripherals, the device type returned from an inquiry is usually ORed with the
395 interface type and the pass-through flag if appropriate.
397 flags are as follows:
398 .Bd -literal -offset indent
400 DEVSTAT_TYPE_DIRECT = 0x000,
401 DEVSTAT_TYPE_SEQUENTIAL = 0x001,
402 DEVSTAT_TYPE_PRINTER = 0x002,
403 DEVSTAT_TYPE_PROCESSOR = 0x003,
404 DEVSTAT_TYPE_WORM = 0x004,
405 DEVSTAT_TYPE_CDROM = 0x005,
406 DEVSTAT_TYPE_SCANNER = 0x006,
407 DEVSTAT_TYPE_OPTICAL = 0x007,
408 DEVSTAT_TYPE_CHANGER = 0x008,
409 DEVSTAT_TYPE_COMM = 0x009,
410 DEVSTAT_TYPE_ASC0 = 0x00a,
411 DEVSTAT_TYPE_ASC1 = 0x00b,
412 DEVSTAT_TYPE_STORARRAY = 0x00c,
413 DEVSTAT_TYPE_ENCLOSURE = 0x00d,
414 DEVSTAT_TYPE_FLOPPY = 0x00e,
415 DEVSTAT_TYPE_MASK = 0x00f,
416 DEVSTAT_TYPE_IF_SCSI = 0x010,
417 DEVSTAT_TYPE_IF_IDE = 0x020,
418 DEVSTAT_TYPE_IF_OTHER = 0x030,
419 DEVSTAT_TYPE_IF_MASK = 0x0f0,
420 DEVSTAT_TYPE_PASS = 0x100
421 } devstat_type_flags;
424 Devices have a priority associated with them, which controls roughly where
425 they are placed in the
428 The priorities are as follows:
429 .Bd -literal -offset indent
431 DEVSTAT_PRIORITY_MIN = 0x000,
432 DEVSTAT_PRIORITY_OTHER = 0x020,
433 DEVSTAT_PRIORITY_PASS = 0x030,
434 DEVSTAT_PRIORITY_FD = 0x040,
435 DEVSTAT_PRIORITY_WFD = 0x050,
436 DEVSTAT_PRIORITY_TAPE = 0x060,
437 DEVSTAT_PRIORITY_CD = 0x090,
438 DEVSTAT_PRIORITY_DISK = 0x110,
439 DEVSTAT_PRIORITY_ARRAY = 0x120,
440 DEVSTAT_PRIORITY_MAX = 0xfff
444 Each device has associated with it flags to indicate what operations are
445 supported or not supported.
447 .Va devstat_support_flags
448 values are as follows:
449 .Bl -tag -width DEVSTAT_NO_ORDERED_TAGS
450 .It DEVSTAT_ALL_SUPPORTED
451 Every statistic type is supported by the device.
452 .It DEVSTAT_NO_BLOCKSIZE
453 This device does not have a blocksize.
454 .It DEVSTAT_NO_ORDERED_TAGS
455 This device does not support ordered tags.
456 .It DEVSTAT_BS_UNAVAILABLE
457 This device supports a blocksize, but it is currently unavailable.
459 flag is most often used with removable media drives.
462 Transactions to a device fall into one of three categories, which are
466 .Fn devstat_end_transaction .
467 The transaction types are as follows:
468 .Bd -literal -offset indent
470 DEVSTAT_NO_DATA = 0x00,
472 DEVSTAT_WRITE = 0x02,
474 } devstat_trans_flags;
475 #define DEVSTAT_N_TRANS_FLAGS 4
478 DEVSTAT_NO_DATA is a type of transactions to the device which are neither
482 drivers often send a test unit ready command to
485 The test unit ready command does not read or write any data.
486 It merely causes the device to return its status.
488 There are four possible values for the
491 .Fn devstat_end_transaction :
492 .Bl -tag -width DEVSTAT_TAG_ORDERED
493 .It DEVSTAT_TAG_SIMPLE
494 The transaction had a simple tag.
496 The transaction had a head of queue tag.
497 .It DEVSTAT_TAG_ORDERED
498 The transaction had an ordered tag.
500 The device does not support tags.
503 The tag type values correspond to the lower four bits of the
506 In CAM, for instance, the
508 from the CCB is ORed with 0xf to determine the tag type to pass in to
509 .Fn devstat_end_transaction .
514 .In sys/devicestat.h .
515 This is the current version of the
517 subsystem, and it should be incremented each time a change is made that
518 would require recompilation of userland programs that access
521 Userland programs use this version, via the
522 .Va kern.devstat.version
524 variable to determine whether they are in sync with the kernel
536 statistics system appeared in
539 .An Kenneth Merry Aq Mt ken@FreeBSD.org
541 There may be a need for
543 protection around some of the
545 list manipulation code to ensure, for example, that the list of devices
546 is not changed while someone is fetching the