2 QoS Management in OpenSM
4 ==============================================================================
6 ==============================================================================
9 2. Full QoS Policy File
10 3. Simplified QoS Policy Definition
11 4. Policy File Syntax Guidelines
12 5. Examples of Full Policy File
13 6. Simplified QoS Policy - Details and Examples
14 7. SL2VL Mapping and VL Arbitration
17 ==============================================================================
19 ==============================================================================
21 When QoS in OpenSM is enabled (-Q or --qos), OpenSM looks for QoS Policy file.
22 The default name of OpenSM QoS policy file is
23 /usr/local/etc/opensm/qos-policy.conf. The default may be changed by using -Y
24 or --qos_policy_file option with OpenSM.
26 During fabric initialization and at every heavy sweep OpenSM parses the QoS
27 policy file, applies its settings to the discovered fabric elements, and
28 enforces the provided policy on client requests. The overall flow for such
30 - The request is matched against the defined matching rules such that the
31 QoS Level definition is found.
32 - Given the QoS Level, path(s) search is performed with the given
33 restrictions imposed by that level.
35 There are two ways to define QoS policy:
36 - Full policy, where the policy file syntax provides an administrator
37 various ways to match PathRecord/MultiPathRecord (PR/MPR) request and
38 enforce various QoS constraints on the requested PR/MPR
39 - Simplified QoS policy definition, where an administrator would be able to
40 match PR/MPR requests by various ULPs and applications running on top of
43 While the full policy syntax is very flexible, in many cases the simplified
44 policy definition would be sufficient.
47 ==============================================================================
48 2. Full QoS Policy File
49 ==============================================================================
51 QoS policy file has the following sections:
53 I) Port Groups (denoted by port-groups).
54 This section defines zero or more port groups that can be referred later by
55 matching rules (see below). Port group lists ports by:
57 - Port name, which is a combination of NodeDescription and IB port number
58 - PKey, which means that all the ports in the subnet that belong to
59 partition with a given PKey belong to this port group
60 - Partition name, which means that all the ports in the subnet that belong
61 to partition with a given name belong to this port group
62 - Node type, where possible node types are: CA, SWITCH, ROUTER, ALL, and
65 II) QoS Setup (denoted by qos-setup).
66 This section describes how to set up SL2VL and VL Arbitration tables on
67 various nodes in the fabric.
68 However, this is not supported in OpenSM currently.
69 SL2VL and VLArb tables should be configured in the OpenSM options file
70 (default location - /usr/local/etc/opensm/opensm.conf).
72 III) QoS Levels (denoted by qos-levels).
73 Each QoS Level defines Service Level (SL) and a few optional fields:
78 When path(s) search is performed, it is done with regards to restriction that
79 these QoS Level parameters impose.
80 One QoS level that is mandatory to define is a DEFAULT QoS level. It is
81 applied to a PR/MPR query that does not match any existing match rule.
82 Similar to any other QoS Level, it can also be explicitly referred by any
85 IV) QoS Matching Rules (denoted by qos-match-rules).
86 Each PathRecord/MultiPathRecord query that OpenSM receives is matched against
87 the set of matching rules. Rules are scanned in order of appearance in the QoS
88 policy file such as the first match takes precedence.
89 Each rule has a name of QoS level that will be applied to the matching query.
90 A default QoS level is applied to a query that did not match any rule.
91 Queries can be matched by:
92 - Source port group (whether a source port is a member of a specified group)
93 - Destination port group (same as above, only for destination port)
97 To match a certain matching rule, PR/MPR query has to match ALL the rule's
98 criteria. However, not all the fields of the PR/MPR query have to appear in
100 For instance, if the rule has a single criterion - Service ID, it will match
101 any query that has this Service ID, disregarding rest of the query fields.
102 However, if a certain query has only Service ID (which means that this is the
103 only bit in the PR/MPR component mask that is on), it will not match any rule
104 that has other matching criteria besides Service ID.
107 ==============================================================================
108 3. Simplified QoS Policy Definition
109 ==============================================================================
111 Simplified QoS policy definition comprises of a single section denoted by
112 qos-ulps. Similar to the full QoS policy, it has a list of match rules and
113 their QoS Level, but in this case a match rule has only one criterion - its
114 goal is to match a certain ULP (or a certain application on top of this ULP)
115 PR/MPR request, and QoS Level has only one constraint - Service Level (SL).
116 The simplified policy section may appear in the policy file in combine with
117 the full policy, or as a stand-alone policy definition.
118 See more details and list of match rule criteria below.
121 ==============================================================================
122 4. Policy File Syntax Guidelines
123 ==============================================================================
125 - Empty lines are ignored.
126 - Leading and trailing blanks, as well as empty lines, are ignored, so
127 the indentation in the example is just for better readability.
128 - Comments are started with the pound sign (#) and terminated by EOL.
129 - Any keyword should be the first non-blank in the line, unless it's a
131 - Keywords that denote section/subsection start have matching closing
133 - Having a QoS Level named "DEFAULT" is a must - it is applied to PR/MPR
134 requests that didn't match any of the matching rules.
135 - Any section/subsection of the policy file is optional.
138 ==============================================================================
139 5. Examples of Full Policy File
140 ==============================================================================
142 As mentioned earlier, any section of the policy file is optional, and
143 the only mandatory part of the policy file is a default QoS Level.
144 Here's an example of the shortest policy file:
153 Port groups section is missing because there are no match rules, which means
154 that port groups are not referred anywhere, and there is no need defining
155 them. And since this policy file doesn't have any matching rules, PR/MPR query
156 won't match any rule, and OpenSM will enforce default QoS level.
157 Essentially, the above example is equivalent to not having QoS policy file
160 The following example shows all the possible options and keywords in the
161 policy file and their syntax:
164 # See the comments in the following example.
165 # They explain different keywords and their meaning.
169 port-group # using port GUIDs
171 # "use" is just a description that is used for logging
172 # Other than that, it is just a comment
174 port-guid: 0x10000000000001, 0x10000000000005-0x1000000000FFFA
175 port-guid: 0x1000000000FFFF
179 name: Virtual Servers
180 # The syntax of the port name is as follows:
181 # "node_description/Pnum".
182 # node_description is compared to the NodeDescription of the node,
183 # and "Pnum" is a port number on that node.
184 port-name: vs1 HCA-1/P1, vs2 HCA-1/P1
187 # using partitions defined in the partition policy
194 # using node types: CA, ROUTER, SWITCH, SELF (for node that runs SM)
195 # or ALL (for all the nodes in the subnet)
204 # This section of the policy file describes how to set up SL2VL and VL
205 # Arbitration tables on various nodes in the fabric.
206 # However, this is not supported in OpenSM currently - the section is
207 # parsed and ignored. SL2VL and VLArb tables should be configured in the
208 # OpenSM options file (by default - /usr/local/etc/opensm/opensm.conf).
213 # Having a QoS Level named "DEFAULT" is a must - it is applied to
214 # PR/MPR requests that didn't match any of the matching rules.
217 use: default QoS Level
221 # the whole set: SL, MTU-Limit, Rate-Limit, PKey, Packet Lifetime
233 # Match rules are scanned in order of their apperance in the policy file.
234 # First matched rule takes precedence.
237 # matching by single criteria: QoS class
241 # Name of qos-level to apply to the matching PR/MPR
242 qos-level-name: WholeSet
245 # show matching by destination group and service id
249 service-id: 0x10000000000001, 0x10000000000008-0x10000000000FFF
250 qos-level-name: WholeSet
255 use: match by source group only
256 qos-level-name: DEFAULT
260 use: match by all parameters
262 source: Virtual Servers
264 service-id: 0x0000000000010000-0x000000000001FFFF
266 qos-level-name: WholeSet
272 ==============================================================================
273 6. Simplified QoS Policy - Details and Examples
274 ==============================================================================
276 Simplified QoS policy match rules are tailored for matching ULPs (or some
277 application on top of a ULP) PR/MPR requests. This section has a list of
278 per-ULP (or per-application) match rules and the SL that should be enforced
279 on the matched PR/MPR query.
282 - Default match rule that is applied to PR/MPR query that didn't match any
283 of the other match rules
285 - SDP application with a specific target TCP/IP port range
286 - SRP with a specific target IB port GUID
289 - iSER application with a specific target TCP/IP port range
290 - IPoIB with a default PKey
291 - IPoIB with a specific PKey
292 - any ULP/application with a specific Service ID in the PR/MPR query
293 - any ULP/application with a specific PKey in the PR/MPR query
294 - any ULP/application with a specific target IB port GUID in the PR/MPR query
296 Since any section of the policy file is optional, as long as basic rules of
297 the file are kept (such as no referring to nonexisting port group, having
298 default QoS Level, etc), the simplified policy section (qos-ulps) can serve
299 as a complete QoS policy file.
300 The shortest policy file in this case would be as follows:
303 default : 0 #default SL
306 It is equivalent to the previous example of the shortest policy file, and it
307 is also equivalent to not having policy file at all.
309 Below is an example of simplified QoS policy with all the possible keywords:
312 default : 0 # default SL
313 sdp, port-num 30000 : 0 # SL for application running on top
314 # of SDP when a destination
315 # TCP/IPport is 30000
316 sdp, port-num 10000-20000 : 0
317 sdp : 1 # default SL for any other
318 # application running on top of SDP
319 rds : 2 # SL for RDS traffic
320 iser, port-num 900 : 0 # SL for iSER with a specific target
322 iser : 3 # default SL for iSER
323 ipoib, pkey 0x0001 : 0 # SL for IPoIB on partition with
325 ipoib : 4 # default IPoIB partition,
327 any, service-id 0x6234 : 6 # match any PR/MPR query with a
328 # specific Service ID
329 any, pkey 0x0ABC : 6 # match any PR/MPR query with a
331 srp, target-port-guid 0x1234 : 5 # SRP when SRP Target is located on
332 # a specified IB port GUID
333 any, target-port-guid 0x0ABC-0xFFFFF : 6 # match any PR/MPR query with
334 # a specific target port GUID
338 Similar to the full policy definition, matching of PR/MPR queries is done in
339 order of appearance in the QoS policy file such as the first match takes
340 precedence, except for the "default" rule, which is applied only if the query
341 didn't match any other rule.
343 All other sections of the QoS policy file take precedence over the qos-ulps
344 section. That is, if a policy file has both qos-match-rules and qos-ulps
345 sections, then any query is matched first against the rules in the
346 qos-match-rules section, and only if there was no match, the query is matched
347 against the rules in qos-ulps section.
349 Note that some of these match rules may overlap, so in order to use the
350 simplified QoS definition effectively, it is important to understand how each
351 of the ULPs is matched:
354 IPoIB query is matched by PKey. Default PKey for IPoIB partition is 0x7fff, so
355 the following three match rules are equivalent:
358 ipoib, pkey 0x7fff : <SL>
359 any, pkey 0x7fff : <SL>
362 SDP PR query is matched by Service ID. The Service-ID for SDP is
363 0x000000000001PPPP, where PPPP are 4 hex digits holding the remote TCP/IP Port
364 Number to connect to. The following two match rules are equivalent:
367 any, service-id 0x0000000000010000-0x000000000001ffff : <SL>
370 Similar to SDP, RDS PR query is matched by Service ID. The Service ID for RDS
371 is 0x000000000106PPPP, where PPPP are 4 hex digits holding the remote TCP/IP
372 Port Number to connect to. Default port number for RDS is 0x48CA, which makes
373 a default Service-ID 0x00000000010648CA. The following two match rules are
377 any, service-id 0x00000000010648CA : <SL>
380 Similar to RDS, iSER query is matched by Service ID, where the the Service ID
381 is also 0x000000000106PPPP. Default port number for iSER is 0x0CBC, which makes
382 a default Service-ID 0x0000000001060CBC. The following two match rules are
386 any, service-id 0x0000000001060CBC : <SL>
389 Service ID for SRP varies from storage vendor to vendor, thus SRP query is
390 matched by the target IB port GUID. The following two match rules are
393 srp, target-port-guid 0x1234 : <SL>
394 any, target-port-guid 0x1234 : <SL>
396 Note that any of the above ULPs might contain target port GUID in the PR
397 query, so in order for these queries not to be recognized by the QoS manager
398 as SRP, the SRP match rule (or any match rule that refers to the target port
399 guid only) should be placed at the end of the qos-ulps match rules.
402 SL for MPI is manually configured by MPI admin. OpenSM is not forcing any SL
403 on the MPI traffic, and that's why it is the only ULP that did not appear in
404 the qos-ulps section.
407 ==============================================================================
408 7. SL2VL Mapping and VL Arbitration
409 ==============================================================================
411 OpenSM cached options file has a set of QoS related configuration parameters,
412 that are used to configure SL2VL mapping and VL arbitration on IB ports.
413 These parameters are:
414 - Max VLs: the maximum number of VLs that will be on the subnet.
415 - High limit: the limit of High Priority component of VL Arbitration
417 - VLArb low table: Low priority VL Arbitration table (IBA 7.6.9) template.
418 - VLArb high table: High priority VL Arbitration table (IBA 7.6.9) template.
419 - SL2VL: SL2VL Mapping table (IBA 7.6.6) template. It is a list of VLs
420 corresponding to SLs 0-15 (Note that VL15 used here means drop this SL).
422 There are separate QoS configuration parameters sets for various target types:
423 CAs, routers, switch external ports, and switch's enhanced port 0. The names
424 of such parameters are prefixed by "qos_<type>_" string. Here is a full list
425 of the currently supported sets:
427 qos_ca_ - QoS configuration parameters set for CAs.
428 qos_rtr_ - parameters set for routers.
429 qos_sw0_ - parameters set for switches' port 0.
430 qos_swe_ - parameters set for switches' external ports.
432 Here's the example of typical default values for CAs and switches' external
433 ports (hard-coded in OpenSM initialization):
437 qos_ca_vlarb_high 0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
438 qos_ca_vlarb_low 0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
439 qos_ca_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
443 qos_swe_vlarb_high 0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0
444 qos_swe_vlarb_low 0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4
445 qos_swe_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
447 VL arbitration tables (both high and low) are lists of VL/Weight pairs.
448 Each list entry contains a VL number (values from 0-14), and a weighting value
449 (values 0-255), indicating the number of 64 byte units (credits) which may be
450 transmitted from that VL when its turn in the arbitration occurs. A weight
451 of 0 indicates that this entry should be skipped. If a list entry is
452 programmed for VL15 or for a VL that is not supported or is not currently
453 configured by the port, the port may either skip that entry or send from any
454 supported VL for that entry.
456 Note, that the same VLs may be listed multiple times in the High or Low
457 priority arbitration tables, and, further, it can be listed in both tables.
459 The limit of high-priority VLArb table (qos_<type>_high_limit) indicates the
460 number of high-priority packets that can be transmitted without an opportunity
461 to send a low-priority packet. Specifically, the number of bytes that can be
462 sent is high_limit times 4K bytes.
464 A high_limit value of 255 indicates that the byte limit is unbounded.
465 Note: if the 255 value is used, the low priority VLs may be starved.
466 A value of 0 indicates that only a single packet from the high-priority table
467 may be sent before an opportunity is given to the low-priority table.
469 Keep in mind that ports usually transmit packets of size equal to MTU.
470 For instance, for 4KB MTU a single packet will require 64 credits, so in order
471 to achieve effective VL arbitration for packets of 4KB MTU, the weighting
472 values for each VL should be multiples of 64.
474 Below is an example of SL2VL and VL Arbitration configuration on subnet:
478 qos_ca_vlarb_high 0:4
479 qos_ca_vlarb_low 0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64
480 qos_ca_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
484 qos_swe_vlarb_high 0:4
485 qos_swe_vlarb_low 0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64
486 qos_swe_sl2vl 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7
488 In this example, there are 8 VLs configured on subnet: VL0 to VL7. VL0 is
489 defined as a high priority VL, and it is limited to 6 x 4KB = 24KB in a single
490 transmission burst. Such configuration would suilt VL that needs low latency
491 and uses small MTU when transmitting packets. Rest of VLs are defined as low
492 priority VLs with different weights, while VL4 is effectively turned off.