7 DNSOP Working Group Paul Vixie, ISC
8 INTERNET-DRAFT Akira Kato, WIDE
9 <draft-ietf-dnsop-respsize-02.txt> July 2005
11 DNS Response Size Issues
14 By submitting this Internet-Draft, each author represents that any
15 applicable patent or other IPR claims of which he or she is aware
16 have been or will be disclosed, and any of which he or she becomes
17 aware will be disclosed, in accordance with Section 6 of BCP 79.
19 Internet-Drafts are working documents of the Internet Engineering
20 Task Force (IETF), its areas, and its working groups. Note that
21 other groups may also distribute working documents as Internet-
24 Internet-Drafts are draft documents valid for a maximum of six months
25 and may be updated, replaced, or obsoleted by other documents at any
26 time. It is inappropriate to use Internet-Drafts as reference
27 material or to cite them other than as "work in progress."
29 The list of current Internet-Drafts can be accessed at
30 http://www.ietf.org/ietf/1id-abstracts.txt
32 The list of Internet-Draft Shadow Directories can be accessed at
33 http://www.ietf.org/shadow.html.
37 Copyright (C) The Internet Society (2005). All Rights Reserved.
44 With a mandated default minimum maximum message size of 512 octets,
45 the DNS protocol presents some special problems for zones wishing to
46 expose a moderate or high number of authority servers (NS RRs). This
47 document explains the operational issues caused by, or related to
48 this response size limit.
55 Expires December 2005 [Page 1]
57 INTERNET-DRAFT July 2005 RESPSIZE
60 1 - Introduction and Overview
62 1.1. The DNS standard (see [RFC1035 4.2.1]) limits message size to 512
63 octets. Even though this limitation was due to the required minimum UDP
64 reassembly limit for IPv4, it is a hard DNS protocol limit and is not
65 implicitly relaxed by changes in transport, for example to IPv6.
67 1.2. The EDNS0 standard (see [RFC2671 2.3, 4.5]) permits larger
68 responses by mutual agreement of the requestor and responder. However,
69 deployment of EDNS0 cannot be expected to reach every Internet resolver
70 in the short or medium term. The 512 octet message size limit remains
71 in practical effect at this time.
73 1.3. Since DNS responses include a copy of the request, the space
74 available for response data is somewhat less than the full 512 octets.
75 For negative responses, there is rarely a space constraint. For
76 positive and delegation responses, though, every octet must be carefully
77 and sparingly allocated. This document specifically addresses
78 delegation response sizes.
80 2 - Delegation Details
82 2.1. A delegation response will include the following elements:
84 Header Section: fixed length (12 octets)
85 Question Section: original query (name, class, type)
86 Answer Section: (empty)
87 Authority Section: NS RRset (nameserver names)
88 Additional Section: A and AAAA RRsets (nameserver addresses)
90 2.2. If the total response size would exceed 512 octets, and if the data
91 that would not fit belonged in the question, answer, or authority
92 section, then the TC bit will be set (indicating truncation) which may
93 cause the requestor to retry using TCP, depending on what information
94 was desired and what information was omitted. If a retry using TCP is
95 needed, the total cost of the transaction is much higher. (See [RFC1123
96 6.1.3.2] for details on the protocol requirement that UDP be attempted
97 before falling back to TCP.)
99 2.3. RRsets are never sent partially unless truncation occurs, in which
100 case the final apparent RRset in the final nonempty section must be
101 considered "possibly damaged". With or without truncation, the glue
102 present in the additional data section should be considered "possibly
103 incomplete", and requestors should be prepared to re-query for any
104 damaged or missing RRsets. For multi-transport name or mail services,
108 Expires December 2005 [Page 2]
110 INTERNET-DRAFT July 2005 RESPSIZE
113 this can mean querying for an IPv6 (AAAA) RRset even when an IPv4 (A)
116 2.4. DNS label compression allows a domain name to be instantiated only
117 once per DNS message, and then referenced with a two-octet "pointer"
118 from other locations in that same DNS message. If all nameserver names
119 in a message are similar (for example, all ending in ".ROOT-
120 SERVERS.NET"), then more space will be available for uncompressable data
121 (such as nameserver addresses).
123 2.5. The query name can be as long as 255 characters of presentation
124 data, which can be up to 256 octets of network data. In this worst case
125 scenario, the question section will be 260 octets in size, which would
126 leave only 240 octets for the authority and additional sections (after
127 deducting 12 octets for the fixed length header.)
129 2.6. Average and maximum question section sizes can be predicted by the
130 zone owner, since they will know what names actually exist, and can
131 measure which ones are queried for most often. For cost and performance
132 reasons, the majority of requests should be satisfied without truncation
135 2.7. Requestors who deliberately send large queries to force truncation
136 are only increasing their own costs, and cannot effectively attack the
137 resources of an authority server since the requestor would have to retry
138 using TCP to complete the attack. An attack that always used TCP would
141 2.8. The minimum useful number of address records is two, since with
142 only one address, the probability that it would refer to an unreachable
143 server is too high. Truncation which occurs after two address records
144 have been added to the additional data section is therefore less
145 operationally significant than truncation which occurs earlier.
147 2.9. The best case is no truncation. This is because many requestors
148 will retry using TCP by reflex, or will automatically re-query for
149 RRsets that are "possibly truncated", without considering whether the
150 omitted data was actually necessary.
152 2.10. Each added NS RR for a zone will add a minimum of between 16 and
153 44 octets to every untruncated referral or negative response from the
154 zone's authority servers (16 octets for an NS RR, 16 octets for an A RR,
155 and 28 octets for an AAAA RR), in addition to whatever space is taken by
156 the nameserver name (NS NSDNAME and A/AAAA owner name).
161 Expires December 2005 [Page 3]
163 INTERNET-DRAFT July 2005 RESPSIZE
168 3.1. An instrumented protocol trace of a best case delegation response
169 follows. Note that 13 servers are named, and 13 addresses are given.
170 This query was artificially designed to exactly reach the 512 octet
173 ;; flags: qr rd; QUERY: 1, ANS: 0, AUTH: 13, ADDIT: 13
175 ;; [23456789.123456789.123456789.\
176 123456789.123456789.123456789.com A IN] ;; @80
178 ;; AUTHORITY SECTION:
179 com. 86400 NS E.GTLD-SERVERS.NET. ;; @112
180 com. 86400 NS F.GTLD-SERVERS.NET. ;; @128
181 com. 86400 NS G.GTLD-SERVERS.NET. ;; @144
182 com. 86400 NS H.GTLD-SERVERS.NET. ;; @160
183 com. 86400 NS I.GTLD-SERVERS.NET. ;; @176
184 com. 86400 NS J.GTLD-SERVERS.NET. ;; @192
185 com. 86400 NS K.GTLD-SERVERS.NET. ;; @208
186 com. 86400 NS L.GTLD-SERVERS.NET. ;; @224
187 com. 86400 NS M.GTLD-SERVERS.NET. ;; @240
188 com. 86400 NS A.GTLD-SERVERS.NET. ;; @256
189 com. 86400 NS B.GTLD-SERVERS.NET. ;; @272
190 com. 86400 NS C.GTLD-SERVERS.NET. ;; @288
191 com. 86400 NS D.GTLD-SERVERS.NET. ;; @304
193 ;; ADDITIONAL SECTION:
194 A.GTLD-SERVERS.NET. 86400 A 192.5.6.30 ;; @320
195 B.GTLD-SERVERS.NET. 86400 A 192.33.14.30 ;; @336
196 C.GTLD-SERVERS.NET. 86400 A 192.26.92.30 ;; @352
197 D.GTLD-SERVERS.NET. 86400 A 192.31.80.30 ;; @368
198 E.GTLD-SERVERS.NET. 86400 A 192.12.94.30 ;; @384
199 F.GTLD-SERVERS.NET. 86400 A 192.35.51.30 ;; @400
200 G.GTLD-SERVERS.NET. 86400 A 192.42.93.30 ;; @416
201 H.GTLD-SERVERS.NET. 86400 A 192.54.112.30 ;; @432
202 I.GTLD-SERVERS.NET. 86400 A 192.43.172.30 ;; @448
203 J.GTLD-SERVERS.NET. 86400 A 192.48.79.30 ;; @464
204 K.GTLD-SERVERS.NET. 86400 A 192.52.178.30 ;; @480
205 L.GTLD-SERVERS.NET. 86400 A 192.41.162.30 ;; @496
206 M.GTLD-SERVERS.NET. 86400 A 192.55.83.30 ;; @512
208 ;; MSG SIZE sent: 80 rcvd: 512
214 Expires December 2005 [Page 4]
216 INTERNET-DRAFT July 2005 RESPSIZE
219 3.2. For longer query names, the number of address records supplied will
220 be lower. Furthermore, it is only by using a common parent name (which
221 is GTLD-SERVERS.NET in this example) that all 13 addresses are able to
222 fit. The following output from a response simulator demonstrates these
225 % perl respsize.pl a.dns.br b.dns.br c.dns.br d.dns.br
226 a.dns.br requires 10 bytes
227 b.dns.br requires 4 bytes
228 c.dns.br requires 4 bytes
229 d.dns.br requires 4 bytes
231 For maximum size query (255 byte):
232 if only A is considered: # of A is 4 (green)
233 if A and AAAA are condered: # of A+AAAA is 3 (yellow)
234 if prefer_glue A is assumed: # of A is 4, # of AAAA is 3 (yellow)
235 For average size query (64 byte):
236 if only A is considered: # of A is 4 (green)
237 if A and AAAA are condered: # of A+AAAA is 4 (green)
238 if prefer_glue A is assumed: # of A is 4, # of AAAA is 4 (green)
240 % perl respsize.pl ns-ext.isc.org ns.psg.com ns.ripe.net ns.eu.int
241 ns-ext.isc.org requires 16 bytes
242 ns.psg.com requires 12 bytes
243 ns.ripe.net requires 13 bytes
244 ns.eu.int requires 11 bytes
246 For maximum size query (255 byte):
247 if only A is considered: # of A is 4 (green)
248 if A and AAAA are condered: # of A+AAAA is 3 (yellow)
249 if prefer_glue A is assumed: # of A is 4, # of AAAA is 2 (yellow)
250 For average size query (64 byte):
251 if only A is considered: # of A is 4 (green)
252 if A and AAAA are condered: # of A+AAAA is 4 (green)
253 if prefer_glue A is assumed: # of A is 4, # of AAAA is 4 (green)
255 (Note: The response simulator program is shown in Section 5.)
257 Here we use the term "green" if all address records could fit, or
258 "orange" if two or more could fit, or "red" if fewer than two could fit.
259 It's clear that without a common parent for nameserver names, much space
260 would be lost. For these examples we use an average/common name size of
261 15 octets, befitting our assumption of GTLD-SERVERS.NET as our common
267 Expires December 2005 [Page 5]
269 INTERNET-DRAFT July 2005 RESPSIZE
272 We're assuming an average query name size of 64 since that is the
273 typical average maximum size seen in trace data at the time of this
274 writing. If Internationalized Domain Name (IDN) or any other technology
275 which results in larger query names be deployed significantly in advance
276 of EDNS, then new measurements and new estimates will have to be made.
280 4.1. The current practice of giving all nameserver names a common parent
281 (such as GTLD-SERVERS.NET or ROOT-SERVERS.NET) saves space in DNS
282 responses and allows for more nameservers to be enumerated than would
283 otherwise be possible. (Note that in this case it is wise to serve the
284 common parent domain's zone from the same servers that are named within
285 it, in order to limit external dependencies when all your eggs are in a
288 4.2. Thirteen (13) seems to be the effective maximum number of
289 nameserver names usable traditional (non-extended) DNS, assuming a
290 common parent domain name, and given that response truncation is
291 undesirable as an average case, and assuming mostly IPv4-only
292 reachability (only A RRs exist, not AAAA RRs).
294 4.3. Adding two to five IPv6 nameserver address records (AAAA RRs) to a
295 prototypical delegation that currently contains thirteen (13) IPv4
296 nameserver addresses (A RRs) for thirteen (13) nameserver names under a
297 common parent, would not have a significant negative operational impact
298 on the domain name system.
305 # repsize.pl [ -z zone ] fqdn_ns1 fqdn_ns2 ...
306 # if all queries are assumed to have zone suffux, such as "jp" in
307 # JP TLD servers, specify it in -z option
311 my ($sz_msg) = (512);
312 my ($sz_header, $sz_ptr, $sz_rr_a, $sz_rr_aaaa) = (12, 2, 16, 28);
313 my ($sz_type, $sz_class, $sz_ttl, $sz_rdlen) = (2, 2, 4, 2);
314 my (%namedb, $name, $nssect, %opts, $optz);
320 Expires December 2005 [Page 6]
322 INTERNET-DRAFT July 2005 RESPSIZE
326 if (defined($opts{'z'})) {
327 server_name_len($opts{'z'}); # just register it
330 foreach $name (@ARGV) {
333 $len = server_name_len($name);
334 print "$name requires $len bytes\n";
335 $nssect += $sz_ptr + $sz_type + $sz_class + $sz_ttl + $sz_rdlen + $len;
337 print "# of NS: $n_ns\n";
338 arsect(255, $nssect, $n_ns, "maximum");
339 arsect(64, $nssect, $n_ns, "average");
341 sub server_name_len {
343 my (@labels, $len, $n, $suffix);
345 $name =~ tr/A-Z/a-z/;
346 @labels = split(/./, $name);
347 $len = length(join('.', @labels)) + 2;
348 for ($n = 0; $#labels >= 0; $n++, shift @labels) {
349 $suffix = join('.', @labels);
350 return length($name) - length($suffix) + $sz_ptr
351 if (defined($namedb{$suffix}));
352 $namedb{$suffix} = 1;
358 my ($sz_query, $nssect, $n_ns, $cond) = @_;
359 my ($space, $n_a, $n_a_aaaa, $n_p_aaaa, $ansect);
360 $ansect = $sz_query + 1 + $sz_type + $sz_class;
361 $space = $sz_msg - $sz_header - $ansect - $nssect;
362 $n_a = atmost(int($space / $sz_rr_a), $n_ns);
363 $n_a_aaaa = atmost(int($space / ($sz_rr_a + $sz_rr_aaaa)), $n_ns);
364 $n_p_aaaa = atmost(int(($space - $sz_rr_a * $n_ns) / $sz_rr_aaaa), $n_ns);
365 printf "For %s size query (%d byte):\n", $cond, $sz_query;
366 printf "if only A is considered: ";
367 printf "# of A is %d (%s)\n", $n_a, &judge($n_a, $n_ns);
368 printf "if A and AAAA are condered: ";
369 printf "# of A+AAAA is %d (%s)\n", $n_a_aaaa, &judge($n_a_aaaa, $n_ns);
373 Expires December 2005 [Page 7]
375 INTERNET-DRAFT July 2005 RESPSIZE
378 printf "if prefer_glue A is assumed: ";
379 printf "# of A is %d, # of AAAA is %d (%s)\n",
380 $n_a, $n_p_aaaa, &judge($n_p_aaaa, $n_ns);
385 return "green" if ($n >= $n_ns);
386 return "yellow" if ($n >= 2);
387 return "orange" if ($n == 1);
393 return 0 if ($a < 0);
394 return $b if ($a > $b);
398 Security Considerations
400 The recommendations contained in this document have no known security
405 This document does not call for changes or additions to any IANA
410 Copyright (C) The Internet Society (2005). This document is subject to
411 the rights, licenses and restrictions contained in BCP 78, and except as
412 set forth therein, the authors retain all their rights.
414 This document and the information contained herein are provided on an
415 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR
416 IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
417 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
418 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
419 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
420 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
426 Expires December 2005 [Page 8]
428 INTERNET-DRAFT July 2005 RESPSIZE
435 Redwood City, CA 94063
440 University of Tokyo, Information Technology Center
442 Tokyo 113-8658, JAPAN
479 Expires December 2005 [Page 9]