contrib/bind9/doc/rfc/rfc1536.txt

   1
   2
   3
   4
   5
   6
   7 Network Working Group                                           A. Kumar
   8 Request for Comments: 1536                                     J. Postel
   9 Category: Informational                                        C. Neuman
  10                                                                      ISI
  11                                                                P. Danzig
  12                                                                S. Miller
  13                                                                      USC
  14                                                             October 1993
  15
  16
  17           Common DNS Implementation Errors and Suggested Fixes
  18
  19 Status of this Memo
  20
  21    This memo provides information for the Internet community.  It does
  22    not specify an Internet standard.  Distribution of this memo is
  23    unlimited.
  24
  25 Abstract
  26
  27    This memo describes common errors seen in DNS implementations and
  28    suggests some fixes. Where applicable, violations of recommendations
  29    from STD 13, RFC 1034 and STD 13, RFC 1035 are mentioned. The memo
  30    also describes, where relevant, the algorithms followed in BIND
  31    (versions 4.8.3 and 4.9 which the authors referred to) to serve as an
  32    example.
  33
  34 Introduction
  35
  36    The last few years have seen, virtually, an explosion of DNS traffic
  37    on the NSFnet backbone. Various DNS implementations and various
  38    versions of these implementations interact with each other, producing
  39    huge amounts of unnecessary traffic. Attempts are being made by
  40    researchers all over the internet, to document the nature of these
  41    interactions, the symptomatic traffic patterns and to devise remedies
  42    for the sick pieces of software.
  43
  44    This draft is an attempt to document fixes for known DNS problems so
  45    people know what problems to watch out for and how to repair broken
  46    software.
  47
  48 1. Fast Retransmissions
  49
  50    DNS implements the classic request-response scheme of client-server
  51    interaction. UDP is, therefore, the chosen protocol for communication
  52    though TCP is used for zone transfers. The onus of requerying in case
  53    no response is seen in a "reasonable" period of time, lies with the
  54    client. Although RFC 1034 and 1035 do not recommend any
  55
  56
  57
  58 Kumar, Postel, Neuman, Danzig & Miller                          [Page 1]
  59 \f
  60 RFC 1536            Common DNS Implementation Errors        October 1993
  61
  62
  63    retransmission policy, RFC 1035 does recommend that the resolvers
  64    should cycle through a list of servers. Both name servers and stub
  65    resolvers should, therefore, implement some kind of a retransmission
  66    policy based on round trip time estimates of the name servers. The
  67    client should back-off exponentially, probably to a maximum timeout
  68    value.
  69
  70    However, clients might not implement either of the two. They might
  71    not wait a sufficient amount of time before retransmitting or they
  72    might not back-off their inter-query times sufficiently.
  73
  74    Thus, what the server would see will be a series of queries from the
  75    same querying entity, spaced very close together. Of course, a
  76    correctly implemented server discards all duplicate queries but the
  77    queries contribute to wide-area traffic, nevertheless.
  78
  79    We classify a retransmission of a query as a pure Fast retry timeout
  80    problem when a series of query packets meet the following conditions.
  81
  82       a. Query packets are seen within a time less than a "reasonable
  83          waiting period" of each other.
  84
  85       b. No response to the original query was seen i.e., we see two or
  86          more queries, back to back.
  87
  88       c. The query packets share the same query identifier.
  89
  90       d. The server eventually responds to the query.
  91
  92 A GOOD IMPLEMENTATION:
  93
  94    BIND (we looked at versions 4.8.3 and 4.9) implements a good
  95    retransmission algorithm which solves or limits all of these
  96    problems.  The Berkeley stub-resolver queries servers at an interval
  97    that starts at the greater of 4 seconds and 5 seconds divided by the
  98    number of servers the resolver queries. The resolver cycles through
  99    servers and at the end of a cycle, backs off the time out
 100    exponentially.
 101
 102    The Berkeley full-service resolver (built in with the program
 103    "named") starts with a time-out equal to the greater of 4 seconds and
 104    two times the round-trip time estimate of the server.  The time-out
 105    is backed off with each cycle, exponentially, to a ceiling value of
 106    45 seconds.
 107
 108
 109
 110
 111
 112
 113
 114 Kumar, Postel, Neuman, Danzig & Miller                          [Page 2]
 115 \f
 116 RFC 1536            Common DNS Implementation Errors        October 1993
 117
 118
 119 FIXES:
 120
 121       a. Estimate round-trip times or set a reasonably high initial
 122          time-out.
 123
 124       b. Back-off timeout periods exponentially.
 125
 126       c. Yet another fundamental though difficult fix is to send the
 127          client an acknowledgement of a query, with a round-trip time
 128          estimate.
 129
 130    Since UDP is used, no response is expected by the client until the
 131    query is complete.  Thus, it is less likely to have information about
 132    previous packets on which to estimate its back-off time.  Unless, you
 133    maintain state across queries, so subsequent queries to the same
 134    server use information from previous queries.  Unfortunately, such
 135    estimates are likely to be inaccurate for chained requests since the
 136    variance is likely to be high.
 137
 138    The fix chosen in the ARDP library used by Prospero is that the
 139    server will send an initial acknowledgement to the client in those
 140    cases where the server expects the query to take a long time (as
 141    might be the case for chained queries).  This initial acknowledgement
 142    can include an expected time to wait before retrying.
 143
 144    This fix is more difficult since it requires that the client software
 145    also be trained to expect the acknowledgement packet. This, in an
 146    internet of millions of hosts is at best a hard problem.
 147
 148 2. Recursion Bugs
 149
 150    When a server receives a client request, it first looks up its zone
 151    data and the cache to check if the query can be answered. If the
 152    answer is unavailable in either place, the server seeks names of
 153    servers that are more likely to have the information, in its cache or
 154    zone data. It then does one of two things. If the client desires the
 155    server to recurse and the server architecture allows recursion, the
 156    server chains this request to these known servers closest to the
 157    queried name. If the client doesn't seek recursion or if the server
 158    cannot handle recursion, it returns the list of name servers to the
 159    client assuming the client knows what to do with these records.
 160
 161    The client queries this new list of name servers to get either the
 162    answer, or names of another set of name servers to query. This
 163    process repeats until the client is satisfied. Servers might also go
 164    through this chaining process if the server returns a CNAME record
 165    for the queried name. Some servers reprocess this name to try and get
 166    the desired record type.
 167
 168
 169
 170 Kumar, Postel, Neuman, Danzig & Miller                          [Page 3]
 171 \f
 172 RFC 1536            Common DNS Implementation Errors        October 1993
 173
 174
 175    However, in certain cases, this chain of events may not be good. For
 176    example, a broken or malicious name server might list itself as one
 177    of the name servers to query again. The unsuspecting client resends
 178    the same query to the same server.
 179
 180    In another situation, more difficult to detect, a set of servers
 181    might form a loop wherein A refers to B and B refers to A. This loop
 182    might involve more than two servers.
 183
 184    Yet another error is where the client does not know how to process
 185    the list of name servers returned, and requeries the same server
 186    since that is one (of the few) servers it knows.
 187
 188    We, therefore, classify recursion bugs into three distinct
 189    categories:
 190
 191       a. Ignored referral: Client did not know how to handle NS records
 192          in the AUTHORITY section.
 193
 194       b. Too many referrals: Client called on a server too many times,
 195          beyond a "reasonable" number, with same query. This is
 196          different from a Fast retransmission problem and a Server
 197          Failure detection problem in that a response is seen for every
 198          query.  Also, the identifiers are always different. It implies
 199          client is in a loop and should have detected that and broken
 200          it.  (RFC 1035 mentions that client should not recurse beyond
 201          a certain depth.)
 202
 203       c. Malicious Server: a server refers to itself in the authority
 204          section. If a server does not have an answer now, it is very
 205          unlikely it will be any better the next time you query it,
 206          specially when it claims to be authoritative over a domain.
 207
 208       RFC 1034 warns against such situations, on page 35.
 209
 210       "Bound the amount of work (packets sent, parallel processes
 211        started) so that a request can't get into an infinite loop or
 212        start off a chain reaction of requests or queries with other
 213        implementations EVEN IF SOMEONE HAS INCORRECTLY CONFIGURED
 214        SOME DATA."
 215
 216 A GOOD IMPLEMENTATION:
 217
 218    BIND fixes at least one of these problems. It places an upper limit
 219    on the number of recursive queries it will make, to answer a
 220    question.  It chases a maximum of 20 referral links and 8 canonical
 221    name translations.
 222
 223
 224
 225
 226 Kumar, Postel, Neuman, Danzig & Miller                          [Page 4]
 227 \f
 228 RFC 1536            Common DNS Implementation Errors        October 1993
 229
 230
 231 FIXES:
 232
 233       a. Set an upper limit on the number of referral links and CNAME
 234          links you are willing to chase.
 235
 236          Note that this is not guaranteed to break only recursion loops.
 237          It could, in a rare case, prune off a very long search path,
 238          prematurely.  We know, however, with high probability, that if
 239          the number of links cross a certain metric (two times the depth
 240          of the DNS tree), it is a recursion problem.
 241
 242       b. Watch out for self-referring servers. Avoid them whenever
 243          possible.
 244
 245       c. Make sure you never pass off an authority NS record with your
 246          own name on it!
 247
 248       d. Fix clients to accept iterative answers from servers not built
 249          to provide recursion. Such clients should either be happy with
 250          the non-authoritative answer or be willing to chase the
 251          referral links themselves.
 252
 253 3. Zero Answer Bugs:
 254
 255    Name servers sometimes return an authoritative NOERROR with no
 256    ANSWER, AUTHORITY or ADDITIONAL records. This happens when the
 257    queried name is valid but it does not have a record of the desired
 258    type. Of course, the server has authority over the domain.
 259
 260    However, once again, some implementations of resolvers do not
 261    interpret this kind of a response reasonably. They always expect an
 262    answer record when they see an authoritative NOERROR. These entities
 263    continue to resend their queries, possibly endlessly.
 264
 265 A GOOD IMPLEMENTATION
 266
 267    BIND resolver code does not query a server more than 3 times. If it
 268    is unable to get an answer from 4 servers, querying them three times
 269    each, it returns error.
 270
 271    Of course, it treats a zero-answer response the way it should be
 272    treated; with respect!
 273
 274 FIXES:
 275
 276       a. Set an upper limit on the number of retransmissions for a given
 277          query, at the very least.
 278
 279
 280
 281
 282 Kumar, Postel, Neuman, Danzig & Miller                          [Page 5]
 283 \f
 284 RFC 1536            Common DNS Implementation Errors        October 1993
 285
 286
 287       b. Fix resolvers to interpret such a response as an authoritative
 288          statement of non-existence of the record type for the given
 289          name.
 290
 291 4. Inability to detect server failure:
 292
 293    Servers in the internet are not very reliable (they go down every
 294    once in a while) and resolvers are expected to adapt to the changed
 295    scenario by not querying the server for a while. Thus, when a server
 296    does not respond to a query, resolvers should try another server.
 297    Also, non-stub resolvers should update their round trip time estimate
 298    for the server to a large value so that server is not tried again
 299    before other, faster servers.
 300
 301    Stub resolvers, however, cycle through a fixed set of servers and if,
 302    unfortunately, a server is down while others do not respond for other
 303    reasons (high load, recursive resolution of query is taking more time
 304    than the resolver's time-out, ....), the resolver queries the dead
 305    server again! In fact, some resolvers might not set an upper limit on
 306    the number of query retransmissions they will send and continue to
 307    query dead servers indefinitely.
 308
 309    Name servers running system or chained queries might also suffer from
 310    the same problem. They store names of servers they should query for a
 311    given domain. They cycle through these names and in case none of them
 312    answers, hit each one more than one. It is, once again, important
 313    that there be an upper limit on the number of retransmissions, to
 314    prevent network overload.
 315
 316    This behavior is clearly in violation of the dictum in RFC 1035 (page
 317    46)
 318
 319       "If a resolver gets a server error or other bizarre response
 320        from a name server, it should remove it from SLIST, and may
 321        wish to schedule an immediate transmission to the next
 322        candidate server address."
 323
 324    Removal from SLIST implies that the server is not queried again for
 325    some time.
 326
 327    Correctly implemented full-service resolvers should, as pointed out
 328    before, update round trip time values for servers that do not respond
 329    and query them only after other, good servers. Full-service resolvers
 330    might, however, not follow any of these common sense directives. They
 331    query dead servers, and they query them endlessly.
 332
 333
 334
 335
 336
 337
 338 Kumar, Postel, Neuman, Danzig & Miller                          [Page 6]
 339 \f
 340 RFC 1536            Common DNS Implementation Errors        October 1993
 341
 342
 343 A GOOD IMPLEMENTATION:
 344
 345    BIND places an upper limit on the number of times it queries a
 346    server.  Both the stub-resolver and the full-service resolver code do
 347    this.  Also, since the full-service resolver estimates round-trip
 348    times and sorts name server addresses by these estimates, it does not
 349    query a dead server again, until and unless all the other servers in
 350    the list are dead too!  Further, BIND implements exponential back-off
 351    too.
 352
 353 FIXES:
 354
 355       a. Set an upper limit on number of retransmissions.
 356
 357       b. Measure round-trip time from servers (some estimate is better
 358          than none). Treat no response as a "very large" round-trip
 359          time.
 360
 361       c. Maintain a weighted rtt estimate and decay the "large" value
 362          slowly, with time, so that the server is eventually tested
 363          again, but not after an indefinitely long period.
 364
 365       d. Follow an exponential back-off scheme so that even if you do
 366          not restrict the number of queries, you do not overload the
 367          net excessively.
 368
 369 5. Cache Leaks:
 370
 371    Every resource record returned by a server is cached for TTL seconds,
 372    where the TTL value is returned with the RR. Full-service (or stub)
 373    resolvers cache the RR and answer any queries based on this cached
 374    information, in the future, until the TTL expires. After that, one
 375    more query to the wide-area network gets the RR in cache again.
 376
 377    Full-service resolvers might not implement this caching mechanism
 378    well. They might impose a limit on the cache size or might not
 379    interpret the TTL value correctly. In either case, queries repeated
 380    within a TTL period of a RR constitute a cache leak.
 381
 382 A GOOD/BAD IMPLEMENTATION:
 383
 384    BIND has no restriction on the cache size and the size is governed by
 385    the limits on the virtual address space of the machine it is running
 386    on. BIND caches RRs for the duration of the TTL returned with each
 387    record.
 388
 389    It does, however, not follow the RFCs with respect to interpretation
 390    of a 0 TTL value. If a record has a TTL value of 0 seconds, BIND uses
 391
 392
 393
 394 Kumar, Postel, Neuman, Danzig & Miller                          [Page 7]
 395 \f
 396 RFC 1536            Common DNS Implementation Errors        October 1993
 397
 398
 399    the minimum TTL value, for that zone, from the SOA record and caches
 400    it for that duration. This, though it saves some traffic on the
 401    wide-area network, is not correct behavior.
 402
 403 FIXES:
 404
 405       a. Look over your caching mechanism to ensure TTLs are interpreted
 406          correctly.
 407
 408       b. Do not restrict cache sizes (come on, memory is cheap!).
 409          Expired entries are reclaimed periodically, anyway. Of course,
 410          the cache size is bound to have some physical limit. But, when
 411          possible, this limit should be large (run your name server on
 412          a machine with a large amount of physical memory).
 413
 414       c. Possibly, a mechanism is needed to flush the cache, when it is
 415          known or even suspected that the information has changed.
 416
 417 6. Name Error Bugs:
 418
 419    This bug is very similar to the Zero Answer bug. A server returns an
 420    authoritative NXDOMAIN when the queried name is known to be bad, by
 421    the server authoritative for the domain, in the absence of negative
 422    caching. This authoritative NXDOMAIN response is usually accompanied
 423    by the SOA record for the domain, in the authority section.
 424
 425    Resolvers should recognize that the name they queried for was a bad
 426    name and should stop querying further.
 427
 428    Some resolvers might, however, not interpret this correctly and
 429    continue to query servers, expecting an answer record.
 430
 431    Some applications, in fact, prompt NXDOMAIN answers! When given a
 432    perfectly good name to resolve, they append the local domain to it
 433    e.g., an application in the domain "foo.bar.com", when trying to
 434    resolve the name "usc.edu" first tries "usc.edu.foo.bar.com", then
 435    "usc.edu.bar.com" and finally the good name "usc.edu". This causes at
 436    least two queries that return NXDOMAIN, for every good query. The
 437    problem is aggravated since the negative answers from the previous
 438    queries are not cached.  When the same name is sought again, the
 439    process repeats.
 440
 441    Some DNS resolver implementations suffer from this problem, too. They
 442    append successive sub-parts of the local domain using an implicit
 443    searchlist mechanism, when certain conditions are satisfied and try
 444    the original name, only when this first set of iterations fails. This
 445    behavior recently caused pandemonium in the Internet when the domain
 446    "edu.com" was registered and a wildcard "CNAME" record placed at the
 447
 448
 449
 450 Kumar, Postel, Neuman, Danzig & Miller                          [Page 8]
 451 \f
 452 RFC 1536            Common DNS Implementation Errors        October 1993
 453
 454
 455    top level. All machines from "com" domains trying to connect to hosts
 456    in the "edu" domain ended up with connections to the local machine in
 457    the "edu.com" domain!
 458
 459 GOOD/BAD IMPLEMENTATIONS:
 460
 461    Some local versions of BIND already implement negative caching. They
 462    typically cache negative answers with a very small TTL, sufficient to
 463    answer a burst of queries spaced close together, as is typically
 464    seen.
 465
 466    The next official public release of BIND (4.9.2) will have negative
 467    caching as an ifdef'd feature.
 468
 469    The BIND resolver appends local domain to the given name, when one of
 470    two conditions is met:
 471
 472       i.  The name has no periods and the flag RES_DEFNAME is set.
 473       ii. There is no trailing period and the flag RES_DNSRCH is set.
 474
 475    The flags RES_DEFNAME and RES_DNSRCH are default resolver options, in
 476    BIND, but can be changed at compile time.
 477
 478    Only if the name, so generated, returns an NXDOMAIN is the original
 479    name tried as a Fully Qualified Domain Name. And only if it contains
 480    at least one period.
 481
 482 FIXES:
 483
 484       a. Fix the resolver code.
 485
 486       b. Negative Caching. Negative caching servers will restrict the
 487          traffic seen on the wide-area network, even if not curb it
 488          altogether.
 489
 490       c. Applications and resolvers should not append the local domain to
 491          names they seek to resolve, as far as possible. Names
 492          interspersed with periods should be treated as Fully Qualified
 493          Domain Names.
 494
 495          In other words, Use searchlists only when explicitly specified.
 496          No implicit searchlists should be used. A name that contains
 497          any dots should first be tried as a FQDN and if that fails, with
 498          the local domain name (or searchlist if specified) appended. A
 499          name containing no dots can be appended with the searchlist right
 500          away, but once again, no implicit searchlists should be used.
 501
 502
 503
 504
 505
 506 Kumar, Postel, Neuman, Danzig & Miller                          [Page 9]
 507 \f
 508 RFC 1536            Common DNS Implementation Errors        October 1993
 509
 510
 511    Associated with the name error bug is another problem where a server
 512    might return an authoritative NXDOMAIN, although the name is valid. A
 513    secondary server, on start-up, reads the zone information from the
 514    primary, through a zone transfer. While it is in the process of
 515    loading the zones, it does not have information about them, although
 516    it is authoritative for them.  Thus, any query for a name in that
 517    domain is answered with an NXDOMAIN response code. This problem might
 518    not be disastrous were it not for negative caching servers that cache
 519    this answer and so propagate incorrect information over the internet.
 520
 521 BAD IMPLEMENTATION:
 522
 523    BIND apparently suffers from this problem.
 524
 525    Also, a new name added to the primary database will take a while to
 526    propagate to the secondaries. Until that time, they will return
 527    NXDOMAIN answers for a good name. Negative caching servers store this
 528    answer, too and aggravate this problem further. This is probably a
 529    more general DNS problem but is apparently more harmful in this
 530    situation.
 531
 532 FIX:
 533
 534       a. Servers should start answering only after loading all the zone
 535          data. A failed server is better than a server handing out
 536          incorrect information.
 537
 538       b. Negative cache records for a very small time, sufficient only
 539          to ward off a burst of requests for the same bad name. This
 540          could be related to the round-trip time of the server from
 541          which the negative answer was received. Alternatively, a
 542          statistical measure of the amount of time for which queries
 543          for such names are received could be used. Minimum TTL value
 544          from the SOA record is not advisable since they tend to be
 545          pretty large.
 546
 547       c. A "PUSH" (or, at least, a "NOTIFY") mechanism should be allowed
 548          and implemented, to allow the primary server to inform
 549          secondaries that the database has been modified since it last
 550          transferred zone data.  To alleviate the problem of "too many
 551          zone transfers" that this might cause, Incremental Zone
 552          Transfers should also be part of DNS.  Also, the primary should
 553          not NOTIFY/PUSH with every update but bunch a good number
 554          together.
 555
 556
 557
 558
 559
 560
 561
 562 Kumar, Postel, Neuman, Danzig & Miller                         [Page 10]
 563 \f
 564 RFC 1536            Common DNS Implementation Errors        October 1993
 565
 566
 567 7. Format Errors:
 568
 569    Some resolvers issue query packets that do not necessarily conform to
 570    standards as laid out in the relevant RFCs. This unnecessarily
 571    increases net traffic and wastes server time.
 572
 573 FIXES:
 574
 575       a. Fix resolvers.
 576
 577       b. Each resolver verify format of packets before sending them out,
 578          using a mechanism outside of the resolver. This is, obviously,
 579          needed only if step 1 cannot be followed.
 580
 581 References
 582
 583    [1] Mockapetris, P., "Domain Names Concepts and Facilities", STD 13,
 584        RFC 1034, USC/Information Sciences Institute, November 1987.
 585
 586    [2] Mockapetris, P., "Domain Names Implementation and Specification",
 587        STD 13, RFC 1035, USC/Information Sciences Institute, November
 588        1987.
 589
 590    [3] Partridge, C., "Mail Routing and the Domain System", STD 14, RFC
 591        974, CSNET CIC BBN, January 1986.
 592
 593    [4] Gavron, E., "A Security Problem and Proposed Correction With
 594        Widely Deployed DNS Software", RFC 1535, ACES Research Inc.,
 595        October 1993.
 596
 597    [5] Beertema, P., "Common DNS Data File Configuration Errors", RFC
 598        1537, CWI, October 1993.
 599
 600 Security Considerations
 601
 602    Security issues are not discussed in this memo.
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618 Kumar, Postel, Neuman, Danzig & Miller                         [Page 11]
 619 \f
 620 RFC 1536            Common DNS Implementation Errors        October 1993
 621
 622
 623 Authors' Addresses
 624
 625    Anant Kumar
 626    USC Information Sciences Institute
 627    4676 Admiralty Way
 628    Marina Del Rey CA 90292-6695
 629
 630    Phone:(310) 822-1511
 631    FAX:  (310) 823-6741
 632    EMail: anant@isi.edu
 633
 634
 635    Jon Postel
 636    USC Information Sciences Institute
 637    4676 Admiralty Way
 638    Marina Del Rey CA 90292-6695
 639
 640    Phone:(310) 822-1511
 641    FAX:  (310) 823-6714
 642    EMail: postel@isi.edu
 643
 644
 645    Cliff Neuman
 646    USC Information Sciences Institute
 647    4676 Admiralty Way
 648    Marina Del Rey CA 90292-6695
 649
 650    Phone:(310) 822-1511
 651    FAX:  (310) 823-6714
 652    EMail: bcn@isi.edu
 653
 654
 655    Peter Danzig
 656    Computer Science Department
 657    University of Southern California
 658    University Park
 659
 660    EMail: danzig@caldera.usc.edu
 661
 662
 663    Steve Miller
 664    Computer Science Department
 665    University of Southern California
 666    University Park
 667    Los Angeles CA 90089
 668
 669    EMail: smiller@caldera.usc.edu
 670
 671
 672
 673
 674 Kumar, Postel, Neuman, Danzig & Miller                         [Page 12]
 675 \f