contrib/bind/doc/misc/DynamicUpdate

   1
   2
   3                 Description of Dynamic Update and T_UNSPEC Code
   4
   5
   6
   7
   8                          Added by Mike Schwartz
   9           University of Washington Computer Science Department
  10                                   11/86
  11                        schwartz@cs.washington.edu
  12
  13
  14
  15
  16 I have incorporated 2 new features into BIND:
  17         1. Code to allow (unauthenticated) dynamic updates: surrounded by
  18            #ifdef ALLOW_UPDATES
  19         2. Code to allow data of unspecified type: surrounded by
  20            #ifdef ALLOW_T_UNSPEC
  21
  22 Note that you can have one or the other or both (or neither) of these
  23 modifications running, by appropriately modifying the makefiles.  Also,
  24 the external interface isn't changed (other than being extended), i.e.,
  25 a BIND server that allows dynamic updates and/or T_UNSPEC data can
  26 still talk to a 'vanilla' server using the 'vanilla' operations.
  27
  28 The description that follows is broken into 3 parts: a functional
  29 description of the dynamic update facility, a functional description of
  30 the T_UNSPEC facility, and a discussion of the implementation of
  31 dynamic updates.  The implementation description is mostly intended for
  32 those who want to make future enhancements (especially the addition of
  33 a good authentication mechanism).  If you make enhancements, I would be
  34 interested in hearing about them.
  35
  36
  37
  38
  39
  40                         1. Dynamic Update Facility
  41
  42 I added this code in conjunction with my research into naming in large
  43 heterogeneous systems.  For the purposes of this research, I ignored
  44 security issues.  In other words, no authentication/authorization
  45 mechanism exists to control updates.  Authentication will hopefully be
  46 addressed at some future point (although probably not by me). In the
  47 mean time, BIND Internet name servers (as opposed to "private" name
  48 server networks operating with their own port numbers, as I use in my
  49 research) should be compiled *without* -DALLOW_UPDATES, so that the
  50 integrity of the Internet name database won't be compromised by this
  51 code.
  52
  53
  54 There are 5 different dynamic update interfaces:
  55         UPDATEA  - add a resource record
  56         UPDATED  - delete a specific resource record
  57         UPDATEDA - delete all named resource records
  58         UPDATEM  - modify a specific resource record
  59         UPDATEMA - modify all named resource records
  60
  61 These all work through the normal resolver interface, i.e., these
  62 interfaces are opcodes, and the data in the buffers passed to
  63 res_mkquery must conform to what is expected for the particular
  64 operation (see the #ifdef ALLOW_UPDATES extensions to nstest.c for
  65 example usage).
  66
  67 UPDATEM is logically equivalent to an UPDATED followed by an UPDATEA,
  68 except that the updates occur atomically at the primary server (as
  69 usual with Domain servers, secondaries may become temporarily
  70 inconsistent).  The difference between UPDATED and UPDATEDA is that the
  71 latter allows you to delete all RRs associated with a name; similarly
  72 for UPDATEM and UPDATEMA.  The reason for the UPDATE{D,M}A interfaces
  73 is two-fold:
  74
  75         1. Sometimes you want to delete/modify some data, but you know you'll
  76            only have a single RR for that data; in such a case, it's more
  77            convenient to delete/modify the RR by just giving the name;
  78            otherwise, you would have to first look it up, and then
  79            delete/modify it.
  80
  81         2. It is sometimes useful to be able to delete/modify multiple RRs
  82            this way, since one can then perform the operation atomically.
  83            Otherwise, one would have to delete/modify the RRs one-by-one.
  84
  85 One additional point to note about UPDATEMA is that it will return a
  86 success status if there were *zero* or more RRs associated with the given
  87 name (and the RR add succeeds), whereas UPDATEM, UPDATED, and UPDATEDA
  88 will return a success status if there were *one* or more RRs associated
  89 with the given name.  The reason for the difference is to handle the
  90 (probably common) case where what you want to do is set a particular
  91 name to contain a single RR, irrespective of whether or not it was
  92 already set.
  93
  94
  95
  96
  97                         2. T_UNSPEC Facility
  98
  99 Type T_UNSPEC allows you to store data whose layout BIND doesn't
 100 understand.  Data of this type is not marshalled (i.e., converted
 101 between host and network representation, as is done, for example, with
 102 Internet addresses) by BIND, so it is up to the client to make sure
 103 things work out ok w.r.t. heterogeneous data representations.  The way
 104 I use this type is to have the client marshal data, store it, retrieve
 105 it, and demarshal it.  This way I can store arbitrary data in BIND
 106 without having to add new code for each specific type.
 107
 108 T_UNSPEC data is dumped in an ASCII-encoded, checksummed format so
 109 that, although it's not human-readable, it at least doesn't fill the
 110 dump file with unprintable characters.
 111
 112 Type T_UNSPEC is important for my research environment, where
 113 potentially lots of people want to store data in the name service, and
 114 each person's data looks different.  Instead of having BIND understand
 115 the format of each of their data types, the clients define marshaling
 116 routines and pass buffers of marshalled data to BIND; BIND never tries
 117 to demarshal the data...it just holds on to it, and gives it back to
 118 the client when the client requests it, and the client must then
 119 demarshal it.
 120
 121 The Xerox Network System's name service (the Clearinghouse) works this
 122 way.  The reason 'vanilla' BIND understands the format of all the data
 123 it holds is probably that BIND is tailored for a very specific
 124 application, and wants to make sure the data it holds makes sense (and,
 125 for some types, BIND needs to take additional action depending on the
 126 data's semantics).  For more general purpose name services (like the
 127 Clearinghouse and my usage of BIND), this approach is less tractable.
 128
 129 See the #ifdef ALLOW_T_UNSPEC extensions to nstest.c for example usage of
 130 this type.
 131
 132
 133
 134
 135
 136
 137                 3. Dynamic Update Implementation Description
 138
 139 This section is divided into 3 subsections: General Discussion,
 140 Miscellaneous Points, and Known Defects.
 141
 142
 143
 144
 145                 3.1 General Discussion
 146
 147 The basic scheme is this: When an update message arrives, a call is
 148 made to InitDynUpdate, which first looks up the SOA record for the zone
 149 the update affects.  If this is the primary server for that zone, we do
 150 the update and then update the zone serial number (so that secondaries
 151 will refresh later).  If this is a secondary server, we forward the
 152 update to the primary, and if that's successful, we update our copy
 153 afterwards.  If it's neither, we refuse the update.  (One might think
 154 to try to propagate the update to an authoritative server; I figured
 155 that updates will probably be most likely within an administrative
 156 domain anyway; this could be changed if someone has strong feelings
 157 about it).
 158
 159 Note that this mechanism disallows updates when the primary is
 160 down, preserving the Domain scheme's consistency requirements,
 161 but making the primary a critical point for updates.  This seemed
 162 reasonable to me because
 163         1. Alternative schemes must deal with potentially complex
 164            situations involving merging of inconsistent secondary
 165            updates
 166         2. Updates are presumed to be rare relative to read accesses,
 167            so this increased restrictiveness for updates over reads is
 168            probably not critical
 169
 170 I have placed comments through out the code, so it shouldn't be
 171 too hard to see what I did.  The majority of the processing is in
 172 doupdate() and InitDynUpdate().  Also, I added a field to the zone
 173 struct, to keep track of when zones get updated, so that only changed
 174 zones get checkpointed.
 175
 176
 177
 178
 179
 180                 3.2 Miscellaneous Points
 181
 182 I use ns_maint to call zonedump() if the database changes, to
 183 provide a checkpointing mechanism.  I use the zone refresh times to
 184 set up ns_maint interrupts if there are either secondaries or
 185 primaries.  Hence, if there is a secondary, this interrupt can cause
 186 zoneref (as before), and if there is a primary, this interrupt can
 187 cause doadump.  I also checkpoint if needed before shutting down.
 188
 189 You can force a server to checkpoint any changed zones by sending the
 190 maint signal (SIGALRM) to the process.  Otherwise it just checkpoints
 191 during maint. interrupts, or when being shutdown (with SIGTERM).
 192 Sending it the dump signal causes the database to be dumped into the
 193 (single) dump file, but doesn't checkpoint (i.e., update the boot
 194 files).  Note that the boot files will be overwritten with checkpoint
 195 files, so if you want to preserve the comments, you should keep copies
 196 of the original boot files separate from the versions that are actually
 197 used.
 198
 199 I disallow T_SOA updates, for several reasons:
 200         - T_SOA deletes at the primary wont be discovered by the secondaries
 201           until they try to request them at maint time, which will cause
 202           a failure
 203         - the corresponding NS record would have to be deleted at the same
 204           time (atomically) to avoid various problems
 205         - T_SOA updates would have to be done in the right order, or else
 206           the primary and secondaries will be out-of-sync for that zone.
 207 My feeling is that changing the zone topology is a weighty enough thing
 208 to do that it should involve changing the load file and reloading all
 209 affected servers.
 210
 211 There are alot of places where bind exits due to catastrophic failures
 212 (mainly malloc failures).  I don't try to dump the database in these
 213 places because it's probably inconsistent anyway.  It's probably better
 214 to depend on the most recent dump.
 215
 216
 217
 218
 219
 220                 3.2 Known Defects
 221
 222 1. I put the following comment in nlookup (db_lookup.c):
 223
 224         Note: at this point, if np->n_data is NULL, we could be in one
 225         of two situations: Either we have come across a name for which
 226         all the RRs have been (dynamically) deleted, or else we have
 227         come across a name which has no RRs associated with it because
 228         it is just a place holder (e.g., EDU).  In the former case, we
 229         would like to delete the namebuf, since it is no longer of use,
 230         but in the latter case we need to hold on to it, so future
 231         lookups that depend on it don't fail.  The only way I can see
 232         of doing this is to always leave the namebufs around (although
 233         then the memory usage continues to grow whenever names are
 234         added, and can never shrink back down completely when all their
 235         associated RRs are deleted).
 236
 237    Thus, there is a problem that the memory usage will keep growing for
 238    the situation described.  You might just choose to ignore this
 239    problem (since I don't see any good way out), since things probably
 240    wont grow fast anyway (how many names are created and then deleted
 241    during a single server incarnation, after all?)
 242
 243    The problem is that one can't delete old namebufs because one would
 244    want to do it from db_update, but db_update calls nlookup to do the
 245    actual work, and can't do it there, since we need to maintain place
 246    holders.  One could make db_update not call nlookup, so we know it's
 247    ok to delete the namebuf (since we know the call is part of a delete
 248    call); but then there is code with alot of overlapping functionality
 249    in the 2 routines.
 250
 251    This also causes another problem:  If you create a name and then do
 252    UPDATEDA, all it's RRs get deleted, but the name remains; then, if you
 253    do a lookup on that name later, the name is found in the hash table,
 254    but no RRs are found for it.  It then forwards the query to itself (for
 255    some reason), and then somehow decides there is no such domain, and then
 256    returns (with the correct answer, but after going through extra work).
 257    But the name remains, and each time it is looked up, we go through
 258    these same steps.  This should be fixed, but I don't have time right
 259    now (and the right answer seems to come back anyway, so it's good
 260    enough for now).
 261
 262 2. There are 2 problems that crop up when you store data (other than
 263    T_SOA and T_NS records) in the root:
 264    a. Can't get primary to doaxfr RRs other than SOA and NS to
 265       secondary.
 266    b. Upon checkpoint (zonedump), this data sometimes comes out after other
 267       data in the root, so that (since the SOA and NS records have null
 268       names), they will get interpreted as being records under the
 269       other names upon the next boot up.  For example, if you have a
 270       T_A record called ABC, the checkpoint may look like:
 271          $ORIGIN .
 272          ABC     IN      A       128.95.1.3
 273          99999999        IN      NS      UW-BORNEO.
 274          IN      SOA     UW-BORNEO. SCHWARTZ.CS.WASHINGTON.EDU.
 275          ( 50 3600 300 3600000 3600 )
 276       Then when booting up the next time, the SOA and NS records get
 277       interpreted as being called "ABC" rather than the null root
 278       name.
 279
 280 3. The secondary server caches the T_A RR for the primary, and hence when
 281    it tries to ns_forw an update, it won't find the address of the primary
 282    using nslookup unless that T_A RR is *also* stored in the main hashtable
 283    (by putting it in a named.db file as well as the named.ca file).
 284