docs/design.rst

   1 .. _design:
   2
   3 Linker Design
   4 =============
   5
   6 Introduction
   7 ------------
   8
   9 lld is a new generation of linker.  It is not "section" based like traditional
  10 linkers which mostly just interlace sections from multiple object files into the
  11 output file.  Instead, lld is based on "Atoms".  Traditional section based
  12 linking work well for simple linking, but their model makes advanced linking
  13 features difficult to implement.  Features like dead code stripping, reordering
  14 functions for locality, and C++ coalescing require the linker to work at a finer
  15 grain.
  16
  17 An atom is an indivisible chunk of code or data.  An atom has a set of
  18 attributes, such as: name, scope, content-type, alignment, etc.  An atom also
  19 has a list of References.  A Reference contains: a kind, an optional offset, an
  20 optional addend, and an optional target atom.
  21
  22 The Atom model allows the linker to use standard graph theory models for linking
  23 data structures.  Each atom is a node, and each Reference is an edge.  The
  24 feature of dead code stripping is implemented by following edges to mark all
  25 live atoms, and then delete the non-live atoms.
  26
  27
  28 Atom Model
  29 ----------
  30
  31 An atom is an indivisible chunk of code or data.  Typically each user written
  32 function or global variable is an atom.  In addition, the compiler may emit
  33 other atoms, such as for literal c-strings or floating point constants, or for
  34 runtime data structures like dwarf unwind info or pointers to initializers.
  35
  36 A simple "hello world" object file would be modeled like this:
  37
  38 .. image:: hello.png
  39
  40 There are three atoms: main, a proxy for printf, and an anonymous atom
  41 containing the c-string literal "hello world".  The Atom "main" has two
  42 references. One is the call site for the call to printf, and the other is a
  43 reference for the instruction that loads the address of the c-string literal.
  44
  45 There are only four different types of atoms:
  46
  47         * DefinedAtom
  48                 95% of all atoms.  This is a chunk of code or data
  49
  50         * UndefinedAtom
  51            This is a place holder in object files for a reference to some atom
  52            outside the translation unit.During core linking it is usually replaced
  53            by (coalesced into) another Atom.
  54
  55         * SharedLibraryAtom
  56                 If a required symbol name turns out to be defined in a dynamic shared
  57                 library (and not some object file).  A SharedLibraryAtom is the
  58                 placeholder Atom used to represent that fact.
  59
  60                 It is similar to an UndefinedAtom, but it also tracks information
  61                 about the associated shared library.
  62
  63         * AbsoluteAtom
  64                 This is for embedded support where some stuff is implemented in ROM at
  65                 some fixed address.  This atom has no content.  It is just an address
  66                 that the Writer needs to fix up any references to point to.
  67
  68
  69 File Model
  70 ----------
  71
  72 The linker views the input files as basically containers of Atoms and
  73 References, and just a few attributes of their own.  The linker works with three
  74 kinds of files: object files, static libraries, and dynamic shared libraries.
  75 Each kind of file has reader object which presents the file in the model
  76 expected by the linker.
  77
  78 Object File
  79 ~~~~~~~~~~~
  80
  81 An object file is just a container of atoms.  When linking an object file, a
  82 reader is instantiated which parses the object file and instantiates a set of
  83 atoms representing all content in the .o file.  The linker adds all those atoms
  84 to a master graph.
  85
  86 Static Library (Archive)
  87 ~~~~~~~~~~~~~~~~~~~~~~~~
  88
  89 This is the traditional unix static archive which is just a collection of object
  90 files with a "table of contents". When linking with a static library, by default
  91 nothing is added to the master graph of atoms. Instead, if after merging all
  92 atoms from object files into a master graph, if any "undefined" atoms are left
  93 remaining in the master graph, the linker reads the table of contents for each
  94 static library to see if any have the needed definitions. If so, the set of
  95 atoms from the specified object file in the static library is added to the
  96 master graph of atoms.
  97
  98 Dynamic Library (Shared Object)
  99 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 100
 101 Dynamic libraries are different than object files and static libraries in that
 102 they don't directly add any content.  Their purpose is to check at build time
 103 that the remaining undefined references can be resolved at runtime, and provide
 104 a list of dynamic libraries (SO_NEEDED) that will be needed at runtime.  The way
 105 this is modeled in the linker is that a dynamic library contributes no atoms to
 106 the initial graph of atoms.  Instead, (like static libraries) if there are
 107 "undefined" atoms in the master graph of all atoms, then each dynamic library is
 108 checked to see if exports the required symbol. If so, a "shared library" atom is
 109 instantiated by the by the reader which the linker uses to replace the
 110 "undefined" atom.
 111
 112 Linking Steps
 113 -------------
 114
 115 Through the use of abstract Atoms, the core of linking is architecture
 116 independent and file format independent.  All command line parsing is factored
 117 out into a separate "options" abstraction which enables the linker to be driven
 118 with different command line sets.
 119
 120 The overall steps in linking are:
 121
 122   #. Command line processing
 123
 124   #. Parsing input files
 125
 126   #. Resolving
 127
 128   #. Passes/Optimizations
 129
 130   #. Generate output file
 131
 132 The Resolving and Passes steps are done purely on the master graph of atoms, so
 133 they have no notion of file formats such as mach-o or ELF.
 134
 135
 136 Input Files
 137 ~~~~~~~~~~~
 138
 139 Existing developer tools using different file formats for object files.
 140 A goal of lld is to be file format independent.  This is done
 141 through a plug-in model for reading object files. The lld::Reader is the base
 142 class for all object file readers.  A Reader follows the factory method pattern.
 143 A Reader instantiates an lld::File object (which is a graph of Atoms) from a
 144 given object file (on disk or in-memory).
 145
 146 Every Reader subclass defines its own "options" class (for instance the mach-o
 147 Reader defines the class ReaderOptionsMachO).  This options class is the
 148 one-and-only way to control how the Reader operates when parsing an input file
 149 into an Atom graph.  For instance, you may want the Reader to only accept
 150 certain architectures.  The options class can be instantiated from command
 151 line options, or it can be subclassed and the ivars programmatically set.
 152
 153 ELF Section Groups
 154 ~~~~~~~~~~~~~~~~~~
 155 Reference : `ELF Section Groups <http://mentorembedded.github.io/cxx-abi/abi/prop-72-comdat.html>`_
 156
 157 C++ has many situations where the compiler may need to emit code or data,
 158 but may not be able to identify a unique compilation unit where it should be
 159 emitted. The approach chosen by the C++ ABI group to deal with this problem, is
 160 to allow the compiler to emit the required information in multiple compilation
 161 units, in a form which allows the linker to remove all but one copy. This is
 162 essentially the feature called COMDAT in several existing implementations.
 163
 164 The COMDAT sections in ELF are modeled by using '.group' sections in the input
 165 files. Each '.group' section is associated with a signature. The '.group'
 166 section has a list of members that are part of the the '.group' which the linker
 167 selects to appear in the input file(Whichever .group section appeared first
 168 in the link). References to any of the '.group' members can also appear from
 169 outside the '.group'.
 170
 171 In lld the the '.group' sections with COMDAT are identified by contentType(
 172 typeGroupComdat). The '.group' members are identified by using
 173 **kindGroupChild** references.
 174
 175 The point to be noted here is the 'group child' members would need to be emitted
 176 in the output file **iff** the group was selected by the resolver.
 177
 178 This is modeled in lld by removing the 'group child' members from the
 179 definedAtom List.
 180
 181 Any reference to the group-child from **outside the group** is referenced using
 182 a 'undefined' atom.
 183
 184 Resolving
 185 ~~~~~~~~~
 186
 187 The resolving step takes all the atoms' graphs from each object file and
 188 combines them into one master object graph.  Unfortunately, it is not as simple
 189 as appending the atom list from each file into one big list.  There are many
 190 cases where atoms need to be coalesced.  That is, two or more atoms need to be
 191 coalesced into one atom.  This is necessary to support: C language "tentative
 192 definitions", C++ weak symbols for templates and inlines defined in headers,
 193 replacing undefined atoms with actual definition atoms, and for merging copies
 194 of constants like c-strings and floating point constants.
 195
 196 The linker support coalescing by-name and by-content. By-name is used for
 197 tentative definitions and weak symbols.  By-content is used for constant data
 198 that can be merged.
 199
 200 The resolving process maintains some global linking "state", including a "symbol
 201 table" which is a map from llvm::StringRef to lld::Atom*.  With these data
 202 structures, the linker iterates all atoms in all input files. For each atom, it
 203 checks if the atom is named and has a global or hidden scope.  If so, the atom
 204 is added to the symbol table map.  If there already is a matching atom in that
 205 table, that means the current atom needs to be coalesced with the found atom, or
 206 it is a multiple definition error.
 207
 208 When all initial input file atoms have been processed by the resolver, a scan is
 209 made to see if there are any undefined atoms in the graph.  If there are, the
 210 linker scans all libraries (both static and dynamic) looking for definitions to
 211 replace the undefined atoms.  It is an error if any undefined atoms are left
 212 remaining.
 213
 214 Dead code stripping (if requested) is done at the end of resolving.  The linker
 215 does a simple mark-and-sweep. It starts with "root" atoms (like "main" in a main
 216 executable) and follows each references and marks each Atom that it visits as
 217 "live".  When done, all atoms not marked "live" are removed.
 218
 219 The result of the Resolving phase is the creation of an lld::File object.  The
 220 goal is that the lld::File model is **the** internal representation
 221 throughout the linker. The file readers parse (mach-o, ELF, COFF) into an
 222 lld::File.  The file writers (mach-o, ELF, COFF) taken an lld::File and produce
 223 their file kind, and every Pass only operates on an lld::File.  This is not only
 224 a simpler, consistent model, but it enables the state of the linker to be dumped
 225 at any point in the link for testing purposes.
 226
 227
 228 Passes
 229 ~~~~~~
 230
 231 The Passes step is an open ended set of routines that each get a change to
 232 modify or enhance the current lld::File object. Some example Passes are:
 233
 234   * stub (PLT) generation
 235
 236   * GOT instantiation
 237
 238   * order_file optimization
 239
 240   * branch island generation
 241
 242   * branch shim generation
 243
 244   * Objective-C optimizations (Darwin specific)
 245
 246   * TLV instantiation (Darwin specific)
 247
 248   * DTrace probe processing (Darwin specific)
 249
 250   * compact unwind encoding (Darwin specific)
 251
 252
 253 Some of these passes are specific to Darwin's runtime environments.  But many of
 254 the passes are applicable to any OS (such as generating branch island for out of
 255 range branch instructions).
 256
 257 The general structure of a pass is to iterate through the atoms in the current
 258 lld::File object, inspecting each atom and doing something.  For instance, the
 259 stub pass, looks for call sites to shared library atoms (e.g. call to printf).
 260 It then instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for
 261 each proxy atom needed, and these new atoms are added to the current lld::File
 262 object.  Next, all the noted call sites to shared library atoms have their
 263 References altered to point to the stub atom instead of the shared library atom.
 264
 265
 266 Generate Output File
 267 ~~~~~~~~~~~~~~~~~~~~
 268
 269 Once the passes are done, the output file writer is given current lld::File
 270 object.  The writer's job is to create the executable content file wrapper and
 271 place the content of the atoms into it.
 272
 273 lld uses a plug-in model for writing output files. All concrete writers (e.g.
 274 ELF, mach-o, etc) are subclasses of the lld::Writer class.
 275
 276 Unlike the Reader class which has just one method to instantiate an lld::File,
 277 the Writer class has multiple methods.  The crucial method is to generate the
 278 output file, but there are also methods which allow the Writer to contribute
 279 Atoms to the resolver and specify passes to run.
 280
 281 An example of contributing
 282 atoms is that if the Writer knows a main executable is being linked and such
 283 an executable requires a specially named entry point (e.g. "_main"), the Writer
 284 can add an UndefinedAtom with that special name to the resolver.  This will
 285 cause the resolver to issue an error if that symbol is not defined.
 286
 287 Sometimes a Writer supports lazily created symbols, such as names for the start
 288 of sections. To support this, the Writer can create a File object which vends
 289 no initial atoms, but does lazily supply atoms by name as needed.
 290
 291 Every Writer subclass defines its own "options" class (for instance the mach-o
 292 Writer defines the class WriterOptionsMachO).  This options class is the
 293 one-and-only way to control how the Writer operates when producing an output
 294 file from an Atom graph.  For instance, you may want the Writer to optimize
 295 the output for certain OS versions, or strip local symbols, etc. The options
 296 class can be instantiated from command line options, or it can be subclassed
 297 and the ivars programmatically set.
 298
 299
 300 lld::File representations
 301 -------------------------
 302
 303 Just as LLVM has three representations of its IR model, lld has three
 304 representations of its File/Atom/Reference model:
 305
 306  * In memory, abstract C++ classes (lld::Atom, lld::Reference, and lld::File).
 307
 308  * textual (in YAML)
 309
 310  * binary format ("native")
 311
 312 Binary File Format
 313 ~~~~~~~~~~~~~~~~~~
 314
 315 In theory, lld::File objects could be written to disk in an existing Object File
 316 format standard (e.g. ELF).  Instead we choose to define a new binary file
 317 format. There are two main reasons for this: fidelity and performance.  In order
 318 for lld to work as a linker on all platforms, its internal model must be rich
 319 enough to model all CPU and OS linking features.  But if we choose an existing
 320 Object File format as the lld binary format, that means an on going need to
 321 retrofit each platform specific feature needed from alternate platforms into the
 322 existing Object File format.  Having our own "native" binary format side steps
 323 that issue.  We still need to be able to binary encode all the features, but
 324 once the in-memory model can represent the feature, it is straight forward to
 325 binary encode it.
 326
 327 The reason to use a binary file format at all, instead of a textual file format,
 328 is speed.  You want the binary format to be as fast as possible to read into the
 329 in-memory model. Given that we control the in-memory model and the binary
 330 format, the obvious way to make reading super fast it to make the file format be
 331 basically just an array of atoms.  The reader just mmaps in the file and looks
 332 at the header to see how many atoms there are and instantiate that many atom
 333 objects with the atom attribute information coming from that array.  The trick
 334 is designing this in a way that can be extended as the Atom mode evolves and new
 335 attributes are added.
 336
 337 The native object file format starts with a header that lists how many "chunks"
 338 are in the file.  A chunk is an array of "ivar data".  The native file reader
 339 instantiates an array of Atom objects (with one large malloc call).  Each atom
 340 contains just a pointer to its vtable and a pointer to its ivar data.  All
 341 methods on lld::Atom are virtual, so all the method implementations return
 342 values based on the ivar data to which it has a pointer.  If a new linking
 343 features is added which requires a change to the lld::Atom model, a new native
 344 reader class (e.g. version 2) is defined which knows how to read the new feature
 345 information from the new ivar data.  The old reader class (e.g. version 1) is
 346 updated to do its best to model (the lack of the new feature) given the old ivar
 347 data in existing native object files.
 348
 349 With this model for the native file format, files can be read and turned
 350 into the in-memory graph of lld::Atoms with just a few memory allocations.
 351 And the format can easily adapt over time to new features.
 352
 353 The binary file format follows the ReaderWriter patterns used in lld. The lld
 354 library comes with the classes: ReaderNative and WriterNative.  So, switching
 355 between file formats is as easy as switching which Reader subclass is used.
 356
 357
 358 Textual representations in YAML
 359 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 360
 361 In designing a textual format we want something easy for humans to read and easy
 362 for the linker to parse.  Since an atom has lots of attributes most of which are
 363 usually just the default, we should define default values for every attribute so
 364 that those can be omitted from the text representation.  Here is the atoms for a
 365 simple hello world program expressed in YAML::
 366
 367   target-triple:   x86_64-apple-darwin11
 368
 369   atoms:
 370       - name:    _main
 371         scope:   global
 372         type:    code
 373         content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
 374                    00, 00, 31, c0, 5d, c3 ]
 375         fixups:
 376         - offset: 07
 377           kind:   pcrel32
 378           target: 2
 379         - offset: 0E
 380           kind:   call32
 381           target: _fprintf
 382
 383       - type:    c-string
 384         content: [ 73, 5A, 00 ]
 385
 386   ...
 387
 388 The biggest use for the textual format will be writing test cases.  Writing test
 389 cases in C is problematic because the compiler may vary its output over time for
 390 its own optimization reasons which my inadvertently disable or break the linker
 391 feature trying to be tested. By writing test cases in the linkers own textual
 392 format, we can exactly specify every attribute of every atom and thus target
 393 specific linker logic.
 394
 395 The textual/YAML format follows the ReaderWriter patterns used in lld. The lld
 396 library comes with the classes: ReaderYAML and WriterYAML.
 397
 398
 399 Testing
 400 -------
 401
 402 The lld project contains a test suite which is being built up as new code is
 403 added to lld.  All new lld functionality should have a tests added to the test
 404 suite.  The test suite is `lit <http://llvm.org/cmds/lit.html/>`_ driven.  Each
 405 test is a text file with comments telling lit how to run the test and check the
 406 result To facilitate testing, the lld project builds a tool called lld-core.
 407 This tool reads a YAML file (default from stdin), parses it into one or more
 408 lld::File objects in memory and then feeds those lld::File objects to the
 409 resolver phase.  The output of the resolver is written as a native object file.
 410 It is then read back in using the native object file reader and then pass to the
 411 YAML writer.  This round-about path means that all three representations
 412 (in-memory, binary, and text) are exercised, and any new feature has to work in
 413 all the representations to pass the test.
 414
 415
 416 Resolver testing
 417 ~~~~~~~~~~~~~~~~
 418
 419 Basic testing is the "core linking" or resolving phase.  That is where the
 420 linker merges object files.  All test cases are written in YAML.  One feature of
 421 YAML is that it allows multiple "documents" to be encoding in one YAML stream.
 422 That means one text file can appear to the linker as multiple .o files - the
 423 normal case for the linker.
 424
 425 Here is a simple example of a core linking test case. It checks that an
 426 undefined atom from one file will be replaced by a definition from another
 427 file::
 428
 429   # RUN: lld-core %s | FileCheck %s
 430
 431   #
 432   # Test that undefined atoms are replaced with defined atoms.
 433   #
 434
 435   ---
 436   atoms:
 437       - name:              foo
 438         definition:        undefined
 439   ---
 440   atoms:
 441       - name:              foo
 442         scope:             global
 443         type:              code
 444   ...
 445
 446   # CHECK:       name:       foo
 447   # CHECK:       scope:      global
 448   # CHECK:       type:       code
 449   # CHECK-NOT:   name:       foo
 450   # CHECK:       ...
 451
 452
 453 Passes testing
 454 ~~~~~~~~~~~~~~
 455
 456 Since Passes just operate on an lld::File object, the lld-core tool has the
 457 option to run a particular pass (after resolving).  Thus, you can write a YAML
 458 test case with carefully crafted input to exercise areas of a Pass and the check
 459 the resulting lld::File object as represented in YAML.
 460
 461
 462 Design Issues
 463 -------------
 464
 465 There are a number of open issues in the design of lld.  The plan is to wait and
 466 make these design decisions when we need to.
 467
 468
 469 Debug Info
 470 ~~~~~~~~~~
 471
 472 Currently, the lld model says nothing about debug info.  But the most popular
 473 debug format is DWARF and there is some impedance mismatch with the lld model
 474 and DWARF.  In lld there are just Atoms and only Atoms that need to be in a
 475 special section at runtime have an associated section.  Also, Atoms do not have
 476 addresses.  The way DWARF is spec'ed different parts of DWARF are supposed to go
 477 into specially named sections and the DWARF references function code by address.
 478
 479 CPU and OS specific functionality
 480 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 481
 482 Currently, lld has an abstract "Platform" that deals with any CPU or OS specific
 483 differences in linking.  We just keep adding virtual methods to the base
 484 Platform class as we find linking areas that might need customization.  At some
 485 point we'll need to structure this better.
 486
 487
 488 File Attributes
 489 ~~~~~~~~~~~~~~~
 490
 491 Currently, lld::File just has a path and a way to iterate its atoms. We will
 492 need to add more attributes on a File.  For example, some equivalent to the
 493 target triple.  There is also a number of cached or computed attributes that
 494 could make various Passes more efficient.  For instance, on Darwin there are a
 495 number of Objective-C optimizations that can be done by a Pass.  But it would
 496 improve the plain C case if the Objective-C optimization Pass did not have to
 497 scan all atoms looking for any Objective-C data structures.  This could be done
 498 if the lld::File object had an attribute that said if the file had any
 499 Objective-C data in it. The Resolving phase would then be required to "merge"
 500 that attribute as object files are added.