contrib/llvm/tools/lld/docs/design.rst

   1 .. _design:
   2
   3 Linker Design
   4 =============
   5
   6 Note: this document discuss Mach-O port of LLD. For ELF and COFF,
   7 see :doc:`index`.
   8
   9 Introduction
  10 ------------
  11
  12 lld is a new generation of linker.  It is not "section" based like traditional
  13 linkers which mostly just interlace sections from multiple object files into the
  14 output file.  Instead, lld is based on "Atoms".  Traditional section based
  15 linking work well for simple linking, but their model makes advanced linking
  16 features difficult to implement.  Features like dead code stripping, reordering
  17 functions for locality, and C++ coalescing require the linker to work at a finer
  18 grain.
  19
  20 An atom is an indivisible chunk of code or data.  An atom has a set of
  21 attributes, such as: name, scope, content-type, alignment, etc.  An atom also
  22 has a list of References.  A Reference contains: a kind, an optional offset, an
  23 optional addend, and an optional target atom.
  24
  25 The Atom model allows the linker to use standard graph theory models for linking
  26 data structures.  Each atom is a node, and each Reference is an edge.  The
  27 feature of dead code stripping is implemented by following edges to mark all
  28 live atoms, and then delete the non-live atoms.
  29
  30
  31 Atom Model
  32 ----------
  33
  34 An atom is an indivisible chunk of code or data.  Typically each user written
  35 function or global variable is an atom.  In addition, the compiler may emit
  36 other atoms, such as for literal c-strings or floating point constants, or for
  37 runtime data structures like dwarf unwind info or pointers to initializers.
  38
  39 A simple "hello world" object file would be modeled like this:
  40
  41 .. image:: hello.png
  42
  43 There are three atoms: main, a proxy for printf, and an anonymous atom
  44 containing the c-string literal "hello world".  The Atom "main" has two
  45 references. One is the call site for the call to printf, and the other is a
  46 reference for the instruction that loads the address of the c-string literal.
  47
  48 There are only four different types of atoms:
  49
  50         * DefinedAtom
  51                 95% of all atoms.  This is a chunk of code or data
  52
  53         * UndefinedAtom
  54            This is a place holder in object files for a reference to some atom
  55            outside the translation unit.During core linking it is usually replaced
  56            by (coalesced into) another Atom.
  57
  58         * SharedLibraryAtom
  59                 If a required symbol name turns out to be defined in a dynamic shared
  60                 library (and not some object file).  A SharedLibraryAtom is the
  61                 placeholder Atom used to represent that fact.
  62
  63                 It is similar to an UndefinedAtom, but it also tracks information
  64                 about the associated shared library.
  65
  66         * AbsoluteAtom
  67                 This is for embedded support where some stuff is implemented in ROM at
  68                 some fixed address.  This atom has no content.  It is just an address
  69                 that the Writer needs to fix up any references to point to.
  70
  71
  72 File Model
  73 ----------
  74
  75 The linker views the input files as basically containers of Atoms and
  76 References, and just a few attributes of their own.  The linker works with three
  77 kinds of files: object files, static libraries, and dynamic shared libraries.
  78 Each kind of file has reader object which presents the file in the model
  79 expected by the linker.
  80
  81 Object File
  82 ~~~~~~~~~~~
  83
  84 An object file is just a container of atoms.  When linking an object file, a
  85 reader is instantiated which parses the object file and instantiates a set of
  86 atoms representing all content in the .o file.  The linker adds all those atoms
  87 to a master graph.
  88
  89 Static Library (Archive)
  90 ~~~~~~~~~~~~~~~~~~~~~~~~
  91
  92 This is the traditional unix static archive which is just a collection of object
  93 files with a "table of contents". When linking with a static library, by default
  94 nothing is added to the master graph of atoms. Instead, if after merging all
  95 atoms from object files into a master graph, if any "undefined" atoms are left
  96 remaining in the master graph, the linker reads the table of contents for each
  97 static library to see if any have the needed definitions. If so, the set of
  98 atoms from the specified object file in the static library is added to the
  99 master graph of atoms.
 100
 101 Dynamic Library (Shared Object)
 102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 103
 104 Dynamic libraries are different than object files and static libraries in that
 105 they don't directly add any content.  Their purpose is to check at build time
 106 that the remaining undefined references can be resolved at runtime, and provide
 107 a list of dynamic libraries (SO_NEEDED) that will be needed at runtime.  The way
 108 this is modeled in the linker is that a dynamic library contributes no atoms to
 109 the initial graph of atoms.  Instead, (like static libraries) if there are
 110 "undefined" atoms in the master graph of all atoms, then each dynamic library is
 111 checked to see if exports the required symbol. If so, a "shared library" atom is
 112 instantiated by the by the reader which the linker uses to replace the
 113 "undefined" atom.
 114
 115 Linking Steps
 116 -------------
 117
 118 Through the use of abstract Atoms, the core of linking is architecture
 119 independent and file format independent.  All command line parsing is factored
 120 out into a separate "options" abstraction which enables the linker to be driven
 121 with different command line sets.
 122
 123 The overall steps in linking are:
 124
 125   #. Command line processing
 126
 127   #. Parsing input files
 128
 129   #. Resolving
 130
 131   #. Passes/Optimizations
 132
 133   #. Generate output file
 134
 135 The Resolving and Passes steps are done purely on the master graph of atoms, so
 136 they have no notion of file formats such as mach-o or ELF.
 137
 138
 139 Input Files
 140 ~~~~~~~~~~~
 141
 142 Existing developer tools using different file formats for object files.
 143 A goal of lld is to be file format independent.  This is done
 144 through a plug-in model for reading object files. The lld::Reader is the base
 145 class for all object file readers.  A Reader follows the factory method pattern.
 146 A Reader instantiates an lld::File object (which is a graph of Atoms) from a
 147 given object file (on disk or in-memory).
 148
 149 Every Reader subclass defines its own "options" class (for instance the mach-o
 150 Reader defines the class ReaderOptionsMachO).  This options class is the
 151 one-and-only way to control how the Reader operates when parsing an input file
 152 into an Atom graph.  For instance, you may want the Reader to only accept
 153 certain architectures.  The options class can be instantiated from command
 154 line options, or it can be subclassed and the ivars programmatically set.
 155
 156 Resolving
 157 ~~~~~~~~~
 158
 159 The resolving step takes all the atoms' graphs from each object file and
 160 combines them into one master object graph.  Unfortunately, it is not as simple
 161 as appending the atom list from each file into one big list.  There are many
 162 cases where atoms need to be coalesced.  That is, two or more atoms need to be
 163 coalesced into one atom.  This is necessary to support: C language "tentative
 164 definitions", C++ weak symbols for templates and inlines defined in headers,
 165 replacing undefined atoms with actual definition atoms, and for merging copies
 166 of constants like c-strings and floating point constants.
 167
 168 The linker support coalescing by-name and by-content. By-name is used for
 169 tentative definitions and weak symbols.  By-content is used for constant data
 170 that can be merged.
 171
 172 The resolving process maintains some global linking "state", including a "symbol
 173 table" which is a map from llvm::StringRef to lld::Atom*.  With these data
 174 structures, the linker iterates all atoms in all input files. For each atom, it
 175 checks if the atom is named and has a global or hidden scope.  If so, the atom
 176 is added to the symbol table map.  If there already is a matching atom in that
 177 table, that means the current atom needs to be coalesced with the found atom, or
 178 it is a multiple definition error.
 179
 180 When all initial input file atoms have been processed by the resolver, a scan is
 181 made to see if there are any undefined atoms in the graph.  If there are, the
 182 linker scans all libraries (both static and dynamic) looking for definitions to
 183 replace the undefined atoms.  It is an error if any undefined atoms are left
 184 remaining.
 185
 186 Dead code stripping (if requested) is done at the end of resolving.  The linker
 187 does a simple mark-and-sweep. It starts with "root" atoms (like "main" in a main
 188 executable) and follows each references and marks each Atom that it visits as
 189 "live".  When done, all atoms not marked "live" are removed.
 190
 191 The result of the Resolving phase is the creation of an lld::File object.  The
 192 goal is that the lld::File model is **the** internal representation
 193 throughout the linker. The file readers parse (mach-o, ELF, COFF) into an
 194 lld::File.  The file writers (mach-o, ELF, COFF) taken an lld::File and produce
 195 their file kind, and every Pass only operates on an lld::File.  This is not only
 196 a simpler, consistent model, but it enables the state of the linker to be dumped
 197 at any point in the link for testing purposes.
 198
 199
 200 Passes
 201 ~~~~~~
 202
 203 The Passes step is an open ended set of routines that each get a change to
 204 modify or enhance the current lld::File object. Some example Passes are:
 205
 206   * stub (PLT) generation
 207
 208   * GOT instantiation
 209
 210   * order_file optimization
 211
 212   * branch island generation
 213
 214   * branch shim generation
 215
 216   * Objective-C optimizations (Darwin specific)
 217
 218   * TLV instantiation (Darwin specific)
 219
 220   * DTrace probe processing (Darwin specific)
 221
 222   * compact unwind encoding (Darwin specific)
 223
 224
 225 Some of these passes are specific to Darwin's runtime environments.  But many of
 226 the passes are applicable to any OS (such as generating branch island for out of
 227 range branch instructions).
 228
 229 The general structure of a pass is to iterate through the atoms in the current
 230 lld::File object, inspecting each atom and doing something.  For instance, the
 231 stub pass, looks for call sites to shared library atoms (e.g. call to printf).
 232 It then instantiates a "stub" atom (PLT entry) and a "lazy pointer" atom for
 233 each proxy atom needed, and these new atoms are added to the current lld::File
 234 object.  Next, all the noted call sites to shared library atoms have their
 235 References altered to point to the stub atom instead of the shared library atom.
 236
 237
 238 Generate Output File
 239 ~~~~~~~~~~~~~~~~~~~~
 240
 241 Once the passes are done, the output file writer is given current lld::File
 242 object.  The writer's job is to create the executable content file wrapper and
 243 place the content of the atoms into it.
 244
 245 lld uses a plug-in model for writing output files. All concrete writers (e.g.
 246 ELF, mach-o, etc) are subclasses of the lld::Writer class.
 247
 248 Unlike the Reader class which has just one method to instantiate an lld::File,
 249 the Writer class has multiple methods.  The crucial method is to generate the
 250 output file, but there are also methods which allow the Writer to contribute
 251 Atoms to the resolver and specify passes to run.
 252
 253 An example of contributing
 254 atoms is that if the Writer knows a main executable is being linked and such
 255 an executable requires a specially named entry point (e.g. "_main"), the Writer
 256 can add an UndefinedAtom with that special name to the resolver.  This will
 257 cause the resolver to issue an error if that symbol is not defined.
 258
 259 Sometimes a Writer supports lazily created symbols, such as names for the start
 260 of sections. To support this, the Writer can create a File object which vends
 261 no initial atoms, but does lazily supply atoms by name as needed.
 262
 263 Every Writer subclass defines its own "options" class (for instance the mach-o
 264 Writer defines the class WriterOptionsMachO).  This options class is the
 265 one-and-only way to control how the Writer operates when producing an output
 266 file from an Atom graph.  For instance, you may want the Writer to optimize
 267 the output for certain OS versions, or strip local symbols, etc. The options
 268 class can be instantiated from command line options, or it can be subclassed
 269 and the ivars programmatically set.
 270
 271
 272 lld::File representations
 273 -------------------------
 274
 275 Just as LLVM has three representations of its IR model, lld has two
 276 representations of its File/Atom/Reference model:
 277
 278  * In memory, abstract C++ classes (lld::Atom, lld::Reference, and lld::File).
 279
 280  * textual (in YAML)
 281
 282
 283 Textual representations in YAML
 284 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 285
 286 In designing a textual format we want something easy for humans to read and easy
 287 for the linker to parse.  Since an atom has lots of attributes most of which are
 288 usually just the default, we should define default values for every attribute so
 289 that those can be omitted from the text representation.  Here is the atoms for a
 290 simple hello world program expressed in YAML::
 291
 292   target-triple:   x86_64-apple-darwin11
 293
 294   atoms:
 295       - name:    _main
 296         scope:   global
 297         type:    code
 298         content: [ 55, 48, 89, e5, 48, 8d, 3d, 00, 00, 00, 00, 30, c0, e8, 00, 00,
 299                    00, 00, 31, c0, 5d, c3 ]
 300         fixups:
 301         - offset: 07
 302           kind:   pcrel32
 303           target: 2
 304         - offset: 0E
 305           kind:   call32
 306           target: _fprintf
 307
 308       - type:    c-string
 309         content: [ 73, 5A, 00 ]
 310
 311   ...
 312
 313 The biggest use for the textual format will be writing test cases.  Writing test
 314 cases in C is problematic because the compiler may vary its output over time for
 315 its own optimization reasons which my inadvertently disable or break the linker
 316 feature trying to be tested. By writing test cases in the linkers own textual
 317 format, we can exactly specify every attribute of every atom and thus target
 318 specific linker logic.
 319
 320 The textual/YAML format follows the ReaderWriter patterns used in lld. The lld
 321 library comes with the classes: ReaderYAML and WriterYAML.
 322
 323
 324 Testing
 325 -------
 326
 327 The lld project contains a test suite which is being built up as new code is
 328 added to lld.  All new lld functionality should have a tests added to the test
 329 suite.  The test suite is `lit <http://llvm.org/cmds/lit.html/>`_ driven.  Each
 330 test is a text file with comments telling lit how to run the test and check the
 331 result To facilitate testing, the lld project builds a tool called lld-core.
 332 This tool reads a YAML file (default from stdin), parses it into one or more
 333 lld::File objects in memory and then feeds those lld::File objects to the
 334 resolver phase.
 335
 336
 337 Resolver testing
 338 ~~~~~~~~~~~~~~~~
 339
 340 Basic testing is the "core linking" or resolving phase.  That is where the
 341 linker merges object files.  All test cases are written in YAML.  One feature of
 342 YAML is that it allows multiple "documents" to be encoding in one YAML stream.
 343 That means one text file can appear to the linker as multiple .o files - the
 344 normal case for the linker.
 345
 346 Here is a simple example of a core linking test case. It checks that an
 347 undefined atom from one file will be replaced by a definition from another
 348 file::
 349
 350   # RUN: lld-core %s | FileCheck %s
 351
 352   #
 353   # Test that undefined atoms are replaced with defined atoms.
 354   #
 355
 356   ---
 357   atoms:
 358       - name:              foo
 359         definition:        undefined
 360   ---
 361   atoms:
 362       - name:              foo
 363         scope:             global
 364         type:              code
 365   ...
 366
 367   # CHECK:       name:       foo
 368   # CHECK:       scope:      global
 369   # CHECK:       type:       code
 370   # CHECK-NOT:   name:       foo
 371   # CHECK:       ...
 372
 373
 374 Passes testing
 375 ~~~~~~~~~~~~~~
 376
 377 Since Passes just operate on an lld::File object, the lld-core tool has the
 378 option to run a particular pass (after resolving).  Thus, you can write a YAML
 379 test case with carefully crafted input to exercise areas of a Pass and the check
 380 the resulting lld::File object as represented in YAML.
 381
 382
 383 Design Issues
 384 -------------
 385
 386 There are a number of open issues in the design of lld.  The plan is to wait and
 387 make these design decisions when we need to.
 388
 389
 390 Debug Info
 391 ~~~~~~~~~~
 392
 393 Currently, the lld model says nothing about debug info.  But the most popular
 394 debug format is DWARF and there is some impedance mismatch with the lld model
 395 and DWARF.  In lld there are just Atoms and only Atoms that need to be in a
 396 special section at runtime have an associated section.  Also, Atoms do not have
 397 addresses.  The way DWARF is spec'ed different parts of DWARF are supposed to go
 398 into specially named sections and the DWARF references function code by address.
 399
 400 CPU and OS specific functionality
 401 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 402
 403 Currently, lld has an abstract "Platform" that deals with any CPU or OS specific
 404 differences in linking.  We just keep adding virtual methods to the base
 405 Platform class as we find linking areas that might need customization.  At some
 406 point we'll need to structure this better.
 407
 408
 409 File Attributes
 410 ~~~~~~~~~~~~~~~
 411
 412 Currently, lld::File just has a path and a way to iterate its atoms. We will
 413 need to add more attributes on a File.  For example, some equivalent to the
 414 target triple.  There is also a number of cached or computed attributes that
 415 could make various Passes more efficient.  For instance, on Darwin there are a
 416 number of Objective-C optimizations that can be done by a Pass.  But it would
 417 improve the plain C case if the Objective-C optimization Pass did not have to
 418 scan all atoms looking for any Objective-C data structures.  This could be done
 419 if the lld::File object had an attribute that said if the file had any
 420 Objective-C data in it. The Resolving phase would then be required to "merge"
 421 that attribute as object files are added.