1 This file describes the design, layouts, and file formats of a
2 libsvn_fs_fs repository.
7 In FSFS, each committed revision is represented as an immutable file
8 containing the new node-revisions, contents, and changed-path
9 information for the revision, plus a second, changeable file
10 containing the revision properties.
12 In contrast to the BDB back end, the contents of recent revision of
13 files are stored as deltas against earlier revisions, instead of the
14 other way around. This is less efficient for common-case checkouts,
15 but brings greater simplicity and robustness, as well as the
16 flexibility to make commits work without write access to existing
17 revisions. Skip-deltas and delta combination mitigate the checkout
20 In-progress transactions are represented with a prototype rev file
21 containing only the new text representations of files (appended to as
22 changed file contents come in), along with a separate file for each
23 node-revision, directory representation, or property representation
24 which has been changed or added in the transaction. During the final
25 stage of the commit, these separate files are marshalled onto the end
26 of the prototype rev file to form the immutable revision file.
28 Layout of the FS directory
29 --------------------------
31 The layout of the FS directory (the "db" subdirectory of the
34 revs/ Subdirectory containing revs
35 <shard>/ Shard directory, if sharding is in use (see below)
36 <revnum> File containing rev <revnum>
37 <shard>.pack/ Pack directory, if the repo has been packed (see below)
38 pack Pack file, if the repository has been packed (see below)
39 manifest Pack manifest file, if a pack file exists (see below)
40 revprops/ Subdirectory containing rev-props
41 <shard>/ Shard directory, if sharding is in use (see below)
42 <revnum> File containing rev-props for <revnum>
43 <shard>.pack/ Pack directory, if the repo has been packed (see below)
44 <rev>.<count> Pack file, if the repository has been packed (see below)
45 manifest Pack manifest file, if a pack file exists (see below)
46 revprops.db SQLite database of the packed revision properties
47 transactions/ Subdirectory containing transactions
48 <txnid>.txn/ Directory containing transaction <txnid>
49 txn-protorevs/ Subdirectory containing transaction proto-revision files
50 <txnid>.rev Proto-revision file for transaction <txnid>
51 <txnid>.rev-lock Write lock for proto-rev file
52 txn-current File containing the next transaction key
53 locks/ Subdirectory containing locks
54 <partial-digest>/ Subdirectory named for first 3 letters of an MD5 digest
55 <digest> File containing locks/children for path with <digest>
56 node-origins/ Lazy cache of origin noderevs for nodes
57 <partial-nodeid> File containing noderev ID of origins of nodes
58 current File specifying current revision and next node/copy id
59 fs-type File identifying this filesystem as an FSFS filesystem
60 write-lock Empty file, locked to serialise writers
61 txn-current-lock Empty file, locked to serialise 'txn-current'
62 uuid File containing the UUID of the repository
63 format File containing the format number of this filesystem
64 fsfs.conf Configuration file
65 min-unpacked-rev File containing the oldest revision not in a pack file
66 min-unpacked-revprop File containing the oldest revision of unpacked revprop
67 rep-cache.db SQLite database mapping rep checksums to locations
69 Files in the revprops directory are in the hash dump format used by
72 The format of the "current" file is:
74 * Format 3 and above: a single line of the form
75 "<youngest-revision>\n" giving the youngest revision for the
78 * Format 2 and below: a single line of the form "<youngest-revision>
79 <next-node-id> <next-copy-id>\n" giving the youngest revision, the
80 next unique node-ID, and the next unique copy-ID for the
83 The "write-lock" file is an empty file which is locked before the
84 final stage of a commit and unlocked after the new "current" file has
85 been moved into place to indicate that a new revision is present. It
86 is also locked during a revprop propchange while the revprop file is
87 read in, mutated, and written out again. Note that readers are never
88 blocked by any operation - writers must ensure that the filesystem is
89 always in a consistent state.
91 The "txn-current" file is a file with a single line of text that
92 contains only a base-36 number. The current value will be used in the
93 next transaction name, along with the revision number the transaction
94 is based on. This sequence number ensures that transaction names are
95 not reused, even if the transaction is aborted and a new transaction
96 based on the same revision is begun. The only operation that FSFS
97 performs on this file is "get and increment"; the "txn-current-lock"
98 file is locked during this operation.
100 "fsfs.conf" is a configuration file in the standard Subversion/Python
101 config format. It is automatically generated when you create a new
102 repository; read the generated file for details on what it controls.
104 When representation sharing is enabled, the filesystem tracks
105 representation checksum and location mappings using a SQLite database in
106 "rep-cache.db". The database has a single table, which stores the sha1
107 hash text as the primary key, mapped to the representation revision, offset,
108 size and expanded size. This file is only consulted during writes and never
109 during reads. Consequently, it is not required, and may be removed at an
110 abritrary time, with the subsequent loss of rep-sharing capabilities for
111 revisions written thereafter.
116 The "format" file defines what features are permitted within the
117 filesystem, and indicates changes that are not backward-compatible.
118 It serves the same purpose as the repository file of the same name.
120 The filesystem format file was introduced in Subversion 1.2, and so
121 will not be present if the repository was created with an older
122 version of Subversion. An absent format file should be interpreted as
123 indicating a format 1 filesystem.
125 The format file is a single line of the form "<format number>\n",
126 followed by any number of lines specifying 'format options' -
127 additional information about the filesystem's format. Each format
128 option line is of the form "<option>\n" or "<option> <parameters>\n".
130 Clients should raise an error if they encounter an option not
131 permitted by the format number in use.
135 Format 1, understood by Subversion 1.1+
136 Format 2, understood by Subversion 1.4+
137 Format 3, understood by Subversion 1.5+
138 Format 4, understood by Subversion 1.6+
139 Format 5, understood by Subversion 1.7-dev, never released
140 Format 6, understood by Subversion 1.8
142 The differences between the formats are:
144 Delta representation in revision files
145 Format 1: svndiff0 only
146 Formats 2+: svndiff0 or svndiff1
149 Formats 1-2: none permitted
150 Format 3+: "layout" option
152 Transaction name reuse
153 Formats 1-2: transaction names may be reused
154 Format 3+: transaction names generated using txn-current file
156 Location of proto-rev file and its lock
157 Formats 1-2: transactions/<txnid>/rev and
158 transactions/<txnid>/rev-lock.
159 Format 3+: txn-protorevs/<txnid>.rev and
160 txn-protorevs/<txnid>.rev-lock.
162 Node-ID and copy-ID generation
163 Formats 1-2: Node-IDs and copy-IDs are guaranteed to form a
164 monotonically increasing base36 sequence using the "current"
166 Format 3+: Node-IDs and copy-IDs use the new revision number to
167 ensure uniqueness and the "current" file just contains the
171 Format 1-2: minfo-here and minfo-count node-revision fields are not
172 stored. svn_fs_get_mergeinfo returns an error.
173 Format 3+: minfo-here and minfo-count node-revision fields are
174 maintained. svn_fs_get_mergeinfo works.
176 Revision changed paths list:
177 Format 1-3: Does not contain the node's kind.
178 Format 4+: Contains the node's kind.
181 Format 4: Applied to revision data only.
182 Format 5: Revprops would be packed independently of revision data.
183 Format 6+: Applied equally to revision data and revprop data
184 (i.e. same min packed revision)
186 # Incomplete list. See SVN_FS_FS__MIN_*_FORMAT
189 Filesystem format options
190 -------------------------
192 Currently, the only recognised format option is "layout", which
193 specifies the paths that will be used to store the revision files and
194 revision property files.
196 The "layout" option is followed by the name of the filesystem layout
197 and any required parameters. The default layout, if no "layout"
198 keyword is specified, is the 'linear' layout.
200 The known layouts, and the parameters they require, are as follows:
203 Revision files and rev-prop files are named after the revision they
204 represent, and are placed directly in the revs/ and revprops/
205 directories. r1234 will be represented by the revision file
206 revs/1234 and the rev-prop file revprops/1234.
208 "sharded <max-files-per-directory>"
209 Revision files and rev-prop files are named after the revision they
210 represent, and are placed in a subdirectory of the revs/ and
211 revprops/ directories named according to the 'shard' they belong to.
213 Shards are numbered from zero and contain between one and the
214 maximum number of files per directory specified in the layout's
217 For the "sharded 1000" layout, r1234 will be represented by the
218 revision file revs/1/1234 and rev-prop file revprops/1/1234. The
219 revs/0/ directory will contain revisions 0-999, revs/1/ will contain
220 1000-1999, and so on.
225 A filesystem can optionally be "packed" to conserve space on disk. The
226 packing process concatenates all the revision files in each full shard to
227 create pack files. A manifest file is also created for each shard which
228 records the indexes of the corresponding revision files in the pack file.
229 In addition, the original shard is removed, and reads are redirected to the
232 The manifest file consists of a list of offsets, one for each revision in the
233 pack file. The offsets are stored as ASCII decimal, and separated by a newline
236 Packing revision properties (format 5: SQLite)
237 ---------------------------
239 This was supported by 1.7-dev builds but never included in a blessed release.
241 See r1143829 of this file:
242 http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/structure?view=markup&pathrev=1143829
245 Packing revision properties (format 6+)
246 ---------------------------
248 Similarly to the revision data, packing will concatenate multiple
249 revprops into a single file. Since they are mutable data, we put an
250 upper limit to the size of these files: We will concatenate the data
251 up to the limit and then use a new file for the following revisions.
253 The limit can be set and changed at will in the configuration file.
254 It is 64kB by default. Because a pack file must contain at least one
255 complete property list, files containing just one revision may exceed
258 Furthermore, pack files can be compressed which saves about 75% of
259 disk space. A configuration file flag enables the compression; it is
260 off by default and may be switched on and off at will. The pack size
261 limit is always applied to the uncompressed data. For this reason,
262 the default is 256kB while compression has been enabled.
264 Files are named after their start revision as "<rev>.<counter>" where
265 counter will be increased whenever we rewrite a pack file due to a
266 revprop change. The manifest file contains the list of pack file
267 names, one line for each revision.
269 Many tools track repository global data in revision properties at
270 revision 0. To minimize I/O overhead for those applications, we
271 will never pack that revision, i.e. its data is always being kept
276 Top level: <packed container>
278 We always apply data compression to the pack file - using the
279 SVN_DELTA_COMPRESSION_LEVEL_NONE level if compression is disabled.
280 (Note that compression at SVN_DELTA_COMPRESSION_LEVEL_NONE is not
281 a no-op stream transformation although most of the data will remain
284 container := header '\n' (revprops)+
285 header := start_rev '\n' rev_count '\n' (size '\n')+
287 All numbers in the header are given as ASCII decimals. rev_count
288 is the number of revisions packed into this container. There must
289 be exactly as many "size" and serialized "revprops". The "size"
290 values in the list are the length in bytes of the serialized
291 revprops of the respective revision.
293 Writing to packed revprops
295 The old pack file is being read and the new revprops serialized.
296 If they fit into the same pack file, a temp file with the new
297 content gets written and moved into place just like an non-packed
298 revprop file would. No name change or manifest update required.
300 If they don't fit into the same pack file, i.e. exceed the pack
301 size limit, the pack will be split into 2 or 3 new packs just
302 before and / or after the modified revision.
304 In the current implementation, they will never be merged again.
305 To minimize fragmentation, the initial packing process will only
306 use about 90% of the limit, i.e. leave some room for growth.
308 When a pack file gets split, its counter is being increased
309 creating a new file and leaving the old content in place and
310 available for concurrent readers. Only after the new manifest
311 file got moved into place, will the old pack files be deleted.
313 Write access to revprops is being serialized by the global
314 filesystem write lock. We only need to build a few retries into
315 the reader code to gracefully handle manifest changes and pack
322 A node-rev ID consists of the following three fields:
324 node_revision_id ::= node_id '.' copy_id '.' txn_id
326 At this level, the form of the ID is the same as for BDB - see the
327 section called "ID's" in <../libsvn_fs_base/notes/structure>.
329 In order to support efficient lookup of node-revisions by their IDs
330 and to simplify the allocation of fresh node-IDs during a transaction,
331 we treat the fields of a node-rev ID in new and interesting ways.
333 Within a new transaction:
335 New node-revision IDs assigned within a transaction have a txn-id
336 field of the form "t<txnid>".
338 When a new node-id or copy-id is assigned in a transaction, the ID
339 used is a "_" followed by a base36 number unique to the transaction.
343 Within a revision file, node-revs have a txn-id field of the form
344 "r<rev>/<offset>", to support easy lookup. The <offset> is the (ASCII
345 decimal) number of bytes from the start of the revision file to the
346 start of the node-rev.
348 During the final phase of a commit, node-revision IDs are rewritten
349 to have repository-wide unique node-ID and copy-ID fields, and to have
350 "r<rev>/<offset>" txn-id fields.
352 In Format 3 and above, this uniqueness is done by changing a temporary
353 id of "_<base36>" to "<base36>-<rev>". Note that this means that the
354 originating revision of a line of history or a copy can be determined
355 by looking at the node ID.
357 In Format 2 and below, the "current" file contains global base36
358 node-ID and copy-ID counters; during the commit, the counter value is
359 added to the transaction-specific base36 ID, and the value in
360 "current" is adjusted.
362 (It is legal for Format 3 repositories to contain Format 2-style IDs;
363 this just prevents I/O-less node-origin-rev lookup for those nodes.)
365 The temporary assignment of node-ID and copy-ID fields has
366 implications for svn_fs_compare_ids and svn_fs_check_related. The ID
367 _1.0.t1 is not related to the ID _1.0.t2 even though they have the
368 same node-ID, because temporary node-IDs are restricted in scope to
369 the transactions they belong to.
371 There is a lazily created cache mapping from node-IDs to the full
372 node-revision ID where they are created. This is in the node-origins
373 directory; the file name is the node-ID without its last character (or
374 "0" for single-character node IDs) and the contents is a serialized
375 hash mapping from node-ID to node-revision ID. This cache is only
376 used for node-IDs of the pre-Format 3 style.
378 Copy-IDs and copy roots
379 -----------------------
381 Copy-IDs are assigned in the same manner as they are in the BDB
384 * A node-rev resulting from a creation operation (with no copy
385 history) receives the copy-ID of its parent directory.
387 * A node-rev resulting from a copy operation receives a fresh
388 copy-ID, as one would expect.
390 * A node-rev resulting from a modification operation receives a
391 copy-ID depending on whether its predecessor derives from a
392 copy operation or whether it derives from a creation operation
393 with no intervening copies:
395 - If the predecessor does not derive from a copy, the new
396 node-rev receives the copy-ID of its parent directory. If the
397 node-rev is being modified through its created-path, this will
398 be the same copy-ID as the predecessor node-rev has; however,
399 if the node-rev is being modified through a copied ancestor
400 directory (i.e. we are performing a "lazy copy"), this will be
403 - If the predecessor derives from a copy and the node-rev is
404 being modified through its created-path, the new node-rev
405 receives the copy-ID of the predecessor.
407 - If the predecessor derives from a copy and the node-rev is not
408 being modified through its created path, the new node-rev
409 receives a fresh copy-ID. This is called a "soft copy"
410 operation, as distinct from a "true copy" operation which was
411 actually requested through the svn_fs interface. Soft copies
412 exist to ensure that the same <node-ID,copy-ID> pair is not
413 used twice within a transaction.
415 Unlike the BDB implementation, we do not have a "copies" table.
416 Instead, each node-revision record contains a "copyroot" field
417 identifying the node-rev resulting from the true copy operation most
418 proximal to the node-rev. If the node-rev does not itself derive from
419 a copy operation, then the copyroot field identifies the copy of an
420 ancestor directory; if no ancestor directories derive from a copy
421 operation, then the copyroot field identifies the root directory of
427 A revision file contains a concatenation of various kinds of data:
429 * Text and property representations
431 * The changed-path data
432 * Two offsets at the very end
434 A representation begins with a line containing either "PLAIN\n" or
435 "DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>,
436 and <length> give the location of the delta base of the representation
437 and the amount of data it contains (not counting the header or
438 trailer). If no base location is given for a delta, the base is the
439 empty stream. After the initial line comes raw svndiff data, followed
440 by a cosmetic trailer "ENDREP\n".
442 If the representation is for the text contents of a directory node,
443 the expanded contents are in hash dump format mapping entry names to
444 "<type> <id>" pairs, where <type> is "file" or "dir" and <id> gives
445 the ID of the child node-rev.
447 If a representation is for a property list, the expanded contents are
448 in the form of a dumped hash map mapping property names to property
451 The marshalling syntax for node-revs is a series of fields terminated
452 by a blank line. Fields have the syntax "<name>: <value>\n", where
453 <name> is a symbolic field name (each symbolic name is used only once
454 in a given node-rev) and <value> is the value data. Unrecognized
455 fields are ignored, for extensibility. The following fields are
458 id The ID of the node-rev
460 pred The ID of the predecessor node-rev
461 count Count of node-revs since the base of the node
462 text "<rev> <offset> <length> <size> <digest>" for text rep
463 props "<rev> <offset> <length> <size> <digest>" for props rep
464 <rev> and <offset> give location of rep
465 <length> gives length of rep, sans header and trailer
466 <size> gives size of expanded rep; may be 0 if equal
468 <digest> gives hex MD5 digest of expanded rep
469 ### in formats >=4, also present:
470 <sha1-digest> gives hex SHA1 digest of expanded rep
471 <uniquifier> see representation_t->uniquifier in fs.h
472 cpath FS pathname node was created at
473 copyfrom "<rev> <path>" of copyfrom data
474 copyroot "<rev> <created-path>" of the root of this copy
475 minfo-cnt The number of nodes under (and including) this node
476 which have svn:mergeinfo.
477 minfo-here Exists if this node itself has svn:mergeinfo.
479 The predecessor of a node-rev crosses both soft and true copies;
480 together with the count field, it allows efficient determination of
481 the base for skip-deltas. The first node-rev of a node contains no
482 "pred" field. A node-revision with no properties may omit the "props"
483 field. A node-revision with no contents (a zero-length file or an
484 empty directory) may omit the "text" field. In a node-revision
485 resulting from a true copy operation, the "copyfrom" field gives the
486 copyfrom data. The "copyroot" field identifies the root node-revision
487 of the copy; it may be omitted if the node-rev is its own copy root
488 (as is the case for node-revs with copy history, and for the root node
489 of revision 0). Copy roots are identified by revision and
490 created-path, not by node-rev ID, because a copy root may be a
491 node-rev which exists later on within the same revision file, meaning
492 its offset is not yet known.
494 The changed-path data is represented as a series of changed-path
495 items, each consisting of two lines. The first line has the format
496 "<id> <action> <text-mod> <prop-mod> <path>\n", where <id> is the
497 node-rev ID of the new node-rev, <action> is "add", "delete",
498 "replace", or "modify", <text-mod> and <prop-mod> are "true" or
499 "false" indicating whether the text and/or properties changed, and
500 <path> is the changed pathname. For deletes, <id> is the node-rev ID
501 of the deleted node-rev, and <text-mod> and <prop-mod> are always
502 "false". The second line has the format "<rev> <path>\n" containing
503 the node-rev's copyfrom information if it has any; if it does not, the
504 second line is blank.
506 Starting with FS format 4, <action> may contain the kind ("file" or
507 "dir") of the node, after a hyphen; for example, an added directory
508 may be represented as "add-dir".
510 At the very end of a rev file is a pair of lines containing
511 "\n<root-offset> <cp-offset>\n", where <root-offset> is the offset of
512 the root directory node revision and <cp-offset> is the offset of the
515 All numbers in the rev file format are unsigned and are represented as
521 A transaction directory has the following layout:
523 props Transaction props
524 next-ids Next temporary node-ID and copy-ID
525 changes Changed-path information so far
526 node.<nid>.<cid> New node-rev data for node
527 node.<nid>.<cid>.props Props for new node-rev, if changed
528 node.<nid>.<cid>.children Directory contents for node-rev
529 <sha1> Text representation of that sha1
531 In FS formats 1 and 2, it also contains:
533 rev Prototype rev file with new text reps
534 rev-lock Lockfile for writing to the above
536 In newer formats, these files are in the txn-protorevs/ directory.
538 The prototype rev file is used to store the text representations as
539 they are received from the client. To ensure that only one client is
540 writing to the file at a given time, the "rev-lock" file is locked for
541 the duration of each write.
543 The two kinds of props files are all in hash dump format. The "props"
544 file will always be present. The "node.<nid>.<cid>.props" file will
545 only be present if the node-rev properties have been changed.
547 The <sha1> files have been introduced in FS format 6. Their content
548 is that of text rep references: "<rev> <offset> <length> <size> <digest>"
549 They will be written for text reps in the current transaction and be
550 used to eliminate duplicate reps within that transaction.
552 The "next-ids" file contains a single line "<next-temp-node-id>
553 <next-temp-copy-id>\n" giving the next temporary node-ID and copy-ID
554 assignments (without the leading underscores). The next node-ID is
555 also used as a uniquifier for representations which may share the same
558 The "children" file for a node-revision begins with a copy of the hash
559 dump representation of the directory entries from the old node-rev (or
560 a dump of the empty hash for new directories), and then an incremental
561 hash dump entry for each change made to the directory.
563 The "changes" file contains changed-path entries in the same form as
564 the changed-path entries in a rev file, except that <id> and <action>
565 may both be "reset" (in which case <text-mod> and <prop-mod> are both
566 always "false") to indicate that all changes to a path should be
567 considered undone. Reset entries are only used during the final merge
568 phase of a transaction. Actions in the "changes" file always contain
569 a node kind, even if the FS format is older than format 4.
571 The node-rev files have the same format as node-revs in a revision
572 file, except that the "text" and "props" fields are augmented as
575 * The "props" field may have the value "-1" if properties have
576 been changed and are contained in a "props" file within the
577 node-rev subdirectory.
579 * For directory node-revs, the "text" field may have the value
580 "-1" if entries have been changed and are contained in a
581 "contents" file in the node-rev subdirectory.
583 * For the directory node-rev representing the root of the
584 transaction, the "is-fresh-txn-root" field indicates that it has
585 not been made mutable yet (see Issue #2608).
587 * For file node-revs, the "text" field may have the value "-1
588 <offset> <length> <size> <digest>" if the text representation is
589 within the prototype rev file.
591 * The "copyroot" field may have the value "-1 <created-path>" if the
592 copy root of the node-rev is part of the transaction in process.
597 Locks in FSFS are stored in serialized hash format in files whose
598 names are MD5 digests of the FS path which the lock is associated
599 with. For the purposes of keeping directory inode usage down, these
600 digest files live in subdirectories of the main lock directory whose
601 names are the first 3 characters of the digest filename.
603 Also stored in the digest file for a given FS path are pointers to
604 other digest files which contain information associated with other FS
605 paths that are beneath our path (an immediate child thereof, or a
606 grandchild, or a great-grandchild, ...).
608 To answer the question, "Does path FOO have a lock associated with
609 it?", one need only generate the MD5 digest of FOO's
610 absolute-in-the-FS path (say, 3b1b011fed614a263986b5c4869604e8), look
611 for a file located like so:
613 /path/to/repos/locks/3b1/3b1b011fed614a263986b5c4869604e8
615 And then see if that file contains lock information.
617 To inquire about locks on children of the path FOO, you would
618 reference the same path as above, but look for a list of children in
619 that file (instead of lock information). Children are listed as MD5
620 digests, too, so you would simply iterate over those digests and
621 consult the files they reference for lock information.