1REPOSURGEON(1)                 Development Tools                REPOSURGEON(1)
2
3
4

NAME

6       reposurgeon - surgical operations on repositories
7

SYNOPSIS

9       reposurgeon [command...]
10

DESCRIPTION

12       The purpose of reposurgeon is to enable risky operations that VCSes
13       (version-control systems) don't want to let you do, such as (a) editing
14       past comments and metadata, (b) excising commits, (c) coalescing and
15       splitting commits, (d) removing files and subtrees from repo history,
16       (e) merging or grafting two or more repos, and (f) cutting a repo in
17       two by cutting a parent-child link, preserving the branch structure of
18       both child repos.
19
20       A major use of reposurgeon is to assist a human operator to perform
21       higher-quality conversions among version control systems than can be
22       achieved with fully automated converters.
23
24       The original motivation for reposurgeon was to clean up artifacts
25       created by repository conversions. It was foreseen that the tool would
26       also have applications when code needs to be removed from repositories
27       for legal or policy reasons.
28
29       To keep reposurgeon simple and flexible, it normally does not do its
30       own repository reading and writing. Instead, it relies on being able to
31       parse and emit the command streams created by git-fast-export and read
32       by git-fast-import. This means that it can be used on any
33       version-control system that has both fast-export and fast-import
34       utilities. The git-import stream format also implicitly defines a
35       common language of primitive operations for reposurgeon to speak.
36
37       Fully supported systems (those for which reposurgeon can both read and
38       write repositories) include git, hg, bzr, svn, darcs, bk, RCS, and SRC.
39       For a complete list, with dependencies and technical notes, type prefer
40       to the reposurgeon prompt.
41
42       Writing to the file-oriented systems RCS and SRC is done via rcs-fast-
43       import(1) and has some serious limitations because those systems cannot
44       represent all the metadata in a git-fast-export stream. Consult that
45       tool's documentation for details and partial workarounds.
46
47       Writing Subversion repositories also has some significant limitations,
48       discussed in the section on Working With Subversion.
49
50       Fossil repository files can be read in using the --format=fossil option
51       of the read command and written out with the --format=fossil option of
52       the write. Ignore patterns are not translated in either direction.
53
54       CVS is supported for read only, not write. For CVS, reposurgeon must be
55       run from within a repository directory (one with a CVSROOT
56       subdirectory). Each module becomes a subdirectory in the the
57       reposurgeon representation of the change history.
58
59       In order to deal with version-control systems that do not have
60       fast-export equivalents, reposurgeon can also host extractor code that
61       reads repositories directly. For each version-control system supported
62       through an extractor, reposurgeon uses a small amount of knowledge
63       about the system's command-line tools to (in effect) replay repository
64       history into an input stream internally. Repositories under systems
65       supported through extractors can be read by reposurgeon, but not
66       modified by it. In particular, reposurgeon can be used to move a
67       repository history from any VCS supported by an extractor to any VCS
68       supported by a normal importer/exporter pair.
69
70       Mercurial repository reading is implemented with an extractor class;
71       writing is handled with the stock "hg fastimport" command. A test
72       extractor exists for git, but is normally disabled in favor of the
73       regular exporter.
74
75       For guidance on the pragmatics of repository conversion, see the DVCS
76       Migration HOWTO[1].
77

SAFETY WARNINGS

79       reposurgeon is a sharp enough tool to cut you. It takes care not to
80       ever write a repository in an actually inconsistent state, and will
81       terminate with an error message rather than proceed when its internal
82       data structures are confused. However, there are lots of things you can
83       do with it - like altering stored commit timestamps so they no longer
84       match the commit sequence - that are likely to cause havoc after you're
85       done. Proceed with caution and check your work.
86
87       Also note that, if your DVCS does the usual thing of making commit IDs
88       a cryptographic hash of content and parent links, editing a
89       publicly-accessible repository with this tool would be a bad idea. All
90       of the surgical operations in reposurgeon will modify the hash chains.
91
92       Please also see the notes on system-specific issues under the section
93       called “LIMITATIONS AND GUARANTEES”.
94

OPERATION

96       The program can be run in one of two modes, either as an interactive
97       command interpreter or in batch mode to execute commands given as
98       arguments on the reposurgeon invocation line. The only differences
99       between these modes are (1) the interactive one begins by turning on
100       the 'verbose 1' option, (2) in batch mode all errors (including
101       normally recoverable errors in selection-set syntax) are fatal, and (3)
102       each command-line argument beginning with “--” has that stripped off
103       (which, in particular means that --help and --version will work as
104       expected). Also, in interactive mode, Ctrl-P and Ctrl-N will be
105       available to scroll through your command history and tab completion of
106       both command keywords and name arguments (wherever that makes semantic
107       sense) is available.
108
109       A git-fast-import stream consists of a sequence of commands which must
110       be executed in the specified sequence to build the repo; to avoid
111       confusion with reposurgeon commands we will refer to the stream
112       commands as events in this documentation. These events are implicitly
113       numbered from 1 upwards. Most commands require specifying a selection
114       of event sequence numbers so reposurgeon will know which events to
115       modify or delete.
116
117       For all the details of event types and semantics, see the git-fast-
118       import(1) manual page; the rest of this paragraph is a quick start for
119       the impatient. Most events in a stream are commits describing revision
120       states of the repository; these group together under a single change
121       comment one or more fileops (file operations), which usually point to
122       blobs that are revision states of individual files. A fileop may also
123       be a delete operation indicating that a specified previously-existing
124       file was deleted as part of the version commit; there are a couple of
125       other special fileop types of lesser importance.
126
127       Commands to reposurgeon consist of a command keyword, sometimes
128       preceded by a selection set, sometimes followed by whitespace-separated
129       arguments. It is often possible to omit the selection-set argument and
130       have it default to something reasonable.
131
132       Here are some motivating examples. The commands will be explained in
133       more detail after the description of selection syntax.
134
135           :15 edit               ;; edit the object associated with mark :15
136
137           edit                   ;; edit all editable objects
138
139           29..71 list            ;; list summary index of events 29..71
140
141           236..$ list            ;; List events from 236 to the last
142
143           <#523> inspect         ;; Look for commit #523; they are numbered
144                                  ;; 1-origin from the beginning of the repository.
145
146           <2317> inspect         ;; Look for a tag with the name 2317, a tip commit
147                                  ;; of a branch named 2317, or a commit with legacy ID
148                                  ;; 2317. Inspect what is found. A plain number is
149                                  ;; probably a legacy ID inherited from a Subversion
150                                  ;; revision number.
151
152           /regression/ list      ;; list all commits and tags with comments or
153                                  ;; committer headers or author headers containing
154                                  ;; the string "regression"
155
156           1..:97 & =T delete     ;; delete tags from event 1 to mark 97
157
158           [Makefile] inspect     ;; Inspect all commits with a file op touching Makefile
159                                  ;; and all blobs referred to in a fileop
160                                  ;; touching Makefile.
161
162           :46 tip                ;; Display the branch tip that owns commit :46.
163
164           @dsc(:55) list         ;; Display all commits with ancestry tracing to :55
165
166           @min([.gitignore]) remove .gitignore delete
167                                  ;; Remove the first .gitignore fileop in the repo.
168
169   SELECTION SYNTAX
170       A selection set is ordered; that is, any given element may occur only
171       one, and the set is ordered by when its members were first added.
172
173       The selection-set specification syntax is an expression-oriented
174       minilanguage. The most basic term in this language is a location. The
175       following sorts of primitive locations are supported:
176
177       event numbers
178           A plain numeric literal is interpreted as a 1-origin event-sequence
179           number.
180
181       marks
182           A numeric literal preceded by a colon is interpreted as a mark; see
183           the import stream format documentation for explanation of the
184           semantics of marks.
185
186       tag and branch names
187           The basename of a branch (including branches in the refs/tags
188           namespace) refers to its tip commit. The name of a tag is
189           equivalent to its mark (that of the tag itself, not the commit it
190           refers to). Tag and branch locations are bracketed with < > (angle
191           brackets) to distinguish them from command keywords.
192
193       legacy IDs
194           If the contents of name brackets (< >) does not match a tag or
195           branch name, the interpreter next searches legacy IDs of commits.
196           This is especially useful when you have imported a Subversion dump;
197           it means that commits made from it can be referred to by their
198           corresponding Subversion revision numbers.
199
200       commit numbers
201           A numeric literal within name brackets (< >) preceded by # is
202           interpreted as a 1-origin commit-sequence number.
203
204       reset@ names
205           A name with the prefix 'reset@' refers to the latest reset with a
206           basename matching the part after the @. Usually there is only one
207           such reset.
208
209       $
210           Refers to the last event.
211
212       These may be grouped into sets in the following ways:
213
214       ranges
215           A range is two locations separated by "..", and is the set of
216           events beginning at the left-hand location and ending at the
217           right-hand location (inclusive).
218
219       lists
220           Comma-separated lists of locations and ranges are accepted, with
221           the obvious meaning.
222
223       There are some other ways to construct event sets:
224
225       visibility sets
226           A visibility set is an expression specifying a set of event types.
227           It will consist of a leading equal sign, followed by type letters.
228           These are the type letters:
229
230                    ┌──┬─────────────────────┬───────────────────────┐
231                    │B │        blobs        │  Most default         │
232                    │  │                     │  selection sets       │
233                    │  │                     │  exclude blobs; they  │
234                    │  │                     │  have to be           │
235                    │  │                     │  manipulated through  │
236                    │  │                     │  the commits they     │
237                    │  │                     │  are attached to.     │
238                    ├──┼─────────────────────┼───────────────────────┤
239                    │C │       commits       │                       │
240                    ├──┼─────────────────────┼───────────────────────┤
241                    │D │ all-delete commits  │ These are artifacts   │
242                    │  │                     │ produced by some      │
243                    │  │                     │ older                 │
244                    │  │                     │ repository-conversion │
245                    │  │                     │ tools.                │
246                    ├──┼─────────────────────┼───────────────────────┤
247                    │H │  head (branch tip)  │                       │
248                    │  │  commits            │                       │
249                    ├──┼─────────────────────┼───────────────────────┤
250                    │O │    orphaned         │                       │
251                    │  │    (parentless)     │                       │
252                    │  │    commits          │                       │
253                    ├──┼─────────────────────┼───────────────────────┤
254                    │U │ commits with        │                       │
255                    │  │ callouts as parents │                       │
256                    ├──┼─────────────────────┼───────────────────────┤
257                    │Z │   commits with no   │                       │
258                    │  │   fileops           │                       │
259                    ├──┼─────────────────────┼───────────────────────┤
260                    │M │   merge             │                       │
261                    │  │   (multi-parent)    │                       │
262                    │  │   commits           │                       │
263                    ├──┼─────────────────────┼───────────────────────┤
264                    │F │ fork (multi-child)  │                       │
265                    │  │ commits             │                       │
266                    ├──┼─────────────────────┼───────────────────────┤
267                    │L │ commits with        │                       │
268                    │  │ unclean multi-line  │                       │
269                    │  │ comments (without a │                       │
270                    │  │ separating empty    │                       │
271                    │  │ line after the      │                       │
272                    │  │ first)              │                       │
273                    ├──┼─────────────────────┼───────────────────────┤
274                    │I │ commits for which   │                       │
275                    │  │ metadata cannot be  │                       │
276                    │  │ decoded to UTF-8    │                       │
277                    ├──┼─────────────────────┼───────────────────────┤
278                    │T │        tags         │                       │
279                    ├──┼─────────────────────┼───────────────────────┤
280                    │R │       resets        │                       │
281                    ├──┼─────────────────────┼───────────────────────┤
282                    │P │     Passthrough     │ All event types       │
283                    │  │                     │ simply passed         │
284                    │  │                     │ through, including    │
285                    │  │                     │ comments, progress    
286                    │  │                     │ commands, and         │
287                    │  │                     │ checkpoint commands.  │
288                    ├──┼─────────────────────┼───────────────────────┤
289                    │N │     Legacy IDs      │ Any string matching a │
290                    │  │                     │ cookie (legacy-ID)    │
291                    │  │                     │ format.               │
292                    └──┴─────────────────────┴───────────────────────┘
293
294       references
295           A reference name (bracketed by angle brackets) resolves to a single
296           object, either a commit or tag.
297
298                      ┌──────────────┬────────────────────────────┐
299type      interpretation       
300                      ├──────────────┼────────────────────────────┤
301                      │  tag name    │  annotated tag with that   │
302                      │              │  name                      │
303                      ├──────────────┼────────────────────────────┤
304                      │ branch name  │   the branch tip commit    │
305                      ├──────────────┼────────────────────────────┤
306                      │  legacy ID   │ commit with that legacy ID │
307                      ├──────────────┼────────────────────────────┤
308                      │assigned name │    name equated to a       │
309                      │              │    selection by assign     │
310                      └──────────────┴────────────────────────────┘
311           Note that if an annotated tag and a branch have the same name foo,
312           <foo> will resolve to the tag rather than the branch tip commit.
313
314       dates and action stamps
315           A date or action stamp in angle brackets resolves to a selection
316           set of all matching commits.
317
318                ┌───────────────────────────┬───────────────────────────┐
319type            interpretation       
320                ├───────────────────────────┼───────────────────────────┤
321                │    RFC3339 timestamp      │  commit or tag with that  │
322                │                           │  time/date                │
323                ├───────────────────────────┼───────────────────────────┤
324                │    action stamp           │ commits or tags with that │
325                │    (timestamp!email)      │ timestamp and author (or  │
326                │                           │ committer if no author).  │
327                ├───────────────────────────┼───────────────────────────┤
328                │yyyy-mm-dd part of RFC3339 │ all commits and tags with │
329                │timestamp                  │ that date                 │
330                └───────────────────────────┴───────────────────────────┘
331           To refine the match to a single commit, use a 1-origin index suffix
332           separated by '#'. Thus "<2000-02-06T09:35:10Z>" can match multiple
333           commits, but "<2000-02-06T09:35:10Z#2>" matches only the second in
334           the set.
335
336       text search
337           A text search expression is a Python regular expression surrounded
338           by forward slashes (to embed a forward slash in it, use a Python
339           string escape such as \x2f).
340
341           A text search normally matches against the comment fields of
342           commits and annotated tags, or against their author/committer
343           names, or against the names of tags; also the text of passthrough
344           objects.
345
346           The scope of a text search can be changed with qualifier letters
347           after the trailing slash. These are as follows:
348
349                          ┌───────┬───────────────────────────┐
350letter interpretation       
351                          ├───────┼───────────────────────────┤
352                          │  a    │   author name in commit   │
353                          ├───────┼───────────────────────────┤
354                          │  b    │ branch name in commit;    │
355                          │       │ also matches blobs        │
356                          │       │ referenced by commits on  │
357                          │       │ matching branches, and    │
358                          │       │ tags which point to       │
359                          │       │ commmits on patching      │
360                          │       │ branches.                 │
361                          ├───────┼───────────────────────────┤
362                          │  c    │ comment text of commit or │
363                          │       │ tag                       │
364                          ├───────┼───────────────────────────┤
365                          │  r    │  committish reference in  │
366                          │       │  tag or reset             │
367                          ├───────┼───────────────────────────┤
368                          │  p    │    text in passthrough    │
369                          ├───────┼───────────────────────────┤
370                          │  t    │       tagger in tag       │
371                          ├───────┼───────────────────────────┤
372                          │  n    │        name of tag        │
373                          ├───────┼───────────────────────────┤
374                          │  B    │       blob content        │
375                          └───────┴───────────────────────────┘
376           Multiple qualifier letters can add more search scopes.
377
378           (The “b” qualifier replaces the branchset syntax in earlier
379           versions of reposurgeon.)
380
381       paths
382           A "path expression" enclosed in square brackets resolves to the set
383           of all commits and blobs related to a path matching the given
384           expression. The path expression itself is either a path literal or
385           a regular expression surrounded by slashes. Immediately after the
386           trailing / of a path regexp you can put any number of the following
387           characters which act as flags: 'a', 'c', 'D', "M', 'R', 'C', 'N'.
388
389           By default, a path is related to a commit if the latter has a
390           fileop that touches that file path - modifies that change it,
391           deletes that remove it, renames and copies that have it as a source
392           or target. When the 'c' flag is in use the meaning changes: the
393           paths related to a commit become all paths that would be present in
394           a checkout for that commit.
395
396           A path literal matches a commit if and only if the path literal is
397           exactly one of the paths related to the commit (no prefix or suffix
398           operation is done). In particular a path literal won't match if it
399           corresponds to a directory in the chosen repository.
400
401           A regular expression matches a commit if it matches any path
402           related to the commit anywhere in the path. You can use '^' or '$'
403           if you want the expression to only match at the beginning or end of
404           paths. When the 'a' flag is in use, the path expression selects
405           commits whose every path matches the regular expression. This is
406           not always a subset of commits selected without the 'a' flag
407           because it also selects commits with no related paths (e.g. empty
408           commits, deletealls and commits with empty trees). If you want to
409           avoid those, you can use e.g. '[/regex/] & [/regex/a]'.
410
411           The flags 'D', "M', 'R', 'C', 'N' restrict match checking to the
412           corresponding fileop types. Note that this means an 'a' match is
413           easier (not harder) to achieve. These are no-ops when used with
414           'c'.
415
416           A path or literal matches a blob if it matches any path that
417           appeared in a modification fileop that referred to that blob. To
418           select purely matching blobs or matching commits, compose a path
419           expression with =B or =C.
420
421           If you need to embed '[^/]' into your regular expression (e.g. to
422           express "all characters but a slash") you can use a Python string
423           escape such as \x2f.
424
425       function calls
426           The expression language has named special functions. The sequence
427           for a named function is “@” followed by a function name, followed
428           by an argument in parentheses. Presently the following functions
429           are defined:
430
431                           ┌─────┬────────────────────────────┐
432name interpretation       
433                           ├─────┼────────────────────────────┤
434                           │min  │    minimum member of a     │
435                           │     │    selection set           │
436                           ├─────┼────────────────────────────┤
437                           │max  │    maximum member of a     │
438                           │     │    selection set           │
439                           ├─────┼────────────────────────────┤
440                           │amp  │ nonempty selection set     │
441                           │     │ becomes all objects, empty │
442                           │     │ set is returned empty      │
443                           ├─────┼────────────────────────────┤
444                           │par  │ all parents of commits in  │
445                           │     │ the argument set           │
446                           ├─────┼────────────────────────────┤
447                           │chn  │ all children of commits in │
448                           │     │ the argument set           │
449                           ├─────┼────────────────────────────┤
450                           │dsc  │ all commits descended from │
451                           │     │ the argument set (argument │
452                           │     │ set included)              │
453                           ├─────┼────────────────────────────┤
454                           │anc  │ all commits whom the       │
455                           │     │ argument set is descended  │
456                           │     │ from (argument set         │
457                           │     │ included)                  │
458                           ├─────┼────────────────────────────┤
459                           │pre  │ events before the argument │
460                           │     │ set; empty if the argument │
461                           │     │ set includes the first     │
462                           │     │ event.                     │
463                           ├─────┼────────────────────────────┤
464                           │suc  │ events after the argument  │
465                           │     │ set; empty if the argument │
466                           │     │ set includes the last      │
467                           │     │ event.                     │
468                           ├─────┼────────────────────────────┤
469                           │srt  │  sort the argument set by  │
470                           │     │  event number.             │
471                           └─────┴────────────────────────────┘
472
473       Set expressions may be combined with the operators | and &; these are,
474       respectively, set union and intersection. The | has lower precedence
475       than intersection, but you may use parentheses '(' and ')' to group
476       expressions in case there is ambiguity (this replaces the curly
477       brackets used in older versions of the syntax).
478
479       Any set operation may be followed by '?' to add the set members'
480       neighbors and referents. This extends the set to include the parents
481       and children of all commits in the set, and the referents of any tags
482       and resets in the set. Each blob reference in the set is replaced by
483       all commit events that refer to it. The '?' can be repeated to extend
484       the neighborhood depth. The result of a '?' extension is sorted so the
485       result is in ascending order.
486
487       Do set negation with prefix ~; it has higher precedence than & and |
488       but lower than ?
489
490   IMPORT AND EXPORT
491       reposurgeon can hold multiple repository states in core. Each has a
492       name. At any given time, one may be selected for editing. Commands in
493       this group import repositories, export them, and manipulate the in-core
494       list and the selection.
495
496       read [--format=fossil] [directory|-|<infile]
497           With a directory-name argument, this command attempts to read in
498           the contents of a repository in any supported version-control
499           system under that directory; read with no arguments does this in
500           the current directory. If output is redirected to a plain file, it
501           will be read in as a fast-import stream or Subversion dumpfile.
502           With an argument of “-”, this command reads a fast-import stream or
503           Subversion dumpfile from standard input (this will be useful in
504           filters constructed with command-line arguments).
505
506           If the contents is a fast-import stream, any "cvs-revision"
507           property on a commit is taken to be a newline-separated list of CVS
508           revision cookies pointing to the commit, and used for reference
509           lifting.
510
511           If the contents is a fast-import stream, any "legacy-id" property
512           on a commit is taken to be a legacy ID token pointing to the
513           commit, and used for reference-lifting.
514
515           If the read location is a git repository and contains a
516           .git/cvsauthors file (such as is left in place by git cvsimport -A)
517           that file will be read in as if it had been given to the authors
518           read command.
519
520           If the read location is a directory, and its repository
521           subdirectory has a file named legacy-map, that file will be read as
522           though passed to a legacy read command.
523
524           If the read location is a file and the --format=fossil is used, the
525           file is interpreted as a Fossil repository.
526
527           The --preserve is interpreted in a way dependent of the type of the
528           incoming repository or stream. Presently it only affects the
529           processing of Subversion repositories; see the section called
530           “WORKING WITH SUBVERSION” for details.
531
532           The just-read-in repo is added to the list of loaded repositories
533           and becomes the current one, selected for surgery. If it was read
534           from a plain file and the file name ends with one of the extensions
535           .fi or .svn, that extension is removed from the load list name.
536
537           Note: this command does not take a selection set.
538
539       write [--legacy] [--format=fossil] [--noincremental] [--callout]
540       [>outfile|-]
541           Dump selected events as a fast-import stream representing the
542           edited repository; the default selection set is all events. Where
543           to dump to is standard output if there is no argument or the
544           argument is '-', or the target of an output redirect.
545
546           Alternatively, if there is no redirect and the argument names a
547           directory, the repository is rebuilt into that directory, with any
548           selection set being ignored; if that target directory is nonempty
549           its contents are backed up to a save directory.
550
551           If the write location is a file and the --format=fossil is used,
552           the file is written in Fossil repository format.
553
554           With the --legacy option, the Legacy-ID of each commit is appended
555           to its commit comment at write time. This option is mainly useful
556           for debugging conversion edge cases.
557
558           If you specify a partial selection set such that some commits are
559           included but their parents are not, the output will include
560           incremental dump cookies for each branch with an origin outside the
561           selection set, just before the first reference to that branch in a
562           commit. An incremental dump cookie looks like "refs/heads/foo^0"
563           and is a clue to export-stream loaders that the branch should be
564           glued to the tip of a pre-existing branch of the same name. The
565           --noincremental option suppresses this behavior.
566
567           When you specify a partial selection set, including a commit object
568           forces the inclusion of every blob to which it refers and every tag
569           that refers to it.
570
571           Specifying a partial selection may cause a situation in which some
572           parent marks in merges don't correspond to commits present in the
573           dump. When this happens and --callout option was specified, the
574           write code replaces the merge mark with a callout, the action stamp
575           of the parent commit; otherwise the parent mark is omitted.
576           Importers will fail when reading a stream dump with callouts; it is
577           intended to be used by the graft command.
578
579           Specifying a write selection set with gaps in it is allowed but
580           unlikely to lead to good results if it is loaded by an importer.
581
582           Property extensions will be be omitted from the output if the
583           importer for the preferred repository type cannot digest them.
584
585           Note: to examine small groups of commits without the progress
586           meter, use inspect.
587
588       choose [reponame]
589           Choose a named repo on which to operate. The name of a repo is
590           normally the basename of the directory or file it was loaded from,
591           but repos loaded from standard input are "unnamed".  reposurgeon
592           will add a disambiguating suffix if there have been multiple reads
593           from the same source.
594
595           With no argument, lists the names of the currently stored
596           repositories and their load times. The second column is '*' for the
597           currently selected repository, '-' for others.
598
599       drop [reponame]
600           Drop a repo named by the argument from reposurgeon's list, freeing
601           the memory used for its metadata and deleting on-disk blobs. With
602           no argument, drops the currently chosen repo.
603
604       rename reponame
605           Rename the currently chosen repo; requires an argument. Won't do it
606           if there is already one by the new name.
607
608   REBUILDS IN PLACE
609       reposurgeon can rebuild an altered repository in place. Untracked files
610       are normally saved and restored when the contents of the new repository
611       is checked out (but see the documentation of the “preserve” command for
612       a caveat).
613
614       rebuild [directory]
615           Rebuild a repository from the state held by reposurgeon. This
616           command does not take a selection set.
617
618           The single argument, if present, specifies the target directory in
619           which to do the rebuild; if the repository read was from a repo
620           directory (and not a git-import stream), it defaults to that
621           directory. If the target directory is nonempty its contents are
622           backed up to a save directory. Files and directories on the
623           repository's preserve list are copied back from the backup
624           directory after repo rebuild. The default preserve list depends on
625           the repository type, and can be displayed with the stats command.
626
627           If reposurgeon has a nonempty legacy map, it will be written to a
628           file named legacy-map in the repository subdirectory as though by a
629           legacy write command. (This will normally be the case for
630           Subversion and CVS conversions.)
631
632       preserve [file...]
633           Add (presumably untracked) files or directories to the repo's list
634           of paths to be restored from the backup directory after a rebuild.
635           Each argument, if any, is interpreted as a pathname. The current
636           preserve list is displayed afterwards.
637
638           It is only necessary to use this feature if your version-control
639           system lacks a command to list files under version control. Under
640           systems with such a command (which include git and hg), all files
641           that are neither beneath the repository dot directory nor under
642           reposurgeon temporary directories are preserved automatically.
643
644       unpreserve [file...]
645           Remove (presumably untracked) files or directories to the repo's
646           list of paths to be restored from the backup directory after a
647           rebuild. Each argument, if any, is interpreted as a pathname. The
648           current preserve list is displayed afterwards.
649
650   TIMEQUAKES AND TIMEBUMPS
651       Modifying a repository so every commit in it has a unique timestamp is
652       often a useful thing to do, in order for every commit has a unique
653       action stamp that can be referred to in surgical commands.
654
655       timequake
656           Attempt to hack committer and author time stamps in the selection
657           set (defaulting to all commits in the repository) to be unique.
658           Works by identifying collisions between parent and child, than
659           incrementing child timestamps so they no longer coincide. Won't
660           touch commits with multiple parents.
661
662           Because commits are checked in ascending order, this logic will
663           normally do the right thing on chains of three or more commits with
664           identical timestamps.
665
666           Any timestamp collisions left after this operation are probably
667           cross-branch and have to be individually dealt with using
668           'timebump' commands.
669
670       timebump [seconds]
671           Bump the committer and author timestamps of commits in the
672           selection set (defaulting to empty) by one second. With following
673           integer argument, that many seconds. Argument may be negative.
674
675       Those of you twitchy about "rewriting history" should bear in mind that
676       the commit stamps in many older repositories were never very reliable
677       to begin with.
678
679       CVS in particular is notorious for shipping client-side timestamps with
680       timezone and DST issues (as opposed to UTC) that don't necessary
681       compare well with stamps from different clients of the same CVS server.
682       Thus, inducing a timequake in a CVS repo seldom produces effects
683       anywhere near as large than the measurement noise of the repository's
684       own timestamps.
685
686       Subversion was somewhat better about this, as commits were stamped at
687       the server, but older Subversion repositories often have sections that
688       predate the era of ubiquitous NTP time.
689
690   INFORMATION AND REPORTS
691       Commands in this group report information about the selected
692       repository.
693
694       The output of these commands can individually be redirected to a named
695       output file. Where indicated in the syntax, you can prefix the output
696       filename with “>” and give it as a following argument. If you use “>>”
697       the file is opened for append rather than write.
698
699       list [>outfile]
700           This is the main command for identifying the events you want to
701           modify. It lists commits in the selection set by event sequence
702           number with summary information. The first column is raw event
703           numbers, the second a timestamp in local time. If the repository
704           has legacy IDs, they will be displayed in the third column. The
705           leading portion of the comment follows.
706
707       stamp [>outfile]
708           Alternative form of listing that displays full action stamps,
709           usable as references in selections. Supports > redirection.
710
711       tip [>outfile]
712           Display the branch tip names associated with commits in the
713           selection set. These will not necessarily be the same as their
714           branch fields (which will often be tag names if the repo contains
715           either annotated or lightweight tags).
716
717           If a commit is at a branch tip, its tip is its branch name. If it
718           has only one child, its tip is the child's tip. If it has multiple
719           children, then if there is a child with a matching branch name its
720           tip is the child's tip. Otherwise this function throws a
721           recoverable error.
722
723       tags [>outfile]
724           Display tags and resets: three fields, an event number and a type
725           and a name. Branch tip commits associated with tags are also
726           displayed with the type field 'commit'. Supports > redirection.
727
728       stats [repo-name...] [>outfile]
729           Report size statistics and import/export method information about
730           named repositories, or with no argument the currently chosen
731           repository.
732
733       count [>outfile]
734           Report a count of items in the selection set. Default set is
735           everything in the currently-selected repo. Supports > redirection.
736
737       inspect [>outfile]
738           Dump a fast-import stream representing selected events to standard
739           output. Just like a write, except (1) the progress meter is
740           disabled, and (2) there is an identifying header before each event
741           dump.
742
743       graph [>outfile]
744           Emit a visualization of the commit graph in the DOT markup language
745           used by the graphviz tool suite. This can be fed as input to the
746           main graphviz rendering program dot(1), which will yield a viewable
747           image. Supports > redirection.
748
749           You may find a script like this useful:
750
751               graph $1 >/tmp/foo$$
752               shell dot </tmp/foo$$ -Tpng | display -; rm /tmp/foo$$
753
754           You can substitute in your own preferred image viewer, of course.
755
756       sizes [>outfile]
757           Print a report on data volume per branch; takes a selection set,
758           defaulting to all events. The numbers tally the size of
759           uncompressed blobs, commit and tag comments, and other metadata
760           strings (a blob is counted each time a commit points at it).
761
762           The numbers are not an exact measure of storage size: they are
763           intended mainly as a way to get information on how to efficiently
764           partition a repository that has become large enough to be unwieldy.
765
766           Supports > redirection.
767
768       lint [>outfile]
769           Look for DAG and metadata configurations that may indicate a
770           problem. Presently checks for: (1) Mid-branch deletes, (2)
771           disconnected commits, (3) parentless commits, (4) the existence of
772           multiple roots, (5) committer and author IDs that don't look
773           well-formed as DVCS IDs, (6) multiple child links with identical
774           branch labels descending from the same commit, (7) time and
775           action-stamp collisions.
776
777           Options to issue only partial reports are supported; "lint
778           --options" or "lint -?" lists them.
779
780           The options and output format of this command are unstable; they
781           may change without notice as more sanity checks are added.
782
783       when >timespec
784           Interconvert between git timestamps (integer Unix time plus TZ) and
785           RFC3339 format. Takes one argument, autodetects the format. Useful
786           when eyeballing export streams. Also accepts any other supported
787           date format and converts to RFC3339.
788
789   SURGICAL OPERATIONS
790       These are the operations the rest of reposurgeon is designed to
791       support.
792
793       squash [policy...]
794           Combine or delete commits in a selection set of events. The default
795           selection set for this command is empty. Has no effect on events
796           other than commits unless the --delete policy is selected; see the
797           'delete' command for discussion.
798
799           Normally, when a commit is squashed, its file operation list (and
800           any associated blob references) gets either prepended to the
801           beginning of the operation list of each of the commit's children or
802           appended to the operation list of each of the commit's parents.
803           Then children of a deleted commit get it removed from their parent
804           set and its parents added to their parent set.
805
806           The analogous operation is performed on commit comments, so no
807           comment text is ever outright discarded. Exception: comments
808           consisting of "*** empty log messages ***", as generated by CVS,
809           are ignored.
810
811           The default is to squash forward, modifying children; but see the
812           list of policy modifiers below for how to change this.
813
814               Warning
815               It is easy to get the bounds of a squash command wrong, with
816               confusing and destructive results. Beware thinking you can
817               squash on a selection set to merge all commits except the last
818               one into the last one; what you will actually do is to merge
819               all of them to the first commit after the selected set.
820           Normally, any tag pointing to a combined commit will also be pushed
821           forward. But see the list of policy modifiers below for how to
822           change this.
823
824           Following all operation moves, every one of the altered file
825           operation lists is reduced to a shortest normalized form. The
826           normalized form detects various combinations of modification,
827           deletion, and renaming and simplifies the operation sequence as
828           much as it can without losing any information.
829
830           After canonicalization, a file op list may still end up containing
831           multiple M operations on the same file. Normally the tool utters a
832           warning when this occurs but does not try to resolve it.
833
834           The following modifiers change these policies:
835
836           --delete
837               Simply discards all file ops and tags associated with deleted
838               commit(s).
839
840           --coalesce
841               Discard all M operations (and associated blobs) except the
842               last.
843
844           --pushback
845               Append fileops to parents, rather than prepending to children.
846
847           --pushforward
848               Prepend fileops to children. This is the default; it can be
849               specified in a lift script for explicitness about intentions.
850
851           --tagforward
852               With the "tagforward" modifier, any tag on the deleted commit
853               is pushed forward to the first child rather than being deleted.
854               This is the default; it can be specified for explicitness.
855
856           --tagback
857               With the "--tagback" modifier, any tag on the deleted commit is
858               pushed backward to the first parent rather than being deleted.
859
860           --quiet
861               Suppresses warning messages about deletion of commits with
862               non-delete fileops.
863
864           --complain
865               The opposite of quiet. Can be specified for explicitness.
866
867           --empty-only
868               Complain if a squash operation modifies a nonempty comment.
869
870           Under any of these policies except “--delete”, deleting a commit
871           that has children does not back out the changes made by that
872           commit, as they will still be present in the blobs attached to
873           versions past the end of the deletion set. All a delete does when
874           the commit has children is lose the metadata information about when
875           and by who those changes were actually made; after the delete any
876           such changes will be attributed to the first undeleted children of
877           the deleted commits. It is expected that this command will be
878           useful mainly for removing commits mechanically generated by
879           repository converters such as cvs2svn.
880
881       delete [policy...]
882           Delete a selection set of events. The default selection set for
883           this command is empty. On a set of commits, this is equivalent to a
884           squash with the --delete flag. It unconditionally deletes tags,
885           resets, and passthroughs; blobs can be removed only as a side
886           effect of deleting every commit that points at them.
887
888       divide parent [child]
889           Attempt to partition a repo by cutting the parent-child link
890           between two specified commits (they must be adjacent). Does not
891           take a general selection set. It is only necessary to specify the
892           parent commit, unless it has multiple children in which case the
893           child commit must follow (separate it with a comma).
894
895           If the repo was named 'foo', you will normally end up with two
896           repos named 'foo-early' and 'foo-late' (option and feature events
897           at the beginning of the early segment will be duplicated onto the
898           beginning of the late one.). But if the commit graph would remain
899           connected through another path after the cut, the behavior changes.
900           In this case, if the parent and child were on the same branch
901           'qux', the branch segments are renamed 'qux-early' and 'qux-late'
902           but the repo is not divided.
903
904       expunge [--notagify] [path | /regexp/]...
905           Expunge files from the selected portion of the repo history; the
906           default is the entire history. The arguments to this command may be
907           paths or Python regular expressions matching paths (regexps must be
908           marked by being surrounded with //).
909
910           All filemodify (M) operations and delete (D) operations involving a
911           matched file in the selected set of events are disconnected from
912           the repo and put in a removal set. Renames are followed as the tool
913           walks forward in the selection set; each triggers a warning
914           message. If a selected file is a copy (C) target, the copy will be
915           deleted and a warning message issued. If a selected file is a copy
916           source, the copy target will be added to the list of paths to be
917           deleted and a warning issued.
918
919           After file expunges have been performed, any commits with no
920           remaining file operations will be removed, and any tags pointing to
921           them. By default each deleted commit is replaced with a tag of the
922           form 'emptycommit-ident' on the preceding commit unless
923           “--notagify” is specified as an argument. Commits with deleted
924           fileops pointing both in and outside the path set are not deleted,
925           but are cloned into the removal set.
926
927           The removal set is not discarded. It is assembled into a new
928           repository named after the old one with the suffix "-expunges"
929           added. Thus, this command can be used to carve a repository into
930           sections by file path matches.
931
932       tagify [--canonicalize] [--tipdeletes] [--tagify-merges]
933           Search for empty commits and turn them into tags. Takes an optional
934           selection set argument defaulting to all commits. For each commit
935           in the selection set, turn it into a tag with the same message and
936           author information if it has no fileops. By default merge commits
937           are not considered, even if they have no fileops (thus no tree
938           differences with their first parent). To change that, use the
939           --tagify-merges option.
940
941           The name of the generated tag will be 'emptycommit-ident', where
942           ident is generated from the legacy ID of the deleted commit, or
943           from its mark, or from its index in the repository, with a
944           disambiguation suffix if needed.
945
946           With the --canonicalize, tagify tries harder to detect trivial
947           commits by first ensuring that all fileops of selected commits will
948           have an actual effect when processed by fast-import.
949
950           With the --tipdeletes, tagify also considers branch tips with only
951           deleteall fileops to be candidates for tagification. The
952           corresponding tags get names of the form 'tipdelete-branchname'
953           rather than the default 'emptycommit-ident'.
954
955           With the --tagify-merges, tagify also tagifies merge commits that
956           have no fileops. When this is done the merge link is move to the
957           yagified commit's parent.
958
959       coalesce [--debug|--changelog] [timefuzz]
960           Scan the selection set for runs of commits with identical comments
961           close to each other in time (this is a common form of scar tissues
962           in repository up-conversions from older file-oriented
963           version-control systems). Merge these cliques by deleting all but
964           the last commit, in order; fileops from the deleted commits are
965           pushed forward to that last one
966
967           The optional second argument, if present, is a maximum time
968           separation in seconds; the default is 90 seconds.
969
970           The default selection set for this command is =C, all commits.
971           Occasionally you may want to restrict it, for example to avoid
972           coalescing unrelated cliques of "*** empty log message ***" commits
973           from CVS lifts.
974
975           With the --debug option, show messages about mismatches.
976
977           With the --changelog option, any commit with a comment containing
978           the string 'empty log message' (such as is generated by CVS) and
979           containing exactly one file operation modifying a path ending in
980           ChangeLog is treated specially. Such ChangeLog commits are
981           considered to match any commit before them by content, and will
982           coalesce with it if the committer matches and the commit separation
983           is small enough. This option handles a convention used by Free
984           Software Foundation projects.
985
986       split {at|by} item
987           The first argument is required to be a commit location; the second
988           is a preposition which indicates which splitting method to use. If
989           the preposition is 'at', then the third argument must be an integer
990           1-origin index of a file operation within the commit. If it is
991           'by', then the third argument must be a pathname to be
992           prefix-matched, pathname match is done first).
993
994           The commit is copied and inserted into a new position in the event
995           sequence, immediately following itself; the duplicate becomes the
996           child of the original, and replaces it as parent of the original's
997           children. Commit metadata is duplicated; the new commit then gets a
998           new mark. If the new commit has a legacy ID, the suffix '.split' is
999           appended to it.
1000
1001           Finally, some file operations - starting at the one matched or
1002           indexed by the split argument - are moved forward from the original
1003           commit into the new one. Legal indices are 2-n, where n is the
1004           number of file operations in the original commit.
1005
1006       add {D path | M perm mark path | R source target | C source target}
1007           To a specified commit, add a specified fileop.
1008
1009           For a D operation to be valid there must be an M operation for the
1010           path in the commit's ancestry. For an M operation to be valid, the
1011           'perm' part must be a token ending with 755 or 644 and the 'mark'
1012           must refer to a blob that precedes the commit location. For an R or
1013           C operation to be valid, there must be an M operation for the
1014           source in the commit's ancestry.
1015
1016       remove [index | path | deletes] [to commit]
1017           From a specified commit, remove a specified fileop. The op must be
1018           one of (a) the keyword “deletes”, (b) a file path, (c) a file path
1019           preceded by an op type set (some subset of the letters DMRCN), or
1020           (d) a 1-origin numeric index. The “deletes” keyword selects all D
1021           fileops in the commit; the others select one each.
1022
1023           If the “to” clause is present, the removed op is appended to the
1024           commit specified by the following singleton selection set. This
1025           option cannot be combined with “deletes”.
1026
1027           Note that this command does not attempt to scavenge blobs even if
1028           the deleted fileop might be the only reference to them. This
1029           behavior may change in a future release.
1030
1031       blob
1032           Create a blob at mark :1 after renumbering other marks starting
1033           from :2. Data is taken from stdin, which may be a here-doc. This
1034           can be used with the add command to patch synthetic data into a
1035           repository.
1036
1037       renumber
1038           Renumber the marks in a repository, from :1 up to :<n> where <n> is
1039           the count of the last mark. Just in case an importer ever cares
1040           about mark ordering or gaps in the sequence.
1041
1042           A side effect of this comment is to clean up stray "done"
1043           passthroughs that may have entered the repository via graft
1044           operations. After a renumber, the repository will have at most one
1045           "done" and it will be at the end of the events.
1046
1047       dedup
1048           Deduplicate blobs in the selection set. If multiple blobs in the
1049           selection set have the same SHA1, throw away all but the first, and
1050           change fileops referencing them to instead reference the (kept)
1051           first blob.
1052
1053       msgout [>outfile]
1054           Emit a file of messages in RFC2822 format representing the contents
1055           of repository metadata. Takes a selection set; members of the set
1056           other than commits, annotated tags, and passthroughs are ignored
1057           (that is, presently, blobs and resets).
1058
1059           The output from this command can optionally be redirected to a
1060           named output file. Prefix the filename with “>” and give it as a
1061           following argument.
1062
1063           May have an option --filter, followed by = and a /-enclosed regular
1064           expression. If this is given, only headers with names matching it
1065           are emitted. In this context the name of the header includes its
1066           trailing colon.
1067
1068       msgin [--create] [--empty-only] [<infile] [--changed >outfile]
1069           Accept a file of messages in RFC2822 format representing the
1070           contents of the metadata in selected commits and annotated tags.
1071           Takes no selection set. If there is an argument it will be taken as
1072           the name of a message file to read from; if no argument, or one of
1073           '-', reads from standard input. Supports < redirection.
1074
1075           Users should be aware that modifying an Event-Number or Event-Mark
1076           field will change which event the update from that message is
1077           applied to. This is unlikely to have good results.
1078
1079           The header CheckText, if present, is examined to see if the comment
1080           text of the associated event begins with it. If not, the item
1081           modification is aborted. This helps ensure that you are landing
1082           updates ob the events you intend.
1083
1084           If the “--create” modifier is present, new tags and commits will be
1085           appended to the repository. In this case it is an error for a tag
1086           name to match any exting tag name. Commit objects are created with
1087           no fileops. If Committer-Date or Tagger-Date fields are not present
1088           they are filled in with the time at which this command is executed.
1089           If Committer or Tagger fields are not present, reposurgeon will
1090           attempt to deduce the user's git-style identity and fill it in. If
1091           a singleton commit set was specified for commit creations, the new
1092           commits are made children of that commit.
1093
1094           Otherwise, if the Event-Number and Event-Mark fields are absent,
1095           the msgin logic will attempt to match the commit or tag first by
1096           Legacy-ID, then by a unique committer ID and timestamp pair.
1097
1098           If output is redirected and the modifier “--changed” appears, a
1099           minimal set of modifications actually made is written to the output
1100           file in a form that can be fed back in. Supports > redirection.
1101
1102           If the option “--empty-only” is given, this command will throw a
1103           recoverable error if it tries to alter a message body that is
1104           neither empty nor consists of the CVS empty-comment marker.
1105
1106       setfield attribute value
1107           In the selected objects (defaulting to none) set every instance of
1108           a named field to a string value. The string may be quoted to
1109           include whitespace, and use backslash escapes interpreted by the
1110           Python string-escape codec, such as \n and \t.
1111
1112           Attempts to set nonexistent attributes are ignored. Valid values
1113           for the attribute are internal Python field names; in particular,
1114           for commits, “comment” and “branch” are legal. Consult the source
1115           code for other interesting values.
1116
1117           The special fieldnames 'author', 'commitdate' and 'authdate' apply
1118           only to commits in the range. The latter two sets attribution
1119           dates. The former sets the author's name and email address
1120           (assuming the value can be parsed for both), copying the committer
1121           timestamp. The author's timezone may be deduced from the email
1122           address.
1123
1124       setperm 100644|100755|120000 path...
1125           For the selected objects (defaulting to none) take the first
1126           argument as an octal literal describing permissions. All subsequent
1127           arguments are paths. For each M fileop in the selection set and
1128           exactly matching one of the paths, patch the permission field to
1129           the first argument value.
1130
1131       append [--rstrip] [>text]
1132           Append text to the comments of commits and tags in the specified
1133           selection set. The text is the first token of the command and may
1134           be a quoted string. C-style escape sequences in the string are
1135           interpreted using Python's string_decode codec.
1136
1137           If the option --rstrip is given, the comment is right-stripped
1138           before the new text is appended.
1139
1140       filter [--shell|--regex|--replace|--dedos]
1141           Run blobs, commit comments, or tag comments in the selection set
1142           through the filter specified on the command line.
1143
1144           In any mode other than --dedos, attempting to specify a selection
1145           set including both blobs and non-blobs (that is, commits or tags)
1146           throws an error. Inline content in commits is filtered when the
1147           selection set contains (only) blobs and the commit is within the
1148           range bounded by the earliest and latest blob in the specification.
1149
1150           When filtering blobs, if the command line contains the magic cookie
1151           '%PATHS%' it is replaced with a space-separated list of all paths
1152           that reference the blob.
1153
1154           With --shell, the remainder of the line specifies a filter as a
1155           shell command. Each blob or comment is presented to the filter on
1156           standard input; the content is replaced with whatever the filter
1157           emits to standard output.
1158
1159           With --regex, the remainder of the line is expected to be a Python
1160           regular expression substitution written as /from/to/ with from and
1161           to being passed as arguments to the standard re.sub() function and
1162           it applied to modify the content. Actually, any non-space character
1163           will work as a delimiter in place of the /; this makes it easier to
1164           use / in patterns. Ordinarily only the first such substitution is
1165           performed; putting 'g' after the slash replaces globally, and a
1166           numeric literal gives the maximum number of substitutions to
1167           perform. Other flags available restrict substitution scope - 'c'
1168           for comment text only, 'C' for committer name only, 'a' for author
1169           names only. Note that parsing of a --regex argument will be
1170           confused by any substring consisting of whitespace followed by #;
1171           use "\s" rather than whitespace to avoid this.
1172
1173           With --replace, the behavior is like --regexp but the expressions
1174           are not interpreted as regular expressions. (This is slightly
1175           faster).
1176
1177           With --dedos, DOS/Windows-style \r\n line terminators are replaced
1178           with \n.
1179
1180       transcode codec
1181           Transcode blobs, commit comments and committer/author names, or tag
1182           comments and tag committer names in the selection set to UTF-8 from
1183           the character encoding specified on the command line.
1184
1185           Attempting to specify a selection set including both blobs and
1186           non-blobs (that is, commits or tags) throws an error. Inline
1187           content in commits is filtered when the selection set contains
1188           (only) blobs and the commit is within the range bounded by the
1189           earliest and latest blob in the specification.
1190
1191           The encoding argument must name one of the codecs known to the
1192           Python standard codecs library. In particular, 'latin1' is a valid
1193           codec name.
1194
1195           Errors in this command are fatal, because an error may leave
1196           repository objects in a damaged state.
1197
1198           The theory behind the design of this command is that the repository
1199           might contain a mixture of encodings used to enter commit metadata
1200           by different people at different times. After using =I to identify
1201           metadata containing non-Unicode high bytes in text, a human must
1202           use context to identify which particular encodings were used in
1203           particular event spans and compose appropriate transcode commands
1204           to fix them up.
1205
1206       edit
1207           Report the selection set of events to a tempfile as msgout does,
1208           call an editor on it, and update from the result as msgin does. If
1209           you do not specify an editor name as second argument, it will be
1210           taken from the $EDITOR variable in your environment. If $EDITOR is
1211           not set, /usr/bin/editor will be used as a fallback if it exists as
1212           a symlink to your default editor, as is the case on Debian, Ubuntu
1213           and their derivatives.
1214
1215           Normally this command ignores blobs because msgout does. However,
1216           if you specify a selection set consisting of a single blob, your
1217           editor will be called directly on the blob file.
1218
1219           Supports < and > redirection.
1220
1221       timeoffset offset [timezone]
1222           Apply a time offset to all time/date stamps in the selected set. An
1223           offset argument is required; it may be in the form [+-]ss,
1224           [+-]mm:ss or [+-]hh:mm:ss. The leading sign is required to
1225           distinguish it from a selection expression.
1226
1227           Optionally you may also specify another argument in the form
1228           [+-]hhmm, a timezone literal to apply. To apply a timezone without
1229           an offset, use an offset literal of +0 or -0.
1230
1231       unite [--prune] reponame...
1232           Unite repositories. Name any number of loaded repositories; they
1233           will be united into one union repo and removed from the load list.
1234           The union repo will be selected.
1235
1236           The root of each repo (other than the oldest repo) will be grafted
1237           as a child to the last commit in the dump with a preceding commit
1238           date. This will produce a union repository with one branch for each
1239           part. Running last to first, duplicate tag and branch names will be
1240           disambiguated using the source repository name (thus, recent
1241           duplicates will get priority over older ones). After all grafts,
1242           marks will be renumbered.
1243
1244           The name of the new repo will be the names of all parts
1245           concatenated, separated by '+'. It will have no source directory or
1246           preferred system type.
1247
1248           With the option --prune, at each join D operations for every
1249           ancestral file existing will be prepended to the root commit, then
1250           it will be canonicalized using the rules for squashing the effect
1251           will be that only files with properly matching M, R, and C
1252           operations in the root survive.
1253
1254       graft [--prune] reponame
1255           For when unite doesn't give you enough control. This command may
1256           have either of two forms, selected by the size of the selection
1257           set. The first argument is always required to be the name of a
1258           loaded repo.
1259
1260           If the selection set is of size 1, it must identify a single commit
1261           in the currently chosen repo; in this case the name repo's root
1262           will become a child of the specified commit. If the selection set
1263           is empty, the named repo must contain one or more callouts matching
1264           a commits in the currently chosen repo.
1265
1266           Labels and branches in the named repo are prefixed with its name;
1267           then it is grafted to the selected one. Any other callouts in the
1268           named repo are also resolved in the context of the currently chosen
1269           one. Finally, the named repo is removed from the load list.
1270
1271           With the option --prune, prepend a deleteall operation into the
1272           root of the grafted repository.
1273
1274       path [source] rename [--force] [target]
1275           Rename a path in every fileop of every selected commit. The default
1276           selection set is all commits. The first argument is interpreted as
1277           a Python regular expression to match against paths; the second may
1278           contain back-reference syntax.
1279
1280           Ordinarily, if the target path already exists in the fileops, or is
1281           visible in the ancestry of the commit, this command throws an
1282           error. With the --force option, these checks are skipped.
1283
1284       paths [{sub|sup}] [dirname] [>outfile]
1285           Takes a selection set. Without a modifier, list all paths touched
1286           by fileops in the selection set (which defaults to the entire
1287           repo). This reporting variant does >-redirection.
1288
1289           With the 'sub' modifier, take a second argument that is a directory
1290           name and prepend it to every path. With the 'sup' modifier, strip
1291           any directory argument from the start of the path if it appears
1292           there; with no argument, strip the first directory component from
1293           every path.
1294
1295       merge
1296           Create a merge link. Takes a selection set argument, ignoring all
1297           but the lowest (source) and highest (target) members. Creates a
1298           merge link from the highest member (child) to the lowest (parent).
1299
1300       unmerge
1301           Linearize a commit. Takes a selection set argument, which must
1302           resolve to a single commit, and removes all its parents except for
1303           the first.
1304
1305           It is equivalent to reparent --rebase first_parent,commit, where
1306           commit is the same selection set as used with unmerge and
1307           first_parent is a set resolving commit's first parent (see the
1308           reparent command below
1309
1310           The main interest of the unmerge is that you don't have to find and
1311           specify the first parent yourself, saving time and avoiding errors
1312           when nearby surgery would make a manual first parent argument
1313           stale.
1314
1315       reparent [options...] [policy]
1316           Changes the parent list of a commit. Takes a selection set, zero or
1317           more option arguments, and an optional policy argument.
1318
1319           Selection set:
1320               The selection set must resolve to one or more commits. The
1321               selected commit with the highest event number (not necessarily
1322               the last one selected) is the commit to modify. The remainder
1323               of the selected commits, if any, become its parents: the
1324               selected commit with the lowest event number (which is not
1325               necessarily the first one selected) becomes the first parent,
1326               the selected commit with second lowest event number becomes the
1327               second parent, and so on. All original parent links are
1328               removed. Examples:
1329
1330                   # this makes 17 the parent of 33
1331                   17,33 reparent
1332
1333                   # this also makes 17 the parent of 33
1334                   33,17 reparent
1335
1336                   # this makes 33 a root (parentless) commit
1337                   33 reparent
1338
1339                   # this makes 33 an octopus merge commit.  its first parent
1340                   # is commit 15, second parent is 17, and third parent is 22
1341                   22,33,15,17 reparent
1342
1343           Options:
1344
1345               --use-order
1346                   Use the selection order to determine which selected commit
1347                   is the commit to modify and which are the parents (and if
1348                   there are multiple parents, their order). The last selected
1349                   commit (not necessarily the one with the highest event
1350                   number) is the commit to modify, the first selected commit
1351                   (not necessarily the one with the lowest event number)
1352                   becomes the first parent, the second selected commit
1353                   becomes the second parent, and so on. Examples:
1354
1355                       # this makes 33 the parent of 17
1356                       33,17 reparent --use-order
1357
1358                       # this makes 17 an octopus merge commit.  its first parent
1359                       # is commit 22, second parent is 33, and third parent is 15
1360                       22,33,15,17 reparent --use-order
1361
1362                   Warning: with this option, it is possible to preduce a
1363                   repository graph in which parents precede children. This
1364                   will produce a fatal error when the repository state is
1365                   written out, so don't do that.
1366
1367           Policy:
1368               By default, the manifest of the reparented commit is computed
1369               before modifying it; a deleteall and some fileops are prepended
1370               so that the manifest stays unchanged even when the first parent
1371               has been changed. This behavior can be changed by specifying a
1372               policy flag:
1373
1374               --rebase
1375                   Inhibits the default behavior—no deleteall is issued and
1376                   the tree contents of all descendents can be modified as a
1377                   result.
1378
1379       reorder [--quiet]
1380           Re-order a contiguous range of commits.
1381
1382           Older revision control systems tracked change history on a per-file
1383           basis, rather than as a series of atomic changesets, which often
1384           made it difficult to determine the relationships between changes.
1385           Some tools which convert a history from one revision control system
1386           to another attempt to infer changesets by comparing file commit
1387           comment and time-stamp against those of other nearby commits, but
1388           such inference is a heuristic and can easily fail. In the best
1389           case, when inference fails, a range of commits in the resulting
1390           conversion which should have been coalesced into a single changeset
1391           instead end up as a contiguous range of separate commits. This
1392           situation typically can be repaired easily enough with the coalesce
1393           or squash commands. However, in the worst case, numerous commits
1394           from several different topics, each of which should have been one
1395           or more distinct changesets, may end up interleaved in an
1396           apparently chaotic fashion. To deal with such cases, the commits
1397           need to be re-ordered, so that those pertaining to each particular
1398           topic are clumped together, and then possibly squashed into one or
1399           more changesets pertaining to each topic. This command, reorder,
1400           can help with the first task; the squash command with the second.
1401
1402           Selected commits are re-arranged in the order specified; for
1403           instance: ":7,:5,:9,:3 reorder". The specified commit range must be
1404           contiguous; each commit must be accounted for after re-ordering.
1405           Thus, for example, ':5' can not be omitted from ":7,:5,:9,:3
1406           reorder". (To drop a commit, use the delete or squash command.) The
1407           selected commits must represent a linear history, however, the
1408           lowest numbered commit being re-ordered may have multiple parents,
1409           and the highest numbered may have multiple children.
1410
1411           Re-ordered commits and their immediate descendants are inspected
1412           for rudimentary fileops inconsistencies. Warns if re-ordering
1413           results in a commit trying to delete, rename, or copy a file before
1414           it was ever created. Likewise, warns if all of a commit's fileops
1415           become no-ops after re-ordering. Other fileops inconsistencies may
1416           arise from re-ordering, both within the range of affected commits
1417           and beyond; for instance, moving a commit which renames a file
1418           ahead of a commit which references the original name. Such
1419           anomalies can be discovered via manual inspection and repaired with
1420           the add and remove (and possibly path) commands. Warnings can be
1421           suppressed with --quiet.
1422
1423           In addition to adjusting their parent/child relationships,
1424           re-ordering commits also re-orders the underlying events since
1425           ancestors must appear before descendants, and blobs must appear
1426           before commits which reference them. This means that events within
1427           the specified range will have different event numbers after the
1428           operation.
1429
1430       branch branchname {rename|delete} [arg]
1431           Rename or delete a branch (and any associated resets). First
1432           argument must be an existing branch name; second argument must one
1433           of the verbs 'rename' or 'delete'. The branchname may use backslash
1434           escapes interpreted by the Python string-escape codec, such as \s.
1435
1436           For a 'rename', the third argument may be any token that is a
1437           syntactically valid branch name (but not the name of an existing
1438           branch).
1439
1440           For either name, if it does not contain a '/' the prefix
1441           'refs/heads' is prepended.
1442
1443       tag tagname {create|move|rename|delete} [arg]
1444           Create, move, rename, or delete a tag.
1445
1446           Creation is a special case. First argument is a name, which must
1447           not be an existing tag. Takes a singleton event second argument
1448           which must point to a commit. A tag object pointing to the commit
1449           is created and inserted just after the last tag in the repo (or
1450           just after the last commit if there are no tags). The tagger,
1451           committish, and comment fields are copied from the commit's
1452           committer, mark, and comment fields.
1453
1454           Otherwise, first argument must be an existing tag name; second
1455           argument must be one of the verbs “move”, “rename”, or “delete”.
1456
1457           For a “move”, a third argument must be a singleton selection set.
1458           For a “rename”, the third argument may be any token that is a
1459           syntactically valid tag name (but not the name of an existing tag).
1460           For a “delete”, no third argument is required.
1461
1462           For a 'delete', no third argument is required. The name portion of
1463           a delete may be a regexp wrapped in //; if so, all objects of the
1464           specified type with names matching the regexp are deleted. This is
1465           useful for mass deletion of junk tags such as CVS branch-root tags.
1466
1467           The tagname may use backslash escapes interpreted by the Python
1468           string-escape codec, such as \s.
1469
1470           The behavior of this command is complex because features which
1471           present as tags may be any of three things: (1) True tag objects,
1472           (2) lightweight tags, actually sequences of commits with a common
1473           branchname beginning with “refs/tags” - in this case the tag is
1474           considered to point to the last commit in the sequence, (3) Reset
1475           objects. These may occur in combination; in fact, stream exporters
1476           from systems with annotation tags commonly express each of these as
1477           a true tag object (1) pointing at the tip commit of a sequence (2)
1478           in which the basename of the common branch field is identical to
1479           the tag name. An exporter that generates lightweight-tagged commit
1480           sequences (2) may or may not generate resets pointing at their tip
1481           commits.
1482
1483           This command tries to handle all combinations in a natural way by
1484           doing up to three operations on any true tag, commit sequence, and
1485           reset matching the source name. In a rename, all are renamed
1486           together. In a delete, any matching tag or reset is deleted; then
1487           matching branch fields are changed to match the branch of the
1488           unique descendent of the tagged commit, if there is one. When a tag
1489           is moved, no branch fields are changed and a warning is issued.
1490
1491           Attempts to delete a lightweight tag may fail with the message
1492           “couldn't determine a unique successor”. When this happens, the tag
1493           is on a commit with multiple children that have different branch
1494           labels. There is a hole in the specification of git fast-import
1495           streams that leaves it uncertain how branch labels can be safely
1496           reassigned in this case; rather than do something risky,
1497           reposurgeon throws a recoverable error.
1498
1499       reset resetname {create|move|rename|delete} [arg]
1500           Create, move, rename, or delete a reset. Create is a special case;
1501           it requires a singleton selection which is the associate commit for
1502           the reset, takes as a first argument the name of the reset (which
1503           must not exist), and ends with the keyword create.
1504
1505           In the other modes, the first argument must match an existing reset
1506           name; second argument must be one of the verbs “move”, “rename”, or
1507           “delete”.
1508
1509           The reset name may use backslash escapes interpreted by the Python
1510           string-escape codec, such as \s.
1511
1512           For a “move”, a third argument must be a singleton selection set.
1513           For a “rename”, the third argument may be any token token that
1514           matches a syntactically valid reset name (but not the name of an
1515           existing reset). For a “delete”, no third argument is required.
1516
1517           For either name, if it does not contain a “/” the prefix “heads/”
1518           is prepended. If it does not begin with “refs/”, “refs/” is
1519           prepended.
1520
1521           An argument matches a reset's name if it is either the entire
1522           reference (refs/heads/FOO or refs/tags/FOO for some some value of
1523           FOO) or the basename (e.g. FOO), or a suffix of the form heads/FOO
1524           or tags/FOO. An unqualified basename is assumed to refer to a head.
1525
1526           When a reset is renamed, commit branch fields matching the tag are
1527           renamed with it to match. When a reset is deleted, matching branch
1528           fields are changed to match the branch of the unique descendent of
1529           the tip commit of the associated branch, if there is one. When a
1530           reset is moved, no branch fields are changed.
1531
1532       debranch source-branch [target-branch]
1533           Takes one or two arguments which must be the names of source and
1534           target branches; if the second (target) argument is omitted it
1535           defaults to refs/heads/master. Any trailing segment of a branch
1536           name is accepted as a synonym for it; thus master is the same as
1537           refs/heads/master. Does not take a selection set.
1538
1539           The history of the source branch is merged into the history of the
1540           target branch, becoming the history of a subdirectory with the name
1541           of the source branch. Any resets of the source branch are removed.
1542
1543       strip [blobs|reduce]
1544           Reduce the selected repository to make it a more tractable test
1545           case. Use this when reporting bugs.
1546
1547           With the modifier 'blobs', replace each blob in the repository with
1548           a small, self-identifying stub, leaving all metadata and DAG
1549           topology intact. This is useful when you are reporting a bug, for
1550           reducing large repositories to test cases of manageable size.
1551
1552           A selection set is effective only with the 'blobs' option,
1553           defaulting to all blobs. The 'reduce' mode always acts on the
1554           entire repository.
1555
1556           With the modifier 'reduce', perform a topological reduction that
1557           throws out uninteresting commits. If a commit has all file
1558           modifications (no deletions or copies or renames) and has exactly
1559           one ancestor and one descendant, then it may be boring. To be fully
1560           boring, it must also not be referred to by any tag or reset.
1561           Interesting commits are not boring, or have a non-boring parent or
1562           non-boring child.
1563
1564           With no modifiers, this command strips blobs.
1565
1566       ignores [rename] [translate] [defaults]
1567           Intelligent handling of ignore-pattern files. This command fails if
1568           no repository has been selected or no preferred write type has been
1569           set for the repository. It does not take a selection set.
1570
1571           If the rename modifier is present, this command attempts to rename
1572           all ignore-pattern files to whatever is appropriate for the
1573           preferred type - e.g. .gitignore for git, .hgignore for hg, etc.
1574           This option does not cause any translation of the ignore files it
1575           renames.
1576
1577           If the translate modifier is present, syntax translation of each
1578           ignore file is attempted. At present, the only transformation the
1579           code knows is to prepend a 'syntax: glob' header if the preferred
1580           type is hg.
1581
1582           If the defaults modifier is present, the command attempts to
1583           prepend these default patterns to all ignore files. If no ignore
1584           file is created by the first commit, it will be modified to create
1585           one containing the defaults. This command will error out on prefer
1586           types that have no default ignore patterns (git and hg, in
1587           particular). It will also error out when it knows the import tool
1588           has already set default patterns.
1589
1590       attribution [selection] {show | set | delete | prepend | append} [args]
1591           Inspect, modify, add, and remove commit and tag attributions.
1592
1593           Attributions upon which to operate are selected in much the same
1594           way as events are selected, as described in SELECTION SYNTAX.
1595           selection is an expression composed of 1-origin
1596           attribution-sequence numbers, '$' for last attribution, '..'
1597           ranges, comma-separated items, '(...)' grouping, set operations '|'
1598           union, '&' intersection, and '~' negation, and function calls
1599           @min(), @max(), @amp(), @pre(), @suc(), @srt(). Attributions can
1600           also be selected by visibility set '=C' for committers, '=A' for
1601           authors, and '=T' for taggers. Finally, /regex/ will attempt to
1602           match the Python regular expression regex against an attribution
1603           name and email address; '/n' limits the match to only the name, and
1604           '/e' to only the email address.
1605
1606           With the exception of show, all actions require an explicit event
1607           selection upon which to operate. Available actions are:
1608
1609           [selection] [show] [>file]
1610               Inspect the selected attributions of the specified events
1611               (commits and tags). The show keyword is optional. If no
1612               attribution selection expression is given, defaults to all
1613               attributions. If no event selection is specified, defaults to
1614               all events. Supports > redirection.
1615
1616           selection set name [email] [date]
1617           selection set [name] email [date]
1618           selection set [name] [email] date
1619               Assign name, email, date to the selected attributions. As a
1620               convenience, if only some fields need to be changed, the others
1621               can be omitted. Arguments name, email, and date can be given in
1622               any order.
1623
1624           [selection] delete
1625               Delete the selected attributions. As a convenience, deletes all
1626               authors if selection is not given. It is an error to delete the
1627               mandatory committer and tagger attributions of commit and tag
1628               events, respectively.
1629
1630           [selection] prepend name [email] [date]
1631           [selection] prepend [name] email [date]
1632               Insert a new attribution before the first attribution named by
1633               selection. The new attribution has the same type (committer,
1634               author, or tagger) as the one before which it is being
1635               inserted. Arguments name, email, and date can be given in any
1636               order.
1637
1638               If name is omitted, an attempt is made to infer it from email
1639               by trying to match email against an existing attribution of the
1640               event, with preference given to the attribution before which
1641               the new attribution is being inserted. Similarly, email is
1642               inferred from an existing matching name. Likewise, for date.
1643
1644               As a convenience, if selection is empty or not specified a new
1645               author is prepended to the author list.
1646
1647               It is presently an error to insert a new committer or tagger
1648               attribution. To change a committer or tagger, use set instead.
1649
1650           [selection] append name [email] [date]
1651           [selection] append [name] email [date]
1652               Insert a new attribution after the last attribution named by
1653               selection. The new attribution has the same type (committer,
1654               author, or tagger) as the one after which it is being inserted.
1655               Arguments name, email, and date can be given in any order.
1656
1657               If name is omitted, an attempt is made to infer it from email
1658               by trying to match email against an existing attribution of the
1659               event, with preference given to the attribution after which the
1660               new attribution is being inserted. Similarly, email is inferred
1661               from an existing matching name. Likewise, for date.
1662
1663               As a convenience, if selection is empty or not specified a new
1664               author is appended to the author list.
1665
1666               It is presently an error to insert a new committer or tagger
1667               attribution. To change a committer or tagger, use set instead.
1668
1669   REFERENCE LIFTING
1670       This group of commands is meant for fixing up references in commits
1671       that are in the format of older version control systems. The general
1672       workflow is this: first, go over the comment history and change all
1673       old-fashioned commit references into machine-parseable cookies. Then,
1674       automatically turn the machine-parseable cookie into action stamps. The
1675       point of dividing the process this way is that the first part is hard
1676       for a machine to get right, while the second part is prone to errors
1677       when a human does it.
1678
1679       A Subversion cookie is a comment substring of the form [[SVN:ddddd]]
1680       (example: [[SVN:2355]] with the revision read directly via the
1681       Subversion exporter, deduced from git-svn metadata, or matching a
1682       $Revision$ header embedded in blob data for the filename.
1683
1684       A CVS cookie is a comment substring of the form
1685       [[CVS:filename:revision]] (example: [[CVS:src/README:1.23]] with the
1686       revision matching a CVS $Id$ or $Revision$ header embedded in blob data
1687       for the filename.
1688
1689       A mark cookie is of the form [[:dddd]] and is simply a reference to the
1690       specified mark. You may want to hand-patch this in when one of previous
1691       forms is inconvenient.
1692
1693       An action stamp is an RFC3339 timestamp, followed by a '!', followed by
1694       an author email address (author is preferred rather than committer
1695       because that timestamp is not changed when a patch is replayed on to a
1696       branch, but the code to make a stamp for a commit will fall back to the
1697       committer if no author field is present). It attempts to refer to a
1698       commit without being VCS-specific. Thus, instead of "commit 304a53c2"
1699       or "r2355", "2011-10-25T15:11:09Z!fred@foonly.com".
1700
1701       The following git aliases allow git to work directly with action
1702       stamps. Append it to your ~/.gitconfig; if you already have an [alias]
1703       section, leave off the first line.
1704
1705
1706           [alias]
1707                # git stamp <commit-ish> - print a reposurgeon-style action stamp
1708                stamp = show -s --format='%cI!%ce'
1709
1710                # git scommit <stamp> <rev-list-args> - list most recent commit that matches <stamp>.
1711                # Must also specify a branch to search or --all, after these arguments.
1712                scommit = "!f(){ d=${1%%!*}; a=${1##*!}; arg=\"--until=$d -1\"; if [ $a != $1 ]; then arg=\"$arg --committer=$a\"; fi; shift; git rev-list $arg ${1:+\"$@\"}; }; f"
1713
1714                # git scommits <stamp> <rev-list-args> - as above, but list all matching commits.
1715                scommits = "!f(){ d=${1%%!*}; a=${1##*!}; arg=\"--until=$d --after $d\"; if [ $a != $1 ]; then arg=\"$arg --committer=$a\"; fi; shift; git rev-list $arg ${1:+\"$@\"}; }; f"
1716
1717                # git smaster <stamp> - list most recent commit on master that matches <stamp>.
1718                smaster = "!f(){ git scommit \"$1\" master --first-parent; }; f"
1719                smasters = "!f(){ git scommits \"$1\" master --first-parent; }; f"
1720
1721                # git shs <stamp> - show the commits on master that match <stamp>.
1722                shs = "!f(){ stamp=$(git smasters $1); shift; git show ${stamp:?not found} $*; }; f"
1723
1724                # git slog <stamp> <log-args> - start git log at <stamp> on master
1725                slog = "!f(){ stamp=$(git smaster $1); shift; git log ${stamp:?not found} $*; }; f"
1726
1727                # git sco <stamp> - check out most recent commit on master that matches <stamp>.
1728                sco = "!f(){ stamp=$(git smaster $1); shift; git checkout ${stamp:?not found} $*; }; f"
1729
1730
1731       There is a rare case in which an action stamp will not refer uniquely
1732       to one commit. It is theoretically possible that the same author might
1733       check in revisions on different branches within the one-second
1734       resolution of the timestamps in a fast-import stream. There is nothing
1735       to be done about this; tools using action stamps need to be aware of
1736       the possibility and throw a warning when it occurs.
1737
1738       In order to support reference lifting, reposurgeon internally builds a
1739       legacy-reference map that associates revision identifiers in older
1740       version-control systems with commits. The contents of this map comes
1741       from three places: (1) cvs2svn:rev properties if the repository was
1742       read from a Subversion dump stream, (2) $Id$ and $Revision$ headers in
1743       repository files, and (3) the .git/cvs-revisions created by git
1744       cvsimport.
1745
1746       The detailed sequence for lifting possible references is this: first,
1747       find possible CVS and Subversion references with the references or =N
1748       visibility set; then replace them with equivalent cookies; then run
1749       references lift to turn the cookies into action stamps (using the
1750       information in the legacy-reference map) without having to do the
1751       lookup by hand.
1752
1753       references [list|edit|lift] [>outfile]
1754           With the modifier 'list', list commit and tag comments for strings
1755           that might be CVS- or Subversion-style revision identifiers. This
1756           will be useful when you want to replace them with equivalent
1757           cookies that can automatically be translated into VCS-independent
1758           action stamps. This reporting command supports >-redirection. It is
1759           equivalent to '=N list'.
1760
1761           With the modifier 'edit', edit the set where revision IDs are
1762           found. This version of the command supports < and > redirection.
1763           This is equivalent to '=N edit'.
1764
1765           With the modifier "lift", attempt to resolve Subversion and CVS
1766           cookies in comments into action stamps using the legacy map. An
1767           action stamp is a timestamp/email/sequence-number combination
1768           uniquely identifying the commit associated with that blob, as
1769           described in the section called “TRANSLATION STYLE”.
1770
1771           It is not guaranteed that every such reference will be resolved, or
1772           even that any at all will be. Normally all references in history
1773           from a Subversion repository will resolve, but CVS references are
1774           less likely to be resolvable.
1775
1776   CHANGELOGS
1777       CVS and Subversion do not have separated notions of committer and
1778       author for changesets; when lifted to a VCS that does, like git, their
1779       one author field is used for both.
1780
1781       However, if the project used the FSF ChangeLog convention, many
1782       changesets will include a ChangeLog modification listing an author for
1783       the commit. In the common case that the changeset was derived from a
1784       patch and committed by a project maintainer, but the ChangeLog entry
1785       names the actual author, this information can be recovered.
1786
1787       Use the "changelogs" command/ This takes neither arguments nor a
1788       selection set. It mines the ChangeLog files for authorship data.
1789
1790       It assumes such files have the basename 'ChangeLog', and that they are
1791       in the format used by FSF projects: entry header lines begin with
1792       YYYY-MM-DD and are followed by a fullname/address. When a ChangeLog
1793       file modification is found in a clique, the entry header at or before
1794       the section changed since its last revision is parsed and the address
1795       is inserted as the commit author.
1796
1797       If the entry header contains an email address but no name, a name will
1798       be filled in if possible by looking for the address in author map
1799       entries.
1800
1801       In accordance with FSF policy for ChangeLogs, any date in an
1802       attribution header is discarded and the committer date is used.
1803       However, if the name is an author-map alias with an associated
1804       timezone, that zone is used.
1805
1806       The command reports statistics on how many commits were altered.
1807
1808   RELEASE TARBALLS
1809       When converting a legacy repository, it sometimes happens that there
1810       are archived releases of the project surviving from before the date of
1811       the repository's initial commit. It may be desirable to insert those
1812       releases at the front of the repository history.
1813
1814       To do this, use the "incorporate" command. This command takes as its
1815       single argument naming a tarball, the content of which is to be
1816       inserted as a commit. It may be a gzipped or bzipped tarball. The
1817       initial segment of each path is assumed to be a version directory and
1818       stripped off. The number of segments stripped off can be set with the
1819       option --strip=n, n defaulting to 1.
1820
1821       Takes a singleton selection set. Normally inserts before that commit;
1822       with the option --after, insert after it. The default selection set is
1823       the very first commit of the repository.
1824
1825       The option --date can be used to set the commit date. It takes an
1826       argument, which is expected to be an RFC3339 timestamp.
1827
1828       The generated commit has a committer field (the invoking user) and gets
1829       as its commit date the modification time of the newest file in the
1830       tarball (not the mod time of the tarball itself). No author field is
1831       generated. A comment recording the tarball name is generated.
1832
1833       Note that the import stream generated by this command is - while
1834       correct - not optimal, and may in particular contain duplicate blobs.
1835
1836   VARIABLES, MACROS AND EXTENSIONS
1837       Occasionally you will need to issue a large number of complex surgical
1838       commands of very similar form, and it's convenient to be able to
1839       package that form so you don't need to do a lot of error-prone typing.
1840       For those occasions, reposurgeon supports simple forms of named
1841       variables and macro expansion.
1842
1843       assign [name]
1844           Compute a leading selection set and assign it to a symbolic name.
1845           It is an error to assign to a name that is already assigned, or to
1846           any existing branch name. Assignments may be cleared by sequence
1847           mutations (though not ordinary deletions); you will see a warning
1848           when this occurs.
1849
1850           With no selection set and no name, list all assignments.>
1851
1852           If the option --singleton is given, the assignment will throw an
1853           error if the selection set is not a singleton.
1854
1855           Use this to optimize out location and selection computations that
1856           would otherwise be performed repeatedly, e.g. in macro calls.
1857
1858       unassign name
1859           Unassign a symbolic name. Throws an error if the name is not
1860           assigned.
1861
1862       names [>outfile]
1863           List the names of all known branches and tags. Tells you what
1864           things are legal within angle brackets and parentheses.
1865
1866       define name body
1867           Define a macro. The first whitespace-separated token is the name;
1868           the remainder of the line is the body, unless it is “{”, which
1869           begins a multi-line macro terminated by a line beginning with “}”.
1870
1871           A later “do” call can invoke this macro.
1872
1873           The command “define” by itself without a name or body produces a
1874           macro list.
1875
1876       do name arguments...
1877           Expand and perform a macro. The first whitespace-separated token is
1878           the name of the macro to be called; remaining tokens replace {0},
1879           {1}... in the macro definition. Tokens may contain whitespace if
1880           they are string-quoted; string quotes are stripped. Macros can call
1881           macros.
1882
1883           If the macro expansion does not itself begin with a selection set,
1884           whatever set was specified before the "do" keyword is available to
1885           the command generated by the expansion.
1886
1887       undefine name
1888           Undefine the named macro.
1889
1890       Here's an example to illustrate how you might use this. In CVS
1891       repositories of projects that use the GNU ChangeLog convention, a very
1892       common pre-conversion artifact is a commit with the comment "*** empty
1893       log message ***" that modifies only a ChangeLog entry explaining the
1894       commit immediately previous to it. The following
1895
1896           define changelog <{0}> & /empty log message/ squash --pushback
1897           do changelog 2012-08-14T21:51:35Z
1898           do changelog 2012-08-08T22:52:14Z
1899           do changelog 2012-08-07T04:48:26Z
1900           do changelog 2012-08-08T07:19:09Z
1901           do changelog 2012-07-28T18:40:10Z
1902
1903       is equivalent to the more verbose
1904
1905           <2012-08-14T21:51:35Z> & /empty log message/ squash --pushback
1906           <2012-08-08T22:52:14Z> & /empty log message/ squash --pushback
1907           <2012-08-07T04:48:26Z> & /empty log message/ squash --pushback
1908           <2012-08-08T07:19:09Z> & /empty log message/ squash --pushback
1909           <2012-07-28T18:40:10Z> & /empty log message/ squash --pushback
1910
1911       but you are less likely to make difficult-to-notice errors typing the
1912       first version.
1913
1914       (Also note how the text regexp acts as a failsafe against the
1915       possibility of typing a wrong date that doesn't refer to a commit with
1916       an empty comment. This was a real-world example from the CVS-to-git
1917       conversion of groff.)
1918
1919       script filename [arg...]
1920           Takes a filename and optional following arguments. Reads each line
1921           from the file and executes it as a command.
1922
1923           During execution of the script, the script name replaces the string
1924           $0 and the optional following arguments (if any) replace the
1925           strings $1, $2 ... $n in the script text. This is done before
1926           tokenization, so the $1 in a string like “foo$1bar” will be
1927           expanded. Additionally, $$ is expanded to the current process ID
1928           (which may be useful for scripts that use tempfiles).
1929
1930           Within scripts (and only within scripts) reposurgeon accepts a
1931           slightly extended syntax: First, a backslash ending a line signals
1932           that the command continues on the next line. Any number of
1933           consecutive lines thus escaped are concatenated, without the ending
1934           backslashes, prior to evaluation. Second, a command that takes an
1935           input filename argument can instead take literal following data in
1936           the syntax of a shell here-document. That is: if the filename is
1937           replaced by "<<EOF", all following lines in the script up to a
1938           terminating line consisting only of "EOF" will be read, placed in a
1939           temporary file, and that file fed to the command and afterwards
1940           deleted. EOF may be replaced by any string. Backslashes have no
1941           special meaning while reading a here-document.
1942
1943           Scripts may have comments. Any line beginning with a '#' is
1944           ignored. If a line has a trailing position that begins with one or
1945           more whitespace characters followed by '#', that trailing portion
1946           is ignored.
1947
1948   ARTIFACT REMOVAL
1949       Some commands automate fixing various kinds of artifacts associated
1950       with repository conversions from older systems.
1951
1952       authors [read|write] [<filename] [>filename]
1953           Apply or dump author-map information for the specified selection
1954           set, defaulting to all events.
1955
1956           Lifts from CVS and Subversion may have only usernames local to the
1957           repository host in committer and author IDs. DVCSes want email
1958           addresses (net-wide identifiers) and complete names. To supply the
1959           map from one to the other, an authors file is expected to consist
1960           of lines each beginning with a local user ID, followed by a '='
1961           (possibly surrounded by whitespace) followed by a full name and
1962           email address, optionally followed by a timezone offset field.
1963           Thus:
1964
1965               ferd = Ferd J. Foonly <foonly@foo.com> America/New_York
1966
1967           An authors file may also contain lines of this form
1968
1969               + Ferd J. Foonly <foonly@foobar.com> America/Los_Angeles
1970
1971           These are interpreted as aliases for the last preceding '=' entry
1972           that may appear in ChangeLog files. When such an alias is matched
1973           on a ChangeLog attribution line, the author attribution for the
1974           commit is mapped to the basename, but the timezone is used as is.
1975           This accommodates people with past addresses (possibly at)
1976           different locations) unifying such aliases in metadata so searches
1977           and statistical aggregation will work better.
1978
1979           An authors file may have comment lines beginning with '#'; these
1980           are ignored.
1981
1982           When an authors file is applied, email addresses in committer and
1983           author metadata for which the local ID matches between < and @ are
1984           replaced according to the mapping (this handles git-svn lifts).
1985           Alternatively, if the local ID is the entire address, this is also
1986           considered a match (this handles what git-cvsimport and cvs2git
1987           do). If a timezone was specified in the map entry, that person's
1988           author and committer dates are mapped to it.
1989
1990           With the 'read' modifier, or no modifier, apply author mapping data
1991           (from standard input or a <-redirected file). May be useful if you
1992           are editing a repo or dump created by cvs2git or by git-svn invoked
1993           without -A.
1994
1995           With the 'write' modifier, write a mapping file that could be
1996           interpreted by authors read, with entries for each unique
1997           committer, author, and tagger (to standard output or a <-redirected
1998           mapping file). This may be helpful as a start on building an
1999           authors file, though each part to the right of an equals sign will
2000           need editing.
2001
2002       branchify [path-set]
2003           Specify the list of directories to be treated as potential branches
2004           (to become tags if there are no modifications after the creation
2005           copies) when analyzing a Subversion repo. This list is ignored when
2006           the --nobranch read option is used. It defaults to the 'standard
2007           layout' set of directories, plus any unrecognized directories in
2008           the repository root.
2009
2010           With no arguments, displays the current branchification set.
2011
2012           An asterisk at the end of a path in the set means 'all immediate
2013           subdirectories of this path, unless they are part of another
2014           (longer) path in the branchify set'.
2015
2016           Note that the branchify set is a property of the reposurgeon
2017           interpreter, not of any individual repository, and will persist
2018           across Subversion dumpfile reads. This may lead to unexpected
2019           results if you forget to re-set it.
2020
2021       branchify_map [/regex/branch/...]
2022           Specify the list of regular expressions used for mapping the svn
2023           branches that are detected by branchify. If none of the expressions
2024           match the default behaviour applies. This maps a branch to the name
2025           of the last directory, except for trunk and “*” which are mapped to
2026           master and root.
2027
2028           With no arguments the current regex replacement pairs are shown.
2029           Passing 'reset' will clear the mapping.
2030
2031           The branchify command will match each branch name against regex1
2032           and if it matches rewrite its branch name to branch1. If not it
2033           will try regex2 and so forth until it either found a matching regex
2034           or there are no regexs left. The regular expressions should be in
2035           Python's[2]. format. The branch name can use backreferences (see
2036           the re.sub function in the Python documentation).
2037
2038           Note that the regular expressions are appended to 'refs/' without
2039           either the needed 'heads/' or 'tags/'. This allows for choosing the
2040           right kind of branch type.
2041
2042           While the syntax template above uses slashes, any first character
2043           will be used as a delimiter (and you will need to use a different
2044           one in the common case that the paths contain slashes).
2045
2046           You must give this command before the Subversion repository read it
2047           is supposed to affect!
2048
2049           Note that the branchify_map set is a property of the reposurgeon
2050           interpreter, not of any individual repository, and will persist
2051           across Subversion dumpfiile or repository reads. This may lead to
2052           unexpected results if you forget to re-set it.
2053
2054   EXAMINING TREE STATES
2055       manifest [regular expression] [>outfile]
2056           Takes an optional selection set argument defaulting to all commits,
2057           and an optional Python regular expression. For each commit in the
2058           selection set, print the mapping of all paths in that commit tree
2059           to the corresponding blob marks, mirroring what files would be
2060           created in a checkout of the commit. If a regular expression is
2061           given, only print "path -> mark" lines for paths matching it. This
2062           command supports > redirection.
2063
2064       checkout directory
2065           Takes a selection set which must resolve to a single commit, and a
2066           second argument. The second argument is interpreted as a directory
2067           name. The state of the code tree at that commit is materialized
2068           beneath the directory.
2069
2070       diff [>outfile]
2071           Display the difference between commits. Takes a selection-set
2072           argument which must resolve to exactly two commits. Supports output
2073           redirection.
2074
2075   HOUSEKEEPING
2076       These are backed up by the following housekeeping commands, none of
2077       which take a selection set:
2078
2079       help
2080           Get help on the interpreter commands. Optionally follow with
2081           whitespace and a command name; with no argument, lists all
2082           commands. '?' also invokes this.
2083
2084       shell
2085           Execute the shell command given in the remainder of the line. '!'
2086           also invokes this.
2087
2088       prefer [repotype]
2089           With no arguments, describe capabilities of all supported systems.
2090           With an argument (which must be the name of a supported system)
2091           this has two effects:
2092
2093           First, if there are multiple repositories in a directory you do a
2094           read on, reposurgeon will read the preferred one (otherwise it will
2095           complain that it can't choose among them).
2096
2097           Secondly, this will change reposurgeon's preferred type for output.
2098           This means that you do a write to a directory, it will build a repo
2099           of the preferred type rather than its original type (if it had
2100           one).
2101
2102           If no preferred type has been explicitly selected, reading in a
2103           repository (but not a fast-import stream) will implicitly set the
2104           preferred type to the type of that repository.
2105
2106           In older versions of reposurgeon this command changed the type of
2107           the selected repository, if there is one. That behavior interacted
2108           badly with attempts to interpret legacy IDs and has been removed.
2109
2110       sourcetype [repotype]
2111           Report (with no arguments) or select (with one argument) the
2112           current repository's source type. This type is normally set at
2113           repository-read time, but may remain unset if the source was a
2114           stream file.
2115
2116           The source type affects the interpretation of legacy IDs (for
2117           purposes of the =N visibility set and the 'references' command) by
2118           controlling the regular expressions used to recognize them. If no
2119           preferred output type has been set, it may also change the output
2120           format of stream files made from the repository.
2121
2122           The source type is reliably set whenever a live repository is read,
2123           or when a Subversion stream or Fossil dump is interpreted but not
2124           necessarily by other stream files. Streams generated by cvs-fast-
2125           export(1) using the --reposurgeon are detected as CVS. In some
2126           other cases, the source system is detected from the presence of
2127           magic $-headers in contents blobs.
2128
2129   INSTRUMENTATION
2130       A few commands have been implemented primarily for debugging and
2131       regression-testing purposes, but may be useful in unusual
2132       circumstances.
2133
2134       The output of most of these commands can individually be redirected to
2135       a named output file. Where indicated in the syntax, you can prefix the
2136       output filename with “>” and give it as a following argument.
2137
2138       index [>outfile]
2139           Display four columns of info on objects in the selection set: their
2140           number, their type, the associate mark (or '-' if no mark) and a
2141           summary field varying by type. For a branch or tag it's the
2142           reference; for a commit it's the commit branch; for a blob it's the
2143           repository path of the file in the blob.
2144
2145           The default selection set for this command is =CTRU, all objects
2146           except blobs.
2147
2148       resolve [label-text...]
2149           Does nothing but resolve a selection-set expression and echo the
2150           resulting event-number set to standard output. The remainder of the
2151           line after the command is used as a label for the output.
2152
2153           Implemented mainly for regression testing, but may be useful for
2154           exploring the selection-set language.
2155
2156       attribution selection resolve [>outfile] [label-text...]
2157           Does nothing but resolve an attribution selection-set expression
2158           for the selected events and echo the resulting attribution-number
2159           set to standard output. The remainder of the line after the command
2160           is used as a label for the output.
2161
2162           Implemented mainly for regression testing, but may be useful for
2163           exploring the selection-set language.
2164
2165       verbose [n]
2166           'verbose 1' enables the progress meter and messages, 'verbose 0'
2167           disables them. Higher levels of verbosity are available but
2168           intended for developers only.
2169
2170       quiet [on | off]
2171           Without an argument, this command requests a report of the quiet
2172           boolean; with the argument 'on' or 'off' it is changed. When quiet
2173           is on, time-varying report fields which would otherwise cause
2174           spurious failures in regression testing are suppressed.
2175
2176       relax
2177           Normally, a command error aborts the execution of an enclosing
2178           script. The relax command suppresses this behavior. It is useful
2179           when writing regression tests that exercise failure cases.
2180
2181       print output-text...
2182           Does nothing but ship its argument line to standard output. Useful
2183           in regression tests.
2184
2185       echo [number]
2186           'echo 1' causes each reposurgeon command to be echoed to standard
2187           output just before its output. This can be useful in constructing
2188           regression tests that are easily checked by eyeball.
2189
2190       version [version...]
2191           With no argument, display the program version and the list of VCSes
2192           directly supported. With argument, declare the major version
2193           (single digit) or full version (major.minor) under which the
2194           enclosing script was developed. The program will error out if the
2195           major version has changed (which means the surgical language is not
2196           backwards compatible).
2197
2198           It is good practice to start your lift script with a version
2199           requirement, especially if you are going to archive it for later
2200           reference.
2201
2202       prompt [format...]
2203           Set the command prompt format to the value of the command line;
2204           with an empty command line, display it. The prompt format is
2205           evaluated in Python after each command with the following
2206           dictionary substitutions:
2207
2208           chosen
2209               The name of the selected repository, or None if none is
2210               currently selected.
2211
2212           Thus, one useful format might be 'rs[%(chosen)s]%% '.
2213
2214           More format items may be added in the future. The default prompt
2215           corresponds to the format 'reposurgeon%% '. The format line is
2216           evaluated with shell quotng of tokens, so that spaces can be
2217           included.
2218
2219       history
2220           List the commands you have entered this session.
2221
2222       legacy [read|write] [<filename] [>filename]
2223           Apply or list legacy-reference information. Does not take a
2224           selection set. The 'read' variant reads from standard input or a
2225           <-redirected filename; the 'write' variant writes to standard
2226           output or a >-redirected filename.
2227
2228           A legacy-reference file maps reference cookies to (committer,
2229           commit-date, sequence-number) pairs; these in turn (should)
2230           uniquely identify a commit. The format is two whitespace-separated
2231           fields: the cookie followed by an action stamp identifying the
2232           commit.
2233
2234           It should not normally be necessary to use this command. The legacy
2235           map is automatically preserved through repository reads and
2236           rebuilds, being stored in the file legacy-map under the repository
2237           subdirectory..
2238
2239       set [option]
2240           Turn on an option flag. With no arguments, list all options
2241
2242           Most options are described in conjunction with the specific
2243           operations that the modify. One of general interest is
2244           “compressblobs”; this enables compression on the blob files in the
2245           internal representation reposurgeon uses for editing repositories.
2246           With this option, reading and writing of repositories is slower,
2247           but editing a repository requires less (sometimes much less) disk
2248           space.
2249
2250       clear [option]
2251           Turn off an option flag. With no arguments, list all options
2252
2253       profile
2254           Enable profiling. Profile statistics are dumped to the path given
2255           as argument. Must be one of the initial command-line arguments, and
2256           gathers statistics only on code executed via '-'.
2257
2258       timing
2259           Display statistics on phase timing and memory usage in repository
2260           analysis. Mainly of interest to developers trying to speed up the
2261           program.
2262
2263       exit
2264           Exit, reporting the time. Included here because, while EOT will
2265           also cleanly exit the interpreter, this command reports elapsed
2266           time since start.
2267

WORKING WITH MERCURIAL

2269       reposurgeon uses a built-in extractor class to perform extractions from
2270       Mercurial repositories.
2271
2272       Mercurial branches are exported as branches in the exported repository
2273       and tags are exported as tags. By default, bookmarks are ignored. You
2274       can specify explicit handling for bookmarks by setting
2275       reposurgeon.bookmarks in your .hg/hgrc. Set the value to the prefix
2276       that reposurgeon should use for bookmarks.
2277
2278       For example, if your bookmarks represent branches, put this at the
2279       bottom of your .hg/hgrc:
2280
2281           [reposurgeon]
2282           bookmarks=heads/
2283
2284       If you do that, it's your responsibility to ensure that branch names do
2285       not conflict with bookmark names. You can add a prefix like
2286       bookmarks=heads/feature- to disambiguate as necessary.
2287

WORKING WITH SUBVERSION

2289       reposurgeon can read Subversion dumpfiles or edit a Subversion
2290       repository (and you must point it at a repository, not a checkout
2291       directory).
2292
2293   READING SUBVERSION REPOSITORIES
2294       Certain optional modifiers on the read command change its behavior when
2295       reading Subversion repositories:
2296
2297       --nobranch
2298           Suppress branch analysis.
2299
2300       --preserve
2301           Never discard metadata. In particular, preserve branch-creation
2302           commits (and their metadata) in full rather than turning commits
2303           for empty branches into bare gitspace resets. Also, preserve
2304           branges and tags with following tip deletes rather than nuking
2305           them; the tip deletes become tags.
2306
2307       --ignore-properties
2308           Suppress read-time warnings about discarded property settings.
2309
2310       --user-ignores
2311           Don't generate .gitignore files from svn:ignore properties.
2312           Instead, just pass through .gitignore files found in the history.
2313
2314       --use-uuid
2315           If the --use-uuid read option is set, the repository's UUID will be
2316           used as the hostname when faking up email addresses, a la git-svn.
2317           Otherwise, addresses will be generated the way git cvs-import does
2318           it, simply copying the username into the address field.
2319
2320       --noignores
2321           Do not fill in an equivalent of default Subversion ignore patterns.
2322
2323       These modifiers can go anywhere in any order on the read command line
2324       after the read verb. They must be whitespace-separated.
2325
2326       It is also possible to embed a magic comment in a Subversion stream
2327       file to set these options. Prefix a space-separated list of them with
2328       the magic comment " # reposurgeon-read-options:"; the leading space is
2329       required. This may be useful when synthesizing test loads; in
2330       partticular, a stream file that does not set up a standard
2331       trunk/branches/tags directoryt layout can use this to perform a mapping
2332       of all commits onto the master branch that the git importer will
2333       accept.
2334
2335       Here are the rules used for mapping subdirectories in a Subversion
2336       repository to branches:
2337
2338        1. At any given time there is a set of eligible paths and path
2339           wildcards which declare potential branches. See the documentation
2340           of the branchify for how to alter this set, which initially
2341           consists of {trunk, tags/*, branches/*, and '*'}.
2342
2343        2. A repository is considered "flat" if it has no directory that
2344           matches a path or path wildcard in the branchify set. All commits
2345           in a flat repository are assigned to branch master, and what would
2346           have been branch structure becomes directory structure. In this
2347           case, we're done; all the other rules apply to non-flat repos.
2348
2349           If you give the option --nobranch when reading a Subversion
2350           repository, branch analysis is skipped and the repository is
2351           treated as though flat (left as a linear sequence of commits on
2352           refs/heads/master). This may be useful if your repository
2353           configuration is highly unusual and you need to do your own branch
2354           surgery. Note that this option will disable partitioning of mixed
2355           commits.
2356
2357        3. If "trunk" is eligible, it always becomes the master branch.
2358
2359        4. If an element of the branchify set ends with *, each immediate
2360           subdirectory of it is considered a potential branch. If '*' is in
2361           the branchify set (which is true by default) all top-level
2362           directories other than /trunk, /tags, and /branches are also
2363           considered potential branches.
2364
2365        5. Files in the top-level directory are assigned to a synthetic branch
2366           named 'root'.
2367
2368        6. Each potential branch is checked to see if it has commits on it
2369           after the initial creation or copy. If there are such commits, it
2370           becomes a branch. If not, it may become a tag in order to preserve
2371           the commit metadata (see the description of the --preserve option
2372           below). In all cases, the name of any created tag or branch is the
2373           basename of the directory.
2374
2375       Branch-creation operations with no following commits are treated
2376       differently depending on whether or not the --preserve option is on. If
2377       it is off (the default) the branch creation becomes an empty gitspace
2378       branch represented by a reset operation; any comment on the commit is
2379       issued with a warning. If --preserve is on, the comment metadata is
2380       preserved in an empty commit attached to the branchpoint.
2381
2382       Otherwise, each commit that only creates or deletes directories (in
2383       particular, copy commits for tags and branches, and commits that only
2384       change properties) will be transformed into a tag named after the tag
2385       or branch, containing the date/author/comment metadata from the commit.
2386
2387       Subversion branch deletions are turned into deletealls, clearing the
2388       fileset of the import-stream branch. When a branch finishes with a
2389       deleteall at its tip, the deleteall is transformed into a tag. This
2390       rule cleans up after aborted branch renames.
2391
2392       Occasionally (and usually by mistake) a branchy Subversion repository
2393       will contain revisions that touch multiple branches. These are handled
2394       by partitioning them into multiple import-stream commits, one on each
2395       affected branch. The Legacy-ID of such a split commit will have a
2396       pseudo-decimal part - for example, if Subversion revision 2317 touches
2397       three branches, the three generated commits will have IDs 2317.1,
2398       2317.2, and 2317.3.
2399
2400       The svn:executable and svn:special properties are translated into
2401       permission settings in the input stream; svn:executable becomes 100755
2402       and svn:special becomes 120000 (indicating a symlink; the blob contents
2403       will be the path to which the symlink should resolve).
2404
2405       Any cvs2svn:rev properties generated by cvs2svn are incorporated into
2406       the internal map used for reference-lifting, then discarded.
2407
2408       Normally, per-directory svn:ignore properties become .gitignore files.
2409       Actual .gitignore files in a Subversion directory are presumed to have
2410       been created by git-svn users separately from native Subversion ignore
2411       properties and discarded with a warning. It is up to the user to merge
2412       the content of such files into the target repository by hand. But this
2413       behavior is inverted by the --user-ignores option; if that is on,
2414       .gitignore files are passed through and Subversion svn:ignore
2415       properties are discarded.
2416
2417       (Regardless of the setting of the --user-ignores option, .cvsignore
2418       files found in Subversion repositories always become .gitignores in the
2419       translation. The assumption is that these date from before a CVS-to-SVN
2420       lift and should be preserved to affect behavior when browsing that
2421       section of the repository.)
2422
2423       svn:mergeinfo properties are interpreted. Any svn:mergeinfo property on
2424       a revision A with a merge source range ending in revision B produces a
2425       merge link such that B becomes a parent of A.
2426
2427       All other Subversion properties are discarded. (This may change in a
2428       future release.) The property for which this is most likely to cause
2429       semantic problems is svn:eol-style. However, since property-change-only
2430       commits get turned into annotated tags, the translated tags will retain
2431       information about setting changes.
2432
2433       The sub-second resolution on Subversion commit dates is discarded; Git
2434       wants integer timestamps only.
2435
2436       Because fast-import format cannot represent an empty directory, empty
2437       directories in Subversion repositories will be lost in translation.
2438
2439       Normally, Subversion local usernames are mapped in the style of git
2440       cvs-import; thus user "foo" becomes "foo <foo>", which is sufficient to
2441       pacify git and other systems that require email addresses. With the
2442       option "svn_use_uuid", usernames are mapped in the git-svn style, with
2443       the repository's UUID used as a fake domain in the email address. Both
2444       forms can be remapped to real address using the authors read command.
2445
2446       Reading a Subversion stream enables writing of the legacy map as
2447       'legacy' passthroughs when the repo is written to a stream file.
2448
2449       reposurgeon tries hard to silently do the right thing, but there are
2450       Subversion edge cases in which it emits warnings because a human may
2451       need to intervene and perform fixups by hand. Here are the less obvious
2452       messages it may emit:
2453
2454       user-generated .gitignore
2455           This message means means reposurgeon has found a .gitignore file in
2456           the Subversion repository it is analyzing. This probably happened
2457           because somebody was using git-svn as a live gateway, and created
2458           ignores which may or may not be congruent with those in the
2459           generated .gitignore files that the Subversion ignore properties
2460           will be translated into. You'll need to make a policy decision
2461           about which set of ignores to use in the conversion, and possibly
2462           set the --user-ignores option on read to pass through user-created
2463           .gitignore files; in that case this warning will not be emitted.
2464
2465       can't connect nonempty branch XXXX to origin
2466           This is a serious error.  reposurgeon has been unable to find a
2467           link from a specified branch to the trunk (master) branch. The
2468           commit graph will not be fully connected and will need manual
2469           repair.
2470
2471       permission information may be lost
2472           A Subversion node change on a file sets or clears properties, but
2473           no ancestor can be found for this file. Executable or symlink
2474           position may be set wrongly on later revisions of this file.
2475           Subversion user-defined properties may also be scrambled or lost.
2476           Usually this error can be ignored.
2477
2478       properties set
2479           reposurgeon has detected a setting of a user-defined property, or
2480           the Subversion properties svn:externals. These properties cannot be
2481           expressed in an import stream; the user is notified in case this is
2482           a showstopper for the conversion or some corrective action is
2483           required, but normally this error can be ignored. This warning is
2484           suppressed by the --ignore-properties option.
2485
2486       branch links detected by file ops only
2487           Branch links are normally deduced by examining Subversion directory
2488           copy operations. A common user error (making a branch with a
2489           non-Subversion directory copy and then doing an svn add on the
2490           contends) can defeat this. While reposurgeon should detect and cope
2491           with most such copies correctly, you should examine the commit
2492           graph to check that the branch is rooted at the correct place.
2493
2494       could not tagify root commit
2495           The earliest commit in your Subversion repository has file
2496           operations, rather than being a pure directory creation. This
2497           probably means your Subversion dump file is malformed, or you may
2498           have attempted to lift from an incremental dump. Proceed with
2499           caution.
2500
2501       deleting parentless tip delete
2502           This message may be triggered by a Subversion branch move followed
2503           by a re-creation under the source name. Check near the indicated
2504           revision to make sure the renamed branch is connected to master.
2505
2506       mid-branch deleteall
2507           A deleteall operation has been found in the middle of a branch
2508           history. This usually indicates that a Subversion tag or branch was
2509           created by mistake, and someone later tried to undo the error by
2510           deleting the tag/branch directory before recreating it with a copy
2511           operation. Examine the topology near the deleteall closely, it may
2512           need hand-hacking. It is fairly likely that both (a) the
2513           reposurgeon translation will be different from what other
2514           translators (such as git-svn) produce, and (b) it will not be
2515           immediately obvious which is right.
2516
2517       lookback for XXX failed, not making branch link
2518           Branch analysis failed, probably due to a set of file copies that
2519           reposurgeon thought it should interpret as a botched branch
2520           creation but couldn't deduce a history for. This is a warning;
2521           check how the directory XXX is converted, it may need post-editing
2522           into a branch.
2523
2524   WRITING SUBVERSION REPOSITORIES
2525       reposurgeon has support for writing Subversion repositories. Due to
2526       mismatches between the ontology of Subversion and that of git import
2527       streams, this support has some significant limitations and bugs.
2528
2529       In summary, Subversion repository histories do not round-trip through
2530       reposurgeon editing. File content changes are preserved but some
2531       metadata is unavoidably lost. Furthermore, writing out a DVCS history
2532       in Subversion also loses significant portions of its metadata. Details
2533       follow.
2534
2535       Writing a Subversion repository or dump stream discards author
2536       information, the committer's name, and the hostname part of the commit
2537       address; only the commit timestamp and the local part of the
2538       committer's email address are preserved, the latter becoming the
2539       Subversion author field. However, reading a Subversion repository and
2540       writing it out again will preserve the author fields.
2541
2542       Import-stream timestamps have 1-second granularity. The sub-second
2543       parts of Subversion commit timestamps will be lost on their way through
2544       reposurgeon.
2545
2546       Empty directories aren't represented in import streams. Consequently,
2547       reading and writing Subversion repositories preserves file content, but
2548       not empty directories. It is also not guaranteed that after editing a
2549       Subversion repository that the sequence of directory creations and
2550       deletions relative to other operations will be identical; the only
2551       guarantee is that enclosing directories will be created before any
2552       files in them are.
2553
2554       When reading a Subversion repository, reposurgeon discards the special
2555       directory-copy nodes associated with branch creations. These can't be
2556       recreated if and when the repository is written back out to Subversion;
2557       rather, each branch copy node from the original translates into a
2558       branch creation plus the first set of file modifications on the branch.
2559
2560       When reading a Subversion repository, reposurgeon also automatically
2561       breaks apart mixed-branch commits. These are not re-united if the
2562       repository is written back out.
2563
2564       When writing to a Subversion repository, all lightweight tags become
2565       Subversion tag copies with empty log comments, named for the tag
2566       basename. The committer name and timestamp are copied from the commit
2567       the tag points to. The distinction between heads and tags is lost.
2568
2569       Because of the preceding two points, it is not guaranteed that even
2570       revision numbers will be stable when a Subversion repository is read in
2571       and then written out!
2572
2573       Subversion repositories are always written with a standard
2574       (trunk/tags/branches) layout. Thus, a repository with a nonstandard
2575       shape that has been analyzed by reposurgeon won't be written out with
2576       the same shape.
2577
2578       When writing a Subversion repository, branch merges are translated into
2579       svn:mergeinfo properties in the simplest possible way - as an
2580       svn:mergeinfo property of the translated merge commit listing the merge
2581       source revisions.
2582
2583       Subversion has a concept of "flows"; that is, named segments of history
2584       corresponding to files or directories that are created when the path is
2585       added, cloned when the path is copied, and deleted when the path is
2586       deleted. This information is not preserved in import streams or the
2587       internal representation that reposurgeon uses. Thus, after editing, the
2588       flow boundaries of a Subversion history may be arbitrarily changed.
2589

IGNORE PATTERNS

2591       reposurgeon recognizes how supported VCSes represent file ignores (CVS
2592       .cvsignore files lurking untranslated in older Subversion repositories,
2593       Subversion ignore properties, .gitignore/.hgignore/.bzrignore file in
2594       other systems) and moves ignore declarations among these containers on
2595       repo input and output. This will be sufficient if the ignore patterns
2596       are exact filenames.
2597
2598       Translation may not, however, be perfect when the ignore patterns are
2599       Unix glob patterns or regular expressions. This compatibility table
2600       describes which patterns will translate; “plain” indicates a plain
2601       filename with no glob or regexp syntax or negation.
2602
2603       RCS has no ignore files or patterns and is therefore not included in
2604       the table.
2605
2606┌─────────────┬───────────────┬──────────────┬───────────────────┬───────────────────┬─────────────────────┬──────────────┬────────────┬────────────┐
2607│             │   from CVS    from svn   from git      from hg      from bzr       from      from SRC  from bk   
2608│             │               │              │                   │                   │                     │    darcs     │            │            │
2609├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2610│             │               │              │                   │                   │                     │              │            │            │
2611to   │         all   │         all  │        all        │           all     │         all         │        plain │        all │        all │
2612CVS  │               │              │        except     │                   │         except      │              │            │            │
2613│             │               │              │        !-prefixed │                   │         RE:-        │              │            │            │
2614│             │               │              │        but        │                   │         and         │              │            │            │
2615│             │               │              │        nonempty   │                   │         !-prefixed  │              │            │            │
2616├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2617│             │  all except   │              │                   │                   │                     │              │            │            │
2618to   │  !.PP         │         all  │        all except │           all     │         all except  │        plain │        all │        all │
2619svn  │               │              │        !-prefixed │                   │         RE:- and    │              │            │            │
2620│             │               │              │                   │                   │         !-prefixed  │              │            │            │
2621├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2622│             │               │              │                   │                   │                     │              │            │            │
2623to   │         all   │         all  │           all     │        all        │        all except   │        plain │        all │        all │
2624git  │               │              │                   │        except     │        RE:-prefixed │              │            │            │
2625│             │               │              │                   │        !-prefixed │                     │              │            │            │
2626├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2627│             │               │              │                   │                   │                     │              │            │            │
2628to   │        all    │         all  │        all except │           all     │         all except  │        plain │        all │        all │
2629hg   │        except │              │        !-prefixed │                   │         RE:- and    │              │            │            │
2630│             │        !      │              │                   │                   │         !-prefixed  │              │            │            │
2631├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2632│             │               │              │                   │                   │                     │              │            │            │
2633to   │         all   │         all  │           all     │           all     │            all      │        plain │        all │        all │
2634bzr  │               │              │                   │                   │                     │              │            │            │
2635├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2636│             │               │              │                   │                   │                     │              │            │            │
2637to    │        plain  │        plain │          plain    │          plain    │           plain     │         all  │        all │        all │
2638darcs │               │              │                   │                   │                     │              │            │            │
2639├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2640│             │               │              │                   │                   │                     │              │            │            │
2641to   │        all    │         all  │        all except │           all     │         all except  │        plain │        all │        all │
2642SRC  │        except │              │        !-prefixed │                   │         RE:- and    │              │            │            │
2643│             │        !      │              │                   │                   │         !-prefixed  │              │            │            │
2644└─────────────┴───────────────┴──────────────┴───────────────────┴───────────────────┴─────────────────────┴──────────────┴────────────┴────────────┘
2645
2646       The hg rows and columns of the table describes compatibility to hg's
2647       glob syntax rather than its default regular-expression syntax. When
2648       writing to an hg repository from any other kind, reposurgeon prepends
2649       to the output .hgignore a "syntax: glob" line.
2650

TRANSLATION STYLE

2652       After converting a CVS, SVN, or BitKeeper repository, check for and
2653       remove $-cookies in the head revision(s) of the files. The full
2654       Subversion set is $Date:, $Revision:, $Author:, $HeadURL and $Id:. CVS
2655       uses $Author:, $Date:, $Header:, $Id:, $Log:, $Revision:, also (rarely)
2656       $Locker:, $Name:, $RCSfile:, $Source:, and $State:.
2657
2658       When you need to specify a commit, use the action-stamp format that
2659       references lift generates when it can resolve an SVN or CVS reference
2660       in a comment. It is best that you not vary from this format, even in
2661       trivial ways like omitting the 'Z' or changing the 'T' or '!' or ':'.
2662       Making action stamps uniform and machine-parseable will have good
2663       consequences for future repository-browsing tools.
2664
2665       Sometimes, in converting a repository, you may need to insert an
2666       explanatory comment - for example, if metadata has been garbled or
2667       missing and you need to point to that fact. It's helpful for
2668       repository-browsing tools if there is a uniform syntax for this that is
2669       highly unlikely to show up in repository comments. We recommend
2670       enclosing translation notes in [[ ]]. This has the advantage of being
2671       visually similar to the [ ] traditionally used for editorial comments
2672       in text.
2673
2674       It is good practice to include, in the comment for the root commit of
2675       the repository, a note dating and attributing the conversion work and
2676       explaining these conventions. Example:
2677
2678       [[This repository was converted from Subversion to git on 2011-10-24 by
2679       Eric S. Raymond <esr@thyrsus.com>. Here and elsewhere, conversion notes
2680       are enclosed in double square brackets. Junk commits generated by
2681       cvs2svn have been removed, commit references have been mapped into a
2682       uniform VCS-independent syntax, and some comments edited into
2683       summary-plus-continuation form.]]
2684
2685       It is also good practice to include a generated tag at the point of
2686       conversion. E.g
2687
2688           msgin --create <<EOF
2689           Tag-Name: git-conversion
2690
2691           Marks the spot at which this repository was converted from Subversion to git.
2692           EOF
2693

ADVANCED EXAMPLES

2695           define lastchange {
2696           @max(=B & [/ChangeLog/] & /{0}/B)? list
2697           }
2698
2699       List the last commit that refers to a ChangeLog file containing a
2700       specified string. (The trick here is that ? extends the singleton set
2701       consisting of the last eligible ChangeLog blob to its set of referring
2702       commits, and listonly notices the commits.)
2703

STREAM SYNTAX EXTENSIONS

2705       The event-stream parser in “reposurgeon” supports some extended syntax.
2706       Exporters designed to work with “reposurgeon” may have a --reposurgeon
2707       option that enables emission of extended syntax; notably, this is true
2708       of cvs-fast-export(1). The remainder of this section describes these
2709       syntax extensions. The properties they set are (usually) preserved and
2710       re-output when the stream file is written.
2711
2712       The token “#reposurgeon” at the start of a comment line in a
2713       fast-import stream signals reposurgeon that the remainder is an
2714       extension command to be interpreted by “reposurgeon”.
2715
2716       One such extension command is implemented: #sourcetype, which behaves
2717       identically to the reposurgeon sourcetype command. An exporter for a
2718       version-control system named “frobozz” could, for example, say
2719
2720           #reposurgeon sourcetype frobozz
2721
2722       Within a commit, a magic comment of the form “#legacy-id” declares a
2723       legacy ID from the stream file's source version-control system.
2724
2725       Also accepted is the bzr syntax for setting per-commit properties.
2726       While parsing commit syntax, a line beginning with the token “property”
2727       must contibue with a whitespace-separated property-name token. If it is
2728       then followed by a newline it is taken to set that boolean-valued
2729       property to true. Otherwise it must be followed by a numeric token
2730       specifying a data length, a space, following data (which may contain
2731       newlines) and a terminating newline. For example:
2732
2733           commit refs/heads/master
2734           mark :1
2735           committer Eric S. Raymond <esr@thyrsus.com> 1289147634 -0500
2736           data 16
2737           Example commit.
2738
2739           property legacy-id 2 r1
2740           M 644 inline README
2741
2742       Unlike other extensions, bzr properties are only preserved on stream
2743       output if the preferred type is bzr, because any importer other than
2744       bzr's will choke on them.
2745

INCOMPATIBLE LANGUAGE CHANGES

2747       In versions before 3.23, “prefer” changed the repository type as well
2748       as the preferred output format.
2749
2750       In versions before 3.0, the general command syntax put the command verb
2751       first, then the selection set (if any) then modifiers (VSO). It has
2752       changed to optional selection set first, then command verb, then
2753       modifiers (SVO). The change made parsing simpler, allowed abolishing
2754       some noise keywords, and recapitulates a successful design pattern in
2755       some other Unix tools - notably sed(1).
2756
2757       In versions before 3.0, path expressions only matched commits, not
2758       commits and the associated blobs as well. The names of the “a” and “c”
2759       flags were different.
2760
2761       In reposurgeon versions before 3.0, the delete command had the
2762       semantics of squash; also, the policy flags did not require a “--”
2763       prefix. The “--delete” flag was named “obliterate”.
2764
2765       In reposurgeon versions before 3.0, read and write optionally took file
2766       arguments rather than requiring redirects (and the write command never
2767       wrote into directories). This was changed in order to allow these
2768       commands to have modifiers. These modifiers replaced several global
2769       options that no longer exist.
2770
2771       In reposurgeon versions before 3.0, the earliest factor in a unite
2772       command always kept its tag and branch names unaltered. The new rule
2773       for resolving name conflicts, giving priority to the latest factor,
2774       produces more natural behavior when uniting two repositories end to
2775       end; the master branch of the second (later) one keeps its name.
2776
2777       In reposurgeon versions before 3.0, the tagify command expected
2778       policies as trailing arguments to alter its behaviour. The new syntax
2779       uses similarly named options with leading dashes, that can appear
2780       anywhere after the tagify command
2781
2782       In versions before 2.9. the syntax of "authors", "legacy", "list", and
2783       what are now "msg{in|out}" was different (and "legacy" was "fossils").
2784       They took plain filename arguments rather that using redirect < and >.
2785
2786       In versions before 4.0, msgin and msgout were named mailbox_in and
2787       mailbox_out.
2788

LIMITATIONS AND GUARANTEES

2790       Guarantee: In DVCses that use commit hashes, editing with reposurgeon
2791       never changes the hash of a commit object unless (a) you edit the
2792       commit, or (b) it is a descendant of an edited commit in a VCS that
2793       includes parent hashes in the input of a child object's hash (git and
2794       hg both do this).
2795
2796       Guarantee: reposurgeon only requires main memory proportional to the
2797       size of a repository's metadata history, not its entire content
2798       history. (Exception: the data from inline content is held in memory.)
2799
2800       Guarantee: In the worst case, reposurgeon makes its own copy of every
2801       content blob in the repository's history and thus uses intermediate
2802       disk space approximately equal to the size of a repository's content
2803       history. However, when the repository to be edited is presented as a
2804       stream file, reposurgeon requires no or only very little extra disk
2805       space to represent it; the internal representation of content blobs is
2806       a (seek-offset, length) pair pointing into the stream file.
2807
2808       Guarantee: reposurgeon never modifies the contents of a repository it
2809       reads, nor deletes any repository. The results of surgery are always
2810       expressed in a new repository.
2811
2812       Guarantee: Any line in a fast-import stream that is not a part of a
2813       command reposurgeon parses and understands will be passed through
2814       unaltered. At present the set of potential passthroughs is known to
2815       include the progress, the options, and checkpoint commands as well as
2816       comments led by #.
2817
2818       Guarantee: All reposurgeon operations either preserve all repository
2819       state they are not explicitly told to modify or warn you when they
2820       cannot do so.
2821
2822       Guarantee: reposurgeon handles the bzr commit-properties extension,
2823       correctly passing through property items including those with embedded
2824       newlines. (Such properties are also editable in the message-box
2825       format.)
2826
2827       Limitation: Because reposurgeon relies on other programs to generate
2828       and interpret the fast-import command stream, it is subject to bugs in
2829       those programs.
2830
2831       Limitation: bzr suffers from deep confusion over whether its unit of
2832       work is a repository or a floating branch that might have been cloned
2833       from a repo or created from scratch, and might or might not be destined
2834       to be merged to a repo one day. Its exporter only works on branches,
2835       but its importer creates repos. Thus, a rebuild operation will produce
2836       a subdirectory structure that differs from what you expect. Look for
2837       your content under the subdirectory 'trunk'.
2838
2839       Limitation: under git, signed tags are imported verbatim. However, any
2840       operation that modifies any commit upstream of the target of the tag
2841       will invalidate it.
2842
2843       Limitation: Stock git (at least as of version 1.7.3.2) will choke on
2844       property extension commands. Accordingly, reposurgeon omits them when
2845       rebuilding a repo with git type.
2846
2847       Limitation: Converting an hg repo that uses bookmarks (not branches) to
2848       git can lose information; the branch ref that git assigns to each
2849       commit may not be the same as the hg bookmark that was active when the
2850       commit was originally made under hg. Unfortunately, this is a real
2851       ontological mismatch, not a problem that can be fixed by cleverness in
2852       reposurgeon.
2853
2854       Limitation: Converting an hg repo that uses branches to git can lose
2855       information because git does not store an explicit branch as part of
2856       commit metadata, but colors commits with branch or tag names on the fly
2857       using a specific coloring algorithm, which might not match the explicit
2858       branch assignments to commits in the original hg repo. Reposurgeon
2859       preserves the hg branch information when reading an hg repo, so it is
2860       available from within reposurgeon itself, but there is no way to
2861       preserve it if the repo is written to git.
2862
2863       Limitation: While the Subversion read-side support is in good shape,
2864       the write-side support is more of a sketch or proof-of-concept than a
2865       robust implementation; it only works on very simple cases and does not
2866       round-trip. It may improve in future releases.
2867
2868       Limitation: Not all BitKeeper versions have the fast-import and
2869       fast-export commands that reposurgeon requires. They are present back
2870       to the 7.3 opensource version.
2871
2872       Limitation: reposurgeon may misbehave under a filesystem which smashes
2873       case in filenames, or which nominally preserves case but maps names
2874       differing only by case to the same filesystem node (Mac OS X behaves
2875       like this by default). Problems will arise if any two paths in a repo
2876       differ by case only. To avoid the problem on a Mac, do all your surgery
2877       on an HFS+ file system formatted with case sensitivity specifically
2878       enabled.
2879
2880       Limitation: If whitespace followed by # appears in a string or regexp
2881       command argument, it will be misinterpreted as the beginning of a
2882       line-ending comment and screw up parsing.
2883
2884       Guarantee: As version-control systems add support for the fast-import
2885       format, their repositories will become editable by reposurgeon.
2886
2887       Limitations edescribed above are unlikely to change. Do "help bugs" at
2888       the reposurgeon prompt to see up-to-date information on reposurgeon
2889       bugs and internal problems that are expected to be fixed in some future
2890       release.
2891

REQUIREMENTS

2893       reposurgeon relies on importers and exporters associated with the VCSes
2894       it supports.
2895
2896       git
2897           Core git supports both export and import.
2898
2899       bzr
2900           Requires bzr plus the bzr-fast-import plugin.
2901
2902       hg
2903           Requires core hg, the hg-fastimport plugin, and the third-party
2904           hg-fast-export.py script.
2905
2906       svn
2907           Stock Subversion commands support export and import.
2908
2909       darcs
2910           Stock darcs commands support export and import.
2911
2912       CVS
2913           Requires cvs-fast-export. Note that the quality of CVS lifts may be
2914           poor, with individual lifts requiring serious hand-hacking. This is
2915           due to inherent problems with CVS's file-oriented model.
2916
2917       RCS
2918           Requires cvs-fast-export (yes, that's not a typo; cvs-fast-export
2919           handles RCS collections as well). The caveat for CVS applies.
2920

CANONICALIZATION RULES

2922       It is expected that reposurgeon will be extended with more deletion
2923       policies. Policy authors may need to know more about how a commit's
2924       file operation sequence is reduced to normal form after operations from
2925       deleted commits are prepended to it.
2926
2927       Recall that each commit has a list of file operations, each a M
2928       (modify), D (delete), R (rename), C (copy), or 'deleteall' (delete all
2929       files). Only M operations have associated blobs. Normally there is only
2930       one M operation per individual file in a commit's operation list.
2931
2932       To understand how the reduction process works, it's enough to
2933       understand the case where all the operation in the list are working on
2934       the same file. Sublists of operations referring to different files
2935       don't affect each other and reducing them can be thought of as separate
2936       operations. Also, a "deleteall" acts as a D for everything and cancels
2937       all operations before it in the list.
2938
2939       The reduction process walks through the list from the beginning looking
2940       for adjacent pairs of operations it can compose. The following table
2941       describes all possible cases and all but one of the reductions.
2942
2943              ┌──────────────────────────┬────────────────────────────┐
2944              │        M + D → D         │                            │
2945              │                          │        If a file is        │
2946              │                          │        modified then       │
2947              │                          │        deleted, the result │
2948              │                          │        is as though it had │
2949              │                          │        been deleted. If    │
2950              │                          │        the M was the only  │
2951              │                          │        modify for the      │
2952              │                          │        file, it's removed  │
2953              │                          │        too.                │
2954              ├──────────────────────────┼────────────────────────────┤
2955              │M a + R a b → R a b + M b │                            │
2956              │                          │        The purpose of this │
2957              │                          │        transformation is   │
2958              │                          │        to push renames     │
2959              │                          │        toward the          │
2960              │                          │        beginning of the    │
2961              │                          │        list, where they    │
2962              │                          │        may become adjacent │
2963              │                          │        to another R or C   │
2964              │                          │        they can be         │
2965              │                          │        composed with. If   │
2966              │                          │        the M is the only   │
2967              │                          │        modify operation    │
2968              │                          │        for this file, the  │
2969              │                          │        rename is dropped.  │
2970              ├──────────────────────────┼────────────────────────────┤
2971              │       M a + C a b        │                            │
2972              │                          │        No reduction.       │
2973              ├──────────────────────────┼────────────────────────────┤
2974              │  M b + R a b → nothing   │                            │
2975              │                          │        Should be           │
2976              │                          │        impossible, and may │
2977              │                          │        indicate repository │
2978              │                          │        corruption.         │
2979              ├──────────────────────────┼────────────────────────────┤
2980              │  M b + C a b → nothing   │                            │
2981              │                          │        The copy undoes the │
2982              │                          │        modification.       │
2983              ├──────────────────────────┼────────────────────────────┤
2984              │        D + M → M         │                            │
2985              │                          │        If a file is        │
2986              │                          │        deleted and         │
2987              │                          │        modified, the       │
2988              │                          │        result is as though │
2989              │                          │        the deletion had    │
2990              │                          │        not taken place     │
2991              │                          │        (because M          │
2992              │                          │        operations store    │
2993              │                          │        entire files, not   │
2994              │                          │        deltas).            │
2995              ├──────────────────────────┼────────────────────────────┤
2996              │       D + {D|R|C}        │                            │
2997              │                          │        These cases should  │
2998              │                          │        be impossible and   │
2999              │                          │        would suggest the   │
3000              │                          │        repository has been │
3001              │                          │        corrupted.          │
3002              ├──────────────────────────┼────────────────────────────┤
3003              │       R a b + D a        │                            │
3004              │                          │        Should never        │
3005              │                          │        happen, and is      │
3006              │                          │        another case that   │
3007              │                          │        would suggest       │
3008              │                          │        repository          │
3009              │                          │        corruption.         │
3010              ├──────────────────────────┼────────────────────────────┤
3011              │    R a b + D b → D a     │                            │
3012              │                          │        The delete removes  │
3013              │                          │        the just-renamed    │
3014              │                          │        file.               │
3015              ├──────────────────────────┼────────────────────────────┤
3016              │        {R|C} + M         │                            │
3017              │                          │        No reduction.       │
3018              ├──────────────────────────┼────────────────────────────┤
3019              │  R a b + R b c → R a c   │                            │
3020              │                          │        The b terms have to │
3021              │                          │        match for these     │
3022              │                          │        operations to have  │
3023              │                          │        made sense when     │
3024              │                          │        they lived in       │
3025              │                          │        separate commits;   │
3026              │                          │        if they don't, it   │
3027              │                          │        indicates           │
3028              │                          │        repository          │
3029              │                          │        corruption.         │
3030              ├──────────────────────────┼────────────────────────────┤
3031              │      R a b + C b c       │                            │
3032              │                          │        No reduction.       │
3033              ├──────────────────────────┼────────────────────────────┤
3034              │   C a b + D a → R a b    │                            │
3035              │                          │        Copy followed by    │
3036              │                          │        delete of the       │
3037              │                          │        source is a rename. │
3038              ├──────────────────────────┼────────────────────────────┤
3039              │  C a b + D b → nothing   │                            │
3040              │                          │        This delete undoes  │
3041              │                          │        the copy.           │
3042              ├──────────────────────────┼────────────────────────────┤
3043              │      C a b + R a c       │                            │
3044              │                          │        No reduction.       │
3045              ├──────────────────────────┼────────────────────────────┤
3046              │  C a b + R b c → C a c   │                            │
3047              │                          │        Copy followed by a  │
3048              │                          │        rename of the       │
3049              │                          │        target reduces to   │
3050              │                          │        single copy         │
3051              ├──────────────────────────┼────────────────────────────┤
3052              │          C + C           │                            │
3053              │                          │        No reduction.       │
3054              └──────────────────────────┴────────────────────────────┘
3055

CRASH RECOVERY

3057       This section will become relevant only if reposurgeon or something
3058       underneath it in the software and hardware stack crashes while in the
3059       middle of writing out a repository, in particular if the target
3060       directory of the rebuild is your current directory.
3061
3062       The tool has two conflicting objectives. On the one hand, we never want
3063       to risk clobbering a pre-existing repo. On the other hand, we want to
3064       be able to run this tool in a directory with a repo and modify it in
3065       place.
3066
3067       We resolve this dilemma by playing a game of three-directory monte.
3068
3069        1. First, we build the repo in a freshly-created staging directory. If
3070           your target directory is named /path/to/foo, the staging directory
3071           will be a peer named /path/to/foo-stageNNNN, where NNNN is a cookie
3072           derived from reposurgeon's process ID.
3073
3074        2. We then make an empty backup directory. This directory will be
3075           named /path/to/foo.~N~, where N is incremented so as not to
3076           conflict with any existing backup directories.  reposurgeon never,
3077           under any circumstances, ever deletes a backup directory.
3078
3079           So far, all operations are safe; the worst that can happen up to
3080           this point if the process gets interrupted is that the staging and
3081           backup directories get left behind.
3082
3083        3. The critical region begins. We first move everything in the target
3084           directory to the backup directory.
3085
3086        4. Then we move everything in the staging directory to the target.
3087
3088        5. We finish off by restoring untracked files in the target directory
3089           from the backup directory. That ends the critical region.
3090
3091       During the critical region, all signals that can be ignored are
3092       ignored.
3093

ERROR RETURNS

3095       Returns 1 on fatal error, 0 otherwise. In batch mode all errors are
3096       fatal.
3097

SEE ALSO

3099       bzr(1), cvs(1), darcs(1), git(1), hg(1), rcs(1), svn(1).  bk(1).
3100

AUTHOR

3102       Eric S. Raymond <esr@thyrsus.com>; project page at
3103       http://www.catb.org/~esr/reposurgeon.
3104

NOTES

3106        1. DVCS Migration HOWTO
3107           http://www.catb.org/esr/dvcs-migration-guide.html
3108
3109        2. Python's
3110           http://docs.python.org/2/library/re.html
3111
3112
3113
3114reposurgeon                       01/30/2020                    REPOSURGEON(1)
Impressum