1REPOSURGEON(1)                 Development Tools                REPOSURGEON(1)
2
3
4

NAME

6       reposurgeon - surgical operations on repositories
7

SYNOPSIS

9       reposurgeon [command...]
10

DESCRIPTION

12       The purpose of reposurgeon is to enable risky operations that VCSes
13       (version-control systems) don't want to let you do, such as (a) editing
14       past comments and metadata, (b) excising commits, (c) coalescing and
15       splitting commits, (d) removing files and subtrees from repo history,
16       (e) merging or grafting two or more repos, and (f) cutting a repo in
17       two by cutting a parent-child link, preserving the branch structure of
18       both child repos.
19
20       A major use of reposurgeon is to assist a human operator to perform
21       higher-quality conversions among version control systems than can be
22       achieved with fully automated converters.
23
24       The original motivation for reposurgeon was to clean up artifacts
25       created by repository conversions. It was foreseen that the tool would
26       also have applications when code needs to be removed from repositories
27       for legal or policy reasons.
28
29       To keep reposurgeon simple and flexible, it normally does not do its
30       own repository reading and writing. Instead, it relies on being able to
31       parse and emit the command streams created by git-fast-export and read
32       by git-fast-import. This means that it can be used on any
33       version-control system that has both fast-export and fast-import
34       utilities. The git-import stream format also implicitly defines a
35       common language of primitive operations for reposurgeon to speak.
36
37       Fully supported systems (those for which reposurgeon can both read and
38       write repositories) include git, hg, bzr, svn, darcs, bk, RCS, and SRC.
39       For a complete list, with dependencies and technical notes, type prefer
40       to the reposurgeon prompt.
41
42       Writing to the file-oriented systems RCS and SRC is done via rcs-fast-
43       import(1) and has some serious limitations because those systems cannot
44       represent all the metadata in a git-fast-export stream. Consult that
45       tool's documentation for details and partial workarounds.
46
47       Writing Subversion repositories also has some significant limitations,
48       discussed in the section on Working With Subversion.
49
50       Fossil repository files can be read in using the --format=fossil option
51       of the read command and written out with the --format=fossil option of
52       the write. Ignore patterns are not translated in either direction.
53
54       CVS is supported for read only, not write. For CVS, reposurgeon must be
55       run from within a repository directory (one with a CVSROOT
56       subdirectory). Each module becomes a subdirectory in the the
57       reposurgeon representation of the change history.
58
59       In order to deal with version-control systems that do not have
60       fast-export equivalents, reposurgeon can also host extractor code that
61       reads repositories directly. For each version-control system supported
62       through an extractor, reposurgeon uses a small amount of knowledge
63       about the system's command-line tools to (in effect) replay repository
64       history into an input stream internally. Repositories under systems
65       supported through extractors can be read by reposurgeon, but not
66       modified by it. In particular, reposurgeon can be used to move a
67       repository history from any VCS supported by an extractor to any VCS
68       supported by a normal importer/exporter pair.
69
70       Mercurial repository reading is implemented with an extractor class;
71       writing is handled with the stock "hg fastimport" command. A test
72       extractor exists for git, but is normally disabled in favor of the
73       regular exporter.
74
75       For guidance on the pragmatics of repository conversion, see the DVCS
76       Migration HOWTO[1].
77

SAFETY WARNINGS

79       reposurgeon is a sharp enough tool to cut you. It takes care not to
80       ever write a repository in an actually inconsistent state, and will
81       terminate with an error message rather than proceed when its internal
82       data structures are confused. However, there are lots of things you can
83       do with it - like altering stored commit timestamps so they no longer
84       match the commit sequence - that are likely to cause havoc after you're
85       done. Proceed with caution and check your work.
86
87       Also note that, if your DVCS does the usual thing of making commit IDs
88       a cryptographic hash of content and parent links, editing a
89       publicly-accessible repository with this tool would be a bad idea. All
90       of the surgical operations in reposurgeon will modify the hash chains.
91
92       Please also see the notes on system-specific issues under the section
93       called “LIMITATIONS AND GUARANTEES”.
94

OPERATION

96       The program can be run in one of two modes, either as an interactive
97       command interpreter or in batch mode to execute commands given as
98       arguments on the reposurgeon invocation line. The only differences
99       between these modes are (1) the interactive one begins by turning on
100       the 'verbose 1' option, (2) in batch mode all errors (including
101       normally recoverable errors in selection-set syntax) are fatal, and (3)
102       each command-line argument beginning with “--” has that stripped off
103       (which, in particular means that --help and --version will work as
104       expected). Also, in interactive mode, Ctrl-P and Ctrl-N will be
105       available to scroll through your command history and tab completion of
106       both command keywords and name arguments (wherever that makes semantic
107       sense) is available.
108
109       A git-fast-import stream consists of a sequence of commands which must
110       be executed in the specified sequence to build the repo; to avoid
111       confusion with reposurgeon commands we will refer to the stream
112       commands as events in this documentation. These events are implicitly
113       numbered from 1 upwards. Most commands require specifying a selection
114       of event sequence numbers so reposurgeon will know which events to
115       modify or delete.
116
117       For all the details of event types and semantics, see the git-fast-
118       import(1) manual page; the rest of this paragraph is a quick start for
119       the impatient. Most events in a stream are commits describing revision
120       states of the repository; these group together under a single change
121       comment one or more fileops (file operations), which usually point to
122       blobs that are revision states of individual files. A fileop may also
123       be a delete operation indicating that a specified previously-existing
124       file was deleted as part of the version commit; there are a couple of
125       other special fileop types of lesser importance.
126
127       Commands to reposurgeon consist of a command keyword, sometimes
128       preceded by a selection set, sometimes followed by whitespace-separated
129       arguments. It is often possible to omit the selection-set argument and
130       have it default to something reasonable.
131
132       Here are some motivating examples. The commands will be explained in
133       more detail after the description of selection syntax.
134
135           :15 edit               ;; edit the object associated with mark :15
136
137           edit                   ;; edit all editable objects
138
139           29..71 list            ;; list summary index of events 29..71
140
141           236..$ list            ;; List events from 236 to the last
142
143           <#523> inspect         ;; Look for commit #523; they are numbered
144                                  ;; 1-origin from the beginning of the repository.
145
146           <2317> inspect         ;; Look for a tag with the name 2317, a tip commit
147                                  ;; of a branch named 2317, or a commit with legacy ID
148                                  ;; 2317. Inspect what is found. A plain number is
149                                  ;; probably a legacy ID inherited from a Subversion
150                                  ;; revision number.
151
152           /regression/ list      ;; list all commits and tags with comments or
153                                  ;; committer headers or author headers containing
154                                  ;; the string "regression"
155
156           1..:97 & =T delete     ;; delete tags from event 1 to mark 97
157
158           [Makefile] inspect     ;; Inspect all commits with a file op touching Makefile
159                                  ;; and all blobs referred to in a fileop
160                                  ;; touching Makefile.
161
162           :46 tip                ;; Display the branch tip that owns commit :46.
163
164           @dsc(:55) list         ;; Display all commits with ancestry tracing to :55
165
166           @min([.gitignore]) remove .gitignore delete
167                                  ;; Remove the first .gitignore fileop in the repo.
168
169   SELECTION SYNTAX
170       The selection-set specification syntax is an expression-oriented
171       minilanguage. The most basic term in this language is a location. The
172       following sorts of primitive locations are supported:
173
174       event numbers
175           A plain numeric literal is interpreted as a 1-origin event-sequence
176           number.
177
178       marks
179           A numeric literal preceded by a colon is interpreted as a mark; see
180           the import stream format documentation for explanation of the
181           semantics of marks.
182
183       tag and branch names
184           The basename of a branch (including branches in the refs/tags
185           namespace) refers to its tip commit. The name of a tag is
186           equivalent to its mark (that of the tag itself, not the commit it
187           refers to). Tag and branch locations are bracketed with < > (angle
188           brackets) to distinguish them from command keywords.
189
190       legacy IDs
191           If the contents of name brackets (< >) does not match a tag or
192           branch name, the interpreter next searches legacy IDs of commits.
193           This is especially useful when you have imported a Subversion dump;
194           it means that commits made from it can be referred to by their
195           corresponding Subversion revision numbers.
196
197       commit numbers
198           A numeric literal within name brackets (< >) preceded by # is
199           interpreted as a 1-origin commit-sequence number.
200
201       $
202           Refers to the last event.
203
204       These may be grouped into sets in the following ways:
205
206       ranges
207           A range is two locations separated by "..", and is the set of
208           events beginning at the left-hand location and ending at the
209           right-hand location (inclusive).
210
211       lists
212           Comma-separated lists of locations and ranges are accepted, with
213           the obvious meaning.
214
215       There are some other ways to construct event sets:
216
217       visibility sets
218           A visibility set is an expression specifying a set of event types.
219           It will consist of a leading equal sign, followed by type letters.
220           These are the type letters:
221
222                    ┌──┬─────────────────────┬───────────────────────┐
223                    │B │        blobs        │  Most default         │
224                    │  │                     │  selection sets       │
225                    │  │                     │  exclude blobs; they  │
226                    │  │                     │  have to be           │
227                    │  │                     │  manipulated through  │
228                    │  │                     │  the commits they     │
229                    │  │                     │  are attached to.     │
230                    ├──┼─────────────────────┼───────────────────────┤
231                    │C │       commits       │                       │
232                    ├──┼─────────────────────┼───────────────────────┤
233                    │D │ all-delete commits  │ These are artifacts   │
234                    │  │                     │ produced by some      │
235                    │  │                     │ older                 │
236                    │  │                     │ repository-conversion │
237                    │  │                     │ tools.                │
238                    ├──┼─────────────────────┼───────────────────────┤
239                    │H │  head (branch tip)  │                       │
240                    │  │  commits            │                       │
241                    ├──┼─────────────────────┼───────────────────────┤
242                    │O │    orphaned         │                       │
243                    │  │    (parentless)     │                       │
244                    │  │    commits          │                       │
245                    ├──┼─────────────────────┼───────────────────────┤
246                    │U │ commits with        │                       │
247                    │  │ callouts as parents │                       │
248                    ├──┼─────────────────────┼───────────────────────┤
249                    │Z │   commits with no   │                       │
250                    │  │   fileops           │                       │
251                    ├──┼─────────────────────┼───────────────────────┤
252                    │M │   merge             │                       │
253                    │  │   (multi-parent)    │                       │
254                    │  │   commits           │                       │
255                    ├──┼─────────────────────┼───────────────────────┤
256                    │F │ fork (multi-child)  │                       │
257                    │  │ commits             │                       │
258                    ├──┼─────────────────────┼───────────────────────┤
259                    │L │ commits with        │                       │
260                    │  │ unclean multi-line  │                       │
261                    │  │ comments (without a │                       │
262                    │  │ separating empty    │                       │
263                    │  │ line after the      │                       │
264                    │  │ first)              │                       │
265                    ├──┼─────────────────────┼───────────────────────┤
266                    │I │ commits for which   │                       │
267                    │  │ metadata cannot be  │                       │
268                    │  │ decoded to UTF-8    │                       │
269                    ├──┼─────────────────────┼───────────────────────┤
270                    │T │        tags         │                       │
271                    ├──┼─────────────────────┼───────────────────────┤
272                    │R │       resets        │                       │
273                    ├──┼─────────────────────┼───────────────────────┤
274                    │P │     Passthrough     │ All event types       │
275                    │  │                     │ simply passed         │
276                    │  │                     │ through, including    │
277                    │  │                     │ comments, progress    
278                    │  │                     │ commands, and         │
279                    │  │                     │ checkpoint commands.  │
280                    ├──┼─────────────────────┼───────────────────────┤
281                    │N │     Legacy IDs      │ Any string matching a │
282                    │  │                     │ cookie (legacy-ID)    │
283                    │  │                     │ format.               │
284                    └──┴─────────────────────┴───────────────────────┘
285
286       references
287           A reference name (bracketed by angle brackets) resolves to a single
288           object, either a commit or tag.
289
290                      ┌──────────────┬────────────────────────────┐
291type      interpretation       
292                      ├──────────────┼────────────────────────────┤
293                      │  tag name    │  annotated tag with that   │
294                      │              │  name                      │
295                      ├──────────────┼────────────────────────────┤
296                      │ branch name  │   the branch tip commit    │
297                      ├──────────────┼────────────────────────────┤
298                      │  legacy ID   │ commit with that legacy ID │
299                      ├──────────────┼────────────────────────────┤
300                      │assigned name │    name equated to a       │
301                      │              │    selection by assign     │
302                      └──────────────┴────────────────────────────┘
303           Note that if an annotated tag and a branch have the same name foo,
304           <foo> will resolve to the tag rather than the branch tip commit.
305
306       dates and action stamps
307           A date or action stamp in angle brackets resolves to a selection
308           set of all matching commits.
309
310                ┌───────────────────────────┬───────────────────────────┐
311type            interpretation       
312                ├───────────────────────────┼───────────────────────────┤
313                │    RFC3339 timestamp      │  commit or tag with that  │
314                │                           │  time/date                │
315                ├───────────────────────────┼───────────────────────────┤
316                │    action stamp           │ commits or tags with that │
317                │    (timestamp!email)      │ timestamp and author (or  │
318                │                           │ committer if no author).  │
319                ├───────────────────────────┼───────────────────────────┤
320                │yyyy-mm-dd part of RFC3339 │ all commits and tags with │
321                │timestamp                  │ that date                 │
322                └───────────────────────────┴───────────────────────────┘
323           To refine the match to a single commit, use a 1-origin index suffix
324           separated by '#'. Thus "<2000-02-06T09:35:10Z>" can match multiple
325           commits, but "<2000-02-06T09:35:10Z#2>" matches only the second in
326           the set.
327
328       text search
329           A text search expression is a Python regular expression surrounded
330           by forward slashes (to embed a forward slash in it, use a Python
331           string escape such as \x2f).
332
333           A text search normally matches against the comment fields of
334           commits and annotated tags, or against their author/committer
335           names, or against the names of tags; also the text of passthrough
336           objects.
337
338           The scope of a text search can be changed with qualifier letters
339           after the trailing slash. These are as follows:
340
341                          ┌───────┬───────────────────────────┐
342letter interpretation       
343                          ├───────┼───────────────────────────┤
344                          │  a    │   author name in commit   │
345                          ├───────┼───────────────────────────┤
346                          │  b    │ branch name in commit;    │
347                          │       │ also matches blobs        │
348                          │       │ referenced by commits on  │
349                          │       │ matching branches, and    │
350                          │       │ tags which point to       │
351                          │       │ commmits on patching      │
352                          │       │ branches.                 │
353                          ├───────┼───────────────────────────┤
354                          │  c    │ comment text of commit or │
355                          │       │ tag                       │
356                          ├───────┼───────────────────────────┤
357                          │  r    │  committish reference in  │
358                          │       │  tag or reset             │
359                          ├───────┼───────────────────────────┤
360                          │  p    │    text in passthrough    │
361                          ├───────┼───────────────────────────┤
362                          │  t    │       tagger in tag       │
363                          ├───────┼───────────────────────────┤
364                          │  n    │        name of tag        │
365                          ├───────┼───────────────────────────┤
366                          │  B    │       blob content        │
367                          └───────┴───────────────────────────┘
368           Multiple qualifier letters can add more search scopes.
369
370           (The “b” qualifier replaces the branchset syntax in earlier
371           versions of reposurgeon.)
372
373       paths
374           A "path expression" enclosed in square brackets resolves to the set
375           of all commits and blobs related to a path matching the given
376           expression. The path expression itself is either a path literal or
377           a regular expression surrounded by slashes. Immediately after the
378           trailing / of a path regexp you can put any number of the following
379           characters which act as flags: 'a', 'c', 'D', "M', 'R', 'C', 'N'.
380
381           By default, a path is related to a commit if the latter has a
382           fileop that touches that file path - modifies that change it,
383           deletes that remove it, renames and copies that have it as a source
384           or target. When the 'c' flag is in use the meaning changes: the
385           paths related to a commit become all paths that would be present in
386           a checkout for that commit.
387
388           A path literal matches a commit if and only if the path literal is
389           exactly one of the paths related to the commit (no prefix or suffix
390           operation is done). In particular a path literal won't match if it
391           corresponds to a directory in the chosen repository.
392
393           A regular expression matches a commit if it matches any path
394           related to the commit anywhere in the path. You can use '^' or '$'
395           if you want the expression to only match at the beginning or end of
396           paths. When the 'a' flag is in use, the path expression selects
397           commits whose every path matches the regular expression. This is
398           not always a subset of commits selected without the 'a' flag
399           because it also selects commits with no related paths (e.g. empty
400           commits, deletealls and commits with empty trees). If you want to
401           avoid those, you can use e.g. '[/regex/] & [/regex/a]'.
402
403           The flags 'D', "M', 'R', 'C', 'N' restrict match checking to the
404           corresponding fileop types. Note that this means an 'a' match is
405           easier (not harder) to achieve. These are no-iops when used with
406           'c'.
407
408           A path or literal matches a blob if it matches any path that
409           appeared in a modification fileop that referred to that blob. To
410           select purely matching blobs or matching commits, compose a path
411           expression with =B or =C.
412
413           If you need to embed '[^/]' into your regular expression (e.g. to
414           express "all characters but a slash") you can use a Python string
415           escape such as \x2f.
416
417       function calls
418           The expression language has named special functions. The sequence
419           for a named function is “@” followed by a function name, followed
420           by an argument in parentheses. Presently the following functions
421           are defined:
422
423                           ┌─────┬────────────────────────────┐
424name interpretation       
425                           ├─────┼────────────────────────────┤
426                           │min  │    minimum member of a     │
427                           │     │    selection set           │
428                           ├─────┼────────────────────────────┤
429                           │max  │    maximum member of a     │
430                           │     │    selection set           │
431                           ├─────┼────────────────────────────┤
432                           │amp  │ nonempty selection set     │
433                           │     │ becomes all objects, empty │
434                           │     │ set is returned empty      │
435                           ├─────┼────────────────────────────┤
436                           │par  │ all parents of commits in  │
437                           │     │ the argument set           │
438                           ├─────┼────────────────────────────┤
439                           │chn  │ all children of commits in │
440                           │     │ the argument set           │
441                           ├─────┼────────────────────────────┤
442                           │dsc  │ all commits descended from │
443                           │     │ the argument set (argument │
444                           │     │ set included)              │
445                           ├─────┼────────────────────────────┤
446                           │anc  │ all commits whom the       │
447                           │     │ argument set is descended  │
448                           │     │ from (argument set         │
449                           │     │ included)                  │
450                           ├─────┼────────────────────────────┤
451                           │pre  │ events before the argument │
452                           │     │ set; empty if the argument │
453                           │     │ set includes the first     │
454                           │     │ event.                     │
455                           ├─────┼────────────────────────────┤
456                           │suc  │ events after the argument  │
457                           │     │ set; empty if the argument │
458                           │     │ set includes the last      │
459                           │     │ event.                     │
460                           ├─────┼────────────────────────────┤
461                           │srt  │  sort the argument set by  │
462                           │     │  event number.             │
463                           └─────┴────────────────────────────┘
464
465       Set expressions may be combined with the operators | and &; these are,
466       respectively, set union and intersection. The | has lower precedence
467       than intersection, but you may use parentheses '(' and ')' to group
468       expressions in case there is ambiguity (this replaces the curly
469       brackets used in older versions of the syntax).
470
471       Any set operation may be followed by '?' to add the set members'
472       neighbors and referents. This extends the set to include the parents
473       and children of all commits in the set, and the referents of any tags
474       and resets in the set. Each blob reference in the set is replaced by
475       all commit events that refer to it. The '?' can be repeated to extend
476       the neighborhood depth.
477
478       Do set negation with prefix ~; it has higher precedence than & and |
479       but lower than ?
480
481   IMPORT AND EXPORT
482       reposurgeon can hold multiple repository states in core. Each has a
483       name. At any given time, one may be selected for editing. Commands in
484       this group import repositories, export them, and manipulate the in-core
485       list and the selection.
486
487       read [--format=fossil] [directory|-|<infile]
488           With a directory-name argument, this command attempts to read in
489           the contents of a repository in any supported version-control
490           system under that directory; read with no arguments does this in
491           the current directory. If output is redirected to a plain file, it
492           will be read in as a fast-import stream or Subversion dumpfile.
493           With an argument of “-”, this command reads a fast-import stream or
494           Subversion dumpfile from standard input (this will be useful in
495           filters constructed with command-line arguments).
496
497           If the contents is a fast-import stream, any "cvs-revision"
498           property on a commit is taken to be a newline-separated list of CVS
499           revision cookies pointing to the commit, and used for reference
500           lifting.
501
502           If the contents is a fast-import stream, any "legacy-id" property
503           on a commit is taken to be a legacy ID token pointing to the
504           commit, and used for reference-lifting.
505
506           If the read location is a git repository and contains a
507           .git/cvsauthors file (such as is left in place by git cvsimport -A)
508           that file will be read in as if it had been given to the authors
509           read command.
510
511           If the read location is a directory, and its repository
512           subdirectory has a file named legacy-map, that file will be read as
513           though passed to a legacy read command.
514
515           If the read location is a file and the --format=fossil is used, the
516           file is interpreted as a Fossil repository.
517
518           The just-read-in repo is added to the list of loaded repositories
519           and becomes the current one, selected for surgery. If it was read
520           from a plain file and the file name ends with one of the extensions
521           .fi or .svn, that extension is removed from the load list name.
522
523           Note: this command does not take a selection set.
524
525       write [--legacy] [--format=fossil] [--noincremental] [--callout]
526       [>outfile|-]
527           Dump selected events as a fast-import stream representing the
528           edited repository; the default selection set is all events. Where
529           to dump to is standard output if there is no argument or the
530           argument is '-', or the target of an output redirect.
531
532           Alternatively, if there is no redirect and the argument names a
533           directory, the repository is rebuilt into that directory, with any
534           selection set being ignored; if that target directory is nonempty
535           its contents are backed up to a save directory.
536
537           If the write location is a file and the --format=fossil is used,
538           the file is written in Fossil repository format.
539
540           With the --legacy option, the Legacy-ID of each commit is appended
541           to its commit comment at write time. This option is mainly useful
542           for debugging conversion edge cases.
543
544           If you specify a partial selection set such that some commits are
545           included but their parents are not, the output will include
546           incremental dump cookies for each branch with an origin outside the
547           selection set, just before the first reference to that branch in a
548           commit. An incremental dump cookie looks like "refs/heads/foo^0"
549           and is a clue to export-stream loaders that the branch should be
550           glued to the tip of a pre-existing branch of the same name. The
551           --noincremental option suppresses this behavior.
552
553           When you specify a partial selection set, including a commit object
554           forces the inclusion of every blob to which it refers and every tag
555           that refers to it.
556
557           Specifying a partial selection may cause a situation in which some
558           parent marks in merges don't correspond to commits present in the
559           dump. When this happens and --callout option was specified, the
560           write code replaces the merge mark with a callout, the action stamp
561           of the parent commit; otherwise the parent mark is omitted.
562           Importers will fail when reading a stream dump with callouts; it is
563           intended to be used by the graft command.
564
565           Specifying a write selection set with gaps in it is allowed but
566           unlikely to lead to good results if it is loaded by an importer.
567
568           Property extensions will be be omitted from the output if the
569           importer for the preferred repository type cannot digest them.
570
571           Note: to examine small groups of commits without the progress
572           meter, use inspect.
573
574       choose [reponame]
575           Choose a named repo on which to operate. The name of a repo is
576           normally the basename of the directory or file it was loaded from,
577           but repos loaded from standard input are "unnamed".  reposurgeon
578           will add a disambiguating suffix if there have been multiple reads
579           from the same source.
580
581           With no argument, lists the names of the currently stored
582           repositories and their load times. The second column is '*' for the
583           currently selected repository, '-' for others.
584
585       drop [reponame]
586           Drop a repo named by the argument from reposurgeon's list, freeing
587           the memory used for its metadata and deleting on-disk blobs. With
588           no argument, drops the currently chosen repo.
589
590       rename reponame
591           Rename the currently chosen repo; requires an argument. Won't do it
592           if there is already one by the new name.
593
594   REBUILDS IN PLACE
595       reposurgeon can rebuild an altered repository in place. Untracked files
596       are normally saved and restored when the contents of the new repository
597       is checked out (but see the documentation of the “preserve” command for
598       a caveat).
599
600       rebuild [directory]
601           Rebuild a repository from the state held by reposurgeon. This
602           command does not take a selection set.
603
604           The single argument, if present, specifies the target directory in
605           which to do the rebuild; if the repository read was from a repo
606           directory (and not a git-import stream), it defaults to that
607           directory. If the target directory is nonempty its contents are
608           backed up to a save directory. Files and directories on the
609           repository's preserve list are copied back from the backup
610           directory after repo rebuild. The default preserve list depends on
611           the repository type, and can be displayed with the stats command.
612
613           If reposurgeon has a nonempty legacy map, it will be written to a
614           file named legacy-map in the repository subdirectory as though by a
615           legacy write command. (This will normally be the case for
616           Subversion and CVS conversions.)
617
618       preserve [file...]
619           Add (presumably untracked) files or directories to the repo's list
620           of paths to be restored from the backup directory after a rebuild.
621           Each argument, if any, is interpreted as a pathname. The current
622           preserve list is displayed afterwards.
623
624           It is only necessary to use this feature if your version-control
625           system lacks a command to list files under version control. Under
626           systems with such a command (which include git and hg), all files
627           that are neither beneath the repository dot directory nor under
628           reposurgeon temporary directories are preserved automatically.
629
630       unpreserve [file...]
631           Remove (presumably untracked) files or directories to the repo's
632           list of paths to be restored from the backup directory after a
633           rebuild. Each argument, if any, is interpreted as a pathname. The
634           current preserve list is displayed afterwards.
635
636   INFORMATION AND REPORTS
637       Commands in this group report information about the selected
638       repository.
639
640       The output of these commands can individually be redirected to a named
641       output file. Where indicated in the syntax, you can prefix the output
642       filename with “>” and give it as a following argument. If you use “>>”
643       the file is opened for append rather than write.
644
645       list [>outfile]
646           This is the main command for identifying the events you want to
647           modify. It lists commits in the selection set by event sequence
648           number with summary information. The first column is raw event
649           numbers, the second a timestamp in local time. If the repository
650           has legacy IDs, they will be displayed in the third column. The
651           leading portion of the comment follows.
652
653       stamp [>outfile]
654           Alternative form of listing that displays full action stamps,
655           usable as references in selections. Supports > redirection.
656
657       tip [>outfile]
658           Display the branch tip names associated with commits in the
659           selection set. These will not necessarily be the same as their
660           branch fields (which will often be tag names if the repo contains
661           either annotated or lightweight tags).
662
663           If a commit is at a branch tip, its tip is its branch name. If it
664           has only one child, its tip is the child's tip. If it has multiple
665           children, then if there is a child with a matching branch name its
666           tip is the child's tip. Otherwise this function throws a
667           recoverable error.
668
669       tags [>outfile]
670           Display tags and resets: three fields, an event number and a type
671           and a name. Branch tip commits associated with tags are also
672           displayed with the type field 'commit'. Supports > redirection.
673
674       stats [repo-name...] [>outfile]
675           Report size statistics and import/export method information about
676           named repositories, or with no argument the currently chosen
677           repository.
678
679       count [>outfile]
680           Report a count of items in the selection set. Default set is
681           everything in the currently-selected repo. Supports > redirection.
682
683       inspect [>outfile]
684           Dump a fast-import stream representing selected events to standard
685           output. Just like a write, except (1) the progress meter is
686           disabled, and (2) there is an identifying header before each event
687           dump.
688
689       graph [>outfile]
690           Emit a visualization of the commit graph in the DOT markup language
691           used by the graphviz tool suite. This can be fed as input to the
692           main graphviz rendering program dot(1), which will yield a viewable
693           image. Supports > redirection.
694
695           You may find a script like this useful:
696
697               graph $1 >/tmp/foo$$
698               shell dot </tmp/foo$$ -Tpng | display -; rm /tmp/foo$$
699
700           You can substitute in your own preferred image viewer, of course.
701
702       sizes [>outfile]
703           Print a report on data volume per branch; takes a selection set,
704           defaulting to all events. The numbers tally the size of
705           uncompressed blobs, commit and tag comments, and other metadata
706           strings (a blob is counted each time a commit points at it).
707
708           The numbers are not an exact measure of storage size: they are
709           intended mainly as a way to get information on how to efficiently
710           partition a repository that has become large enough to be unwieldy.
711
712           Supports > redirection.
713
714       lint [>outfile]
715           Look for DAG and metadata configurations that may indicate a
716           problem. Presently checks for: (1) Mid-branch deletes, (2)
717           disconnected commits, (3) parentless commits, (4) the existence of
718           multiple roots, (5) committer and author IDs that don't look
719           well-formed as DVCS IDs, (6) multiple child links with identical
720           branch labels descending from the same commit, (7) time and
721           action-stamp collisions.
722
723           Options to issue only partial reports are supported; "lint
724           --options" or "lint -?" lists them.
725
726           The options and output format of this command are unstable; they
727           may change without notice as more sanity checks are added.
728
729       when >timespec
730           Interconvert between git timestamps (integer Unix time plus TZ) and
731           RFC3339 format. Takes one argument, autodetects the format. Useful
732           when eyeballing export streams. Also accepts any other supported
733           date format and converts to RFC3339.
734
735   SURGICAL OPERATIONS
736       These are the operations the rest of reposurgeon is designed to
737       support.
738
739       squash [policy...]
740           Combine or delete commits in a selection set of events. The default
741           selection set for this command is empty. Has no effect on events
742           other than commits unless the --delete policy is selected; see the
743           'delete' command for discussion.
744
745           Normally, when a commit is squashed, its file operation list (and
746           any associated blob references) gets either prepended to the
747           beginning of the operation list of each of the commit's children or
748           appended to the operation list of each of the commit's parents.
749           Then children of a deleted commit get it removed from their parent
750           set and its parents added to their parent set.
751
752           The default is to squash forward, modifying children; but see the
753           list of policy modifiers below for how to change this.
754
755               Warning
756               It is easy to get the bounds of a squash command wrong, with
757               confusing and destructive results. Beware thinking you can
758               squash on a selection set to merge all commits except the last
759               one into the last one; what you will actually do is to merge
760               all of them to the first commit after the selected set.
761           Normally, any tag pointing to a combined commit will also be pushed
762           forward. But see the list of policy modifiers below for how to
763           change this.
764
765           Following all operation moves, every one of the altered file
766           operation lists is reduced to a shortest normalized form. The
767           normalized form detects various combinations of modification,
768           deletion, and renaming and simplifies the operation sequence as
769           much as it can without losing any information.
770
771           After canonicalization, a file op list may still end up containing
772           multiple M operations on the same file. Normally the tool utters a
773           warning when this occurs but does not try to resolve it.
774
775           The following modifiers change these policies:
776
777           --delete
778               Simply discards all file ops and tags associated with deleted
779               commit(s).
780
781           --coalesce
782               Discard all M operations (and associated blobs) except the
783               last.
784
785           --pushback
786               Append fileops to parents, rather than prepending to children.
787
788           --pushforward
789               Prepend fileops to children. This is the default; it can be
790               specified in a lift script for explicitness about intentions.
791
792           --tagforward
793               With the "tagforward" modifier, any tag on the deleted commit
794               is pushed forward to the first child rather than being deleted.
795               This is the default; it can be specified for explicitness.
796
797           --tagback
798               With the "--tagback" modifier, any tag on the deleted commit is
799               pushed backward to the first parent rather than being deleted.
800
801           --quiet
802               Suppresses warning messages about deletion of commits with
803               non-delete fileops.
804
805           --complain
806               The opposite of quiet. Can be specified for explicitness.
807
808           Under any of these policies except “--delete”, deleting a commit
809           that has children does not back out the changes made by that
810           commit, as they will still be present in the blobs attached to
811           versions past the end of the deletion set. All a delete does when
812           the commit has children is lose the metadata information about when
813           and by who those changes were actually made; after the delete any
814           such changes will be attributed to the first undeleted children of
815           the deleted commits. It is expected that this command will be
816           useful mainly for removing commits mechanically generated by
817           repository converters such as cvs2svn.
818
819       delete [policy...]
820           Delete a selection set of events. The default selection set for
821           this command is empty. On a set of commits, this is equivalent to a
822           squash with the --delete flag. It unconditionally deletes tags,
823           resets, and passthroughs; blobs can be removed only as a side
824           effect of deleting every commit that points at them.
825
826       divide parent [child]
827           Attempt to partition a repo by cutting the parent-child link
828           between two specified commits (they must be adjacent). Does not
829           take a general selection set. It is only necessary to specify the
830           parent commit, unless it has multiple children in which case the
831           child commit must follow (separate it with a comma).
832
833           If the repo was named 'foo', you will normally end up with two
834           repos named 'foo-early' and 'foo-late' (option and feature events
835           at the beginning of the early segment will be duplicated onto the
836           beginning of the late one.). But if the commit graph would remain
837           connected through another path after the cut, the behavior changes.
838           In this case, if the parent and child were on the same branch
839           'qux', the branch segments are renamed 'qux-early' and 'qux-late'
840           but the repo is not divided.
841
842       expunge [--notagify] [path | /regexp/]...
843           Expunge files from the selected portion of the repo history; the
844           default is the entire history. The arguments to this command may be
845           paths or Python regular expressions matching paths (regexps must be
846           marked by being surrounded with //).
847
848           All filemodify (M) operations and delete (D) operations involving a
849           matched file in the selected set of events are disconnected from
850           the repo and put in a removal set. Renames are followed as the tool
851           walks forward in the selection set; each triggers a warning
852           message. If a selected file is a copy (C) target, the copy will be
853           deleted and a warning message issued. If a selected file is a copy
854           source, the copy target will be added to the list of paths to be
855           deleted and a warning issued.
856
857           After file expunges have been performed, any commits with no
858           remaining file operations will be removed, and any tags pointing to
859           them. By default each deleted commit is replaced with a tag of the
860           form 'emptycommit-ident' on the preceding commit unless --notagify
861           is specified as an argument. Commits with deleted fileops pointing
862           both in and outside the path set are not deleted, but are cloned
863           into the removal set.
864
865           The removal set is not discarded. It is assembled into a new
866           repository named after the old one with the suffix "-expunges"
867           added. Thus, this command can be used to carve a repository into
868           sections by file path matches.
869
870       tagify [--canonicalize] [--tipdeletes] [--tagify-merges]
871           Search for empty commits and turn them into tags. Takes an optional
872           selection set argument defaulting to all commits. For each commit
873           in the selection set, turn it into a tag with the same message and
874           author information if it has no fileops. By default merge commits
875           are not considered, even if they have no fileops (thus no tree
876           differences with their first parent). To change that, use the
877           --tagify-merges option.
878
879           The name of the generated tag will be 'emptycommit-ident', where
880           ident is generated from the legacy ID of the deleted commit, or
881           from its mark, or from its index in the repository, with a
882           disambiguation suffix if needed.
883
884           With the --canonicalize, tagify tries harder to detect trivial
885           commits by first ensuring that all fileops of selected commits will
886           have an actual effect when processed by fast-import.
887
888           With the --tipdeletes, tagify also considers branch tips with only
889           deleteall fileops to be candidates for tagification. The
890           corresponding tags get names of the form 'tipdelete-branchname'
891           rather than the default 'emptycommit-ident'.
892
893           With the --tagify-merges, tagify also tagifies merge commits that
894           have no fileops. When this is done the merge link is move to the
895           yagified commit's parent.
896
897       coalesce [--debug|--changelog] [timefuzz]
898           Scan the selection set for runs of commits with identical comments
899           close to each other in time (this is a common form of scar tissues
900           in repository up-conversions from older file-oriented
901           version-control systems). Merge these cliques by deleting all but
902           the last commit, in order; fileops from the deleted commits are
903           pushed forward to that last one
904
905           The optional second argument, if present, is a maximum time
906           separation in seconds; the default is 90 seconds.
907
908           The default selection set for this command is =C, all commits.
909           Occasionally you may want to restrict it, for example to avoid
910           coalescing unrelated cliques of "*** empty log message ***" commits
911           from CVS lifts.
912
913           With the --debug option, show messages about mismatches.
914
915           With the --changelog option, any commit with a comment containing
916           the string 'empty log message' (such as is generated by CVS) and
917           containing exactly one file operation modifying a path ending in
918           ChangeLog is treated specially. Such ChangeLog commits are
919           considered to match any commit before them by content, and will
920           coalesce with it if the committer matches and the commit separation
921           is small enough. This option handles a convention used by Free
922           Software Foundation projects.
923
924       split {at|by} item
925           The first argument is required to be a commit location; the second
926           is a preposition which indicates which splitting method to use. If
927           the preposition is 'at', then the third argument must be an integer
928           1-origin index of a file operation within the commit. If it is
929           'by', then the third argument must be a pathname to be
930           prefix-matched, pathname match is done first).
931
932           The commit is copied and inserted into a new position in the event
933           sequence, immediately following itself; the duplicate becomes the
934           child of the original, and replaces it as parent of the original's
935           children. Commit metadata is duplicated; the new commit then gets a
936           new mark. If the new commit has a legacy ID, the suffix '.split' is
937           appended to it.
938
939           Finally, some file operations - starting at the one matched or
940           indexed by the split argument - are moved forward from the original
941           commit into the new one. Legal indices are 2-n, where n is the
942           number of file operations in the original commit.
943
944       add {D path | M perm mark path | R source target | C source target}
945           To a specified commit, add a specified fileop.
946
947           For a D operation to be valid there must be an M operation for the
948           path in the commit's ancestry. For an M operation to be valid, the
949           'perm' part must be a token ending with 755 or 644 and the 'mark'
950           must refer to a blob that precedes the commit location. For an R or
951           C operation to be valid, there must be an M operation for the
952           source in the commit's ancestry.
953
954       remove [index | path | deletes] [to commit]
955           From a specified commit, remove a specified fileop. The op must be
956           one of (a) the keyword “deletes”, (b) a file path, (c) a file path
957           preceded by an op type set (some subset of the letters DMRCN), or
958           (d) a 1-origin numeric index. The “deletes” keyword selects all D
959           fileops in the commit; the others select one each.
960
961           If the “to” clause is present, the removed op is appended to the
962           commit specified by the following singleton selection set. This
963           option cannot be combined with “deletes”.
964
965           Note that this command does not attempt to scavenge blobs even if
966           the deleted fileop might be the only reference to them. This
967           behavior may change in a future release.
968
969       blob
970           Create a blob at mark :1 after renumbering other marks starting
971           from :2. Data is taken from stdin, which may be a here-doc. This
972           can be used with the add command to patch synthetic data into a
973           repository.
974
975       renumber
976           Renumber the marks in a repository, from :1 up to :<n> where <n> is
977           the count of the last mark. Just in case an importer ever cares
978           about mark ordering or gaps in the sequence.
979
980           A side effect of this comment is to clean up stray "done"
981           passthroughs that may have entered the repository via graft
982           operations. After a renumber, the repository will have at most one
983           "done" and it will be at the end of the events.
984
985       mailbox_out [>outfile]
986           Emit a mailbox file of messages in RFC822 format representing the
987           contents of repository metadata. Takes a selection set; members of
988           the set other than commits, annotated tags, and passthroughs are
989           ignored (that is, presently, blobs and resets).
990
991           The output from this command can optionally be redirected to a
992           named output file. Prefix the filename with “>” and give it as a
993           following argument.
994
995           May have an option --filter, followed by = and a /-enclosed regular
996           expression. If this is given, only headers with names matching it
997           are emitted. In this context the name of the header includes its
998           trailing colon.
999
1000       mailbox_in [<infile] [--changed >outfile]
1001           Accept a mailbox file of messages in RFC822 format representing the
1002           contents of the metadata in selected commits and annotated tags.
1003           Takes no selection set. If there is an argument it will be taken as
1004           the name of a mailbox file to read from; no argument, or one of
1005           '-'; reads from standard input.
1006
1007           Users should be aware that modifying an Event-Number or Event-Mark
1008           field will change which event the update from that message is
1009           applied to. This is unlikely to have good results.
1010
1011           If the Event-Number and Event-Mark fields are absent, the
1012           mailbox_in logic will attempt to match the commit or tag first by
1013           Legacy-ID, then by a unique committer ID and timestamp pair.
1014
1015           If output is redirected and the modifier “--changed” appears, a
1016           minimal set of modifications actually made is written to the output
1017           file in a form that can be fed back in.
1018
1019       setfield attribute value
1020           In the selected objects (defaulting to none) set every instance of
1021           a named field to a string value. The string may be quoted to
1022           include whitespace, and use backslash escapes interpreted by the
1023           Python string-escape codec, such as \n and \t.
1024
1025           Attempts to set nonexistent attributes are ignored. Valid values
1026           for the attribute are internal Python field names; in particular,
1027           for commits, “comment” and “branch” are legal. Consult the source
1028           code for other interesting values.
1029
1030       append [--rstrip] [>text]
1031           Append text to the comments of commits and tags in the specified
1032           selection set. The text is the first token of the command and may
1033           be a quoted string. C-style escape sequences in the string are
1034           interpreted using Python's string_decode codec.
1035
1036           If the option --rstrip is given, the comment is right-stripped
1037           before the new text is appended.
1038
1039       filter [--shell|--regex|--replace|--dedos]
1040           Run blobs, commit comments, or tag comments in the selection set
1041           through the filter specified on the command line.
1042
1043           In any mode other than --dedos, attempting to specify a selection
1044           set including both blobs and non-blobs (that is, commits or tags)
1045           throws an error. Inline content in commits is filtered when the
1046           selection set contains (only) blobs and the commit is within the
1047           range bounded by the earliest and latest blob in the specification.
1048
1049           When filtering blobs, if the command line contains the magic cookie
1050           '%PATHS%' it is replaced with a space-separated list of all paths
1051           that reference the blob.
1052
1053           With --shell, the remainder of the line specifies a filter as a
1054           shell command. Each blob or comment is presented to the filter on
1055           standard input; the content is replaced with whatever the filter
1056           emits to standard output.
1057
1058           With --regex, the remainder of the line is expected to be a Python
1059           regular expression substitution written as /from/to/ with from and
1060           to being passed as arguments to the standard re.sub() function and
1061           it applied to modify the content. Actually, any non-space character
1062           will work as a delimiter in place of the /; this makes it easier to
1063           use / in patterns. Ordinarily only the first such substitution is
1064           performed; putting 'g' after the slash replaces globally, and a
1065           numeric literal gives the maximum number of substitutions to
1066           perform. Other flags available restrict substitution scope - 'c'
1067           for comment text only, 'C' for committer name only, 'a' for author
1068           names only. Note that parsing of a --regex argument will be
1069           confused by ant substring consisting of whitespace followed by #;
1070           use "\s" rather than whitespace to avoid this.
1071
1072           With --replace, the behavior is like --regexp but the expressions
1073           are not interpreted as regular expressions. (This is slightly
1074           faster).
1075
1076           With --dedos, DOS/Windows-style \r\n line terminators are replaced
1077           with \n.
1078
1079       transcode codec
1080           Transcode blobs, commit comments and committer/author names, or tag
1081           comments and tag committer names in the selection set to UTF-8 from
1082           the character encoding specified on the command line.
1083
1084           Attempting to specify a selection set including both blobs and
1085           non-blobs (that is, commits or tags) throws an error. Inline
1086           content in commits is filtered when the selection set contains
1087           (only) blobs and the commit is within the range bounded by the
1088           earliest and latest blob in the specification.
1089
1090           The encoding argument must name one of the codecs known to the
1091           Python standard codecs library. In particular, 'latin-1' is a valid
1092           codec name.
1093
1094           Errors in this command are fatal, because an error may leave
1095           repository objects in a damaged state.
1096
1097           The theory behind the design of this command is that the repository
1098           might contain a mixture of encodings used to enter commit metadata
1099           by different people at different times. After using =I to identify
1100           metadata containing non-Unicode high bytes in text, a human must
1101           use context to identify which particular encodings were used in
1102           particular event spans and compose appropriate transcode commands
1103           to fix them up.
1104
1105       edit
1106           Report the selection set of events to a tempfile as mailbox_out
1107           does, call an editor on it, and update from the result as
1108           mailbox_in does. If you do not specify an editor name as second
1109           argument, it will be taken from the $EDITOR variable in your
1110           environment.
1111
1112           Normally this command ignores blobs because mailbox_out does.
1113           However, if you specify a selection set consisting of a single
1114           blob, your editor will be called directly on the blob file.
1115
1116       timeoffset offset [timezone]
1117           Apply a time offset to all time/date stamps in the selected set. An
1118           offset argument is required; it may be in the form [+-]ss,
1119           [+-]mm:ss or [+-]hh:mm:ss. The leading sign is required to
1120           distinguish it from a selection expression.
1121
1122           Optionally you may also specify another argument in the form
1123           [+-]hhmm, a timezone literal to apply. To apply a timezone without
1124           an offset, use an offset literal of +0 or -0.
1125
1126       unite [--prune] reponame...
1127           Unite repositories. Name any number of loaded repositories; they
1128           will be united into one union repo and removed from the load list.
1129           The union repo will be selected.
1130
1131           The root of each repo (other than the oldest repo) will be grafted
1132           as a child to the last commit in the dump with a preceding commit
1133           date. This will produce a union repository with one branch for each
1134           part. Running last to first, duplicate tag and branch names will be
1135           disambiguated using the source repository name (thus, recent
1136           duplicates will get priority over older ones). After all grafts,
1137           marks will be renumbered.
1138
1139           The name of the new repo will be the names of all parts
1140           concatenated, separated by '+'. It will have no source directory or
1141           preferred system type.
1142
1143           With the option --prune, at each join D operations for every
1144           ancestral file existing will be prepended to the root commit, then
1145           it will be canonicalized using the rules for squashing the effect
1146           will be that only files with properly matching M, R, and C
1147           operations in the root survive.
1148
1149       graft [--prune] reponame
1150           For when unite doesn't give you enough control. This command may
1151           have either of two forms, selected by the size of the selection
1152           set. The first argument is always required to be the name of a
1153           loaded repo.
1154
1155           If the selection set is of size 1, it must identify a single commit
1156           in the currently chosen repo; in this case the name repo's root
1157           will become a child of the specified commit. If the selection set
1158           is empty, the named repo must contain one or more callouts matching
1159           a commits in the currently chosen repo.
1160
1161           Labels and branches in the named repo are prefixed with its name;
1162           then it is grafted to the selected one. Any other callouts in the
1163           named repo are also resolved in the context of the currently chosen
1164           one. Finally, the named repo is removed from the load list.
1165
1166           With the option --prune, prepend a deleteall operation into the
1167           root of the grafted repository.
1168
1169       path [source] rename [--force] [target]
1170           Rename a path in every fileop of every selected commit. The default
1171           selection set is all commits. The first argument is interpreted as
1172           a Python regular expression to match against paths; the second may
1173           contain back-reference syntax.
1174
1175           Ordinarily, if the target path already exists in the fileops, or is
1176           visible in the ancestry of the commit, this command throws an
1177           error. With the --force option, these checks are skipped.
1178
1179       paths [{sub|sup}] [dirname] [>outfile]
1180           Takes a selection set. Without a modifier, list all paths touched
1181           by fileops in the selection set (which defaults to the entire
1182           repo). This reporting variant does >-redirection.
1183
1184           With the 'sub' modifier, take a second argument that is a directory
1185           name and prepend it to every path. With the 'sup' modifier, strip
1186           any directory argument from the start of the path if it appears
1187           there; with no argument, strip the first directory component from
1188           every path.
1189
1190       merge
1191           Create a merge link. Takes a selection set argument, ignoring all
1192           but the lowest (source) and highest (target) members. Creates a
1193           merge link from the highest member (child) to the lowest (parent).
1194
1195       unmerge
1196           Linearize a commit. Takes a selection set argument, which must
1197           resolve to a single commit, and removes all its parents except for
1198           the first.
1199
1200           It is equivalent to reparent first_parent,commit rebase, where
1201           commit is the same selection set as used with unmerge and
1202           first_parent is a set resolving commit's first parent (see the
1203           reparent command below
1204
1205           The main interest of the unmerge is that you don't have to find and
1206           specify the first parent yourself, saving time and avoiding errors
1207           when nearby surgery would make a manual first parent argument
1208           stale.
1209
1210       reparent [options...] [policy]
1211           Changes the parent list of a commit. Takes a selection set, zero or
1212           more option arguments, and an optional policy argument.
1213
1214           Selection set:
1215               The selection set must resolve to one or more commits. The
1216               selected commit with the highest event number (not necessarily
1217               the last one selected) is the commit to modify. The remainder
1218               of the selected commits, if any, become its parents: the
1219               selected commit with the lowest event number (which is not
1220               necessarily the first one selected) becomes the first parent,
1221               the selected commit with second lowest event number becomes the
1222               second parent, and so on. All original parent links are
1223               removed. Examples:
1224
1225                   # this makes 17 the parent of 33
1226                   17,33 reparent
1227
1228                   # this also makes 17 the parent of 33
1229                   33,17 reparent
1230
1231                   # this makes 33 a root (parentless) commit
1232                   33 reparent
1233
1234                   # this makes 33 an octopus merge commit.  its first parent
1235                   # is commit 15, second parent is 17, and third parent is 22
1236                   22,33,15,17 reparent
1237
1238           Options:
1239
1240               --use-order
1241                   Use the selection order to determine which selected commit
1242                   is the commit to modify and which are the parents (and if
1243                   there are multiple parents, their order). The last selected
1244                   commit (not necessarily the one with the highest event
1245                   number) is the commit to modify, the first selected commit
1246                   (not necessarily the one with the lowest event number)
1247                   becomes the first parent, the second selected commit
1248                   becomes the second parent, and so on. Examples:
1249
1250                       # this makes 33 the parent of 17
1251                       33|17 reparent --use-order
1252
1253                       # this makes 17 an octopus merge commit.  its first parent
1254                       # is commit 22, second parent is 33, and third parent is 15
1255                       22,33,15|17 reparent --use-order
1256
1257                   Because ancestor commit events must appear before their
1258                   descendants, giving a commit with a low event number a
1259                   parent with a high event number triggers a re-sort of the
1260                   events. A re-sort assigns different event numbers to some
1261                   or all of the events. Re-sorting only works if the
1262                   reparenting does not introduce any cycles. To swap the
1263                   order of two commits that have an ancestor–descendant
1264                   relationship without introducing a cycle during the
1265                   process, you must reparent the descendant commit first.
1266
1267           Policy:
1268               By default, the manifest of the reparented commit is computed
1269               before modifying it; a deleteall and some fileops are prepended
1270               so that the manifest stays unchanged even when the first parent
1271               has been changed. This behavior can be changed by specifying a
1272               policy:
1273
1274               rebase
1275                   Inhibits the default behavior—no deleteall is issued and
1276                   the tree contents of all descendents can be modified as a
1277                   result.
1278
1279       branch branchname... {rename|delete} [arg]
1280           Rename or delete a branch (and any associated resets). First
1281           argument must be an existing branch name; second argument must one
1282           of the verbs 'rename' or 'delete'.
1283
1284           For a 'rename', the third argument may be any token that is a
1285           syntactically valid branch name (but not the name of an existing
1286           branch). For a 'delete', no third argument is required.
1287
1288           For either name, if it does not contain a '/' the prefix
1289           'refs/heads' is prepended.
1290
1291       tag tagname... {move|rename|delete} [arg].
1292           Create, move, rename, or delete a tag.
1293
1294           Creation is a special case. First argument is a nane, which must
1295           not be an existing tag. Takes a singleton event second argument
1296           which must point to a commit. A tag object pointing to the commit
1297           is created and inserted just after the last tag in the repo (or
1298           just after the last commit if there are no tags). The tagger,
1299           committish, and comment fields are copied from the commit's
1300           committer, mark, and comment fields.
1301
1302           Otherwise, first argument must be an existing tag name; second
1303           argument must be one of the verbs “move”, “rename”, or “delete”.
1304
1305           For a “move”, a third argument must be a singleton selection set.
1306           For a “rename”, the third argument may be any token that is a
1307           syntactically valid tag name (but not the name of an existing tag).
1308           For a “delete”, no third argument is required.
1309
1310           The behavior of this command is complex because features which
1311           present as tags may be any of three things: (1) True tag objects,
1312           (2) lightweight tags, actually sequences of commits with a common
1313           branchname beginning with “refs/tags” - in this case the tag is
1314           considered to point to the last commit in the sequence, (3) Reset
1315           objects. These may occur in combination; in fact, stream exporters
1316           from systems with annotation tags commonly express each of these as
1317           a true tag object (1) pointing at the tip commit of a sequence (2)
1318           in which the basename of the common branch field is identical to
1319           the tag name. An exporter that generates lightweight-tagged commit
1320           sequences (2) may or may not generate resets pointing at their tip
1321           commits.
1322
1323           This command tries to handle all combinations in a natural way by
1324           doing up to three operations on any true tag, commit sequence, and
1325           reset matching the source name. In a rename, all are renamed
1326           together. In a delete, any matching tag or reset is deleted; then
1327           matching branch fields are changed to match the branch of the
1328           unique descendent of the tagged commit, if there is one. When a tag
1329           is moved, no branch fields are changed and a warning is issued.
1330
1331           Attempts to delete a lightweight tag may fail with the message
1332           “couldn't determine a unique successor”. When this happens, the tag
1333           is on a commit with multiple children that have different branch
1334           labels. There is a hole in the specification of git fast-import
1335           streams that leaves it uncertain how branch labels can be safely
1336           reassigned in this case; rather than do something risky,
1337           reposurgeon throws a recoverable error.
1338
1339       reset resetname... {move|rename|delete} [arg].
1340           Move, rename, or delete a reset. First argument must match an
1341           existing reset name; second argument must be one of the verbs
1342           “move”, “rename”, or “delete”.
1343
1344           For a “move”, a third argument must be a singleton selection set.
1345           For a “rename”, the third argument may be any token token that
1346           matches a syntactically valid reset name (but not the name of an
1347           existing reset). For a “delete”, no third argument is required.
1348
1349           For either name, if it does not contain a “/” the prefix “heads/”
1350           is prepended. If it does not begin with “refs/”, “refs/” is
1351           prepended.
1352
1353           An argument matches a reset's name if it is either the entire
1354           reference (refs/heads/FOO or refs/tags/FOO for some some value of
1355           FOO) or the basename (e.g. FOO), or a suffix of the form heads/FOO
1356           or tags/FOO. An unqualified basename is assumed to refer to a head.
1357
1358           When a reset is renamed, commit branch fields matching the tag are
1359           renamed with it to match. When a reset is deleted, matching branch
1360           fields are changed to match the branch of the unique descendent of
1361           the tip commit of the associated branch, if there is one. When a
1362           reset is moved, no branch fields are changed.
1363
1364       debranch source-branch... [target-branch].
1365           Takes one or two arguments which must be the names of source and
1366           target branches; if the second (target) argument is omitted it
1367           defaults to refs/heads/master. Any trailing segment of a branch
1368           name is accepted as a synonym for it; thus master is the same as
1369           refs/heads/master. Does not take a selection set.
1370
1371           The history of the source branch is merged into the history of the
1372           target branch, becoming the history of a subdirectory with the name
1373           of the source branch. Any resets of the source branch are removed.
1374
1375       strip [blobs|reduce].
1376           Reduce the selected repository to make it a more tractable test
1377           case. Use this when reporting bugs.
1378
1379           With the modifier 'blobs', replace each blob in the repository with
1380           a small, self-identifying stub, leaving all metadata and DAG
1381           topology intact. This is useful when you are reporting a bug, for
1382           reducing large repositories to test cases of manageable size.
1383
1384           A selection set is effective only with the 'blobs' option,
1385           defaulting to all blobs. The 'reduce' mode always acts on the
1386           entire repository.
1387
1388           With the modifier 'reduce', perform a topological reduction that
1389           throws out uninteresting commits. If a commit has all file
1390           modifications (no deletions or copies or renames) and has exactly
1391           one ancestor and one descendant, then it may be boring. To be fully
1392           boring, it must also not be referred to by any tag or reset.
1393           Interesting commits are not boring, or have a non-boring parent or
1394           non-boring child.
1395
1396           With no modifiers, this command strips blobs.
1397
1398       ignores [rename]. [translate]. [defaults].
1399           Intelligent handling of ignore-pattern files. This command fails if
1400           no repository has been selected or no preferred write type has been
1401           set for the repository. It does not take a selection set.
1402
1403           If the rename modifier is present, this command attempts to rename
1404           all ignore-pattern files to whatever is appropriate for the
1405           preferred type - e.g. .gitignore for git, .hgignore for hg, etc.
1406           This option does not cause any translation of the ignore files it
1407           renames.
1408
1409           If the translate modifier is present, syntax translation of each
1410           ignore file is attempted. At present, the only transformation the
1411           code knows is to prepend a 'syntax: glob' header if the preferred
1412           type is hg.
1413
1414           If the defaults modifier is present, the command attempts to
1415           prepend these default patterns to all ignore files. If no ignore
1416           file is created by the first commit, it will be modified to create
1417           one containing the defaults. This command will error out on prefer
1418           types that have no default ignore patterns (git and hg, in
1419           particular). It will also error out when it knows the import tool
1420           has already set default patterns.
1421
1422   REFERENCE LIFTING
1423       This group of commands is meant for fixing up references in commits
1424       that are in the format of older version control systems. The general
1425       workflow is this: first, go over the comment history and change all
1426       old-fashioned commit references into machine-parseable cookies. Then,
1427       automatically turn the machine-parseable cookie into action stamps. The
1428       point of dividing the process this way is that the first part is hard
1429       for a machine to get right, while the second part is prone to errors
1430       when a human does it.
1431
1432       A Subversion cookie is a comment substring of the form [[SVN:ddddd]]
1433       (example: [[SVN:2355]] with the revision read directly via the
1434       Subversion exporter, deduced from git-svn metadata, or matching a
1435       $Revision$ header embedded in blob data for the filename.
1436
1437       A CVS cookie is a comment substring of the form
1438       [[CVS:filename:revision]] (example: [[CVS:src/README:1.23]] with the
1439       revision matching a CVS $Id$ or $Revision$ header embedded in blob data
1440       for the filename.
1441
1442       A mark cookie is of the form [[:dddd]] and is simply a reference to the
1443       specified mark. You may want to hand-patch this in when one of previous
1444       forms is inconvenient.
1445
1446       An action stamp is an RFC3339 timestamp, followed by a '!', followed by
1447       an author email address (author rather than committer because that
1448       timestamp is not changed when a patch is replayed on to a branch). It
1449       attempts to refer to a commit without being VCS-specific. Thus, instead
1450       of "commit 304a53c2" or "r2355",
1451       "2011-10-25T15:11:09Z!fred@foonly.com".
1452
1453       The following git aliases allow git to work directly with action
1454       stamps. Append it to your ~/.gitconfig; if you already have an [alias]
1455       section, leave off the first line.
1456
1457
1458           [alias]
1459                # git stamp <commit-ish> - print a reposurgeon-style action stamp
1460                stamp = show -s --format='%cI!%ce'
1461
1462                # git scommit <stamp> <rev-list-args> - list most recent commit that matches <stamp>.
1463                # Must also specify a branch to search or --all, after these arguments.
1464                scommit = "!f(){ d=${1%%!*}; a=${1##*!}; arg=\"--until=$d -1\"; if [ $a != $1 ]; then arg=\"$arg --committer=$a\"; fi; shift; git rev-list $arg ${1:+\"$@\"}; }; f"
1465
1466                # git scommits <stamp> <rev-list-args> - as above, but list all matching commits.
1467                scommits = "!f(){ d=${1%%!*}; a=${1##*!}; arg=\"--until=$d --after $d\"; if [ $a != $1 ]; then arg=\"$arg --committer=$a\"; fi; shift; git rev-list $arg ${1:+\"$@\"}; }; f"
1468
1469                # git smaster <stamp> - list most recent commit on master that matches <stamp>.
1470                smaster = "!f(){ git scommit \"$1\" master --first-parent; }; f"
1471                smasters = "!f(){ git scommits \"$1\" master --first-parent; }; f"
1472
1473                # git shs <stamp> - show the commits on master that match <stamp>.
1474                shs = "!f(){ stamp=$(git smasters $1); shift; git show ${stamp:?not found} $*; }; f"
1475
1476                # git slog <stamp> <log-args> - start git log at <stamp> on master
1477                slog = "!f(){ stamp=$(git smaster $1); shift; git log ${stamp:?not found} $*; }; f"
1478
1479                # git sco <stamp> - check out most recent commit on master that matches <stamp>.
1480                sco = "!f(){ stamp=$(git smaster $1); shift; git checkout ${stamp:?not found} $*; }; f"
1481
1482
1483       There is a rare case in which an action stamp will not refer uniquely
1484       to one commit. It is theoretically possible that the same author might
1485       check in revisions on different branches within the one-second
1486       resolution of the timestamps in a fast-import stream. There is nothing
1487       to be done about this; tools using action stamps need to be aware of
1488       the possibility and throw a warning when it occurs.
1489
1490       In order to support reference lifting, reposurgeon internally builds a
1491       legacy-reference map that associates revision identifiers in older
1492       version-control systems with commits. The contents of this map comes
1493       from three places: (1) cvs2svn:rev properties if the repository was
1494       read from a Subversion dump stream, (2) $Id$ and $Revision$ headers in
1495       repository files, and (3) the .git/cvs-revisions created by git
1496       cvsimport.
1497
1498       The detailed sequence for lifting possible references is this: first,
1499       find possible CVS and Subversion references with the references or =N
1500       visibility set; then replace them with equivalent cookies; then run
1501       references lift to turn the cookies into action stamps (using the
1502       information in the legacy-reference map) without having to do the
1503       lookup by hand.
1504
1505       references [list|edit|lift] [>outfile]
1506           With the modifier 'list', list commit and tag comments for strings
1507           that might be CVS- or Subversion-style revision identifiers. This
1508           will be useful when you want to replace them with equivalent
1509           cookies that can automatically be translated into VCS-independent
1510           action stamps. This reporting command supports >-redirection. It is
1511           equivalent to '=N list'.
1512
1513           With the modifier 'edit', edit the set where revision IDs are
1514           found. This is equivalent to '=N edit'.
1515
1516           With the modifier "lift", attempt to resolve Subversion and CVS
1517           cookies in comments into action stamps using the legacy map. An
1518           action stamp is a timestamp/email/sequence-number combination
1519           uniquely identifying the commit associated with that blob, as
1520           described in the section called “TRANSLATION STYLE”.
1521
1522           It is not guaranteed that every such reference will be resolved, or
1523           even that any at all will be. Normally all references in history
1524           from a Subversion repository will resolve, but CVS references are
1525           less likely to be resolvable.
1526
1527   VARIABLES, MACROS AND EXTENSIONS
1528       Occasionally you will need to issue a large number of complex surgical
1529       commands of very similar form, and it's convenient to be able to
1530       package that form so you don't need to do a lot of error-prone typing.
1531       For those occasions, reposurgeon supports simple forms of named
1532       variables and macro expansion.
1533
1534       assign [name]
1535           Compute a leading selection set and assign it to a symbolic name.
1536           It is an error to assign to a name that is already assigned, or to
1537           any existing branch name. Assignments may be cleared by sequence
1538           mutations (though not ordinary deletions); you will see a warning
1539           when this occurs.
1540
1541           With no selection set and no name, list all assignments.>
1542
1543           Use this to optimize out location and selection computations that
1544           would otherwise be performed repeatedly, e.g. in macro calls.
1545
1546       unassign [name]
1547           Unassign a symbolic name. Throws an error if the name is not
1548           assigned.
1549
1550       names [>outfile]
1551           List the names of all known branches and tags. Tells you what
1552           things are legal within angle brackets and parentheses.
1553
1554       define name body
1555           Define a macro. The first whitespace-separated token is the name;
1556           the remainder of the line is the body, unless it is “{”, which
1557           begins a multi-line macro terminated by a line beginning with “}”.
1558
1559           A later “do” call can invoke this macro.
1560
1561           The command “define” by itself without a name or body produces a
1562           macro list.
1563
1564       do name arguments...
1565           Expand and perform a macro. The first whitespace-separated token is
1566           the name of the macro to be called; remaining tokens replace {0},
1567           {1}... in the macro definition (the conventions used are those of
1568           the Python format method). Tokens may contain whitespace if they
1569           are string-quoted; string quotes are stripped. Macros can call
1570           macros.
1571
1572           If the macro expansion does not itself begin with a selection set,
1573           whatever set was specified before the "do" keyword is available to
1574           the command generated by the expansion.
1575
1576       undefine name]
1577           Undefine the named macro.
1578
1579       Here's an example to illustrate how you might use this. In CVS
1580       repositories of projects that use the GNU ChangeLog convention, a very
1581       common pre-conversion artifact is a commit with the comment "***empty
1582       log message***" that modifies only a ChangeLog entry explaining the
1583       commit immediately previous to it. The following
1584
1585           define changelog <{0}> & /empty log message/ squash --pushback
1586           do changelog 2012-08-14T21:51:35Z
1587           do changelog 2012-08-08T22:52:14Z
1588           do changelog 2012-08-07T04:48:26Z
1589           do changelog 2012-08-08T07:19:09Z
1590           do changelog 2012-07-28T18:40:10Z
1591
1592       is equivalent to the more verbose
1593
1594           <2012-08-14T21:51:35Z> & /empty log message/ squash --pushback
1595           <2012-08-08T22:52:14Z> & /empty log message/ squash --pushback
1596           <2012-08-07T04:48:26Z> & /empty log message/ squash --pushback
1597           <2012-08-08T07:19:09Z> & /empty log message/ squash --pushback
1598           <2012-07-28T18:40:10Z> & /empty log message/ squash --pushback
1599
1600       but you are less likely to make difficult-to-notice errors typing the
1601       first version.
1602
1603       (Also note how the text regexp acts as a failsafe against the
1604       possibility of typing a wrong date that doesn't refer to a commit with
1605       an empty comment. This was a real-world example from the CVS-to-git
1606       conversion of groff.)
1607
1608       When even a macro is not enough, you can write and call custom Python
1609       extensions.
1610
1611       exec name
1612           Execute custom code from standard input (normally a file via <
1613           redirection). Use this to set up custom extension functions for
1614           later eval calls. The code has full access to all internal data
1615           structures. Functions defined are accessible to later eval calls.
1616
1617           This can be called in a script with the extension code in a
1618           here-doc.
1619
1620       eval function-name
1621           Evaluate a line of code in the current interpreter context.
1622           Typically this will be a call to a function defined by a previous
1623           exec. The variables _repository and _selection will have the
1624           obvious values. Note that _selection will be a list of integers,
1625           not objects.
1626
1627       script filename [arg...]
1628           Takes a filename and optional following arguments. Reads each line
1629           from the file and executes it as a command.
1630
1631           During execution of the script, the script name replaces the string
1632           $0 and the optional following arguments (if any) replace the
1633           strings $1, $2 ... $n in the script text. This is done before
1634           tokenization, so the $1 in a string like “foo$1bar” will be
1635           expanded. Additionally, $$ is expanded to the current process ID
1636           (which may be useful for scripts that use tempfiles).
1637
1638           Within scripts (and only within scripts) reposurgeon accepts a
1639           slightly extended syntax: First, a backslash ending a line signals
1640           that the command continues on the next line. Any number of
1641           consecutive lines thus escaped are concatenated, without the ending
1642           backslashes, prior to evaluation. Second, a command that takes an
1643           input filename argument can instead take literal following data in
1644           the syntax of a shell here-document. That is: if the filename is
1645           replaced by "<<EOF", all following lines in the script up to a
1646           terminating line consisting only of "EOF" will be read, placed in a
1647           temporary file, and that file fed to the command and afterwards
1648           deleted. EOF may be replaced by any string. Backslashes have no
1649           special meaning while reading a here-document.
1650
1651           Scripts may have comments. Any line beginning with a '#' is
1652           ignored. If a line has a trailing position that begins with one or
1653           more whitespace characters followed by '#', that trailing portion
1654           is ignored.
1655
1656   ARTIFACT REMOVAL
1657       Some commands automate fixing various kinds of artifacts associated
1658       with repository conversions from older systems.
1659
1660       authors [read|write] [<filename] [>filename]
1661           Apply or dump author-map information for the specified selection
1662           set, defaulting to all events.
1663
1664           Lifts from CVS and Subversion may have only usernames local to the
1665           repository host in committer and author IDs. DVCSes want email
1666           addresses (net-wide identifiers) and complete names. To supply the
1667           map from one to the other, an authors file is expected to consist
1668           of lines each beginning with a local user ID, followed by a '='
1669           (possibly surrounded by whitespace) followed by a full name and
1670           email address, optionally followed by a timezone offset field.
1671           Thus:
1672
1673               ferd = Ferd J. Foonly <foonly@foo.com> -0500
1674
1675           An authors file may have comment lines beginning with '#'; these
1676           are ignored.
1677
1678           When an authors file is applied, email addresses in committer and
1679           author metadata for which the local ID matches between < and @ are
1680           replaced according to the mapping (this handles git-svn lifts).
1681           Alternatively, if the local ID is the entire address, this is also
1682           considered a match (this handles what git-cvsimport and cvs2git do)
1683
1684           With the 'read' modifier, or no modifier, apply author mapping data
1685           (from standard input or a <-redirected file). May be useful if you
1686           are editing a repo or dump created by cvs2git or by git-svn invoked
1687           without -A.
1688
1689           With the 'write' modifier, write a mapping file that could be
1690           interpreted by authors read, with entries for each unique
1691           committer, author, and tagger (to standard output or a <-redirected
1692           mapping file). This may be helpful as a start on building an
1693           authors file, though each part to the right of an equals sign will
1694           need editing.
1695
1696       branchify [path-set]
1697           Specify the list of directories to be treated as potential branches
1698           (to become tags if there are no modifications after the creation
1699           copies) when analyzing a Subversion repo. This list is ignored when
1700           the --nobranch read option is used. It defaults to the 'standard
1701           layout' set of directories, plus any unrecognized directories in
1702           the repository root.
1703
1704           With no arguments, displays the current branchification set.
1705
1706           An asterisk at the end of a path in the set means 'all immediate
1707           subdirectories of this path, unless they are part of another
1708           (longer) path in the branchify set'.
1709
1710           Note that the branchify set is a property of the reposurgeon
1711           interpreter, not of any individual repository, and will persist
1712           across Subversion dumpfile reads. This may lead to unexpected
1713           results if you forget to re-set it.
1714
1715       branchify_map [/regex/branch/...]
1716           Specify the list of regular expressions used for mapping the svn
1717           branches that are detected by branchify. If none of the expressions
1718           match the default behaviour applies. This maps a branch to the name
1719           of the last directory, except for trunk and “*” which are mapped to
1720           master and root.
1721
1722           With no arguments the current regex replacement pairs are shown.
1723           Passing 'reset' will clear the mapping.
1724
1725           The branchify command will match each branch name against regex1
1726           and if it matches rewrite its branch name to branch1. If not it
1727           will try regex2 and so forth until it either found a matching regex
1728           or there are no regexs left. The regular expressions should be in
1729           Python's[2]. format. The branch name can use backreferences (see
1730           the re.sub function in the Python documentation).
1731
1732           Note that the regular expressions are appended to 'refs/' without
1733           either the needed 'heads/' or 'tags/'. This allows for choosing the
1734           right kind of branch type.
1735
1736           While the syntax template above uses slashes, any first character
1737           will be used as a delimeter (and you will need to use a different
1738           one in the common case that the paths contain slashes).
1739
1740           Note that the branchify_map set is a property of the reposurgeon
1741           interpreter, not of any individual repository, and will persist
1742           across Subversion dumpfile reads. This may lead to unexpected
1743           results if you forget to re-set it.
1744
1745   EXAMINING TREE STATES
1746       manifest [regular expression] [>outfile]
1747           Takes an optional selection set argument defaulting to all commits,
1748           and an optional Python regular expression. For each commit in the
1749           selection set, print the mapping of all paths in that commit tree
1750           to the corresponding blob marks, mirroring what files would be
1751           created in a checkout of the commit. If a regular expression is
1752           given, only print "path -> mark" lines for paths matching it. This
1753           command supports > redirection.
1754
1755       checkout directory
1756           Takes a selection set which must resolve to a single commit, and a
1757           second argument. The second argument is interpreted as a directory
1758           name. The state of the code tree at that commit is materialized
1759           beneath the directory.
1760
1761       diff [>outfile]
1762           Display the difference between commits. Takes a selection-set
1763           argument which must resolve to exactly two commits. Supports output
1764           redirection.
1765
1766   HOUSEKEEPING
1767       These are backed up by the following housekeeping commands, none of
1768       which take a selection set:
1769
1770       help
1771           Get help on the interpreter commands. Optionally follow with
1772           whitespace and a command name; with no argument, lists all
1773           commands. '?' also invokes this.
1774
1775       shell
1776           Execute the shell command given in the remainder of the line. '!'
1777           also invokes this.
1778
1779       prefer [repotype]
1780           With no arguments, describe capabilities of all supported systems.
1781           With an argument (which must be the name of a supported system)
1782           this has two effects:
1783
1784           First, if there are multiple repositories in a directory you do a
1785           read on, reposurgeon will read the preferred one (otherwise it will
1786           complain that it can't choose among them).
1787
1788           Secondly, this will change reposurgeon's preferred type for output.
1789           This means that you do a write to a directory, it will build a repo
1790           of the preferred type rather than its original type (if it had
1791           one).
1792
1793           If no preferred type has been explicitly selected, reading in a
1794           repository (but not a fast-import stream) will implicitly set the
1795           preferred type to the type of that repository.
1796
1797           In older versions of reposurgeon this command changed the type of
1798           the selected repository, if there is one. That behavior interacted
1799           badly with attempts to interpret legacy IDs and has been removed.
1800
1801       sourcetype [repotype]
1802           Report (with no arguments) or select (with one argument) the
1803           current repository's source type. This type is normally set at
1804           repository-read time, but may remain unset if the source was a
1805           stream file.
1806
1807           The source type affects the interpretation of legacy IDs (for
1808           purposes of the =N visibility set and the 'references' command) by
1809           controlling the regular expressions used to recognize them. If no
1810           preferred output type has been set, it may also change the output
1811           format of stream files made from the repository.
1812
1813           The source type is reliably set whenever a live repository is read,
1814           or when a Subversion stream or Fossil dump is interpreted but not
1815           necessarily by other stream files. Streams generated by cvs-fast-
1816           export(1) using the --reposurgeon are detected as CVS. In some
1817           other cases, the source system is detected from the presence of
1818           magic $-headers in contents blobs.
1819
1820   INSTRUMENTATION
1821       A few commands have been implemented primarily for debugging and
1822       regression-testing purposes, but may be useful in unusual
1823       circumstances.
1824
1825       The output of most of these commands can individually be redirected to
1826       a named output file. Where indicated in the syntax, you can prefix the
1827       output filename with “>” and give it as a following argument.
1828
1829       index [>outfile]
1830           Display four columns of info on objects in the selection set: their
1831           number, their type, the associate mark (or '-' if no mark) and a
1832           summary field varying by type. For a branch or tag it's the
1833           reference; for a commit it's the commit branch; for a blob it's the
1834           repository path of the file in the blob.
1835
1836           The default selection set for this command is =CTRU, all objects
1837           except blobs.
1838
1839       resolve [label-text...]
1840           Does nothing but resolve a selection-set expression and echo the
1841           resulting event-number set to standard output. The remainder of the
1842           line after the command is used as a label for the output.
1843
1844           Implemented mainly for regression testing, but may be useful for
1845           exploring the selection-set language.
1846
1847       verbose [n]
1848           'verbose 1' enables the progress meter and messages, 'verbose 0'
1849           disables them. Higher levels of verbosity are available but
1850           intended for developers only.
1851
1852       quiet [on | off]
1853           Without an argument, this command requests a report of the quiet
1854           boolean; with the argument 'on' or 'off' it is changed. When quiet
1855           is on, time-varying report fields which would otherwise cause
1856           spurious failures in regression testing are suppressed.
1857
1858       print output-text...
1859           Does nothing but ship its argument line to standard output. Useful
1860           in regression tests.
1861
1862       echo [number]
1863           'echo 1' causes each reposurgeon command to be echoed to standard
1864           output just before its output. This can be useful in constructing
1865           regression tests that are easily checked by eyeball.
1866
1867       version [version...]
1868           With no argument, display the program version and the list of VCSes
1869           directly supported. With argument, declare the major version
1870           (single digit) or full version (major.minor) under which the
1871           enclosing script was developed. The program will error out if the
1872           major version has changed (which means the surgical language is not
1873           backwards compatible).
1874
1875           It is good practice to start your lift script with a version
1876           requirement, especially if you are going to archive it for later
1877           reference.
1878
1879       prompt [format...]
1880           Set the command prompt format to the value of the command line;
1881           with an empty command line, display it. The prompt format is
1882           evaluated in Python after each command with the following
1883           dictionary substitutions:
1884
1885           chosen
1886               The name of the selected repository, or None if none is
1887               currently selected.
1888
1889           Thus, one useful format might be 'rs[%(chosen)s]%% '.
1890
1891           More format items may be added in the future. The default prompt
1892           corresponds to the format 'reposurgeon%% '. The format line is
1893           evaluated with shell quotng of tokens, so that spaces can be
1894           included.
1895
1896       history
1897           List the commands you have entered this session.
1898
1899       legacy [read|write] [<filename] [>filename]
1900           Apply or list legacy-reference information. Does not take a
1901           selection set. The 'read' variant reads from standard input or a
1902           <-redirected filename; the 'write' variant writes to standard
1903           output or a >-redirected filename.
1904
1905           A legacy-reference file maps reference cookies to (committer,
1906           commit-date, sequence-number) pairs; these in turn (should)
1907           uniquely identify a commit. The format is two whitespace-separated
1908           fields: the cookie followed by an action stamp identifying the
1909           commit.
1910
1911           It should not normally be necessary to use this command. The legacy
1912           map is automatically preserved through repository reads and
1913           rebuilds, being stored in the file legacy-map under the repository
1914           subdirectory..
1915
1916       set [option]
1917           Turn on an option flag. With no arguments, list all options
1918
1919           Most options are described in conjunction with the specific
1920           operations that the modify. One of general interest is
1921           “compressblobs”; this enables compression on the blob files in the
1922           internal representation reposurgeon uses for editing repositories.
1923           With this option, reading and writing of repositories is slower,
1924           but editing a repository requires less (sometimes much less) disk
1925           space.
1926
1927       clear [option]
1928           Turn off an option flag. With no arguments, list all options
1929
1930       profile
1931           Enable profiling. Profile statistics are dumped to the path given
1932           as argument. Must be one of the initial command-line arguments, and
1933           gathers statistics only on code executed via '-'.
1934
1935       timing
1936           Display statistics on phase timing in repository analysis. Mainly
1937           of interest to developers trying to speed up the program.
1938
1939       exit
1940           Exit, reporting the time. Included here because, while EOT will
1941           also cleanly exit the interpreter, this command reports elapsed
1942           time since start.
1943

WORKING WITH MERCURIAL

1945       reposurgeon uses a built-in extractor class to perform extractions from
1946       Mercurial repositories.
1947
1948       Mercurial branches are exported as branches in the exported repository
1949       and tags are exported as tags. By default, bookmarks are ignored. You
1950       can specify explicit handling for bookmarks by setting
1951       reposurgeon.bookmarks in your .hg/hgrc. Set the value to the prefix
1952       that reposurgeon should use for bookmarks.
1953
1954       For example, if your bookmarks represent branches, put this at the
1955       bottom of your .hg/hgrc:
1956
1957           [reposurgeon]
1958           bookmarks=heads/
1959
1960       If you do that, it's your responsibility to ensure that branch names do
1961       not conflict with bookmark names. You can add a prefix like
1962       bookmarks=heads/feature- to disambiguate as necessary.
1963

WORKING WITH SUBVERSION

1965       reposurgeon can read Subversion dumpfiles or edit a Subversion
1966       repository (and you must point it at a repository, not a checkout
1967       directory). The reposurgeon distribution includes a script named
1968       “repotool” that you can use to make and then incrementally update a
1969       local mirror of a remote repository for editing or conversion purposes.
1970
1971   READING SUBVERSION REPOSITORIES
1972       Certain optional modifiers on the read command change its behavior when
1973       reading Subversion repositories:
1974
1975       --nobranch
1976           Suppress branch analysis.
1977
1978       --ignore-properties
1979           Suppress read-time warnings about discarded property settings.
1980
1981       --user-ignores
1982           Don't generate .gitignore files from svn:ignore properties.
1983           Instead, just pass through .gitignore files found in the history.
1984
1985       --use-uuid
1986           If the --use-uuid read option is set, the repository's UUID will be
1987           used as the hostname when faking up email addresses, a la git-svn.
1988           Otherwise, addresses will be generated the way git cvs-import does
1989           it, simply ciopying the username into the address field.
1990
1991       These modifiers can go anywhere in any order on the read command line
1992       after the read verb. They must be whitespace-separated.
1993
1994       Here are the rules used for mapping subdirectories in a Subversion
1995       repository to branches:
1996
1997        1. At any given time there is a set of eligible paths and path
1998           wildcards which declare potential branches. See the documentation
1999           of the branchify for how to alter this set, which initially
2000           consists of {trunk, tags/*, branches/*, and '*'}.
2001
2002        2. A repository is considered "flat" if it has no directory that
2003           matches a path or path wildcard in the branchify set. All commits
2004           in a flat repository are assigned to branch master, and what would
2005           have been branch structure becomes directory structure. In this
2006           case, we're done; all the other rules apply to non-flat repos.
2007
2008           If you give the option --nobranch when reading a Subversion
2009           repository, branch analysis is skipped and the repository is
2010           treated as though flat (left as a linear sequence of commits on
2011           refs/heads/master). This may be useful if your repository
2012           configuration is highly unusual and you need to do your own branch
2013           surgery. Note that this option will disable partitioning of mixed
2014           commits.
2015
2016        3. If "trunk" is eligible, it always becomes the master branch.
2017
2018        4. If an element of the branchify set ends with *, each immediate
2019           subdirectory of it is considered a potential branch. If '*' is in
2020           the branchify set (which is true by default) all top-level
2021           directories other than /trunk, /tags, and /branches are also
2022           considered potential branches.
2023
2024        5. Each potential branch is checked to see if it has commits on it
2025           after the initial creation or copy. If there are such commits, it
2026           becomes a branch. If not, it becomes a tag in order to preserve the
2027           commit metadata. (In all cases, the name of the tag or branch is
2028           the basename of the directory.)
2029
2030        6. Files in the top-level directory are assigned to a synthetic branch
2031           named 'root'.
2032
2033       Each commit that only creates or deletes directories (in particular,
2034       copy commits for tags and branches, and commits that only change
2035       properties) will be transformed into a tag named after the branch,
2036       containing the date/author/comment metadata from the commit. While this
2037       produces a desirable result for tags, non-tag branches (including
2038       trunk) will also get root tags this way. This apparent misfeature has
2039       been accepted so that reposurgeon will never destroy human-generated
2040       metadata that might have value; it is left up to the user to manually
2041       remove unwanted tags.
2042
2043       Subversion branch deletions are turned into deletealls, clearing the
2044       fileset of the import-stream branch. When a branch finishes with a
2045       deleteall at its tip, the deleteall is transformed into a tag. This
2046       rule cleans up after aborted branch renames.
2047
2048       Occasionally (and usually by mistake) a branchy Subversion repository
2049       will contain revisions that touch multiple branches. These are handled
2050       by partitioning them into multiple import-stream commits, one on each
2051       affected branch. The Legacy-ID of such a split commit will have a
2052       pseudo-decimal part - for example, if Subversion revision 2317 touches
2053       three branches, the three generated commits will have IDs 2317.1,
2054       2317.2, and 2317.3.
2055
2056       The svn:executable and svn:special properties are translated into
2057       permission settings in the input stream; svn:executable becomes 100755
2058       and svn:special becomes 120000 (indicating a symlink; the blob contents
2059       will be the path to which the symlink should resolve).
2060
2061       Any cvs2svn:rev properties generated by cvs2svn are incorporated into
2062       the internal map used for reference-lifting, then discarded.
2063
2064       Normally, per-directory svn:ignore properties become .gitignore files.
2065       Actual .gitignore files in a Subversion directory are presumed to have
2066       been created by git-svn users separately from native Subversion ignore
2067       properties and discarded with a warning. It is up to the user to merge
2068       the content of such files into the target repository by hand. But this
2069       behavior is inverted by the --user-ignores option; if that is on,
2070       .gitignore files are passed through and Subversion svn:ignore
2071       properties are discarded.
2072
2073       (Regardless of the setting of the --user-ignores option, .cvsignore
2074       files found in Subversion repositories always become .gitignores in the
2075       translation. The assumption is that these date from before a CVS-to-SVN
2076       lift and should be preserved to affect behavior when browsing that
2077       section of the repository.)
2078
2079       svn:mergeinfo properties are interpreted. Any svn:mergeinfo property on
2080       a revision A with a merge source range ending in revision B produces a
2081       merge link such that B becomes a parent of A.
2082
2083       All other Subversion properties are discarded. (This may change in a
2084       future release.) The property for which this is most likely to cause
2085       semantic problems is svn:eol-style. However, since property-change-only
2086       commits get turned into annotated tags, the translated tags will retain
2087       information about setting changes.
2088
2089       The sub-second resolution on Subversion commit dates is discarded; Git
2090       wants integer timestamps only.
2091
2092       Because fast-import format cannot represent an empty directory, empty
2093       directories in Subversion repositories will be lost in translation.
2094
2095       Normally, Subversion local usernames are mapped in the style of git
2096       cvs-import; thus user "foo" becomes "foo <foo>", which is sufficient to
2097       pacify git and other systems that require email addresses. With the
2098       option "svn_use_uuid", usernames are mapped in the git-svn style, with
2099       the repository's UUID used as a fake domain in the email address. Both
2100       forms can be remapped to real address using the authors read command.
2101
2102       Reading a Subversion stream enables writing of the legacy map as
2103       'legacy' passthroughs when the repo is written to a stream file.
2104
2105       reposurgeon tries hard to silently do the right thing, but there are
2106       Subversion edge cases in which it emits warnings because a human may
2107       need to intervene and perform fixups by hand. Here are the less obvious
2108       messages it may emit:
2109
2110       user-generated .gitignore
2111           This message means means reposurgeon has found a .gitignore file in
2112           the Subversion repository it is analyzing. This probably happened
2113           because somebody was using git-svn as a live gateway, and created
2114           ignores which may or may not be congruent with those in the
2115           generated .gitignore files that the Subversion ignore properties
2116           will be translated into. You'll need to make a policy decision
2117           about which set of ignores to use in the conversion, and possibly
2118           set the --user-ignores option on read to pass through user-created
2119           .gitignore files; in that case this warning will not be emitted.
2120
2121       can't connect nonempty branch XXXX to origin
2122           This is a serious error.  reposurgeon has been unable to find a
2123           link from a specified branch to the trunk (master) branch. The
2124           commit graph will not be fully connected and will need manual
2125           repair.
2126
2127       permission information may be lost
2128           A Subversion node change on a file sets or clears properties, but
2129           no ancestor can be found for this file. Executable or symlink
2130           position may be set wrongly on later revisions of this file.
2131           Subversion user-defined properties may also be scrambled or lost.
2132           Usually this error can be ignored.
2133
2134       properties set
2135           reposurgeon has detected a setting of a user-defined property, or
2136           the Subversion properties svn:externals. These properties cannot be
2137           expressed in an import stream; the user is notified in case this is
2138           a showstopper for the conversion or some corrective action is
2139           required, but normally this error can be ignored. This warning is
2140           suppressed by the --ignore-properties option.
2141
2142       branch links detected by file ops only
2143           Branch links are normally deduced by examining Subversion directory
2144           copy operations. A common user error (making a branch with a
2145           non-Subversion directory copy and then doing an svn add on the
2146           contends) can defeat this. While reposurgeon should detect and cope
2147           with most such copies correctly, you should examine the commit
2148           graph to check that the branch is rooted at the correct place.
2149
2150       could not tagify root commit
2151           The earliest commit in your Subversion repository has file
2152           operations, rather than being a pure directory creation. This
2153           probably means your Subversion dump file is malformed, or you may
2154           have attempted to lift from an incremental dump. Proceed with
2155           caution.
2156
2157       deleting parentless tip delete
2158           This message may be triggered by a Subversion branch move followed
2159           by a re-creation under the source name. Check near the indicated
2160           revision to make sure the renamed branch is connected to master.
2161
2162       mid-branch deleteall
2163           A deleteall operation has been found in the middle of a branch
2164           history. This usually indicates that a Subversion tag or branch was
2165           created by mistake, and someone later tried to undo the error by
2166           deleting the tag/branch directory before recreating it with a copy
2167           operation. Examine the topology near the deleteall closely, it may
2168           need hand-hacking. It is fairly likely that both (a) the
2169           reposurgeon translation will be different from what other
2170           translators (such as git-svn) produce, and (b) it will not be
2171           immediately obvious which is right.
2172
2173       lookback for XXX failed, not making branch link
2174           Branch analysis failed, probably due to a set of file copies that
2175           reposurgeon thought it should interpret as a botched branch
2176           creation but couldn't deduce a history for. This is a warning;
2177           check how the directory XXX is converted, it may need post-editing
2178           into a branch.
2179
2180   WRITING SUBVERSION REPOSITORIES
2181       reposurgeon has support for writing Subversion repositories. Due to
2182       mismatches between the ontology of Subversion and that of git import
2183       streams, this support has some significant limitations and bugs.
2184
2185       In summary, Subversion repository histories do not round-trip through
2186       reposurgeon editing. File content changes are preserved but some
2187       metadata is unavoidably lost. Furthermore, writing out a DVCS history
2188       in Subversion also loses significant portions of its metadata. Details
2189       follow.
2190
2191       Writing a Subversion repository or dump stream discards author
2192       information, the committer's name, and the hostname part of the commit
2193       address; only the commit timestamp and the local part of the
2194       committer's email address are preserved, the latter becoming the
2195       Subversion author field. However, reading a Subversion repository and
2196       writing it out again will preserve the author fields.
2197
2198       Import-stream timestamps have 1-second granularity. The sub-second
2199       parts of Subversion commit timestamps will be lost on their way through
2200       reposurgeon.
2201
2202       Empty directories aren't represented in import streams. Consequently,
2203       reading and writing Subversion repositories preserves file content, but
2204       not empty directories. It is also not guaranteed that after editing a
2205       Subversion repository that the sequence of directory creations and
2206       deletions relative to other operations will be identical; the only
2207       guarantee is that enclosing directories will be created before any
2208       files in them are.
2209
2210       When reading a Subversion repository, reposurgeon discards the special
2211       directory-copy nodes associated with branch creations. These can't be
2212       recreated if and when the repository is written back out to Subversion;
2213       rather, each branch copy node from the original translates into a
2214       branch creation plus the first set of file modifications on the branch.
2215
2216       When reading a Subversion repository, reposurgeon also automatically
2217       breaks apart mixed-branch commits. These are not re-united if the
2218       repository is written back out.
2219
2220       When writing to a Subversion repository, all lightweight tags become
2221       Subversion tag copies with empty log comments, named for the tag
2222       basename. The committer name and timestamp are copied from the commit
2223       the tag points to. The distinction between heads and tags is lost.
2224
2225       Because of the preceding two points, it is not guaranteed that even
2226       revision numbers will be stable when a Subversion repository is read in
2227       and then written out!
2228
2229       Subversion repositories are always written with a standard
2230       (trunk/tags/branches) layout. Thus, a repository with a nonstandard
2231       shape that has been analyzed by reposurgeon won't be written out with
2232       the same shape.
2233
2234       When writing a Subversion repository, branch merges are translated into
2235       svn:mergeinfo properties in the simplest possible way - as an
2236       svn:mergeinfo property of the translated merge commit listing the merge
2237       source revisions.
2238
2239       Subversion has a concept of "flows"; that is, named segments of history
2240       corresponding to files or directories that are created when the path is
2241       added, cloned when the path is copied, and deleted when the path is
2242       deleted. This information is not preserved in import streams or the
2243       internal representation that reposurgeon uses. Thus, after editing, the
2244       flow boundaries of a Subversion history may be arbitrarily changed.
2245

IGNORE PATTERNS

2247       reposurgeon recognizes how supported VCSes represent file ignores (CVS
2248       .cvsignore files lurking untranslated in older Subversion repositories,
2249       Subversion ignore properties, .gitignore/.hgignore/.bzrignore file in
2250       other systems) and moves ignore declarations among these containers on
2251       repo input and output. This will be sufficient if the ignore patterns
2252       are exact filenames.
2253
2254       Translation may not, however, be perfect when the ignore patterns are
2255       Unix glob patterns or regular expressions. This compatibility table
2256       describes which patterns will translate; “plain” indicates a plain
2257       filename with no glob or regexp syntax or negation.
2258
2259       RCS has no ignore files or patterns and is therefore not included in
2260       the table.
2261
2262┌─────────────┬───────────────┬──────────────┬───────────────────┬───────────────────┬─────────────────────┬──────────────┬────────────┬────────────┐
2263│             │   from CVS    from svn   from git      from hg      from bzr       from      from SRC  from bk   
2264│             │               │              │                   │                   │                     │    darcs     │            │            │
2265├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2266│             │               │              │                   │                   │                     │              │            │            │
2267to   │         all   │         all  │        all        │           all     │         all         │        plain │        all │        all │
2268CVS  │               │              │        except     │                   │         except      │              │            │            │
2269│             │               │              │        !-prefixed │                   │         RE:-        │              │            │            │
2270│             │               │              │        but        │                   │         and         │              │            │            │
2271│             │               │              │        nonempty   │                   │         !-prefixed  │              │            │            │
2272├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2273│             │  all except   │              │                   │                   │                     │              │            │            │
2274to   │  !.PP         │         all  │        all except │           all     │         all except  │        plain │        all │        all │
2275svn  │               │              │        !-prefixed │                   │         RE:- and    │              │            │            │
2276│             │               │              │                   │                   │         !-prefixed  │              │            │            │
2277├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2278│             │               │              │                   │                   │                     │              │            │            │
2279to   │         all   │         all  │           all     │        all        │        all except   │        plain │        all │        all │
2280git  │               │              │                   │        except     │        RE:-prefixed │              │            │            │
2281│             │               │              │                   │        !-prefixed │                     │              │            │            │
2282├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2283│             │               │              │                   │                   │                     │              │            │            │
2284to   │        all    │         all  │        all except │           all     │         all except  │        plain │        all │        all │
2285hg   │        except │              │        !-prefixed │                   │         RE:- and    │              │            │            │
2286│             │        !      │              │                   │                   │         !-prefixed  │              │            │            │
2287├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2288│             │               │              │                   │                   │                     │              │            │            │
2289to   │         all   │         all  │           all     │           all     │            all      │        plain │        all │        all │
2290bzr  │               │              │                   │                   │                     │              │            │            │
2291├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2292│             │               │              │                   │                   │                     │              │            │            │
2293to    │        plain  │        plain │          plain    │          plain    │           plain     │         all  │        all │        all │
2294darcs │               │              │                   │                   │                     │              │            │            │
2295├─────────────┼───────────────┼──────────────┼───────────────────┼───────────────────┼─────────────────────┼──────────────┼────────────┼────────────┤
2296│             │               │              │                   │                   │                     │              │            │            │
2297to   │        all    │         all  │        all except │           all     │         all except  │        plain │        all │        all │
2298SRC  │        except │              │        !-prefixed │                   │         RE:- and    │              │            │            │
2299│             │        !      │              │                   │                   │         !-prefixed  │              │            │            │
2300└─────────────┴───────────────┴──────────────┴───────────────────┴───────────────────┴─────────────────────┴──────────────┴────────────┴────────────┘
2301
2302       The hg rows and columns of the table describes compatibility to hg's
2303       glob syntax rather than its default regular-expression syntax. When
2304       writing to an hg repository from any other kind, reposurgeon prepends
2305       to the output .hgignore a "syntax: glob" line.
2306

TRANSLATION STYLE

2308       After converting a CVS, SVN, or BitKeeper repository, check for and
2309       remove $-cookies in the head revision(s) of the files. The full
2310       Subversion set is $Date:, $Revision:, $Author:, $HeadURL and $Id:. CVS
2311       uses $Author:, $Date:, $Header:, $Id:, $Log:, $Revision:, also (rarely)
2312       $Locker:, $Name:, $RCSfile:, $Source:, and $State:.
2313
2314       When you need to specify a commit, use the action-stamp format that
2315       references lift generates when it can resolve an SVN or CVS reference
2316       in a comment. It is best that you not vary from this format, even in
2317       trivial ways like omitting the 'Z' or changing the 'T' or '!' or ':'.
2318       Making action stamps uniform and machine-parseable will have good
2319       consequences for future repository-browsing tools.
2320
2321       Sometimes, in converting a repository, you may need to insert an
2322       explanatory comment - for example, if metadata has been garbled or
2323       missing and you need to point to that fact. It's helpful for
2324       repository-browsing tools if there is a uniform syntax for this that is
2325       highly unlikely to show up in repository comments. We recommend
2326       enclosing translation notes in [[ ]]. This has the advantage of being
2327       visually similar to the [ ] traditionally used for editorial comments
2328       in text.
2329
2330       It is good practice to include, in the comment for the root commit of
2331       the repository, a note dating and attributing the conversion work and
2332       explaining these conventions. Example:
2333
2334       [[This repository was converted from Subversion to git on 2011-10-24 by
2335       Eric S. Raymond <esr@thyrsus.com>. Here and elsewhere, conversion notes
2336       are enclosed in double square brackets. Junk commits generated by
2337       cvs2svn have been removed, commit references have been mapped into a
2338       uniform VCS-independent syntax, and some comments edited into
2339       summary-plus-continuation form.]]
2340
2341       It is also good practice to include a generated tag at the point of
2342       conversion. E.g
2343
2344           mailbox_in --create <<EOF
2345           Tag-Name: git-conversion
2346
2347           Marks the spot at which this repository was converted from Subversion to git.
2348           EOF
2349

ADVANCED EXAMPLES

2351           define lastchange {
2352           @max(=B & [/ChangeLog/] & /{0}/B)? list
2353           }
2354
2355       List the last commit that refers to a ChangeLog file containing a
2356       specified string. (The trick here is that ? extends the singleton set
2357       consisting of the last eligible ChangeLog blob to its set of referring
2358       commits, and listonly notices the commits.)
2359

STREAM SYNTAX EXTENSIONS

2361       The event-stream parser in “reposurgeon” supports some extended syntax.
2362       Exporters designed to work with “reposurgeon” may have a --reposurgeon
2363       option that enables emission of extended syntax; notably, this is true
2364       of cvs-fast-export(1). The remainder of this section describes these
2365       syntax extensions. The properties they set are (usually) preserved and
2366       re-output when the stream file is written.
2367
2368       The token “#reposurgeon” at the start of a comment line in a
2369       fast-import stream signals reposurgeon that the remainder is an
2370       extension command to be interpreted by “reposurgeon”.
2371
2372       One such extension command is implemented: #sourcetype, which behaves
2373       identically to the reposurgeon sourcetype command. An exporter for a
2374       version-control system named “frobozz” could, for example, say
2375
2376           #reposurgeon sourcetype frobozz
2377
2378       Within a commit, a magic comment of the form “#legacy-id” declares a
2379       legacy ID from the stream file's source version-control system.
2380
2381       Also accepted is the bzr syntax for setting per-commit properties.
2382       While parsing commit syntax, a line beginning with the token “property”
2383       must contibue with a whitespace-separated property-name token. If it is
2384       then followed by a newline it is taken to set that boolean-valued
2385       property to true. Otherwise it must be followed by a numeric token
2386       specifying a data length, a space, following data (which may contain
2387       newlines) and a terminating newline. For example:
2388
2389           commit refs/heads/master
2390           mark :1
2391           committer Eric S. Raymond <esr@thyrsus.com> 1289147634 -0500
2392           data 16
2393           Example commit.
2394
2395           property legacy-id 2 r1
2396           M 644 inline README
2397
2398       Unlike other extensions, bzr properties are only preserved on stream
2399       output if the preferred type is bzr, because any importer other than
2400       bzr's will choke on them.
2401

INCOMPATIBLE LANGUAGE CHANGES

2403       In versions before 3.23, “prefer” changed the repository type as well
2404       as the preferred output format.
2405
2406       In versions before 3.0, the general command syntax put the command verb
2407       first, then the selection set (if any) then modifiers (VSO). It has
2408       changed to optional selection set first, then command verb, then
2409       modifiers (SVO). The change made parsing simpler, allowed abolishing
2410       some noise keywords, and recapitulates a successful design pattern in
2411       some other Unix tools - notably sed(1).
2412
2413       In versions before 3.0, path expressions only matched commits, not
2414       commits and the associated blobs as well. The names of the “a” and “c”
2415       flags were different.
2416
2417       In reposurgeon versions before 3.0, the delete command had the
2418       semantics of squash; also, the policy flags did not require a “--”
2419       prefix. The “--delete” flag was named “obliterate”.
2420
2421       In reposurgeon versions before 3.0, read and write optionally took file
2422       arguments rather than requiring redirects (and the write command never
2423       wrote into directories). This was changed in order to allow these
2424       commands to have modifiers. These modifiers replaced several global
2425       options that no longer exist.
2426
2427       In reposurgeon versions before 3.0, the earliest factor in a unite
2428       command always kept its tag and branch names unaltered. The new rule
2429       for resolving name conflicts, giving priority to the latest factor,
2430       produces more natural behavior when uniting two repositories end to
2431       end; the master branch of the second (later) one keeps its name.
2432
2433       In reposurgeon versions before 3.0, the tagify command expected
2434       policies as trailing arguments to alter its behaviour. The new syntax
2435       uses similarly named options with leading dashes, that can appear
2436       anywhere after the tagify command
2437
2438       In versions before 2.9. the syntax of "authors", "legacy", "list", and
2439       "mailbox_{in|out}" was different (and "legacy" was "fossils"). They
2440       took plain filename arguments rather that using redirect < and >.
2441

LIMITATIONS AND GUARANTEES

2443       Guarantee: In DVCses that use commit hashes, editing with reposurgeon
2444       never changes the hash of a commit object unless (a) you edit the
2445       commit, or (b) it is a descendant of an edited commit in a VCS that
2446       includes parent hashes in the input of a child object's hash (git and
2447       hg both do this).
2448
2449       Guarantee: reposurgeon only requires main memory proportional to the
2450       size of a repository's metadata history, not its entire content
2451       history. (Exception: the data from inline content is held in memory.)
2452
2453       Guarantee: In the worst case, reposurgeon makes its own copy of every
2454       content blob in the repository's history and thus uses intermediate
2455       disk space approximately equal to the size of a repository's content
2456       history. However, when the repository to be edited is presented as a
2457       stream file, reposurgeon requires no or only very little extra disk
2458       space to represent it; the internal representation of content blobs is
2459       a (seek-offset, length) pair pointing into the stream file.
2460
2461       Guarantee: reposurgeon never modifies the contents of a repository it
2462       reads, nor deletes any repository. The results of surgery are always
2463       expressed in a new repository.
2464
2465       Guarantee: Any line in a fast-import stream that is not a part of a
2466       command reposurgeon parses and understands will be passed through
2467       unaltered. At present the set of potential passthroughs is known to
2468       include the progress, the options, and checkpoint commands as well as
2469       comments led by #.
2470
2471       Guarantee: All reposurgeon operations either preserve all repository
2472       state they are not explicitly told to modify or warn you when they
2473       cannot do so.
2474
2475       Guarantee: reposurgeon handles the bzr commit-properties extension,
2476       correctly passing through property items including those with embedded
2477       newlines. (Such properties are also editable in the mailbox format.)
2478
2479       Limitation: Because reposurgeon relies on other programs to generate
2480       and interpret the fast-import command stream, it is subject to bugs in
2481       those programs.
2482
2483       Limitation: bzr suffers from deep confusion over whether its unit of
2484       work is a repository or a floating branch that might have been cloned
2485       from a repo or created from scratch, and might or might not be destined
2486       to be merged to a repo one day. Its exporter only works on branches,
2487       but its importer creates repos. Thus, a rebuild operation will produce
2488       a subdirectory structure that differs from what you expect. Look for
2489       your content under the subdirectory 'trunk'.
2490
2491       Limitation: under git, signed tags are imported verbatim. However, any
2492       operation that modifies any commit upstream of the target of the tag
2493       will invalidate it.
2494
2495       Limitation: Stock git (at least as of version 1.7.3.2) will choke on
2496       property extension commands. Accordingly, reposurgeon omits them when
2497       rebuilding a repo with git type.
2498
2499       Limitation: Converting an hg repo that uses bookmarks (not branches) to
2500       git can lose information; the branch ref that git assigns to each
2501       commit may not be the same as the hg bookmark that was active when the
2502       commit was originally made under hg. Unfortunately, this is a real
2503       ontological mismatch, not a problem that can be fixed by cleverness in
2504       reposurgeon.
2505
2506       Limitation: Converting an hg repo that uses branches to git can lose
2507       information because git does not store an explicit branch as part of
2508       commit metadata, but colors commits with branch or tag names on the fly
2509       using a specific coloring algorithm, which might not match the explicit
2510       branch assignments to commits in the original hg repo. Reposurgeon
2511       preserves the hg branch information when reading an hg repo, so it is
2512       available from within reposurgeon itself, but there is no way to
2513       preserve it if the repo is written to git.
2514
2515       Limitation: While the Subversion read-side support is in good shape,
2516       the write-side support is more of a sketch or proof-of-concept than a
2517       robust implementation; it only works on very simple cases and does not
2518       round-trip. It may improve in future releases.
2519
2520       Limitation: Not all BitKeeper versions have the fast-import and
2521       fast-export commands that reposurgeon requires. They are present back
2522       to the 7.3 opensource version.
2523
2524       Limitation: reposurgeon may misbehave under a filesystem which smashes
2525       case in filenames, or which nominally preserves case but maps names
2526       differing only by case to the same filesystem node (Mac OS X behaves
2527       like this by default). Problems will arise if any two paths in a repo
2528       differ by case only. To avoid the problem on a Mac, do all your surgery
2529       on an HFS+ file system formatted with case sensitivity specifically
2530       enabled.
2531
2532       Limitation: If whitespace followed by # appears in a string or regexp
2533       command argument, it will be misinterpreted as the beginning of a
2534       line-ending comment and screw up parsing.
2535
2536       Guarantee: As version-control systems add support for the fast-import
2537       format, their repositories will become editable by reposurgeon.
2538

REQUIREMENTS

2540       reposurgeon relies on importers and exporters associated with the VCSes
2541       it supports.
2542
2543       git
2544           Core git supports both export and import.
2545
2546       bzr
2547           Requires bzr plus the bzr-fast-import plugin.
2548
2549       hg
2550           Requires core hg, the hg-fastimport plugin, and the third-party
2551           hg-fast-export.py script.
2552
2553       svn
2554           Stock Subversion commands support export and import.
2555
2556       darcs
2557           Stock darcs commands support export and import.
2558
2559       CVS
2560           Requires cvs-fast-export. Note that the quality of CVS lifts may be
2561           poor, with individual lifts requiring serious hand-hacking. This is
2562           due to inherent problems with CVS's file-oriented model.
2563
2564       RCS
2565           Requires cvs-fast-export (yes, that's not a typo; cvs-fast-export
2566           handles RCS collections as well). The caveat for CVS applies.
2567

CANONICALIZATION RULES

2569       It is expected that reposurgeon will be extended with more deletion
2570       policies. Policy authors may need to know more about how a commit's
2571       file operation sequence is reduced to normal form after operations from
2572       deleted commits are prepended to it.
2573
2574       Recall that each commit has a list of file operations, each a M
2575       (modify), D (delete), R (rename), C (copy), or 'deleteall' (delete all
2576       files). Only M operations have associated blobs. Normally there is only
2577       one M operation per individual file in a commit's operation list.
2578
2579       To understand how the reduction process works, it's enough to
2580       understand the case where all the operation in the list are working on
2581       the same file. Sublists of operations referring to different files
2582       don't affect each other and reducing them can be thought of as separate
2583       operations. Also, a "deleteall" acts as a D for everything and cancels
2584       all operations before it in the list.
2585
2586       The reduction process walks through the list from the beginning looking
2587       for adjacent pairs of operations it can compose. The following table
2588       describes all possible cases and all but one of the reductions.
2589
2590              ┌──────────────────────────┬────────────────────────────┐
2591              │        M + D → D         │                            │
2592              │                          │        If a file is        │
2593              │                          │        modified then       │
2594              │                          │        deleted, the result │
2595              │                          │        is as though it had │
2596              │                          │        been deleted. If    │
2597              │                          │        the M was the only  │
2598              │                          │        modify for the      │
2599              │                          │        file, it's removed  │
2600              │                          │        too.                │
2601              ├──────────────────────────┼────────────────────────────┤
2602              │M a + R a b → R a b + M b │                            │
2603              │                          │        The purpose of this │
2604              │                          │        transformation is   │
2605              │                          │        to push renames     │
2606              │                          │        toward the          │
2607              │                          │        beginning of the    │
2608              │                          │        list, where they    │
2609              │                          │        may become adjacent │
2610              │                          │        to another R or C   │
2611              │                          │        they can be         │
2612              │                          │        composed with. If   │
2613              │                          │        the M is the only   │
2614              │                          │        modify operation    │
2615              │                          │        for this file, the  │
2616              │                          │        rename is dropped.  │
2617              ├──────────────────────────┼────────────────────────────┤
2618              │       M a + C a b        │                            │
2619              │                          │        No reduction.       │
2620              ├──────────────────────────┼────────────────────────────┤
2621              │  M b + R a b → nothing   │                            │
2622              │                          │        Should be           │
2623              │                          │        impossible, and may │
2624              │                          │        indicate repository │
2625              │                          │        corruption.         │
2626              ├──────────────────────────┼────────────────────────────┤
2627              │  M b + C a b → nothing   │                            │
2628              │                          │        The copy undoes the │
2629              │                          │        modification.       │
2630              ├──────────────────────────┼────────────────────────────┤
2631              │        D + M → M         │                            │
2632              │                          │        If a file is        │
2633              │                          │        deleted and         │
2634              │                          │        modified, the       │
2635              │                          │        result is as though │
2636              │                          │        the deletion had    │
2637              │                          │        not taken place     │
2638              │                          │        (because M          │
2639              │                          │        operations store    │
2640              │                          │        entire files, not   │
2641              │                          │        deltas).            │
2642              ├──────────────────────────┼────────────────────────────┤
2643              │       D + {D|R|C}        │                            │
2644              │                          │        These cases should  │
2645              │                          │        be impossible and   │
2646              │                          │        would suggest the   │
2647              │                          │        repository has been │
2648              │                          │        corrupted.          │
2649              ├──────────────────────────┼────────────────────────────┤
2650              │       R a b + D a        │                            │
2651              │                          │        Should never        │
2652              │                          │        happen, and is      │
2653              │                          │        another case that   │
2654              │                          │        would suggest       │
2655              │                          │        repository          │
2656              │                          │        corruption.         │
2657              ├──────────────────────────┼────────────────────────────┤
2658              │    R a b + D b → D a     │                            │
2659              │                          │        The delete removes  │
2660              │                          │        the just-renamed    │
2661              │                          │        file.               │
2662              ├──────────────────────────┼────────────────────────────┤
2663              │        {R|C} + M         │                            │
2664              │                          │        No reduction.       │
2665              ├──────────────────────────┼────────────────────────────┤
2666              │  R a b + R b c → R a c   │                            │
2667              │                          │        The b terms have to │
2668              │                          │        match for these     │
2669              │                          │        operations to have  │
2670              │                          │        made sense when     │
2671              │                          │        they lived in       │
2672              │                          │        separate commits;   │
2673              │                          │        if they don't, it   │
2674              │                          │        indicates           │
2675              │                          │        repository          │
2676              │                          │        corruption.         │
2677              ├──────────────────────────┼────────────────────────────┤
2678              │      R a b + C b c       │                            │
2679              │                          │        No reduction.       │
2680              ├──────────────────────────┼────────────────────────────┤
2681              │   C a b + D a → R a b    │                            │
2682              │                          │        Copy followed by    │
2683              │                          │        delete of the       │
2684              │                          │        source is a rename. │
2685              ├──────────────────────────┼────────────────────────────┤
2686              │  C a b + D b → nothing   │                            │
2687              │                          │        This delete undoes  │
2688              │                          │        the copy.           │
2689              ├──────────────────────────┼────────────────────────────┤
2690              │      C a b + R a c       │                            │
2691              │                          │        No reduction.       │
2692              ├──────────────────────────┼────────────────────────────┤
2693              │  C a b + R b c → C a c   │                            │
2694              │                          │        Copy followed by a  │
2695              │                          │        rename of the       │
2696              │                          │        target reduces to   │
2697              │                          │        single copy         │
2698              ├──────────────────────────┼────────────────────────────┤
2699              │          C + C           │                            │
2700              │                          │        No reduction.       │
2701              └──────────────────────────┴────────────────────────────┘
2702

CRASH RECOVERY

2704       This section will become relevant only if reposurgeon or something
2705       underneath it in the software and hardware stack crashes while in the
2706       middle of writing out a repository, in particular if the target
2707       directory of the rebuild is your current directory.
2708
2709       The tool has two conflicting objectives. On the one hand, we never want
2710       to risk clobbering a pre-existing repo. On the other hand, we want to
2711       be able to run this tool in a directory with a repo and modify it in
2712       place.
2713
2714       We resolve this dilemma by playing a game of three-directory monte.
2715
2716        1. First, we build the repo in a freshly-created staging directory. If
2717           your target directory is named /path/to/foo, the staging directory
2718           will be a peer named /path/to/foo-stageNNNN, where NNNN is a cookie
2719           derived from reposurgeon's process ID.
2720
2721        2. We then make an empty backup directory. This directory will be
2722           named /path/to/foo.~N~, where N is incremented so as not to
2723           conflict with any existing backup directories.  reposurgeon never,
2724           under any circumstances, ever deletes a backup directory.
2725
2726           So far, all operations are safe; the worst that can happen up to
2727           this point if the process gets interrupted is that the staging and
2728           backup directories get left behind.
2729
2730        3. The critical region begins. We first move everything in the target
2731           directory to the backup directory.
2732
2733        4. Then we move everything in the staging directory to the target.
2734
2735        5. We finish off by restoring untracked files in the target directory
2736           from the backup directory. That ends the critical region.
2737
2738       During the critical region, all signals that can be ignored are
2739       ignored.
2740

ERROR RETURNS

2742       Returns 1 on fatal error, 0 otherwise. In batch mode all errors are
2743       fatal.
2744

SEE ALSO

2746       bzr(1), cvs(1), darcs(1), git(1), hg(1), rcs(1), svn(1).  bk(1).
2747

AUTHOR

2749       Eric S. Raymond <esr@thyrsus.com>; project page at
2750       http://www.catb.org/~esr/reposurgeon.
2751

NOTES

2753        1. DVCS Migration HOWTO
2754           http://www.catb.org/esr/dvcs-migration-guide.html
2755
2756        2. Python's
2757           http://docs.python.org/2/library/re.html
2758
2759
2760
2761reposurgeon                       03/31/2019                    REPOSURGEON(1)
Impressum