1GIT-FILTER-BRANCH(1)              Git Manual              GIT-FILTER-BRANCH(1)
2
3
4

NAME

6       git-filter-branch - Rewrite branches
7

SYNOPSIS

9       git filter-branch [--env-filter <command>] [--tree-filter <command>]
10               [--index-filter <command>] [--parent-filter <command>]
11               [--msg-filter <command>] [--commit-filter <command>]
12               [--tag-name-filter <command>] [--subdirectory-filter <directory>]
13               [--prune-empty]
14               [--original <namespace>] [-d <directory>] [-f | --force]
15               [--] [<rev-list options>...]
16
17

DESCRIPTION

19       Lets you rewrite git revision history by rewriting the branches
20       mentioned in the <rev-list options>, applying custom filters on each
21       revision. Those filters can modify each tree (e.g. removing a file or
22       running a perl rewrite on all files) or information about each commit.
23       Otherwise, all information (including original commit times or merge
24       information) will be preserved.
25
26       The command will only rewrite the positive refs mentioned in the
27       command line (e.g. if you pass a..b, only b will be rewritten). If you
28       specify no filters, the commits will be recommitted without any
29       changes, which would normally have no effect. Nevertheless, this may be
30       useful in the future for compensating for some git bugs or such,
31       therefore such a usage is permitted.
32
33       NOTE: This command honors .git/info/grafts. If you have any grafts
34       defined, running this command will make them permanent.
35
36       WARNING! The rewritten history will have different object names for all
37       the objects and will not converge with the original branch. You will
38       not be able to easily push and distribute the rewritten branch on top
39       of the original branch. Please do not use this command if you do not
40       know the full implications, and avoid using it anyway, if a simple
41       single commit would suffice to fix your problem. (See the "RECOVERING
42       FROM UPSTREAM REBASE" section in git-rebase(1) for further information
43       about rewriting published history.)
44
45       Always verify that the rewritten version is correct: The original refs,
46       if different from the rewritten ones, will be stored in the namespace
47       refs/original/.
48
49       Note that since this operation is very I/O expensive, it might be a
50       good idea to redirect the temporary directory off-disk with the -d
51       option, e.g. on tmpfs. Reportedly the speedup is very noticeable.
52
53   Filters
54       The filters are applied in the order as listed below. The <command>
55       argument is always evaluated in the shell context using the eval
56       command (with the notable exception of the commit filter, for technical
57       reasons). Prior to that, the $GIT_COMMIT environment variable will be
58       set to contain the id of the commit being rewritten. Also,
59       GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME,
60       GIT_COMMITTER_EMAIL, and GIT_COMMITTER_DATE are set according to the
61       current commit. The values of these variables after the filters have
62       run, are used for the new commit. If any evaluation of <command>
63       returns a non-zero exit status, the whole operation will be aborted.
64
65       A map function is available that takes an "original sha1 id" argument
66       and outputs a "rewritten sha1 id" if the commit has been already
67       rewritten, and "original sha1 id" otherwise; the map function can
68       return several ids on separate lines if your commit filter emitted
69       multiple commits.
70

OPTIONS

72       --env-filter <command>
73           This filter may be used if you only need to modify the environment
74           in which the commit will be performed. Specifically, you might want
75           to rewrite the author/committer name/email/time environment
76           variables (see git-commit-tree(1) for details). Do not forget to
77           re-export the variables.
78
79       --tree-filter <command>
80           This is the filter for rewriting the tree and its contents. The
81           argument is evaluated in shell with the working directory set to
82           the root of the checked out tree. The new tree is then used as-is
83           (new files are auto-added, disappeared files are auto-removed -
84           neither .gitignore files nor any other ignore rules HAVE ANY
85           EFFECT!).
86
87       --index-filter <command>
88           This is the filter for rewriting the index. It is similar to the
89           tree filter but does not check out the tree, which makes it much
90           faster. Frequently used with git rm --cached --ignore-unmatch ...,
91           see EXAMPLES below. For hairy cases, see git-update-index(1).
92
93       --parent-filter <command>
94           This is the filter for rewriting the commit’s parent list. It will
95           receive the parent string on stdin and shall output the new parent
96           string on stdout. The parent string is in the format described in
97           git-commit-tree(1): empty for the initial commit, "-p parent" for a
98           normal commit and "-p parent1 -p parent2 -p parent3 ..." for a
99           merge commit.
100
101       --msg-filter <command>
102           This is the filter for rewriting the commit messages. The argument
103           is evaluated in the shell with the original commit message on
104           standard input; its standard output is used as the new commit
105           message.
106
107       --commit-filter <command>
108           This is the filter for performing the commit. If this filter is
109           specified, it will be called instead of the git commit-tree
110           command, with arguments of the form "<TREE_ID> [(-p
111           <PARENT_COMMIT_ID>)...]" and the log message on stdin. The commit
112           id is expected on stdout.
113
114           As a special extension, the commit filter may emit multiple commit
115           ids; in that case, the rewritten children of the original commit
116           will have all of them as parents.
117
118           You can use the map convenience function in this filter, and other
119           convenience functions, too. For example, calling skip_commit "$@"
120           will leave out the current commit (but not its changes! If you want
121           that, use git rebase instead).
122
123           You can also use the git_commit_non_empty_tree "$@" instead of git
124           commit-tree "$@" if you don’t wish to keep commits with a single
125           parent and that makes no change to the tree.
126
127       --tag-name-filter <command>
128           This is the filter for rewriting tag names. When passed, it will be
129           called for every tag ref that points to a rewritten object (or to a
130           tag object which points to a rewritten object). The original tag
131           name is passed via standard input, and the new tag name is expected
132           on standard output.
133
134           The original tags are not deleted, but can be overwritten; use
135           "--tag-name-filter cat" to simply update the tags. In this case, be
136           very careful and make sure you have the old tags backed up in case
137           the conversion has run afoul.
138
139           Nearly proper rewriting of tag objects is supported. If the tag has
140           a message attached, a new tag object will be created with the same
141           message, author, and timestamp. If the tag has a signature
142           attached, the signature will be stripped. It is by definition
143           impossible to preserve signatures. The reason this is "nearly"
144           proper, is because ideally if the tag did not change (points to the
145           same object, has the same name, etc.) it should retain any
146           signature. That is not the case, signatures will always be removed,
147           buyer beware. There is also no support for changing the author or
148           timestamp (or the tag message for that matter). Tags which point to
149           other tags will be rewritten to point to the underlying commit.
150
151       --subdirectory-filter <directory>
152           Only look at the history which touches the given subdirectory. The
153           result will contain that directory (and only that) as its project
154           root. Implies the section called “Remap to ancestor”.
155
156       --prune-empty
157           Some kind of filters will generate empty commits, that left the
158           tree untouched. This switch allow git-filter-branch to ignore such
159           commits. Though, this switch only applies for commits that have one
160           and only one parent, it will hence keep merges points. Also, this
161           option is not compatible with the use of --commit-filter. Though
162           you just need to use the function git_commit_non_empty_tree "$@"
163           instead of the git commit-tree "$@" idiom in your commit filter to
164           make that happen.
165
166       --original <namespace>
167           Use this option to set the namespace where the original commits
168           will be stored. The default value is refs/original.
169
170       -d <directory>
171           Use this option to set the path to the temporary directory used for
172           rewriting. When applying a tree filter, the command needs to
173           temporarily check out the tree to some directory, which may consume
174           considerable space in case of large projects. By default it does
175           this in the .git-rewrite/ directory but you can override that
176           choice by this parameter.
177
178       -f, --force
179
180           git filter-branch refuses to start with an existing temporary
181           directory or when there are already refs starting with
182           refs/original/, unless forced.
183
184       <rev-list options>...
185           Arguments for git rev-list. All positive refs included by these
186           options are rewritten. You may also specify options such as --all,
187           but you must use -- to separate them from the git filter-branch
188           options. Implies the section called “Remap to ancestor”.
189
190   Remap to ancestor
191       By using rev-list(1) arguments, e.g., path limiters, you can limit the
192       set of revisions which get rewritten. However, positive refs on the
193       command line are distinguished: we don’t let them be excluded by such
194       limiters. For this purpose, they are instead rewritten to point at the
195       nearest ancestor that was not excluded.
196

EXAMPLES

198       Suppose you want to remove a file (containing confidential information
199       or copyright violation) from all commits:
200
201           git filter-branch --tree-filter 'rm filename' HEAD
202
203
204       However, if the file is absent from the tree of some commit, a simple
205       rm filename will fail for that tree and commit. Thus you may instead
206       want to use rm -f filename as the script.
207
208       Using --index-filter with git rm yields a significantly faster version.
209       Like with using rm filename, git rm --cached filename will fail if the
210       file is absent from the tree of a commit. If you want to "completely
211       forget" a file, it does not matter when it entered history, so we also
212       add --ignore-unmatch:
213
214           git filter-branch --index-filter 'git rm --cached --ignore-unmatch filename' HEAD
215
216
217       Now, you will get the rewritten history saved in HEAD.
218
219       To rewrite the repository to look as if foodir/ had been its project
220       root, and discard all other history:
221
222           git filter-branch --subdirectory-filter foodir -- --all
223
224
225       Thus you can, e.g., turn a library subdirectory into a repository of
226       its own. Note the -- that separates filter-branch options from revision
227       options, and the --all to rewrite all branches and tags.
228
229       To set a commit (which typically is at the tip of another history) to
230       be the parent of the current initial commit, in order to paste the
231       other history behind the current history:
232
233           git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' HEAD
234
235
236       (if the parent string is empty - which happens when we are dealing with
237       the initial commit - add graftcommit as a parent). Note that this
238       assumes history with a single root (that is, no merge without common
239       ancestors happened). If this is not the case, use:
240
241           git filter-branch --parent-filter \
242                   'test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>" || cat' HEAD
243
244
245       or even simpler:
246
247           echo "$commit-id $graft-id" >> .git/info/grafts
248           git filter-branch $graft-id..HEAD
249
250
251       To remove commits authored by "Darl McBribe" from the history:
252
253           git filter-branch --commit-filter '
254                   if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ];
255                   then
256                           skip_commit "$@";
257                   else
258                           git commit-tree "$@";
259                   fi' HEAD
260
261
262       The function skip_commit is defined as follows:
263
264           skip_commit()
265           {
266                   shift;
267                   while [ -n "$1" ];
268                   do
269                           shift;
270                           map "$1";
271                           shift;
272                   done;
273           }
274
275
276       The shift magic first throws away the tree id and then the -p
277       parameters. Note that this handles merges properly! In case Darl
278       committed a merge between P1 and P2, it will be propagated properly and
279       all children of the merge will become merge commits with P1,P2 as their
280       parents instead of the merge commit.
281
282       You can rewrite the commit log messages using --msg-filter. For
283       example, git svn-id strings in a repository created by git svn can be
284       removed this way:
285
286           git filter-branch --msg-filter '
287                   sed -e "/^git-svn-id:/d"
288           '
289
290
291       To restrict rewriting to only part of the history, specify a revision
292       range in addition to the new branch name. The new branch name will
293       point to the top-most revision that a git rev-list of this range will
294       print.
295
296       If you need to add Acked-by lines to, say, the last 10 commits (none of
297       which is a merge), use this command:
298
299           git filter-branch --msg-filter '
300                   cat &&
301                   echo "Acked-by: Bugs Bunny <bunny@bugzilla.org>"
302           ' HEAD~10..HEAD
303
304
305       NOTE the changes introduced by the commits, and which are not reverted
306       by subsequent commits, will still be in the rewritten branch. If you
307       want to throw out changes together with the commits, you should use the
308       interactive mode of git rebase.
309
310       Consider this history:
311
312                D--E--F--G--H
313               /     /
314           A--B-----C
315
316
317       To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:
318
319           git filter-branch ... C..H
320
321
322       To rewrite commits E,F,G,H, use one of these:
323
324           git filter-branch ... C..H --not D
325           git filter-branch ... D..H --not C
326
327
328       To move the whole tree into a subdirectory, or remove it from there:
329
330           git filter-branch --index-filter \
331                   'git ls-files -s | sed "s-\t\"*-&newsubdir/-" |
332                           GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
333                                   git update-index --index-info &&
334                    mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' HEAD
335
336

CHECKLIST FOR SHRINKING A REPOSITORY

338       git-filter-branch is often used to get rid of a subset of files,
339       usually with some combination of --index-filter and
340       --subdirectory-filter. People expect the resulting repository to be
341       smaller than the original, but you need a few more steps to actually
342       make it smaller, because git tries hard not to lose your objects until
343       you tell it to. First make sure that:
344
345       ·   You really removed all variants of a filename, if a blob was moved
346           over its lifetime.  git log --name-only --follow --all -- filename
347           can help you find renames.
348
349       ·   You really filtered all refs: use --tag-name-filter cat -- --all
350           when calling git-filter-branch.
351
352       Then there are two ways to get a smaller repository. A safer way is to
353       clone, that keeps your original intact.
354
355       ·   Clone it with git clone file:///path/to/repo. The clone will not
356           have the removed objects. See git-clone(1). (Note that cloning with
357           a plain path just hardlinks everything!)
358
359       If you really don’t want to clone it, for whatever reasons, check the
360       following points instead (in this order). This is a very destructive
361       approach, so make a backup or go back to cloning it. You have been
362       warned.
363
364       ·   Remove the original refs backed up by git-filter-branch: say git
365           for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git
366           update-ref -d.
367
368       ·   Expire all reflogs with git reflog expire --expire=now --all.
369
370       ·   Garbage collect all unreferenced objects with git gc --prune=now
371           (or if your git-gc is not new enough to support arguments to
372           --prune, use git repack -ad; git prune instead).
373

AUTHOR

375       Written by Petr "Pasky" Baudis <pasky@suse.cz[1]>, and the git list
376       <git@vger.kernel.org[2]>
377

DOCUMENTATION

379       Documentation by Petr Baudis and the git list.
380

GIT

382       Part of the git(1) suite
383

NOTES

385        1. pasky@suse.cz
386           mailto:pasky@suse.cz
387
388        2. git@vger.kernel.org
389           mailto:git@vger.kernel.org
390
391
392
393Git 1.7.4.4                       04/11/2011              GIT-FILTER-BRANCH(1)
Impressum