1GIT-FILTER-BRANCH(1) Git Manual GIT-FILTER-BRANCH(1)
2
3
4
6 git-filter-branch - Rewrite branches
7
9 git filter-branch [--env-filter <command>] [--tree-filter <command>]
10 [--index-filter <command>] [--parent-filter <command>]
11 [--msg-filter <command>] [--commit-filter <command>]
12 [--tag-name-filter <command>] [--subdirectory-filter <directory>]
13 [--prune-empty]
14 [--original <namespace>] [-d <directory>] [-f | --force]
15 [--] [<rev-list options>...]
16
17
19 Lets you rewrite git revision history by rewriting the branches
20 mentioned in the <rev-list options>, applying custom filters on each
21 revision. Those filters can modify each tree (e.g. removing a file or
22 running a perl rewrite on all files) or information about each commit.
23 Otherwise, all information (including original commit times or merge
24 information) will be preserved.
25
26 The command will only rewrite the positive refs mentioned in the
27 command line (e.g. if you pass a..b, only b will be rewritten). If you
28 specify no filters, the commits will be recommitted without any
29 changes, which would normally have no effect. Nevertheless, this may be
30 useful in the future for compensating for some git bugs or such,
31 therefore such a usage is permitted.
32
33 NOTE: This command honors .git/info/grafts. If you have any grafts
34 defined, running this command will make them permanent.
35
36 WARNING! The rewritten history will have different object names for all
37 the objects and will not converge with the original branch. You will
38 not be able to easily push and distribute the rewritten branch on top
39 of the original branch. Please do not use this command if you do not
40 know the full implications, and avoid using it anyway, if a simple
41 single commit would suffice to fix your problem. (See the "RECOVERING
42 FROM UPSTREAM REBASE" section in git-rebase(1) for further information
43 about rewriting published history.)
44
45 Always verify that the rewritten version is correct: The original refs,
46 if different from the rewritten ones, will be stored in the namespace
47 refs/original/.
48
49 Note that since this operation is very I/O expensive, it might be a
50 good idea to redirect the temporary directory off-disk with the -d
51 option, e.g. on tmpfs. Reportedly the speedup is very noticeable.
52
53 Filters
54 The filters are applied in the order as listed below. The <command>
55 argument is always evaluated in the shell context using the eval
56 command (with the notable exception of the commit filter, for technical
57 reasons). Prior to that, the $GIT_COMMIT environment variable will be
58 set to contain the id of the commit being rewritten. Also,
59 GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME,
60 GIT_COMMITTER_EMAIL, and GIT_COMMITTER_DATE are set according to the
61 current commit. The values of these variables after the filters have
62 run, are used for the new commit. If any evaluation of <command>
63 returns a non-zero exit status, the whole operation will be aborted.
64
65 A map function is available that takes an "original sha1 id" argument
66 and outputs a "rewritten sha1 id" if the commit has been already
67 rewritten, and "original sha1 id" otherwise; the map function can
68 return several ids on separate lines if your commit filter emitted
69 multiple commits.
70
72 --env-filter <command>
73 This filter may be used if you only need to modify the environment
74 in which the commit will be performed. Specifically, you might want
75 to rewrite the author/committer name/email/time environment
76 variables (see git-commit-tree(1) for details). Do not forget to
77 re-export the variables.
78
79 --tree-filter <command>
80 This is the filter for rewriting the tree and its contents. The
81 argument is evaluated in shell with the working directory set to
82 the root of the checked out tree. The new tree is then used as-is
83 (new files are auto-added, disappeared files are auto-removed -
84 neither .gitignore files nor any other ignore rules HAVE ANY
85 EFFECT!).
86
87 --index-filter <command>
88 This is the filter for rewriting the index. It is similar to the
89 tree filter but does not check out the tree, which makes it much
90 faster. Frequently used with git rm --cached --ignore-unmatch ...,
91 see EXAMPLES below. For hairy cases, see git-update-index(1).
92
93 --parent-filter <command>
94 This is the filter for rewriting the commit’s parent list. It will
95 receive the parent string on stdin and shall output the new parent
96 string on stdout. The parent string is in the format described in
97 git-commit-tree(1): empty for the initial commit, "-p parent" for a
98 normal commit and "-p parent1 -p parent2 -p parent3 ..." for a
99 merge commit.
100
101 --msg-filter <command>
102 This is the filter for rewriting the commit messages. The argument
103 is evaluated in the shell with the original commit message on
104 standard input; its standard output is used as the new commit
105 message.
106
107 --commit-filter <command>
108 This is the filter for performing the commit. If this filter is
109 specified, it will be called instead of the git commit-tree
110 command, with arguments of the form "<TREE_ID> [(-p
111 <PARENT_COMMIT_ID>)...]" and the log message on stdin. The commit
112 id is expected on stdout.
113
114 As a special extension, the commit filter may emit multiple commit
115 ids; in that case, the rewritten children of the original commit
116 will have all of them as parents.
117
118 You can use the map convenience function in this filter, and other
119 convenience functions, too. For example, calling skip_commit "$@"
120 will leave out the current commit (but not its changes! If you want
121 that, use git rebase instead).
122
123 You can also use the git_commit_non_empty_tree "$@" instead of git
124 commit-tree "$@" if you don’t wish to keep commits with a single
125 parent and that makes no change to the tree.
126
127 --tag-name-filter <command>
128 This is the filter for rewriting tag names. When passed, it will be
129 called for every tag ref that points to a rewritten object (or to a
130 tag object which points to a rewritten object). The original tag
131 name is passed via standard input, and the new tag name is expected
132 on standard output.
133
134 The original tags are not deleted, but can be overwritten; use
135 "--tag-name-filter cat" to simply update the tags. In this case, be
136 very careful and make sure you have the old tags backed up in case
137 the conversion has run afoul.
138
139 Nearly proper rewriting of tag objects is supported. If the tag has
140 a message attached, a new tag object will be created with the same
141 message, author, and timestamp. If the tag has a signature
142 attached, the signature will be stripped. It is by definition
143 impossible to preserve signatures. The reason this is "nearly"
144 proper, is because ideally if the tag did not change (points to the
145 same object, has the same name, etc.) it should retain any
146 signature. That is not the case, signatures will always be removed,
147 buyer beware. There is also no support for changing the author or
148 timestamp (or the tag message for that matter). Tags which point to
149 other tags will be rewritten to point to the underlying commit.
150
151 --subdirectory-filter <directory>
152 Only look at the history which touches the given subdirectory. The
153 result will contain that directory (and only that) as its project
154 root. Implies the section called “Remap to ancestor”.
155
156 --prune-empty
157 Some kind of filters will generate empty commits, that left the
158 tree untouched. This switch allow git-filter-branch to ignore such
159 commits. Though, this switch only applies for commits that have one
160 and only one parent, it will hence keep merges points. Also, this
161 option is not compatible with the use of --commit-filter. Though
162 you just need to use the function git_commit_non_empty_tree "$@"
163 instead of the git commit-tree "$@" idiom in your commit filter to
164 make that happen.
165
166 --original <namespace>
167 Use this option to set the namespace where the original commits
168 will be stored. The default value is refs/original.
169
170 -d <directory>
171 Use this option to set the path to the temporary directory used for
172 rewriting. When applying a tree filter, the command needs to
173 temporarily check out the tree to some directory, which may consume
174 considerable space in case of large projects. By default it does
175 this in the .git-rewrite/ directory but you can override that
176 choice by this parameter.
177
178 -f, --force
179
180 git filter-branch refuses to start with an existing temporary
181 directory or when there are already refs starting with
182 refs/original/, unless forced.
183
184 <rev-list options>...
185 Arguments for git rev-list. All positive refs included by these
186 options are rewritten. You may also specify options such as --all,
187 but you must use -- to separate them from the git filter-branch
188 options. Implies the section called “Remap to ancestor”.
189
190 Remap to ancestor
191 By using rev-list(1) arguments, e.g., path limiters, you can limit the
192 set of revisions which get rewritten. However, positive refs on the
193 command line are distinguished: we don’t let them be excluded by such
194 limiters. For this purpose, they are instead rewritten to point at the
195 nearest ancestor that was not excluded.
196
198 Suppose you want to remove a file (containing confidential information
199 or copyright violation) from all commits:
200
201 git filter-branch --tree-filter 'rm filename' HEAD
202
203
204 However, if the file is absent from the tree of some commit, a simple
205 rm filename will fail for that tree and commit. Thus you may instead
206 want to use rm -f filename as the script.
207
208 Using --index-filter with git rm yields a significantly faster version.
209 Like with using rm filename, git rm --cached filename will fail if the
210 file is absent from the tree of a commit. If you want to "completely
211 forget" a file, it does not matter when it entered history, so we also
212 add --ignore-unmatch:
213
214 git filter-branch --index-filter 'git rm --cached --ignore-unmatch filename' HEAD
215
216
217 Now, you will get the rewritten history saved in HEAD.
218
219 To rewrite the repository to look as if foodir/ had been its project
220 root, and discard all other history:
221
222 git filter-branch --subdirectory-filter foodir -- --all
223
224
225 Thus you can, e.g., turn a library subdirectory into a repository of
226 its own. Note the -- that separates filter-branch options from revision
227 options, and the --all to rewrite all branches and tags.
228
229 To set a commit (which typically is at the tip of another history) to
230 be the parent of the current initial commit, in order to paste the
231 other history behind the current history:
232
233 git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' HEAD
234
235
236 (if the parent string is empty - which happens when we are dealing with
237 the initial commit - add graftcommit as a parent). Note that this
238 assumes history with a single root (that is, no merge without common
239 ancestors happened). If this is not the case, use:
240
241 git filter-branch --parent-filter \
242 'test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>" || cat' HEAD
243
244
245 or even simpler:
246
247 echo "$commit-id $graft-id" >> .git/info/grafts
248 git filter-branch $graft-id..HEAD
249
250
251 To remove commits authored by "Darl McBribe" from the history:
252
253 git filter-branch --commit-filter '
254 if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ];
255 then
256 skip_commit "$@";
257 else
258 git commit-tree "$@";
259 fi' HEAD
260
261
262 The function skip_commit is defined as follows:
263
264 skip_commit()
265 {
266 shift;
267 while [ -n "$1" ];
268 do
269 shift;
270 map "$1";
271 shift;
272 done;
273 }
274
275
276 The shift magic first throws away the tree id and then the -p
277 parameters. Note that this handles merges properly! In case Darl
278 committed a merge between P1 and P2, it will be propagated properly and
279 all children of the merge will become merge commits with P1,P2 as their
280 parents instead of the merge commit.
281
282 You can rewrite the commit log messages using --msg-filter. For
283 example, git svn-id strings in a repository created by git svn can be
284 removed this way:
285
286 git filter-branch --msg-filter '
287 sed -e "/^git-svn-id:/d"
288 '
289
290
291 To restrict rewriting to only part of the history, specify a revision
292 range in addition to the new branch name. The new branch name will
293 point to the top-most revision that a git rev-list of this range will
294 print.
295
296 If you need to add Acked-by lines to, say, the last 10 commits (none of
297 which is a merge), use this command:
298
299 git filter-branch --msg-filter '
300 cat &&
301 echo "Acked-by: Bugs Bunny <bunny@bugzilla.org>"
302 ' HEAD~10..HEAD
303
304
305 NOTE the changes introduced by the commits, and which are not reverted
306 by subsequent commits, will still be in the rewritten branch. If you
307 want to throw out changes together with the commits, you should use the
308 interactive mode of git rebase.
309
310 Consider this history:
311
312 D--E--F--G--H
313 / /
314 A--B-----C
315
316
317 To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:
318
319 git filter-branch ... C..H
320
321
322 To rewrite commits E,F,G,H, use one of these:
323
324 git filter-branch ... C..H --not D
325 git filter-branch ... D..H --not C
326
327
328 To move the whole tree into a subdirectory, or remove it from there:
329
330 git filter-branch --index-filter \
331 'git ls-files -s | sed "s-\t\"*-&newsubdir/-" |
332 GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
333 git update-index --index-info &&
334 mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' HEAD
335
336
338 git-filter-branch is often used to get rid of a subset of files,
339 usually with some combination of --index-filter and
340 --subdirectory-filter. People expect the resulting repository to be
341 smaller than the original, but you need a few more steps to actually
342 make it smaller, because git tries hard not to lose your objects until
343 you tell it to. First make sure that:
344
345 · You really removed all variants of a filename, if a blob was moved
346 over its lifetime. git log --name-only --follow --all -- filename
347 can help you find renames.
348
349 · You really filtered all refs: use --tag-name-filter cat -- --all
350 when calling git-filter-branch.
351
352 Then there are two ways to get a smaller repository. A safer way is to
353 clone, that keeps your original intact.
354
355 · Clone it with git clone file:///path/to/repo. The clone will not
356 have the removed objects. See git-clone(1). (Note that cloning with
357 a plain path just hardlinks everything!)
358
359 If you really don’t want to clone it, for whatever reasons, check the
360 following points instead (in this order). This is a very destructive
361 approach, so make a backup or go back to cloning it. You have been
362 warned.
363
364 · Remove the original refs backed up by git-filter-branch: say git
365 for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git
366 update-ref -d.
367
368 · Expire all reflogs with git reflog expire --expire=now --all.
369
370 · Garbage collect all unreferenced objects with git gc --prune=now
371 (or if your git-gc is not new enough to support arguments to
372 --prune, use git repack -ad; git prune instead).
373
375 Written by Petr "Pasky" Baudis <pasky@suse.cz[1]>, and the git list
376 <git@vger.kernel.org[2]>
377
379 Documentation by Petr Baudis and the git list.
380
382 Part of the git(1) suite
383
385 1. pasky@suse.cz
386 mailto:pasky@suse.cz
387
388 2. git@vger.kernel.org
389 mailto:git@vger.kernel.org
390
391
392
393Git 1.7.4.4 04/11/2011 GIT-FILTER-BRANCH(1)