1GIT-FILTER-BRANCH(1) Git Manual GIT-FILTER-BRANCH(1)
2
3
4
6 git-filter-branch - Rewrite branches
7
9 git filter-branch [--env-filter <command>] [--tree-filter <command>]
10 [--index-filter <command>] [--parent-filter <command>]
11 [--msg-filter <command>] [--commit-filter <command>]
12 [--tag-name-filter <command>] [--subdirectory-filter <directory>]
13 [--prune-empty]
14 [--original <namespace>] [-d <directory>] [-f | --force]
15 [--] [<rev-list options>...]
16
17
19 Lets you rewrite git revision history by rewriting the branches
20 mentioned in the <rev-list options>, applying custom filters on each
21 revision. Those filters can modify each tree (e.g. removing a file or
22 running a perl rewrite on all files) or information about each commit.
23 Otherwise, all information (including original commit times or merge
24 information) will be preserved.
25
26 The command will only rewrite the positive refs mentioned in the
27 command line (e.g. if you pass a..b, only b will be rewritten). If you
28 specify no filters, the commits will be recommitted without any
29 changes, which would normally have no effect. Nevertheless, this may be
30 useful in the future for compensating for some git bugs or such,
31 therefore such a usage is permitted.
32
33 NOTE: This command honors .git/info/grafts. If you have any grafts
34 defined, running this command will make them permanent.
35
36 WARNING! The rewritten history will have different object names for all
37 the objects and will not converge with the original branch. You will
38 not be able to easily push and distribute the rewritten branch on top
39 of the original branch. Please do not use this command if you do not
40 know the full implications, and avoid using it anyway, if a simple
41 single commit would suffice to fix your problem. (See the "RECOVERING
42 FROM UPSTREAM REBASE" section in git-rebase(1) for further information
43 about rewriting published history.)
44
45 Always verify that the rewritten version is correct: The original refs,
46 if different from the rewritten ones, will be stored in the namespace
47 refs/original/.
48
49 Note that since this operation is very I/O expensive, it might be a
50 good idea to redirect the temporary directory off-disk with the -d
51 option, e.g. on tmpfs. Reportedly the speedup is very noticeable.
52
53 Filters
54 The filters are applied in the order as listed below. The <command>
55 argument is always evaluated in the shell context using the eval
56 command (with the notable exception of the commit filter, for technical
57 reasons). Prior to that, the $GIT_COMMIT environment variable will be
58 set to contain the id of the commit being rewritten. Also,
59 GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, GIT_AUTHOR_DATE, GIT_COMMITTER_NAME,
60 GIT_COMMITTER_EMAIL, and GIT_COMMITTER_DATE are set according to the
61 current commit. The values of these variables after the filters have
62 run, are used for the new commit. If any evaluation of <command>
63 returns a non-zero exit status, the whole operation will be aborted.
64
65 A map function is available that takes an "original sha1 id" argument
66 and outputs a "rewritten sha1 id" if the commit has been already
67 rewritten, and "original sha1 id" otherwise; the map function can
68 return several ids on separate lines if your commit filter emitted
69 multiple commits.
70
72 --env-filter <command>
73 This filter may be used if you only need to modify the environment
74 in which the commit will be performed. Specifically, you might want
75 to rewrite the author/committer name/email/time environment
76 variables (see git-commit(1) for details). Do not forget to
77 re-export the variables.
78
79 --tree-filter <command>
80 This is the filter for rewriting the tree and its contents. The
81 argument is evaluated in shell with the working directory set to
82 the root of the checked out tree. The new tree is then used as-is
83 (new files are auto-added, disappeared files are auto-removed -
84 neither .gitignore files nor any other ignore rules HAVE ANY
85 EFFECT!).
86
87 --index-filter <command>
88 This is the filter for rewriting the index. It is similar to the
89 tree filter but does not check out the tree, which makes it much
90 faster. Frequently used with git rm --cached --ignore-unmatch ...,
91 see EXAMPLES below. For hairy cases, see git-update-index(1).
92
93 --parent-filter <command>
94 This is the filter for rewriting the commit’s parent list. It will
95 receive the parent string on stdin and shall output the new parent
96 string on stdout. The parent string is in the format described in
97 git-commit-tree(1): empty for the initial commit, "-p parent" for a
98 normal commit and "-p parent1 -p parent2 -p parent3 ..." for a
99 merge commit.
100
101 --msg-filter <command>
102 This is the filter for rewriting the commit messages. The argument
103 is evaluated in the shell with the original commit message on
104 standard input; its standard output is used as the new commit
105 message.
106
107 --commit-filter <command>
108 This is the filter for performing the commit. If this filter is
109 specified, it will be called instead of the git commit-tree
110 command, with arguments of the form "<TREE_ID> [-p
111 <PARENT_COMMIT_ID>]..." and the log message on stdin. The commit id
112 is expected on stdout.
113
114 As a special extension, the commit filter may emit multiple commit
115 ids; in that case, the rewritten children of the original commit
116 will have all of them as parents.
117
118 You can use the map convenience function in this filter, and other
119 convenience functions, too. For example, calling skip_commit "$@"
120 will leave out the current commit (but not its changes! If you want
121 that, use git rebase instead).
122
123 You can also use the git_commit_non_empty_tree "$@" instead of git
124 commit-tree "$@" if you don’t wish to keep commits with a single
125 parent and that makes no change to the tree.
126
127 --tag-name-filter <command>
128 This is the filter for rewriting tag names. When passed, it will be
129 called for every tag ref that points to a rewritten object (or to a
130 tag object which points to a rewritten object). The original tag
131 name is passed via standard input, and the new tag name is expected
132 on standard output.
133
134 The original tags are not deleted, but can be overwritten; use
135 "--tag-name-filter cat" to simply update the tags. In this case, be
136 very careful and make sure you have the old tags backed up in case
137 the conversion has run afoul.
138
139 Nearly proper rewriting of tag objects is supported. If the tag has
140 a message attached, a new tag object will be created with the same
141 message, author, and timestamp. If the tag has a signature
142 attached, the signature will be stripped. It is by definition
143 impossible to preserve signatures. The reason this is "nearly"
144 proper, is because ideally if the tag did not change (points to the
145 same object, has the same name, etc.) it should retain any
146 signature. That is not the case, signatures will always be removed,
147 buyer beware. There is also no support for changing the author or
148 timestamp (or the tag message for that matter). Tags which point to
149 other tags will be rewritten to point to the underlying commit.
150
151 --subdirectory-filter <directory>
152 Only look at the history which touches the given subdirectory. The
153 result will contain that directory (and only that) as its project
154 root. Implies --remap-to-ancestor.
155
156 --remap-to-ancestor
157 Rewrite refs to the nearest rewritten ancestor instead of ignoring
158 them.
159
160 Normally, positive refs on the command line are only changed if the
161 commit they point to was rewritten. However, you can limit the
162 extent of this rewriting by using rev-list(1) arguments, e.g., path
163 limiters. Refs pointing to such excluded commits would then
164 normally be ignored. With this option, they are instead rewritten
165 to point at the nearest ancestor that was not excluded.
166
167 --prune-empty
168 Some kind of filters will generate empty commits, that left the
169 tree untouched. This switch allow git-filter-branch to ignore such
170 commits. Though, this switch only applies for commits that have one
171 and only one parent, it will hence keep merges points. Also, this
172 option is not compatible with the use of --commit-filter. Though
173 you just need to use the function git_commit_non_empty_tree "$@"
174 instead of the git commit-tree "$@" idiom in your commit filter to
175 make that happen.
176
177 --original <namespace>
178 Use this option to set the namespace where the original commits
179 will be stored. The default value is refs/original.
180
181 -d <directory>
182 Use this option to set the path to the temporary directory used for
183 rewriting. When applying a tree filter, the command needs to
184 temporarily check out the tree to some directory, which may consume
185 considerable space in case of large projects. By default it does
186 this in the .git-rewrite/ directory but you can override that
187 choice by this parameter.
188
189 -f, --force
190
191 git filter-branch refuses to start with an existing temporary
192 directory or when there are already refs starting with
193 refs/original/, unless forced.
194
195 <rev-list options>...
196 Arguments for git rev-list. All positive refs included by these
197 options are rewritten. You may also specify options such as --all,
198 but you must use -- to separate them from the git filter-branch
199 options.
200
202 Suppose you want to remove a file (containing confidential information
203 or copyright violation) from all commits:
204
205 git filter-branch --tree-filter ´rm filename´ HEAD
206
207
208 However, if the file is absent from the tree of some commit, a simple
209 rm filename will fail for that tree and commit. Thus you may instead
210 want to use rm -f filename as the script.
211
212 Using --index-filter with git rm yields a significantly faster version.
213 Like with using rm filename, git rm --cached filename will fail if the
214 file is absent from the tree of a commit. If you want to "completely
215 forget" a file, it does not matter when it entered history, so we also
216 add --ignore-unmatch:
217
218 git filter-branch --index-filter ´git rm --cached --ignore-unmatch filename´ HEAD
219
220
221 Now, you will get the rewritten history saved in HEAD.
222
223 To rewrite the repository to look as if foodir/ had been its project
224 root, and discard all other history:
225
226 git filter-branch --subdirectory-filter foodir -- --all
227
228
229 Thus you can, e.g., turn a library subdirectory into a repository of
230 its own. Note the -- that separates filter-branch options from revision
231 options, and the --all to rewrite all branches and tags.
232
233 To set a commit (which typically is at the tip of another history) to
234 be the parent of the current initial commit, in order to paste the
235 other history behind the current history:
236
237 git filter-branch --parent-filter ´sed "s/^\$/-p <graft-id>/"´ HEAD
238
239
240 (if the parent string is empty - which happens when we are dealing with
241 the initial commit - add graftcommit as a parent). Note that this
242 assumes history with a single root (that is, no merge without common
243 ancestors happened). If this is not the case, use:
244
245 git filter-branch --parent-filter \
246 ´test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>" || cat´ HEAD
247
248
249 or even simpler:
250
251 echo "$commit-id $graft-id" >> .git/info/grafts
252 git filter-branch $graft-id..HEAD
253
254
255 To remove commits authored by "Darl McBribe" from the history:
256
257 git filter-branch --commit-filter ´
258 if [ "$GIT_AUTHOR_NAME" = "Darl McBribe" ];
259 then
260 skip_commit "$@";
261 else
262 git commit-tree "$@";
263 fi´ HEAD
264
265
266 The function skip_commit is defined as follows:
267
268 skip_commit()
269 {
270 shift;
271 while [ -n "$1" ];
272 do
273 shift;
274 map "$1";
275 shift;
276 done;
277 }
278
279
280 The shift magic first throws away the tree id and then the -p
281 parameters. Note that this handles merges properly! In case Darl
282 committed a merge between P1 and P2, it will be propagated properly and
283 all children of the merge will become merge commits with P1,P2 as their
284 parents instead of the merge commit.
285
286 You can rewrite the commit log messages using --msg-filter. For
287 example, git svn-id strings in a repository created by git svn can be
288 removed this way:
289
290 git filter-branch --msg-filter ´
291 sed -e "/^git-svn-id:/d"
292 ´
293
294
295 To restrict rewriting to only part of the history, specify a revision
296 range in addition to the new branch name. The new branch name will
297 point to the top-most revision that a git rev-list of this range will
298 print.
299
300 If you need to add Acked-by lines to, say, the last 10 commits (none of
301 which is a merge), use this command:
302
303 git filter-branch --msg-filter ´
304 cat &&
305 echo "Acked-by: Bugs Bunny <bunny@bugzilla.org>"
306 ´ HEAD~10..HEAD
307
308
309 NOTE the changes introduced by the commits, and which are not reverted
310 by subsequent commits, will still be in the rewritten branch. If you
311 want to throw out changes together with the commits, you should use the
312 interactive mode of git rebase.
313
314 Consider this history:
315
316 D--E--F--G--H
317 / /
318 A--B-----C
319
320
321 To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:
322
323 git filter-branch ... C..H
324
325
326 To rewrite commits E,F,G,H, use one of these:
327
328 git filter-branch ... C..H --not D
329 git filter-branch ... D..H --not C
330
331
332 To move the whole tree into a subdirectory, or remove it from there:
333
334 git filter-branch --index-filter \
335 ´git ls-files -s | sed "s-\t\"*-&newsubdir/-" |
336 GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
337 git update-index --index-info &&
338 mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE´ HEAD
339
340
342 git-filter-branch is often used to get rid of a subset of files,
343 usually with some combination of --index-filter and
344 --subdirectory-filter. People expect the resulting repository to be
345 smaller than the original, but you need a few more steps to actually
346 make it smaller, because git tries hard not to lose your objects until
347 you tell it to. First make sure that:
348
349 · You really removed all variants of a filename, if a blob was moved
350 over its lifetime. git log --name-only --follow --all -- filename
351 can help you find renames.
352
353 · You really filtered all refs: use --tag-name-filter cat -- --all
354 when calling git-filter-branch.
355
356 Then there are two ways to get a smaller repository. A safer way is to
357 clone, that keeps your original intact.
358
359 · Clone it with git clone file:///path/to/repo. The clone will not
360 have the removed objects. See git-clone(1). (Note that cloning with
361 a plain path just hardlinks everything!)
362
363 If you really don’t want to clone it, for whatever reasons, check the
364 following points instead (in this order). This is a very destructive
365 approach, so make a backup or go back to cloning it. You have been
366 warned.
367
368 · Remove the original refs backed up by git-filter-branch: say git
369 for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git
370 update-ref -d.
371
372 · Expire all reflogs with git reflog expire --expire=now --all.
373
374 · Garbage collect all unreferenced objects with git gc --prune=now
375 (or if your git-gc is not new enough to support arguments to
376 --prune, use git repack -ad; git prune instead).
377
379 Written by Petr "Pasky" Baudis <pasky@suse.cz[1]>, and the git list
380 <git@vger.kernel.org[2]>
381
383 Documentation by Petr Baudis and the git list.
384
386 Part of the git(1) suite
387
389 1. pasky@suse.cz
390 mailto:pasky@suse.cz
391
392 2. git@vger.kernel.org
393 mailto:git@vger.kernel.org
394
395
396
397Git 1.7.1 08/16/2017 GIT-FILTER-BRANCH(1)