History repeats itself, first as tragedy, second as farce.
— Karl Marx
Several days ago, a few coworkers and I happened upon a perplexing problem. It can be stated as follows: "Given two Git repositories
target, while preserving revision history, move the contents of a subdirectory
a located in
source to a subdirectory
b located in
We ended up simply executing a
mv to transfer the contents of
b which, of course, purged all revision history. Out of curiousity, I researched if there was a lossless way to do what we wanted.
The solution was the Git subcommand
filter-branch. This tool allows you to alter revision history by iterating over revisions and applying something called filters. A filter is an option provided to
filter-branch which produces a side-effect on all revisions for a given repository branch.
Consider the command
git filter-branch --subdirectory-filter foo. This steps through all revisions and sets the subdirectory
foo as the new root of the Git repository.
For us, the
subdirectory-filter option wasn't going to cut it. Instead we used an option which rewrites a revision by evaluating a user specified shell script, called
Let's step through an example of utilizing
--tree-filter to solve the problem we defined above. For our purposes, we assume the following:
targetare Git repositories in the same parent directory.
sourceis on the
We begin by changing into
$ cd source
Next we run the
$ git filter-branch --tree-filter 'find . ! -name a -type d ! -name . -type d ! -name .. -type d -maxdepth 1 | xargs rm -rf; test -d a && mv a b || echo "Nothing to do"' --prune-empty HEAD
Take a deep breath. Let's break apart this command so we can better digest it.
find . ! -name a -type d ! -name . -type d ! -name .. -type d -maxdepth 1 | xargs rm -rf;: Remove all files and directories except
test -d a && mv a b || echo "Nothing to do": If the directory
b. This prevents
filter-branchfrom failing in the case of commits that don't contain the directory
--prune-empty: Ignore empty commits generated by
filter-branch. In our case, this would be any commits that did not contain the directory
For other useful options refer to the appendix.
Now we change into the
target Git repository:
$ cd ../target
We then add
../source as a new remote called
source, fetch from this new remote, and merge in
$ git remote add source ../source $ git fetch source $ get merge source/master
There you have it. We have successfully migrated the contents of a subdirectory from one Git repository to a subdirectory in another Git repository. All the while, our revision history has remained intact.
If you're dealing with a repository that has a large revision history a significant speed-up can be seen by using the
-d option. This redirects the temporary directory used by
filter-branch to what is specfied by the user. For example, on many Unix systems you can use