keep - merge two unrelated git repositories




How do you merge two Git repositories? (14)

Consider the following scenario:

I have developed a small experimental project A in its own Git repo. It has now matured, and I'd like A to be part of larger project B, which has its own big repository. I'd now like to add A as a subdirectory of B.

How do I merge A into B, without losing history on any side?


git-subtree is nice, but it is probably not the one you want.

For example, if projectA is the directory created in B, after git subtree,

git log projectA

lists only one commit: the merge. The commits from the merged project are for different paths, so they don't show up.

Greg Hewgill's answer comes closest, although it doesn't actually say how to rewrite the paths.


The solution is surprisingly simple.

(1) In A,

PREFIX=projectA #adjust this

git filter-branch --index-filter '
    git ls-files -s |
    sed "s,\t,&'"$PREFIX"'/," |
    GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info &&
    mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE
' HEAD

Note: This rewrites history, so if you intend to continue using this repo A, you may want to clone (copy) a throwaway copy of it first.

(2) Then in B, run

git pull path/to/A

Voila! You have a projectA directory in B. If you run git log projectA, you will see all commits from A.


In my case, I wanted two subdirectories, projectA and projectB. In that case, I did step (1) to B as well.


A single branch of another repository can be easily placed under a subdirectory retaining its history. For example:

git subtree add --prefix=rails git://github.com/rails/rails.git master

This will appear as a single commit where all files of Rails master branch are added into "rails" directory. However the commit's title contains a reference to the old history tree:

Add 'rails/' from commit <rev>

Where <rev> is a SHA-1 commit hash. You can still see the history, blame some changes.

git log <rev>
git blame <rev> -- README.md

Note that you can't see the directory prefix from here since this is an actual old branch left intact. You should treat this like a usual file move commit: you will need an extra jump when reaching it.

# finishes with all files added at once commit
git log rails/README.md

# then continue from original tree
git log <rev> -- README.md

There are more complex solutions like doing this manually or rewriting the history as described in other answers.

The git-subtree command is a part of official git-contrib, some packet managers install it by default (OS X Homebrew). But you might have to install it by yourself in addition to git.


Here are two possible solutions:

Submodules

Either copy repository A into a separate directory in larger project B, or (perhaps better) clone repository A into a subdirectory in project B. Then use git submodule to make this repository a submodule of a repository B.

This is a good solution for loosely-coupled repositories, where development in repository A continues, and the major portion of development is a separate stand-alone development in A. See also SubmoduleSupport and GitSubmoduleTutorial pages on Git Wiki.

Subtree merge

You can merge repository A into a subdirectory of a project B using the subtree merge strategy. This is described in Subtree Merging and You by Markus Prinz.

git remote add -f Bproject /path/to/B
git merge -s ours --allow-unrelated-histories --no-commit Bproject/master
git read-tree --prefix=dir-B/ -u Bproject/master
git commit -m "Merge B project as our subdirectory"
git pull -s subtree Bproject master

(Option --allow-unrelated-histories is needed for Git >= 2.9.0.)

Or you can use git subtree tool (repository on GitHub) by apenwarr (Avery Pennarun), announced for example in his blog post A new alternative to Git submodules: git subtree.


I think in your case (A is to be part of larger project B) the correct solution would be to use subtree merge.


I had a similar challenge, but in my case, we had developed one version of the codebase in repo A, then cloned that into a new repo, repo B, for the new version of the product. After fixing some bugs in repo A, we needed to FI the changes into repo B. Ended up doing the following:

  1. Adding a remote to repo B that pointed to repo A (git remote add...)
  2. Pulling the current branch (we were not using master for bug fixes) (git pull remoteForRepoA bugFixBranch)
  3. Pushing merges to github

Worked a treat :)


I kept losing history when using merge, so I ended up using rebase since in my case the two repositories are different enough not to end up merging at every commit:

git clone [email protected]/projA.git projA
git clone [email protected]/projB.git projB

cd projB
git remote add projA ../projA/
git fetch projA 
git rebase projA/master HEAD

=> resolve conflicts, then continue, as many times as needed...

git rebase --continue

Doing this leads to one project having all commits from projA followed by commits from projB


I know it's long after the fact, but I wasn't happy with the other answers I found here, so I wrote this:

me=$(basename $0)

TMP=$(mktemp -d /tmp/$me.XXXXXXXX)
echo 
echo "building new repo in $TMP"
echo
sleep 1

set -e

cd $TMP
mkdir new-repo
cd new-repo
    git init
    cd ..

x=0
while [ -n "$1" ]; do
    repo="$1"; shift
    git clone "$repo"
    dirname=$(basename $repo | sed -e 's/\s/-/g')
    if [[ $dirname =~ ^git:.*\.git$ ]]; then
        dirname=$(echo $dirname | sed s/.git$//)
    fi

    cd $dirname
        git remote rm origin
        git filter-branch --tree-filter \
            "(mkdir -p $dirname; find . -maxdepth 1 ! -name . ! -name .git ! -name $dirname -exec mv {} $dirname/ \;)"
        cd ..

    cd new-repo
        git pull --no-commit ../$dirname
        [ $x -gt 0 ] && git commit -m "merge made by $me"
        cd ..

    x=$(( x + 1 ))
done

I've been trying to do the same thing for days, I am using git 2.7.2. Subtree does not preserve the history.

You can use this method if you will not be using the old project again.

I would suggest that you branch B first and work in the branch.

Here are the steps without branching:

cd B

# You are going to merge A into B, so first move all of B's files into a sub dir
mkdir B

# Move all files to B, till there is nothing in the dir but .git and B
git mv <files> B

git add .

git commit -m "Moving content of project B in preparation for merge from A"


# Now merge A into B
git remote add -f A <A repo url>

git merge A/<branch>

mkdir A

# move all the files into subdir A, excluding .git
git mv <files> A

git commit -m "Moved A into subdir"


# Move B's files back to root    
git mv B/* ./

rm -rf B

git commit -m "Reset B to original state"

git push

If you now log any of the files in subdir A you will get the full history

git log --follow A/<file>

This was the post that help me do this:

http://saintgimp.org/2013/01/22/merging-two-git-repositories-into-one-repository-without-losing-file-history/


If both repositories have same kind of files (like two Rails repositories for different projects), you can fetch data of the secondary repository to your current repository:

git fetch git://repository.url/repo.git master:branch_name

and then merge it to current repository:

git merge --allow-unrelated-histories branch_name

If your Git version is smaller than 2.9, remove --allow-unrelated-histories.

After this, conflicts may occur. You can resolve them for example with git mergetool. kdiff3 can be used solely with keyboard, so 5 conflict file takes when reading the code just few minutes.

Remember to finish the merge:

git commit

If you want to put the files from a branch in repo B in a subtree of repo A and also preserve the history, keep reading. (In the example below, I am assuming that we want repo B's master branch merged into repo A's master branch.)

In repo A, first do the following to make repo B available:

git remote add B ../B # Add repo B as a new remote.
git fetch B

Now we create a brand new branch (with only one commit) in repo A that we call new_b_root. The resulting commit will have the files that were committed in the first commit of repo B's master branch but put in a subdirectory called path/to/b-files/.

git checkout --orphan new_b_root master
git rm -rf . # Remove all files.
git cherry-pick -n `git rev-list --max-parents=0 B/master`
mkdir -p path/to/b-files
git mv README path/to/b-files/
git commit --date="$(git log --format='%ai' $(git rev-list --max-parents=0 B/master))"

Explanation: The --orphan option to the checkout command checks out the files from A's master branch but doesn't create any commit. We could have selected any commit because next we clear out all the files anyway. Then, without committing yet (-n), we cherry-pick the first commit from B's master branch. (The cherry-pick preserves the original commit message which a straight checkout doesn't seem to do.) Then we create the subtree where we want to put all files from repo B. We then have to move all files that were introduced in the cherry-pick to the subtree. In the example above, there's only a README file to move. Then we commit our B-repo root commit, and, at the same time, we also preserve the timestamp of the original commit.

Now, we'll create a new B/master branch on top of the newly created new_b_root. We call the new branch b:

git checkout -b b B/master
git rebase -s recursive -Xsubtree=path/to/b-files/ new_b_root

Now, we merge our b branch into A/master:

git checkout master
git merge --allow-unrelated-histories --no-commit b
git commit -m 'Merge repo B into repo A.'

Finally, you can remove the B remote and temporary branches:

git remote remove B
git branch -D new_b_root b

The final graph will have a structure like this:


If you're trying to simply glue two repositories together, submodules and subtree merges are the wrong tool to use because they don't preserve all of the file history (as people have noted on other answers). See this answer here for the simple and correct way to do this.


Merging 2 repos

git clone ssh://<project-repo> project1
cd project1
git remote add -f project2 project2
git merge --allow-unrelated-histories project2/master
git remote rm project2

delete the ref to avoid errors
git update-ref -d refs/remotes/project2/master

Similar to @Smar but uses file system paths, set in PRIMARY and SECONDARY:

PRIMARY=~/Code/project1
SECONDARY=~/Code/project2
cd $PRIMARY
git remote add test $SECONDARY && git fetch test
git merge test/master

Then you manually merge.

(adapted from post by Anar Manafov)


This function will clone remote repo into local repo dir, after merging all commits will be saved, git log will be show the original commits and proper paths:

function git-add-repo
{
    repo="$1"
    dir="$(echo "$2" | sed 's/\/$//')"
    path="$(pwd)"

    tmp="$(mktemp -d)"
    remote="$(echo "$tmp" | sed 's/\///g'| sed 's/\./_/g')"

    git clone "$repo" "$tmp"
    cd "$tmp"

    git filter-branch --index-filter '
        git ls-files -s |
        sed "s,\t,&'"$dir"'/," |
        GIT_INDEX_FILE="$GIT_INDEX_FILE.new" git update-index --index-info &&
        mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
    ' HEAD

    cd "$path"
    git remote add -f "$remote" "file://$tmp/.git"
    git pull "$remote/master"
    git merge --allow-unrelated-histories -m "Merge repo $repo into master" --edit "$remote/master"
    git remote remove "$remote"
    rm -rf "$tmp"
}

How to use:

cd current/package
git-add-repo https://github.com/example/example dir/to/save

If make a little changes you can even move files/dirs of merged repo into different paths, for example:

repo="https://github.com/example/example"
path="$(pwd)"

tmp="$(mktemp -d)"
remote="$(echo "$tmp" | sed 's/\///g' | sed 's/\./_/g')"

git clone "$repo" "$tmp"
cd "$tmp"

GIT_ADD_STORED=""

function git-mv-store
{
    from="$(echo "$1" | sed 's/\./\\./')"
    to="$(echo "$2" | sed 's/\./\\./')"

    GIT_ADD_STORED+='s,\t'"$from"',\t'"$to"',;'
}

# NOTICE! This paths used for example! Use yours instead!
git-mv-store 'public/index.php' 'public/admin.php'
git-mv-store 'public/data' 'public/x/_data'
git-mv-store 'public/.htaccess' '.htaccess'
git-mv-store 'core/config' 'config/config'
git-mv-store 'core/defines.php' 'defines/defines.php'
git-mv-store 'README.md' 'doc/README.md'
git-mv-store '.gitignore' 'unneeded/.gitignore'

git filter-branch --index-filter '
    git ls-files -s |
    sed "'"$GIT_ADD_STORED"'" |
    GIT_INDEX_FILE="$GIT_INDEX_FILE.new" git update-index --index-info &&
    mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
' HEAD

GIT_ADD_STORED=""

cd "$path"
git remote add -f "$remote" "file://$tmp/.git"
git pull "$remote/master"
git merge --allow-unrelated-histories -m "Merge repo $repo into master" --edit "$remote/master"
git remote remove "$remote"
rm -rf "$tmp"

Notices
Paths replaces via sed, so make sure it moved in proper paths after merging.
The --allow-unrelated-histories parameter only exists since git >= 2.9.


To merge a A within B:

1) In the project A

git fast-export --all --date-order > /tmp/ProjectAExport

2) In the project B

git checkout -b projectA
git fast-import --force < /tmp/ProjectAExport

In this branch do all operations you need to do and commit them.

C) Then back to the master and a classical merge between the two branches:

git checkout master
git merge projectA






git-subtree