Wednesday, September 30, 2009

(How not to) Use git to maintain local changes to an upstream codebase

In my day job, we use an open-source content management system called WebGUI. They recently switched to git to manage their source code. "Great!" I thought, "Git's decentralized nature will make it super easy for me to maintain the local additions and modifications we've made to WebGUI over the years while still being able to update to new upstream releases as they come out."

Yeah, right.

I still haven't figured out how to do this correctly or easily (see update below). My first attempt was to branch at the v7.7.20 upstream release tag on the webgui-7.7 branch. My branch is called webgui-7.7-pin and I committed all my local additions and modifications to that branch. I then pushed it out to an EC2 instance so my WebGUI servers running in EC2 can pull their code from it when they launch. I committed a few more changes over time, and I even committed a couple of bug fixes upstream and to my local branch (because I want the bug fixes now, not in the next release). So far so good. But then 7.7.21 came out.

I tried doing (on the webgui-7.7-pin branch):

git rebase v7.7.21

That v7.7.21 tag name gets found by git on the webgui-7.7 branch and it rebases, or forward-ports, my local commits to the 7.7.21 tag. Git does this by rewinding all my commits, fast-forwarding from the v7.7.20 tag (the original branch point) and then re-applying my commits in order to that new tip. So I end up with exactly what I want, v7.7.21 but with my local changes applied on top of that. Then I tried to push it to my "cloud" remote (the EC2 instance serving my production servers). The push is rejected because it's not a fast-forward. As far as my cloud remote is concerned, my local copy of the webgui-7.7-pin branch is in chaos. It can't make heads or tails of my rebased code.

I haven't found much useful advice on the web. I'm really surprised this isn't a more common workflow pattern for folks using git. But maybe it is and I'm just going about it all wrong. Hopefully someone will smack me with a clue stick soon!

UPDATE: Many thanks to Haarg in #webgui for the clue stick beating! Turns out I was getting too big for my britches and trying to use rebase where a good ol' fashioned merge did the trick nicely. So here's the new workflow:

  1. Upstream releases a new version, tags it v7.7.22 (for example)
  2. I pull the latest changes into my local copy of the upstream branch:

    git checkout webgui-7.7
    git pull origin

  3. I now checkout my local modifications branch:

    git checkout webgui-7.7-pin

  4. I then merge the new version tag into my local branch (git is smart enough to go find the "v7.7.22" tag on the webgui-7.7 branch):

    git merge v7.7.22

  5. Resolve any conflicts that created (it happens), commit them (not necessary if the merge created no conflicts), and then push to my cloud remote:

    git push cloud

  6. This time the push was a fast-forward merge and the cloud remote happily accepts it. Hooray!

1 comment:

  1. Just to add a bit of clarification on *why* the rebase workflow didn’t work: if you rebase, you basically alter history, and Git by default does not let you push to a non-local (so – from its point of view – possiby ‘public’, and almost certainly ‘shared’) repository, because then anyone who pulled from that repository (and maybe did their own changes) would have to rewrite their history in your manner.

    That said, if you do want to push a rebased branch, you can still do that with git push --force