How to squash git commits

Lately I’ve been working on a project that uses git as for storing source code. I’ve previously written the fourth edition of Java Network Programming in asciidoc with all files checked into git, but that was a very different experience: single author, no branches, always working against master. In other words, it was much like my experiences with Perforce, Subversion, CVS, and (this is really dating me) RCS.

The new project is more traditional git: many branches, many developers, many forks. Perhaps the git/bitkeeper distributed model makes sense for projects like the Linux kernel where there are many independent repositories on many developers’ machines, none authoritative. However for a traditional single team, single repository project, git feels far too heavyweight and complex for my tastes. I find it slows me way down. However like most developers I’m slowly getting used to it, and developing my own small subset of the vast corpus of git functionality that I actually use.

Git is designed to support frequent commits, and pass change requests back and forth as lists of commits so the development work is tracked, rather than by passing file diffs back and forth like most other systems. Now what really confuses me though is that no one seems to actually use it this way. if you want to submit a patch, you do not in fact send the list of commits that shows the history and ongoing work. Instead you rebase everything against master and send a single commit that squashes all the changes together, which seems to be exactly what git is designed to make unnecessary. In other words, we’re using git as if it were a traditional single-master system such as Subversion. Why? And does any project actually expect developers to send their full list of commits rather than a single squashed commit?

(Side note: Perforce is the best of both worlds here. To my knowledge, Perforce and its clones are the only version control system that manages to separate out the ongoing work in a change list and the final commit, and show you both depending on what you want to see.)

Regardless of the wisdom of discarding all history before submitting, like removing all the scaffolding before publishing a mathematical proof, it is how almost all git-based projects operate. Like most (all?) operations in git, it is far from obvious how to actually squash a series of commits down so it’s one clean diff with the current master. And also like most operations in git, there are multiple ways to do this. What follows is the approach I’ve found easiest and most reliable:

Update: As of April 1, 2016, this post is out of date for developers working on Github. Just use the new
Confirm Squash and Merge button instead.

Assumptions:

  • Working on a branch named feature_branch, not a fork and definitely not master.
  • Github is central. (May not be relevant.)
  • “Head” is origin/master.
  • Only one developer works on a given branch at a time.

There may be other, implicit assumptions about workflow I don’t realize yet. E.g. I don’t know if this works with a non-Github system.

Here’s the short version.

$ git checkout master
$ git fetch
$ git pull
$ git checkout feature_branch
$ git merge master
$ git reset origin/master
$ git add any/untracked/new/files
$ git commit -a -m "Here's what this feature does"
$ git push -f origin feature_branch

Finally, go to the github UI and merge origin/feature_branch into origin/master. Of course, this may change if your team has a different workflow or does not use github.

In more detail:

  1. First make sure master is up-to-date:
    $ git checkout master
    $ git fetch
    $ git pull
    
  2. Now merge the master into the feature branch:

    $ git checkout feature_branch
    $ git merge master

    At this point, you’ll be presented with a bunch of screens in your editor of choice. Just save them all.

  3. Now here comes the magic. We’re going to throw away all the commits but leave
    the changes in place:
    $ git reset origin/master
  4. Now you have a bunch of changed but uncommitted files in your local repository. If you’ve added any new files
    they are untracked, add all untracked files:

    and then commit the change:

    $ git add path/to/untracked/file1 path/to/untracked/file2 ...

  5. Commit the change:

    $ git add path/to/untracked/file1 path/to/untracked/file2 ...
    $ git commit -a -m "Here's what this feature does

    The one thing you lose in this process is your old commit message, so you need to enter it again.

  6. $ git commit -a -m "Here's what this feature does
    The one thing you lose in this process is your old commit message, so you need to enter it again.

  7. Finally force push your new clean commit to the server, overwriting the previous commits:

    $ git push -f origin feature_branch

There are other ways to squash git commits, in particular using the rebase command. However, in my experience rebasing gets very confusing after more than a few commits, especially if you’ve had to merge changes from other developers or branches into your feature_branch in the meantime. Resetting the origin effectively does a diff between your local branch and head, which is a lot easier to follow.

8 Responses to “How to squash git commits”

  1. John Cowan Says:

    The process that $EMPLOYER is now using is “A Successful Git Branching Model”. It’s pretty good so far.

  2. Scott Carpenter Says:

    I’ve been working in a development shop for the past three years and have not experienced this kind of of workflow. We just merge branches for the most part. Working on github with pull requests, that can give you a nice view of changes in a merge, but still not necessary to do any of this funny business. I wouldn’t be surprised if some teams and projects do something like this, but I’ve yet to hear of it.

  3. Scott Carpenter Says:

    Right on, John. I’ve seen gitflow or lazier variations on it, for the most part.

  4. abra Says:

    Well you saved my day..i was looking for this sequence of commands only

    $ git checkout master
    $ git fetch
    $ git pull
    $ git checkout feature_branch
    $ git merge master
    $ git reset origin/master
    $ git add any/untracked/new/files
    $ git commit -a -m “Here’s what this feature does”
    $ git push -f origin feature_branch

  5. On Request Says:

    What you are doing seems like you are fighting the tool and the flow. It seems nutty to me. If you told me at an interview what you were doing in git, I’d politely cut the interview short and leave. I think it would kill the deal for most people. It sounds like something from 1990.

    I’d suggest reading about using trunk development with feature flags.

    You should never merge manually into master. it should be backed by a ci system that enforces that the style guide hasn’t been violated, that the code has been properly linted, that all unit, integration, smoke, regression, functional tests have passed. You should not even allow a successful commit into local git that violates the coding style guide.

    Read the manual. take the time to learn the tools. don’t rant when the tool is actually doing the right thing and you just don’t get it yet. its ok, a lot of us old guys never do get it.

  6. On Request Says:

    if you checked code into master that had 100 commits into it, I swear I would strangle you.

  7. On Request Says:

    you aren’t using tagging at all.

  8. On Request Says:

    you have a mess.