How to squash git commits

Tuesday, November 10th, 2015

Lately I’ve been working on a project that uses git as for storing source code. I’ve previously written the fourth edition of Java Network Programming in asciidoc with all files checked into git, but that was a very different experience: single author, no branches, always working against master. In other words, it was much like my experiences with Perforce, Subversion, CVS, and (this is really dating me) RCS.

The new project is more traditional git: many branches, many developers, many forks. Perhaps the git/bitkeeper distributed model makes sense for projects like the Linux kernel where there are many independent repositories on many developers’ machines, none authoritative. However for a traditional single team, single repository project, git feels far too heavyweight and complex for my tastes. I find it slows me way down. However like most developers I’m slowly getting used to it, and developing my own small subset of the vast corpus of git functionality that I actually use.

Git is designed to support frequent commits, and pass change requests back and forth as lists of commits so the development work is tracked, rather than by passing file diffs back and forth like most other systems. Now what really confuses me though is that no one seems to actually use it this way. if you want to submit a patch, you do not in fact send the list of commits that shows the history and ongoing work. Instead you rebase everything against master and send a single commit that squashes all the changes together, which seems to be exactly what git is designed to make unnecessary. In other words, we’re using git as if it were a traditional single-master system such as Subversion. Why? And does any project actually expect developers to send their full list of commits rather than a single squashed commit?

(Side note: Perforce is the best of both worlds here. To my knowledge, Perforce and its clones are the only version control system that manages to separate out the ongoing work in a change list and the final commit, and show you both depending on what you want to see.)

Regardless of the wisdom of discarding all history before submitting, like removing all the scaffolding before publishing a mathematical proof, it is how almost all git-based projects operate. Like most (all?) operations in git, it is far from obvious how to actually squash a series of commits down so it’s one clean diff with the current master. And also like most operations in git, there are multiple ways to do this. What follows is the approach I’ve found easiest and most reliable:
(more…)