Even though everybody uses git nowadays, people sometimes struggle how to use it efficiently. As git commit
it the most important git command, let me present some guidelines which we found useful.
1) Commit frequency
How often should you commit? There’s a simple rule:
Commit early and often
Why? There are several reasons for that:
For your team
Visibility: If your work lives on your local machine only, nobody can be aware what you are currently working on. Neither could anybody pick up your task in case you get sick or you have to stop working on it for any other reason.
Project / feature history: Many small commits with useful commit messages also work as some kind of documentation. Your team mates can get an impression of the work that has been done by looking at the git history.
For yourself
Backup: In case you work on a task for several days without pushing to the central repository, you risk loosing your work in case your PC goes down (assuming you have no other backup system in place).
Safety net: Let’s say you copy + past to the wrong editor window. How to recover? Sure all editors have some “undo” function, but often times it only works until the last time you hit “Save”. Early and often git commits give you a comfortable safety net wich allows you to always go back to the latest commit version in case of an accident (or maybe your brilliant refactoring idea turns out to be not so brilliant).
Status: After pausing for some days/weeks, you will find it easier to remember what tasks have been done and what have not by looking at your git history.
Bad practices
-
One single commit at the end of the day: Sometimes I saw people doing one big commit at the end of their work day. This ensures there’s a backup, but you would not have the other benefits mentioned above
-
Local commits without
git push
: There’s actually no reason why a commit should not be pushed. If you have commits which would put the application in “not ready yet” state, use feature branches, or – preferrably – feature toggles.
2) How to group changes into commits
If you pay a little attention how commits are grouped, your commit history becomes easy to read and can even serve as some kind of documentation. To achieve this, it’s important to group related changes together.
-
Changes for a bugfix
-
Changes for a new feature
-
Refactoring
-
Changes triggered by a linter (code conventions / formattings)
-
Whitespace changes
It’s quite helpful to separate refactoring changes from feature development changes. When browsing through the commit history later on, it’s much easier to understand why a particular change was made.
Similar idea for whitespace and code formatting changes. They are usually less interesting and can be easily skipped if they happen in a separate commit. On the other hand, if you group code formatting changes and feature changes together, the diff becomes harder to understand.
The worst is actually grouping changes for two or more unrelated features together. The git diff would become quite big, and you would have to read it line by line to understand which change belongs to which feature.
The Linux kernel project has similar guidelines.
3) When to commit
While you should commit early and often, you should commit when it makes sense. I’ve seen people abusing version control as a backup solution by committing every x minutes (or hours), regardless of the current state. Sometimes also the question comes up whether committing should be allowed while the code includes known bugs or event compile errors. From my point of view, this is a bad idea. When somebody checks out the “bad” version later on, it’s unclear whether there’s a real bug or the code is just in an “in progress” state. Therefore, we always set these guidelines: whenever you commit,
-
there should be no compilation errors
-
tests should pass (unit tests etc.)
-
linters and other code analyzers should not complain
On the other hand, your feature doesn’t have to be complete (see above).
4) Commit messages
We prefer commit messages of one or two sentences (roughly, 15 - 100 characters), summarizing what has been changed. Some things you should consider:
-
Most git hosting tools like GitHub, Bitbucket etc. have a “commits” page where all commits are listed, showing most recent commits first. There’s usually one row for the commit message, and it would display only the first 60 chars (something like that). So the key point of your commit message should already be included in the first 60 chars
-
While scanning throught the git history, you would usually read only what’s displayed on the “commits” page directly. You would not click on each commit to expand the full message
-
On the other hand, the commit message shouldn’t be to short neither. It should give the reader some good idea what the commit is about
-
It’s quite helpful to have a convention that bugfix commits always start with
Bugfix:
, refactorings always start withRefactoring: ...
etc. -
In case you use a ticket system (Jira, YouTrack, etc.), you should include the ticket number in the message
-
Make sure your message is informative.
Fixed weird bug
only tells that the commit is a bugfix, but which bug? Why is it “weird”? -
Another bad practice is to repeat the code changes, e.g.
Added method 'write()' to class A
. These messages are very hard to understand, particularly if you use generic names likewrite()
,generate()
,send()
etc. In this case, the commit message is more or less the same as the diff, which makes it less useful.
5) The power of annotate
The recommendations given above would give you a nice history you can read through on GitHub’s / Bitbucket’s “commits” page. However, they would also leverage the power of another – often overlooked – git command: git annotate
.
For a given file, annotate basically gives you the most recent commit information for each line. Most editors have some annotate command (GitHub and Bitbucket call it blame
).
In case you followed the guidelines above, annotate
would give you magically some kind of documentation for your source file. For each method, statement, you would be able to see:
- Who changed the line the last time?
- When was it changed the last time?
- What was the reason for the change? What was the context of the change?
Particulary for legacy projects, the annotate
view can be extremely helpful.