26 March 2023

Jan Koriťák

Hunting down regression using "git bisect"

git

Debugging

regressio

Intro

Regression bugs are the worst! 😠 Especially, when you've just introduced a huge change and you have no clear idea where the regression bug stems from.

I'll show you a quick and reliable path on how to uncover regression bugs with a standard Git command, that not many people utilize, which is a pity!

As this article is discussing a very specific scenario in developers' workflow, let me open with a real-world model scenario, to get you into the context.

You may not even need this article! If you're only here to read about git bisect's API either jump to the bottom or navigate to the Git docs. The point of this article is to discuss a real-life use case rather than duplicate the Git docs.

A real-world scenario

Storytime! 📚

Imagine you're working on a new, fancy feature, that you and the product team are very excited about. Once deployed, the feature is gonna bring a lot of value to your customers, moving you another tiny bit ahead of the competition. And you've been part of that! Exciting! 👏

You've spent days, maybe even weeks working on and perfecting the feature. The day finally comes. You receive the last required approval on your polished pull request. Proudly, you hit the "Merge PR" button and calmly watch the CI/CD pipeline take care of the deployment while sipping your cup of victorious coffee. ☕ You pat yourself on your back and feeling accomplished, you head home.

... fast-forward to the next morning. 🔅

You open your laptop to see a bunch of worrying messages from the product folks. Apparently, while shipping the awesome new feature, a bug 🐛 snuck through all the tests and even the code review! Now, your customers are complaining that something, that used to work before is now not working. You gulp and check out the feature branch, you hoped to never see again.

The root cause is not trivial and this is the commit history of the branch:

12bd81f7 chore: Initial commit
2fd1c7ae feat: storing search state in user settings
303d78e refactor: basic tag autosuggest implementation
4a2c6f9b feat: improve tag autosuggest algorithm
5c1b2e7d chore: refactor search filter implementation
69e6c1a8 fix: correct fuzzy search implementation for tags
76c7d2b2 feat: add search suggestions to input field
8e1f9a4d refactor: search input component reorganization
98a3b0c9 feat: add support for searching by date
107dfe573 test: add search performance testing
114a7b8f1 feat: integrate search input with external API
123a8bcb4 chore: add keyboard shortcuts for search input
13b50b832 feat: allow searching within search results
14d0e2b8f feat: add support for searching by category
155ab5e15 fix: improve search input accessibility
16e7ba51c feat: add option to save search queries
174f1df68 test: add search input validation testing
181a2e82c feat: add ability to search within specific fields
190c38d2a feat: implement search input highlighting
20e0ef1c6 feat: add support for searching within attachments
216d7e25d refactor: improve autocomplete for search input
229d7fae9 feat: add support for searching by location
236c7f6b3 test: implement search input throttling testing
243f49862 feat: add search input to mobile interface
251cbb0c2 refactor: implement search input suggestions from user history
267b642f1 chore: improve search input placeholder text
273e9a8f8 feat: add support for searching within shared documents
281d4e4d7 refactor: implement autocomplete for search input filters
29f52973a feat: add support for searching within comments
307b64df1 fix: improve search input styling and layout

(Don't over-analyze the commit history, it's ChatGPT-generated). Here's the prompt, for reference.

If only you knew, where to start...

Taking a naive approach

If you ask me, that is quite an intimidating number of commits. If the commits are not single-purpose or close to atomic, it's very likely that the diff is not gonna be the smallest as well.

Since we have no idea, where the sneaky bug is stemming from, it's important to realize, we're partially relying on a chance, to discover it. Therefore, debugging by browsing the branch and asserting a bunch of pseudo-randomly placed console.log or debug statements while clicking through the app would be very ineffective here.

After all, you're an engineer and there must be a systematic approach, right?

Do you need a reliable partner in tech for your next project?

Tilting the odds in our favor

It's a general rule, that in case you're relying on chance, you better tilt the odds in your favor. How do we do that? We reduce the size of the faulty diff to an absolute minimum.

What's the smallest primitive we can work down to in a Git-versioned repository? You guessed it, It's a commit. In other words. Instead of this.

We want to be digging through something like this.

That sounds like less of a headscratcher, right? 🤔

Leveraging "git bisect"

What it is

git bisect is obviously a Git command and does exactly what we defined in the previous section. It helps us reduce the code to dig through by systematically identifying the first bad commit ("bad" is a terminus technicus here) in Git history.

The process happens in a controlled, iterative fashion (similar to a wizard 🧙), using simple interval-halving, aka. bisection.

How does the command work

When you trigger git bisect, the runner requests two inputs.

❌ A commit (hash) that you know is bad - meaning "is broken".
✅ A commit (hash) that you know good - meaning "works fine".

Once you supply these two interval borders. The runner takes over. Iteratively, it starts checking out commits and asking you, whether things are broken or just fine on this particular commit.

Your only job is to re-run your test scenario, e.g.

Refresh a broken application UI and test the functionality
Re-run the failing test case and check whether it passes
Execute a script and see if it returns 0 this time
... depends on your environment

With each step, it's your job to tell the bisect runner, if the commit is good or bad.

That's it! Since bisection is just another name for binary search, you'll locate the broken commit in very brief log2(number_of_commits) steps. 😎

Looking through a history of 8 commits? You'll know the answer in 3 steps.
64 commits? 6 steps!
Even if you're digging through as many as 1024 commits, you'll know in 10 steps.

You've probably seen a log2(n) chart, right?

A practical example - Visual

Let's take the series of commits from above. Here's a little animation of how git bisect locates the first bad commit.

Let's take 6d7e25d as our broken commit, which we want to "identify". Below, you can watch a little animation of how we bisect from 2bd81f7 (good) and 7b64df1 (bad) all the way to the culprit.

📽️ If the embedded animation is too small, feel free to click through this link for a full-screen high-res vesion! Hope this animation says a thousand words.

A practical example - CLI

In case you're a more hands-on type of person, let's also analyze the whole sequence of commands that lead us to the culprit (0c38d2a) for completeness. If you got the idea from the animation, feel free to skip this. It's gonna be very linear.

Let's go over the command sequence. We start off by asserting the borders of the interval by passing a broken commit and a working commit. In this case HEAD is broken, but 2bd81f7 works just fine.

👨‍💻 git bisect start HEAD 2bd81f7

Git acknowledges the internal borders and checks out a new commit - 5ab5e15. Then informs us that the culprit will be identified in roughly 4 steps. Great!

🤖 Bisecting: 14 revisions left to test after this (roughly 4 steps) [5ab5e15] fix: improve search input accessibility

We re-run the target (application, test, script, ...) and access the commit as good.

👨‍💻 git bisect good

Git acknowledges the good commit and checks out further, to 9d7fae9 .

🤖 Bisecting: 7 revisions left to test after this (roughly 3 steps) [9d7fae9] feat: add support for searching by location

We re-run the target (application, test, script, ...) and access the commit as bad.

👨‍💻 git bisect bad

Git acknowledges the bad commit and checks out further, to 1a2e82c .

🤖

Bisecting: 3 revisions left to test after this (roughly 2 steps) [1a2e82c] feat: add ability to search within specific fields

We re-run the target (application, test, script, ...) and access the commit as good.

👨‍💻 git bisect good

Git acknowledges the bad commit and checks out further, to e0ef1c6 .

🤖

Bisecting: 1 revision left to test after this (roughly 1 step) [e0ef1c6] feat: add support for searching within attachments

We re-run the target (application, test, script, ...) and access the commit as bad.

👨‍💻 git bisect bad

Git acknowledges and checks out the final commit - 0c38d2a.

🤖 Bisecting: 0 revisions left to test after this (roughly 0 steps) [0c38d2a] feat: implement search input highlighting

As we've evaluated all necessary commits, the process ends. Now we know 0c38d2a is the first broken commit that we need to dig through!

I'll now leave you in peace to debug your broken commit! 😊

Final words

The whole idea of using git bisect for debugging, resp. tracing regression is to isolate the smallest possible piece of code, that we reliably identify as faulty. The smaller the diff is, the easier should it is to locate the culprit code.

Now we know, git bisect can take of this in mere log2(number_of_commits) steps. Therefore, even if we're working with a large commit sequence, e.g. of 1024, we can trace the regression bug in mere 10 steps.

Hope you learned something new today and you'll think twice before trying to debug large branches in the future.