How to Find Bugs With Divide and Conquer
If I could learn only one thing about debugging, I’d learn to divide and conquer.
Divide and Conquer
Divide and conquer is an algorithm design type. It works by recursively breaking a problem into two (or more) smaller problems of a similar nature. If you can solve the little issue, you can solve a bigger one too. Divide and conquer is used primarily for sorting or finding the closest pair of points.
But divide and conquer is more than that. It’s a mindset you can use to debug your apps.
Debugging is a tiring task for many developers. You need to know your intended inputs and outputs. You need to understand the code and the flow of the app. Often finding a mistake is much harder than fixing it.
Countless programmers hate debugging for another reason. We prefer making things than fixing them. It’s a lot more satisfying to create new code than to spend half a day seeking an error and finally changing two lines.
The naive approach to finding bugs is going through the code line by line, logging data, or setting breakpoints. You can use that in trivial apps or if you know almost exactly where the mistake is.
But typically, you need to debug a big application, frequently before you can familiarize yourself with the codebase. Making tons of logs will take forever. You have to do better than that.
Basic Divide and Conquer
The straightforward use of divide and conquer in debugging is to split the codebase in half, and add a log there. If the log works and prints correct data, we can assume that everything up to that point works fine; if not, the error is in the first half of the code.
Next, we take the suspicious half of the code and put another log in the middle. We repeat the procedure until we discover the issue. Each time we cut the remaining codebase in half, so we should find the error in just a few steps. That’s a lot better than logs all over the code, but we can go a step further.
Educated Guesses and Divide and Conquer
Imagine your app consists of thousands of lines. It’s hard to find the middle, and you’ll need dozens of logs to discover the culprit. Now it’s the time to take a step back and think before adding logs.
Start with answering a few questions and making some hypotheses.
What broke? Is it likely that some string transformation failed; maybe the data returned is not in the type you expected, or is there another issue?
- Do you have some tests or checked parts of the code? If so, the bug probably isn’t there.
- When did the bug occur? Did everything work fine before? What changed? Maybe inputs are different now, or maybe the code worked two commits back.
When you answer these questions, you’re ready to make an educated guess about the nature of the bug.
Every assumption can prove wrong, but your goal isn’t to be correct. It’s to find the most probable culprit so you can pinpoint the suspicious parts of code.
When you have a few hypotheses, zoom in on the most likely and use divide and conquer. If you find the bug, that’s great. If not, proceed to the next hypothesis.
I’ll show you a real-life example of divide-and-conquer debugging. The code below takes an HTML form and creates a PNG image of it. Later, it sends both the image and the form data to the server and redirects to another page.
toPngtakes the HTML element and returns an image (base64-encoded data)
dataUrltoFiletakes base64-encoded data and returns a file
saveDraftsend data to the server
setRedirectredirects to another page
The code worked well in Chrome, but I had a problem with Safari. The program didn’t return anything — no data, no error.
My (naive) approach
I’ve logged a response right before the middle of the code, between lines 4 and 5. The log never ran, so the issue was above.
My next step was to log
formAsFile between lines 2 and 3. That log didn’t run either.
Now I got my answer — the bug is hidden in the first line. The function
toPng never finished.
In the first place, I should have analyzed why I didn’t get an error. I should have gotten one if there were issues with the request to the server. So the most probable place of failure was in lines 1 or 2. A simple log in the second line would suffice to tell me I never got the image out of the HTML.
Divide and conquer is a handy approach to finding bugs. You need it in your toolkit.
And if you’re wondering what the underlying bug was — the HTML element was too big for Safari to convert to the image.