It was pretty long time ago when I read the book: Debugging: The 9 Indispensable Rules for Finding Even the Most Elusive Bugs by David J. Agans, and I am still fascinated how timeless and universal those rules are. As software engineers (and technicians in general) it is our everyday struggle to find the source of problem. If you don’t have right attitude when finding where cause of the problem it will take you a lot of time and you will end up frustrated and unsuccessful. This book gives you tools (9 rules) and stories how other people used them to find solutions quickly and efficiently.
I think it was one of the most important books in my technical career. I can highly recommend it.
Below I present 9 rules from the book. Rules are strictly quoted from the book, I’ve just added short explanations. Trust me, buy and read the book, don’t stop after reading this short summary 🙂
- Understand the system – read the docs/manual it will make you understand system better.
- Make it fail – if you are not able to make it fail when you are ‘looking at it’ you can’t focus on the cause. Make intermittent bug repeatable.
- Quit thinking and look – you need to see exact moment/command when the system is failing. Look at verbose log, open up the source of library, go as deep as you can and see the cause.
- Divide and conquer – act as binary search machine when you’re after the bug, narrow the range. Determine on which side of bug you are on. Use easy example at first, something which should work and is not complicated, isolate the case which you’re debugging.
- Change one thing at the time – don’t change many things at once, debugging is like shooting from sniper rifle, not a shot gut. Systems (including IT) are deterministic, there is only one cause of issue which is in front of you, when you will solve that one you can solve another one (if there is any).
- Keep an audit trail – write down what you did, it will help you to understand flow of debugging process. It will also let you redo the steps that didn’t help.
- Check the plug – are you 100% sure that this part of system/code is a problem? Did you try to replace it with ‘working code’ ?
- Get a fresh view – ask colleagues, make sanity checks, don’t be proud, take some sleep, grab a coffee.
- If you didn’t fix it, it ain’t fixed – if you will roll back your fix, will bug emerge once again? If not – you didn’t fix it.
If you prefer you can download the poster from David Agans site (link to the poster) 🙂