Debugging Software

Sat 24 September 2016
By Bram

I've always found Rob Pike's debugging story about Ken Thompson very powerful.

Ken would just stand and think, ignoring me and the code we'd just written. After a while I noticed a pattern: Ken would often understand the problem before I would, and would suddenly announce, "I know what's wrong." He was usually correct. -- Rob Pike.

This resonates with me, and I've always regarded debugging tools overrated. In many cases, you can do without those. My debugging method is typically a liberal use of assert() and fprintf(stderr,..) calls. But even those are not always a substitute of clear and deep thought, as I experienced again today.

Yesterday and today I was chasing a bug in my C code. The code generates topology and geometry of planets for my Children of Orc game. The visual and logical representations of the planet would not match up, and the Orcs would sink into the planet's surface. I spent a lot of time going over the code, over and over again. I even had a working Python version to compare with. But no matter how long I was staring at the code, I couldn't find the culprit.*)

I eventually solved the issue far away from the keyboard, when at the play ground with my toddler son. Just forming a mental picture of the processes, and some deep thinking on what could cause the observed behaviour of the code. It turned out to be a case of miss-used vertex indices. I keep two versions of the planet mesh, one where the polygons share vertices via indexing. The other where polygons are flattened out, and do not share vertices. I was using the indices of the former one to index the mesh of the latter. Going over the code, adding printf statements, did little good. Stepping away from the keyboard helped a lot. When back home again, finding and fixing the bug took seconds.

I also learned that in this case, the C code was double the size of the Python code. Before I ported it, my estimate was that C would require almost 10 times the lines of code. Even though Python is much more concise and powerful than the low level C, in this case it came down to a factor 2.

*) Comparing the outputs of Python and C versions was of little use, as both versions used different Simplex-Noise implementations. Also, the output was binary, not ascii. This made it hard to understand what specific part of the data was incorrect.

image0