Learn to Read Code

26 Jun 2017

I recently read the blog post Hyperproductive development by Jessica Kerr. This was timely for me because I've recently become (by attrition) one of the most productive developers on my small team. As she states in the article, part of the reason for this is that I've had 7 years at this and now know where many of the bodies are buried. I am either directly responsible or at least had a hand in creating much of the code that afflicts us every day. But I believe there's more to it than simply being the author. There's a skill I learned early on working in a legacy C++ codebase that nobody wanted to touch: reading code.

When I was first hired, all of the "logged in" experience took the user to a bunch of creaky old code that performed its work largely by schlepping data out of the database into a bunch of global char arrays, tweaking it and then siphoning it back into the database. It was nearly unbearable, but to make matters worse the whole thing was the worst example of spaghetti code I'd ever seen. Many functions did multiple different operations by testing a global "job code." This meant that the business logic was cut up into little strips and strewn haphazardly across multiple files. The icing on the cake was these files often had identically named functions so I had to check the makefile to determine which my program linked to.

It was a terrible mess, but except for my inexperience there wasn't that much bad about the code. It was in desperate need of some objects or at least some structs, but by and large it was a comprehensible mess. Clearly someone had had a firm enough grasp on what they were doing that all these plates stayed in the air. What I had to learn quickly, however is to make due. I couldn't ask the authors, they were long gone (likely cashed out during some acquisition or another). I couldn't even manage to get a debugger on it (presumably this is at least possible, though my Linux-foo has never reached that level). We had one huge advantage however. We knew the code was horrible and deceptive, but we had the source code. At the end of the day we could trace the execution and figure out where to squash a bug or surgically insert a new feature.

The problems that I face these days are more subtle, and much more insidious. The framework that we use for database access has a convention that for every class X there is an XReader and XWriter that handle retrieving the object from the database and persisting changes respectively. Our code is punctuated with code like the following:

using (var userReader = new UserReader())
using (var addressReader = new AddressReader())
using (var addressWriter = new AddressWriter())
{
    var user = userReader.GetUserById(userId);
    var address = addressReader.GetAddressById(user.AddressId);
    address.Street = newStreet;
    address.City = newCity;
    address.State = newState;
    address.Country = newCountry;
    addressWriter.UpdateAddress(address);
}

I was talking to a coworker about how we should just remove the IDisposable interface from all of those classes and he was horrified. As he understood it, those classes being IDisposable disposed of resources that weren't managed by C#. He worried that removing them could mean that we never disconnect from the database or at least hang on to the connection for far too long. What I knew about this code was that even though it was marked with the IDisposable interface, if you went up through the type hierarchy, the Dispose method did not contain any actual code!

The difference between myself and my coworker is certainly partly due to how long each of us has been working with these tools, but it's also fundamentally about how to read code.

How to Read Code

I would not have noticed this myself, but when I read a novel I skip over any words that I don't understand. I don't get much from this style of reading, just a vague impression of what the word means. I had never seen anything wrong with this until I borrowed a book from my grandpa. Every once in a while I would come upon a word in the book that was highlighted or underlined. I thought this was strange so I asked him about it. He told me that every time he ran into a word that he didn't understand, he would highlight it to look it up later. I couldn't believe the simple brilliance. I have so many words in my vocabulary that I only know in one context or another, but I could never use myself. I realized there that I was effectively faking it when it came to my vocab.

I believe that many people are the same way when they're coding. When they see something foreign like a using statement, they will do enough research to satisfy their curiosity. This is where ideas like "using cleans up unmanaged resources" come from. If you dig another layer deeper, what using really does is guarantee that the Dispose method gets called (and there are even more caveats than that if you include the possibility of the assignment expression throwing an exception). At the end of the day, the more you know about the code you're working on, the closer you'll be to that hyperproductive ideal. The best way to hamstring your knowledge is to let your understanding stay in the fuzzy, undefined state.

Reading Tools

When my grandpa reads his books, he uses a dictionary to look up all the words that he doesn't understand. As programmers, we need to know where to find the resources that we need when we get lost in the code. Here's my list of essential tools:

I hope this has been as helpful to someone out there as my grandfather has been to me. I'm sure some people think this is old news, but there are just as many out there who've never dared to delve into the nitty gritty and truly understand the code they're working with.