The Feudal Workplace

28 Feb 2013

There are really two philosophies that I can see when it comes to writing code that is supposed to endure through abusive user interaction, fail-safe and fail-fast. Fail-safe on the one hand tries to keep playing at (almost) all costs by avoiding exceptions and HTTP 500's. Fail-fast on the other hand tries to avoid any data corruption by stopping the minute that it can't guarantee execution completed successfully.

Whether you pursue one or the other is generally a balance based on what the requirements of your business. If you are performing surgery with a high-powered laser strapped to a robotic arm, it is probably best you don't start cutting into a patient when realize there's a bug in the calculation. If you aren't handling critical information, but your customers will leave at the drop of the hat, you may want to focus on gracefully degrading.

For those of us in the latter group, this can be a huge bummer. We're working in databases we didn't design in code that we think is rubbish and we can't even stick an exceptional case in the code without breaking the whole application. Fortunately you can have your cake and eat it too, but it may require you to look at things a different way.

The Feudal System

The feudal system is basically viewing functions as a large set of hostile fiefdoms that are constantly at war. Because they are at war, they can never trust anything they get from another function and must sanitize the input. While this may sound like a good idea, the failing is that when a feudal function determines there is a hostile input, it saves itself by swallowing errors or ignoring input. While this technically has some merit, it ends up being very difficult to maintain, as I will now demonstrate.

Imagine we have a function that takes an id and gets something out of the database for us.

Foo *get_foo_by_id(int id) {
   ...
}

Now, we know a little something about the database logic, specifically that we never have an id less than or equal to 0 and can therefore safely ignore the request, returning null.

#define NOT_FOUND NULL

Foo *get_foo_by_id(int id) {
   if (id > 0) {
       ...
   }
   return NOT_FOUND;
}

This is a nice little work-around that might save us some time in the degenerate case, and since we're going to have to return NOT_FOUND in some cases anyway, so it's an innocent lie. Or is it? Did you ever stop to wonder why they sent us a -1 in the first place? It sounds like they might have a bug on their hands. In a feudal system, however the only thing a function is truly required to do is not segfault.

Now we run into a worse situation, we do something that absolutely requires a certain state in a parameter.

void update_contact_info(User *user, const char *contact) {
    ...
}

Now, we need to avoid breaking so we do the following:

void update_contact_info(User *user, const char *contact) {
    if (user != NULL) {
        ...
    }
}

We effectively avoid a segfault here, but now we're really stretching the bounds of what's an acceptable input value. If this were real life (a huge stretch, I know) this would be unacceptable. I have never once been able to convince someone to wash my car without bringing it to their car wash. This idea that we need these checks to safeguard our code come back to bite us when we end up with strange bugs.

Unfortunately what started as a defensive programming style has turned into a maintainability problem. Instead of working together and passing information, this function now has a noop behavior in certain cases. This is a bug waiting to happen.

Enter the Magna Carta

Just like the real feudal period came to an end, there is a solution to the feudal code that plagues some of our projects: code contracts.

Everybody knows you're supposed to document your code, but in certain circles, there seems to be an attitude that a well named method is worth its weight in comments. Even basic code contracts can be very useful because people will be aware of what you are expecting and at least you will be able to avoid some bugs.

/* Get a Foo from the database
 * PARAM:
 *     id    - the Foo's id number (> 0)
 * RETURN:
 *     Foo * - if found
 *     NULL  - if not found
 *
 */
Foo *get_foo_by_id(int id)

/* Update the user's contact info
 * PARAM:
 *     user    - the user to update (NON NULL)
 *     contact - the contact information
 */
void update_contact_info(User *user, const char *contact);

This is the most basic form of code contracts and it can be really nice to leave your header files (or interfaces if you're using Java/C#) full of information that a person can read through easily without going into the painful depths of the function implementing the guarantee. I've had people make the statement that if you follow a convention, everyone will be able to understand how the method works, but even in a clear-cut system, communication between developers and systems is a great thing.

In other languages, we have other options. In Java it is possible to use JML and the runtime checker to enforce pre- and post-conditions. Classes declare a variety of requirements in a pseudo-Java language The result is that the documentation is not as clear (I suggest supplemental Javadoc comments), but we are able to verify that the requirements are met and log errors.

public class Foo {
}

public class Foos {
    /**
     * Upgrade the Foo to the specified upgrade level
     * @param foo
     *     the Foo to upgrade (non-null)
     * @param upgradeLevel
     *     the level to upgrade the foo to (!= foo.upgradeLevel)
     */
    /*@
     @ normal_behavior
     @     requires
     @         foo != null
     @                 && foo.upgradeLevel < upgradeLevel;
     @     ensures
     @         foo.upgadeLevel == upgradeLevel;
     @ exceptional_behavior
     @     requires
     @         foo == null;
     @     signals
     @          (NullPointerException e) true;
     @*/
    public static void upgrade(Foo foo, int upgradeLevel) {
        ...
    }
}

The above code is an example of something similar to the C example I showed earlier. The trick is that the funny /*@@*/ comments allow the JML runtime checker to alert you when you violate a precondition. Not shown are the strange operators and statements you can insert to create conditions that validate almost anything that you could want.

Because JML has a run-time checker that will print a message when the contract is not satisfied, it can provide useful information to the user running the code when they have experienced undefined behavior. This may not be an actual bug in the code, but it is often the symptom of an oversight that should be remedied.

Usher in a New Era

While I've only demonstrated one run-time checker, many different languages have some sort of support for contracts and different behaviors. Depending on the buy-in you can generally change from logging contract validation to exceptions in debug builds to run-time checking in the production build. The nice thing is that since most of these solutions will use comments, conditional builds, decorators or other solutions that can be easily made innocuous, making it much easier to pitch to a conservative audience.