The Trouble with Tuples

04 Feb 2015

In computer science, a tuple is an ordered list of elements of (potentially) different types with a fixed length. In programming languages that support them, the syntax is often comma separated values inside of parentheses. Here are some examples of tuples in a hypothetical language with their corresponding types:

(42)                   : <int>
("Hello, world", 3.14) : <string, decimal>
(true, true, true)     : <bool, bool, bool>

At first glance, they are actually really helpful and are often used to return multiple values from a function, for instance:

fn pair(first: int, second: int): <int, int> {
        return (first, second);
}

The big problem comes when you have to read values out of a tuple, how can you do it? Dynamic languages have an advantage because they can use the subscript operator (because they don't have to type check at compile time).

var tuple = (1, true);
tuple[0]            : int
tuple[1]            : bool
tuple[<expression>] : ? - to be determined at run-time

Static languages are up a creek, however because it's nigh impossible to do the kind of type checking required unless you're really strict about the indexes (requiring them to be constants, for instance). This has given rise to destructuring assignment syntax and named accessor methods:

var tuple = (1, true);

var (i, b) = tuple;
or -
var i = tuple.Item1;
var b = tuple.Item2;

i : int
b : bool

Tuples have a major disadvantage, however when compared to other heterogeneous data types: a lack of description around the individual values. For example, what is the significance of the second bool in <bool, bool, bool>? Well, you'll have to check the documentation because there's no name for that value. This makes it very difficult to read code that uses tuples, and since code is read much more often than it's written, it tends to be a bad idea.

So now that we've decided to abandon tuples because of their warts, are we happy? Well, recently I've noticed that there is a form of tuple that is deeply ingrained in many programming languages that have all of these awkward issues: method parameters. Now that we've looked at the tuple syntax, you can start to see how function calls look suspiciously the same:

add(5, 7)

This method takes two arguments, both ints, but we don't know what names they have, so we just remember which position they're in. This isn't really a problem since adding is associative. Imagine we had the following functions:

subtract(7, 5)
doSomethingWithOptions(true, false)

These functions are a little bit more dicey. We hope that subtract(7, 5) turns into 7 - 5, but we don't have any labels and it's hard to tell what the original intention is without looking at the documentation (this is not a theoretical problem, check the MDN doc for Node.prototype.replaceChild, I always forget the arguments to that function). Even worse is the doSomethingWithOptions call, unless you work with this function every day, I would not expect you to know what those two parameters do, or what order they have to come in. Can we do any better? I've been thinking more about other function call syntax recently and have been inspired by Objective-C's message passing syntax and JavaScript's object notation:

[Screen displayText: "Hello" color: "red"]
Screen.display({ text: "Hello", color: 'red' })

The beauty here is that each argument is named when it's invoked, so you don't have to come back later and try to figure out what each argument corresponds to. The unfortunate consequence is that it's more typing, but most IDEs already autocomplete arguments anyway. When all's said and done, a function call could look like this:

Screen.display(text: "Hello", color: Color.Red)

With most ergonomic problems, I think the inertia to stay with the current way of doing things is too great to really shift the way we do things, but it's at least something to think about as more languages adopt tuples on a syntax level.