Making Poirot Fast

02 Feb 2018

As way of introduction, Poirot is a mustache rendering engine for .NET Standard (it descends from SharpStache, my earlier project for .NET Framework 4.5+). I originally created SharpStache with the intent of beating the performance of Nustache, but my new target with Poirot is to beat string.Replace (who says you can't have un-achievable goals?). Here is my (possibly flawed) recollection of the performance work that has spanned two .NET packages.

Parsing Like a Pro

In performance tests on the Nustache package, I found that an amazing amount of time was spent in the package parsing Regexes to parse Mustache. Now, regular expressions are an amazing thing and they are fast, but the tools that most languages call regex only vaguely resemble the concept from computer science. The first problem with regexes is parsing/interpreting. You're inserting a whole other programming language (and a dynamic one at that) in between your template and the metal. This is mitigated somewhat by "compiled regexes" which allow you to spend extra cycles at startup to ensure much faster execution, but that still doesn't protect you from the second problem: backtracking.

If possible, a parser should always be moving forward, but even the most basic operations in regex like the * operator will start going backward if it can't find a match. For example, say you wanted to find the first "{​{" in a string (definitely a thing you would want in a mustache parser), you might be tempted to do something like this in regex: .*{​{ (read: match zero or more (*) of any character (.) and then match two "{"s ({​{)). You would never ship this regex because besides being slow, it would also be wrong. See, you and I think about patterns differently, but regex implementations have a strict order-of-operations. A regex implementation will start at the beginning of the pattern and pump it until it no longer works meaning it will generate code like the following for .*: while (true) { advance(); }. This is because it will continue to try to match any character (.) until it can't any more. This means that in a simple string like "Hello, {​{name}}.", the pattern will first match every character in the string as part of .* before trying to find the first "{". Seeing nothing left of the string (because it already advanced to the end) it will start to backtrack one character at a time checking for "{". First it will check if "." starts with "{", back up and check if "}." starts with "{" until it finally gets to "{name}}.". At this point it will try to find that second "{" and fail again backing up to where we want it matching "Hello, " to .* and "{​{" to {​{. Even if your input string is 10TB long and the first character is "{", it won't go looking for that until it tries every other character first. This is a huge problem which can be solved with "non-greedy" modifiers (change .* to .*?) that behave as we normally expect with respect to performance.

So given that the regex path is mired in problems, how do we proceed? I suggest we do things the old-fashioned way and just iterate our string looking for "{", the code is a lot longer, but the performance of the application isn't dependent on pre-compiling some backtracking-crazed language to do the simple job of finding "{​{" and "}}".

Evaluating expressions

In the original version of SharpStache, after writing the parsing by hand I was so far out in front of the performance problem that I punted on the actual property access and just used reflection. The code was pretty short and to-the-point having only a few special cases to handle things that are IEnumerable and dictionaries. The problem with this is that it's pretty slow. The reflection system is responsible for a lot and constantly querying it with the same questions created some room for improvement. Since I was committed to making something fast instead of something easy or simple, I had to figure out how to avoid those lookups.

Thankfully, C# has a great library to generate new code at compile-time: System.Linq.Expression. This allows you to generate code in basically the same way you think about writing it: one method call or property access at a time. The first time I would see a new type, I would generate a couple of functions. The first one I would generate was the property getter. Rather than lookup which property to get I generated a switch statement with the names of all of the properties on the object and handled each case by accessing the correct property. The pseudo-code looks like this:

public object GetItem(string property)
{
    switch (property)
    {
        case "Property1": return this.Property1;
        case "Property2": return this.Property2;
        default: return null;
    }
}

This means that I don't have to query the reflection APIs for information that's not changing like which properties are on a class. This really helped speed up more complicated patterns where properties were called in a loop or in general because there were generally only a few types passed into the templates.

Making the Parser Skim

Inspired by a technique I'd heard mentioned as part of grep, I realized that for larger templates with more text in the middle, I could skip about half of the letters on average while I was looking for "{​{". The old code looked something like this (simplified):

public int FindBrace(string template, int start)
{
    for (int i = start; i < template.Length - 1; i++)
    {
        if (template[i] == '{'
            && template[i + 1] == '{')
        {
            return i;
        }
    }
    return template.Length;
}

That's much faster than any sort of regex shenanigans, but you can actually do better, because even if you skip every other letter, you'll still see one of the "{"s. This meant I could convert the code to something like this:

public int FindBrace(string template, int start)
{
    for (int i = start + 1; i < template.Length; i += 2)
    {
        if (template[i] == '{')
        {
            if (template[i - 1] == '{')
            {
                return i - 1;
            }
            if (i + 1 < template.Length
                && template[i + 1] == '{')
            {
                return i;
            }
        }
    }
    return template.Length;
}

Now we're only comparing every other character with the additional cost of having to look both ways to find the "{​{". Overall, however it's a pretty big win.

Future Work

I would love to be part of The Fastest Mustache Parser for .NET of All Time ™, so I'm always thinking about other optimizations I could perform. If you find this interesting and would like to help me out, my project is on GitHub at https://github.com/kbsletten/Poirot/. Feel free to open an Issue or send in a Pull Request if you have any great ideas! If you're looking for a Mustache renderer for your .NET Core or .NET Framework project, give Poirot a go and let me know what you think.