Home

A Simple Templating Language for Python

Estimated Reading Time: 6.50 minutes.

Recently, in my own projects, I've been hitting a bunch of cookie-handling problems in several of the main Python web frameworks. They're edge-cases, so it does make sense, but it is annoying as all hell.

So, being the person that I am, I began to undertake writing my own WSGI framework. It isn't overly difficult - the spec exists and is well-understood, so writing something that is pluggable into any major WSGI server is the case of a day's work. Writing something that can do 99% of what you want to do, two days. Something that fits everything you want to do... Infinity.

However, in the course of writing it, I felt the need to quickly spin up a 0-dependency templating engine to include as part of the framework.

I vaguely recalled something in the past about someone abusing Python's string formatter and spec handler for this purpose, which I was already doing for some fancy URL parsing. So, I dove right in to that and came up with something horrifying.

It's been integrated into my framework (which is not yet API-stable) here, under the "Template" class.

This is what a basic template might look like:

My name is {foo:(name, last)}

<ul>
{books:foreach:<li>"{{item[1]}}" by {{item[0]}}</li>}
</ul>

{rockit:if:
<p>ROCK IT!</p>
    :
<p>Don't rock it.</p>
}

It's widespace agnostic, and fairly lisp-y. (Note that in the function call, you can actually put expansions in instead of names and it'll still work! Almost like quasi-quoting).

And the usage of the template is something like:

return Template().format(TEMPLATE_CONTENT, foo=foo, name='eric', last='moe', books={'Author': 'Title'}, rockit=True)

The entire awful system is around 49LOC.


The Class

Our class is going to require two components to it:

Thus, we're going to start off with something like this:

import string

class Template(string.Formatter):
    def format(self, format_string, *args, **kwargs):
        return super(Template, self).format(format_string, *args, **kwargs)

    def format_field(self, value, spec):
        return super(Template, self).format_field(value, spec)

The class should now pretty much do absolutely nothing and be the exact same as Python's normal, and exceedingly powerful, string formatter:

assert(Template("{a} != {b}", a=2, b=1) == '2 != 1')

The format Override

We only want to override the format override so that we can pass command arguments to function calls. That's it.

So, with that in mind...

def format(self, format_string, *args, **kwargs):
    # Expose arguments...
    self.args = args
    self.kwargs = kwargs

    return super(Template, self).format(format_string, *args, **kwargs)

That's it. We don't care about anything else. We're not changing behaviour, we're just adding a side effect.

foreach

One of the simplest commands to add to the formatter, which will give us an idea of the process for the future, is the foreach command.

def format_field(self, value, spec):
    if spec.startswith('foreach'):
        template = spec.partition(':')[-1]
        if type(value) is dict:
            value = value.items()
        return ''.join([Template().format(template, item=item) for item in value])
    else:
        return super(Template, self).format_field(value, spec)

What we're doing here is pretty basic. It's a simple recursion over the arguments, where we ask our own formatter to handle the insides of the formatter.

The key to understanding this is the way the arguments are presented. Basically, the specifier becomes anything from the first ":" to the end of the formatting field:

"{someValue:specifier-is-here:no:matter:what!}"

So you extract the first specifier, and then pass on the arguments, evaluating and concatenating them.

Function Calls

Function calls are a little more involved, but not especially. We'll be handing off most of the effort to shlex, a builtin Python library, because I am far too lazy to actually do any parsing. (argparse might also be a decent alternative, and give you more advanced type handling...)

First, we'll handle the argument-less calls:

elif spec == '!' or spec == '()':
    return value()

Which, in use, will look something like:

{foo:!} or {foo:()}

We let the user either pass in a pair of empty double quotes, or for some magic, a single bang to indicate no arguments. It's an opinionated decision, but one I like. Tweak to your own tastes.

And now, we'll handle the calls that actually require a little bit more intelligence:

elif spec.startswith("(") and spec.endswith(')'):
    # Grab the arguments...
    args = shlex.split(spec[1:-1])

    # Turn the arguments into their equivalents...
    built_args = []
    for arg in args:
        # Allow the user to not realise how args are broken up...
        if arg.endswith(","):
            arg = arg[:-1]

        # Use a template-aware sensible default:
        if arg in self.kwargs:
            built_args.append(self.kwargs[arg])
        else:
            built_args.append('')

    # Call the function:
    return value(*built_args)

There isn't a lot to think about there. We strip the brackets, and hand off the arguments to shlex. This means that quoted arguments are handled correctly, so the user can pass arguments that are split up into the formatter at the top level, and so on. There's a few edge-cases, like pipes being passed around on their own, but that's fine for the most part. Quotes can fix those problems.

We strip hanging commas from arguments, because shlex doesn't split on commas, obviously, but the user is likely to make that mistake, because they think what they're looking at is vaguely Python.

Then we make another opinionated decision: What if the user mentions a variable that doesn't exist? We could raise a KeyError. That would make sense. But this is a template system intended for the web, so passing in an empty string also makes sense.

Finally, we just deliver the payload. Anything goes wrong, and Python will conveniently tell us where and what part of the template the error happened in. So we don't need to catch any exceptions.

We use it somewhat like:

{foo:(var1, var2)}
{foo:(var1 var2)}
{foo:(var1, "var2")}

Mix and match as fits your way of doing things.

If statements

No template language should be considered complete without if statements. I don't care if it can't do function calls. But if you can't do a conditional... I hate you.

Unfortunately, this is where things get a little bit... Unreadable.

We need to handle two cases:

We accomplish the above by some fancy string cutting. It looks fairly unreadable, but it isn't overly complicated code.

It's basically determining lengths and cutting them apart, before returning only the part we're interested in.

Summing Up

With that, we're done!

This awful damn thing works, and is actually rather fast, thanks to being based on part of CPython that is fairly well optimised. It isn't something you'd want to use every day, but it is more than good enough to use for simple tasks here and there.

I haven't, and probably won't, split it out of my main project, but you can find the complete class code for it over here. Obey the license, and you can rip it out and into whatever you feel like.

Update: Swapping Values & Specification

It turns out that a very simple tweak can let us make our simple template language a lot nicer to use.

My name is {foo:(name, last)}

<ul>
{foreach:books:<li>"{{item[1]}}" by {{item[0]}}</li>}
</ul>

{if:rockit:
<p>ROCK IT!</p>
    :
<p>Don't rock it.</p>
}

This is what we're going to accomplish. We're going to make it so that for if/foreach (and any future spec-like syntax), they come before the value, so it reads more like a lisp, and less like... A bad DSL.

def parse(self, format_string):
    t = super(Template, self).parse(format_string)
    t = list(t)

    for cell in t:
        literal_text, field_name, format_spec, conversion = cell
        if format_spec and format_spec.startswith('(') and format_spec.endswith(')'):
            # We don't need to rearrange anything on a function call!
            yield cell
        elif not format_spec:
            # No spec, make no changes!
            yield cell
        else:
            # Invert field/spec, slicing as appropriate...
            field_name = field_name + ":" + format_spec.partition(":")[-1]
            format_spec = format_spec.partition(":")[0]
            yield (literal_text, format_spec, field_name, conversion)

So, to break it down:

Firstly, parse is expected to yield an iterator of literal, name, spec, conversion tuples.

We'll continue doing that where there is no spec, and for function calls, by just yielding the cell.

However, when that isn't the case, we'll slice and append the spec to the name, and swap their positions.

That's really all there is to it, but it is a huge useability improvement as the programmer can now read the code left to write as they expect to be able to.


Comments

Submit comment...

Subscribe to this comment thread.