For one of my courses this Fall, I had to write PBS job scripts for conducting some performance benchmarking experiments on our university’s HPC cluster. The scripts were essentially the same- I had to run a particular program for various combinations of (nodes, cores per node). Each combination required me to write a new script, putting in new filenames for the output, and new directives for allocating nodes and cores. The first time I tried to run these experiments, I thought, “I’ll spend more time doing automation than I’ll save from having automation”. So I wrote 16 scripts manually.

The experiments failed. One of the reasons was misconfigured parameters. The other was psychological- the moment I had decided to write all the scripts manually, I had accepted defeat.

A week later, I ended up having to run the same experiments again. This time, I wanted to do things right. I decided to automate script generation. First, I poked around with m4- I’d used it before for generating GMAT scripts with custom parameters[^1]. But then, I hit a roadblock.

If you want to allocate exactly 3 nodes with 4 cores each using scheduler directives, here’s what you have to write:

	#PBS -l nodes=1:ppn=4+1:ppn=4+1:ppn=4

In general, if I want to allocate n nodes with m cores each, I’ll write-

	#PBS -l nodes=1:ppn=m+1:ppn=m+1:ppn=m... <n> times

Now m4 does, theoretically allow you to write recursive macros which will repeat a string n times. But I really was in no mood for learning the syntax of a new esoteric language simply for automating script generation. And then I realized that I had done loops in templates before- in jinja2![^2] In jinja2, the above statement could be generated as follows:

	#PBS -l nodes={% for node in node_list %}{{ node }}{% endfor %}

With the following argument passed to jinja’s render method:

	node = "1:ppn={}".format(cores)
	args = {
		"node_list": [node + "+" for i in range(processors - 1] + [node],
		...
	}

After hunting around for a bit, I found out how to use jinja on arbitrary template files. What you need is a FileSystemLoader.

Here is a useful snippet to serve as an example- gist.

Now here’s the thing- I ended up running all the experiments 4 more times. But now, all I had to do for new tweaks was change the template- the python would take care of generating all 16 files. This made my life bearable.

When I was a kid, I used to hate stories that had morals in them, mostly because they were artificially constructed for somehow “instilling” that moral in the listener (usually a bored 6 year old). To give an example, there was a story where a monkey refuses to share a fruit containing a poisonous seed, and ends up dead as a result. Another “virtuous” monkey splits the fruit in two and shares it with their friend, causing the poisonous seed to fall off, with that monkey not ending up dead. I could go on, but I really should be writing a B+ tree instead of this blog post, and the least I can do is not spend my time ranting about bad short stories.

Anyway, the story that I have told is 100% organic, and has a real moral (unlike some stories that are “enriched” with morals), and thus I believe I can add some lessons that people can take away.

  1. Automation is worth the effort even if it ends up consuming equal or slightly more time than it saves. This is because it makes your process reproducible, easier to debug, more accurate and less of a psychological burden. It is also a way of having fun.

  2. Jinja2 can be used in generic python scripts, independent of Flask. Here’s an example of how.

  3. Magic is for winners, kids! Don’t do drugs. -Sean Plott aka Day9

If you know a better way of configuring jobs using PBS, reach out!.

  1. I wanted to run an orbit raising simulation for several combinations of Keplerian elements using GMAT. While GMAT can be invoked from the command line, it does not take command-line arguments. So, I had to modify the simulation script every time. Instead, I added macros inside the script for the Keplerian elements, and used it as a template. Then, in the inner loop of my bash script, I used to generate a new script file, pass it to GMAT, run the simulation, rinse and repeat.

  2. Of course, you could also write the whole script as a single python string, and then call .format() on the string, but I did not want to do that, for obvious reasons.