Unleashing dynamic LaTeX: latexmk, gnuplot, R (knitr) and pgfplotstables

One of the nice things of LaTeX is that it allows you to separate the content and the formatting/layout of your documents. Need a new column layout? Just add an option in the document class. Your image needs to be wider? Change it, LaTeX adjusts its position in the document. In other words you only need to worry about what you’re doing right now and nothing more.

However, this philosophy doesn’t extend to all aspects of LaTeX. What if you’re writing a technical document and a graphic must be regenerated if the data files change? What if those data files should also be summed up in a table and some key values discussed in the document? You’d probably end up re-running your R and gnuplot scripts, and copy-pasting the results. This isn’t a bad thing if you only have to do it once or twice, but it can get annoying quickly.

The solution: again, separate content and presentation. If you have to do some calculations to get the data you need in your document, write those calculations in the document and let LaTeX (with knitr, as we’ll see later) recreate them if needed. If you have some plots generated by gnuplot or tables from some data file, again, let your document typesetting system take care of filling them.

Show me the code!

First of all, a sample of what can be done. I uploaded to a Github repository some sample code with all the needed files: here’s what the document looks like and here you can take a look at the code used to generate it.

Latexmk, a Make for LaTeX

If we want to separate the presentation (the TeX code) from the content (the data files, plots and calculation results) we will end up with some dependencies. You may think now that Make is the solution, as it’s built for these kind of things. But unless you want to end up absolutely crazy, don’t use Make.

The problem with LaTeX is that you’ll probably need multiple compilation runs to get the references, bibliography, etc, right; and that doesn’t sit well with Make. The solution is to use an specific maker for LaTeX, named Latexmk.

Latexmk is a Perl script that knows how to build your LaTeX documents and how many reruns does it need to be complete. It’s also customizable and has some nice features, such as the continuous preview and compilation of documents: latexmk detects changes on your document and depending files, builds them and updates the resulting PDF file in your document viewer. The command-line option is -pvc, and although it usually works out of the box, you may need to change your ~/.latexmkrc file. In OS X and Skim, for example, I had to include the following lines:

$pdf_previewer = 'osascript -e "set theFile to POSIX file \"%S\" as alias" -e "set thePath to POSIX path of theFile" -e "tell application \"Skim\"" -e "open theFile" -e "end tell"';
$pdf_update_method = 4;
$pdf_update_command = '/usr/bin/osascript -e "set theFile to POSIX file \"%S\" as alias" -e "set thePath to POSIX path of theFile" -e "tell application \"Skim\"" -e " set theDocs to get documents whose path is thePath" -e " try" -e " if (count of theDocs) > 0 then revert theDocs" -e " end try" -e " open theFile" -e "end tell"';

You may also want to enable SyncTeX (option -synctex=1) to link your editor and your previewer, so you can select a line in the .tex file and see the corresponding part in the PDF file and viceversa. As there’re a lot of possible LaTeX environments, I will not include here how to configure all of them. However, a quick Google search will show you how to configure SyncTeX and Latexmk continous preview in most of them.

The other feature of Latexmk we will use is the custom dependency recipes. Basically, they’re chunks of code similar to this one:

add_cus_dep('.inputext','.outputext', 0, 'funcname');
sub funcname {
system("build-your-file-comand") ;

These are more detailed in the Latexmk manual (PDF), but basically it tells Latexmk that, if it sees a file.outputext included in your TeX document and a file.inputext exist in the directory, call funcname to build it. Latexmk automatically includes that file.inputext in the dependency list for your document and will rebuild it if the file changes.

Automatic plots and graphs with gnuplot

Now to the actual interesting things. Gnuplot is a great open-source graphing tool. You write some commands in a file, run gnuplot and it outputs a pretty graphic. You can change the output format so instead of showing a window, it writes the graph to a .png file. However, if you’re going to include that graphic in a LaTeX file, I would recommend using the LaTeX output driver. To do that, just add these two lines to the beginning of your gnuplot script (changing the size if needed, of course):

set term cairolatex color size 4.7in,3in dashed
set output "filename.tex"

It’s important that the filename is the same as the gnuplot file name you’re using, or else Latexmk will not recognize the dependency. You have to include also this code in your ~/.latexmkrc file (the gnuplot extension can be changed to your favorite one:

add_cus_dep('gnuplot','tex', 0, 'makegnu2tex');
sub makegnu2tex {
system("gnuplot \"$_[0].gnuplot\"") ;

If you ran now your Gnuplot script, it would output an eps file and a tex file. The last one is the one you have to \input{} in your document. When Latexmk sees that, it will automatically call Gnuplot and build your .tex file containing the plot.

The advantage of using the Cairolatex driver instead of outputting to PDF or PNG files is that the formatting of your plot is consistent with the rest of the document: the font is the same, you can use LaTeX characters (remember to escape the backslashes in the gnuplot script) and there aren’t artifacts from the resize of the image file.

An alternative to this could be the use of pgfplots to call Gnuplot, but it can be an annoyance as the plot is rebuilt on each compilation and caching of the result is not straightforward. Another advantage of the Gnuplot script approach is that it allows you to change one line and see and manipulate the plot in a GUI terminal, such as wxt.

Write your calculations, not your results: using R with LaTeX

To be honest, I’m not a great fan of R, but it seems to be the mathematical computation language best integrated with LaTeX. The idea is that you can write R code in your document, and then at build time it will be interpreted and the result will be included in your document. For example, I could have this chunk of code

<<BiasBoxplot, fig.lp='fig:', fig.cap = 'A Boxplot.', echo = FALSE, cache = TRUE, fig.height = 4>>=
results = read.table("data")
time = results$V5
avg_time = mean(time)

bias = results$V6 - results$V1


that would output a boxplot image in the LaTeX document. You could also inline the results in your LaTeX test: something like The average time is $\Sexpr{avg_time}$ seconds would appear as The average time is 20 seconds. The nice part of doing things this way is that you don’t change your text when your data changes. You just update the file and rebuild the document, and all the results that come from that data file just get updated automatically. This is also great if you have your document in source control (using Git/Mercurial/SVN/whatever to track your document changes is, by the way, something that you should absolutely do if you care about it).

To make this work, I use Knitr, which is an improvement over Sweave, way easier to use. Just two tips you should remember: use echo = FALSE to avoid showing the R code in the document, results = ‘hide’ to remove the output from code chunks and cache = TRUE if you have figures in the code and would like to avoid Latexmk looping infinitely rebuilding your document. All these chunk options are documented here, you should probably read it. You can also take a look at the demos (specially the minimal one) to see all the features of Knitr and how to use it.

Installing Knitr is not difficult at all if you have R installed. If you don’t, just download it from the official page. I’d also recommend downloading RStudio, which is a nice frontend/IDE. Once you have R, just follow the instructions here to download Knitr (it’s just one command). Installing patchSynctex (just execute install.packages(‘patchSynctex’))is also recommended if you want to have your PDF viewer synced with your LaTeX code.

With Knitr installed, it’s the time to fiddle with Latexmk. However, there’s a caveat: here we can’t use the dependency system of Latexmk. Here, you write a .Rnw document (actually Knitr doesn’t care about the extension, but it’s usual to use this one) and then Knitr ‘weaves’ that document, executing the R code and outputting a .tex document ready to compile. This does not go well with Latexmk, so we have to do a little trick. This is the code you should put in your .latexmkrc file:

@default_files = ("*.Rnw");
$pdf_mode = 1;
$graphicx_opts = "final";
$color_opts = "usenames,dvipsnames";
$pdflatex = '([[ ! "%T" =~ \.Rnw$ ]] || Rscript -e "'.
'library(knitr);' .
'opts_knit\$set(latex.options.graphicx = \"' .
$graphicx_opts .
'\", latex.options.color = \"' .
$color_opts .
'\", concordance = TRUE); ' .
'knit(\'%B.Rnw\');" ' .
') && pdflatex -shell-escape %O %B ' .
'&& ([[ ! "%T" =~ \.Rnw$ ]] || Rscript -e "' .
'library(patchSynctex); ' .
'patchSynctex(\'%B\', verbose = TRUE);")'

Some explanation: the $pdflatex variable is just a shell command that Latexmk will execute to build the PDF file. I have modified it in order to first weave the document with Knitr in case its extension is .Rnw. Then, it will build it normally with pdflatex and finally, patchSynctex will adjust the SyncTeX files so you can reverse search from your PDF into your .Rnw file. The nice part of this snipped is that, if the extension is just a regular .tex, it will build it normally without Knitr, so you don’t have to worry about having Knitr and non-Knitr projects in your computer.

The other variables you should check are $graphicx_opts and $color_opts. These control the options for the graphicx and color packages: Knitr includes them just after the \documentclass declaration and LaTeX can fail with an ‘option clash’ error if you have different options configured in your preamble for these packages.

Some caveats of using Knitr: first, it may occur that the SyncTex patching is not totally correct, so your reverse search can be screwed up. I’ve not found a way to solve this. The second caveat is that Latexmk doesn’t rebuild your document when the data files you’re reading change.

Pgfplotstable, simple plots from data files

Final tool for your dynamic documents in LaTeX: Pgfplotsstable. Just include \usepackage{pgfplotstable} in your preamble (and \usepackage{booktabs} if you want nicer separators for your tables). Then, a chunk of code like this one

columns/0/.style={column name={Column A}},
columns/1/.style={column name={Column B}},
every head row/.style={
before row=\toprule, % booktabs rules
after row=\midrule
every last row/.style={
after row=\bottomrule

will create a table in LaTeX automatically from your data file. And Latexmk will detect the dependency too. Nice, isn’t it?

These are just some basic ideas for the use of these packages. Reading the documentation is recommended (and using TeX.stackexchange.com too, as always), because you can learn a lot of things that these libraries can do and which I didn’t even mentioned here. And, if you have any question/doubt/encouragement/whatever, you can just drop a comment below.