QR code

Spell Check Your LaTeX Writings Using GNU Aspell

  • Moscow, Russia
  • comments

latex

Do you use LaTeX for your academic and technical writings? You don’t? Well you should! It’s the most only professional instrument for making properly formatted PDF documents. MS Word and Apple Pages are for secretaries non-tech people, while LaTeX is serious. It’s perfect in so many ways, thanks to Donald Knuth (the creator of TeX) and Leslie Lamport (the author of LaTeX), but it lacks one very convenient feature: spell checking. The only solution I’ve found so far, which works perfectly for my documents, is GNU aspell.

Zero 2 (2010) by Emilis Velyvis
Zero 2 (2010) by Emilis Velyvis

GNU aspell is a command line tool which expects you to provide the LaTeX source code (indeed, it is code, not “text”) as an input and prints a list of found spelling errors. The beauty of it is that it checks only the text, ignoring TeX commands. For example, this is LaTeX document:

\documentclass{article}
\begin{document}
Hello, \textbf{Yegor}!
\end{document}

If we feed this text to some other spell checker (or GNU aspell without the option --mode=tex) the word textbf would be an obvious spelling mistake; aspell, however, understands it as a LaTeX command and ignores it. Moreover, aspell can understand the word Yegor, even though it’s not an English word, by using a custom dictionary provided by the --pws option.

There are a few other useful features in aspell. Long story short, I decided to create a small wrapper around aspell, to simplify the process of its configuration: texsc (stands for “TeX Spell Checking”). It’s a command line tool, which you install and then run, for example like this (you can see how it’s configured in the Makefile of this paper):

$ texsc --pws aspell.en.pws --ignore=code,citep article.tex

There is a list of arguments you can supply to texsc:

  • --pws is the location of a custom dictionary, where each line is a word aspell is supposed to ignore. It’s important to have the first line equal to personal_ws-1.1 en 741 utf-8. Why? I don’t know. But if it contains something else, aspell will just silently ignore the file. Nice, huh?

  • --ignore (you may have many of them) is the list of TeX commands, and which arguments should be ignored. A good example is the \code{} command, which in all cases has something that is not an English word. You may also have commands with multiple arguments, in which case you say something like --ignore=code:op and in the command \code[foo]{bar} both foo and bar will be ignored. The :op suffix means that an optional (o) argument is ignored and then a mandatory one (p). Something like :oppp would tell aspell to ignore one optional and then three mandatory arguments.

  • --min-word-length is the minimum length of word to pay attention to. I use the number 3 and this is the default value. Shorter words (one or two characters) are not important and don’t need to be spell-checked.

I use texsc in all my LaTeX projects, usually as part of their build cycle, which I automate with GNU make. You can do the same, as it’s open source.

sixnines availability badge   GitHub stars