a photo of Whexy

Wenxuan

CyberSecurity Researcher at Northwestern University

Experience Using Several Plugins in Complex LaTeX Projects

Whexy /
May 06, 2021

This article summarizes some of the more complex LaTeX problems encountered during paper writing. It includes usage tips for latexdiff in multi-file environments and how to resolve latexindent dependency conflicts.

Preparation

Since the perl environment that comes with macOS 11 lacks important header files (macOS 12 has completely removed the perl environment), many LaTeX-related dependencies cannot be installed. When configuring the environment, you need to install a complete perl environment.

brew install perl
brew link --overwrite perl

Revision Tool latexdiff

During the phase when papers are sent back for rewriting, we can use the latexdiff tool to generate a PDF version with annotations to please reviewers.

Latexdiff is a binary file that comes with the LaTeX compilation environment. LaTeX installed through brew has already added it to the PATH environment variable. Usage is very simple:

latexdiff a.tex b.tex > difference.tex

Multi-file Processing

Papers usually consist of more than one file. Using latexdiff in multi-file projects is quite troublesome. The --flatten parameter that comes with latexdiff can be used to flatten multi-files, but if the project uses BibTeX to manage references, it will report errors after flattening.

In this blog post Multiple-file LaTeX diff, the author wrote a Python script flatten.py to flatten multiple LaTeX files.

#!/usr/bin/python
import sys
import os
import re

inputPattern = re.compile('\\input{(.*)}')

def flattenLatex( rootFilename ):
    dirpath, filename = os.path.split(rootFilename)
    with open(rootFilename,'r') as fh:
        for line in fh:
            match = inputPattern.search( line )
            if match:
                newFile = match.group(1)
                if not newFile.endswith('tex'):
                    newFile += '.tex'
                flattenLatex( os.path.join(dirpath,newFile) )
            else:
                sys.stdout.write(line)

if __name__ == "__main__":
    flattenLatex( sys.argv[1] )

For example, with the following file directory:

$ tree | grep -e "\.tex"

├── main.tex
│   ├── abstract.tex
│   ├── appendix.tex
│   ├── background.tex
│   ├── conclusion.tex
│   ├── design.tex
│   ├── discuss.tex
│   ├── evaluation.tex
│   ├── implementation.tex
│   ├── introduction.tex
│   └── relatedwork.tex

Just use flatten.py main.tex > flatten_main.tex to generate a flattened file. It's best to check that this file can compile successfully.

Ignoring Nested Contexts

The code generated by the latexdiff command usually cannot be compiled directly. This is because many LaTeX environments don't support nesting their declared formats. For example, section, subsection, subsubsection, table, cite, etc. You need to add parameters in latexdiff to skip them.

The parameters I use are:

latexdiff old.tex new.tex --disable-citation-markup --exclude-textcmd="section,subsection,subsubsection" --config="PICTUREENV=(?:picture|DIFnomarkup|table)[\w\d*@]*"

This set of parameters skips citation markup, ignores section names, and doesn't preserve old version tables. After testing, this is the minimal parameter set that can compile in my paper repository.

Automation

When using latexdiff, it's best to keep all versions of compilation intermediates. I'll show you my personal Makefile that automatically generates the comparison result DIFF.pdf.

TARGETS = main

LATEX	= xelatex
BIBTEX	= bibtex

all:    $(TARGETS) debug

$(TARGETS):
	$(LATEX) $@
	-$(BIBTEX) $@ > $(BIBTEX)_out.log
	$(LATEX) $@
	$(LATEX) $@
	$(LATEX) $@

debug:
	-grep Dialoging *.log

diff:
	rm -rf ./_diff ./_env
	# Checkout Old version
	mkdir _diff
	git archive 5dc07a9 | tar x -C ./_diff
	# Checkout Current version
	mkdir _env
	git archive master | tar x -C ./_env
	# Flatten Old version
	cp ./_diff/main.tex ./_diff/main.tex.bak
	./utils/flatten.py ./_diff/main.tex.bak > ./_diff/main.tex
	cd ./_diff && make
	# Flatten Current version
	cp ./_env/main.tex ./_env/main.tex.bak
	./utils/flatten.py ./_env/main.tex.bak > ./_env/main.tex
	cd ./_env && make
	# LaTeX Diff
	latexdiff _diff/main.tex _env/main.tex --disable-citation-markup --exclude-textcmd="section,subsection,subsubsection" --config="PICTUREENV=(?:picture|DIFnomarkup|table)[\w\d*@]*" > _env/diff.tex
	cp _env/diff.tex _env/main.tex
	cd ./_env && make
	cp ./_env/main.pdf ./DIFF.pdf
	# Cleanup
	rm -rf ./_diff ./_env

clean:
	rm -f images/*.aux images/*.log *.aux *.bbl *.blg *.log *.dvi *.bak *~ $(TARGETS:%=%.pdf)
	rm -f diff*

Source Code Formatting Latexindent

LaTeX source code is always messy, filled with various comments, macros, tables, images, and code. In multi-person collaboration, huge code style differences will appear. Latexindent is a tool for formatting LaTeX source code. It gives you clean and tidy source code. Your mood improves, and work efficiency also increases.

Use the perl package manager CPAN to install Latexindent dependencies.

sudo cpan Log::Log4perl
sudo cpan Log::Dispatch
sudo cpan YAML::Tiny
sudo cpan File::HomeDir
sudo cpan Unicode::GCString

After installation, you can use Latexindent normally. VSCode's LaTeX Workshop plugin directly calls Latexindent - just press the code formatting shortcut to use it.

© LICENSED UNDER CC BY-NC-SA 4.0