2.1.5 Input and Output Files

If you have a renderer that only generates one file, specifying the output filename is simple: use the --filename option to specify the name. However, if the renderer you are using generates multiple files, things get more complicated. The --filename option is also capable of handling multiple names, as well as giving you a templating way to build filenames.

Below is a list of all of the options that affect filename generation.

Characters that shouldn’t be used in a filename


Command-Line Options: --bad-filename-chars=string
Config File: [ files ] bad-chars
Default: : #$%^&*!~‘"’=?/[]()|<>;\,.
specifies all characters that should not be allowed in a filename. These characters will be replaced by the value in --bad-filename-chars-sub.

String to use in place of invalid characters


Command-Line Options: --bad-filename-chars-sub=string
Config File: [ files ] bad-chars-sub
Default: -
specifies a string to use in place of invalid filename characters ( specified by the --bad-chars-sub option)

Output Directory


Command-Line Options: --dir=directory or -d directory
Config File: [ files ] directory
Default: $jobname
specifies a directory name to use as the output directory.

Escaping characters higher than 7-bit


Command-Line Options: --escape-high-chars
Config File: [ files ] escape-high-chars
Default: False
some output types allow you to represent characters that are greater than 7-bits with an alternate representation to alleviate the issue of file encoding. This option indicates that these alternate representations should be used.

Note: The renderer is responsible for doing the translation into the alternate format. This might not be supported by all output types.

Template to use for output filenames


Command-Line Options: --filename=string
Config File: [ files ] filename
specifies the templates to use for generating filenames. The filename template is a list of space separated names. Each name in the list is returned once. An example is shown below.

index.html toc.html file1.html file2.html

If you don’t know how many files you are going to be reproducing, using static filenames like in the example above is not practical. For this reason, these filenames can also contain variables as described in Python’s string Templates (e.g. $title, $id). These variables come from the namespace created in the renderer and include: $id, the ID (i.e. label) of the item, $title, the title of the item, and $jobname, the basename of the LaTeX file being processed. One special variable is $num. This value in generated dynamically whenever a filename with $num is requested. Each time a filename with $num is successfully generated, the value of $num is incremented.

The values of variables can also be modified by a format specified in parentheses after the variable. The format is simply an integer that specifies how wide of a field to create for integers (zero-padded), or, for strings, how many space separated words to limit the name to. The example below shows $num being padded to four places and $title being limited to five words.

sect$num(4).html $title(5).html

The list can also contain a wildcard filename (which should be specified last). Once a wildcard name is reached, it is used from that point on to generate the remaining filenames. The wildcard filename contains a list of alternatives to use as part of the filename indicated by a comma separated list of alternatives surrounded by a set of square brackets ([ ]). Each of the alternatives specified is tried until a filename is successfully created (i.e. all variables resolve). For example, the specification below creates three alternatives.

$jobname_[$id, $title, sect$num(4)].html

The code above is expanded to the following possibilities.

$jobname_$id.html
$jobname_$title.html
$jobname_sect$num(4).html

Each of the alternatives is attempted until one of them succeeds. In order for an alternative to succeed, all of the variables referenced in the template must be populated. For example, the $id variable will not be populated unless the node had a \$label macro pointing to it. The title variable would not be populated unless the node had a title associated with it (e.g. such as section, subsection, etc.). Generally, the last one should contain no variables except for $num as a fail-safe alternative.

Input Encoding


Command-Line Options: --input-encoding=string
Config File: [ files ] input-encoding
Default: utf-8
specifies which encoding the LaTeX source file is in

Output Encoding


Command-Line Options: --output-encoding=string
Config File: [ files ] output-encoding
Default: utf-8
specifies which encoding the output files should use. Note: This depends on the output format as well. While HTML and XML use encodings, a binary format like MS Word, would not.

Splitting document into multiple files


Command-Line Options: --split-level=integer
Config File: [ files ] split-level
Default: 2
specifies the highest section level that generates a new file. Each section in a LaTeX document has a number associated with its hierarchical level. These levels are -2 for the document, -1 for parts, 0 for chapters, 1 for sections, 2 for subsections, 3 for subsubsections, 4 for paragraphs, and 5 for subparagraphs. A new file will be generated for every section in the hierarchy with a value less than or equal to the value of this option. This means that for the value of 2, files will be generated for the document, parts, chapters, sections, and subsections.