How to Print UTF-8 (Including Chinese) Text

postscriptprintingtext processingunicode

I am trying to print a large quantity (several megabytes) of UTF-8 encoded text which consists of Chinese and Latin characters (and maybe a sprinkling of others). I would like to print it in several columns per page, in a very small, condensed font, preferably with control over line spacing. I'd quite like inter-column lines, but I can live without them. The aim is to print to PDF for transfer, as well as to paper.

I have tried enscript and a2ps, but neither of these support Unicode.

I have also tried paps, but this produces bitmapped outputs which cannot be PDF'd effectively, and also look terrible.

Is there a modern way to print UTF-8 text like this without resorting to something like constructing it manually in Python?

Best Answer

Cedilla is a text-to-postscript converter, similar to enscript and a2ps, with good Unicode support but a lot fewer configuration possibilities. I don't think Cedilla can to multi-column.

If you want fine control over the formatting, you can use LaTeX. LaTeX's support for going beyond 8 bits is a bit problematic, but tools now exist to typeset Chinese fairly painlessly. Here's some untested code, inspired by How does one type Chinese in LaTeX? and Include data from a .txt on our sister site about TeX. You can customize the appearance of the text by changing the options passed to \VerbatimInput from the fancyvrb package.

cat <<'EOF' >driver.tex
\documentclass[UTF8]{ctexart}
\usepackage{multicol}
\usepackage{fancyvrb}
\setlength\columnseprule{.5pt}
\begin{document}
\begin{multicols}{2}
\VerbatimInput[fontfamily=cmr]{stuff.txt}
\end{multicols}
\end{document}
EOF
pdflatex driver.tex
Related Question