Generic Preprocessor adds extra whitespace

compilingmarkdownwhitespace

Following up on this article I use GPP to empower Markdown parser pandoc with some macros. Unfortunately, gpp seems to copy all whitespace into the result.

For example, consider file test.md

% Title
% Raphael
% 2012

\lorem \ipsum

with test.gpp

\define{lorem}{Lorem}
\define{ipsum}{ipsum...}

Now, calling gpp -T --include test.gpp test.md yields

<empty line>
% Title
% Raphael
% 2012

Lorem ipsum...

This breaks the metadata extraction of pandoc. The extra linebreak is indeed the one between the definitions; if I use

\define{lorem}{Lorem}@@@
\define{ipsum}{ipsum...}

with the extra option +c "@@@" "\n", the empty line is gone. But this workaround is not only ugly, is also has two fatal flaws.

First, it treats @@@ as comment indicator in the source file, too. As @@@ is not forbidden in Markdown, that can have unintended consequences when @@@ (or any other chosen delimiter) happens to occur in the source file.

Second, it does not cover whitespaces at line beginnings as caused by proper indentation. For example,

\define{lorem}{@@@
  \if{a == a}@@@
    ![some image](test.png)@@@
  \endif@@@
}@@@

will cause all such image tags to be indented by four spaces, causing pandoc to typeset it as code (as specified).

So, short of writing gpp files in one line or introducing ugly line-end comments and not indenting, what can you do to prevent gpp from plastering superfluous whitespaces all over the place?

Best Answer

Assuming all the junk is in the include file, and therefore before the start of the document, you could just post-process it:

test.gpp:

\define{lorem}{Lorem}
\define{ipsum}{ipsum...}
----- cut here ------

Then do:

gpp -T --include test.gpp test.md | sed '1,/----- cut here ------/d'

(Does gpp output to stdout? Otherwise just run sed on the output file.)

Related Question