Convert Doc/Docx to PDF on Headless Server Without OpenOffice

command lineconversionfile formatpdfserver

On a production web server I have to produce letters based on a template I got in MS-Word binary format. I use PHP and for the search and replace task I found PHPWord, which can handle Docx files, so I converted the template to OpenXML on my local workstation. Unfortunately the output also is Docx.

The goal is to produce a single PDF for the user to download so she can print out a bunch of letters at once very easily.

Now I need to find a way to either:

  • Search and replace text in a PDF file
  • Convert Docx to PDF without loss of formatting
  • Edit the original Doc template without loss of formatting and without using COM
  • Convert Docx to Doc without loss of formatting (which seems nearly impossible for the template looks good in word but technically how the formatting is done is a big pile of…) so I could convert it using wvPDF

What I don't want to use besides are web services. I'm aware of PHPLiveDocx but I don't want to depend on an external service for performance, availability, security reasons. Also buying a piece of software isn't an option in this case (can't influence that).

Running on a public facing web server I don't want to pull – not even headless, as it will pull around 160MB of compressed(!) binaries and best practice is not no load binaries you don't really need on a public facing server. Though it's a last resort to use oo.o I want to make sure I have ruled out any other options there may have been.

The host OS is CentOS 5.5.

Where can I go from here?


Best Answer

To my knowledge there is no application that can do this without some dependency from Libre Office.

However you don't need to install the whole office suite when only performing commandline conversions.

You can try if the tool unoconv Install unoconv meets your needs. It has python and python-uno as a dependency. The latter will also install libreoffice-core as a dependency but not the whole office suite.

Related Question