How to extract hocr file from PDF

pdf

I'm creating an OCR-ed PDF through tesseract:

tesseract input.tif out pdf

But I also need the hocr and txt files.
Recent versions of tesseract already solved this but because it requires compiling both leptonica and tesseract, I'm not entirely comfortable with it.

I can use pdftotext to extract the text file but I can't seem to find a way to extract hocr from the PDF.

Best Answer

You can simply run the following command to create both pdf and hocr at the same time.

tesseract input.tif out pdf hocr