Knowing that "How to convert from text to .pdf" is already well answered here link and here link, I am looking for something more specific:
Using Claws-Mail [website] and a Plug-In [RSSyl] to read RSS feeds I collected a lot of text files. These I want to convert into .pdf files.
Problem: The files inside the folders are numbered [1, 2, …, 456]. Every feed has its own folder, but inside I have 'just' numbered files. Every file contains a header [followed by the message's content]:
Date: Tue, 5 Feb 2013 19:59:53 GMT
From: N/A
Subject: Civilized Discourse Construction Kit
X-RSSyl-URL: http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html
Message-ID: <http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html>
Content-Type: text/html; charset=UTF-8
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<base href="http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html">
</head><body>
<p>URL: <a href="http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html">http://www.codinghorror.com/blog/2013/02/civilized-discourse-construction-kit.html</a></p>
<br>
<!-- RSSyl text start -->
Question: A way to convert each file into a .pdf
file and rename it, based upon the name given under Subject. Super-awesome would be converting and re-naming this way:
"folder.name"_"date"_"file name"
with each information taken from the header data. As there are a few hundred files, I am looking for a batch processing way.
The files are html
formatted, but without a .htm[l]
suffix.
Best Answer
If you have a relatively simple file tree where you have only one level of directories, and where each directory contains a list of files but there are no sub directories, you should be able to do something like this (you can paste this directly into your terminal and hit Enter):
NOTES
This solution requires
wkhtmltopdf
:On Debian based systems you can install it with
It assumes there are no files in the top level directory and only desired html files in all sub directories.
It can deal with file and directory names that contain spaces, new lines and other unorthodox characters.
Given a file
dir1/foo
with the contents of the example you have posted, it will create a file calleddir1/dir1_020513_20:59:53_Civilized Discourse Construction Kit10.pdf