Command-line-friendly full-text indexing

command linesearch

Is there such a thing as a full-text indexing engine, that can be queried from the command line and ideally wouldn't require using a gui at all ?

I'm especially interested in indexing my ebooks and papers, so that's a mixture of pdf, epub and a few djvu. (Open)Office docs would be nice, but much lower on my list.

Best Answer

Have you looked at Lucene or Sphinx? While you will need to initially parse the documents you want to index, once that's done, either one can search from the cli.

For Lucene, there is some info on doing this available.

Sphinx, is a bit more vague, but there is also some documentation available. You can pass structured XML data of your choice to sphinx via the xmlpipe2 data source.

Lucene relies on Java, while Sphinx is built in C++ with no needed outside dependencies.

Either one is going to require a bit of work to do what you want, but, seems like a totally workable solution.

Related Question