I haven't seen an answer that comes near the wishes of Chaitanya. If you want to search on filename, a combination of locate, find, ls and grep could be sufficient. But I think Chaitanya want to search for example for 'all files created before 2011'. This can perfectly done with find, but I can imagine it will take a long time searching through 1TB (depends more on the amount of files, not necessarily the total size). To speed this up I think indexing is inevitable. The problem of locate (indexing with updatedb) is that it doesn't index creation time.
So what Chaitanya need is something that indexes the needed attributes of files (file name, file size, creation date, more?). And later something that can search on these attributes. As far as I know there is no out-of-the-box solution for this on Ubuntu.
An important comment of Chaitanya: "Now the thing is that I am designing a php based web gui...". Because your problem sounds quite specific, maybe you want to build somehting yourself. Some suggestions:
The only viable solution I've found so far:
- Upload PDF file with wrong fields to https://www.pdfescape.com
- Fill in any field, without at least one change to the fields, the uploaded file will be returned directly as is rather than recreated properly at point 3.
- Click on 'Save & Download PDF...' and download it onto your computer
The black form fields should now look normal. Whatever is wrong with them, rewriting the file completely with PDFescape seems to fix it.
Who knows what's wrong with them. Maybe Wizards of the Coast use some outdated Windows-only writer or something. Though strange that only the 'Character details' & 'Spellcasting sheet' pages are rendered wrongly...
Best Answer
You can convert the PDF to text and then apply grep on that text: