Short answer:
it's entirely possible that the cache will not be comprehensive. If you delete mail and hcache later recomputes the header cache for that mailbox, your stats will not include mail from before the deletion.
If you don't have access to the mail logs for your server, do you have access to a filter mechanism, e.g. procmail? You could use that to generate an alternative log for analysis.
Otherwise, can you poll your mailbox with a program that can generate a log of mail received? Something like an offlineimap filter, or fetchmail/retchmail combined with some hashing and caching.
Longer answer:
The cache file is a DBM-style database. Depending on the exact build options for your mutt, it could be one of QDBM, tokyo cabinet, gdbm or Berkeley DB (BDB); which all implement a variation of BDB's API.
I believe that it is unlikely you can reliably read the DB unless you use the right library implementation. ldd
tells me my local mutt uses the tokyo cabinet implementation:
$ ldd /usr/bin/mutt
…
libtokyocabinet.so.8 => /usr/lib/libtokyocabinet.so.8 (0xb74f2000)
…
You would then need to write a program, using that library, to query the BDB stored within the cache file. There are bindings for Perl, Ruby, Lua, Java, and of course C.
It would appear that headers are stored as values in the DB, indexed by a CRC. From what I can tell, the CRC is derived from the path to a mailbox, which implies that the stored headers are the headers for all mail in that mailbox. So your program is essentially going to end up with a buffer containing all headers for all mail in a given mailbox. I don't think it will be much more useful than pulling the headers from all mail currently in your mailbox (and given the "short answer" above, not guaranteed to be more reliable).
In your .muttrc
add the following line:
set display_filter="exec sed -r \"s/^Date:\\s*(([F-Wa-u]{3},\\s*)?[[:digit:]]{1,2}\\s+[A-Sa-y]{3}\\s+[[:digit:]]{4}\\s+[[:digit:]]{1,2}:[[:digit:]]{1,2}(:[[:digit:]]{1,2})?\\s+[+-][[:digit:]]{4})/date +'Date: %a, %d %b %Y %H:%M:%S %z' -d '\\1'/e\""
This will change the Date:
header in the message (for display only) to your local timezone if the header contained a valid RFC formatted date. If the provided date format was incorrect (we are dealing with untrusted user input after all) it will be preserved. To combat a possible attempt to inject the shell code through the header the sed
pattern implements a whitelist based on RFC 5322 (this RFC defines the format of the Date:
field).
Note that mutt
limits the command line to be no more than 255 character long, hence I optimised the original sed
command that had stricter whitelist to fit into 255 bytes. If you plan to do other things with the message, then the full sed
command you can put in a script is:
sed -r "s/^Date:\s*(((Mon|Tue|Wed|Thu|Fri|Sat|Sun),\s*)?[[:digit:]]{1,2}\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+[[:digit:]]{4}\s+[[:digit:]]{1,2}:[[:digit:]]{1,2}(:[[:digit:]]{1,2})?\s+[+-][[:digit:]]{4})/date +'Date: %a, %d %b %Y %H:%M:%S %z' -d '\1'/e"
Best Answer
Munging the subject is possible with
subjectrx
. Available since mutt 1.8.0.Mutt manual, 12. Display Munging