One of our customers who has a somewhat problematic web application would like us to log all the queries for a period of 24 hours.
I wouldn't know how much data it will actually write.
The amount queries for any 24-hour period would be somewhere around 5M.
I could either allow a couple of GB of ramdisk, or mount an NFS share located on an otherwise unused JBOD.
Questions
-
What would be the effect if the destination where the log is written would get filled?
-
Would performance of the mysqld in any way (other than the general log write) be affected if the (dedicated) NFS share will start to perform slow due to heavy I/O?
Thanks in advance,
Best Answer
Instead of using the general log, how about going with a query profiler? In fact, you could do query profiling without using any of MySQL's log files and while the queries are still running.
You must use mk-query-digest or pt-query-digest and poll the processlist.
I learned how to use mk-query-digest from this youtube video as a replacement for the slow log: http://www.youtube.com/watch?v=GXwg1fiUF68&feature=colike
Here is the script I wrote to run the query digest program
Make sure
*/20 * * * * /root/QueryDigest/ExecQueryDigest.sh 1190s 144 10.64.95.141
in the crontab to run every 20 minutes (Each profile is 20 min less 10 seconds, Keeps the last 144 copies, and only runs if specfifc DBVIP is present [Alter script to bypass checking for DBVIPs])The output produces a file with the 20 worst running queries based on the number of times the query was called X avg sec per query.
Here is the sample output of the query profiling summary of mk-query-digest
Above this output are histograms of these 20 top worst-performing queries
Example of the first entry's histogram
There is no performance impact doing this because one DB Connection is maintained to poll the processlist for the duration you specify and hundreds of queries against the processlist are totally lightweight within the confines of a single DB Connection. In light of this, you wll not need a NFS share or any hardware considerations for query performance and analysis.
Give it a Try !!!
UPDATE
mk-query-digest can use either the processlist (via live DB Connection) or tcpdump (via pipe). Here are the options:
--processlist
--tcpdump
After 8.5 years, I decided to update this post with a version of my wrapper script using pt-query-digest instead of mk-query-digest: