Instead of using the general log, how about going with a query profiler? In fact, you could do query profiling without using any of MySQL's log files and while the queries are still running.
You must use mk-query-digest or pt-query-digest and poll the processlist.
I learned how to use mk-query-digest from this youtube video as a replacement for the slow log: http://www.youtube.com/watch?v=GXwg1fiUF68&feature=colike
Here is the script I wrote to run the query digest program
#!/bin/sh
RUNFILE=/tmp/QueriesAreBeingDigested.txt
if [ -f ${RUNFILE} ] ; then exit ; fi
MKDQ=/usr/local/sbin/mk-query-digest
RUNTIME=${1}
COPIES_TO_KEEP=${2}
DBVIP=${3}
WHICH=/usr/bin/which
DATE=`${WHICH} date`
ECHO=`${WHICH} echo`
HEAD=`${WHICH} head`
TAIL=`${WHICH} tail`
AWK=`${WHICH} awk`
SED=`${WHICH} sed`
CAT=`${WHICH} cat`
WC=`${WHICH} wc`
RM=`${WHICH} rm | ${TAIL} -1 | ${AWK} '{print $1}'`
LS=`${WHICH} ls | ${TAIL} -1 | ${AWK} '{print $1}'`
HAS_THE_DBVIP=`/sbin/ip addr show | grep "scope global secondary" | grep -c "${DBVIP}"`
if [ ${HAS_THE_DBVIP} -eq 1 ] ; then exit ; fi
DT=`${DATE} +"%Y%m%d_%H%M%S"`
UNIQUETAG=`${ECHO} ${SSH_CLIENT}_${SSH_CONNECTION}_${DT} | ${SED} 's/\./ /g' | ${SED} 's/ //g'`
cd /root/QueryDigest
OUTFILE=QP_${DT}.txt
HOSTADDR=${DBVIP}
${MKDQ} --processlist h=${HOSTADDR},u=queryprofiler,p=queryprofiler --run-time=${RUNTIME} > ${OUTFILE}
#
# Rotate out Old Copies
#
QPFILES=QPFiles.txt
QPFILES2ZAP=QPFiles2Zap.txt
${LS} QP_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].txt > ${QPFILES}
LINECOUNT=`${WC} -l < ${QPFILES}`
if [ ${LINECOUNT} -gt ${COPIES_TO_KEEP} ]
then
(( DIFF = LINECOUNT - COPIES_TO_KEEP ))
${HEAD} -${DIFF} < ${QPFILES} > ${QPFILES2ZAP}
for QPFILETOZAP in `${CAT} ${QPFILES2ZAP}`
do
${RM} ${QPFILETOZAP}
done
fi
rm -f ${QPFILES2ZAP}
rm -f ${QPFILES}
rm -f ${RUNFILE}
Make sure
- you a user called queryprofiler whose password is queryprofiler and who only has the PROCESS privilege
- you put
*/20 * * * * /root/QueryDigest/ExecQueryDigest.sh 1190s 144 10.64.95.141
in the crontab to run every 20 minutes (Each profile is 20 min less 10 seconds, Keeps the last 144 copies, and only runs if specfifc DBVIP is present [Alter script to bypass checking for DBVIPs])
The output produces a file with the 20 worst running queries based on the number of times the query was called X avg sec per query.
Here is the sample output of the query profiling summary of mk-query-digest
# Rank Query ID Response time Calls R/Call Item
# ==== ================== ================ ======= ========== ====
# 1 0x812D15015AD29D33 336.3867 68.5% 910 0.369656 SELECT mt_entry mt_placement mt_category
# 2 0x99E13015BFF1E75E 25.3594 5.2% 210 0.120759 SELECT mt_entry mt_objecttag
# 3 0x5E994008E9543B29 16.1608 3.3% 46 0.351321 SELECT schedule_occurrence schedule_eventschedule schedule_event schedule_eventtype schedule_event schedule_eventtype schedule_occurrence.start
# 4 0x84DD09F0FC444677 13.3070 2.7% 23 0.578567 SELECT mt_entry
# 5 0x377E0D0898266FDD 12.0870 2.5% 116 0.104199 SELECT polls_pollquestion mt_category
# 6 0x440EBDBCEDB88725 11.5159 2.3% 21 0.548376 SELECT mt_entry
# 7 0x1DC2DFD6B658021F 10.3653 2.1% 54 0.191949 SELECT mt_entry mt_placement mt_category
# 8 0x6C6318E56E149036 8.8294 1.8% 44 0.200667 SELECT schedule_occurrence schedule_eventschedule schedule_event schedule_eventtype schedule_event schedule_eventtype schedule_occurrence.start
# 9 0x392F6DA628C7FEBD 8.5243 1.7% 9 0.947143 SELECT mt_entry mt_objecttag
# 10 0x7DD2B294CFF96961 7.3753 1.5% 70 0.105362 SELECT polls_pollresponse
# 11 0x9B9092194D3910E6 5.8124 1.2% 57 0.101973 SELECT content_specialitem content_basecontentitem advertising_product organizations_neworg content_basecontentitem_item_attributes
# 12 0xA909BF76E7051792 5.6005 1.1% 55 0.101828 SELECT mt_entry mt_objecttag mt_tag
# 13 0xEBE07AC48DB8923E 5.5195 1.1% 54 0.102213 SELECT rssfeeds_contentfeeditem
# 14 0x3E52CF0261A7C3FF 4.4676 0.9% 44 0.101536 SELECT schedule_occurrence schedule_occurrence.start
# 15 0x9D0BCD3F6731195B 4.2804 0.9% 41 0.104401 SELECT mt_entry mt_placement mt_category
# 16 0x7961BD4C76277EB7 4.0143 0.8% 18 0.223014 INSERT UNION UPDATE UNION mt_session
# 17 0xD2F486BA41E7A623 3.1448 0.6% 21 0.149754 SELECT mt_entry mt_placement mt_category mt_objecttag mt_tag
# 18 0x3B9686D98BB8E054 2.9577 0.6% 11 0.268885 SELECT mt_entry mt_objecttag mt_tag
# 19 0xBB2443BF48638319 2.7239 0.6% 9 0.302660 SELECT rssfeeds_contentfeeditem
# 20 0x3D533D57D8B466CC 2.4209 0.5% 15 0.161391 SELECT mt_entry mt_placement mt_category
Above this output are histograms of these 20 top worst-performing queries
Example of the first entry's histogram
# Query 1: 0.77 QPS, 0.28x concurrency, ID 0x812D15015AD29D33 at byte 0 __
# This item is included in the report because it matches --limit.
# pct total min max avg 95% stddev median
# Count 36 910
# Exec time 58 336s 101ms 2s 370ms 992ms 230ms 393ms
# Lock time 0 0 0 0 0 0 0 0
# Users 1 mt
# Hosts 905 10.64.95.74:54707 (2), 10.64.95.74:56133 (2), 10.64.95.80:33862 (2)... 901 more
# Databases 1 mt1
# Time range 1321642802 to 1321643988
# bytes 1 1.11M 1.22k 1.41k 1.25k 1.26k 25.66 1.20k
# id 36 9.87G 11.10M 11.11M 11.11M 10.76M 0.12 10.76M
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms ################################################################
# 1s ###
# 10s+
# Tables
# SHOW TABLE STATUS FROM `mt1` LIKE 'mt_entry'\G
# SHOW CREATE TABLE `mt1`.`mt_entry`\G
# SHOW TABLE STATUS FROM `mt1` LIKE 'mt_placement'\G
# SHOW CREATE TABLE `mt1`.`mt_placement`\G
# SHOW TABLE STATUS FROM `mt1` LIKE 'mt_category'\G
# SHOW CREATE TABLE `mt1`.`mt_category`\G
# EXPLAIN
SELECT `mt_entry`.`entry_id`, `mt_entry`.`entry_allow_comments`, `mt_entry`.`entry_allow_pings`, `mt_entry`.`entry_atom_id`, `mt_entry`.`entry_author_id`, `mt_entry`.`entry_authored_on`, `mt_entry`.`entry_basename`, `mt_entry`.`entry_blog_id`, `mt_entry`.`entry_category_id`, `mt_entry`.`entry_class`, `mt_entry`.`entry_comment_count`, `mt_entry`.`entry_convert_breaks`, `mt_entry`.`entry_created_by`, `mt_entry`.`entry_created_on`, `mt_entry`.`entry_excerpt`, `mt_entry`.`entry_keywords`, `mt_entry`.`entry_modified_by`, `mt_entry`.`entry_modified_on`, `mt_entry`.`entry_ping_count`, `mt_entry`.`entry_pinged_urls`, `mt_entry`.`entry_status`, `mt_entry`.`entry_tangent_cache`, `mt_entry`.`entry_template_id`, `mt_entry`.`entry_text`, `mt_entry`.`entry_text_more`, `mt_entry`.`entry_title`, `mt_entry`.`entry_to_ping_urls`, `mt_entry`.`entry_week_number` FROM `mt_entry` INNER JOIN `mt_placement` ON (`mt_entry`.`entry_id` = `mt_placement`.`placement_entry_id`) INNER JOIN `mt_category` ON (`mt_placement`.`placement_category_id` = `mt_category`.`category_id`) WHERE (`mt_entry`.`entry_status` = 2 AND `mt_category`.`category_basename` IN ('business_review' /*... omitted 3 items ...*/ ) AND NOT (`mt_entry`.`entry_id` IN (53441))) ORDER BY `mt_entry`.`entry_authored_on` DESC LIMIT 4\G
There is no performance impact doing this because one DB Connection is maintained to poll the processlist for the duration you specify and hundreds of queries against the processlist are totally lightweight within the confines of a single DB Connection. In light of this, you wll not need a NFS share or any hardware considerations for query performance and analysis.
Give it a Try !!!
UPDATE
mk-query-digest can use either the processlist (via live DB Connection) or tcpdump (via pipe). Here are the options:
--processlist
--processlist
type: DSN
Poll this DSNâs processlist for queries, with "--interval" sleep between.
If the connection fails, mk-query-digest tries to reopen it once
per second. See also "--mirror".
--interval
type: float; default: .1
How frequently to poll the processlist, in seconds.
--tcpdump
tcpdump
mk-query-digest does not actually watch the network (i.e. it
does NOT "sniff packets"). Instead, itâs just parsing the out-
put of tcpdump. You are responsible for generating this out-
put; mk-query-digest does not do it for you. Then you send
this to mk-query-digest as you would any log file: as files on
the command line or to STDIN.
The parser expects the input to be formatted with the following
options: "-x -n -q -tttt". For example, if you want to capture
output from your local machine, you can do something like
tcpdump -i eth0 port 3306 -s 65535 -c 1000 -x -n -q -tttt \
â mk-query-digest --type tcpdump
The other tcpdump parameters, such as -s, -c, and -i, are up to
you. Just make sure the output looks like this:
2009-04-12 09:50:16.804849 IP 127.0.0.1.42167 > 127.0.0.1.3306: tcp 37
0x0000: 4508 0059 6eb2 4000 4006 cde2 7f00 0001
0x0010: ....
Remember tcpdump has a handy -c option to stop after it cap-
tures some number of packets! Thatâs very useful for testing
your tcpdump command.
After 8.5 years, I decided to update this post with a version of my wrapper script using pt-query-digest instead of mk-query-digest:
#!/bin/bash
#
# qdlive : Wrapper Around pt-query-digest
# BASE_FOLDER : Target Folder for This Wrapper's Output
# PT_QUERY_DIGEST : Absolute path to the pt-query-digest script
# MYCNF : Absolue path to my.cnf with User Credentials to run pt-query-digest and mysqladmin
# QD_FOLDER : Folder Where pt-query-digest Output are Stored
#
BASE_FOLDER=${HOME}/qdlive
MYCNF=${BASE_FOLDER}/my.cnf
ERRLOG=${BASE_FOLDER}/err.log
PT_QUERY_DIGEST=/usr/bin/pt-query-digest
QD_FOLDER=${BASE_FOLDER}/reports
mkdir -p ${QD_FOLDER}
cd ${BASE_FOLDER}
#
# 1st Parameter is the Number of Minutes for pt-query-digest to run
# 2nd Parameter is the Number of Copies of the pt-query-digest Output to Keep
#
[ "${1}" == "" ] && exit
[ "${2}" == "" ] && exit
[ ! "${3}" == "" ] && exit
OK1=`echo "${1}" | grep -c "^[1-9]$"`
OK2=`echo "${1}" | grep -c "^[1-9][0-9]$"`
OK3=`echo "${1}" | grep -c "^[1-9][0-9][0-9]$"`
(( OK = OK1 + OK2 + OK3 ))
if [ ${OK} -eq 0 ]
then
DT=`date +"%Y-%m-%d %H:%M:%S"`
echo "${DT} : Invalid Minutes (${1})" >> ${ERRLOG}
exit
fi
OK=1
[ ${1} -lt 2 ] && OK=0
[ ${1} -gt 60 ] && OK=0
if [ ${OK} -eq 0 ]
then
DT=`date +"%Y-%m-%d %H:%M:%S"`
echo "${DT} : Invalid Minutes (${1})" >> ${ERRLOG}
exit
fi
RUNTIME_MINUTES=${1}
(( RUNTIME_SECONDS = RUNTIME_MINUTES * 60 ))
OK1=`echo "${2}" | grep -c "^[1-9]$"`
OK2=`echo "${2}" | grep -c "^[1-9][0-9]$"`
OK3=`echo "${2}" | grep -c "^[1-9][0-9][0-9]$"`
(( OK = OK1 + OK2 + OK3 ))
if [ ${OK} -eq 0 ]
then
DT=`date +"%Y-%m-%d %H:%M:%S"`
echo "${DT} : Invalid Copies (${2})" >> ${ERRLOG}
exit
fi
OK=1
[ ${2} -lt 7 ] && OK=0
[ ${2} -gt 300 ] && OK=0
if [ ${OK} -eq 0 ]
then
DT=`date +"%Y-%m-%d %H:%M:%S"`
echo "${DT} : Invalid Copies (${2})" >> ${ERRLOG}
exit
fi
COPIES_TO_KEEP=${2}
#
# Parse my.cnf
#
# [mysql]
# host = ....
# user = ....
# password = ...
#
MYSQL_HOST=localhost
MYSQL_HOST=`grep ^host ${MYCNF} | awk '{print $3}'`
MYSQL_USER=`grep ^user ${MYCNF} | awk '{print $3}'`
MYSQL_PASS=`grep ^password ${MYCNF} | awk '{print $3}'`
MYSQL_PORT=3306
MYSQL_AUTH="--defaults-file=${MYCNF}"
PTOOL_AUTH="-F ${MYCNF}"
#
# Make Sure mysqld is running
#
MYSQL_RUNNING=`mysqladmin ${MYSQL_AUTH} ping 2>&1 | grep -c "mysqld is alive"`
if [ ${MYSQL_RUNNING} -eq 0 ]
then
DT=`date +"%Y-%m-%d %H:%M:%S"`
echo "${DT} : mysqld Not Running" >> ${ERRLOG}
exit
fi
#
# Execute Query Digest
#
DT=`date +"%Y%m%d_%a_%H%M%S"`
QD_OUTPUT=${QD_FOLDER}/qd_${DT}_${RUNTIME_MINUTES}m.rpt
${PT_QUERY_DIGEST} --processlist ${MYSQL_HOST} ${PTOOL_AUTH} --run-time=${RUNTIME_SECONDS}s >${QD_OUTPUT} 2>&1
#
# Rotate Old Reports
#
COPIES_ON_HAND=`ls -l | grep -c rpt$`
if [ ${COPIES_ON_HAND} -gt ${COPIES_TO_KEEP} ]
then
(( COPIES_TO_ZAP = COPIES_ON_HAND - COPIES_TO_KEEP ))
FILELIST=""
SPC=""
for FIL in `ls *.txt | head -${COPIES_TO_ZAP}`
do
FILELIST="${FILELIST}${SPC}${FIL}"
SPC=" "
done
for FIL in ${FILELIST} ; do rm -f ${FIL} ; done
fi
I have good news and bad news on this one.
GOOD NEWS
You could use the general log as a table you can query
Step 01) Add this to /etc/my.cnf
[mysqld]
log
log-output=TABLE
Step 02) service mysql restart
OK mysqld is not recording every query in the table mysql.general_log
. Problem: look at the initial layout of mysql.general_log
:
mysql> show create table mysql.general_log\G
*************************** 1. row ***************************
Table: general_log
Create Table: CREATE TABLE `general_log` (
`event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`user_host` mediumtext NOT NULL,
`thread_id` int(11) NOT NULL,
`server_id` int(10) unsigned NOT NULL,
`command_type` varchar(64) NOT NULL,
`argument` mediumtext NOT NULL
) ENGINE=CSV DEFAULT CHARSET=utf8 COMMENT='General log'
1 row in set (0.09 sec)
What good is a the general log as a CSV table
Step 03) Make mysql.general_log
a MyISAM table and index it
SET @old_log_state = @@global.general_log;
SET GLOBAL general_log = 'OFF';
ALTER TABLE mysql.general_log ENGINE = MyISAM;
ALTER TABLE mysql.general_log ADD INDEX (event_time);
SET GLOBAL general_log = @old_log_state;
Now it looks like this:
mysql> show create table general_log\G
*************************** 1. row ***************************
Table: general_log
Create Table: CREATE TABLE `general_log` (
`event_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`user_host` mediumtext NOT NULL,
`thread_id` int(11) NOT NULL,
`server_id` int(10) unsigned NOT NULL,
`command_type` varchar(64) NOT NULL,
`argument` mediumtext NOT NULL,
KEY `event_time` (`event_time`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='General log'
1 row in set (0.00 sec)
mysql>
THe user and host values are appended together in the user_host fields.
How do you rotate out the general log?
Here is an example of how to blank out mysql.general_log
:
SET @old_log_state = @@global.general_log;
SET GLOBAL general_log = 'OFF';
CREATE TABLE mysql.general_log_new LIKE mysql.general_log;
DROP TABLE mysql.general_log;
ALTER TABLE mysql.general_log_new RENAME mysql.general_log;
SET GLOBAL general_log = @old_log_state;
Here is an example of how to keep the last 3 days of entries:
SET @old_log_state = @@global.general_log;
SET GLOBAL general_log = 'OFF';
CREATE TABLE mysql.general_log_new LIKE mysql.general_log;
INSERT INTO mysql.general_log_new
SELECT * FROM mysql.general_log WHERE event_time > NOW() - INTERVAL 3 DAY;
DROP TABLE mysql.general_log;
ALTER TABLE mysql.general_log_new RENAME mysql.general_log;
SET GLOBAL general_log = @old_log_state;
BAD NEWS
Anything you have collected in the text file version of the general log will not come for the ride. You can collect new entries going forward.
Best Answer
If you are really interesting in monitoring via the general log, there are two major approaches you can try:
APPROACH #1 : Rotating Logs
Let's say you have the following configuration in
my.cnf
:You could rotate the general log with the following script:
As for your monitoring, you would scan the text file as you already do. You can always look back at previous copies of the general.log as archives.
APPROACH #2 : Make the General Log a MyISAM table
Try setting up the general log as MyISAM.
First let's configure the table for logging to tables
Next, MySQL supplies the table
mysql.general_log
. It looks like this:Of course, the general log as a CSV table is hardly useful. You can do two things:
event_time
Here is that code:
When the smoke clears,
mysql.general_log
now looks like this:Events get recorded in this table instead of the text file. Naturally, you want to rotate this as well. Here is that script
As for your monitoring, you would query the
mysql.general_log
table like this:Perhaps yo u may want to check the last 4 hours
CAVEAT
If you want both the text file and the table, then configure them both like this:
Of course, you would have to implement the rotation of both.
I have written other past posts about
mysql.general_log
:Apr 04, 2013
: Stress test MySQL with queries captured with general log in MySQLFeb 11, 2012
: MySQL general logJan 08, 2012
: How to enable MySQL general log?Jun 30, 2011
: How do I output MySQL logs to syslog?Give it a Try !!!