MySQL Benchmarking
I would like to evaluate the performance of a different MySQL instance by re-executing real queries from log files. I am aware of tools like mysqlslap, that produces random and autogenerated queries, but I would rather like to test the system with realistic queries.
In order to benchmark the new system, I would ideally collect all queries either using the slow log or the general log and then replay the queries to the new system.
It should be possible to filter queries, for instance I would like to replay SELECT statements only. I am wondering if there exists an actively maintained procedure or a best practice that how to achieve warm-up or benchmark functionality.
Requirements
- Replay (re-execute) real queries collected from a running MySQL instance
- Filter statements by statement type (INSERT, UPDATE, DELETE) and DML
- Get statistics (how long did the query run before, after the migration etc)
Problem: Tools are outdated
Percona offered several tools that seemed to be designed for that purpose. All of them have been removed from the current Percona Toolkit. These tools were:
- pt-log-player
- The replay option for pt-query-digest
- query-playback
Potential solutions
Using a Debian Wheezy docker container gives easy access to an older Percona Toolkit version where the pt-log-player is available. Running a container with docker run -it --network="host" --name wheezy debian:wheezy /bin/bash
and installing the legacy versions inside works:
apt-get update
apt-get install percona-toolkit mysql-client
I also tried percona-playback, but it does not seem to be very actively maintained either. That would be a great tool, but it is hardly documented and I could not find a method for filtering SELECT statements only, as I do not want to run other DML statements. I only saw that Percona Server offers a flag for enabling a read only option, but this is not available with vanilla MySQL or AWS RDS.
However I would rather like to use a procedure that does not rely on unmaintained tools.
Imagined Workflow
- Enable slow query log or general log
- Parse and analyze log file
- Filter relevant queries
- Re-execute queries
- Compare results
Are there any tools how I could achieve this? What is the best way to filter the slow query log? How to replay the queries sanely?
Best Answer
I found a solution for my problem. It is a combination of Percona pt-query-digest and Percona playback.
First, I filter the slow log by database name and exclude all statements that do not start with
select
.As output, I get a new slow log file that only contains the desired queries.
As result, I get an overview of the query execution time and a warmed up database.
I hope this helps someone with a similar problem.
Installing Percona Playback
There is a bug in the current version that requires to change a library path.
Edit the the file CMakeLists.txt (in the directory ~/git/query-playback/percona_playback/mysql_client/CMakeLists.txt ) and replace find_library(MYSQL_LIB"mysqlclient_r" PATH_SUFFIXES "mysql") with find_library(MYSQL_LIB "mysqlclient" PATH_SUFFIXES "mysql") (remove the _r suffix).
Here is a link with some background