Mysql – Replay (re-execute) MySQL SELECT queries from a log file

benchmarklogsMySQLperformanceslow-log

MySQL Benchmarking

I would like to evaluate the performance of a different MySQL instance by re-executing real queries from log files. I am aware of tools like mysqlslap, that produces random and autogenerated queries, but I would rather like to test the system with realistic queries.

In order to benchmark the new system, I would ideally collect all queries either using the slow log or the general log and then replay the queries to the new system.

It should be possible to filter queries, for instance I would like to replay SELECT statements only. I am wondering if there exists an actively maintained procedure or a best practice that how to achieve warm-up or benchmark functionality.

Requirements

  • Replay (re-execute) real queries collected from a running MySQL instance
  • Filter statements by statement type (INSERT, UPDATE, DELETE) and DML
  • Get statistics (how long did the query run before, after the migration etc)

Problem: Tools are outdated

Percona offered several tools that seemed to be designed for that purpose. All of them have been removed from the current Percona Toolkit. These tools were:

  • pt-log-player
  • The replay option for pt-query-digest
  • query-playback

Potential solutions

Using a Debian Wheezy docker container gives easy access to an older Percona Toolkit version where the pt-log-player is available. Running a container with docker run -it --network="host" --name wheezy debian:wheezy /bin/bash and installing the legacy versions inside works:

apt-get update
apt-get install percona-toolkit mysql-client

I also tried percona-playback, but it does not seem to be very actively maintained either. That would be a great tool, but it is hardly documented and I could not find a method for filtering SELECT statements only, as I do not want to run other DML statements. I only saw that Percona Server offers a flag for enabling a read only option, but this is not available with vanilla MySQL or AWS RDS.

However I would rather like to use a procedure that does not rely on unmaintained tools.

Imagined Workflow

  1. Enable slow query log or general log
  2. Parse and analyze log file
  3. Filter relevant queries
  4. Re-execute queries
  5. Compare results

Are there any tools how I could achieve this? What is the best way to filter the slow query log? How to replay the queries sanely?

Best Answer

I found a solution for my problem. It is a combination of Percona pt-query-digest and Percona playback.

First, I filter the slow log by database name and exclude all statements that do not start with select.

cat mysql-slow.log |  pt-query-digest --filter '(($event->{db}) =~ /my_database/) && $event->{arg} =~ m/^select/i' --output slowlog > my_database_selects.log

As output, I get a new slow log file that only contains the desired queries.

percona-playback --mysql-max-retries 1 --mysql-host example.org --mysql-port 3306 --mysql-username warmup --mysql-password S-E-C-R-E-T --ignore-row-result-diffs --query-log-file ~/my_database_selects.log

As result, I get an overview of the query execution time and a warmed up database.

Detailed Report
----------------
SELECTs  : 41666 queries (33379 faster, 8287 slower)
INSERTs  : 0 queries (0 faster, 0 slower)
UPDATEs  : 0 queries (0 faster, 0 slower)
DELETEs  : 0 queries (0 faster, 0 slower)
REPLACEs : 0 queries (0 faster, 0 slower)
DROPs    : 0 queries (0 faster, 0 slower)


Report
------
Executed 41666 queries
Spent 00:00:45.795945 executing queries versus an expected 00:02:07.966942 time.
33379 queries were quicker than expected, 8287 were slower
A total of 0 queries had errors.
Expected 79870 rows, got 79868 (a difference of 2)
Number of queries where number of rows differed: 0.

Average of 265.39 queries per connection (157 connections).

I hope this helps someone with a similar problem.

Installing Percona Playback

There is a bug in the current version that requires to change a library path.

sudo apt-get install libtbb-dev libmysqlclient-dev libboost-program-options-dev libboost-thread-dev libboost-regex-dev libboost-system-dev libboost-chrono-dev pkg-config cmake  libssl-dev
git clone https://github.com/Percona-Lab/query-playback.git
cd query-playback/

Edit the the file CMakeLists.txt (in the directory ~/git/query-playback/percona_playback/mysql_client/CMakeLists.txt ) and replace find_library(MYSQL_LIB"mysqlclient_r" PATH_SUFFIXES "mysql") with find_library(MYSQL_LIB "mysqlclient" PATH_SUFFIXES "mysql") (remove the _r suffix).

mkdir build_dir
cd build_dir
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
make
sudo make install

Here is a link with some background