Db2 – How to determine the root cause of a performance spike during a heavy load on DB2

db2db2-luwload-testingperformanceperformance-testing

I am testing the performance of an application that links to a DB2 database engine at the back end.

Looking at the transaction timer logs, I see that on the 37'th minute into the test run, transactions take a very high time to process. This lasts for maybe a few seconds and then the run proceeds at acceptable speed.

I've done three test runs (restoring the data to initial test data, clearing up everything = basically, ensuring each run has the same starting position) and they all experience the same phenomenon: at the 37'th minute, a performance spike:
enter image description here

My current lead is that I notice, on the second run it's more visible, that every 37-ish minutes, there's a "minispike".

I think what's going on is the DB2 is doing some regular maintenance every 37 minutes, and the first maintenance job is a bit harder, as the machine hasn't cached a required program or…?

I am wondering what advice can I get to investigate this further? Debugging my own application did not show any reason to hang as it does there.

Info:
DB: DB2 10.5.0.7
OS: SLES 12 0
App: Precompiled to bind to DB2 in 64-bit mode

Best Answer

This could be one of the autonomic maintenance processes, e.g. automatic collection of table statistics, although the 37 minute interval looks a bit suspect.

You can start with looking at db2diag.log at around the time of the spike -- most auto maintenance tasks would be logged there.

For a quick look at what thread might be busy, run

db2pd -edus interval=<seconds> -repeat

around the time of the spike, choosing the interval appropriate for the spike width.

Another useful tool is the system stored procedure MONREPORT.DBSUMMARY(<seconds>), which will print the summary of various wait times over the specified period -- run it before, during, and after the spike to compare the numbers.