Which parameters and metrics to look at when CDC based replication is causing high load

change-data-captureoracletungsten-replicator

I'm investigating the reasons why a Tungsten Replicator extractor is causing heavy load on an Oracle 11g database. The replication is set up to use synchronous CDC and there is no unusual load while the extractor is not running, which leads me to believe that the triggers registered by the CDC package are working just fine, without causing any significant load.

When the extractor is running, the load average of the database server can go as high as 60, but it's not that high all the time. Most likely the variance depends on the number of writes.

What configuration parameters and/or metrics of the database should I look into to identify the cause of this load? Tungsten seems to be simply running standard CDC procedures as outlined below.

Initial prepartion steps of the extractor:

Get a list of tables to extract from a table prepared by a setup script:
```
SELECT * FROM TUNGSTEN_SOURCE_TABLES
```

For each source table, get a list of columns to extract.

SELECT
  UNIQUE CHANGE_SET_NAME,
  PUB.COLUMN_NAME,
  PUB_ID,
  COL.COLUMN_ID
FROM TUNGSTEN_PUBLISHED_COLUMNS PUB, ALL_TAB_COLUMNS COL
WHERE
  SOURCE_SCHEMA_NAME = '{SERVICE_NAME}' AND
  SOURCE_TABLE_NAME = '{TABLE_NAME}' AND
  SOURCE_SCHEMA_NAME = COL.OWNER AND
  SOURCE_TABLE_NAME = COL.TABLE_NAME AND
  PUB.COLUMN_NAME = COL.COLUMN_NAME AND
  CHANGE_SET_NAME = 'TUNGSTEN_CS_{SERVICE_NAME}'
ORDER BY COL.COLUMN_ID

Create a subscription with DBMS_CDC_SUBSCRIBE.CREATE_SUBSCRIPTION
Create a subscriber view for each source table with DBMS_CDC_SUBSCRIBE.SUBSCRIBE
Activate the subscription with DBMS_CDC_SUBSCRIBE.ACTIVATE_SUBSCRIPTION

Inner loop of the extractor while it's running:

Extend the subscription window with DBMS_CDC_SUBSCRIBE.EXTEND_WINDOW

Select each subscription view to fetch any changes

-- Receiving changes for the first time
SELECT
  COLUMN1, COLUMN2, ...,
  CSCN$,
  COMMIT_TIMESTAMP$,
  OPERATION$
FROM VW_TUNGSTEN_{SERVICE_NAME}{INDEX}
ORDER BY CSCN$, RSID$

-- Receiving changes after the first window
SELECT
  COLUMN1, COLUMN2, ...,
  CSCN$,
  COMMIT_TIMESTAMP$,
  OPERATION$
FROM VW_TUNGSTEN_{SERVICE_NAME}{INDEX}
WHERE CSCN$ > :lastSCN
ORDER BY CSCN$, RSID$

Purge the subscription window with DBMS_CDC_SUBSCRIBE.PURGE_WINDOW

Environment:

Oracle Database 11g Release 11.1.0.6.0 – Production
- According to my client, this box is 5 years old
- open_mode: read write
- log_mode: archivelog
- platform_id: 10
- platform_name: Linux IA (32-bit)
Tungsten Replicator 3.0.0

Best Answer

I wasn't able to tune up the Oracle side of parameters, due to having limited access to the database instance, but the database load has seen improvements after stripping down the tables and columns that are being replicated.

1. Determine which tables are getting updated the most.

Try running the extractor for a few hours and let the THL grow. Then, you can count the number of changes logged for each table using a command like:

./bin/thl list | gawk -f thl_summary.awk | sort -n -r

Where thl_summary.awk is:

#!/bin/gawk
# Sum up rows from a `bin/thl list` output
BEGIN {
    OFS="\t"
}
match($0, /- TABLE = (.*)/, a) {
    table=a[1]
}
match($0, /- ROW# = (.*)/, a) {
    rows[table]++
}
END {
    for (k in rows) {
        print rows[k], k
    }
}

2. Replicate only those tables and columns that are necessary.

Looking at the list of tables, ask yourself if you can exclude any of them.

Exclude CLOB columns.
- They aren't really supported by synchronous CDC.
- Even if you specify a subset of the columns in DBMS_CDC_PUBLISH.CREATE_CHANGE_TABLE, each and every change will be logged to the change table¹. But by excluding CLOB columns, we can lessen the load of costly writes and reads.
Exclude all tables that are not really needed on the downstream database.
- If necessary, add a fallback table with a subset of the columns that you really need, even if it means denormalizing a bit.
- The way Tungsten configures CDC, a single update to a source row generates two rows in the corresponding change table². These extra updates can add up if you have a frequently updated table.

Once you've determined which tables and columns to replicate, you can specify them in a file named tungsten.tables:

aSubsetOfTheTable<TAB>column1,...,columnN
wholeTable

Then, you need to re-run the setupCDC.sh script with setupCDC.conf:

specific_tables=1

¹ https://community.oracle.com/thread/2288254

² Updated rows will generate a record with operation$='UU' ... and all the old values, and another record with operation$='UN' or 'UL' with new values. (source)

Best Answer

1. Determine which tables are getting updated the most.

2. Replicate only those tables and columns that are necessary.

Related Solutions

Oracle – How to Identify Process Causing High Memory Usage on DB

Oracle – Why Do an ORDER BY for INSERTs AS SELECT?

Related Question