How to monitor backup jobs on Oracle 11g

monitoringoracle

Background:

I'm trying to monitor that backups run as expected on several databases. Zenoss is used to monitor most other stuff.

A couple of KPIs are already monitored using Zenoss in combination with script that runs sqlplus on database host.

What I've found so far:

Views in database:

  • V$RMAN_STATUS
  • V$RMAN_BACKUP_JOB_DETAILS;
  • V$RMAN_BACKUP_SUBJOB_DETAILS;

Stop gap solution:

Sending alerts from Enterprise Manager:

  • Setup > Monitoring Templates > Create > Target : some Database > Metric Threshold > Failed Job Count

or

  • Preferences > Rules > Create > Tab "General" – Target Type: Database Instance, Tab "Jobs" – Specific Jobs > Add Specific Jobs: Select the job and Job Status: "Problem"

What I'd like to know:

  • Does this seem reasonable or is there another preferred way to integrate monitoring of Oracle into a third party nms? (Polling is preferred to traps in the current setting. )
  • Pitfalls to be aware of.
  • There seems to be significant differences between the database views I mentioned and what I see in EM backup history. Where should I be looking for the status of past backups?

Best Answer

I might have found a couple of answers on my own. Posting them here in case others need to know:

  • Google (or my google skills) seems to be getting increasingly less reliable.
  • Searching the forums at https://forums.oracle.com/ I found this thread: https://forums.oracle.com/forums/thread.jspa?messageID=9956411, which references http://www.morganslibrary.org/reference/dba_best_practices.html. Here is the code mentioned:

    • SELECT start_time, end_time, input_type, input_type, status FROM v$rman_backup_job_details ORDER BY 1;

      and

    • SELECT vbd.file#, vrbjd.start_time, vrbjd.end_time, vbd.incremental_level, vrbjd.input_type, vrbjd.status FROM v$rman_backup_job_details vrbjd, v$backup_datafile vbd WHERE vbd.completion_time BETWEEN vrbjd.start_time AND vrbjd.end_time AND vrbjd.input_type <> 'ARCHIVELOG' ORDER BY 2,1;

  • As for why the query in Enterprise Manager picked older jobs I learned a nice trick from someone at work: You can read recent queries from the view v$sqlarea. Something like select sql_text from v$sqlarea order by last_active_time desc should do the trick.