Background:
I'm trying to monitor that backups run as expected on several databases. Zenoss is used to monitor most other stuff.
A couple of KPIs are already monitored using Zenoss in combination with script that runs sqlplus on database host.
What I've found so far:
Views in database:
- V$RMAN_STATUS
- V$RMAN_BACKUP_JOB_DETAILS;
- V$RMAN_BACKUP_SUBJOB_DETAILS;
Stop gap solution:
Sending alerts from Enterprise Manager:
- Setup > Monitoring Templates > Create > Target : some Database > Metric Threshold > Failed Job Count
or
- Preferences > Rules > Create > Tab "General" – Target Type: Database Instance, Tab "Jobs" – Specific Jobs > Add Specific Jobs: Select the job and Job Status: "Problem"
What I'd like to know:
- Does this seem reasonable or is there another preferred way to integrate monitoring of Oracle into a third party nms? (Polling is preferred to traps in the current setting. )
- Pitfalls to be aware of.
- There seems to be significant differences between the database views I mentioned and what I see in EM backup history. Where should I be looking for the status of past backups?
Best Answer
I might have found a couple of answers on my own. Posting them here in case others need to know:
Searching the forums at https://forums.oracle.com/ I found this thread: https://forums.oracle.com/forums/thread.jspa?messageID=9956411, which references http://www.morganslibrary.org/reference/dba_best_practices.html. Here is the code mentioned:
SELECT start_time, end_time, input_type, input_type, status FROM v$rman_backup_job_details ORDER BY 1;
and
SELECT vbd.file#, vrbjd.start_time, vrbjd.end_time, vbd.incremental_level, vrbjd.input_type, vrbjd.status FROM v$rman_backup_job_details vrbjd, v$backup_datafile vbd WHERE vbd.completion_time BETWEEN vrbjd.start_time AND vrbjd.end_time AND vrbjd.input_type <> 'ARCHIVELOG' ORDER BY 2,1;
As for why the query in Enterprise Manager picked older jobs I learned a nice trick from someone at work: You can read recent queries from the view v$sqlarea. Something like select sql_text from v$sqlarea order by last_active_time desc should do the trick.