Oracle – Using CRON and RMAN for Backups

cronlinuxoraclerman

We run RMAN backups via CRON jobs, but find the out-of-the-box emails with CRON to almost do what we want. I know we can channel the RMAN output to /dev/null so no email is sent, which for any successful run of RMAN is the desired outcome, but if the RMAN job runs into an error, I would like the entirety of the standard output sent via email to the DBA team for review. I suspect I can wrap my RMAN scripts with a bash shell script that will pipe the standard output to cron only in the event of a failure, but is there a better way to do this?

To summarize:

  1. Can I pipe standard output to an email in the event of an error (and
    ignore the standard output if no error exists) so we can use the
    default mail functionality within cron?
  2. If not, should I just wrap my RMAN backups scripts with logic so
    that it will only return output from the shell script if an error is
    encountered?
  3. Is there another approach that I'm not seeing or able to find online
    that accomplishes this?

My apologies if this has already been asked. I've found a number of one-off questions and scenarios, but nothing that I've felt match up exactly with my desired outcome.

Thanks!

Best Answer

I write a bash script to perform the backups (RMAN, datapump and also xtrabackup and dump for MySQL in a similar way) and then call that from cron. That way you can handle the email as you want.

So a simple crontab line is like this

00 18 * * 0-6 /dba/bin/rman.sh > /tmp/dba_rman_SID.log 2>&1

Then everything is done in the script.

So define some common variables including RMAN e.g.

export RMAN="${ORACLE_HOME}/bin/rman target /"
...
etc etc

Then import a list of emails you want (one per line) from a local file

declare -a EMAILS
readarray -t EMAILS < .emails.dat

Then define a couple of handy procedures that you can use later on for error handling and email

abort_with_error()
{
        echo
        echo "Error :" $1
        echo
        echo "Error Code :" ${ERR}
        echo

        export SUBJECT="Error - ${SUBJECT}"

        send_email

        exit 1
}

send_email()
{
        echo
        echo "Sending logfile via email to the following :"
        echo

        for EMAIL in ${EMAILS[@]}
        do
                echo ${EMAIL}
                echo
        done

        for EMAIL in ${EMAILS[@]}
        do
                case $EMAIL in
                        \#*) ;;
                        *) ${MAILX} -s "${SUBJECT}" ${EMAIL} < ${LOG_FILE} ;;
                esac
        done
}

now the main process which is wrapped up to output it all to a log file. This is a simple example but shows the basic method.

{

    //add what you want your script to do here - can be many things for example:

        ${RMAN}<<EOF
        delete noprompt force obsolete;
        exit;
EOF

        ERR=$?
        if [ ${ERR} -ne 0 ];
        then
                abort_with_error "The RMAN delete obsolete backup step failed"
        fi


        echo "${JOB_NAME} ended at : $(date)"

        send_email

} > ${LOG_FILE} 2>&1

Adjust the abort_with_error and send_email procedures to suit your environment and needs.

This separates the scheduler from the process so no need to change cron if you update your backup process or get a new team member who needs the emails.

This way you get an email if it works and if it doesn't but with Error in the subject so an simple email rule can hide away non-interesting successful backups and highlight failures and a simple count up each morning shows if something didn't email and needs a look. Since the log is held on the server you can find it there (add something to the job to clean the old ones down after a couple of weeks).

Obviously there are a bunch of ways you can implement something like this and you'll need to vary it to suit (and I've cut out * a lot * of stuff from the script for clarity).