Ola Hallengren Backup Solution Error After Moving Files – Job Stops

ola-hallengrensql serversql-server-2012sql-server-agentsqlcmd

I am using Ola Hallengren's Maintenance script, and I am having an issue which seems documented in the FAQ.

For context, there was an issue where the existing SQL Server backups had to be moved to new share, and so the history in MSDB and the current location of the new backups would not match what exists now.

When running the default job that MaintenanceSolution.sql creates, "DatabaseBackup – USER_DATABASES – FULL" (with minor modifications to the parameters it runs on) I see the job now fails because it appears that deleting the files it expects would exist:

EXECUTE @ReturnCode = [master].dbo.xp_delete_file 0, N'\\UNC\Path\Server\Database... Process Exit Code 1. The step failed.

After the first error of this type, the entire job comes to a halt.

In the FAQ https://ola.hallengren.com/frequently-asked-questions.html , I found the following paragraph which addresses the issue I believe I am having:

Why does my job stop after the first error?

This issue is happening because you’re using T-SQL job steps. I recommend that you use CmdExec job steps with sqlcmd and the -b option. Then the job will continue after an error.

You can use the MaintenanceSolution.sql script to create the jobs.

I double checked (because I thought this was the default that the solution configured) and based on what I see in SSMS it is (Type: Operating System (CmdExec)) and found that this was the case.

I also verified that the job is also using the -b option:

sqlcmd -E -S $(ESCAPE_SQUOTE(SRVR)) -d master -Q "EXECUTE [dbo].[DatabaseBackup] @Databases = 'USER_DATABASES', @Directory = N'\\UNC\Path\', @BackupType = 'FULL', @Verify = 'N', @CleanupTime = 750 , @CheckSum = 'Y', @LogToTable = 'Y'" -b

I was quickly running through the stored procedure and found that it simply assigns any error from the xp_delete command to a variable and raises it as an error (completely bog standard sql), so is there anything missing to make this work without modifying his solution?

Thanks!

Best Answer

After further digging (and almost emailing Ola) I found the source of my confusion.

After querying the CommandLog table (@LogToTable = 'Y') found that while the TSQL job steps were reporting the error, the job actually completed, logged the commands to the table as successful (and of course the files existed on disk.)

It looks like not only do TSQL job steps stop reporting on the first error, which is referenced in a different part of the FAQ.

If you are seeing that the SQL Server Agent History reports a failing job, verify that the same failure is reported in the master.dbo.CommandLog table, and ensure you have @LogToTable = 'Y' specified.

Ola's job is behaving as described in the FAQ, and the reference to TSQL job steps was the hint I needed.