SQL Server – How to Set a Restart on a TSQL Job?

jobssql servert-sql

I currently have an important TSQL job that hangs probably once a week.

It moves data from one database to another.

It is on a schedule running every 5 minutes indefinitely throughout the day.

Sporadically, it will hang usually around 1 am in the morning. When I say hang I mean it just gets caught at 1am never finishes its current scheduled run in turn it never moves on to the next scheduled run.

In order to get it to stop I have have to stop the job. Disable the job. Kill the associated SPID. Re-enable and restart the job. Sometimes this gets it running but other times I have to restart the SQL Agent itself.

There is really no rhythm or reason as to why this happens datawise. Sometimes it will go weeks without issue other times it occurs once or twice a week.

I am just wondering if there is possibly a way to have this job restart it self regularly in the morning so it doesn't hang. I am assuming someone else has had an issue like this before. Any direction here is greatly appreciated.

There is no blocking in the database. I am not looking to fix a blocking issue. Looking for a way to determine if the job is hung and restart it or something along those lines.

I have reviewed sp_who2 regularly and never seen any blocking with this particular job.

Best Answer

I have dealt with some long running jobs in the past and the best way I have found to identify them is to run a separate job that checks on that sort of thing. Thomas LaRock put together a nice script that does that, and you can modify it to either email you or if you wanted you could have it just stop the job as well. In my case, the job literally just hangs, but it is not being blocked and in fact in a lot of cases it is doing nothing at all. But since for whatever reason it never stops the job it will never run that next schedule.

HOW TO: FIND CURRENTLY RUNNING LONG SQL AGENT JOBS

I have put his code to use and added an email notification to it for myself, and it works great. As long as you know your threshold's for what is a long running job or even if you just want to look at an individual job, this should get it done for you. It would be easy enough to have your code find that it is running too long and then to within the same script call

exec msdb.dbo.sp_stop_job @job_name 'YourJobHere'

which would then allow the next schedule to pick up where this left off. I have found that I like to see if there is anything else going on as opposed to just stopping the job but it will work either way.

Best of luck.