Sql-server – SQL Server Agent Jobs – Long running job delaying other jobs

sql serversql-server-agent

I have several agent jobs that run on different schedules.

Job 1 runs every 5 minutes, only executing a stored procedure that usually takes less than 30 seconds to run.

This job Selects from table "users" then uses a cursor to loop through the results and insert records into local tables. After the cursor finishes 4 batch calls are made to a remote linked server to insert the records from the local tables to remote tables. Then the stored procedure ends.

The stored procedure sql is wrapped in a Begin, Begin Try, End Try, End statement.

Job 2, runs every hour, it too only executes a stored procedure, however, this job can take 20 minutes to run.

Job 2 performs two different Selects both referencing the "users" table (which Job 1 also selects from) and inserts the results into local tables, but without using a cursor. Then 4 batch calls to a remote linked server to insert the records from the local tables to remote tables. This stored procedure is also wrapped in a Begin, Begin Try, End Try, End.

When I check the Job Activity Monitor log for Job 1, it shows the duration is consistently under 30 seconds, except for the time that it runs on the hour, with Job 2. At this time it shows the duration 20+ minutes – I guess as a result of Job 2, the long running job, running.

Does this mean that Job 1 begins and as it begins Job 2 starts causing Job 1 to not complete until after Job 2 finishes (the reason why SQL agent log shows the duration of Job 1 as 20 minutes, even though this job never takes 20 minutes to actually execute)?

Is there a way that I can keep this long running job isolated so that it doesn't cause other fast running jobs to be delayed?

Best Answer

An easy workaround solution would be to nudge either of the jobs to start earlier/later and to not run together with the other job.

E.g.

    Job 1 Schedule: 00:14, 00:29, 00:44, 00:59, 01:14, ....
    Job 2 Schedule:                        01:00, .......
It's not an issue with the job scheduler, but an issue with the underlying tables that are being accessed. When Job 2 starts, it manages to acquire a lock on the users table (possibly: IX, IU) before the Job 1 is able to acquire its required locks on the same users table. Then Job 1 has to wait.

You are encountering blocking as previously observed by @MaxVernon. You could run the following script at the full hour to see if you can catch the statements involved:

SELECT sp.spid,
       sp.kpid,
       sp.blocked,
       sp.hostprocess,
       sp.waittype,
       sp.lastwaittype,
       DB_NAME(sp.dbid),
       sp.cpu,
       sp.physical_io,
       sp.memusage,
       CASE 
            WHEN sp.blocked = 0 THEN sp.spid
            WHEN sp.blocked != 0 THEN blocked
       END               AS chain,
       dest.text
FROM   sys.sysprocesses  AS sp
       OUTER APPLY sys.dm_exec_sql_text(sp.sql_handle) AS dest
WHERE  spid > 50
       AND spid <> @@spid
       AND (
               spid IN (SELECT blocked
                        FROM   sys.sysprocesses
                        WHERE  blocked != 0
                               AND spid > 50
                               AND spid != @@spid)
               OR blocked != 0
           )

The output will show statements that are blocking or blocked. You will probably find that the statements are from the Job 1 and Job 2.

Possible solutions:

  • Adding a NOLOCK hint can help, but should be avoided if possible.
  • Then you could also consider optimising the code so that the steps in Job 2 run faster.
  • Changing the schedules of the Job 1 and/or Job 2.