Sql-server – Production (20GB+) Database running slowly and timing out

sql serversql-server-2008-r2

I know the title thread is a bit ambiguous (and my first post here!) but I am having a very severe production database problem. I am not a DBA, though I know how to work with queries and SQL Manager. The database is constantly under load, as there's a constant input and output. The problems started about a week ago and I've done everything I know to get it working… from Rebuilding Indexes to Updating Statistics to Shrinking DB. As an example of the recent problem I used to run a daily backup at 5PM and it took 20-25 minutes, now its taking over an hour. Some days the thing will work just fine, for example yesterday we didn't have an issue and today its all over the place. Its a 20+GB Database running in a VM with Windwos Server 2008R2, dual Xeon @ 2.4GHZ and 12GB of RAM plus a partition for OS, one for DB and one for backups. As one of the measures I am taking I created a separate Log partition to move the log from the DB HardDisk to its own separate HDD. Haven't yet done it as I need to detach and take everything offline. I checked the fragmentation for the main table (5+ million entries) and it was over 99%… this was AFTER rebuilding the index.

At this point, I am just stuck. The application users use is timing out, transactions not going in or out (financial institution) sometimes. I am constantly monitoring the performance, and heck even the SQL Server Activity Monitor timed out for me a few moments ago. Any suggestions and what else should I do (aside from separating the log and DB) would be greatly appreciated.

EDIT: Server RAM Usage just in case:
ram [1]

EDIT2: Code used for rebuilding Index

  ALTER INDEX ALL ON DailyTransactions
REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON,
          STATISTICS_NORECOMPUTE = On);

Best Answer

First you need to figure out what the actual problem is. The sys.dm_os_wait_stats DMV can help with that. Most values in there are accumulative, so you need to capture it a few times over an extended period to see what is actually going on.

One of the things that this DMV can tell you is if you have RAM pressure. The graphic you posted is basically useless as SQL Server uses all memory it can get (if it needs to). So this is only showing that that mechanism is working.

Once you figured out what your biggest problem is, come back here to get more help.

There are a few wait_types that can be considered noise. You can filter many of those out with this query:

SELECT wait_type,
       wait_time_ms / 1000.0 AS wait_in_sec,
       (wait_time_ms - signal_wait_time_ms) / 1000.0 AS resource_wait_sec,
       signal_wait_time_ms / 1000.0 AS signal_sec,
       waiting_tasks_count AS wait_count,
       100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS percentage
    FROM sys.dm_os_wait_stats
    WHERE wait_type NOT IN (
        'CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 'SLEEP_TASK',
        'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'LOGMGR_QUEUE',
        'CHECKPOINT_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BROKER_TO_FLUSH',
        'BROKER_TASK_STOP', 'CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT', 'DISPATCHER_QUEUE_SEMAPHORE',
        'FT_IFTS_SCHEDULER_IDLE_WAIT', 'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'BROKER_EVENTHANDLER',
        'TRACEWRITE', 'FT_IFTSHC_MUTEX', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP')
     AND waiting_tasks_count >0
     AND wait_time_ms >0
  ORDER BY wait_time_ms DESC;

(based on http://www.sqlskills.com/blogs/paul/survey-what-is-the-highest-wait-on-your-system/)

Related Solutions

Sql-server – Partitioning / indexing an extremely large table

My approach is to disable all the nonclustered indexes [...] then build a clustered index for the table using this partition scheme.

Creating a clustered index on a heap automatically rebuilds all nonclustered indexes (even disabled ones). The nonclustered indexes are rebuilt but not partitioned. Assuming the desired end state is a partitioned clustered table with aligned indexes, rebuilding the nonclustered indexes to be non-aligned is entirely wasted effort.

What worries me slightly is that tempdb is now pushing nearly 1 TB and steadily climbing, despite the current table being around half that size. The MS docs I've read suggest the tempdb space usage should be about the size of the final table/clustered index.

The question of sort space is very complex. To understand all the details (including the effect of parallelism) you would need to carefully read an entire series of posts by the SQL Server Query Processing Team. Converting a heap to a partitioned clustered table with parallelism enabled is probably pretty close to worst case.

At it's most basic (neglecting most of the important information in the QP Team's posts), you are asking SQL Server to run a query like:

SELECT *
FROM DailyTable
ORDER BY
    $partition.monthly_on_primary(LoadDate),
    LoadDate,
    SeqNumber;

This query will not execute quickly, regardless of where you choose to write the sort runs that do not fit in memory to. Add to that the work of actually building a complete new copy of the entire data set in separate rowsets, and the work involved in rebuilding the nonclustered indexes pointlessly...

Advice

There are many considerations in getting this change to work efficiently. The important ones are to avoid sorting at all where possible, and to use parallel minimally-logged bulk load wherever possible.

The details of that depend on details not contained in the question, and a full solution is beyond an answer here. Nevertheless, the outline of an approach that has worked well for me personally in the past is:

Extract the existing data using bcp to one file per final partition
Drop the existing table and create the new one
Load the new table using parallel minimally-logged bulk load

The per-partition data extract needs be ordered on (LoadDate, SeqNumber). Ideally, you would avoid a sorting operation. If you have an existing nonclustered index on (LoadDate, SeqNumber) you can extract data in the right order without sorting if you construct the query correctly.

Once per-partition data has been extracted to separate files (this can be done in parallel if your hardware is up to it), the source table can then be dropped, freeing space. A new partitioned heap or clustered table is then created, and bulk loaded with the pre-sorted data, possibly also in parallel.

Done right, the entire process requires no more than 1x data size and achieves the fastest possible data transfer rates in both directions, with the least amount of log use.

SQL Server Disk Space – Considerations for Rebuilding Large Clustered Primary Key Index

The clustered index "is" the table. Rebuilding the index will require enough space to duplicate each existing non-deleted row, page-by-page. Since you don't have Enterprise Edition, you do not have the luxury of rebuilding the index online. This means the table will not be accessible during the rebuild operation.

You'd likely be better off rebuilding by creating a duplicate empty table on a different drive that has plenty of free space with room for foreseeable growth, moving data row-by-row in key order, then renaming the old and new tables. For instance if the original table is Table1 and the new copy is Table2, rename Table1 as Table1-old then rename Table2 as Table1.

Best Answer

Related Solutions

Sql-server – Partitioning / indexing an extremely large table

Advice

SQL Server Disk Space – Considerations for Rebuilding Large Clustered Primary Key Index

Related Question