Sql-server – Select Query takes more than a minute to retrieve data

performancesql serversql-server-2008

I have a table that has 99 columns, all of them are not needed but they were there when we started. There is a Primary Key (Objectuid) which is a GUI and has the Clustered Index on the PK.

The Table is updated often and has close to 2 million rows. there are no other non clustered indexes in this table.

I have a view that pulls necessary field from the table and does a left outer join with another table which has only 7 rows. It then adds 4 filters on the where clause. it retrieves about half a million record but takes about a minute.

Is it normal? How can I make it to run for less than five seconds? The Server has 64 GB RAM and 2 processors with 8 cores each. Its Hard Drive Space is about a TB.

Do you think it is improper Indexing? If so How can I figure out which Column to index to make it run faster? I need to keep in mind that there are inserts , updates and deletes on a regular basis.

The query is like this:

SELECT a,b,c,d,e,f,g,h,i,j 
FROM TBLA LEFT OUTER JOIN TBLB 
    ON TBLA.Objectuid = TBLB.objectuid 
WHERE TBLA.a NOT LIKE '11%' 
AND TBLA.b <> '2100' 
AND TBLA.c <> 'alloc' 
AND TBLA.d NOT LIKE 'lm-%' 
AND TBLA.e NOT LIKE 'ap-%' 
AND TBLA.f >= '2014'

TBLA has 99 columns out of which the view only uses 15. TBLA has a PK which is clustered is a GUID. even when i remove the index it takes the same amount of time.

Best Answer

Have you checked your actual rows vs estimated rows in the actual execution plan? Also, have you checked the reads from your query compared to the reads from a table made up of just the fields you are selecting on?

Without even seeing any of that I would say you definitely need a non-clustered index. My guess is either one or both of the following is occurring in your query in regards to the questions above:
1) The actual rows are way off from the estimated rows (because the sql optimizer has no good estimates on what is actually in your tables)
2) The IO from your query listed above is at least a magnitude higher than if you just inserted those same 10 fields into another table and ran the same query (if those 10 fields are no larger than your average across all those 99 fields)

Even if you don't use those other 89 fields (since your query looks to only select 10 fields), SQL can only read what is on the page -- all 99 fields. Think of it like this: You have a word doc all your data printed out on paper. All your records are striped in one long string; at the end of the each record (99 fields), you have some delimiter. Then it starts over again with the next record; again all in one long string printed across all these physical piece of paper. Now if you are wanting to read just 10 of those fields, you still have to look at that whole piece of paper and pull out just those 10 fields. This is exactly how SQL Server works. If those 10 fields make up 10% of the actual data length, you are wasting 90% of the paper in his massive book you have printed out. If you create a non-clustered index on just these 10 fields you will save a lot of reading (because now these sheets of paper you have printed out will be completely crammed full of good data and the overall size will be 10% the original book. You will of course want to index on something that limits the most (based on the criteria of your query) and then just throw the other fields into an includes. Also, if those 7 rows you mention in your other query are static values, I would recommend to even go further and do a filtered index on those 7 values OR a filtered index on one of your where criteria. That will shrink your index down even further.

Related Solutions

Sql-server – Optimising join on large table

Your ix_hugetable looks quite useless because:

it is the clustered index (PK)
the INCLUDE makes no difference because a clustered index INCLUDEs all non-key columns (non-key values at lowest leaf = INCLUDEd = what a clustered index is)

In addition: - added or fk should be first - ID is first = not much use

Try changing the clustered key to (added, fk, id) and drop ix_hugetable. You've already tried (fk, added, id). If nothing else, you'll save a lot of disk space and index maintenance

Another option might be to try the FORCE ORDER hint with table order boh ways and no JOIN/INDEX hints. I try not to use JOIN/INDEX hints personally because you remove options for the optimiser. Many years ago I was told (seminar with a SQL Guru) that FORCE ORDER hint can help when you have huge table JOIN small table: YMMV 7 years later...

Oh, and let us know where the DBA lives so we can arrange for some percussion adjustment

Edit, after 02 Jun update

The 4th column is not part of the non-clustered index so it uses the clustered index.

Try changing the NC index to INCLUDE the value column so it doesn't have to access the value column for the clustered index

create nonclustered index ix_hugetable on dbo.hugetable (
    fk asc, added asc
) include(value)

Note: If value is not nullable then it is the same as COUNT(*) semantically. But for SUM it need the actual value, not existence.

As an example, if you change COUNT(value) to COUNT(DISTINCT value) without changing the index it should break the query again because it has to process value as a value, not as existence.

The query needs 3 columns: added, fk, value. The first 2 are filtered/joined so are key columns. value is just used so can be included. Classic use of a covering index.

Sql-server – What to do after deleting a million records

The problem of deleting large portions of a table is far from a trivial problem. The best approach, by far, is partitioning. A daily partition scheme with a sliding window is really a magic bullet for this problem, see How to Implement an Automatic Sliding Window in a Partitioned Table.

If you cannot afford partitioning (eg. non-enterprise license on site) then I would recommend clustering by schDateCreated. If you need primary key on the GUID+smallint then move it to non-clustered. Delete in batches (eg. TOP 10000), in a loop, to reduce pressure on the log. Consider updating stats after the operation.

Best Answer

Related Solutions

Sql-server – Optimising join on large table

Sql-server – What to do after deleting a million records

Related Question