The non-clustered index you have tested is not the best for this query. It can be used for the WHERE
clause and for doing an index scan instead of a full table scan but it cannot be used for the GROUP BY
.
The best possible index would have to be a partial index (to filter the unwanted rows from the WHERE
clause), then have all the columns used in the GROUP BY
and then INCLUDE
all the other columns used in the SELECT
:
CREATE INDEX special_ix
ON dbo.Commissions_Output
( company, location, account,
salesroute, employee, producttype,
item, loadjdate, commissionrate )
INCLUDE
( [Extended Sales Price], [Delivered Qty] )
WHERE
( [Extended Sales Price] <> 0 ) ;
I was using one of two ways to get a random row:
SELECT * FROM mytable ORDER BY RANDOM();
and
SELECT * FROM mytable LIMIT 1 OFFSET <random number>;
The offset method was faster for low offset numbers, otherwise super slow. An offset of 1 million took ~600 msec, while 100 million was 60 seconds (average query time for midpoint offset was 2.5 minutes). For order by random, all queries were ~5 minutes.
I ended up finding a good solution, similar to Mat's comment, but I don't need to keep track of how many rows there are.
I added an integer column called "randval" with a btree index. When a record is saved, I generate a random number between 1 and 2 billion. It took a while to migrate existing data (about half a day on slow hardware), but now random selects are super fast, typically around 1 millisecond using this query:
SELECT * FROM mytable WHERE randval >= <random number> ORDER BY randval LIMIT 1;
With 500 million rows, not every random number has a value, so the order by with limit gives us the next closest one and the index makes it quick, with a 150,000x speedup for what was an average 2.5-minute query.
Databases don't like doing random things, but they sure do well at specific things that just happen to have random values.
Best Answer
then u can use this one
SELECT * FROM tablex WHERE Number<>123 OR VAT<>15