Performance difference between < and <=

oracleperformanceselect

I remember that there is a performance issue with SQLs that use "<" against "<=" because of statistics and/or histograms and/or the data itself, but I dont remember exactly the issue.

I know this is very abstract question, but still if you can assists with what exactly the issue here

For example:

select id
from table
where year < 2010

Against

select id
from table
where year <= 2009

Best Answer

Yes, there is a difference, and the issue is that the estimated selectivity (thus cardinality) will be different in the 2 cases.

For a simple example, lets say the year column has years in it from 2000 (low_value) to 2014 (high_value), so there are 15 distinct values (num_distinct), they are evenly distributed, 1000 rows for each, 15000 rows (num_rows) in total in your table.

In the first case, where year < 2010 (limit = 2010), the estimated selectivity will be:

(limit - low_value) / (high_value - low_value) = (2010 - 2000) / (2014 -2000) = 10 / 14

Cardinality = 10/14 * 15000 = 10714

For the second case, where year <= 2009 (limit = 2009), selectivity is calculated as:

(limit - low_value) / (high_value - low_value) + 1/num_distinct = (2009 - 2000) / (2014 - 2000) + 1/15 = 9/14 + 1/15

Cardinality = (9/14 + 1/15) * 15000 = 10642

Not a big difference, but it can still affect the optimizer so that it choses another plan.

Above is explained in detail by Jonathan Lewis in his book "Cost-Based Oracle Fundamentals".

Related Solutions

Mysql – table size above which SELECT performance degrades significantly

Not sure what version of TokuDB you are running, or what MySQL/MariaDB distribution, but there is an index condition pushdown bug that occurs on queries similar to the one you've provided. We are still waiting for a fix, the only current workaround is to disable ICP when running the problematic query.

Try running the query with ICP turned off. To disable it, issue the following command in the session just before running the query:

SET optimizer_switch='index_condition_pushdown=off';

And please report back success/failure, and the particular versions of TokuDB and MySQL/MariaDB.

(full disclosure, I'm an employee of Tokutek, makers of TokuDB)

Sql-server – Poor performance on inner join using dates and between

Much more efficient to do this without having to go back and join to the periods table.

DECLARE @StartDate DATE, @EndDate DATE;

Select @StartDate = Min(StartDate), @EndDate = MAX(EndDate) 
from dbo.PeriodCalendar_Weeks pcw
where (pcw.Year = @Year and pcw.Period < @Period) 
  or  (pcw.Year = @Year and pcw.Period = @Period and pcw.Week <= @Week) 
  or (pcw.Year = @Year -1 and pcw.Period >= @Period);

SELECT 
  WeekEndDate = DATEADD(DAY, 6, DATEADD(WEEK, SalesWeek, @StartDate)), 
  Store, 
  DeliveryChargesTotal = dct
FROM 
(
  SELECT DATEDIFF(DAY, @StartDate, SalesDate)/7, Store, SUM(DeliveryChargesTotal)
  FROM dbo.Daily_GC_Headers
  WHERE SalesDate BETWEEN @StartDate AND @EndDate AND isCanceled = 0
  GROUP BY DATEDIFF(DAY, @StartDate, SalesDate)/7, Store
) AS x (SalesWeek, Store, dct)
ORDER BY WeekEndDate, Store;

A filtered index may help, if many rows exist where isCanceled = 1 (these are just possible suggestions, depending on cardinality of Store, and may not be the most optimal):

CREATE INDEX x ON dbo.Daily_GC_Headers
  (SalesDate) INCLUDE (Store, DeliveryChargesTotal)
  WHERE isCanceled = 0;

If there are very few rows where isCanceled = 1, this may be better:

CREATE INDEX x ON dbo.Daily_GC_Headers
  (SalesDate, IsCanceled) INCLUDE (Store, DeliveryChargesTotal);

Both are worth trying on a test system, as well as moving Store into the key in either case, or moving IsCanceled to the INCLUDE list in the latter case. On my system, I found the best results with everything but the date in the INCLUDE list:

CREATE INDEX x ON dbo.Daily_GC_Headers
  (SalesDate) INCLUDE (Store, IsCanceled, DeliveryChargesTotal);

Again, you will need to test if any of these work out, or if the query above gives a different/better recommendation directly from SQL Server.

Best Answer

Related Solutions

Mysql – table size above which SELECT performance degrades significantly

Sql-server – Poor performance on inner join using dates and between

Related Question