Sql-server – improve this query

performancequery-performancesql-server-2008-r2t-sql

SELECT * 
FROM   SameLogTable
WHERE  ID_Table IN (SELECT ID_Table-1
                    FROM   SameLogTable
                    WHERE  <SameCondition>) OR
       ID_Table IN (SELECT ID_Table
                    FROM   SameLogTable
                    WHERE  <SameCondition>) OR
       ID_Table IN (SELECT ID_Table+1
                    FROM   SameLogTable
                    WHERE  <SameCondition>)

This query runs on a Logging Table, and I want to select particular events, but also the event preceding and following those events.
This solution feels ugly and inefficient, what would be the better way to write the for this?

Example: If I am interested in lines with ID 4, and 23 of a LogTable, I want to get the following result:

ID    Column1    Column2    ...
3     ...        ...
4     ...        ...
5     ...        ...
22    ...        ...
23    ...        ...
24    ...        ...

These are all lines from the Same LogTable, except I specify Lines 4 and 23 using a WHERE , and I want the Select to automatically return rows 3,5 for 4 and rows 22, 24 for 23.

To Summarize the results:

MyQuery:     16s
UNION ALL:    4s
Join:        ~0s

Thanks for the replies!

Best Answer

Assuming that ID is the primary key of the table (or has a unique constraint), you can use this variant:

SELECT t.* 
FROM   SameLogTable AS t
  JOIN ( SELECT id
         FROM   SameLogTable 
         WHERE  <SameCondition>
       ) AS c
       ON t.id = c.id -1
       OR t.id = c.id
       OR t.id = c.id +1 ;

Note: if the id have gaps, which is very probable, the above will not work as expected and neither will your original query. ROW_NUMBER() can help you for that:

; WITH cte AS
( SELECT t.*,
         rn = ROW_NUMBER() OVER (ORDER BY t.id)
  FROM   SameLogTable AS t
) 
SELECT t.*
FROM   cte AS t
  JOIN ( SELECT rn
         FROM   cte 
         WHERE  <SameCondition>
       ) AS c
       ON t.rn = c.rn - 1
       OR t.rn = c.rn
       OR t.rn = c.rn + 1 ;

Related Solutions

Mysql – How to improve this query

First, since you already have a UNIQUE index that contains the user_id, you should be able to get rid of the id field, and use the UNIQUE index as the PRIMARY KEY:

CREATE TABLE `twitter_relationships` (
  `user_id` int(11) NOT NULL,
  `source_twitter_id` bigint(20) NOT NULL,
  `target_twitter_id` bigint(20) NOT NULL,
  `relationship_status` tinyint(1) NOT NULL,
  `status_change_date` int(11) unsigned DEFAULT NULL,
  PRIMARY KEY (`user_id`,`source_twitter_id`,`target_twitter_id`),
  KEY `target_status_and_change_date_index`
    (`user_id`,`target_twitter_id`,`relationship_status`,`status_change_date`),
  KEY `user_id_index` (`user_id`,`status_change_date`)
) ENGINE=InnoDB AUTO_INCREMENT=116597775 DEFAULT CHARSET=latin1
PARTITION BY HASH (user_id) PARTITIONS 1000;

Unfortunately, while this removes an index, it may increase storage requirements, due to the way that InnoDB indexes data. See "How Secondary Indexes Relate to the Clustered Index" in http://dev.mysql.com/doc/refman/5.6/en/innodb-table-and-index.html

Second, while the source_and_target index has two of the three fields in your WHERE clause, MySQL will have to do an additional read to find the relationship_status.

Therefore, to improve performance, create an index that includes all three fields in your WHERE clause:

CREATE INDEX user_source_status ON twitter_relationships
    (`user_id`,`source_twitter_id`,`relationship_status`);

Then, if MySQL doesn't use this index automatically, you can force using it, with:

SELECT target_twitter_id 
 FROM `twitter_relationships` FORCE INDEX (user_source_status)
WHERE (`twitter_relationships`.`user_id` = ? 
   AND `twitter_relationships`.`source_twitter_id` = ?
   AND `twitter_relationships`.`relationship_status` = ?) 
LIMIT ?, ?

Lastly, you're missing the UNSIGNED attribute on the id, user_id, source_twitter_id, and target_twitter_id fields. I'm guessing these fields will never store negative values, so it would make sense to make them UNSIGNED.

Sql-server – the best way to get all data for a date range, plus the last event just before the range

I am going to assume that there isn't an index on the date columns, otherwise I think that the query would have been structured differently. If there is, you can probably find a better performing one than this.

The advantage of this query is that it can get all the data in one scan. The disadvantage is that it has to sort the data and join EventEmployee on the entire table. So as always, test with your own situation. This query also assumes that the MAX date is either unique or that equivalent rows would be acceptable.

USE AdventureWorks2012
GO
;
WITH Base AS (
   SELECT 
      TransactionHistory.*
      ,ProductVendor.BusinessEntityID
      ,MAX(CASE WHEN TransactionDate < '2008-08-01' THEN TransactionDate END) 
           OVER (PARTITION BY ProductVendor.BusinessEntityID) AS PreviousVendorTransaction
      ,COUNT(CASE WHEN TransactionDate >= '2008-08-01' THEN 1 END ) 
           OVER (PARTITION BY ProductVendor.BusinessEntityID) AS VendorAfterCutoff
   FROM
      Production.TransactionHistory
      -- Doesn't make the most sense, but I need a repeating relation
      INNER JOIN Purchasing.ProductVendor
         ON TransactionHistory.ProductID = ProductVendor.ProductID
),
Filtered AS (
   SELECT
      *
   FROM
      Base
   WHERE
      Base.TransactionDate >= '2008-08-01'
      OR (TransactionDate = PreviousVendorTransaction AND VendorAfterCutoff > 0)
)
SELECT DISTINCT
   TransactionID
   ,ProductID
   ,ReferenceOrderID
   ,ReferenceOrderLineID
   ,TransactionDate
   ,TransactionType
   ,Quantity
   ,ActualCost
   ,ModifiedDate
FROM
   Filtered

Edit:

Hmm, I think I may have to take back my comment on structuring it differently if there are indexes. The other suggestions that I have are probably fairly minor.

Make sure the query is using the indexes you're expecting it to. Start and End date to build temp table, end date to drive the previous event loop.
If the query to build the temp table is doing a lookup on the clustered index, it may be better to hold off and do that as part of the main query.
Try using a cte instead of a temp table. I think that a cte might be more competitive with the way that the query is structured below.
If you are returning a lot of events, it might be better to pull out the event table lookup to the main query to give the optimizer the option of doing a merge join.
I don't see a way of optimizing the previous event lookup short of an indexed view.

Here's a query that combines a few of those ideas.

SELECT
    e.[EventID]
INTO #EventTemp
FROM
    [Events] AS e
WHERE
    ( e.[EventStart] >= @StartDate AND e.[EventStart] <= @EndDate )
    OR ( e.[EventEnd] >= @StartDate AND e.[EventEnd] <= @EndDate )

;
WITH PrevEvent AS (
    SELECT
        EmpPrevEvent.[EventID]
    FROM
    (
        SELECT DISTINCT
            ee.[EmployeeID]
        FROM
            #EventTemp
            INNER JOIN [EventEmployee] AS ee ON
                #EventTemp.[EventID] = ee.[EventID]
    ) AS Emp
    CROSS APPLY (
        SELECT TOP 1
            e.[EventID]
        FROM
            [Events] AS e
            INNER JOIN [EventEmployee] AS ee ON
                e.[EventID] = ee.[EventID]
        WHERE
            ee.[EmployeeID] = Emp.[EmployeeID]
            AND e.[EventEnd] < @StartDate
        ORDER BY 
            e.[EventEnd] DESC
    ) AS EmpPrevEvent
)
SELECT
    e.[EventID],
    e.[EventStart],
    e.[EventEnd],
    e.[EventTypeID]
FROM
    [Events] AS e
WHERE
    e.EventID IN (
        SELECT EventID
        FROM #EventTemp
        UNION
        SELECT EventID
        FROM PrevEvent
    )

Best Answer

Related Solutions

Mysql – How to improve this query

Sql-server – the best way to get all data for a date range, plus the last event just before the range

Related Question