For SQL Server, you could do something like this.
DECLARE @t TABLE (Id INT, DateVal DATETIME, ValueVal DECIMAL(18, 9));
INSERT @t ( Id, DateVal, ValueVal )
SELECT Id, DateVal, ValueVal
FROM (
VALUES
(12410, '01/03/2017 12:17', 0.000178),
(12410, '01/10/2017 11:36', 0.000186),
(12410, '01/17/2017 11:27', 0.000189),
(12410, '01/24/2017 13:09', 0.000182),
(12410, '01/31/2017 10:37', 0.000169),
(12410, '02/07/2017 11:03', 0.000214),
(12410, '02/14/2017 11:52', 0.000176),
(12410, '02/21/2017 10:51', 0.000200),
(12410, '02/28/2017 12:29', 0.000194),
(12410, '03/07/2017 08:39', 0.000206)
) x (Id, DateVal, ValueVal);
SELECT DATEPART(MONTH, t.DateVal) AS [DateVal],
MAX(t.ValueVal) AS MaxVal
FROM @t AS t
WHERE t.DateVal >= DATEADD(MONTH, -3, DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0))
GROUP BY DATEPART(MONTH, t.DateVal);
Updated WHERE
clause:
Flattening dates is weird, and my previous query got you today's date minus three months. To get back to the first of three months ago, you have to turn some tricks.
SELECT DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0) AS [First Of This Month],
DATEADD(MONTH, -3, DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)) AS [First Of Three Months Ago]
You can either take some time trying to understand this, or keep a cheat sheet of how to do it handy so you don't have to remember awful date math ;)
It looks like for your final result set you want one row for each distinct value of clientname
in @NumOfLeads
and @PurchasedLicenses
. If the client is in both tables then populate the numoflicensesrequested
and numlicensespurchased
columns. If not then populate the one column that you have. To me, that business logic sounds straightforward enough that you don't need to split it up into multiple queries. Sample data:
CREATE TABLE #NumOfLeads
(
clientname varchar(500)
, numoflicensesrequested int
, calldate date
, PRIMARY KEY (clientname)
);
CREATE TABLE #PurchasedLicenses
(
clientname varchar(500)
, numoflicensespurchased int
, purchasedate date
, PRIMARY KEY (clientname)
)
Insert Into #NumOfLeads (clientname, numoflicensesrequested, calldate) Values
('Client A', 10, '2017-01-01'), ('Client B', 5, '2017-01-02')
,('Client C', 7, '2017-01-01'), ('Client D', 12, '2017-01-03');
Insert Into #PurchasedLicenses (clientname, numoflicensespurchased, purchasedate) Values
('Client A', 10, '2017-01-10'), ('Client C', 5, '2017-01-15'), ('Client E', 10, '2017-01-15')
,('Client F', 11, '2017-01-15'), ('Client G', 5, '2017-01-22');
This query returns the same results as yours:
SELECT
nol.clientname
, nol.numoflicensesrequested
, COALESCE(pl.numoflicensespurchased, 0) AS numoflicensespurchased
FROM #NumOfLeads nol
LEFT OUTER JOIN #PurchasedLicenses pl ON nol.clientname = pl.clientname
AND pl.purchasedate BETWEEN '2017-01-01' AND '2017-01-31'
WHERE nol.calldate BETWEEN '2017-01-01' AND '2017-01-31'
UNION ALL
SELECT
pl.clientname
, 0
, pl.numoflicensespurchased
FROM #PurchasedLicenses pl
WHERE pl.purchasedate BETWEEN '2017-01-01' AND '2017-01-31'
AND NOT EXISTS (
SELECT 1
FROM #NumOfLeads nol
WHERE nol.clientname = pl.clientname
)
order by clientname asc;
I believe that the query could also be written with a FULL OUTER JOIN
, but I'm not personally a fan of that syntax.
A few things worth mentioning:
Don't use table variables unless you need the functionality that they provide. Table variables don't have statistics so they can cause poor query performance if you use them in the wrong way. Personally, I only consider table variables if I need to persist data through a transaction rollback or if I have code that will run thousands of times per second (or more).
The fact that you're updating by clientname
implies that it's the primary key of the data. Why not define it as the primary key?
For complex queries you can see performance gains by splitting them up into multiple steps by saving intermediate results to temp tables. I would only try that if I couldn't get good enough performance with a single query. For this data and business logic I think that you can just write a single SELECT
query. For other queries which are too complex, try doing inserts into temp tables. That will give you minimal logging and other benefits. Avoid doing updates (when practical), and avoid using MERGE unless you really need its functionality.
Best Answer
Your two queries are fundamentally different. The fast query is right-joining the TourneyPlayers table to Tourney, and then filtering. Based on the order of operations, it is likely seeking records on Tourney first, then seeking records on TourneyPlayers that match the TourneyId using the correct index.
In the slow query, you're inner joining the tables, so the engine is scanning the TourneyPlayers clustered index and seeking the Tourney records using the date field index. It then has to combine these record sets to produce the final result set.
You need to determine which join you require for the relationship between your tables and the result set you need, then from there you can construct your query correctly and assess your indexing strategy for these tables.
For example, do you want to return every Tourney record regardless of whether there is an associated TourneyPlayer record? Do you want every TourneyPlayer record, regardless of whether there is an associated Tourney record? Do you want only those Tourney records that have an associated TourneyPlayer record?
Once you've answered this question, you will have the answer for your join and can proceed from there with query design and index optimisation.