Consider the following table:
CREATE TABLE [xxx].[MyTable](
[period] [date] NOT NULL,
[genusId] [int] NOT NULL,
[subjectId] [int] NOT NULL,
[waitingStageId] [tinyint] NOT NULL,
[value] [tinyint] NULL,
[savedOn] [datetime2](2) NULL,
[savedBy] [nvarchar](50) NULL,
CONSTRAINT [PK_ExpectedActivity] PRIMARY KEY CLUSTERED
(
[period] ASC,
[genusId] ASC,
[subjectId] ASC,
[waitingStageId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-05-01' AS Date), 71, 92, 1, 1, CAST(N'2017-05-29T13:45:20.6300000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-05-01' AS Date), 71, 96, 1, 1, CAST(N'2017-05-29T13:45:21.2400000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-05-01' AS Date), 71, 128, 0, 2, CAST(N'2017-05-29T13:45:21.6500000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-05-01' AS Date), 71, 128, 1, 3, CAST(N'2017-05-29T13:45:21.4400000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-06-01' AS Date), 73, 35, 0, 1, CAST(N'2017-05-26T20:05:40.2800000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-06-01' AS Date), 73, 94, 0, 3, CAST(N'2017-05-26T20:05:40.6700000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-06-01' AS Date), 73, 94, 1, 1, CAST(N'2017-05-26T20:05:40.4800000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-06-01' AS Date), 74, 23, 0, 6, CAST(N'2017-05-26T20:03:26.5400000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-06-01' AS Date), 74, 47, 0, 6, CAST(N'2017-05-26T20:03:26.7700000' AS DateTime2), NULL)
GO
INSERT [xxx].[MyTable] ([period], [genusId], [subjectId], [waitingStageId], [value], [savedOn], [savedBy]) VALUES (CAST(N'2017-06-01' AS Date), 74, 67, 0, 3, CAST(N'2017-05-26T20:03:27.1800000' AS DateTime2), NULL)
GO
I am trying to create a stored procedure that will allow us to duplicate values from the previous month into the current month — BUT ONLY if there isn't a value for the current month already.
DROP TABLE IF EXISTS #PreviousMonth
DECLARE @schemaName AS varchar(5) = 'xxx' -- To be used later as a parameter in stored procedure
DECLARE @lastMonth AS DATE = CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE())-1, 0) AS DATE) --First day of previous month
DECLARE @currentMonth AS DATE = CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0) AS DATE) --First day of current month
-- Add rows for existing month into a temporary table
SELECT @currentMonth AS period, genusId, subjectId, waitingStageId, value, GETDATE() AS savedOn, '<Automated by Stored Procedure>' AS savedBy INTO #PreviousMonth FROM [xxx].[ExpectedActivity]
WHERE period in(@lastMonth)
SELECT * INTO xxx.MyTable FROM #PreviousMonth
-- Here is where I got stuck. I tried WHERE caluses and NOT EXISTS but my skills are limited.
This is as far as I managed to get unfortunately. My skill is limited where it gets to do an INSERT with a condition with the NOT EXIST condition.
How can I make sure that I only insert rows in the xxx.MyTable where rows with the same primary key (period AND genusId AND subjectId AND waitingStageId) does not exist already?
Best Answer
Taking what your query says literally, you don't have to worry about what exists in
xxx.myTable
- because your statement will create it, therefore it won't have anything in it (or, if it already exists, your statement will fail). Presumably, that's not really what you were going for.If we assume that
xxx.myTable
does exist, and may already have some rows whose values match your defaults, then this should work.In the
NOT EXISTS
clause, you're checking if the current row from the temp table matches the primary key values for any rows already inxxx.myTable
. If you find a match for the current row, then you won'tSELECT
it (and thus, won'tINSERT
it).You can also write this as what's called an "anti-join":
Here, we're matching all the rows in the temp table with any matching records in
xxx.myTable
. via aLEFT JOIN
. Because of the left join, if there's no matching row inxxx.myTable
, we'll still return columns from that table, but all the values will be NULL. So, if we find a row whereperiod
is NULL inxxx.myTable
, that should mean that there isn't a matchingxxx.myTable
row for our temp table row. So, only those rows (temp table rows with no matchingxxx.myTable
row) will beSELECT
ed, and thus will beINSERT
ed.In most cases, I wouldn't expect dramatically different performance between the two versions; test in your specific case, to see if one seems better than the other.
UPDATE: If you want to use the previous month's data anywhere the key fields don't exists, or where they do exist but the value = 0, you've got a couple of options available.
You could delete all rows in
xxx.myTable
where the value is 0, then do the insert as above.I would be inclined to
UPDATE
the existing rows, and then perform theINSERT
as above. TheUPDATE
statement would be: