I have the following table:
CREATE TABLE [dbo].[Journals](
[Id] [nvarchar](450) NOT NULL,
[CorrelationId] [nvarchar](450) NOT NULL,
[Version] [int] NOT NULL,
[Type] [nvarchar](max) NOT NULL,
[Data] [nvarchar](max) NOT NULL,
[Created] [datetime2](7) NOT NULL,
[Updated] [datetime2](7) NOT NULL,
CONSTRAINT [PK_Journals] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[Journals] ADD DEFAULT (getutcdate()) FOR [Created]
GO
ALTER TABLE [dbo].[Journals] ADD DEFAULT (getutcdate()) FOR [Updated]
GO
One row could look like this:
Id: 1833eac9-3ea7-4613-be26-af3f7589d6f0
CorrelationId: 0HLNKM700ILBN:00000001
Version: 1
Type: Notification
Data: [{"type": "Event_Received","timeStamp": "2019-06-24T12:47:54.3960942Z","data": {"version": 1,"notificationName": "My Notification","receiverName": "John Doe","receiverUserIdentifier": "123456789","receiverEmail": "john.doe@example.org","senderName": "Some Sender","senderEmail": "someone@example.org","contentSubject": "This is my subject","contentBody": "And this is my body","attachmentNames": []},"sensitive": true}, {"type": "Event_ValidationSucceded","timeStamp": "2019-06-24T12:47:54.4167034Z","data": "{}","sensitive": false}, {"type": "Event_PublishStart","timeStamp": "2019-06-24T12:47:54.4691455Z","data": "{}","sensitive": false}, {"type": "Event_PublishSucceeded","timeStamp": "2019-06-24T12:47:54.4740542Z","data": "{}","sensitive": true}]
Created: 2019-06-19 12:29:20.8100000
Updated: 2019-06-19 12:30:51.1123311
I want to fetch all the Ids from the rows that contains array entries where Sensitive = true
So I have tried something like this:
SELECT Id
FROM Journals journals
CROSS APPLY OPENJSON(journals.Data)
WITH (JournalType nvarchar(255) '$.type', Sensitive bit '$.sensitive') as jsonValues
WHERE Sensitive = 1
This works but I get duplicate ids back (since the array contains more than one item where Sensitive = true).
How can I limit the query so that I don't receive duplicate Ids when the array contains more than one item where Sensitive = true?
Wanted output:
1833eac9-3ea7-4613-be26-af3f7589d6f0
Actual output:
1833eac9-3ea7-4613-be26-af3f7589d6f0
1833eac9-3ea7-4613-be26-af3f7589d6f0
I guess that I could add a DISTINCT to SELECT DISTINCT Id but that feels like cheating, or? 😀
Best Answer
You could opt to use
TOP(1)
in theCROSS APPLY
Or, as you mentioned
DISTINCT
Main difference between the two
When using the
TOP(1)
the values will be filtered and the top operator will be applied earlier in the execution planWhen using the
DISTINCT
solution, the filtering and grouping is only done after the JSON function is (cross) appliedWhich is why I would opt for the
TOP(1)
+CROSS APPLY
solution.As a side note, storing unique identifiers as
nvarchar()
is not ideal, especially when using a huge data type likenvarchar(450)
. Consider using the minimal or close to the minimal amount of space needed in your column datatypes & using theuniqueidentifier
datatype.