The collation determines the comparison semantics.
If I try
CREATE TABLE [word](
[id] [int] IDENTITY(0,1) NOT NULL,
[value] [nvarchar](255) COLLATE Latin1_General_100_CI_AS NULL
);
It only returns ἀπὸ
.
Changing the suffix to AI
for accent insensitive returns ἀπό
also.
On my install I have tried every collation and 1526
return 1
(presumably AS
and BIN
collations), 1264
return 2 rows (presumably AI
) and 1095
return 8
.
From a quick look through this last group looks to include all the SQL
collations and 90
collations whereas all the 100
ones are in the first 2 groups so I presume this is some issue that has been fixed in the 2008 batch of collations. (See What's New in SQL Server 2008 Collations)
Script to try this yourself
DECLARE @Results TABLE
(
Count INT,
Collation SYSNAME
)
SET NOCOUNT ON;
DECLARE @N SYSNAME;
DECLARE @C1 AS CURSOR;
SET @C1 = CURSOR FAST_FORWARD FOR
SELECT name
FROM sys.fn_helpcollations();
OPEN @C1;
FETCH NEXT FROM @C1 INTO @N ;
WHILE @@FETCH_STATUS = 0
BEGIN
INSERT @Results
EXEC('SELECT COUNT(*), ''' + @N + ''' from word where value = N''ἀπὸ'' COLLATE ' + @N)
FETCH NEXT FROM @C1 INTO @N ;
END
SELECT *
FROM @Results
ORDER BY Count DESC
Below is one method. In the outer WHERE clause, specify the number of types included in the IN
clause. An composite index on the Childrens
table ParentId
and TypeId
will help optimize the query.
SELECT *
FROM Parents p
WHERE 2 = ( SELECT COUNT(DISTINCT c.TypeId)
FROM Childrens c
WHERE p.Id = c.ParentId
AND c.TypeId IN ( 1, 2 )
);
SELECT *
FROM Parents p
WHERE 3 = ( SELECT COUNT(DISTINCT c.TypeId)
FROM Childrens c
WHERE p.Id = c.ParentId
AND c.TypeId IN ( 4, 5, 6 )
);
EDIT:
The original DDL in the question didn't include constraints or indexes. These aren't relevant for query functionality but are important to compare performance of alternative solutions.
Assuming the IDENTITY
columns in both tables are also the primary keys and the tables are related, I added the constraints below.
ALTER TABLE Parents
ADD CONSTRAINT PK_Parents PRIMARY KEY CLUSTERED(Id);
ALTER TABLE Childrens
ADD CONSTRAINT PK_Childrens PRIMARY KEY CLUSTERED(Id);
ALTER TABLE Childrens
ADD CONSTRAINT FK_Childrens_Parents FOREIGN KEY (ParentId) REFERENCES Parents(Id);
I added candidate indexes on columns specified on JOIN or WHERE clauses and specified UNIQUE
for those indexes that include the primary key column. Indexes not deemed to be used can be dropped afterward.
CREATE UNIQUE INDEX idx1 ON Parents (Id, GroupId);
CREATE INDEX idx1 ON Childrens (ParentId, TypeId);
CREATE INDEX idx2 ON Childrens (TypeId, ParentId);
Finally, I ran the original query in my answer in SSMS along with the query below which Tpsamw1 proposed in a comment (but without ORDER BY to level the playing field). SET STATISTICS IO ON
and including the actual execution plan:
SELECT p.Id
, p.GroupId
FROM Parents p
INNER JOIN Childrens c ON c.ParentId = p.Id
WHERE c.TypeId IN ( 4, 5, 6 )
GROUP BY p.Id
, p.GroupId
HAVING COUNT(DISTINCT c.TypeId) = 3;
SSMS reported comparable performance (50% of batch cost for each query) but that doesn't tell the whole story. The STATISTICS IO showed a higher number of logical reads for my original query, which I believe a better indicator of performance. This suggests Tpsamw1's query performs better. Below are the stats:
My original query:
Table 'Childrens'. Scan count 15, logical reads 30, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Parents'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Tpsamw1 query:
Table 'Parents'. Scan count 4, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Childrens'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I then loaded additional data by copying the original test data and update stats:
INSERT INTO Parents SELECT NEWID() FROM Parents WHERE Id = 1;
INSERT INTO Childrens SELECT SCOPE_IDENTITY(), TypeId FROM Childrens WHERE ParentId = 1;
INSERT INTO Parents SELECT NEWID() FROM Parents WHERE Id = 2;
INSERT INTO Childrens SELECT SCOPE_IDENTITY(), TypeId FROM Childrens WHERE ParentId = 2;
INSERT INTO Parents SELECT NEWID() FROM Parents WHERE Id = 3;
INSERT INTO Childrens SELECT SCOPE_IDENTITY(), TypeId FROM Childrens WHERE ParentId = 3;
INSERT INTO Parents SELECT NEWID() FROM Parents WHERE Id = 4;
INSERT INTO Childrens SELECT SCOPE_IDENTITY(), TypeId FROM Childrens WHERE ParentId = 4;
INSERT INTO Parents SELECT NEWID() FROM Parents WHERE Id = 5;
INSERT INTO Childrens SELECT SCOPE_IDENTITY(), TypeId FROM Childrens WHERE ParentId = 5;
GO 1000
UPDATE STATISTICS dbo.Childrens WITH FULLSCAN;
UPDATE STATISTICS dbo.Parents WITH FULLSCAN;
GO
Running the queries again shows execution plan changes to both queries. Both queries yielded the same number of logical IOs, indicating performance is be about the same. The only execution plan difference was an inner merge join versus a right outer merge join.
Table 'Parents'. Scan count 1, logical reads 21, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Childrens'. Scan count 1, logical reads 54, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Best Answer
Sample data
Solution
The basic idea here is quite simple:
Simply return rows 'between' those two values
Return rows less than @EndHour; and
Return rows greater than @StartHour
The logic can be implemented quite naturally as:
Execution plan
The Filters in this plan are start-up filters. At execution time, either the top branch or the two lower branches will be executed depending on the values of the local variables at that time. Splitting the query into three simple sections allows an index on HourPart to be used effectively.
Try it on the Stack Exchange Data Explorer
Or try the updated version that converts the supplied sample data to have the HourPart column as an indexed computed column.
If the start and end parameters were intended to be inclusive, simply change the
<
and>
comparison operators to<=
and>=
respectively.