I'm using these two tables in a SQL Server 2014 database:
CREATE TABLE #FORMULAS
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[DataSourceAddress] [nvarchar](255) NULL,
[Description] [nvarchar](255) NULL
)
CREATE INDEX [IX_FORMULAS]
ON #FORMULAS ([DataSourceAddress] ASC)
CREATE TABLE #PARAMETERS
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Datetime] DateTime NULL,
[ParameterName] [nvarchar](255) NULL,
[Value] float NULL
)
CREATE INDEX [IX_PARAMETERS]
ON #PARAMETERS([ParameterName] ASC)
INSERT INTO #FORMULAS ([DataSourceAddress], [Description])
VALUES ('(parameter1+parameter2+parameter3)/3', 'Formula description 1'),
('parameter4', 'Formula description 2'),
('(parameter5*1000)', 'Formula description 3');
INSERT INTO #PARAMETERS ([Datetime], [ParameterName], [Value])
VALUES ('2016-01-26 11:40:00.000', 'parameter1', 1),
('2016-01-26 11:40:00.000', 'parameter2', 2),
('2016-01-26 11:40:00.000', 'parameter3', 3),
('2016-01-26 11:40:00.000', 'parameter4', 4),
('2016-01-26 11:40:00.000', 'parameter5', 5),
('2016-01-26 11:40:00.000', 'parameter6', 6),
('2016-01-26 11:44:00.000', 'parameter1', 1),
('2016-01-26 11:44:00.000', 'parameter2', 2),
('2016-01-26 11:44:00.000', 'parameter3', 3),
('2016-01-26 11:44:00.000', 'parameter4', 4),
('2016-01-26 11:44:00.000', 'parameter5', 5),
('2016-01-26 11:44:00.000', 'parameter6', 6);
I want to select the parameters of the table #FORMULAS
that match with the column [ParameterName]
of the table #PARAMETERS
.
For example, take the value '(parameter1+parameter2+parameter3)/3' of the table #FORMULAS
and receive all the rows of the table #PARAMETERS
that contains 'parameter1','parameter2','parameter3'
.
I'm using the following select:
SELECT
A.[Datetime],
A.[ParameterName],
B.[Description],
A.[Value]
FROM
#PARAMETERS A
JOIN
#FORMULAS B ON CHARINDEX(A.[ParameterName], B.[DataSourceAddress]) > 0
The #PARAMETERS
table has more than 160000 rows (with 1174 unique ParameterNames
), #FORMULAS
has 3283 rows. The query takes a lot of time to finish, more than 10 minutes.
Is there a faster way to do the query?
Best Answer
Wildcard matching strings with good performance will always be tricky, but here's a best-effort attempt.
First off, your indexing can be improved. The index on #PARAMETERS only includes the actual
ParameterName
column, which is fine if you need a list of parameters, but whenever you need any other column, the index will be useless to SQL Server, and it'll revert to scanning the entire table instead. I would propose fixing this by changing the index to a clustered index:Clustered indexes include every column in the table, which automatically makes a clustered index a covering indexes.
From here, I've updated your query to do three things:
Here's the resulting query:
I added
WITH (FORCESCAN)
to eliminate key lookups. You can try the query with or without this hint.If I had a lot more time to work on this and it was a super-critical query, I would probably (a) remodel the tables and/or (b) consider storing the data in #FORMULAS as XML data and add XML indexes, which would eliminate the LIKE match.
Footnote: Text matching like you do here (
LIKE
andCHARINDEX
are pretty much the same here) will include formulas with the parameters "parameter10" and "parameter11" even when you're just searching for "parameter1".