There's a hint about this in the documentation for CCI query performance:
Plan for enough memory to create columnstore indexes in parallel
Creating a columnstore index is by default a parallel operation unless memory is constrained. Creating the index in parallel requires more memory than creating the index serially. When there is ample memory, creating a columnstore index takes on the order of 1.5 times as long as building a B-tree on the same columns.
The memory required for creating a columnstore index depends on the number of columns, the number of string columns, the degree of parallelism (DOP), and the characteristics of the data. For example, if your table has fewer than one million rows, SQL Server will use only one thread to create the columnstore index.
That quote suggests that the number of rows in the table matters for DOP but doesn't matter for the memory grant. The memory required also depends on the number of columns so let's test with a table with fewer columns to more easily see the expected difference in memory grant. I'm using this table and index definition:
DROP TABLE IF EXISTS dbo.MY_SECOND_FACT_TABLE;
CREATE TABLE dbo.MY_SECOND_FACT_TABLE (
ID BIGINT NOT NULL,
COL1 BIGINT NULL
);
INSERT INTO dbo.MY_SECOND_FACT_TABLE WITH (TABLOCK) (ID, COL1)
SELECT TOP (0) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), NULL
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);
CREATE CLUSTERED COLUMNSTORE INDEX MY_SECOND_CCI ON dbo.MY_SECOND_FACT_TABLE WITH (MAXDOP = 1);
Below are results from a few tests:
╔══════════════════╦════════════════╦═════════════════╗
║ REQUESTED_MAXDOP ║ NUMBER_OF_ROWS ║ MEMORY_GRANT_KB ║
╠══════════════════╬════════════════╬═════════════════╣
║ 1 ║ 0 ║ 89928 ║
║ 2 ║ 0 ║ 89928 ║
║ 1 ║ 2000000 ║ 89928 ║
║ 2 ║ 2000000 ║ 179896 ║
║ 1 ║ 5000000 ║ 89928 ║
║ 2 ║ 5000000 ║ 179896 ║
╚══════════════════╩════════════════╩═════════════════╝
With 0 rows the DOP is downgraded to 1, but otherwise the number of rows doesn't appear to matter for the memory grant. This is undoubtedly a gross oversimplification, but perhaps SQL Server builds the CCI one rowgroup at a time per DOP and maintains the string dictionaries as it goes along. That sounds fine in practice and means that the memory grant won't scale with the number of rows in the table, but without a check for a very low number of rows SQL Server could request an excessive memory grant. The excessive memory grant wouldn't matter in most cases because the CREATE INDEX
is so fast, but it can matter if SQL Server waits on the memory grant.
So now I have an idea of why the memory grant is so large, but what can be done about it? SQL Server 2014 introduced the syntax to define an index definition as part of the table definition. It is possible to write a single statement that creates both the table and the index:
INDEX index_name CLUSTERED COLUMNSTORE
Applies to: SQL Server 2014 through SQL Server 2016 and Azure SQL Database.
Specifies to store the entire table in columnar format with a clustered columnstore index. This always includes all columns in the table. The data is not sorted in alphabetical or numeric order since the rows are organized to gain columnstore compression benefits.
For the original table the syntax would look like this:
CREATE TABLE dbo.MY_FIRST_FACT_TABLE (
ID BIGINT NOT NULL,
COL1 BIGINT NULL,
COL2 BIGINT NULL,
COL3 BIGINT NULL,
COL4 BIGINT NULL,
COL5 BIGINT NULL,
COL6 BIGINT NULL,
COL7 BIGINT NULL,
COL8 BIGINT NULL,
COL9 BIGINT NULL,
COL10 BIGINT NULL,
STRING1 VARCHAR(100) NULL,
STRING2 VARCHAR(100) NULL,
STRING3 VARCHAR(100) NULL,
STRING4 VARCHAR(100) NULL,
STRING5 VARCHAR(100) NULL,
INDEX MY_FIRST_CCI CLUSTERED COLUMNSTORE
);
A CREATE TABLE
statement does not generate an actual plan so I can't immediately tell if this reduced the memory grant. I tried creating the table 1000 times while hammering the sys.dm_exec_query_memory_grants
DMV but never got any results. That suggests that memory was not granted but isn't proof. I also tried extended events with query_memory_grant_usage and didn't get any results for the single CREATE TABLE
statement.
Of course, the best way to test is to create the CCI the original workflow conditions with limited available memory. I did that and did not experience any memory timeouts.
I suppose it could be said that defining the CCI in the CREATE TABLE
statement for a new table is a best practice. However, if your workload causes this error then changing the CREATE TABLE
script probably isn't sufficient to address other problems that you'll run into, such as memory issues when you starting loading data into the CCIs.
The first thing that jumps out to me is the difference in table access method on Socios
. Both queries use the 79271_79270_Socios
index but they use them in very different ways. The slow query does a seek to find all rows with Tarjeta
not equal to ''
. The query optimizer expects to read 6496920 rows out of 6498640 total rows in the table. A predicate is applied to filter out rows not equal to '0'
. A filter operator is applied restrict the rows down to those that equal @Tarjeta
OR @TarjetaAux
, which are NVARCHAR(MAX)
variables. This strategy seems odd and a not very efficient use of an index. You said that changing the data type helped so let's do some simple testing to see if SQL Server handles NVARCHAR(MAX)
predicates in some special way.
I'm going to assume that the Tarjeta
column has a data type of NVARCHAR(50)
but I don't think that's particularly important for this demo. Here is my sample data:
create table #demo (col nvarchar(50));
INSERT INTO #demo WITH (TABLOCK)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) / 1000
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;
create index ix on #demo (col);
For this query I get a nice, simple index seek that has a low cost and only expects to read and return 1000 rows:
SELECT col
FROM #demo
WHERE col = '123';
This query has the same plan:
SELECT col
FROM #demo
WHERE col <> '0' AND col = '123';
It looks like the optimizer is smart enough to eliminate the col <> '0'
filter because it's redundant.
If I cast my filter value to NVARCHAR(MAX)
things take an unexpected turn:
SELECT col
FROM #demo
WHERE col = CAST('123' AS NVARCHAR(MAX));
The performance of the query is fine. It still only reads 1000 rows from the index. However, the plan is much more complicated. It looks like some information is missing from the XML and I can't figure out how to properly read it. Within the XML I found this:
<ScalarOperator ScalarString="GetRangeWithMismatchedTypes(CONVERT(nvarchar(max),'123',0),CONVERT(nvarchar(max),'123',0),(62))">
Which seems like a hint that SQL Server has to do some kind of conversion to deal with the nvarchar(max)
constant value. You can find more information about this in Dynamic Seeks and Hidden Implicit Conversions.
If I make the query a little more complicated I get a similar access pattern to what you experienced with the bad query:
SELECT col
FROM #demo WITH (FORCESEEK)
WHERE col <> '0' AND col = CAST('123' AS NVARCHAR(MAX));
The FORCESEEK
hint was an attempt to make the query plan better but it didn't work. SQL Server will read all of the rows from the index and this plan has a much higher cost than before. Note that if we prevent SQL Server from applying an index seek on the <> '0'
filter we can get a much more reasonable plan:
SELECT col
FROM #demo
WHERE LEFT(col, 50) <> '0' AND col = CAST('123' AS NVARCHAR(MAX));
However, a better solution is to change the datatype if possible. When switching to NVARCHAR(50)
as you did I get a nice, simple index seek with this code:
DECLARE @Tarjeta nvarchar(50) = '123';
SELECT col
FROM #demo
WHERE col <> '0' AND col = @Tarjeta;
In conclusion, the unnecessary use of VARCHAR(MAX)
appears to be resulting in extra work in some cases and leading to unnecessarily inefficient plans in others. Based on the information that we have, changing the variable's type seems to be the right preventative step.
Regarding the No Join Predicate warning, you always see that when SQL Server implements part of your query as a cross join. Check out StarJoinInfo in Execution Plans for further information on why that might happen. Sometimes cross joins are good for performance and the right thing to do. Sometimes they may not be ideal but the cross join may be part of a plan that is good enough or the best plan that was found before the optimizer ran out of moves. The query optimizer assigned a very low cost to the cross join based on row estimates. Joining 414 rows to 1 row in a cross join isn't really a big deal, but it could be a problem if those estimates are inaccurate.
It's also hard to say why this plan is suddenly worse than before, especially with what we observed with the simple NVARCHAR(MAX)
testing above. Perhaps there was parameter sniffing and SQL server cached a plan generated from unlucky parameters. It's important to realize that SQL Server doesn't know anything about the values of the local variables in your plan without a RECOMPILE
hint. The better query that you posted has hardcoded values which isn't a fair comparison to the other query which required SQL Server to guess or to use a cached plan.
Best Answer
This is usually resultant (though I believe other things can also cause it) from Cardinality Estimate issues. You should look at the Actual Execution Plan and see if you have any mis-estimates by comparing the Estimated Number of Rows total to the Actual Number of Rows. If you update your question with the Execution Plan (you can upload it to Paste The Plan and link it to your question) then we can give you better direction on the source of your excessive memory grant issue.