Sql-server – Warnings NoJoinPredicate

execution-planplan-cachesql serversql-server-2016

This morning, our SQL Server 2016 started using high CPU percentage(90%) when normally is about 30%.

Reviewing cache plan with sp_blitzcache, I saw a new plan for an old query in the top of CPU time.

Full query plan is here.

This plan had a warning about no join predicate.

(@TokenCentro nvarchar(36),@Tarjeta nvarchar(max) ,@TarjetaAux nvarchar(max) )
SELECT   Socios.id, Socios.id AS IdSocio, Centros.id AS IDCentro, Centros.nombre,
         ISNULL(NombreWEB, Centros.Nombre) AS NombreWeb, Socios.Token, Foto, ISNULL(Socios.IdIdioma, 1) AS IdIdioma,
         AplicacionPropia, ISNULL(Socios.IdConf_Regional, 1) AS IdUnidad, Socios_Asignar_Centro.AplicacionWEB,
         CASE WHEN Socios.Email = '' THEN 1
              ELSE CASE WHEN Socios.Email IS NULL THEN 1
                        ELSE 0
                   END
         END AS ConEmail, Centros.HorarioVisible, Centros.PerfilApp, TactilEntrenoSocio, Centros.ColorKiosco,
         Conf_Regional.PrimerDia, ISNULL(Centros.IdGrupoApp, 0) AS IdGrupoApp, Centros.requiere_LOPD,
         ISNULL(Centros.LOPD, '') AS LOPD, Socios.aceptada_lopd, Centros.IntegracionDieta, Empresas.Chat, Centros.field,
         Centros.valoracionApp, Empresas.p_Premios, Centros.Reservas, ISNULL(CustomCss, '') AS CustomCss,
         Socios_Asignar_Centro.PuedeReservar, zonasHorarias.Zona, zonasHorarias.ZonaIana,
         ISNULL(Socios.TokenExten, '') AS TokenExtern, Centros.field1, Centros.tipoServicioPush,
         Socios.Nombre AS NombreSocio, Socios.Apellidos AS ApellidosSocio, Centros_Opciones.forzarLogin
FROM     Socios
INNER JOIN Socios_Asignar_Centro
ON Socios_Asignar_Centro.IdSocio = Socios.Id
INNER JOIN Conf_Regional
ON Conf_Regional.Id = Socios.IdConf_Regional
INNER JOIN Centros
ON Centros.Id = Socios_Asignar_Centro.IdCentro
   AND Centros.FechaBaja IS NULL
INNER JOIN Empresas
ON Empresas.Id = Centros.IdEmpresa
INNER JOIN zonasHorarias
ON Socios.idzonahoraria = zonasHorarias.id
INNER JOIN Centros_Opciones
ON Centros_Opciones.IdCentro = Centros.Id
INNER JOIN Centros_Tactiles
ON Centros_Tactiles.IdCentro = Centros.Id
WHERE    Socios.tarjeta IS NOT NULL
         AND Socios.tarjeta <> ''
         AND Socios.tarjeta <> '0'
         AND (
                 Socios.Tarjeta = @Tarjeta
                 OR Socios.Tarjeta = @TarjetaExtend )
         AND Centros_Tactiles.Token = @Token
         AND (
                 Socios_Asignar_Centro.FechaBaja IS NULL
                 OR Socios_Asignar_Centro.FechaBaja > CONVERT(DATE, GETUTCDATE()))
ORDER BY IdCentro, IdSocio DESC

After changing the length of the parameter from nvarchar(max) to nvarchar(50), the new plan was normal again. However, I can not see where the problem is.

Any ideas?

Warning

Best Answer

The first thing that jumps out to me is the difference in table access method on Socios. Both queries use the 79271_79270_Socios index but they use them in very different ways. The slow query does a seek to find all rows with Tarjeta not equal to ''. The query optimizer expects to read 6496920 rows out of 6498640 total rows in the table. A predicate is applied to filter out rows not equal to '0'. A filter operator is applied restrict the rows down to those that equal @Tarjeta OR @TarjetaAux, which are NVARCHAR(MAX) variables. This strategy seems odd and a not very efficient use of an index. You said that changing the data type helped so let's do some simple testing to see if SQL Server handles NVARCHAR(MAX) predicates in some special way.

I'm going to assume that the Tarjeta column has a data type of NVARCHAR(50) but I don't think that's particularly important for this demo. Here is my sample data:

create table #demo (col nvarchar(50));

INSERT INTO #demo WITH (TABLOCK)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) / 1000
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;

create index ix on #demo (col);

For this query I get a nice, simple index seek that has a low cost and only expects to read and return 1000 rows:

SELECT col
FROM #demo 
WHERE col = '123';

simple seek

This query has the same plan:

SELECT col
FROM #demo 
WHERE col <> '0' AND col = '123';

It looks like the optimizer is smart enough to eliminate the col <> '0' filter because it's redundant.

If I cast my filter value to NVARCHAR(MAX) things take an unexpected turn:

SELECT col
FROM #demo 
WHERE col = CAST('123' AS NVARCHAR(MAX));

seek join

The performance of the query is fine. It still only reads 1000 rows from the index. However, the plan is much more complicated. It looks like some information is missing from the XML and I can't figure out how to properly read it. Within the XML I found this:

<ScalarOperator ScalarString="GetRangeWithMismatchedTypes(CONVERT(nvarchar(max),'123',0),CONVERT(nvarchar(max),'123',0),(62))">

Which seems like a hint that SQL Server has to do some kind of conversion to deal with the nvarchar(max) constant value. You can find more information about this in Dynamic Seeks and Hidden Implicit Conversions.

If I make the query a little more complicated I get a similar access pattern to what you experienced with the bad query:

SELECT col
FROM #demo WITH (FORCESEEK)
WHERE col <> '0' AND col = CAST('123' AS NVARCHAR(MAX));

force seek

The FORCESEEK hint was an attempt to make the query plan better but it didn't work. SQL Server will read all of the rows from the index and this plan has a much higher cost than before. Note that if we prevent SQL Server from applying an index seek on the <> '0' filter we can get a much more reasonable plan:

SELECT col
FROM #demo 
WHERE LEFT(col, 50) <> '0' AND col = CAST('123' AS NVARCHAR(MAX));

left

However, a better solution is to change the datatype if possible. When switching to NVARCHAR(50) as you did I get a nice, simple index seek with this code:

DECLARE @Tarjeta nvarchar(50) = '123';

SELECT col
FROM #demo 
WHERE col <> '0' AND col = @Tarjeta;

In conclusion, the unnecessary use of VARCHAR(MAX) appears to be resulting in extra work in some cases and leading to unnecessarily inefficient plans in others. Based on the information that we have, changing the variable's type seems to be the right preventative step.

Regarding the No Join Predicate warning, you always see that when SQL Server implements part of your query as a cross join. Check out StarJoinInfo in Execution Plans for further information on why that might happen. Sometimes cross joins are good for performance and the right thing to do. Sometimes they may not be ideal but the cross join may be part of a plan that is good enough or the best plan that was found before the optimizer ran out of moves. The query optimizer assigned a very low cost to the cross join based on row estimates. Joining 414 rows to 1 row in a cross join isn't really a big deal, but it could be a problem if those estimates are inaccurate.

It's also hard to say why this plan is suddenly worse than before, especially with what we observed with the simple NVARCHAR(MAX) testing above. Perhaps there was parameter sniffing and SQL server cached a plan generated from unlucky parameters. It's important to realize that SQL Server doesn't know anything about the values of the local variables in your plan without a RECOMPILE hint. The better query that you posted has hardcoded values which isn't a fair comparison to the other query which required SQL Server to guess or to use a cached plan.