Subqueries for JSON Path in UNION Resulting in Strings Instead of JSON

jsonsql-server-2016subqueryt-sqlunion

Bear with me – the example code has bit going on and I'll do my best to explain it, below.

with ENTITIES as (
select [name] as entityName from (values ('A'), ('B'), ('C')) X([name])
),
PROPERTIES as (
select [propName] as entityName, [propValue] as propertyValue from (values ('A','1'),('A','2'),('B','3'),('UNIVERSAL','999')) Y([propName], [propValue])
),
SUBPROPERTIES as (
select [propValue] as propertyValue, [subPropValue] as subPropertyValue from (values ('1','x'),('1','y'),('2','z'),('999','xyz')) Y([propValue], [subPropValue])
)
--select * from ENTITIES
--select * from PROPERTIES

select (

    select entityName as 'entityName',
    (
        select * from 
            (
                select propertyValue as 'propertyValue',
                (
                    select subPropertyValue from SUBPROPERTIES where SUBPROPERTIES.propertyValue = PROPERTIES.propertyValue
                    FOR JSON PATH, INCLUDE_NULL_VALUES
                ) Y
                from PROPERTIES where PROPERTIES.entityName = ENTITIES.entityName 

-- For testing, comment out from here -------------------------------------------------------------------------------------
                UNION
    
                select propertyValue as 'propertyValue',
                (
                    select subPropertyValue from SUBPROPERTIES where SUBPROPERTIES.propertyValue = PROPERTIES.propertyValue
                    FOR JSON PATH, INCLUDE_NULL_VALUES
                ) Y
                from PROPERTIES where PROPERTIES.entityName = 'UNIVERSAL' 
-- For testing, stop commenting out here ----------------------------------------------------------------------------------

            ) X
        FOR JSON PATH, INCLUDE_NULL_VALUES
    ) as entityProperties
    
    from ENTITIES
    FOR JSON PATH, INCLUDE_NULL_VALUES

) jsondata

I set up 3 CTEs – representing 3 tables that link to each other. An ENTITIES record (like 'A') can have multiple PROPERTIES (like '1' and '2') and each PROPERTIES record can have multiple SUBPROPERTIES (like '1' has sub-properties 'x' and 'y').

Now the main query is simply trying to build a JSON object that will return all ENTITIES together with all of their linked PROPERTIES and the linked SUBPROPERTIES.

So the first select column just gives you the entityName – a top level query.

The next column is a subquery to source all the PROPERTIES values related to the given ENTITIES record. You'll see it returns something with alias 'propertyValue' – of course this will be a set of values – one value for each PROPERTIES record linked to the current ENTITIES record.

The subquery has a second column which is another subquery – to the SUBPROPERTIES table – similar to above.

Now here's the issue – at this middle level (reporting PROPERTIES), I actually want to source the data from a UNION.

(In this example the second query in the union is querying the same source table – so I could avoid a union by changing the WHERE clause to have or PROPERTIES.entityName = 'UNIVERSAL' – but in the real world I'm sourcing data from two different tables, so I want to aggregate with a UNION).

I understand that when using FOR XML PATH with a UNION, you need to wrap all the UNION-ed SELECTs in parentheses and give that an alias (X, in this example) and then SELECT from that – which I have done with select * from. (Ignore that select * is bad practice – that's not the point).

This approach to UNIONs works fine – so long as the queries that have been UNION-ed do not contain subqueries – but as you can see here, I have subqueries.

So – to see what I am trying to illustrate, if you comment out the UNION (and the second select), you will get well structured JSON per the following illustration. I've highlighted where the SUBPROPERTIES are shown – correctly – in structured JSON.

enter image description here

However, when I add the UNION, those subqueries return strings, rather than JSON, per illustration below. I've highlighted where the strings are.

enter image description here

I have tried various approaches to adding a JSON_QUERY function (which is recommended in some cases to parse a string as JSON) but a) failed to resolve the issue, and b) there is no actual need for it for exactly the same data when there is no UNION.

I have also tried various approaches to select * from the lowest level subqueries too – to no avail.

Thanks in advance.

Best Answer

The issue seems to be that SQL is treating the pseudo field from the inner SELECT as text rather than JSON when it has been unioned. When the outer FOR JSON is applied to this pseudo field, it tries to escape the special characters in the text field. See this link for more info.

You can negate this using the JSON_QUERY function, essentially forcing SQL Server to treat the JSON field as JSON instead of text. See this fiddle for a working example.

Query (note the use of JSON_QUERY twice, once for each inner SELECT):

with ENTITIES as (
select [name] as entityName from (values ('A'), ('B'), ('C')) X([name])
),
PROPERTIES as (
select [propName] as entityName, [propValue] as propertyValue from (values ('A','1'),('A','2'),('B','3'),('UNIVERSAL','999')) Y([propName], [propValue])
),
SUBPROPERTIES as (
select [propValue] as propertyValue, [subPropValue] as subPropertyValue from (values ('1','x'),('1','y'),('2','z'),('999','xyz')) Y([propValue], [subPropValue])
)

select entityName as 'entityName',
JSON_QUERY(
    (select propertyValue, JSON_QUERY(Y) AS subPropertyValue from 
        (
            select propertyValue as 'propertyValue',
            (
                select subPropertyValue from SUBPROPERTIES where SUBPROPERTIES.propertyValue = PROPERTIES.propertyValue
                FOR JSON PATH, INCLUDE_NULL_VALUES
            ) Y
            from PROPERTIES where PROPERTIES.entityName = ENTITIES.entityName 

-- For testing, comment out from here -------------------------------------------------------------------------------------
            UNION

            select propertyValue as 'propertyValue',
            (
                select subPropertyValue from SUBPROPERTIES where SUBPROPERTIES.propertyValue = PROPERTIES.propertyValue
                FOR JSON PATH, INCLUDE_NULL_VALUES
            ) Y
            from PROPERTIES where PROPERTIES.entityName = 'UNIVERSAL' 
-- For testing, stop commenting out here ----------------------------------------------------------------------------------

        ) X
    FOR JSON PATH, INCLUDE_NULL_VALUES
)
) as entityProperties
from ENTITIES
FOR JSON PATH, INCLUDE_NULL_VALUES

Results

[
  {
    "entityName": "A",
    "entityProperties": [
      {
        "propertyValue": "1",
        "subPropertyValue": [
          {
            "subPropertyValue": "x"
          },
          {
            "subPropertyValue": "y"
          }
        ]
      },
      {
        "propertyValue": "2",
        "subPropertyValue": [
          {
            "subPropertyValue": "z"
          }
        ]
      },
      {
        "propertyValue": "999",
        "subPropertyValue": [
          {
            "subPropertyValue": "xyz"
          }
        ]
      }
    ]
  },
  {
    "entityName": "B",
    "entityProperties": [
      {
        "propertyValue": "3",
        "subPropertyValue": null
      },
      {
        "propertyValue": "999",
        "subPropertyValue": [
          {
            "subPropertyValue": "xyz"
          }
        ]
      }
    ]
  },
  {
    "entityName": "C",
    "entityProperties": [
      {
        "propertyValue": "999",
        "subPropertyValue": [
          {
            "subPropertyValue": "xyz"
          }
        ]
      }
    ]
  }
]