Why does SQL Server require a LEFT JOIN to retrieve a status that doesn’t exist

database-internalsjoin;querysql serversql-server-2022

I was recently querying our internal database inventory tool for a list of servers, instances and databases, and was adding the corresponding status to each server, instance and database.

Relationship Diagram

Server ˂-- 1 : n --˃ Instance ˂-- 1 : n --˃ Database
   ˄                    ˄                      ˄
   |                    |                      |
   |                  1 : 1                    |
   |                    |                      |
   |                    ˅                      |
   +-- 1 : 1 --˃     Status        ˂-- 1 : 1 --+

Read as:
…a server can have multiple instances
…an instance can have multiple databases
…a server, an instance and a database can have a status

Setup

Status Table

CREATE TABLE [Status]
(
  StatusID int,
  StatusName char(20),
  );

Status Data

INSERT INTO [Status] (StatusID, StatusName)
VALUES
(1,'Productive'),
(2,'Prod ACC'),
(3,'Prod APP'),
(4,'Test'),
(5,'Test ACC'),
(6,'Test APP'),
(7,'OFFLINE'),
(8,'Reserved'),
(9,'Decommisioned');

Server Table

CREATE TABLE [Server]
(
  ServerID int,
  ServerName char(20),
  ServerStatusID int
  );

Server Data

INSERT INTO [Server] (ServerID, ServerName, ServerStatusID)
VALUES
(1,'FirstServer',1),
(2,'SecondServer',2),
(3,'ThirdServer',5),
(4,'FourthServer',8),
(5,'FifthServer',8);

Instance Table

CREATE TABLE [Instance]
(
  InstanceID int,
  ServerID int,
  InstanceName char(30),
  InstanceStatusID int
  );

Instance Data

INSERT INTO [Instance] 
  (InstanceID, ServerID, InstanceName, InstanceStatusID)
VALUES
(1,1,'GENERAL',1),
(2,1,'TAXES',1),
(3,2,'GENERAL',9),
(4,2,'SOCIAL',2),
(5,3,'GENERAL',5),
(6,3,'FBI',8),
(7,5,'COMINGSOON',8);

Database Table

CREATE TABLE [Database]
(
  DatabaseID int,
  InstanceID int,
  DatabaseName char(30),
  DatabaseStatusID int
  );

Database Data

INSERT INTO [Database]
(DatabaseID, InstanceID, DatabaseName, DatabaseStatusID)
VALUES
(1,1,'master',1),
(2,1,'model',1),
(3,1,'msdb',1),
(4,1,'UserDB1',1),
(5,2,'master',1),
(6,2,'model',1),
(7,2,'msdb',1),
(8,2,'TaxesDB',1),
(9,4,'master',2),
(10,4,'model',2),
(11,4,'msdb',2),
(12,4,'HealthCareDB',2),
(13,5,'master',5),
(14,5,'model',5),
(15,5,'msdb',5),
(16,5,'GeneralUserDB',5),
(17,6,'master',8),
(18,6,'model',8),
(19,6,'msdb',8),
(20,6,'CriminalDB',8);

SELECT Statement without Status table involved

The initial SELECT statement involved simply joining the three tables: server, instance, database and was as follows:

-- Simple SELECT to get all information on Servers, Instances and Databases
-- The status of the server, instance or database is not returned
SELECT 
  ServerName, 
  InstanceName,
  DatabaseName 
  FROM [Server] as srv
    LEFT JOIN [Instance] as ins
      ON srv.ServerID = ins.ServerID
    LEFT JOIN [Database] as dbs
      ON ins.InstanceID = dbs.InstanceID;

Results of 1. Statement

PLEASE OBSERVE THAT…

  • there is a server without an instance and database
  • there is an instances without a database
ServerName InstanceName DatabaseName
FirstServer GENERAL master
FirstServer GENERAL model
FirstServer GENERAL msdb
FirstServer GENERAL UserDB1
FirstServer TAXES master
FirstServer TAXES model
FirstServer TAXES msdb
FirstServer TAXES TaxesDB
SecondServer GENERAL null
SecondServer SOCIAL master
SecondServer SOCIAL model
SecondServer SOCIAL msdb
SecondServer SOCIAL HealthCareDB
ThirdServer GENERAL master
ThirdServer GENERAL model
ThirdServer GENERAL msdb
ThirdServer GENERAL GeneralUserDB
ThirdServer FBI master
ThirdServer FBI model
ThirdServer FBI msdb
ThirdServer FBI CriminalDB
FourthServer null null
FifthServer COMINGSOON null

SELECT Statement involving Status table

In the next statement I decide to add the status to each element (server, instance, database) and JOINed each table with the Status table as follows:

-- Advanced SELECT to get all information on Servers, Instances and Databases 
-- including their status
SELECT 
  ServerName, 
  srvst.StatusName,
  InstanceName,
  insst.StatusName,
  DatabaseName,
  dbsst.StatusName
  FROM [Server] as srv
    JOIN [Status] as srvst
      ON srv.ServerStatusID = srvst.StatusID
    LEFT JOIN [Instance] as ins
      ON srv.ServerID = ins.ServerID
    JOIN [Status] as insst
      ON ins.InstanceStatusID = insst.StatusID
    LEFT JOIN [Database] as dbs
      ON ins.InstanceID = dbs.InstanceID
    JOIN [Status] as dbsst
      ON dbs.DatabaseStatusID = dbsst.StatusID
  ;

Results of 2. Statement

To my surprise the server without an instance and database and the server with an instance but without a database were no longer listed:

ServerName StatusName InstanceName StatusName DatabaseName StatusName
FirstServer Productive GENERAL Productive master Productive
FirstServer Productive GENERAL Productive model Productive
FirstServer Productive GENERAL Productive msdb Productive
FirstServer Productive GENERAL Productive UserDB1 Productive
FirstServer Productive TAXES Productive master Productive
FirstServer Productive TAXES Productive model Productive
FirstServer Productive TAXES Productive msdb Productive
FirstServer Productive TAXES Productive TaxesDB Productive
SecondServer Prod ACC SOCIAL Prod ACC master Prod ACC
SecondServer Prod ACC SOCIAL Prod ACC model Prod ACC
SecondServer Prod ACC SOCIAL Prod ACC msdb Prod ACC
SecondServer Prod ACC SOCIAL Prod ACC HealthCareDB Prod ACC
ThirdServer Test ACC GENERAL Test ACC master Test ACC
ThirdServer Test ACC GENERAL Test ACC model Test ACC
ThirdServer Test ACC GENERAL Test ACC msdb Test ACC
ThirdServer Test ACC GENERAL Test ACC GeneralUserDB Test ACC
ThirdServer Test ACC FBI Reserved master Reserved
ThirdServer Test ACC FBI Reserved model Reserved
ThirdServer Test ACC FBI Reserved msdb Reserved
ThirdServer Test ACC FBI Reserved CriminalDB Reserved

Findings / Solution

After checking various options with a trial and error approach I found out that the JOIN on the Status table had to be changed to a LEFT JOIN to allow for the statement to display the server without an instance or a database, and to display the instance without a database:

-- Advanced SELECT to get all information on Servers, Instances and Databases 
-- including their status
SELECT 
  ServerName, 
  srvst.StatusName,
  InstanceName,
  insst.StatusName,
  DatabaseName,
  dbsst.StatusName
  FROM [Server] as srv
    LEFT JOIN [Status] as srvst
      ON srv.ServerStatusID = srvst.StatusID
    LEFT JOIN [Instance] as ins
      ON srv.ServerID = ins.ServerID
    LEFT JOIN [Status] as insst
      ON ins.InstanceStatusID = insst.StatusID
    LEFT JOIN [Database] as dbs
      ON ins.InstanceID = dbs.InstanceID
    LEFT JOIN [Status] as dbsst
      ON dbs.DatabaseStatusID = dbsst.StatusID;

Results of 3. Statement

ServerName StatusName InstanceName StatusName DatabaseName StatusName
FirstServer Productive GENERAL Productive master Productive
FirstServer Productive GENERAL Productive model Productive
FirstServer Productive GENERAL Productive msdb Productive
FirstServer Productive GENERAL Productive UserDB1 Productive
FirstServer Productive TAXES Productive master Productive
FirstServer Productive TAXES Productive model Productive
FirstServer Productive TAXES Productive msdb Productive
FirstServer Productive TAXES Productive TaxesDB Productive
SecondServer Prod ACC GENERAL Decommisioned null null
SecondServer Prod ACC SOCIAL Prod ACC master Prod ACC
SecondServer Prod ACC SOCIAL Prod ACC model Prod ACC
SecondServer Prod ACC SOCIAL Prod ACC msdb Prod ACC
SecondServer Prod ACC SOCIAL Prod ACC HealthCareDB Prod ACC
ThirdServer Test ACC GENERAL Test ACC master Test ACC
ThirdServer Test ACC GENERAL Test ACC model Test ACC
ThirdServer Test ACC GENERAL Test ACC msdb Test ACC
ThirdServer Test ACC GENERAL Test ACC GeneralUserDB Test ACC
ThirdServer Test ACC FBI Reserved master Reserved
ThirdServer Test ACC FBI Reserved model Reserved
ThirdServer Test ACC FBI Reserved msdb Reserved
ThirdServer Test ACC FBI Reserved CriminalDB Reserved
FourthServer Reserved null null null null
FifthServer Reserved COMINGSOON Reserved null null

Reference Material

Here a link to the db<>fiddle to reproduce my findings.

Question

Why does SQL Server require a LEFT JOIN on the Status table for child items that do not exist and for the query to display these items?

Best Answer

You need to nest your joins. Otherwise what is happening is that it's expecting each individual join clause to return a result, but it can't if the previous one didn't return anything.

Essentially, you want the server to take the result of the inner-join of Instance and Status, and left-join all of that back to Server.

SELECT 
  ServerName, 
  srvst.StatusName,
  InstanceName,
  insst.StatusName,
  DatabaseName,
  dbsst.StatusName
FROM Server as srv
  JOIN Status as srvst
    ON srv.ServerStatusID = srvst.StatusID
  LEFT JOIN (
      Instance as ins
      JOIN Status as insst
        ON ins.InstanceStatusID = insst.StatusID
    )
    ON srv.ServerID = ins.ServerID
  LEFT JOIN (
      Database as dbs
      JOIN Status as dbsst
        ON dbs.DatabaseStatusID = dbsst.StatusID
    )
    ON ins.InstanceID = dbs.InstanceID;

In SQL Server, the parenthesis are not essential, they are purely for readability. The key is putting JOIN table2 ON ... inside another join, so it comes out to LEFT JOIN table1 JOIN table2 ON ... ON ... ie the ON clauses are nested.

This is the exact equivalent of using derived subqueries:

SELECT 
  ServerName, 
  srvst.StatusName,
  InstanceName,
  ins.StatusName,
  DatabaseName,
  dbs.StatusName
FROM Server as srv
  JOIN Status as srvst
    ON srv.ServerStatusID = srvst.StatusID
  LEFT JOIN (
      SELECT ins.*, insst.StatusName
      FROM Instance as ins
      JOIN Status as insst
        ON ins.InstanceStatusID = insst.StatusID
    ) ins
    ON srv.ServerID = ins.ServerID
  LEFT JOIN (
      SELECT dbs.*, dbsst.StatusName
      Database as dbs
      JOIN Status as dbsst
        ON dbs.DatabaseStatusID = dbsst.StatusID
    ) dbs
    ON ins.InstanceID = dbs.InstanceID;