Sql-server – Count the nodes in a binary tree structure

recursivesql serversql-server-2012

I need to count the left and right nodes in a binary tree structure (output grouped by joiningDate), given a particular starting node parent id (pid).

The tree is stored in the table shown below:

For example using pid = 4 you would get 2 cid (5 and 11 ) then you would use those as the new pid (5, 11). When cid is null or the full tree has been traversed count all the placement = L and placement = R. Other placements like "M" should be ignored.

Illustration:

Expected output for a selected starting node of 4:

+-----------+-------------+-------+
| placement | joiningDate | Total |
+-----------+-------------+-------+
| L         | 2015-02-02  |     3 |
| R         | 2015-02-02  |     1 |
| L         | 2015-08-21  |     4 |
| L         | 2015-12-12  |     1 |
+-----------+-------------+-------+

Best Answer

This is most easily implemented in SQL Server using a Recursive Common Table Expression.

Table definition

DECLARE @binaryUser AS TABLE
(
    id          integer NOT NULL,
    joiningDate date NOT NULL,
    placement   char(1) NOT NULL,
    pId         integer NOT NULL,
    cId         integer NOT NULL,
    referBy     integer NOT NULL
);

Data

INSERT @binaryUser
    (id, joiningDate, placement, pid, cid, referBy)
VALUES
    (4, '20150202', 'L', 4, 5, 4),
    (6, '20150202', 'R', 5, 8, 4),
    (8, '20150202', 'R', 4, 11, 4),
    (9, '20150202', 'L', 5, 10, 4),
    (25, '20151212', 'L', 8, 9, 4),
    (31, '20150821', 'R', 8, 12, 4),
    (33, '20150821', 'R', 12, 13, 4),
    (36, '20150821', 'R', 9, 14, 4),
    (37, '20150821', 'M', 9, 15, 4),
    (38, '20150821', 'L', 10, 16, 4),
    (39, '20150821', 'M', 4, 17, 4);

Solution

This is presented as a script, but it is trivial to convert it to a stored procedure. The basic idea is to traverse the tree recursively, then count the rows found.

DECLARE @pId integer = 4;

-- Recursive CTE
WITH R AS
(
    -- Anchor
    SELECT 
        BU.joiningDate,
        BU.cId,
        BU.placement
    FROM @binaryUser AS BU
    WHERE
        BU.pId = @pId
        AND BU.placement IN ('L', 'R')

    UNION ALL

    -- Recursive part
    SELECT
        BU.joiningDate,
        BU.cId,
        R.placement
    FROM R
    JOIN @binaryUser AS BU
        ON BU.pId = R.cId
    WHERE
        BU.placement IN ('L', 'R')
)
-- Final groups of nodes found
SELECT
    R.placement,
    R.joiningDate,
    Total = COUNT_BIG(*)
FROM R
GROUP BY
    R.placement,
    R.joiningDate
OPTION (MAXRECURSION 0);

SEDE Demo

Output:

╔═══════════╦═════════════╦═══════╗
║ placement ║ joiningDate ║ Total ║
╠═══════════╬═════════════╬═══════╣
║ L         ║ 2015-02-02  ║     3 ║
║ R         ║ 2015-02-02  ║     1 ║
║ L         ║ 2015-08-21  ║     4 ║
║ L         ║ 2015-12-12  ║     1 ║
╚═══════════╩═════════════╩═══════╝

Related Solutions

Postgresql – Tree structure hierarchical query sorting

Efficiently storing, processing and reporting on hierarchical data of mixed/unknown depth in SQL is quite a complex area: whole books (such as Joe Celko's "Trees and Hierarchies in SQL for Smarties") have been written on the subject.

What you have there is often called a "naive tree" (only the parent relationship is noted) which does not allow for easily performing such a sort, or for queries where you want all data from under a particular node. It is common to de-normalise the structure in these cases by adding a path element to each row, so you can perform the sort you are looking for by sorting by theis field (and you can perform "everything under no matter how deep" queries by using filters such as tree_path LIKE '/path/to/target/node/%' (or tree_path LIKE '/path/to/target/node%' if you want to include the node itself). This avoids any need for recursion or an arbitrarily long collection joins, though of course you have the extra work to do maintaining this path when the tree changes. The book I usually recommend to all DB people (devs and admins alike), "SQL Antipatterns", has a chapter on this which covers the basics of this and other alternatives quite clearly (so you don't need to try consume Calko's tome unless you have some more complex requirements!).

If you can't alter the structure then you might be able to produce the path as an output using a recursive CTE (Common Table Expression) - I'll not go into detail here as there are already many good examples out there (in the documentation and StackExchange questions such as https://stackoverflow.com/questions/3307480/postgresql-recursive-with).

If your tree has a fixed depth (all leaf nodes are the same number of levels below root) then a fixed set of joins will do the trick:

SELECT {stuff}
FROM   source_table t1
JOIN   source_table t2 ON t2.id=t1.parent
JOIN   source_table t3 ON t3.id=t2.parent
JOIN   source_table t4 ON t4.id=t3.parent
ORDER BY t1.name, t2.name, t3.name, t4.name

for three levels below your root(s). If your depth is not fixed then this would not work (well, it could be forced to up to an arbitrary depth but the joins and the ordering clause would be pretty hairy to design and inefficient to run) to you need to use the CTE approach or use a modified data structure.

Sql-server – find total money of parent node of tree according to a particular pattern

Test Data:

memberid    MemberID    joiningposition packagetype
   RPM00000     NULL         Root   free
   RPM71572     RPM00000    Left    Royal
   RPM323768    RPM00000    Right   Royal
   RPM715790    RPM71572    Left    free
   RPM323769    RPM71572    Right   free
   RPM715987    RPM323768   Left    free
   RPM323985    RPM323768   Right   free
   RPM733333    RPM323985   Right   free
   RPM324444    RPM715987   Right   Royal

create a table to store final value

CREATE TABLE [dbo].[Wallatpayout]
(   
    [childid] [varchar](50) NULL,
    [joiningposition] [varchar](50) NULL,
    [DateofJoing] [varchar](50)    NULL,
    [packagetype] [varchar](50) NULL,
    [Total] [int] NULL,
    [FirstPayoutstatus] [varchar](30) NULL,
    [DateofPayout] [datetime]    NULL 
)

create procedure [dbo].[sunnypro] as 
DECLARE @pId varchar(40) = 'RPM00000'; 
Declare @Id int set @Id=(select id from registration where hildid=@pId)      
begin

-- Recursive 
CTE WITH R AS (
    -- Anchor
    SELECT 
        BU.DateofJoing,
        BU.childid,
        BU.joiningposition,
        BU.packagetype
        FROM registration AS BU
        WHERE
        BU.MemberID = @pId and
       BU.joiningposition IN ('Left', 'Right')
        or BU.packagetype in('Royal','Platinum','Majestic')
         and BU.Id>@id
   UNION All
    -- Recursive part
    SELECT      
         BU.DateofJoing,
         BU.childid,
         R.joiningposition,
        BU.packagetype   
    FROM R
    JOIN registration AS BU
        ON BU.MemberID = R.childid
    WHERE
        BU.joiningposition IN ('Left', 'Right') and
      BU.packagetype in('Royal','Platinum','Majestic')
     and BU.Id>@id ) 
INSERT INTO Wallatpayout
           (childid
           ,packagetype
           ,joiningposition
           ,DateofJoing
           ,Total)
-- Final groups of nodes found 
SELECT top 3
R.childid,    R.packagetype,
    R.joiningposition,
    R.DateofJoing,
    Total = COUNT_BIG(*) FROM R where R.packagetype in('Royal','Platinum','Majestic') 
GROUP BY
R.childid,
    R.joiningposition,
    R.DateofJoing,
    R.packagetype
    OPTION (MAXRECURSION 0); 
end

This code is helpful for multi-level marketing, to find all left node and right node by passing particular id or parentid.