Postgresql – How to sort the results of a recursive query in an expanded tree-like fashion

cteorder-bypostgresqlrecursive

Let's assume you have a nodes tables like this:

CREATE TABLE nodes
(
    node serial PRIMARY KEY,
    parent integer NULL REFERENCES nodes(node),
    ts timestamp NOT NULL DEFAULT now()
);

It represents a standard node-like tree structure with root nodes at the top and several child nodes dangling from root nodes or other child nodes.

Let us insert a couple of example values:

INSERT INTO nodes (parent)
VALUES (NULL), (NULL), (NULL), (NULL), (1), (1), (1), (1), (6), (1)
     , (6), (9), (6), (6), (3), (3), (3), (15);

Now I want to retrieve the first 10 root nodes and all their children up to a depth of 4:

WITH RECURSIVE node_rec AS
(
    (SELECT 1 AS depth, * FROM nodes WHERE parent IS NULL LIMIT 10)

    UNION ALL

    SELECT depth + 1, n.*
    FROM nodes AS n JOIN node_rec ON (n.parent = node_rec.node)
    WHERE depth < 4
)
SELECT * FROM node_rec;

This works great and gives me the following result:

 depth | node | parent 
-------+------+--------
     1 |  1   |
     1 |  2   |
     1 |  3   |
     1 |  4   |
     2 |  5   |  1
     2 |  6   |  1
     2 |  7   |  1
     2 |  8   |  1
     2 | 10   |  1
     2 | 15   |  3
     2 | 16   |  3
     2 | 17   |  3
     3 |  9   |  6
     3 | 11   |  6
     3 | 13   |  6
     3 | 14   |  6
     3 | 18   | 15
     4 | 12   |  9

As you might have noticed, there's no ORDER BY clause, so the order is not defined. The order you see here is from root nodes to deeper nodes.

How would I order the results as they would appear in an expanded tree view, as you can see from the example picture below?

Expanded tree view of nodes

I basically want the child nodes to be placed right after their corresponding parent node. If two or more child nodes have the same parent node, I want them to be sorted by their timestamp. Based on the example above, here's the desired output order that I'm trying to achieve:

 depth | node | parent | ts
-------+------+--------+---------
     1 |  1   |        | 2014-01-01 00:00:00
     2 |  5   |     1  | 2014-01-01 00:10:00
     2 |  6   |     1  | 2014-01-01 00:20:00
     3 |  9   |     6  | 2014-01-01 00:25:00
     4 |  12  |     9  | 2014-01-01 00:27:00
     3 |  11  |     6  | 2014-01-01 00:26:00
     3 |  13  |     6  | 2014-01-01 00:30:00
     3 |  14  |     6  | 2014-01-01 00:36:00
     2 |  7   |     1  | 2014-01-01 00:21:00
     2 |  8   |     1  | 2014-01-01 00:22:00
     2 |  10  |     1  | 2014-01-01 00:23:00
     1 |  2   |        | 2014-01-01 00:08:00
     1 |  3   |        | 2014-01-01 00:09:00
     2 |  15  |     3  | 2014-01-01 10:00:00
     3 |  18  |     15 | 2014-01-01 11:05:00
     2 |  16  |     3  | 2014-01-01 11:00:00
     2 |  17  |     3  | 2014-01-01 12:00:00
     1 |  4   |        | 2014-01-01 00:10:00

Best Answer

An array representing the path from the root up to the leaf should achieve the desired sort order:

WITH RECURSIVE node_rec AS (
   (SELECT 1 AS depth, ARRAY[node] AS path, *
    FROM   nodes
    WHERE  parent IS NULL
    LIMIT  10
   )    
    UNION ALL
    SELECT r.depth + 1, r.path || n.node, n.*
    FROM   node_rec r 
    JOIN   nodes    n ON n.parent = r.node
    WHERE  r.depth < 4
)
SELECT *
FROM   node_rec
ORDER  BY path;

If two or more child nodes have the same parent node, I want them to be sorted by their timestamp.

Shift the path by one towards the root and order by that column additionally:

WITH RECURSIVE node_rec AS (
   (SELECT 1 AS depth, ARRAY[node] AS path, *
    FROM   nodes
    WHERE  parent IS NULL
    LIMIT  10
   )    
    UNION ALL
    SELECT r.depth + 1, r.path || n.parent, n.*
    FROM   node_rec r 
    JOIN   nodes    n ON n.parent = r.node
    WHERE  r.depth < 4
)
SELECT *
FROM   node_rec
ORDER  BY path, ts;