How to query for leaf nodes in a hierarchy using a closure table

selectsqlite

I have some hierarchical data (representing musical instruments and playing techniques), where each leaf in the hierarchy is an audio file.

The nesting depth of leaf nodes will vary because some playing techniques will have sub-articulations.

Example paths to leaf nodes could be

/philharmonia/trumpet/mute/muted_trumpet1.wav
/philharmonia/trumpet/legato/attack/legato_trumpet1.wav
/ircam/violin/pizz/violin_pizz1.wav

I have the data stored using a master table and closure table:

sqlite> PRAGMA table_info(as_node);  
0|id|INTEGER|0||1  
1|name|TEXT(1024)|1||0  
2|file|INTEGER|0||0  
3|parent|INTEGER|0||0  

sqlite> PRAGMA table_info(as_node_closure);  
0|id|INTEGER|0||1  
1|ancestor|INTEGER|0||0  
2|descendant|INTEGER|0||0  
3|depth|INTEGER|0||0  

How can I query the database to select all leaf nodes where an ancestor node matches a given string at given nesting level?

For example in plain English: "give me all leaf nodes where ancestor node is 'trumpet' at a nesting level of 2"

With the above data, this should return muted_trumpet1.wav and legato_trumpet1.wav.

UPDATE 1

Some example data:

sqlite> select * from as_node
1|BrassP||
2|4Horns||1
3|FT||2
4|DYN_02||3
5|RR_01||4
6|Mic_CL||5
7|4H_CL_DYN2_FT_01.wav|1|6
8|Mic_FM||5
9|4H_FM_DYN2_FT_01.wav|2|8
10|Mic_RM||5
11|4H_RM_DYN2_FT_01.wav|3|10
12|Mic_SUR||5
13|4H_SURR_DYN2_FT_01.wav|4|12
14|DYN_03||3
15|RR_01||14
16|Mic_CL||15
17|4H_CL_DYN3_FT_01.wav|5|16
...

sqlite> select * from as_node_closure limit 10;
1|1|1|0
2|1|2|1
3|2|2|0
4|1|3|2
5|2|3|1
6|3|3|0
7|1|4|3
8|2|4|2
9|3|4|1
10|4|4|0

UPDATE 2:

This gives me pretty much what I want, but I'm sure it it's not the right way:


SELECT name from as_node WHERE id IN (SELECT descendant FROM as_node_closure WHERE ancestor IN (SELECT id FROM as_node WHERE name = "trumpet") AND descendant IN (SELECT id FROM as_node WHERE name LIKE "%.wav"));

This uses the inner SELECT to get all of the leaf nodes and return only descendants in that list.

Best Answer

I think this is what you want (edited):

-- ml : short name for middle-to-leaf connection
-- rm : short name for root-to-middle connection

SELECT                                     -- show me 
    leaf.*                                 -- all
FROM 
    as_node AS leaf                        -- leaf nodes
  JOIN as_node_closure AS ml               -- which have an ancestor
    ON  ml.descendant = leaf.id            -- 
  JOIN as_node AS middle                   -- (let's call it middle)
    ON  ml.ancestor = middle.id            -- 
    AND middle.name = 'trumpet'            -- with the name 'trumpet'
  JOIN as_node_closure AS rm               -- and this middle
    ON  rm.descendant = ml.ancestor        -- has another ancestor
  JOIN as_node AS root                     -- (which we call root)
    ON  rm.ancestor = root.id              -- which is indeed a root node
    AND root.parent IS NULL                -- as it has no parent
WHERE                                      -- and the distance from root to middle
      rm.depth = 2                         -- is 2 (so middle's absolute level is 2)
  AND leaf.name LIKE '%.wav'
  AND NOT EXISTS                           -- and also the leaf is indeed a leaf node
      ( SELECT *                           -- as it has no children
        FROM as_node AS extra
        WHERE extra.parent = leaf.id
      ) ;

Test at SQL-Fiddle (where 'FT' is used instead of 'trumpet', to match your data.)