Postgresql – Selecting parent and child data

postgresql

I have an application front end which speaks to a PostgreSQL database. I'm trying to find the most efficient way to extract a structure similar to the below.

CREATE TABLE people (
  person_id SERIAL PRIMARY KEY,
  fname     VARCHAR
);

CREATE TABLE items (
  person_id INTEGER REFERENCES people (person_id),
  item_id   SERIAL PRIMARY KEY,
  title     VARCHAR
);

INSERT INTO people (fname) VALUES
  ('Bob'),
  ('Jim'),
  ('Geoff');

INSERT INTO items (person_id, title) VALUES
  (1, 'Cat'),
  (1, 'Dog'),
  (1, 'Monkey'),
  (2, 'Elephant');

My current approach would be to list all of the items, then (within the application) iterate over each and run a SELECT. Something similar to this Pseudo code:

items = db.Query(SELECT * FROM people);
for item in items
   itemsub = db.Query(SELECT * FROM items WHERE id = item.person_id)

There are a number of reasons why I don't like this.

  • It requires a minimum of 1 + N(People) queries to extract a simple data structure. For a page containing 1000 items, this would produce a ton of network traffic.
  • The application has to manually iterate over the list atleast twice more after the database, one to build the better structure and a second to render it.

  • It seems like something a database should be able to do

One other alternative would be to perform a JOIN, but this would result in extra processing to remove duplicates in the application

The resulting structure should be relatively simple:

peopleItems{ 
   array people{ 
             array items{}
         }
}

However, I can't for the life of me work out the best approach.

Best Answer

This is just an idea.

test=# CREATE INDEX items_idx ON items (person_id);
CREATE INDEX

test=# SELECT person_id, fname, 
    (SELECT array_to_string(ARRAY(SELECT title FROM items WHERE items.person_id = people.person_id ), ',')) AS titles
 FROM people;
 person_id | fname |     titles     
-----------+-------+----------------
         1 | Bob   | Cat,Dog,Monkey
         2 | Jim   | Elephant
         3 | Geoff | 
(3 rows)

I don't know what application language you use (PHP, Ruby, or others), but all languages can easily parse this result.

I made dummy data and done EXPLAIN. The result is shown below:

test=# EXPLAIN 
SELECT person_id,
fname, 
(SELECT array_to_string(ARRAY(SELECT title FROM items WHERE items.person_id = people.person_id ), ',')) AS titles
 FROM people;
                                       QUERY PLAN                                     

--------------------------------------------------------------------------------------
---
                                        QUERY PLAN                                    

--------------------------------------------------------------------------------------
-----
 Seq Scan on people  (cost=0.00..984256.31 rows=9990 width=8)
   SubPlan 2
     ->  Result  (cost=98.50..98.51 rows=1 width=0)
           InitPlan 1 (returns $1)
             ->  Bitmap Heap Scan on items  (cost=19.81..98.50 rows=455 width=16)
                   Recheck Cond: (person_id = people.person_id)
                   ->  Bitmap Index Scan on items_idx  (cost=0.00..19.70 rows=455 widt
h=0)
                         Index Cond: (person_id = people.person_id)
(8 rows)

This query uses Bitmap Index scan, so it can be run very fast.

Please don't forget to create index of person_id in the items table.