Postgresql – get the union of the a Postgres fulltext index

full-text-searchpostgresql

I have a fulltext index on a table. Is it possible to retrieve the set of terms used in the index (gist or gin)? With weights if possible?

To clarify:

If I have the following table:

create table "test" (id integer, thing tsvector);

I then make a GIST index on it:

create index thing_index on test using gist (thing);

Then some data:

insert into test (id, thing)
values (1, 'one'),(2, 'two'), (3, 'three'), (4, 'one'), (5, 'two');

The index thing_index is going contain the following mapping:

'one' => {1, 4}
'two' => {2, 5}
'three' => {3}

I want to obtain the following response from the index:

'one',
'two',
'three'

Maybe even with rankings:

'one' => 2
'two' => 2
'three' => 1

I know I can do this myself by scanning and building my own index, but I want to get it out of Postgres if possible.

Best Answer

If I understand your question correctly and it isnt at all clear, you are trying to pull back information as far as what id's are connected with a value. I don't think you can just pull it from the index in PostgreSQL because the index will not contain visibility info, and so you will have a LOT of random IO and waiting for platters to turn.

The query for your test case is:

select thing, array_agg(id) from test group by thing;

Assuming you are on a version high enough to have array_agg.

In my system (9.1) this gives me:

chris=> select thing, array_agg(id) from test group by thing;
   thing  | array_agg 
 ---------+-----------
  'one'   | {1,4}
  'two'   | {2,5}
  'three' | {3}
 (3 rows)

that's what you are looking for, right?