I tried to import a subset of open street map data to a Postgres database using the standard tool (osm2pgsql). I then tried to rebuild the indices on the table using plain sql (dropping them, then recreating them) and they end up taking massive amount of space.
The table size is 9GB, with an int id
, 70 text columns for attributes (for example column access_type
has 250 different text values, and is null in 90% of cases), and a possibly large geometry column. If I try:
CREATE INDEX planet_osm_polygon_accesstype_index
ON planet_osm_polygon
USING BTREE (access_type)
I end up with an index that's 1GB in size for 140k rows, which is massive considering how little data there is. This doesn't seem to happen on other systems, and I tried to ANALYZE
, then VACUUM
.
I don't know much about Postgres, but any hints about how to reduce the index size would be very appreciated.
(OS is ubuntu 12-04, PostgreSQL is version 9.1)
Best Answer
You seem to expect that rows with
NULL
values are excluded from a B-tree index automatically, but that's not the case. Those are indexed as well and can be searched for. However, since:that's hardly useful in your case. Such common values hardly ever make sense in an index to begin with, be it
NULL
or any other value. ExcludeNULL
values from the index with a partial index.Should be much smaller and thus faster.
Remember that you may have to include the same
WHERE
condition in queries to make Postgres realize it can apply the partial index.Related answers with more details: