PostgreSQL Index Type for Not Equal (<>) WHERE Condition

indexperformancepostgresqlpostgresql-performancewhere

I need to query two huge tables (more than milion records). Besides other conditions, there is also a not equal (<>) where condition on a columns pair of varchar type. It has to exclude one in a million.

Values in the column are unique, same in both tables, so I have to exclude from joining rows from two tables which have identical values in that particular column.

Is a standard b-tree type index suitable, or would some other index type be more suitable for a not equal (<>) where condition?

What about pg-strom extension? Could pg-strom (utilizing GPU) enhance speed of a not equal comparison in the where clause?

Best Answer

What about pg-strom extension? Could pg-strom (utilizing GPU) enhance speed of a not equal comparison in the where clause?

There is no point in using the GPU unless the operations can make use of highly parallel processing. Simple <> is likely not in the use case. That said, Pg 9.6 should make use of simple-parallel index scans.

I need to query two huge tables (more than milion records). Besides other conditions, there is also a not equal (<>) where condition on a columns pair of varchar type. It has to exclude one in a million.

Indexes that exclude one in a million are likely not going to be planned to use an index. We would need a test case here or for you to show the actual query.

Just think of it like this.

  1. Scenario one, you visit a million rows on disk. You need a 999,999 rows.
  2. Scenario two, you visit a million rows in an index -- this isn't free. You need 999,999. After you determine which one row you do not want, you still have to visit the 999,999 rows on disk.

Which would you rather do? If you have to visit (almost) all the rows anyway the index doesn't save you any work.