Are many two column indexes a bad smell

h2index-tuning

I have table that contains failures. (15000 rows + on a abstract embedded system)

A failure has the following fields:

  • id PK BIGINT
  • errorNumber INT
  • source VARCHAR
  • raisedAt BIGINT
  • clearedAt BIGINT
  • cleared (computed column for sorting purposes = clearedat>0) BOOL

My first use case Is that I want to display all failures, the uncleared ones first.

Select * from FAILURES ORDER BY CLEARED asc, raisedAt desc;

Second use case is by a synchronization job. I want to delete all synchronized and cleared failures, there for all failures with an id smaller X and cleared

DELETE FROM FAILURES WHERE ID <? AND CLEARED;

Third use case is that I want to pick a specific uncleared failure.

SELECT * FROM FAILURES WHERE SOURCE=? AND ERRORNUMBER=? AND NOT CLEARED;

This would lead me to the following indexes:

CREATE INDEX CLEARED_RAISEDAT_SORTED_FAILURES ON FAILURE_MANAGER.FAILURES (cleared asc,raisedat desc);
CREATE INDEX CLEARED_ID_FAILURES ON FAILURE_MANAGER.FAILURES (cleared asc,id desc);
CREATE INDEX CLEARED_SOURCE_SORTED_FAILURES ON FAILURE_MANAGER.FAILURES (cleared asc,SOURCE desc,ERRORNUMBER desc);

I sometimes still struggle with index definition. It seems odd to me that I have 3 different indexes, which all only differ in the second column. Relative to my described uses cases, are these indexes feasible or does this create unnecessary overhead? Are there any check up questions that I can ask my self to figure if this might be the correct setting?

Best Answer

I hope for you that most failures are cleared. In that case, when you query for cleared failures, the index will hurt performance. If you'd put that column in the second place, on the other hand, it might be benificial. When you're looking for not cleared - if those are rare - the index will help.