I know this has been asked in the past, but sometimes things change over time and I wanted to double check.
I have a table with about 9 Billion rows. Should I add the index before inserting the data or after. I'm using Aurora. Does it matter if I'm adding more than one index?
Everything, I'm aware of says you should do this after the insert, but one of my colleagues is insisting it's faster to do it on the insert.
Best Answer
Have the primary key in place, but not the secondary keys. When finished loading, add all the secondary keys in a single
ALTER
statement.Caveats:
ALTER
works. Doing multiple alters at once used to be optimal. But with online/instant/etc ALTERs, it is hard to say. Still, the currentALTER .. ADD INDEX..
, even though it comes back instantly, has a lot of work to do afterwards.PRIMARY KEY
in not in place to start with, it will have to reorganize the data. However, this aspect may depend on whether the source data is in PK-order.