MySQL Clustered Index or Non Clustered index? (Database Design)

clustered-indexMySQLnonclustered-indexperformanceperformance-tuning

I know this might be a basic thing for you guys in this corner of SE but I'm having some trouble figuring out the best option for designing my DB.

I am creating a custom app where the user will be entering data into a table(lets call this table "manifest") at least 1 entry every minute, each entry will have data grabbed from the input field and when adding the new row, the current date and time is also being added to a datetime column.

This table will later be browsed by the app users on the front end at least 4 times a week when they come to do some exporting of their previously entered data.

I expect them to a lot of ordering by date, so I was planning to do an index on the date column for the first 8 characters of the MYSQL DATE = 00/00/00 (id leave out the time portion).


From what I know about indexes is that they slow down inserting and updating of data because the index needs to be rebuilt every time new data is entered.

This sounds like it would be inefficient to create an index(other than the primary key[id]) if the users will be inputting data at least once a minute for maybe 2 hours straight(per day) and the index would have to be rebuilt every minute(?) new data comes in.


I know this table would just keep growing because they need to keep track of the records for years on end, that's why I'd want to put an index on the date column to make it future proof in terms of speed. But the fact that they'd be inputting data into it so frequently and indexes would have to be rebuilt every time has me a little lost on how to proceed.

I was thinking a non-clustered index would solve the issue? I've watched several videos explaining the two and it seems that the rebuilding would still need to take place for non-clustered indexes?

Any help and performance tips would be appreciated

Best Answer

About PRIMARY KEYs:

  • Each table must have a PRIMARY KEY.
  • The PK (in MySQL's InnoDB engine) is always "clustered"
  • Indexes other than the PK are non-clustered.
  • The PK is "unique". It uniquely identifies each row.

In your question:

  • One INSERT per minute is very slow. (So, no performance concerns here, regardless of the indexes on the table, regardless of anything else.)
  • "Exporting" rows -- Do you mean "read" (SELECT) or "archive" (SELECT, then DELETE)?? (More details, please.)
  • "first 8 characters of the MYSQL DATE = 00/00/00" -- Are you not storing the DATE part? If not, then make it a TIME column. If you want to "extract" just the time (from a DATETIME, see the TIME() function.
  • "indexes ... slow down inserting and updating of data because the index needs to be rebuilt every time new data is entered." NO, NO, NO. Indexes are not rebuilt except on explicit requests -- which are rarely warranted.
  • Having an index does slow down writes slightly, but the index is likely to speed up reads significantly. So, don't fear having indexes.
  • After 10 years, you will have about 5 million rows. As tables go, this is only medium-sized. Not a problem. However, for accessing individual rows or groups of rows, indexes (whether PK or secondary) are strongly advised.
  • "I'd want to put an index on the date column to make it future proof in terms of speed." -- That does not "future-proof" it. It will help some SELECTs, but not others. Show us the selects so we can recommend a set of indexes that would be optimal for those selects.
  • Later, if you add/drop/change an index (primary or secondary), there will be some amount of effort, possibly involving copying the table over. It is premature to worry about that.