Mysql – Does varchar(n) affect the number of records which can be stored per page even if you store less than n

innodbMySQLperformanceperformance-tuning

I'm struggling to find enough documentation to help me answer my own question.

The common story seems to be that it doesn't make a difference to actual database size when you use a much higher value for n than will ever be used.

As an extreme example:
VARCHAR(5000) but only storing on average a small number of characters such as 4 characters.

That seems to be correct from a few tests I've run.

I've heard people saying that MySQL will take the higher value into account when managing pages meaning a potential for less records per page and that this can affect performance when running queries.

I can't see anything in the documentation to suggest this is actually the case so really looking for some clarity from a more experienced DBA.

So thanks in advance for any help. Please let me know if my question needs to be clearer. My question is, does using a higher value for varchar(n) than will be stored affect the number of records which can be stored per page even if you store less than n?

I'm pretty new to this so if possible I'd also really appreciate a steer on how I can determine how many records are included in pages and then run some benchmarks.

Best Answer

I can only speculate what the people meant. Maybe it's related to how MySQL computes index statistics depending on n. The only case that comes to my mind is whether n is higher than 127 or not.

As you may know InnoDB stores a string with its length. The length is stored in so called offset fields. The offset can be either one or two bytes. In REDUNDANT format if the maximum length is less than 128 bytes the offset field will be one byte. Otherwise - two bytes. In COMPACT and DYNAMIC format the actual field length is taken into account. So, if your field is VARCHAR(5000), but you store a short string like abc the offset field will be one byte.

@jkavalik made a good point about MEMORY tables. VARCHAR(5000) is effectively TEXT i.e. it would need off-index pages to store long values. If a SELECT query ever needs a temporary table it will be of MEMORY type. Because it doesn't support variable length types it's going to be CHAR(5000). So even if you store only short values the temporary table will be stored on disk. It will be a big performance hit.