I recently ran into a problem where a table had a column of type TEXT inadvertently converted to type TINYTEXT (MySQL version 5.6.24). We were able to restore any truncated data, but the confusing thing was was that we only noticed the problem when certain values for the column started appearing to be truncated at seemingly random positions in the text string.
After trying to wrap my head around what happened I finally realized that each value originally >= 256 chars was truncated to a length equal to the difference between its original length and the nearest lower multiple of 256. Eg:
Original Length Truncated Length Difference
257 1 256
350 94 256
597 85 512
I then confirmed this is exactly what happens when you convert a TEXT column to TINYTEXT. It's as if it iteratively removes chunks of 256 bytes until the total length is < 256. But why wouldn't it simply just truncate the value to 255 bytes to begin with? Seems like a more efficient operation, but I have no idea about how the internals work.
I couldn't find an explanation on the MySQL documentation. I mean, either way if you have values that exceed the max length you're going to lose data; obviously one wouldn't do this on purpose, but figuring out what happened would have been a bit less confusing if all of the truncated values had length 255.
EDIT: I confirmed this behavior for both the MyISAM and InnoDB storage engines.
EDIT AGAIN: Code to reproduce scenario:
create temporary table tmp (
text_col text
)
collate = 'utf8_general_ci'
engine = InnoDB;
create temporary table results (
current_value text,
length_text_col int(11),
char_length_text_col int(11),
text_col_type char(8)
)
collate = 'utf8_general_ci'
engine = InnoDB;
insert into tmp
(text_col)
values
(repeat('a', 257)), -- 1 byte/character
(repeat('ア', 86)); -- 3 bytes/character
insert into results
select
text_col as current_value,
length(text_col) as length_text_col,
char_length(text_col) as char_length_text_col,
'TEXT' as text_col_type
from
tmp;
alter table tmp
change column text_col text_col tinytext;
insert into results
select
text_col as current_value,
length(text_col) as length_text_col,
char_length(text_col) as char_length_text_col,
'TINYTEXT' as text_col_type
from
tmp;
select * from results;
The above code will select:
current_value length_text_col char_length_text_col text_col_type
aaaaaaaaaa... 257 257 TEXT
アアアアア.... 258 86 TEXT
a 1 1 TINYTEXT
2 2 TINYTEXT
Best Answer
There is essentially no good reason to ever use
TINYTEXT
. And there are some obscure drawback.If you are using utf8 or utf8mb4, then you may have "multi-byte" characters.
TINYTEXT
is limited to 255 bytes.VARCHAR(255)
, which is similar in many ways, is limited to 255 characters.If your text is any any Asian language, then only about 85 characters can be stored in
TINYTEXT
. ForVARCHAR(255)
, a full 255 characters can be stored.More
I agree that the
ALTER
is messing thing up in a strange way. Here's a shorter example:The
HEX
comes back with the first two bytes of Katakana 'A', which isE382A2
, so I will guess that theALTER
did some kind of modulo -- picking (258 % 256 == 2) bytes to keep.Fails for me on 5.5.43, 5.6.22 and 8.0.2. On the other hand, MariaDB 10.0.28 and 10.3.7 complain about the
ALTER
:File a bug at bugs.mysql.com.