divided linked to some info that explains the basic issue (there's performance differences), but it's not simple enough to say that one's always better than the other. (otherwise, there'd be no reason to have both.) Also, in MyISM, the 64k max size for VARCHAR isn't per field -- it's per record.
Basically, there's 4 ways to store strings in database records:
- fixed length
- C-style strings (marked with a NULL or similar character at the end of the string)
- Pascal style strings (a few bytes to indicate length, then the string)
- Pointers (store the string somewhere else)
MyISM uses something similar to #3 for VARCHAR, and a hybrid approach for TEXT where it stores the beginning of the string in the record, then rest of the string somewhere else.
InnoDB is similar for VARCHAR, but stores the complete TEXT field outside of the record.
With 1&4, the stuff in the record is always the same length, so it's easier to skip over if you don't need the string, but need stuff after it. Both #2 and #3 aren't too bad for short strings ... #2 has to keep looking for the marker, while #3 can skip ahead ... as the strings get longer, #2 gets worse for this particular use case.
If you actually need to read the string, #4 is slower, as you have to read the record, then read the string which might be stored elsewhere on the disk, depending on just how that database handles it. #1 is always pretty straightforward, and again you run into similar issues where for #2 gets worse the longer the string is, while #3 is a little worse than #2 for very small strings, but better as it gets longer.
Then there's storage requirements ... #1 is always a fixed length, so it might have bloat if most strings aren't the max length. #2 has 1 extra byte; #3 typically has 2 extra bytes if max length = 255, 4 extra bytes if a 64k max. #4 has the pointer length, plus the rules for #3 typically.
For the specific implementations within MySQL 5.1, the docs for MyISM state:
- Support for a true VARCHAR type; a VARCHAR column starts with a length stored in one or two bytes.
- Tables with VARCHAR columns may have fixed or dynamic row length.
- The sum of the lengths of the VARCHAR and CHAR columns in a table may be up to 64KB.
While for InnoDB :
- The variable-length part of the record header contains a bit vector for indicating NULL columns. If the number of columns in the index that can be NULL is N, the bit vector occupies CEILING(N/8) bytes. (For example, if there are anywhere from 9 to 15 columns that can be NULL, the bit vector uses two bytes.) Columns that are NULL do not occupy space other than the bit in this vector. The variable-length part of the header also contains the lengths of variable-length columns. Each length takes one or two bytes, depending on the maximum length of the column. If all columns in the index are NOT NULL and have a fixed length, the record header has no variable-length part.
- For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes will only be needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the two-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.
...
as with so many other things when dealing with databases, if you're not sure what's best for your needs, try benchmarking it with similar data & usage, and see how they behave.
I don't know what the best way necessarily is to store it -- but there's at least a better option than using a varchar(39)
(or varchar(40)
if you needed it signed) ; instead use a decimal(39,0)
. From the mysql docs:
Fixed-Point (Exact-Value) Types
The DECIMAL and NUMERIC types store
exact numeric data values. These types
are used when it is important to
preserve exact precision, for example
with monetary data. In MySQL, NUMERIC
is implemented as DECIMAL, so the
following remarks about DECIMAL apply
equally to NUMERIC.
MySQL 5.1 stores DECIMAL values in
binary format. Before MySQL 5.0.3,
they were stored as strings. See
Section 11.18, “Precision Math”.
In a DECIMAL column declaration, the
precision and scale can be (and
usually is) specified; for example:
salary DECIMAL(5,2)
In this example, 5 is the precision
and 2 is the scale. The precision
represents the number of significant
digits that are stored for values, and
the scale represents the number of
digits that can be stored following
the decimal point.
Standard SQL requires that
DECIMAL(5,2) be able to store any
value with five digits and two
decimals, so values that can be stored
in the salary column range from
-999.99 to 999.99.
In standard SQL, the syntax DECIMAL(M)
is equivalent to DECIMAL(M,0).
Similarly, the syntax DECIMAL is
equivalent to DECIMAL(M,0), where the
implementation is permitted to decide
the value of M. MySQL supports both of
these variant forms of DECIMAL syntax.
The default value of M is 10.
If the scale is 0, DECIMAL values
contain no decimal point or fractional
part.
The maximum number of digits for
DECIMAL is 65, but the actual range
for a given DECIMAL column can be
constrained by the precision or scale
for a given column. When such a column
is assigned a value with more digits
following the decimal point than are
permitted by the specified scale, the
value is converted to that scale. (The
precise behavior is operating
system-specific, but generally the
effect is truncation to the
permissible number of digits.)
It's stored packed, so it'll take up less space than the varchar (18 bytes, if I'm doing my math right), and I'd hope you'd be able to do math on it directly, but I've never tried with that large of a number to see what happens.
Best Answer
This is all IMHO:
•table entry created and updated: date or timestamp? Why? With date isnt it easier to display and sort on the frontend?
Use a timestamp - better for multiple mods in a given day.
•Entry title - VARCHAR or text?
You're not going to lose much with a text over VARCHAR, but I think that VARCHAR is the reasonable choice.
•Entry body - I assume text.
BLOB.
•5 Star rating - INT or Decimal? Why since I have read about both?
How flexible do you want to be - I'd tend to Decimal.
•Phone & Zip - INT or VARCHAR?
VARCHAR - intl codes, extensions... more flexible.
•Event Date - Again, Date or Timestamp
Again you're not losing much adding the time - never throw away information that you may use later.