admin and active is just a bunch of 0's and 1's so can I do something
like this? Since there will be 99% active and only a couple that will
be inactive.
admin - id, user_id
inactive - id, user_id
social - id, user_id
facebook_link, twitter_link
member_forgot - id, user_id, forgot_code
Actually no, I wouldn't do this and keep it in the same table. The flag should be enough to distinguish between these two. The only difference in your queries is a active = true
. Personally, I wouldn't consider this a bad design choice (altough your idea is not wrong, I would not use it in this case with - I'm guessing - a small set of data). Since only on set of data belongs to a single user, JOIN
s would make not that much sense.
I really want to practice splitting it up in different tables even if
it is not a huge database, or is that a bad idea? So if I delete a
user from the users table, it will delete all of the other records in
the other tables associated with that user_id?
Again, I wouldn't do that personally(!). I don't see a benefit of seperating user centric data from the user. If you fear losing the data on deletion, introduce a deleted
flag. This way, you will never lose data and your application just respects the deleted
flag in your queries (this would also apply if you split the data accross tables).
Credit_Cards = 1110 format (visa, mastercard, discover, amex)
Storing credit cards is always very sensitive, according to PCI (as far as i remember) you are not allowed to store the CVV code (hope someone can correct me on that). But since this seems to be a fictional application, that should not be a problem.
You have pretty much 4 choices:
NoSQL - definition Every record is stored as a set of Key/Value pairs. It is very flexible and fast. Not all of the report writers out there support this style of storage. There are many example database implementations of NoSQL. The one that seems to be most popular right now, is MongoDB.
EAV - definition This is where you turn either the whole table or a portion (in another table) on its side. This is a good choice if you already have a relational database in-house that you can't move away from easily. The custom info table example you gave is a good example of an EAV table.
Standard tables with XML columns - Think of this as NoSQL meets relational tables. The data stored in an XML column can be any format that XML supports, including multiple correlated sub-data. For the columns that you know are going to be "regular" columns, they can be built as the appropriate type of column to store the data (LastName, Address, City, State, etc.).
Standard tables with lots of extra columns - You have a relational database, you can't use either XML or EAV, and NoSQL is not an option. Add lots of extra columns of each type. I would guess 30 or more varchar, 30 or more integer, 15 or more numerics. And once you use a column for a value, don't re-use it. And don't delete the column either.
Out of all of these solutions, my own opinion is that you will find either the NoSQL or the EAV approach to be the most successful with the least amount of refactoring your code and your schema.
You will have a situation where you collect data one year, not the next, and then collect it again afterward. Trying to get the older data updated with the correct information is problematic and expensive. Storage is neither.
Best Answer
Size is one consideration. An
int
can hold up to -2,147,483,648 in four bytes. Achar
will need 11 bytes to hold the same value.There are built-in functions to manipulate the various data types.
DATEADD()
andDATEDIFF()
are two examples. This will not be possible with date-stored-as-text. ConstantlyCAST
ing back and forth will not make for efficient processing, or legible code.Automatic validation is another foregone benefit with the all-text approach. You may think that a column contains dates but there will be nothing to stop someone entering the value '2014-13-97'.
Sorting is unlikely to give the intended result with columns which are "really" numbers. For example, if a column contained integers '1' through '100', and the query sorted by this column, one would expect the result to be
However, the actual result is more likely to be
There may be similar concerns for dates, depending on the chosen character representation. Again type casting can cure this with the costs mentioned previously.
Occasionally you will find strings which contain digits only. Examples are national identity numbers, bank account numbers, phone numbers and such like. The common characteristic of such values is that it does not make sense to perform calculations on them. Other things being equal it may be OK to store these as
char()
columns, especially if they have optional embedded alpha characters, but the above considerations still apply.