This is most likely caused by the MERGE operation in SQL Server being single threaded. No matter how many resources you throw at it, it wont run faster which is exactly what you are observing.
The solution is to manually parallelise the query by running multiple copies of the MERGE statement at the same time, each operating on their own subset of the data.
The problem is very common in data warehouses running SQL Server. There are a series of good design pattern in this document: http://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx (DISCLAIMER: This is written by yours truly, so I am shamelessly plugging this)
Size is one consideration. An int
can hold up to -2,147,483,648 in four bytes. A char
will need 11 bytes to hold the same value.
There are built-in functions to manipulate the various data types. DATEADD()
and DATEDIFF()
are two examples. This will not be possible with date-stored-as-text. Constantly CAST
ing back and forth will not make for efficient processing, or legible code.
Automatic validation is another foregone benefit with the all-text approach. You may think that a column contains dates but there will be nothing to stop someone entering the value '2014-13-97'.
Sorting is unlikely to give the intended result with columns which are "really" numbers. For example, if a column contained integers '1' through '100', and the query sorted by this column, one would expect the result to be
1, 2 .. 10, 11 .. 20, 21 .. 100
However, the actual result is more likely to be
1, 10, 100, 11, 12, 2, 20, 21 ...
There may be similar concerns for dates, depending on the chosen character representation. Again type casting can cure this with the costs mentioned previously.
Occasionally you will find strings which contain digits only. Examples are national identity numbers, bank account numbers, phone numbers and such like. The common characteristic of such values is that it does not make sense to perform calculations on them. Other things being equal it may be OK to store these as char()
columns, especially if they have optional embedded alpha characters, but the above considerations still apply.
Best Answer
If you have a system with a large number of tags it's probably more efficient management-wise to use a relational system between products and product-tags.
i.e.:
Insert some data into the various tables:
Query the tables:
This allows you to extend the tags at will without needing to redesign the database itself.
Using discrete bit columns in the products table is a strategy that will soon get very tedious. Take for example:
Now, if you want to add products that might be red, you must alter the table via:
You must also modify any code that touches the
dbo.Products
table, to make that code aware of the newIsRed
column.