If you only have a few variables I would consider keeping separate boolean
columns.
- Indexing is easy. In particular, indexes on expressions are easy.
- Conditions for queries and partial indexing are easy to write and read and meaningful.
- A boolean column occupies 1 byte. For only a few variables this occupies the least space.
- Unlike the other options boolean columns allow
NULL
values for individual bits if you should need that. You can always define columns NOT NULL
if you don't.
Optimizing storage
If you have more than a hand full variables but less than 33, an integer
column may serve you best. (Or a bigint
for up to 64 variables.)
- Occupies 4 bytes on disk.
- Very fast indexing for exact matches (
=
operator).
- Handling individual values may be slower / less convenient than with
bit string
or boolean
.
With even more variables, or if you want to manipulate the values a lot, or if you don't have huge tables and disk space / RAM is no issue, or if you are not sure what to pick, I would consider bit(n)
or bit varying(n)
.
Examples
For just 3 bits of information, individual boolean
columns get by with 3 bytes, an integer
needs 4 bytes and a bit string
6 bytes (5 + 1).
For 32 bits of information, an integer
still needs 4 bytes, a bit string
occupies 9 bytes for the same (5 + 4) and boolean
columns occupy 32 bytes.
Further reading
Shrinking and Storage
Enable "Instant File Initialization"
Disable autoshrink
Size your storage properly (15% free per drive)
RAID what? RAID level matters and SQL Server accesses the files differently. Log file access is generally sequential and Data is generally random.
tempdb
Separate data and log files (see RAID, below) - get a good tempdb
strategy. We have tempdb
logs on a log drive and our tempdb
files (1/4-1/2 the number of cores) on their own drive. While you're at it, make sure that tempdb
has an appropriate number of data files, that their initial sizing is exactly the same, and that they have exactly the same growth factor. While you're at it, read up on Trace Flag 1118
Schema Changes
If you're going to be making these changes, make sure you have a good, durable backup. What happens in 9 months when someone actually needed those tables?
Can your application actually withstand the changes from NVARCHAR
, NCHAR
, and NTEXT
? While you're at it, NTEXT
is deprecated so moving to TEXT
would not be good. Check out VARBINARY
or VARCHAR(MAX)
instead.
You can change the collation all you want, but it won't affect any of the existing data.
Compatibility
Change the db compatibility levels to 100 if you can.
Tuning Option
Check out some of these trace flags none of them may be appropriate. Some of them might be, "it depends." Also, read about trace flag 4199.
Tooling
I'd also recommend coming in to this install with a good tool set. Create a DBA
database for this toolkit. I'd start with sp_WhoIsActive
by Adam Machanic, sp_Blitz
by Brent Ozar. I'd also look for Kendra Little's work with scheduling and storing the results from sp_WhoIsActive
. Finally, I'd start up a way of capturing Waits and Queues so that you have a good history of your server's profile.
Look into getting a monitoring solution.
Finally (and maybe firstly), check out Glenn Berry's excellent series on provisioning a new SQL Server Instance.
Part 1, Part 2, and Part 3
Best Answer
Most client/servers DBMSes have parameters similar to those, and the handle database indeed is an unknown concept elsewhere. But they might need other parameters; the common way to handle is with a connection string (or URL) that encodes all those parameters, or with a generic list of keys/values.
And for databases that do not use a client/server architecture, it's even more different. If you do not need much concurrency and want to avoid having to install the DB server, you would be likely to choose SQLite, and then the only connection parameter is the database file name.