I use C# entity framework code first.
My knowledge regarding databases is almost nothing. I don't know if it's common or good practice so I wanted to ask you guys how I should do it.
Say I have a table of letters and a table of stations.
I develop an application that sends letters from one station to other stations. Each letter can be sent to different stations – it's a many to many relationship.
Each letter that is sent is associated with a station – the source station of the letter (a letter cannot be sent to it's own source station, etc..).
Each database instance represents a station.
On application startup I know which station is my source station and want to save that information in the database.
How should I save which station is the source station? Should I have a flag in the stations table in each row indicating if it is the current instance station? This sounds bad to me.
Is there a way to have a table with only 1 value? For example that will be called InstanceStation that will contain only one row with a single column – StationId? Is this a good practice?
I tried to be as clear as possible, I hope my situation is clear.
Best Answer
Having a flag on the
Stations
table that indicates which station is local would be exactly how I'd handle this, presuming you don't have a billion stations. I'd likely call the columnIsSourceStation
or something, and make it aBIT
value, that can acceptNULL
. I would mark the local station row as1
, and leave all other rows asNULL
,since that won't take any space(see my comments below regarding space).I'd add a filtered index to the
IsSourceStation
column, filtered asWHERE IsSourceStation = 1
. This index will allow extremely fast lookups to determine the name of the local station, if that is required.Looking for the
Stations
row that corresponds to our "home" station could be accomplished by:This will be very fast with the index I suggested, regardless of how many rows are in the
Stations
table.Looking to confirm a station is not the home station? Use this:
The null bitmap used in SQL Server is a fantastic optimization designed for just this type of situation where very few rows in a nullable column actually contain a value.
My statement above, while technically correct in that the null bitmap is used to save space, in the case of a table with a single bit column, there is no appreciable difference between defining the column as nullable vs having it be not nullable, with a default value of 0. I used the following test bed to determine this on SQL Server 2012:
The above code creates two tables, each with a single
INT
column, and a singleBIT
column. The first table allows theBIT
column to beNULL
; the second table does not.I used the following to inspect the actual on-disk data pages for the first page of each table:
The
DBCC PAGE
command can be used with the last option set to "3" to see the actual column values stored on the page, along with quite a bit of detail about each row "slot". In my two tables above, the first page of each table was 334 and 342 respectively.The output for slot 0 from each of the DBCC PAGE commands above shows the following for the table with the nullable column:
And this, for the column that is not nullable:
The "memory dump" value in the 2nd column on the first row shows the actual data stored on disk in hex format; both variants are precisely the same.
Indeed, when looking at the on-disk size for both tables using this query:
We see both tables are identically sized:
My conclusion in light of the above data is that it is probably easier just to use a non-nullable column with a default value of 0 since that eliminates the potentially problematic null handling required for nullable columns.
Where the null bitmap does help is when you have more than 8 nullable bit fields. If you take my sample tables
TestBit
andTestBitNotNull
and give them 16 bit fields each, you'll see the following table sizes for 1,000,000 rows: