You were more or less on the right track with (b), but you want to make sure that latitude and longitude are handled independently. What you need are indexes that cover your query parameters. This is what you are looking for:
Segregate your retailer addresses (including lat/lon). This could
be in a RetailerAddress table or it could be columns on the
RetailOutlet table, whichever makes the most sense to you.
Create two indexes: (lat), (lon) on whichever table contains your retail
outlet addresses. If you have a separate table for retail outlet addresses, be
sure to include the FK to the retail outlet in these indexes (i.e. (lat, OutletID),
(lon, OutletID)).
Now as long as your query starts from the perspective of a particular user, this is all you'll need for a very efficient lookup on lat/lon ranges. If you need to find the proximate retail locations for a bunch of users at one time, then you will want to have similar indexes on the table containing user addresses.
A coworker was suggesting that it is correct for a table to store 'description' and 'notes' as a foreign key rather than in the table for performance reasons, given that some of the records will contain NULL (or empty string in this case), and presumably will be split across pages.
Possibly, depending on RDBMS, the exact data types, and what the performance goal is.
For SQL Server, which it sounds like you're using, you can use sp_tableoption
with large value types out of row
to keep a LOB (nvarchar(MAX)
in this case) column in the table you want (logically), while pushing the bulk of the data storage onto separate pages (physically).
This will make accessing the LOB data slower, but speed up access to only the base data (probably moreso for scanning) as the tradeoff. Depending on the average LOB size, this setting may not make a significant difference. You will have to test this for your exact scenario to see if it improves your workload.
In any event, as long as you want to keep the one-to-one relationship, there's no reason to logically separate the values from the base table because there is an available way around the problem in SQL Server, assuming you're using the MAX
type variant. (Note: if you're currently using a fixed-length field, you could switch to the MAX
type and add a check constraint to limit the length.)
This conversation occurred after we noticed that the integrity of the one-to-one table design was broken with a duplicate record.
It seems like the design of these structures is smelly. Are you keeping a single table to store all of these fields, which are then referenced from the parent tables?
A better design would be to only separate the tables where there's an actual need (again, as I mentioned, this may not even be required) with a declarative one-to-one relationship, such as repeating the primary key of the parent table as the primary key in the LOB storage table.
It seems to me that if the developer who designed the column put it in the source table, all of the code that handles the one-to-one integrity would be unnecessary (and would have not caused this bug)
Quite possibly. The fewer moving parts there are in a system, the more likely it is to be reliable, probably with fewer bugs as a result of complexity.
Best Answer
Yes duplicated data is stored as copies in SQL Server
To change this behavior, you would need to implement PAGE COMPRESSION feature - create or rebuild indexes with (data_compression = on) option
It is a great feature that helps to save space. Once you enable it, SQL Server is pointing to the same reference of the string behind the scenes
Beware that PAGE COMPRESSION is not available in every SQL Server Edition, and it can have some CPU overhead
So you might want to make a lookup table if your edition does not allow for page compression