SQL Server – Creating Range for Special Fields

check-constraintsperformancequery-performancesql server

I have two columns that look like this:

Start_Post  End_Post
----------  --------
102+20.45   153+19.22 
120+21.25   220+25.30
...         ...

And I want to introduce a constraint to each column:

Start_Pos
Min Range: 100+50.30
Max Range: 150+20.65
End_Pos
Min Range: 150+60.30
Max Range: 500+20.75

All values are in feet and follow a special format. It's a construction convention and I am not an expert but to my knowledge, 1+00 would be 100 feet. Essentially, taking the above range of 100+50.30 to 150+20.65 would mean a distance of 4970 ft. What I did was just remove the plus sign and subtract the two numbers. Hope that made sense.

Is there a way to define this constraint whilst maintaining the format?

Best Answer

Assuming the digits to the left of the + are measured in units of 100, you could perhaps create constraints like this:

CREATE TABLE dbo.SurveyData
(
    StartPostStation int NOT NULL
    , StartPostPlus decimal(38,3) NOT NULL
    , EndPostStation int NOT NULL
    , EndPostPlus decimal(38,3) NOT NULL
    , CONSTRAINT StartMin
        CHECK ((StartPostStation * 100 + StartPostPlus) > 100*100+50.3)
    , CONSTRAINT StartMax
        CHECK ((StartPostStation * 100 + StartPostPlus) < 150*100+20.65)
    , CONSTRAINT EndMin
        CHECK ((EndPostStation * 100 + EndPostPlus) > 150*100+60.3)
    , CONSTRAINT EndMax
        CHECK ((EndPostStation * 100 + EndPostPlus) < 500*100+20.75)
);

However, this doesn't allow dynamic constraints, all rows in the table will be constrained to those measurements, which may in fact be exactly what you need.

If you need dynamically adjusting constraints for each independent row, you could do this:

CREATE TABLE dbo.SurveyData
(
    StartPostStation int NOT NULL
    , StartPostPlus decimal(38,3) NOT NULL
    , EndPostStation int NOT NULL
    , EndPostPlus decimal(38,3) NOT NULL
    , CONSTRAINT StartMin
        CHECK ((StartPostStation * PostUnits + StartPostPlus) > MinStartPost * PostUnits + MinStartPostPlus)
    , CONSTRAINT StartMax
        CHECK ((StartPostStation * PostUnits + StartPostPlus) < MaxStartPost * PostUnits + MaxStartPostPlus)
    , CONSTRAINT EndMin
        CHECK ((EndPostStation * PostUnits + EndPostPlus) > MinEndPost * PostUnits + MinEndPostPlus)
    , CONSTRAINT EndMax
        CHECK ((EndPostStation * PostUnits + EndPostPlus) < MaxEndPost * PostUnits + MaxEndPostPlus)
    , MinStartPost int NOT NULL
    , MinStartPostPlus decimal(38,3) NOT NULL
    , MaxStartPost int NOT NULL
    , MaxStartPostPlus decimal(38,3) NOT NULL
    , MinEndPost int NOT NULL
    , MinEndPostPlus decimal(38,3) NOT NULL
    , MaxEndPost int NOT NULL
    , MaxEndPostPlus decimal(38,3) NOT NULL
    , PostUnits int NOT NULL
);


INSERT INTO dbo.SurveyData (StartPostStation, StartPostPlus, EndPostStation, EndPostPlus
    , MinStartPost, MinStartPostPlus, MaxStartPost, MaxStartPostPlus
    , MinEndPost, MinEndPostPlus, MaxEndPost, MaxEndPostPlus
    , PostUnits)
VALUES (120, 49.2, 175, 80.5  --measurements
    , 100, 50.3, 150, 20.65   --valid start post range
    , 150, 60.3, 500, 20.75   --valid end post range
    , 100); --units per post

Selecting data would look like this:

SELECT StartPost = CONVERT(varchar(10), sd.StartPostStation) + '+' + CONVERT(varchar(50), sd.StartPostPlus)
    , EndPost = CONVERT(varchar(10), sd.EndPostStation) + '+' + CONVERT(varchar(50), sd.EndPostPlus)
FROM dbo.SurveyData sd;

Results, like this:

+------------+------------+
| StartPost  |  EndPost   |
+------------+------------+
| 120+49.200 | 175+80.500 |
+------------+------------+

The post columns have been split into two to enable us to use the correct data types to store numeric data. If we try to store 175+80.50 in a single field, we end up using a varchar(x) column, which allows all kinds of possibilities for bad data, such as tee+27.-1, which are very difficult to comprehensively prevent. So, we store the station in one column, and the offset from that station in the next column. When presenting this data on screen or on reports, to humans, we'd use the concatenated version shown above.

Related Solutions

Sql-server – Improve delete speed for SQL Server

If you are trying to delete a large number of rows in a single statement, then it is likely you are waiting on log activity. So you can:

Make sure your log is adequately sized so that growth events don't slow you down. With the defaults your log is probably starting at 1MB with 10% growth. Growth events are expensive, and if you are logging even 10 GB of deletes, this will destroy performance not just now but also in the future (due to what this does to VLFs).
If you are deleting the whole table, use TRUNCATE or DROP/CREATE.
If you are deleting most of the table, use SELECT INTO to put the data you want to keep into another table, then TRUNCATE, then move the small portion back. (Or just drop the old table, rename the new, and re-apply constraints / permissions etc.)
Minimize the impact of logging in the first place by deleting the data in chunks instead of all at once. See this article. You can also consider switching to simple recovery temporarily, so that you only have to CHECKPOINT to clear the log instead of take log backups, but you need to be sure to set it back and to take a new full backup to re-initiate the log chain.

SQL Server – Table Structure for Validating Data Using Category and Subcategory Fields

There are a few different strategies you could take, where on one end you pursue aggressive normalization to, on the other end, full denormalization. The full denormalization would be equivalent to your second example where all relevant info simply ends up in the transaction table without references to other tables.

Full Normalization

So, to completely normalize, you would still want a Categories table, but you want to even eliminate the storage of redundant information in this table, so you would need a CategoryList table and a SubCategoryList table as

CategoryList

id    category     
----------------
 1     Food      
 2     Household
 etc...

and

SubCategoryList

id    subcategory     
----------------
 1     Work Lunch    
 2     Fast Food
 3     Grocery Store
 4     Mortgage
 5     Repairs
 etc...

You could then construct your category table from these two tables as

Categories

id    category_id     subcategory_id
----------------------------------
 1        1               1
 2        1               2
 3        1               3
 4        2               4
 5        2               5

Treatment of NULL subcategories can easily be handled by either 1) simply placing a NULL entry for the subcategory_id column in the appropriate row of the Categories table, or 2) adding a subcategory entry id, subcategory where the subcategory field is NULL.

Last but not least you would add a foreigh key reference from your Transactions table to the appropriate id in the Categories table.

Does it really need to be so normalized?

Well, in my opinion, no it doesn't. I've heard a quote, though I can't remember who spoke it, but it basically goes "Normalize until it hurts, denormailize until it works." Especially in the case where you don't have a lot of categories, the fully normalized design may be a little bit of overkill.

What might simply make more sense would be to keep the above mentioned CategoryList and SubCategoryList tables to enumerate your types, but skip making the separate Categories table, and then simply have your Transactions table referencing the CategoryList and SubCategoryList tables as

id    txdate  amount   category   subcategory   account   ...
--------------------------------------------------------------
 1   6/25/15   15.25      1             2        cash                     ...

This way, you save on storage, and you can easily update/modify any category or subcategory entry in the list without needing to modify your entire Transactions table. Further, you can simply permit the subcategory column of the Transactions table to permit NULL entries, if need be.

Hope this helps!

Best Answer

Related Solutions

Sql-server – Improve delete speed for SQL Server

SQL Server – Table Structure for Validating Data Using Category and Subcategory Fields

Related Question