In my experience, trying to encode multiple data points into a single column always ends up being more trouble than it's worth. Sure, it seems cool and clever to use BITWISE operators, but there are many things that go wrong and it won't always be efficient to test those bits without cumbersome and unintuitive workarounds. It's the same reason we stay away from storing comma-separated lists, JSON strings etc. in a single column - eventually you care about viewing or filtering on those distinct bits which you now have to extract, sometimes expensively.
With the information I have, my vote is for three separate BIT
columns. They will still collapse to similar storage patterns as a single column with the three bits on/off, and can be made more efficient individually and across the board in several ways, including:
- data compression
- sparse columns
- filtered indexes (e.g.
WHERE allow_returns = 1
)
Someone else advocated for three CHAR(1)
columns. These do not benefit from storage collapse and also require a check constraint, making them less than ideal in my mind.
Now, my answer might change if you say, "well, what if I might add 15 other attributes in the future?" I certainly don't think it's wise to build the columns this way if they're not relatively static - changing the schema (and therefore all of the code and interfaces to it) for every new or changed attribute is going to be a royal pain. So in that case you might want to consider EAV - where the attributes are not part of the metadata but part of the data. There are a lot of objections to EAV, mostly around performance and the difficulty in enforcing constraints (in this case unlikely to be an issue if all of these attributes are either on or off), but it worked quite well for us at my previous job. You might model it like this:
CREATE TABLE dbo.Attributes
(
AttributeID TINYINT PRIMARY KEY,
Name VARCHAR(32) NOT NULL UNIQUE
);
CREATE TABLE dbo.ItemAttributes
(
ItemID INT NOT NULL
FOREIGN KEY REFERENCES dbo.Items(ItemID),
AttributeID TINYINT NOT NULL
FOREIGN KEY REFERENCES dbo.Attributes(AttributeID),
Status BIT NOT NULL,
PRIMARY KEY(ItemID, AttributeID)
);
And again, you can have filtered indexes here to make certain queries much more efficient, such as (imagine the AttributeID for "allow returns" is 10):
CREATE INDEX optAllowReturns ON dbo.ItemAttributes(ItemID)
WHERE AttributeID = 10 AND Status = 1;
If you have certain attributes that are not on/off (for example, three states of manufacture or shipping), you can change the Status
column to:
Value TINYINT NOT NULL
This can double as an on/off value for attributes that are represented that way, and as tri- or more-state value for attributes that require more than simple on/off. You can also reflect which type is which in the metadata of the dbo.Attributes
table.
This type of requirement comes under the banner of "gaps and islands". A popular approach is
WITH T
AS (SELECT *,
DENSE_RANK() OVER (PARTITION BY ItemId ORDER BY DateOfChange) -
DENSE_RANK() OVER (PARTITION BY ItemId, Status ORDER BY DateOfChange) AS Grp
FROM ItemTable)
SELECT ItemId,
Status,
MIN(DateOfChange) AS Start,
MAX(DateOfChange) AS Finish
FROM T
GROUP BY ItemId,
Status,
Grp
ORDER BY Start
Best Answer
OK, try this. I made some assumptions.
With those assumptions in mind I present the following solution. LAG() isn't an option in 2008, so you are left with making a bunch of datasets to realize the output you want. I'm using CTEs to generate the lists of UNITS by INCIDENT and then creating temp tables to work with the timestamps.
I could have done a bunch of nested CTEs but then they'd be executed multiple times; same with doing this with sub queries. Finally, when reading the output for the status changes, if a change is preceeded by a NULL value, it is because the status was skipped. In the case of E14, based off available sample data, there were no status changes from DP.
Hopefully this is what you need or at least enough to get you going.
GENERATE SAMPLE DATA
CODE