Supertype/Subtype
How about looking into the supertype/subtype pattern? Common columns go in a parent table. Each distinct type has its own table with the ID of the parent as its own PK and it contains unique columns not common to all subtypes. You can include a type column in both parent and children tables to ensure each device can't be more than one subtype. Make an FK between the children and the parent on (ItemID, ItemTypeID). You can use FKs to either the supertype or subtype tables to maintain the desired integrity elsewhere. For example, if the ItemID of any type is allowed, create the FK to the parent table. If only SubItemType1 can be referenced, create the FK to that table. I would leave the TypeID out of referencing tables.
Naming
When it comes to naming, you have two choices as I see it (since the third choice of just "ID" is in my mind a strong anti-pattern). Either call the subtype key ItemID like it is in the parent table, or call it the subtype name such as DoohickeyID. After some thought and some experience with this, I advocate calling it DoohickeyID. The reason for this is that even though there could be confusion about the subtype table really in disguise containing Items (rather than Doohickeys), that is a small negative compared to when you create an FK to the Doohickey table and the column names don't match!
To EAV or not to EAV - My experience with an EAV database
If EAV is what you truly have to do, then it's what you have to do. But what if it weren't what you had to do?
I built an EAV database that is in use in a business. Thank God, the set of data is small (though there are dozens of item types) so the performance is not bad. But it would be bad if the database had more than a few thousand items in it! Additionally, the tables are so HARD to query. This experience has led me to really desire to avoid EAV databases in the future if at all possible.
Now, in my database I created a stored procedure that automatically builds PIVOTed views for each and every subtype that exists. I can just query from AutoDoohickey. My metadata about the subtypes has a "ShortName" column containing an object-safe name suitable for use in view names. I even made the views updateable! Unfortunately, you cannot update them on a join, but you CAN insert to them an already-existing row, which will be converted to an UPDATE. Unfortunately, you cannot update only a few columns, because there is no way to indicate to the VIEW which columns you want to update with the INSERT-to-UPDATE conversion process: a NULL value looks like "update this column to NULL" even if you wanted to indicate "Don't update this column at all."
Despite all this decoration to make the EAV database easier to use, I still don't use these views in most normal querying because it is SLOW. Query conditions are not predicate pushed all the way back to the Value
table, so it has to build an intermediate result set of all the items of that view's type before filtering. Ouch. So I have many, many queries with many, many joins, each one going out to get a different value and so on. They perform relatively well, but ouch! Here's an example. The SP that creates this (and its update trigger) is one giant beast, and I'm proud of it, but it is not something you want to ever try to maintain.
CREATE VIEW [dbo].[AutoModule]
AS
--This view is automatically generated by the stored procedure AutoViewCreate
SELECT
ElementID,
ElementTypeID,
Convert(nvarchar(160), [3]) [FullName],
Convert(nvarchar(1024), [435]) [Descr],
Convert(nvarchar(255), [439]) [Comment],
Convert(bit, [438]) [MissionCritical],
Convert(int, [464]) [SupportGroup],
Convert(int, [461]) [SupportHours],
Convert(nvarchar(40), [4]) [Ver],
Convert(bit, [28744]) [UsesJava],
Convert(nvarchar(256), [28745]) [JavaVersions],
Convert(bit, [28746]) [UsesIE],
Convert(nvarchar(256), [28747]) [IEVersions],
Convert(bit, [28748]) [UsesAcrobat],
Convert(nvarchar(256), [28749]) [AcrobatVersions],
Convert(bit, [28794]) [UsesDotNet],
Convert(nvarchar(256), [28795]) [DotNetVersions],
Convert(bit, [512]) [WebApplication],
Convert(nvarchar(10), [433]) [IFAbbrev],
Convert(int, [437]) [DataID],
Convert(nvarchar(1000), [463]) [Notes],
Convert(nvarchar(512), [523]) [DataDescription],
Convert(nvarchar(256), [27991]) [SpecialNote],
Convert(bit, [28932]) [Inactive],
Convert(int, [29992]) [PatchTestedBy]
FROM (
SELECT
E.ElementID + 0 ElementID,
E.ElementTypeID,
V.AttrID,
V.Value
FROM
dbo.Element E
LEFT JOIN dbo.Value V ON E.ElementID = V.ElementID
WHERE
EXISTS (
SELECT *
FROM dbo.LayoutUsage L
WHERE
E.ElementTypeID = L.ElementTypeID
AND L.AttrLayoutID = 7
)
) X
PIVOT (
Max(Value)
FOR AttrID IN ([3], [435], [439], [438], [464], [461], [4], [28744], [28745], [28746], [28747], [28748], [28749], [28794], [28795], [512], [433], [437], [463], [523], [27991], [28932], [29992])
) P;
Here's another type of automatically-generated view created by another stored procedure from special metadata to help find relationships between items that can have multiple paths between them (Specifically: Module->Server, Module->Cluster->Server, Module->DBMS->Server, Module->DBMS->Cluster->Server):
CREATE VIEW [dbo].[Link_Module_Server]
AS
-- This view is automatically generated by the stored procedure LinkViewCreate
SELECT
ModuleID = A.ElementID,
ServerID = B.ElementID
FROM
Element A
INNER JOIN Element B
ON EXISTS (
SELECT *
FROM
dbo.Element R1
WHERE
A.ElementID = R1.ElementID1
AND B.ElementID = R1.ElementID2
AND R1.ElementTypeID = 38
) OR EXISTS (
SELECT *
FROM
dbo.Element R1
INNER JOIN dbo.Element R2 ON R1.ElementID2 = R2.ElementID1
WHERE
A.ElementID = R1.ElementID1
AND R1.ElementTypeID = 40
AND B.ElementID = R2.ElementID2
AND R2.ElementTypeID = 38
) OR EXISTS (
SELECT *
FROM
dbo.Element R1
INNER JOIN dbo.Element R2 ON R1.ElementID2 = R2.ElementID1
WHERE
A.ElementID = R1.ElementID1
AND R1.ElementTypeID = 38
AND B.ElementID = R2.ElementID2
AND R2.ElementTypeID = 3122
) OR EXISTS (
SELECT *
FROM
dbo.Element R1
INNER JOIN dbo.Element R2 ON R1.ElementID2 = R2.ElementID1
INNER JOIN dbo.Element C2 ON R2.ElementID2 = C2.ElementID
INNER JOIN dbo.Element R3 ON R2.ElementID2 = R3.ElementID1
WHERE
A.ElementID = R1.ElementID1
AND R1.ElementTypeID = 40
AND C2.ElementTypeID = 3080
AND R2.ElementTypeID = 38
AND B.ElementID = R3.ElementID2
AND R3.ElementTypeID = 3122
)
WHERE
A.ElementTypeID = 9
AND B.ElementTypeID = 17
The Hybrid Approach
If you MUST have some of the dynamic aspects of an EAV database, you could consider creating the metadata as if you had such a database, but instead actually using the supertype/subtype design pattern. Yes, you would have to create new tables, and add and remove and modify columns. But with the proper pre-processing (like I did with my EAV database's Auto views) you could have real table-like objects to work with. Only, they wouldn't be as gnarly as mine and the query optimizer could predicate push down to base tables (read: perform well with them). There would just be a one join between the supertype table and the subtype table. Your application could be set to read the metadata to discover what it is supposed to do (or it can use the auto-generated views in some cases). This protects your application code from having to be touched extensively just to add or modify things.
Or, if you had a multi-level set of subtypes, just a few joins. By multi-level I mean when some subtypes share common columns, but not all, you could have a subtype table for those that is itself a supertype of a few other tables. For example, if you are storing information about Servers, Routers, and Printers, an intermediate subtype of "IP Device" could make sense.
I will give the caveat that I haven't yet made such a hybrid supertype/subtype EAV-metatable-decorated database like I'm suggesting here yet to try out in the real world. But the problems I've experienced with EAV are not small, and doing something is probably an absolute must if your database is going to be large and you want good performance without some crazy expensive gigantic hardware.
In my opinion, the time spent automating the use/creation/modification of real subtype tables would ultimately be best. Focusing on flexibility driven by data makes the EAV sound so attractive (and believe me I love how when someone asks me for a new attribute on an element type I can add it in about 18 seconds and they can immediately start entering data on the web site). But flexibility can be accomplished in more than one way! Pre-processing is another way to do it. It's such a powerful method that so few people use, giving the benefits of being totally data-driven but the performance of being hard-coded.
(Note: Yes those views really are formatted like that and the PIVOT ones really do have update triggers. :) If someone is really that interested in the awful painful details of the long and complicated UPDATE trigger, let me know and I'll post a sample for you.)
And One More Idea
Put all your data in one table. Give columns generic names and then reuse/abuse them for multiple purposes. Create views over these to give them sensible names. Add columns when a suitable-data-type unused column is not available, and update your views. Despite my length going on about subtype/supertype, this may be the best way.
This is more of a comment, but since this is a lengthy response that's worth mentioning, I'm posting it as an answer.
Your database looks sufficient for your needs. (Well, as best can be assumed from the diagram.) I assume that the UserAccount
table has a one-to-many relationship between the four tables Vehicle
, Book
, VideoGame
and Movie
(from here on, the V-B-V-M tables.) Next, I assume that the Asset
table has a one-to-one relationship with each link between the V-B-V-M table entries. With this information, I would recommend moving the UserAccount_id
from each V-B-V-M table, to the AssetTable
.
Next, it seems that Asset
is always associated with one of the four V-B-V-M tables. Considering this type of setup, I would recommend creating an additional table-- AssetType
.
+-----------+
| AssetType |
+-----------+
| id |
| typeDesc |
+-----------+
Next, update the Asset
table and each V-B-V-M tables to include a AssetType_ID
field. the AssetType
table should include one entry, for each type of asset. In this case, Vehicle
, Book
, Video Game
and Movie
, each with a unique description and ID.
You may want to do this for a couple reasons-- First, you can now query the Asset
table and discern what type of asset is stored in a specific record, without having to link the data to your four other V-B-V-M tables. This will speed up your querying of the data, when all of the information from your four tables isn't needed. Second, this also provides an easy means of linking in a textual description of the type of asset associated with the Asset
entry. Again, without having to link to the V-B-V-M tables to figure that out.
Last, you may want to consider handling the barcode data differently. If barcode is a simple look-up, you can add it to the asset table. However, if barcode values have to be unique, keeping it in a secondary table is necessary because you'd have to put a constraint on the database. You could do this if it was integrated in the Asset
table, however, your vehicle data would cause problems because vehicles do not have barcodes. To my knowledge most, if not all, major database engines won't allow you to define a unique field, that also allows null
values because, then, the data is no longer unique.
You might be inclined to think that you could combine the Barcode
field with the VIN
. Since VINs are unique and Barcodes are unique, and neither should be the same, why can't they be combined into one field on the Asset
table, which allow you to drop the additional Barcode
table. I wouldn't recommend this because I think it gives the data less "focus".
Yes, it is a form of unique object ID but barcodes are barcodes and VINs are VINs-- the concept behind both is completely different and consolidating the two can become confusing with time. Furthermore, if you expand your project in the future, it could cause collisions. Technically, I don't see a big problem with that (in your case) but I'd still highly recommend avoiding that option.
Also, I would also add the assetType_id
to the Barcode
table, for the same reasons you added it to the Asset
table.
Best Answer
LedgerSMB, which is my main project, does both. I don't think we will merge these and the explanation why may give you an answer.
The difference you have to understand is that if you are trying to do both, you are probably trying to do accounting on both. The accounting issues are different and require that different things are tracked. There is room to centralize some information, but others can't be handled as well.
Let's look at the flow items through both systems.
You buy 10 widgets for inventory for $10 each. What you have to track depends on how you value your inventory. If this is FIFO or LIFO, you are going to have to track the financial aspect of the transaction ($100 debit in inventory, $100 credit in payables), and for average cost, you will have to track also the average cost per item. As an item gets sold then you have to record revenue and receivables but you also move inventory value to your cost of goods sold account. Modelling this process really defines a usable inventory system.
An asset control system is different. Suppose you buy 100 desks at $50 each. You then use these for a minimum of five years and depreciate using some method (many methods are acceptable) over that time. This example will use straight-line. To do this properly you also have to track date in service, estimated salvage value (in this case $0 if we intend to donate on disposal), etc. Then each year we will have a transaction which partially moves that $5000 into expenses ($1000 per year using straight-line). After that the asset will be fully depreciated but still in service and we will have no further transactions for the part until disposal. When we dispose even if for $0 on a fully depreciated item, we will still have another financial transaction accounting for this.
I know in LedgerSMB, our asset management side is more complex than our inventory management side.
So your design has a lot less to do with the nature of what is being stored and a lot more to do with the business processes you have to support, the accounting requirements, and so forth. These mean you have to track different information and expect different flows of stuff through your system.
Hope this helps. One thing I would highly recommend is that you do one of two things, however. Either hire a consultant (me! no conflict of interest there ;-) ) or pick up a book on accounting. If you want to support accounting workflows you want to know the basics enough you can have an intelligent conversation with the folks from finance. Domain knowledge here really is important because this is an area that looks really weird to the uninitiated. You don't have to know everything but you should be able to flag issues to talk about with accountants. I got into this through a book I picked up for $2 at a garage sale. It's a vast field though and you want to know what human problems you are solving and what business processes are accepted.