Situation 1:
Your tables have one relationship and not two. (example: a Device
belongs to a DeviceType
)
So, keep only one relationship, the one with the composite keys (that include the Primary Key). The other relationship is redundant when the composite one is defined.
I would also suggest you have same names for related columns:
DeviceCategory Table
CategoryCode | Name | Description
----------------------------------------
WKS | Workstation | Description of what classifies an item as a workstation...
LPT | Laptop | Description of what classifies an item as a laptop...
DeviceType Table
DeviceTypeID | CategoryCode | Manufacturer | Model | IsTrackedInOtherSystemDefault
-----------------------------------------------------------------------------------
1 | WKS | Dell | GX1000 | true
2 | LPT | HP | dv4000 | false
3 | WKS | HP | xx9000 | false
Device Table
DeviceID | SerialNumber | DeviceTypeID | IsTrackedInOtherSystem | CategoryCode
------------------------------------------------------------------------------
1 | I81U812 | 1 | true | WKS
2 | N0S4A2 | 1 | false | WKS
3 | 3BL1NDMIC3 | 2 | false | LPT
So, the design would be:
DeviceCategory
--------------
CategoryCode PK
Name U1
Description
DeviceType
----------
DeviceTypeID PK U1
CategoryCode FK U1
Manufacturer U2
Model U2
IsTrackedInOtherSystemDefault
Device
------
DeviceID PK U1
SerialNumber U2
DeviceTypeID FK1
IsTrackedInOtherSystem
CategoryCode FK1 U1
and for the Computer
:
Computer
--------
DeviceID PK FK1
Hostname U1
IPAddress U2
CategoryCode FK1 CHK
The "additional" UNIQUE
keys (the two composite U1
ones) will be needed in most DBMS to enforce the foreign key constraints. I guess this answers your question 2, relationships needs indices to be enforced, so (you have to) use them. They will be used by the DBMS not only to enforce integrity but in your queries/statements, when you will be joining the tables.
The only one that is not needed is the U3
you had in the Computer
table.
About question 3 (the over-engineering part): No, I don't think so but that's just my opinion. And you haven't told us if this is a homework/exercise or a real project, whether you will be holding only your family's or a multi-million company's inventory, etc.
Situation 2
I think what you have is fine and there is no need (and not a good idea) to have referential integrity constraints on these columns. This is a default value that is copied in the second table via a stored procedure (I guess during Inserts on the second table?) or altered by a user. If you add an FK, won't that deny users the ability to override the default?
The names of the two columns are self-explanatory enough for a DBA to understand the functionality.
Dynamic SQL is another option. I would try both this method and @a1ex07's method and see which runs faster for you.
CREATE PROCEDURE WorkAssignmentReport @Category varchar(25)
,@---Name varchar(25)
,@Stats varchar(25)
,@---Name varchar(25)
.........
......
DECLARE @sql nvarchar(max)
SET @sql = 'SELECT * ' + CHAR(13) +
'FROM [Ticket] ' + CHAR(13) +
'INNER JOIN dbo.[TicketType] ON [Ticket].TicketTypeID = [TicketType].TicketTypeID ' + CHAR(13) +
'INNER JOIN dbo.[Status] ON [Ticket].Stat .... ' + CHAR(13) +
'..... ' + CHAR(13) +
' ' + CHAR(13) +
' ' + CHAR(13) +
'WHERE 1=1 '
IF @Category IS NOT NULL
SET @sql = @sql + CHAR(13) + ' AND [TicketType].Name LIKE @Category '
IF @ReportName IS NOT NULL
SET @sql = @sql + CHAR(13) + ' AND [Ticket].Name LIKE @ReportName '
IF @Status IS NOT NULL
SET @sql = @sql + CHAR(13) + ' AND [Status].StatusName LIKE @Status '
IF @PhaseName IS NOT NULL
SET @sql = @sql + CHAR(13) + ' AND [Phase].PhaseName LIKE @PhaseName '
IF @Name1 IS NOT NULL
SET @sql = @sql + CHAR(13) + ' AND (BA.FullName = @Name1 OR AD.FullName = @Name1) '
EXEC sp_executesql @sql, N'@Category varchar(25), @ReportName varchar(25), @Status varchar(25), @PhaseName varchar(25), @Name1 varchar(25)',
@Category, @ReportName, @Status, @PhaseName, @Name1
Best Answer
Introduction
I do not have the time for extensive testing, but can suggest from where to start.
If you rewrite the query in a more symmetrical manner, to emphasise that both entities are joined to a two-dimensional cross-section of
MyJoinTable
:you will see that one reasonably efficient way of executing it is first to extract that cross-section and then to join the entities to it. The following equivalent query will serve for illustration purposes (but do not use it!):
It is the more important since you say that there is only
For this reason, joining the entities in before
MyJoinTable
is reduced to the required cross-section by@date1
and@date2
is inefficient as it is likely to have many records for each entity so that the result will have too many rows. The entites are best joined at the end, using their natural key field,Id
, which I assume is the clustered index. The solutions below, therefore, propose different ways of calculating this cross-section, corresponding to theSECT
subquery of the example above—Soution I: The easiest
Let us try to add useful indices to the original query. Since MSSQL's composite indices are hierarchical, they are useless in optimising interval comparisons, so the best we can do with the given structure is to index one of the date fields, yet make sure to cover all the other fields required form the table:
The query then will be executed in the following order:
Using the index
SD1
, perform an index seek onMyJoinTable
to find records byStartDate1
, and filter them byStartDate2
,EndDate1
, andEndDate2
in the residual predicateJoin the entity tables using their natural key
Id
.This method is not very efficient because only one of the four date predicates is fully optimised by an index, resudual predicates being boring data grinders.
Solution II: Set intersection
Another symmetrical way to obtain the two-dimensional
JOIN
cross-section goes throughINTERSECT
ing the results of the four date predicates:With the suitable indices,
the plan for this query includes only clean index seek (wihout residual predicates) and hash match operations.
Observe that the entity keys in query and indices present a fourfold redundancy, which may be removed—at the expense of a more complicated execution plan—by joining the entities in separately:
with these indices:
But since in either case each of four date predicates, bounding the date on one end only, does not reduce the amount of data sufficiently, hash matches may have to spill data into
tempdb
. If they do, this method is not fit for your environment.Solution III: A compromise
Now we can merge solutions I & II in order to come up with a plan that does not require so much RAM and at the same time is reasonably fast:
Now the date constraints are more efficient in discaring data because they bound the date on both ends. With the indices below:
each constraint uses an index seek with a residual predicate, which is better than the single index seek in solution I and takes less RAM than solution II. The plan shows three opportunities of parallisation: one for each date constraint, and one for the
INTERSECT
operation.A non-redundant version of this approach would be:
with indices
Although it should work better with your data, where the
SECT
subquery returns at most one row, in my crude tests with randomly generated data it has been less efficient because the server used a hash match instead of a nested loop join with that single row. Do try it on your side.Solution IV: Optimised structure
It is possible to optimise your query at the expense of introducing a more complicated structure that requires additional maintenance and slows down the modification of data in
MyJoinTable
. If you are willing pay the price, store the date ranges as sets of days:and query the relation thus:
You will need some testing in order to determine optimal indices, but I think the following should work:
Index
RD
will make sure the ranges are quickly found corresponding to the specified dates, and indexRR
will help to find the record matching these ranges. But then you shall have to devise a means of filling theRanges
table and keeping it in sync withMyJoinTable
, because doing so by hand is out of the question.