Merge them. Sometimes it's worthwhile to have a separate Addresses
table, but generally it's simplest to make those fields of a single Customers
dimension. Because addresses are nearly unique, you're not going to save many records by hiving them out to their own table.
If your Customers
table is really large, and addresses change relatively often then moving Addresses
out as a Type-4 SCD could be a worthwhile optimization, but a small one.
If there's interest in analyzing customers independently of their specific sales (e.g., "20% of our top-100 customers are international"), you can absolutely add a NumCustomers
dummy column to this Customers
table to turn it into a factless fact table.
Assuming that the app will be treating these various shop-types in the same manner for some operations (operations that do not care what type of shop that it is), then my preferred approach is to use a subclass / inheritance model. I have detailed this out in a couple of other answers here:
The basic concept would be as follows (using Microsoft SQL Server T-SQL semantics).
First you need the main entity table. It represents various "types" so those need to be defined in a lookup table (and the lookup table should have a matching enum
in the app layer). Please note that I am using a TINYINT here which holds values from 0 to 255. Use the smaller numeric type that will meet your needs.
CREATE TABLE dbo.ShopType
(
ShopTypeID TINYINT NOT NULL PRIMARY KEY, -- NO IDENTITY / auto-increment!
ShopType VARCHAR(50) NOT NULL
);
CREATE TABLE dbo.Shop
(
ShopID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
ShopTypeID TINYINT NOT NULL REFERENCES dbo.ShopType (ShopTypeID), -- FK to ShopType
ShopName NVARCHAR(100) NOT NULL,
VisitorsLastWeek INT NOT NULL DEFAULT (0),
-- {OtherCommonFields},...
CreateDate DATETIME NOT NULL DEFAULT (GETDATE()), -- or GETUTCDATE()
ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);
Next you create a subclass table for each particular "type". It is sometimes appropriate to share a subclass table for multiple "types", but just be careful when doing that so one type doesn't evolve differently than the other type(s) that might be sharing the same subclass table. Please note that the PK of the subclass tables is the same field as the PK in the parent class (i.e. the [Shop] table) and even FKs to it. The reason is that the row in both tables (e.g. [Shop] and [ShopRepair] that have the same value for [ShopID]
are together a complete entity; the row in the subclass table does not represent anything by itself and doesn't need a separate ID.
CREATE TABLE dbo.ShopFood
(
ShopID INT NOT NULL PRIMARY KEY REFERENCES dbo.Shop(ShopID),
MostPopularItem NVARCHAR(50) NULL,
ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);
CREATE TABLE dbo.ShopRepair
(
ShopID INT NOT NULL PRIMARY KEY REFERENCES dbo.Shop(ShopID),
CurrentComputerRepairs SMALLINT NOT NULL DEFAULT (0),
RepairEstimatedHours TINYINT NOT NULL DEFAULT (0),
ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);
CREATE TABLE dbo.ShopRetail
(
ShopID INT NOT NULL PRIMARY KEY REFERENCES dbo.Shop(ShopID),
NextPromotionDate DATETIME NULL,
CurrentWeekSalesEstimate MONEY NULL,
ModifiedDate DATETIME NOT NULL DEFAULT (GETDATE()) -- or GETUTCDATE()
);
To simplify SELECT operations, you can create Views against each type, if you like:
CREATE VIEW dbo.FoodShop AS
SELECT sh.ShopID, sh.ShopTypeID, sf.MostPopularItem
FROM dbo.Shop sh
INNER JOIN dbo.ShopFood sf
ON sf.ShopID = sh.ShopID;
CREATE VIEW dbo.RepairShop AS
SELECT sh.ShopID, sh.ShopTypeID, sr.CurrentComputerRepairs, sr.RepairEstimatedHours
FROM dbo.Shop sh
INNER JOIN dbo.ShopRepair sr
ON sr.ShopID = sh.ShopID;
CREATE VIEW dbo.RetailShop AS
SELECT sh.ShopID, sh.ShopTypeID, sr.NextPromotionDate, sr.CurrentWeekSalesEstimate
FROM dbo.Shop sh
INNER JOIN dbo.ShopRetail sr
ON sr.ShopID = sh.ShopID;
EDIT:
I forgot to add in the "audit" fields of [CreateDate]
and [ModifiedDate]
. I have now added them to the tables above. Please note that:
these fields are not in the [ShopType] table as there really is no purpose here, but that is a bit subjective and some people still like to know at least when the records were created, so maybe [CreateDate]
is fine there
the [CreateDate]
field is not in the subclass tables as the assumption is that those rows are created at the same time as the parent record. But when it comes to updates, those might just touch one of the two tables, depending on which fields are being updated. If using SQL Server, it is possible to create the records for both tables in a single statement even, using the OUTPUT
clause as follows:
CREATE PROCEDURE dbo.Shop_CreateRepairShop
(
@ShopName NVARCHAR(100),
@CurrentComputerRepairs SMALLINT,
@RepairEstimatedHours TINYINT,
@ShopID INT OUTPUT = -1
)
AS
SET NOCOUNT ON;
INSERT INTO dbo.Shop (ShopName, ShopTypeID)
OUTPUT INSERTED.ShopID, @CurrentComputerRepairs, @RepairEstimatedHours
INTO dbo.ShopRepair (ShopID, CurrentComputerRepairs, RepairEstimatedHours)
VALUES (@ShopName, 2); -- assuming ShopTypeID for "Repair Shop" = 2
SET @ShopID = SCOPE_IDENTITY();
Best Answer
It sounds like you want to put aggregation rules (based on anything, doesn't matter) in the database. Nothing particularly wrong with that, unless you're planning to use it to generate SQL dynamically. That's a slippery surface: proceed with caution, because SQL is much easier to debug when you can see it.
Will the users be able to add to your putative Measure table? If not, you might want to consider statically generating the SQL as stored procedures. That would be easier to understand, both for the human being and likely for the query optimizer.
I'm also a little leery of your question because ISTM you're not quite sure yet what the properties of the measures will be. In such a situation, it's often best to wait until a dozen or so use cases develop. At that point, you'll have a concrete set that you can actually use, with identified properties. Then you'll know enough that I expect the question will answer itself. It wouldn't surprise me, though, if that many cases never do develop, in which event the Measure abstraction will have proved more trouble than it would have been worth, and was fortuitously avoided.
HTH.