Sql-server – Database structure, relational or data warehouse

data-warehousedatabase-designrelational-theorysql-server-2008

Good morning,

Im having some issues with how my database should be laid out and would appreciate some guidance. I have several tables (rainfall data), each containing the following columns:

Date (DateTime), Value_of_rainfall (Float)

Each of these tables are for a specific location.

It has been suggested i create another table with the following columns:

LocationID (tinyInt), LocationName(char(6))

and insert into the first table a new column called LocationID(tinyInt).

Now the confusion i have is regarding that the data i have stored in the rainfall data datatables. Its been suggested that all the data from each location be contained in one data table. The statistical analysis im looking to carry out on the info is very (as far as i have envisioned thus far) location specific and wont require querying multiple locations at once. A couple months worth of data for one location = nearly 3 million rows and im looking to set up long running calculations on the data. therefore would a data warehouse be more appropriate? If so, could someone give me some pointers on how i should lay it out?

Thanks for your time.

Note: Im using Sql Server 2008

Best Answer

It seems you want to aggregate location based statistics over time for rainfall. A database structure like the one below would let you do that. The 'data source' could be just a filename, or some indication as to where it came from.

create table DimDataSource (
       DataSourceID      int identity (1,1) not null
       DataSourceDesc    nvarchar (100)  -- May need unicode for file names
)
go

alter table DimDataSource
  add constraint PK_DataSource
      primary key clustered (DataSourceID)
go

create table DimLocation (
       LocationID        int identity (1,1) not null
       LocationDesc      varchar (50)
)
go

alter table DimLocation
  add constraint PK_Location
      primary key clusterd (LocationID)
go

create table DimDate (
       DateID           smalldatetime not null  -- 'Date' is a reserved word
      ,MonthID          int not null
      ,MonthDesc        varchar (15)
      ,QuarterID        int not null
      ,QuarterDesc      varchar (15)
      ,YearID
)
go

alter table DimDate
  add constraint PK_Date
      primary key clustered (DateID)
go

create table DimTime (
       TimeID           time not null  -- 'Time' is a reserved word
      ,Hour             int not null
)
go

alter table DimTime
  add constraint PK_Time
      primary key clustered (TimeID)
go


-- If the table is <50GB, don't bother with partitioning, but put a clustered
-- index on DateID or LocationID and DateID, depending on how you normally expect
-- to query the data.

create table FactRainfall (
       RainfallID        int identity (1,1) not null -- May need a wider type if >4B rows.
                                                     -- SSAS likes an identity column for
                                                     -- incremental loads
      ,DataSourceID      int not null
      ,LocationID        int not null
      ,DateID            smalldatetime not null
      ,TimeID            time not null
      ,Rainfall          float
)
go

-- Add foreign keys as necessary

Populate the dimensions with the appropriate list of locations, date ranges, time of day to the right grain and one data source record per file. This table will also allow you to put a cube over the top, or can be flattened with a view, which will help people using tools like Excel or stats packages to get and use the data.

Related Solutions

Mysql – Database Design: What is the best structure for storing site offices and locations

Here is some vanilla SQL:

CREATE TABLE RoomTypes
(
 RoomType VARCHAR(12) NOT NULL,
 UNIQUE (RoomType)
);

CREATE TABLE Zones
(
 Zone VARCHAR(10) NOT NULL,
 UNIQUE (Zone)
);

CREATE TABLE Rooms
(
 RoomType VARCHAR(12) NOT NULL,
 Zone VARCHAR(10) NOT NULL, 
 Name VARCHAR(30) NOT NULL, 
 UNIQUE (RoomType, Zone, Name), 
 FOREIGN KEY (RoomType) REFERENCES RoomTypes (RoomType),
 FOREIGN KEY (Zone) REFERENCES Zones (Zone)
);

CREATE TABLE NumberedRooms 
(
 RoomType VARCHAR(12) NOT NULL,
 Zone VARCHAR(10) NOT NULL, 
 Name VARCHAR(30) NOT NULL, 
 CHECK (RoomType = 'Numbered'),
 UNIQUE (Name), 
 UNIQUE (RoomType, Zone, Name),
 FOREIGN KEY (RoomType, Zone, Name) 
    REFERENCES Rooms (RoomType, Zone, Name)
);

CREATE TABLE AncillaryRooms 
(
 RoomType VARCHAR(12) NOT NULL,
 Zone VARCHAR(10) NOT NULL, 
 Name VARCHAR(30) NOT NULL, 
 CHECK (RoomType = 'Ancillary'),
 UNIQUE (Zone, Name), 
 UNIQUE (RoomType, Zone, Name),
 FOREIGN KEY (RoomType, Zone, Name) 
    REFERENCES Rooms (RoomType, Zone, Name)
);

The CHECK constraints will not be tested by mySQL e.g. do the tests yourself using triggers. Consider adding other tests e.g. that attribute Name in table NumberedRooms represents an integer.

The idea that every row in the supertype table Rooms will have exactly one row in the union of AncillaryRooms and NumberedRooms. This is merely implied e.g. have 'helper' procs to add rows to both super- and subtype tables as a single operation and use triggers to ensure it is done.

Note NumberedRooms has a simple key Name alone, whereas AncillaryRooms has a compound key on (Zone, Name). All three (non-lookup) tables have a key on (RoomType, Zone, Name) throughout, allowing further subtype tables to referencing them and maintain integrity by further testing for valid RoomType values.

Relational database: in-RAM partitioning? Theoretical structure discussion

That's a long question.

First off, my current project (I'm the database guy, there are MMO engine experts to deal with that) is a form of MMORPG based on an off-the-shelf engine. Volumes would be like Eve Online" or "World of Tanks" volumes.

Now for an orthogonal short answer:

separate DB and Engine completely
Don't mix and match because of hardware optimisations
hardware: DB and engine servers will be way different specs
design your database normally

There is a whole lot more of course, but I'd suggest you're over-thinking the problem and shooting yourself in the foot. I'm simply applying the same techniques to my MMO that I used in Investment Banking because IMO most high volume systems should converge to the similar architecture

Best Answer

Related Solutions

Mysql – Database Design: What is the best structure for storing site offices and locations

Relational database: in-RAM partitioning? Theoretical structure discussion

Related Question