Designing for maximum performance for an anticipated large user-base, I would suggest ensuring bookmarks and tags are not duplicated unnecessarily. Using my suggested schema below, http://google.com/
would become a single bookmark row. Users who want a bookmark to Google would have a row in UserBookmarksTags
referencing the Bookmark row and the associated tag row.
I would tend to abstract the UserID away from the Bookmarks table, and use a cross-reference table to determine which users have bookmarks, and what they've tagged each bookmark with.
Using SQL Server, I would do it like this:
USE tempdb;
GO
CREATE TABLE Users
(
UserID INT NOT NULL
CONSTRAINT PK_Users
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, UserName varchar(255)
);
CREATE TABLE Bookmarks
(
BookmarkID INT NOT NULL
CONSTRAINT PK_Bookmarks
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, BookmarkName varchar(255)
, BookmarkURL varchar(255)
);
CREATE TABLE Tags
(
TagID INT NOT NULL
CONSTRAINT PK_Tags
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, TagName varchar(255)
);
If every entry in your system requires both a bookmark and a tag, you could use the following table for maximum performance:
CREATE TABLE UsersBookmarksTags
(
UserID INT NOT NULL
CONSTRAINT FK_UsersBookmarksTags_UserID
FOREIGN KEY REFERENCES Users(UserID)
, TagID INT NOT NULL
CONSTRAINT FK_UsersBookmarksTags_TagID
FOREIGN KEY REFERENCES Tags(TagID)
, BookmarkID INT NOT NULL
CONSTRAINT FK_UsersBooksmarksTags_BookmarkID
FOREIGN KEY REFERENCES Bookmarks(BookmarkID)
, CONSTRAINT PK_UsersBookmarksTags
PRIMARY KEY CLUSTERED
(UserID, TagID, BookmarkID)
);
The above version of the table UsersBookmarksTags
allows you to quickly return a list of bookmarks for a specific user that matches any given tag, or all bookmarks for that user.
Alternately, you could create the cross reference table that allows null entries for both Tags and Bookmarks, as in:
CREATE TABLE UsersBookmarksTags
(
UsersTagsBookmarksID INT NOT NULL
CONSTRAINT PK_UsersBookmarksTags
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, UserID INT NOT NULL
CONSTRAINT FK_UserBookmarks_UserID
FOREIGN KEY REFERENCES Users(UserID)
, TagID INT NULL
CONSTRAINT FK_UserBookmarks_TagID
FOREIGN KEY REFERENCES Tags(TagID)
, BookmarkID INT NULL
CONSTRAINT FK_UserBooksmarks_BookmarkID
FOREIGN KEY REFERENCES Bookmarks(BookmarkID)
);
This allows rows that have a bookmark without a tag, and also allows tags to be associated to a user without a bookmark.
If there is a requirement to know all the tags a user has added, without those tags necessarily having an associated bookmark, I would consider adding another table for that purpose, as:
CREATE TABLE UserTags
(
UserID INT NOT NULL
CONSTRAINT FK_UserTags_UserID
FOREIGN KEY REFERENCES Users(UserID)
, TagID INT NOT NULL
CONSTRAINT FK_UserTags_TagID
FOREIGN KEY REFERENCES Tags(TagID)
, CONSTRAINT PK_UserTags
PRIMARY KEY CLUSTERED
(UserID, TagID)
);
I'm not sure either of these are really great options. Here's my thoughts:
pk|A ,pk|B ,PK|C ,PK|D ----> JOIN AND INSERT INTO TABLE T(pk,a,b,c,d)
With this scenario you have redundant data. Unless your individual PK tables are just staging tables, I think you could run into some serious issues here, especially with different consumers pulling data from different tables. I wouldn't chance that whatsoever with this design.
Not to mention that if PK:A
, PK:B
, PK:C
, and PK:D
are all 1:1
relationships, I don't see any normalization benefit out of this either.
T (PK,A,B,C,D) DATA SOMETHING like (1,a1,NULL,NULL,NULL) run update statement for all columns B,C,D sequentially and get all the data T(1,a1,b1,c1,d1)
I'm not sure I like this design either, although I'd put it at a higher preference then the former. The reason behind this is because what happens if business dictates that columns A
, B
, C
, and/or D
really aren't NULLable. Now you can run into a big problem of data conflicting with business constraints.
I think the real solution here may be to take a step back and analyze your data modification. Instead of having a step-based data insert, is there no way to do a single insert of all the required and necessary data into this table? In the application, retrieve all of the required data before each item's insert. That way you won't be defying any business rules, having redundant data, or additional operation overhead of the designs above.
In other words, if data columns A
, B
, and C
are required, then you shouldn't be pushing that data out to this table until you have it. Without all of those required columns you wouldn't have compliant data, and it shouldn't live in that table until it is. And if D
is nullable and optional data, then sure that's fine if it doesn't appear until a later time with a different DML operation.
Best Answer
You mention every airport has a manager so the manager table should have AirportID as an FK. The Office table should have AirportId and AirlineID as FKs