If I understand your requirements correctly....
I would just use the natural key, LanguageCode-CultureCode ("en-US," for example). It's small enough. (I'm using the entire "en-US" as the primary key to differentiate it from "en-GB," for example.)
CREATE TABLE [dbo].[Language](
[Language] [char](2) NOT NULL,
[Culture] [char](2) NOT NULL,
[LanguageCode] AS (([Language]+'-')+[Culture]) PERSISTED NOT NULL,
CONSTRAINT [PK_Language] PRIMARY KEY CLUSTERED
(
[Language] ASC,
[Culture] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [LanguageCode] UNIQUE NONCLUSTERED
(
[LanguageCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[Language_Text](
[LanguageID] [varchar](5) NOT NULL,
[LanguageCode] [varchar](5) NOT NULL,
[LanguageName] [nvarchar](20) NULL,
CONSTRAINT [PK_Language_Text] PRIMARY KEY CLUSTERED
(
[LanguageID] ASC,
[LanguageCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Language_Text] WITH CHECK ADD CONSTRAINT [FK_Language_Text_Language] FOREIGN KEY([LanguageCode])
REFERENCES [dbo].[Language] ([LanguageCode])
GO
ALTER TABLE [dbo].[Language_Text] CHECK CONSTRAINT [FK_Language_Text_Language]
GO
This should allow you to get all articles in en or sv, and you can also query for en-US or sv-SE. Presumably, although it wasn't in your example, you could also query for en-CA, fr-CA, en, fr, or CA.
Edit--I'm sorry, you're right, no search by Culture in my old code. Revamped above, sorry. Here's a sample of content:
Language:
Language-Culture-LanguageCode
en US en-US
sv SE sv-SE
Language_Text:
LanguageID-LanguageCode-LanguageName
en-US en-US English
en-US sv-SE Engelska
sv-SE en-US Swedish
sv-SE sv-SE Svenska
Searching by Culture (Canada):
SELECT test.dbo.Language_Text.LanguageID, test.dbo.Language_Text.LanguageCode, test.dbo.Language_Text.LanguageName
FROM test.dbo.Language_Text INNER JOIN
test.dbo.Language ON test.dbo.Language_Text.LanguageID = test.dbo.Language.LanguageCode
WHERE (test.dbo.Language.Culture = 'CA')
Searching by Language (French):
SELECT test.dbo.Language_Text.LanguageID, test.dbo.Language_Text.LanguageCode, test.dbo.Language_Text.LanguageName
FROM test.dbo.Language_Text INNER JOIN
test.dbo.Language ON test.dbo.Language_Text.LanguageID = test.dbo.Language.LanguageCode
WHERE (test.dbo.Language.Language = 'fr')
Searching by LanguageCode (Swedish):
SELECT LanguageName
FROM [test].[dbo].[Language_Text]
where (LanguageID = 'sv-SE')
Size is one consideration, certainly. Consider not only the size of the index in the widgets
table, but also that primary keys show up in other tables as foreign keys. In some systems, even short strings are going to take up more space than integers (e.g., MSSQL: VARCHAR is a byte per character plus 2 bytes, so at only two characters you're already as large as an integer).
Almost all numeric primary keys I have seen are surrogate keys, implemented via series, sequence, auto_increment, IDENTITY or whatever the database engine's native method is for generating values itself. I suspect this is a leading reason for the ubiquity of the numeric primary key. One advantage to surrogate keys is that they have no business meaning. Since business meaning can change over time, using a key without business meaning helps ensure that the primary key is static.
If your widgets have some sort of industry-standard identifier (like the auto industry's VIN, publishing's ISBN, UPCs and so forth), that's probably the best choice for your primary key. My concern in using widget_name
is that attribute's immutability. Will it ever change? How do you know it will never change--did Sales tell you that? :)
The whole surrogate vs. natural key issue is nearly a religious debate, and is sort of tangential to your question. I would say if you have a natural key that is static, minimal, and unique, use it. Otherwise, consider a surrogate key (which is likely going to be numeric).
Best Answer
circle_share1
does not require much in-depth design thought.circle_share2
requires some actual brain cycles to determine if the primary key will have the desired performance.Rows in
circle_share1
will be written to the disk in the exact order they are inserted, thereby making inserts quicker.Rows in
circle_share2
may be inserted anywhere in the table, necessitating page-splits thereby fragmenting the data and possibly resulting in slower performance.Neither way is the correct way in all circumstances - the best way depends on your data, and how it will be created in the
user
andcircle
tables and inserted into thecircle_shareX
table.