Foreign key as part of composite primary key or not

database-designforeign keyprimary-key

I'm designing a relational database. This is what I'm trying to design:

I will have production orders and products.

A production order will have one product, but a product could be on zero or more production orders (In fact, there will be always at least one production order for each product. It has no sense to have a product that it's not going to be produce on database).

In other words, with a production I will produce a product, but I will have a lot of production orders to produce the same product.

We are going to distinguish each production order by its name (alphanumeric) and it must be unique.

I have two questions:

  1. Is the production order's name its primary key? I think I can use a integer as primary key.
  2. Does product's key must be part of production order's primary key? I don't think so but I'm not sure.

By the way, I'm going to use SQL Server.

Best Answer

It depends on the specific scenario.

For example, if you have a many-to-many relationship, then it is common to have a join table between the two tables. The join table will contain a foreign key to each of the primary keys and most of the time, the combination of the two foreign keys will become the primary key:

CREATE TABLE users 
(
    id INT NOT NULL PRIMARY KEY
);
CREATE TABLE roles 
(
    id INT NOT NULL PRIMARY KEY
);
CREATE TABLE users_roles 
(
    user_id INT NOT NULL REFERENCES users(id),
    role_id INT NOT NULL REFERENCES roles(id),
    PRIMARY KEY (user_id, role_id)
);

For the most part, I prefer natural keys rather than surrogate keys - especially if I will never use the surrogate key to ever query the table.

With that being said, to answer your question directly:

Firstly, can a string be a primary key? Though the name may be its natural key, you have to take into consideration the size of the key if you made it the primary key. Not just in your main table, but also in the tables that refer to it. Remember, that in order to create a foreign key you must copy the key value into any dependant tables - this can quickly bloat your database and slow down queries depending on the size and type of data. For instance, short fixed-length text columns can perform better than variable-length text columns and can be indexed easier than longer text values. A good example of this that I use each day is a currency table. My primary key is a 3 character currency code - it is short, of a fixed length, and I am always going to query the table using it even if the table had a surrogate key defined on it.

CREATE TABLE DimCurrency
(
    Code CHAR(3) NOT NULL PRIMARY KEY,
    Name VARCHAR(50) NOT NULL
);
CREATE TABLE FactSales
(
    CurrencyCode CHAR(3) NOT NULL REFERENCES DimCurrency(Code),
    Date DATE NOT NULL,
    SalesAmmount DECIMAL(18,2) NOT NULL
);

Secondly, does the product id need to be part of the productions primary key? Well, this depends on a few factors. If you made the name part of the primary key, and the name was short then potentially, yes it could. But, does any other table have a reference to the productions table? If so, then by adding the product id into the tables primary key you would also need to add it into all of the foreign keys that refer to it. Personally, I don't like this. I am happy to do it if I can guarantee that no table will ever need a foreign key to it (such as a fact table in a dimensional model) but otherwise I think it is wasteful and makes querying your tables over complicated.

You should also consider that if you define further non-clustered indexes on the production table when you have a composite key, the row locator for the non-clustered index is a copy of the primary key itself. This can cause all of your non-clustered indexes to bloat out of control very quickly without your realising it. For reference: Clustered and Nonclustered Indexes Described

I think given your particular scenario, it is perfectly reasonable to add a foreign key in your production table which refers to the id of the product that is produced. As Gulrez Khan stated in his answer, you may want to add a non clustered index on the foreign key - it is not required, but can enhance performance in some queries.

CREATE TABLE production
(
    id INT NOT NULL PRIMARY KEY,
    product_id INT NOT NULL REFERENCES product(id),
    name VARCHAR(50) NOT NULL UNQIUE
);

CREATE INDEX ix_production_product_ncl ON production(product_id);

Remember that all indexes need to be built, and rebuilt when they are modified. The more fields you cram into them, the longer it takes to perform actions against them. Especially clustered indexes, which are generally defined upon the primary key. These indexes physically sort your data at the storage level. If you insert a value that happens to not be at the end of the existing chain of values, you could cause performance problems due to page splits. For me, this is the biggest advantage of a surrogate key over a natural key - the ability to ensure that new data is always appended and not injected into the middle of an existing index.

Finally, if in doubt - try it out! There is no substitution for a good testing strategy. Try querying the data against both structures, look at the query plans, do some capacity planning and choose the one that is right for you.