What could justify a non explicit – non atomic primary key

Architecturedatabase-designprimary-key

I'm working as a "guest" dev at a random company.

I'm working with a table that does not have a primary key by definition(Oracle).

The column named "SOMETHING_ID" is varchar2(10) and constructed at insert time
by the following operation. And it is used as a primary key and referenced by other tables.

'A_STRING' || LPAD(SEQUENCE_NAME.NEXTVAL, 7, '0')

Some examples of the primary key data are

CH-00004321
CH-00004322
CH-00004323

My questions are,

first, in what circumstances would it be justified to use a primary key that is a combination of the type of the record(let's say, CH for Chinese food, In for Indian and MX for Mexican) and a sequence number ?

Second, in what circumstances would it be justified not to have a primary key defined but you have a column that distinctively distinguish each row ?

Third, is my perception correct that I see the combination of the type string(CH, IN, MX) and a sequence number is against the 1st normalisation ?

Fourth, in the table, there is only one "type string" value (let's say CH).
Is it okay to add that string to the pk while it does not seem necessary since there is only one value ? (The nature of the table does not seem to allow any other value for the type string though unlike my example of food. Perhaps you can see the table name as "CHINESE_FOOD", hence no other type string value would seem logical.)

Note – 1. Actual column, sequence names are replaced with arbitrary names.
2. The aim of this software is prediction of the damages caused by natural disasters.

Best Answer

in what circumstances would it be justified to use a primary key that is a combination of the type of the record(let's say, CH for Chinese food, In for Indian and MX for Mexican) and a sequence number ?

Under no circumstances would it be justified. A column containing two different data elements as one value is a violation of first normal form. 1NF requires, among other things, that each column contain a single value. This practice also destroys the guaranteed logical access of every data element by table name, column name, and key value. Now if each data element were in its own column, and the combination of columns made up a key, that is not a problem. Ultimately this depends on the entity type for which the table holds entity occurrences. The key chosen should be something used by people in the real world to identify occurrences. Perhaps this makes sense if we are talking about a menu, and the menu has a section for Chinese food, and then gives each dish a number like #1, #2, etc.

Second, in what circumstances would it be justified not to have a primary key defined but you have a column that distinctively distinguish each row ?

Again, under no circumstances. If there is a column whose values uniquely identify each row that column must be declared as a unique constraint so as to ensure the occurrences entered are not duplicated and thus become inconsistent with the real world entities they represent.

Third, is my perception correct that I see the combination of the type string(CH, IN, MX) and a sequence number is against the 1st normalization ?

Absolutely, as described above.

Fourth, in the table, there is only one "type string" value (let's say CH). Is it okay to add that string to the pk while it does not seem necessary since there is only one value ?

No, for the same reason as this violates 1NF. Again that column contains 2 different values and each column must contain only a single value to be in 1NF.

Fabian Pascal's Practical Database Foundation Series is a great reference to dive more deeply into these fundamentals. Paper #4 addresses keys specifically.