Database Design – Naming Columns in Articles Table

database-design

A question that I ask myself quite often is: If my table is named articles, should I name my column article_number or simply number?

Is there a best practice? Is there a difference if you create tables and columns in a database, or work in an abstract layer like an ORM, where the query would perhaps be

a = Article.get(1);
my_num = a.number;
// or 
my_num2 = a.article_number

Best Answer

SQL employs a concept known as domain name integrity which means that the names of objects have a scope given by their container.

Column names have to be unique, but only within the context of the table that contains the columns. Table names have to be unique, but only within the context of the schema that contains the tables, etc.

When you query columns you need to reference the schema, table and column that you are interested in, unless one or more of these can be inferred. When you write a query, you need to reference the tables in your query by name directly or by using an alias, e.g. Customer.ID or C.ID from Customer C, etc., unless your query is so simple that it only references one table.

There was a time when there was a technical requirement for uniqueness of all column names, which applied to old ISAM databases and to languages like COBOL in the 1960s and 70s. This got dragged along for no good reason into dBase in the 1980s and has stuck as a convention well into the relational and object DBMS eras. Resist this outdated convention.

When the shift from flat file and network databases to relational databases happened in the 1970s and 80s, the idea of joining tables was new. So some people chose the convention that a unique name could be repeated if one column was a reference to another (i.e. foreign keys). This concept is called a "natural join" and a lot of people still advocate for doing this.

I am not a fan of natural joins because it requires you, ultimately, to throw out the concept of domain name integrity and force the whole column reference into the column name.

The issue with natural joins is that you either have to be hypocritical or you have to make your column names long and unreadable. Let me illustrate: It may sound like a good idea that the primary key of the Customer table is CustomerID. Then in your Invoice table, your foreign key to Customer is also called CustomerID. This would be a natural join and it all sounds good so far. Here is the problem. What if your convention is to have a column on every table called LastUpdatedDate? So are you meant to join every table to every other table by LastUpdatedDate? Of course not. This is the absurdity of natural joins. In order to avoid this absurdity you would need to jam the table name into the column name as a prefix. However, if you have multiple schemas in your database, you can't stop there. You also need to add the schema to the column name, and so it goes.

Another place where natural joins break down is when you have multiple relationships between the same two tables. You if you need two references to Employee on your Invoice table (Sold By and Approved By, for example) you can't call them both EmployeeID.