Mysql – Best Practices for Separating Content from Presentation in SQL

best practicesconcatMySQLnumber formatting

A common and useful paradigm of programming is to separate content and presentation. One system deals with building the result, another system deals with outputting it.

You see this with HTML vs CSS. HTML should ideally be clean and without any internal styling (color, position, etc), which should be handled by CSS.

I'm now wondering how this applies to SQL results. I often find myself formatting results in a query:

SELECT 
    CONCAT(first_name, ' ', last_name) AS name,
    DATE_FORMAT(last_edit, '%d/%m/%Y') AS last_edit,
    CONCAT(
        record_type, 
        IF (private = 1, ' (Private)', ' (Public)')
    ) AS record_type
FROM people
ORDER BY updated DESC, LastEdit DESC

Three different scenarios are covered here:

  1. The first column is a simple concatenation.
  2. The second transforms the output into a different format.
  3. The third actually adds content to the result.

I understand that there are many useful functions for transforming data in a query. But from a team development point of view, should I be formatting these results in the presentation layer rather than the logic layer (i.e., in my PHP code instead of the SQL query)?

If so, then how far should I take it? Are columns 1 and 2 acceptable but 3 isn't?

Is there a widely-accepted best practice regarding this?

Best Answer

It depends. There is no one-size-fits-all answer. Yes, a database is a persistent store for data -- focus on that when designing the schema.

Meanwhile, let me point out that

DATE_FORMAT(last_edit, '%d/%m/%Y') AS last_edit,

implies something that needs to be strongly stated: Store the data in a generic format:

  • Store Dates and times in the database in a format suitable to the database. This may require conversion both during input and output. It facilitates filtering and sorting (as in your example, the typos notwithstanding).
  • Store text as UTF-8, not html-entites, latin1, etc. This implies adding entities (or whatever) after SELECTing the data back out.
  • Monetary values stored as DECIMAL; scientific as FLOAT/DOUBLE. etc.
  • JSON can be handy for storing random attributes (think EAV) that don't need to be sorted/searched too much.

But, let's study a common example: Should an "address" be split into address1, address2, city, province, country, postal_code. Versus a directly printable address.

One rule for deciding is whether you will ever need to search/sort on individual pieces of the data. Keep in mind that it is very hard and very inefficient to split apart fields for such purposes.

At the other extreme, would you even think about splitting up an image into pixels? (I hope not.)

Now, back to your question. Your example seems to have reasonable datatypes under the covers. But your question is who does the formatting as it comes out. You have given simple examples where it is quite simple to use SQL to do the formatting. I would probably write it the way you did for a "simple" system.

If there is any chance of changing the formatting (eg, flipping and adding a comma: "last_name, first_name") you should probably leave that for later code.

For something more complex, it is better to postpone formatting.

Some things are messy in SQL, so it is better to leave for later code:

  • Pivoting -- You have 2-dimensional data stored in 1D. (Follow tag [pivot-table])
  • Want to build tabular data but not duplicate the value (vertically) until it changes. This is too cryptic to do in SQL; it is better to do in formatting code.

As for IF (private = 1, ' (Private)', ' (Public)'), consider switching from numeric to ENUM('public', 'private').