Postgresql – New Table vs Adding Columns To Existing Table

database-designpostgresql

I have a table, products with the following columns: id, name
There is another table, countries with these columns: id, name, code
The products have different prices across different countries (same currency is used).

I'm torn between creating a new table to store this, which would have the following columns: product, country, price
and adding new colums to the products table; price_uk, price_kenya, price_sweden, etc, for the different countries.
Which would be the better option, and why?

Best Answer

Instead of arguing normalization rules, I will use predicates and constraints.

[P1] Product number (PRODUCT_ID) named (PRODUCT_NAME) exists.

(c1.1) Product is identified by product number.

(c1.2) For each product that product has exactly one product name; for each product name exactly one product has that product name.

product {PRODUCT_ID, PRODUCT_NAME}  -- P1
    KEY {PRODUCT_ID}                -- c1.1
    KEY {PRODUCT_NAME}              -- c1.2

[P2] Country named (COUNTRY_NAME) with assigned country code (COUNTRY_CODE) and number (COUNTRY_ID) exists.

(c2.1) Country is identified by country number.

(c2.2) For each country, that country is assigned exactly one country name; for each country name, that country name is assigned to exactly one country.

(c2.3) For each country, that country is assigned exactly one country code; for each country code, that country name is assigned to exactly one country.

country {COUNTRY_ID, COUNTRY_NAME, COUNTRY_CODE}  -- P2
    KEY {COUNTRY_ID}                              -- c2.1
    KEY {COUNTRY_NAME}                            -- c2.2
    KEY {COUNTRY_CODE}                            -- c2.3

[P3] Product (PRODUCT_ID) in country (COUNTRY_ID) is assigned price of (AMOUNT) (CURRENCY).

(c3.1) For each country and product, that combination of that country and that product occurs at most once.

(c3.2) For each combination of country and product, that product in that country has exactly one price.

(c3.3) For each price, it is possible that more than one product in a country has that price.

(c3.4) If a product is assigned a price for a country, then that product must exist.

(c3.5) If a product is assigned a price for a country, then that country must exist.

country_price {COUNTRY_ID, PRODUCT_ID, AMOUNT, CURRENCY}  -- P3
          KEY {COUNTRY_ID, PRODUCT_ID}                    -- c3.1

          FK1 {PRODUCT_ID} REFERENCES product {PRODUCT_ID} -- c3.4
          FK2 {COUNTRY_ID} REFERENCES country {COUNTRY_ID} -- c3.5

Note:

[Px]   = predicate  x
(cx.y) = constraint x.y

KEY = PK or AK

PK  = PRIMARY KEY
AKn = ALTERNATE KEY (UNIQUE)
FKn = FOREIGN KEY

So this is basics, as per your first suggestion.

Note that adding pricing for a new country does not involve schema changes, but adding data. What happens to the application code if you were to add a column to the product table instead?

Consider the following questions:

For which countries do we have all product prices defined?
For which countries we do not have any product pricing defined?
Which countries do not have pricing for all products defined -- but do have at least one?
Which product prices are not defined in country X?

These are straightforward queries. An analyst may prepare these queries and the code will not change as you keep adding new countries and products. Now try to formulate these queries for a model where you add columns to the product table. What happens to the query code if you were to add a column to the product table?

Related Solutions

Sql-server – Is adding rows better then adding columns in case their value is not consistent

First, it's about normalization. Today you have max 30 special products, tomorrow (or after 10 years for that matter) you might need another extra 5. You will need to write more code, but you will have greater flexibility after all. At least you won't need to code (or design) 30 columns! You can have a schema like this:

CREATE TABLE Product (ProductId...)
CREATE TABLE Customer (CustomerId...)
CREATE TABLE CustomerProductPrice (ProductId, CustomerId, Price...)

In the third table you can put price for product/customer, but remember to create a unique constraint on foreign key columns like ProductId, CustomerId.

Postgresql – Efficiently comparing prices in different currencies

Here's a different approach for which recomputing the calculated_price is just an optimization, as opposed to being strictly necessary.

Suppose that in the currencies tables, you add another column, last_rate, which contains the exchange rate at the time the calculated_price was last updated, no matter when this happened.

To quickly retrieve a set of products with a price point between, say, 50 USD and 100 USD that include the desired results, you could do something like that:

  SELECT * FROM products
   WHERE calculated_price > 50.0/(:last_rate*
    (SELECT coalesce(max(value/last_rate),1) FROM currencies
      WHERE value>last_rate))
   AND calculated_price < 100.0/ (:last_rate*
    (SELECT coalesce(min(value/last_rate),1) FROM currencies
      WHERE value<last_rate))

where :last_rate contains the EUR/USD exchange rate at the time of the last update. The idea is to increase the interval to take into account the maximum variation of every currency. The increase factors for both ends of the interval are constant between rates updates, so they could be pre-computed.

Since the rates change only slightly over short periods of time, the above query is likely to give a close approximation of the final result. To get the final result, let's filter out the products for which the prices have slipped out of the bounds due to the changes in rates since the last update of calculated_price:

  WITH p AS (
   SELECT * FROM products
   WHERE calculated_price > 50.0/(:last_rate*
    (SELECT coalesce(max(value/last_rate),1) FROM currencies
      WHERE value>last_rate))
   AND calculated_price < 100.0/ (:last_rate*
    (SELECT coalesce(min(value/last_rate),1) FROM currencies
      WHERE value<last_rate))
  )
  SELECT price,c.value FROM p join currencies c on (p.currency=c.id)
     WHERE price/c.value>50/:current_rate
       AND price/c.value<100/:current_rate;

where :current_rate is the more up-to-date rate with EUR for the money choosen by the user.

The efficiency comes from the fact that the range of rates is supposed to be small, the values being close together.

Best Answer

Related Solutions

Sql-server – Is adding rows better then adding columns in case their value is not consistent

Postgresql – Efficiently comparing prices in different currencies

Related Question