Sql-server – Storing daily yield curve data in SQL table

rdbmssql server

I have daily yield curve data for integer maturities that should be stored in a MSSQL server.

Currency.csv

Currently, the data is stored in CSV files per currency in the format:

date, 1y, 2y, 3y, ...
08/12/2020, 0.4, 0.5, 0.6, ...

I am wondering what the best choice for a table would be.

Same Format Table

The current data could be directly translated to a table with columns:

+------------+------+------+------+------+
|    date    |  1y  |  2y  |  3y  |  ... |
+------------+------+------+------+------+
| 08/12/2020 |  0.4 |  0.5 |  0.6 |  ... |
+------------+------+------+------+------+

Alternate Table Design

However, I could also store it like:

+------------+-----------+-----------+--------+
|    date    |  currency |  maturity |  value |
+------------+-----------+-----------+--------+
| 08/12/2020 |  "EUR"    |         1 |    0.4 |
| 08/12/2020 |  "EUR"    |         2 |    0.5 |
| 08/12/2020 |  "EUR"    |         3 |    0.6 |
+------------+-----------+-----------+--------+

Obviously, the second approach would take more space than the first, but I would see the advantage that I am not restricting the value of the maturity in case in the future it should be changed to non-integer values or to maturities that don't have a column yet.

Which of the two table designs would be better? Are there other designs that would fit better?

Best Answer

Your second choice (long) is preferable for the reasons you outline, provided you use the right keys and constraints.

Something like this is probably in the right direction:

CREATE TABLE Currency
(
  CurrencyCd    CHAR(3)      NOT NULL  /* This is the ISO currency code */
 ,CurrencyDesc  VARCHAR(50)  NOT NULL
 ,CONSTRAINT PK_Currency PRIMARY KEY (CurrencyCd)
 ,CONSTRAINT AK_Currency UNIQUE (CurrencyDesc)
 ,CONSTRAINT CK_CurrencyCd_Length CHECK (LEN(CurrencyCd) = 3)
)
GO

/* This is solely to prevent invalid values being added */
CREATE TABLE MaturityAge
(
  MaturityAgeYr  TINYINT  NOT NULL
 ,CONSTRAINT PK_Maturity PRIMARY KEY (MaturityAgeYr)
)
GO

CREATE TABLE CurrencyYield
(
  CurrencyCd     CHAR(3)       NOT NULL
 ,MaturityAgeYr  TINYINT       NOT NULL  /* Possibly consider using months if finer detail possible */
 ,AsOfDt         DATE          NOT NULL
 ,Yield          DECIMAL(9,1)  NOT NULL /* Can increase decimal precision if necessary */
 ,CONSTRAINT FK_CurrencyYield_Of_Currency FOREIGN KEY (CurrencyCd) REFERENCES Currency (CurrencyCd)
 ,CONSTRAINT FK_CurrencyYield_At_Maturity FOREIGN KEY (MaturityAgeYr) REFERENCES MaturityAge (MaturityAgeYr)
 ,CONSTRAINT PK_CurrencyYield PRIMARY KEY (CurrencyCd, AsOfDt, MaturityAgeYr)
)
GO

Here the choice of the composite clustered index form by the primary key (CurrencyCd, AsOfDt, MaturityAgeYr) ensures the data will be (more or less) stored sorted according to those columns. That means any queries of the type WHERE CurrencyCd = 'EUR' will only read the relevant pages.

This choice of clustered index + storing the data in a "long" format will make "widening" the data trivial - so if you need to view the data as you provide in the first example, it's a simple PIVOT.

Other considerations:

  • If the volume of data is very large, but most of the values do not change, you can explore only storing changed values. It takes a little more code to retrieve the data for a given date, but you can create date tables/views to simplify that. If this is the case, let me know in the comments and I'll expand my answer.
  • Over time the table will fragment (a large number of records will not be co-located according to the clustered index), so the clustered index will need to be rebuilt occasionally. There are a number of readily available tools/scripts to make this relatively simple.