Schema for Financial Database

data-warehousedatabase-designschematimestamp

My company is starting a new initiative aimed at building a financial database from scratch.

We would be using it in these ways:

  • Time series analysis of: a company's financial data (ex: IBM's total fixed assets over time), aggregations (ex: total fixed assets for the materials sector over time), etc.
  • Single company snapshot: various data points of a single company
  • Analysis of multiple companies across multiple data fields for a single time frame, usually the current day.
  • Back-testing, rank analysis, data analysis, etc. of ideas and custom factors.

Approximate breadth of data:

  • 3000 companies
  • 3500 data fields (ex: total fixed assets, earnings, etc.)
  • 500 aggregation levels

Periodicity: daily, monthly, quarterly, annual

20 year look-back that would grow over time

Question: In our PostgreSQL database, what schema should we use? Right now I am thinking one time series table per company, per category of data field for the fully normalized DB. For example, one table for, say, all the balance sheet fields for IBM, another table for IBM's cash flow items, etc for all categories of data and for each company. Timestamps as records and data fields as columns/fields. Then for fast queries, create a warehouse and views, etc. that are not fully normalized but optimized for queries for my use-cases listed above. However, if you look at my number of companies and fields above, I will probably end up with more than 200,000 tables for just my base financial data if my tables are pretty wide, which isn't great either. That's a lot of tables, but I don't see another good way to do it.

If there is a better place to ask this question, please let me know.

If you need more information, I am happy to edit my question and add it.

PS – I asked a similar question on the SO Quant site, but didn't get much schema help. Also, non-schema focused answers are okay, but note that I am looking for help with schema design here.

Best Answer

You can achieve this by creating 3 or 4 tables:

  1. Company Information
  2. Fundamental Entries
  3. Fundamental Data
  4. MarketCap Data (Optional)

Company information will contain information about all the companies, assign a Company ID to each company, which would be later used in to join the financial fields. Use some SCD type( slowly changing dimensions) based on your requirements, fields like Ticker changes often. So carefully assign a company ID.

Unpivot and divide your financial data between Entries and values.

In the Fundamental Entries table generate a unique ID (eg ID_FS_ENT) for each company ID and each financial entry (that would be a combination of Period End, Accounting Standard, Fiscal Period, etc.)

In the financial Data table store the financial value along with the ID_FS_ENT, Field Name (that would be shares outstanding, total fixed assets, earnings etc)

If you are also receiving daily market cap for each company then store it separately. Again use the Company ID, add trading date and value to this table.

Hope this helps!

P.S. I am already managing a similar database with more than 100,000 companies for more that 78 economies and around 100 million financial entries.