How best to store complex cost information

database-designeavnosqlschema

I am creating a database to store data related to costs. The cost information can be very varied and I won't be able to fully determine all possible columns currently.

An example piece of data might be the cost of a F-15 jet. So taking this from Wikipedia:

$27.9 million in 1998 dollars.

I will also need to store contextual information, for example:

  • Model Variant
  • Weight
  • Wingspan
  • Length

The model must also be able to store completely different information such as price of a pressurised water reactor used in a nuclear power plant. The cost could be:

€200 million in 2011 euros.

The contextual information for this would be things like:

  • Throughput
  • Max Operating Temperature
  • Weight
  • Country of Origin

From a database design point of view the cost information seems quite easy to capture, all costs will have:

  • Description
  • Economic Conditions
  • Unit
  • Value

The contextual information however can be anything and as such it's hard to design a schema that will easily capture all the relevant data.

The purpose of having this data is to use it to estimate future costs. So all contextual information should be searchable, the user might need all costs from the database that are related to aircrafts or all information on the cost of valves under 100Kg in weight.

So my question is, what is the best way of storing this data? Will I just have to try and create a schema as detailed as possible, will something like Entity-Attribute-Value work best or maybe a NoSQL document store?

Thanks for your help.

Best Answer

After much investigation I think that NoSQL is the best way forward. Having unsearchable JSON fields doesn't appeal and I like the flexibility of schema free design.

I have created an example MongoDB document that covers everything I was trying to do:

{
    Cost_Type : "Actuals",
    Cost_Description : "200mm copper steam valve",
    Cost_UOM : "£K",
    Cost_Value : 1000,
    ECs : 2017,
    Source : {
        URL : ,
        Location_In_Source : "pg 203",
        Document : "The cost of valves over time.pdf",
        }
    Physical_Properties : [
        {
            Property : "Height",
            Property_Value : "200",
            Property_UOM : "mm",
            },
        {   
            Property : "Weight",
            Property_Value : "200",
            Property_UOM : "kg"
            }
    ]
    Categories : [
        {
            id: "Valve",
            Path: ",Components,Pumps & Valves,",
            }
        {
            id: "Aircraft",
            Path : ",Industry Sector,Aviation,Aircraft,"
            }
        {
            id: "France"
            Path : ",Europe,France,"
            }
        ]
    Tags : ["Fuel system", "Hot system"],
}   

The initial fields cover everything that is required for each cost.

The source field can either be an URL or an actual binary document.

The physical properties field holds an array of key value fields with an additional unit of measurement field. if I didn't need the UOM it would be simpler as I could just use a standard dictionary.

The categories are a reference to a mongodb graph collection which will be defined by an administrator to avoid bloat.

Finally the tags field will allow for the user to add any other information they like.

I think this should cover everything I need but now I just need to actually implement it in a MongoDB prototype.