PostgreSQL: translating user-defined calculations into executable calculation in trigger

data-warehousemdxpostgresqlpsql

I am looking to convert user defined machine readable string formulas into a set of queries and calculations, the result of which would then be saved in another non-normal table. The calcs would be executed via a trigger and a generalized/dynamic/flexible calculation function(s) that would execute when a certain table (bbg_pulls in the simple example below) is updated or inserted on.

For example, if the string passed into the calculation function was:

'[bp2][-1]/[bp5]'

Where [bp#] designates the table (ex: bp = bbg_pulls) and id and [+/-#] designate the period lag (day lag in my example) relative to now() for ID just to the left of it.

Then the query would be something like:

select val1/val2 as fin_calc
from 
    (select val from bbg_pulls 
    where 
      id = 2
      and val_date = (select (CURRENT_TIMESTAMP - INTERVAL '1 day')::date)) as val1
    ,(select val from bbg_pulls 
    where 
      id = 5
      and val_date = (select CURRENT_TIMESTAMP::date)) as val2

(Note that all of the above made up and I have no way to test the queries so they may be syntactically off etc. Hopefully you understand my point though.)

My example above only includes division of two items in the same table, but other strings could be much more complicated, and I would need to incorporate things like percentile rank, the basic mathematically operates, absolute value, etc.

We use PostgreSQL 9.3.5. The queries above are dramatically over simplified for clarity.

What is the correct approach here? Warehouse/MDX/cube solution? Or is there something like this out there already?

My current approach/thinking is to go through all of the various calculations we use and define a separate query etc. in psql for each, create some logic to find out which type of query/function is desired and pass in the IDs as parameters to the specific query/function that was identified. In other words, not a generalized calculation generation function that can take any string calculation, but a set predefined functions and queries that all have to be defined separately and identified in the string formula that is passed in. One would exist for each minor difference in calculation, and we have a lot.

If this question is not answerable, please let me know and I'll take it down.

Best Answer

Essentially, you're asking how to implement a DSL that generates dynamic SQL based on that proprietary non-spec'd query string '[bp2][-1]/[bp5]'. That's a massive task. I'm not going to be able to answer this question on this medium, it's far from a Q/A: "it's like how do I do this job." That said, this is how it would look.

  1. Create a specification of what kind of features you intend to support with '[bp2][-1]/[bp5]' and '([bp1]/[wc66])*[wc100]', and the numerous other examples you've given in your questions.
  2. Create a parser. This is an art in and of itself. You going to use regexes, a tree parser, a sax parser? What exactly do you need to meet the specification?
  3. Use an off-the-shelf solution that can translate the parsed structure to SQL. Most languages have something that can do this, like SQL::Abstract. If not, you gotta create it.