I have a bunch of tables that all have a storenumber
, extracteddate
and extractedtime
column.
I'm doing ETL on these tables and I want to put the last extracted date and time for each store in a single table, with an additional column for the input table name. So I've ended up repeating the same upsert a bunch of times and really the only thing that is different is the CTE that I am using for input:
with rows as (
select distinct on(storenumber)
storenumber, extracteddate, extractedtime
from storefinancialdate
)
insert into star.stat_etl (storenumber, filename, last_extracted, last_loaded)
select rows.storenumber,
'storefinancialdate',
(rows.extracteddate + rows.extractedtime::time),
transaction_timestamp()
from rows
on conflict(storenumber, filename) do update
set last_extracted = excluded.last_extracted,
last_loaded = excluded.last_loaded;
with rows as (
select distinct on(storenumber)
storenumber, extracteddate, extractedtime
from storeproduct
)
insert into star.stat_etl (storenumber, filename, last_extracted, last_loaded)
select rows.storenumber,
'storeproduct',
(rows.extracteddate + rows.extractedtime::time),
transaction_timestamp()
from rows
on conflict(storenumber, filename) do update
set last_extracted = excluded.last_extracted,
last_loaded = excluded.last_loaded;
-- repeat a bunch more times...
Is there some way that I can DRY this up? I can't pass a CTE into a function, can I?
(I get that I could call a function on each record of each CTE, one at a time. That would make the code shorter but also much, much slower. So I don't want to do that.)
Best Answer
You could use
UNION
in the CTE. There is still some duplication of code but less than before: