Query optimization help

optimizationoracleoracle-11g

I am fairly new to SQL and am a bit out of my depth with the query I have been working on. I am attempting to modify a query created by someone else to fix an error. the code I have modified is below:

SELECT COUNTY                               COUNTY,
DECODE(RATE,'115A','R','C')             R_C,
SUM(MTH_CHG)                            MO_CHG,
SUM(YTD_CHG)                            YR_CHG,
COUNT(CUST)                             TOTAL_CUST
    FROM RPT_BLOC_VW,
        (SELECT SRAT_CODE               RATE,
        NVL(billed_chg,0) + NVL(adj_chg,0) MTH_CHG,
        0                             YTD_CHG,
        BLOC_CODE,
        FROM RPT_REV_CON_ETL,
        (SELECT DISTINCT cust_code, prem_code AS CUST 
            FROM   rpt_rev_con_etl 
            WHERE  PROCESS_DATE BETWEEN vStartMonth AND vEndMonth 
                   AND UTVSRAT_desc = '115')
        WHERE PROCESS_DATE BETWEEN vStartMonth AND vEndMonth
        AND UTVSRAT_DESC = '115'
    UNION ALL
     SELECT         SRAT_CODE          RATE,
         0                             MTH_CHG,
         NVL(billed_chg,0) + NVL(adj_chg,0) YTD_CHG,
         BLOC_CODE
        FROM RPT_REV_CON_ETL
          WHERE PROCESS_DATE BETWEEN vStartYear AND vEndMonth
        AND DESC = '115'
     ) DATA
     WHERE DATA.BLOC_CODE = RPT_BLOC_VW.BLOC_CODE(+)
 GROUP BY COUNTY, DECODE(RATE,'115A','R','C')
 ORDER BY COUNTY, DECODE(RATE,'115A','R','C') DESC ;

What I am looking for is a way to combine the second and third select statements into one. They have the same From and Where clause so I am fairly certain it should be possible, but I cannot think of a way to do it that doesn't mess up the distinct clause and screw up the count I am trying to get.

I am pretty sure there is something obvious I am overlooking.

-edited to include full query-

Best Answer

With your original code as such:

    SELECT COUNTY                                  COUNTY,
       DECODE(RATE,'115A','R','C')             R_C,
       SUM(MTH_CHG)                            MO_CHG,
       SUM(YTD_CHG)                            YR_CHG,
       COUNT(CUST)                             TOTAL_CUST
    FROM RPT_BLOC_VW,
       (SELECT SRAT_CODE               RATE,
        NVL(billed_chg,0) + NVL(adj_chg,0) MTH_CHG,
         0                             YTD_CHG,
         BLOC_CODE
        FROM RPT_REV_CON_ETL,
          (SELECT DISTINCT
           CUST_CODE,
           PREM_CODE AS CUST
    FROM   RPT_REV_CON_ETL
    WHERE PROCESS_DATE BETWEEN '28-OCT-16' AND '29-NOV-16' AND DESC = '115'
    )
          WHERE PROCESS_DATE BETWEEN '28-OCT-16' AND '29-NOV-16'
        AND DESC = '115'
     ) DATA
     WHERE DATA.BLOC_CODE = RPT_BLOC_VW.BLOC_CODE(+)
 GROUP BY COUNTY, DECODE(RATE,'115A','R','C')
 ORDER BY COUNTY, DECODE(RATE,'115A','R','C') DESC ;

I start by rewriting it to (this helps me think about it a little differently, which is weird, I'm sure, but this is how I approach it)

SELECT county            COUNTY, 
       Decode(rate, '115A', 'R', 'C') R_C, 
       SUM(mth_chg)      MO_CHG, 
       SUM(ytd_chg)      YR_CHG, 
       Count(cust)       TOTAL_CUST 
FROM   rpt_bloc_vw, 
       (SELECT srat_code                            RATE, 
               Nvl(billed_chg, 0) + Nvl(adj_chg, 0) MTH_CHG, 
               0                                    YTD_CHG, 
               bloc_code 
        FROM   rpt_rev_con_etl, 
               (SELECT DISTINCT cust_code, 
                                prem_code AS CUST 
                FROM   rpt_rev_con_etl 
                WHERE  process_date BETWEEN '28-OCT-16' AND '29-NOV-16' 
                       AND desc = '115') 
        WHERE  process_date BETWEEN '28-OCT-16' AND '29-NOV-16' 
               AND desc = '115') DATA 
WHERE  DATA.bloc_code = rpt_bloc_vw.bloc_code(+) 
GROUP  BY county, Decode(rate, '115A', 'R', 'C') 
ORDER  BY county, Decode(rate, '115A', 'R', 'C') DESC; 

If I rewrite your query to use CTEs like so:

WITH DistinctCustomers AS (
    SELECT DISTINCT cust_code, 
                    prem_code AS CUST 
    FROM   rpt_rev_con_etl 
    WHERE  process_date BETWEEN '28-OCT-16' AND '29-NOV-16' AND desc = '115'
),
DATA as (
    SELECT srat_code                            RATE, 
           Nvl(billed_chg, 0) + Nvl(adj_chg, 0) MTH_CHG, 
           0                                    YTD_CHG, 
           bloc_code ,
           cust.CUST
    FROM   rpt_rev_con_etl etl 
    JOIN   DistinctCustomers cust ON cust.cust_code = etl.cust_code
    WHERE  process_date BETWEEN '28-OCT-16' AND '29-NOV-16' AND desc = '115'
)
SELECT county COUNTY, 
       Decode(rate, '115A', 'R', 'C') R_C, 
       SUM(mth_chg) MO_CHG, 
       SUM(ytd_chg) YR_CHG, 
       Count(cust) TOTAL_CUST 
FROM   rpt_bloc_vw rpt
JOIN   DATA d on d.bloc_code = rpt.bloc_code(+) 
GROUP  BY county, Decode(rate, '115A', 'R', 'C') 
ORDER  BY county, Decode(rate, '115A', 'R', 'C') DESC; 

It looks like, if I read this correctly, that you are first getting the distinct customers that had any processes between the date ranges.

Then you go back for the rate codes and charges for those customers. You could probably just do a join there and not need the distinct (which was probably added because it wasn't written correctly, by my eye, but I don't write a lot of Oracle SQL, just MS SQL).

That gives us this:

WITH DATA as (
    SELECT srat_code                            RATE, 
           Nvl(billed_chg, 0) + Nvl(adj_chg, 0) MTH_CHG, 
           0                                    YTD_CHG, 
           bloc_code ,
           prem_code as Cust
    FROM   rpt_rev_con_etl etl 
    WHERE  process_date BETWEEN '28-OCT-16' AND '29-NOV-16' AND desc = '115'
)
SELECT county COUNTY, 
       Decode(rate, '115A', 'R', 'C') R_C, 
       SUM(mth_chg) MO_CHG, 
       SUM(ytd_chg) YR_CHG, 
       Count(cust) TOTAL_CUST 
FROM   rpt_bloc_vw rpt
JOIN   DATA d on d.bloc_code = rpt.bloc_code(+) 
GROUP  BY county, Decode(rate, '115A', 'R', 'C') 
ORDER  BY county, Decode(rate, '115A', 'R', 'C') DESC; 

Since I don't have a table schema, I'm just assuming that what I've done so far is correct. Pretty sure I got this under control, but see the first query to tell me what I did wrong if I got it wrong.

So then this can be reduced even further if we just do some joins across things like so:

SELECT county COUNTY, 
       Decode(etl.srat_code, '115A', 'R', 'C') R_C, 
       SUM(Nvl(etl.billed_chg, 0) + Nvl(etl.adj_chg, 0)) MO_CHG, 
       SUM(0) YR_CHG, 
       Count(etl.prem_code) TOTAL_CUST 
FROM   rpt_bloc_vw rpt
JOIN   rpt_rev_con_etl etl on etl.bloc_code = rpt.bloc_code(+) AND etl.process_date BETWEEN '28-OCT-16' AND '29-NOV-16' AND desc = '115'
GROUP  BY county, Decode(etl.srat_code, '115A', 'R', 'C') 
ORDER  BY county, Decode(etl.srat_code, '115A', 'R', 'C') DESC; 

But I'm pretty sure I got something wrong on that last one, I don't always trust my own reductions. I've tried to show all my steps here starting with the first edit I did to your post on your original SQL code.

Can you tell me how the output of the last SQL I wrote compares to your original query?