Postgresql – Multiple THEN outputs on a PostgresQL query

casefunctionspostgresql

Helllo there,
I've been playing around with a materialised view, and my goal is to pre-process as much information as I can, aggregating the info into this view that would speed up the data analysis.

As a such, I'd like to use a CASE expression with an input condition and two assignments if the condition is satisfied.

Something like

SELECT 
    CASE 
        WHEN t."page" ~* '...' THEN t.type = 'Product', t.something = 'something else here'
END AS "page_type"

Reading through the documentation, but also checking for several examples I couldn't see this doable. Am I missing something?

Assuming is not, the only thing I can think of is using a function that returns an array or something similar.
Would that be an a possible approach? If so, any example I can be pointed out to learn a decent DDL pattern?

Thanks

UPDATE To support @a_horse_with_no_name answer

SELECT * 
FROM "V_2019-06-10_to_2019-06-20_GSC" t,
            (
                SELECT
                    Regexp_replace(t."page", '...', '\1') as node
                    FROM "V_2019-06-10_to_2019-06-20_GSC" i
                    WHERE i.page = t.page
            ) AS "IFs"

The above generates an error in referencing the t table.
If I go for a CROSS JOIN LATERAL, the query works, but then the GROUP BY I have at the end of the outer query cannot dedup anymore the results

UPDATE 2

Here's a simplification of the query. It's of course much more complex than this, but hopefully it suffice.

SELECT -- t."page", 
    "URL".node, t.area, "URL".page_type,
    CASE
        WHEN "URL".page_type = 'ABC' THEN <here I need my value>
        ELSE ''
    END AS "Node_Type",
    FROM "V_2019-06-10_to_2019-06-20_GSC" t
    CROSS JOIN LATERAL (
        SELECT 
            Regexp_replace(t."page", '...', '\1\2\3\4\5\6') as node, 
            CASE 
                WHEN t."page" ~* '...' THEN 'ABC'
                WHEN t."page" ~* 'filter' THEN 'DEF' 
                ELSE 'Other'
            END AS "page_type" 
    ) AS "URL"
    GROUP BY "URL".node, t.area, "URL".page_type
;

A URL processed by this query could look like this /level1/level2/level3/?filter=123

Idealy I'd like to:

  • capture level1/level2/level3 with the Regexp_replace, and I have the
    regexpr working for this.
  • I need to identify once again whether the filter is included in the page to populate the page_type, and the case above should be ok.
  • capturing 123 as part of the outer CASE "Node_Type", but I cannot get this done, hence my first thought of a CASE processing two
    outputs.

Is there an effective way to get this out?

Best Answer

A single case expression can only return a single value (=expression). So yes, you are right you can't do that. If you want to have two output columns, you need two case expressions.

SELECT CASE 
         WHEN t.page ~* '...' THEN 'Product here'
         ... other outputs ....
        END AS page_type, 
        CASE 
         WHEN t.page ~* '...' THEN 'something else'
         ... other outputs ....
        END AS something _else
FROM ...

If you don't want to repeat the condition, you could wrap that into a derived table:

SELECT node, area, page_type, 
       CASE
          WHEN page_type = 'ABC' THEN 'Product'
          WHEN page_type = 'DEF' THEN 'Something else'
          ELSE ''
       END AS "Node_Type"
from (
    SELECT area, 
           Regexp_replace(t."page", '...', '\1\2\3\4\5\6') as node, 
           CASE 
             WHEN t.page ~* '...' THEN 'ABC'
             WHEN t.page ~* 'filter' THEN 'DEF' 
             ELSE 'Other'
           END AS page_type, 
           string_to_array(trim(substr(page, 1, strpos(page, '?')-1),'/'), '/') as levels,
           regexp_replace(page, '(^.*\?filter=)([0-9]+)', '\2', 'g') as filter_value
           length(t.page) as string_length
    FROM "V_2019-06-10_to_2019-06-20_GSC" 
) t;

The expression string_to_array(trim(substr(page, 1, strpos(page, '?')-1),'/'), '/') will return the individual elements (levels) of the URL as array elements. '/level1/level2/level3/?filter=123' will yield and array {level1,level2,level3) and '/level1/level2/level3/level4/?filter=123' will return 4 elements. If you need to access the number of levels in the outer select, you can use e.g. cardinality(levels)

The expression regexp_replace(page, '(^.*\?filter=)([0-9]+)', '\2', 'g') will return the value of the parameter filter as a single expression (It's probably not 100% fool-proof, but URL parsing with regex is not a simple task).