Summation query with complex grouping criteria

duplicationoracle

I have a database of coded actions with this structure.

{User, Date, ActionCode}

I need to generate reports based on dates and action codes, and the action codes are not clean because we have to import them from multiple sources.

I can handle most of them (like the average number of times users perform ActionCode 13).

But, here's what I'm having trouble with: The average number of times users perform an action that can be defined by any one of the action codes 61, 62, 700, or 701. This action can also be defined by using both action codes 84 and 85 or by using the three action codes 10, 11, and 12.

Here is what I have that handles the first set of action codes:

select
  average(cnt)
from
(
  select
    count(distinct(date)) as cnt
  from
    codes
  where
    actioncode in (61, 62, 700, 701)
  group by user
)

I use distinct date because some users code to multiple places and we get back more than one code for the action. How do I add "or they had both 84 and 85 on the same date"?

Best Answer

One of many possible ways:

SELECT  avg(cnt) AS avg_per_user
FROM   (
   SELECT usr, count(DISTINCT date) AS cnt
   FROM   codes c
   WHERE  actioncode in (61, 62, 700, 701)

   OR     actioncode = 84
   AND EXISTS (
      SELECT 1 FROM codes c1
      WHERE  c1.date = c.date
      AND    c1.actioncode = 85
      )

   OR     actioncode = 10
   AND EXISTS (
      SELECT 1 FROM codes c1
      WHERE  c1.date = c.date
      AND    c1.actioncode = 11
      )
   AND EXISTS (
      SELECT 1 FROM codes c1
      WHERE  c1.date = c.date
      AND    c1.actioncode = 12
      )
   GROUP  BY usr
   ) sub

We don't need parenthesis, since operator precedence works in our favor.

DISTINCT is only still necessary if there can be duplicates. The cases for (10 + 11 + 12) and (84 + 85) only produce a single row.

I am using usr instead of user, since I am not comfortable with abusing reserved words as identifiers. date isn't a good idea either.