PostgreSQL – Optimize Age Range in WHERE Clause

postgresql

I have 2 columns in the database for events.

  1. agelow (int)
  2. agehigh (int)

How it's structured:

  • agehigh is being used for "and under" which is represented by 0 in the column
  • the high age is placed in the agelow column
  • The "and over" is represented by 99 and is not a problem since it has the agehigh and the agelow in the right columns.
  • all ages are represented with 99 in both columns.

I am trying to write a WHERE that can pull all event ages if no ages are passed in but also need to be able to pull events just for one age passed in or for an age range passed in. The age range is making this a big headache.

Example

If I pass in agelow 5 and agehigh 15, I would need to grab all events that are

  1. available for anyone from 5 up to 15 which includes the "5 and under", "5 and up", "15 and under" and all ages.

Here is the WHERE part:

WHERE
((events.age_low >= agelow AND events.age_high <= agehigh) OR
 (events.age_low <= agelow AND events.age_high >= agehigh) OR
 (events.age_low <= agelow /* LOW (99 for all) */ AND events.age_high >= agelow /* LOW (99 for all) */) OR
 (events.age_low <= agelow /* LOW (99 for all) */ AND events.age_high = 99 /* DEFAULT 99 */) OR
 (events.age_low >= agelow /* LOW (99 for all) */ AND events.age_high = 0 /* DEFAULT 0 */) OR 
 (events.age_high >= agehigh /* HIGH (0 for all) */ AND events.age_low <= agehigh /* HIGH (0 for all) */) OR 
 (events.age_low >= agehigh /* HIGH (0 for all) */ AND events.age_high = 0 /* DEFAULT 0 */) OR 
 (events.age_low <= agehigh /* HIGH (0 for all) */ AND events.age_high = 99 /* DEFAULT 99 */) OR  
 (events.age_low = 99 AND events.age_high = 99));

Here is a small selection of the age columns:

CREATE TABLE events(age_low,age_high)
AS VALUES
  (  99 ,  99 ),
  (   3 ,  12 ),
  (   1 ,   6 ),
  (   5 ,   0 ),
  (  99 ,  99 ),
  (  99 ,  99 ),
  (  99 ,  99 ),
  (   5 ,  11 ),
  (   5 ,   0 ),   -- this is 5 and below
  (  16 ,  99 ),   -- this is 16 and up
  (   1 ,   4 ),
  (  13 ,  18 ),
  (  99 ,  99 ),   -- this is all ages
  (   5 ,   0 ),
  (  99 ,  99 );

Best Answer

One option is to wrap greatest and least to assist this. We can use this to iron out the oddities

  1. 5,0 being five and under. That should really be 0,5 if there was any level of sanity.
  2. That 99,99 is all ages. Here if the lower bound is 99 we set that to -1 just be sure it's all ages.

Code,

CREATE FUNCTION normalize_query_range(start int, finish int)
RETURNS int4range
AS
$$
  SELECT int4range(
    CASE WHEN least(start,finish) = 99 THEN -1 ELSE least(start,finish) END
    , greatest(start,finish)
  );
$$ LANGUAGE sql
IMMUTABLE;

Now, we can easily do this using the range operators for @> contains, and && overlaps.

SELECT age_low,age_high
FROM events
WHERE normalize_query_range(age_low,age_high) && int4range(5,15);

And, even better, creating an index will speed this up.

CREATE INDEX ON events
  USING gist(normalize_query_range(age_low, age_high));