How do I create a function which takes indefinite parameters? And then finds all game_id
s in a table where each parameter matches a different row (but same game_id
)?
Example
Table:
create table tags (
tag_id serial primary key,
game_id int, -- references games(game_id),
tag_name text,
tag_value text
)
Sample data:
tag_id | game_id | tag_name | tag_value
--------+---------+-----------+----------------------
55 | 6 | Event | EUR-ASIA Rapid Match
58 | 6 | Round | 5
400 | 38 | Event | EUR-ASIA Rapid Match
403 | 38 | Round | 4
Example request: Let's say I want all game_id
s where
Event (a tag_name) = 'EUR-ASIA Rapid Match' (a tag_value)
AND
Round (a tag_name) = '5' (a tag_value)
A hardcoded solution for that exact scenario only might look like this:
with m1 as (
select game_id from tags
where tag_name = 'Event'
and tag_value = 'EUR-ASIA Rapid Match'
), m2 as (
select game_id from tags
where tag_name = 'Round'
and tag_value = '5'
) select * from m1 intersect select * from m2;
Except I want an indefinite number of tag matches. Can I create a function that takes an arbitrary number of tag names/values and returns a set of game_id
matching all? The call might look this (pseudo-code):
select * from get_games_by_tags('{Event,EUR-ASIA Rapid Match}', ...)
Best Answer
This is a special case of relational-division. Here is an arsenal of query techniques:
The special difficulty of your case is to filter on the combination of two attributes, but the principle is the same.
You can make this fully dynamic with plain SQL, without string concatenation and dynamic SQL:
But performance won't come close to the following solution with dynamic SQL.
For best performance, have this (
UNIQUE
) multicolumn index:Maybe your
PRIMARY KEY
ontags
already spans these columns. For best performance you need index columns in the demonstrated order. Create an additional index if the PK does not match or change the column order of the PK unless you need columns in a different order (too). Related:The basic query technique I chose uses the pattern:
This query is already optimized for performance.
Function
Using a user-defined row type as input like you have in your answer (optional, but convenient for the function design). I chose the name
game_tag
becausetag
felt too generic:Note the subtle differences in syntax for these two row values:
The first one is a string literal for the registered row type
game_tag
, the second is aROW
constructor on two string literals building an anonymous row, short for:Either works for our purpose and gets index support. Just don't confuse the different syntax requirements. Related:
The shortcuts for just 1 or 2 parameters are optional but should further improve performance.
db<>fiddle here
Should be faster by orders of magnitude than what you have in your answer.
Call:
You can also pass an actual array to a
VARIADIC
function. Related: