The approach you're using is unnecessarily complex - and very inefficient. Instead of the first function use:
create or replace function compute_pair_id_value(id bigint, value integer)
returns setof pair_id_value
as $$
SELECT $1, generate_series(0,$2);
$$
language sql;
or better, get rid of it entirely and write the whole operation like this:
-- Sample data creation:
CREATE TABLE my_obj(id bigint, obj_value integer);
insert into my_obj(id,obj_value) VALUES (1712437,2),(17000,5);
-- and the query:
SELECT id, generate_series(0,obj_value) FROM my_obj;
Resulting in:
regress=> SELECT id, generate_series(0,obj_value) FROM my_obj;
id | generate_series
---------+-----------------
1712437 | 0
1712437 | 1
1712437 | 2
17000 | 0
17000 | 1
17000 | 2
17000 | 3
17000 | 4
17000 | 5
(9 rows)
This exploits PostgreSQL's behaviour with set-returning functions called in the SELECT
list. Once PostgreSQL 9.3 comes out it can be replaced with a standards-compliant LATERAL
query.
Since it turns out your question was a simplified version of the real problem, let's tackle that. I'll work with the simplified compute_pair_id_value
above to avoid the hassle of plpython3. Here's how to do what you want:
SELECT (compute_pair_id_value(id,obj_value)).* FROM my_obj;
Result:
regress=> SELECT (compute_pair_id_value(id,obj_value)).* FROM my_obj;
id | value
---------+-------
1712437 | 0
1712437 | 1
1712437 | 2
17000 | 0
17000 | 1
17000 | 2
17000 | 3
17000 | 4
17000 | 5
(9 rows)
but again, be warned that compute_pair_id_value
will be called more than once. This is a limitation of PostgreSQL's query executor that can be avoided in 9.3 with LATERAL
support, but as far as I know you're stuck with it in 9.2 and below. Observe:
create or replace function compute_pair_id_value(id bigint, value integer)
returns setof pair_id_value
as $$
BEGIN
RAISE NOTICE 'compute_pair_id_value(%,%)',id,value;
RETURN QUERY SELECT $1, generate_series(0,$2);
END;
$$
language plpgsql;
output:
regress=> SELECT (compute_pair_id_value(id,obj_value)).* FROM my_obj;
NOTICE: compute_pair_id_value(1712437,2)
NOTICE: compute_pair_id_value(1712437,2)
NOTICE: compute_pair_id_value(17000,5)
NOTICE: compute_pair_id_value(17000,5)
id | value
---------+-------
1712437 | 0
1712437 | 1
1712437 | 2
17000 | 0
17000 | 1
17000 | 2
17000 | 3
17000 | 4
17000 | 5
(9 rows)
See how compute_pair_id_value
is called once per output column?
There is a workaround: Another layer of subquery to unpack the composite type result. See:
regress=> SELECT (val).* FROM (SELECT compute_pair_id_value(id,obj_value) FROM my_obj) x(val);
NOTICE: compute_pair_id_value(1712437,2)
NOTICE: compute_pair_id_value(17000,5)
id | value
---------+-------
1712437 | 0
1712437 | 1
1712437 | 2
17000 | 0
17000 | 1
17000 | 2
17000 | 3
17000 | 4
17000 | 5
(9 rows)
You can use the same technique in your code if you really must LOOP
over the results (it's slow to do that, so avoid it if you can).
Best Answer
Solution
Assuming the data type of your column
tenk1.unique1
isinteger
:Use your actual column type and the corresponding array type.
You get the position, or NULL if the value is not in the MCV list.
The solution is short - unlike the ...
Explanation
The function
array_position()
is defined to take(anyarray, anyelement)
(or(anyarray, anyelement, integer)
for the second variant).The column
pg_stats.most_common_vals
has the polymorphic data typeanyarray
to be able to hold arrays of any data type - for obvious reasons.anyarray
andanyelement
are not allowed as data types for user-created tables. For users, both are polymorphic pseudo-types. (But Postgres can use them in system tables.)Multiple polymorphic variables in the same function have to resolve to the same (or corresponding) data types. The manual:
And:
Bold emphasis mine.
You have found a corner case where function type resolution fails for the combination of the polymorphic
anyarray
with aninteger
- or any non-polymorphic type in second position.1000
in your expressionarray_position(most_common_vals, 1000)
is a numeric constant resolving tointeger
. These would fail in similar fashion:Furthermore, there are no casts defined for
anyarray
, being a pseudo-type in user-land:The workaround is to cast to
text
as stepping stone, since any type can be cast totext
. Then cast tointeger[]
, arriving at the solution above.In closing, I think this is a shortcoming in function type resolution that could be resolved (easily?). But since the data type
anyarray
is not supposed to be used like this in user-land to begin with I doubt that any developer will spend time on it ...