PostgreSQL Casting Error – Troubleshooting Query Failures

hivepostgresqlpostgresql-9.3

I have a query that can run on the same data set, and sometimes it fails and sometimes it succeeds

The query is generated by hive metadata service, and I can't modify it.

This is a simplified version of the query:

select
    "TBLS"."TBL_ID",
    "FILTER0"."PART_ID",
    "TBLS"."TBL_NAME",
    "FILTER0"."PART_KEY_VAL"
from
    "PARTITIONS"
inner join "TBLS" on
    "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
    and "TBLS"."TBL_NAME" = 'test_table_int'
inner join "PARTITION_KEY_VALS" "FILTER0" on
    "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
where
    cast("FILTER0"."PART_KEY_VAL" as decimal(21, 0)) = 1

When I spin up a new database, and populate the relevant tables, this is how the whole data looks like (querying without any filters):

enter image description here

and running the query above will return a single row (the one with PART_KEY_VAL = 1)

the problem starts after I run some automated tests that write to those tables. I couldn't find any pattern, I just run a few complicate tests that write to those tables

Now if I populate those tables again, the data looks similar:

enter image description here

but running the query above will result in:

SQL Error [22P02]: ERROR: invalid input syntax for type numeric: "c"

for some reason, the value "c" is being cast to decimal and it fails, even though the same query on the same data was working earlier

what could be the reason for this behavior?


for reference, here is where the query is generated, but I simplified it a bit above: https://github.com/apache/hive/blob/rel/release-3.1.2/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L1289-L1339

Best Answer

the problem is that for some reason postgres stops using the index on the column "TBL_NAME".

when the index is used, the table is filtered first, and the remaining values are all ints.
but when the index is not used, the cast will happen on all the rows, and will fail on the strings.

as a workaround, I set enable_seqscan = false, and that forces postgres to use the index, but that is not a long term solution.

I opened a bug on hive with a more detailed explanation of the problem: https://issues.apache.org/jira/browse/HIVE-23576