We have a huge number of rows(tens of millions), and for each table we have an associated view that is a select with UNION ALL between that table and several others.
Now, we are doing a select from that view with some conditions(where).
On Oracle and SQL Server the performance of the select is reasonable.
We have given a try to Vertica, because it is supposed to be good with selects on large tables, but the performance of this particular select is really bad.
I understand that when you do a union you lose the indexes on columns, because you can't get back to them.
What do you think is the problem with Vertica, is it missing some optimization steps that the other databases do ?
Best Answer
Vertica query performance depends highly on the predicate used in the query .
To get the gist of your performance , try getting the
projection name
of the selected columns of the query you are firing . The columns in theorder by
clause of the projection is very important in deciding the performance of yourselect
. you can get that by runningexplain
on your query .Vertica improves performance by
sorting
the columns ,compressing
viaencoding
them so that they use minimal memory while running .Also run
analyze_histogram(tablename, 100)
on all your tables . this will ensure statistics over complete data sample, not just 10% of data sample which is taken byanalyze_statistics
.Also , since you are doing
union
you should try to keep thesort order
of all theprojections
same as afterunion
it might be meaningless .