I read about pgpool-II here and it states that:
Using the parallel query function, data can be divided among the multiple servers, so that a query can be executed on all the servers concurrently to reduce the overall execution time. Parallel query works the best when searching large-scale data.
I am wondering whether anyone has experience with using this function? Does it have similar functionality as Greenplum, Aster Data or Stado/GridSQL? These all offer the ability to utilize many computers to process any SQL query in parallel, which is a huge benefit if one wants to undertake data-mining on a large dataset. I was unable to find any documentation on this on the Pgpool-II pages.
Best Answer
To my knowledge it does not allow that level of parallelism. If that's what you are trying to do the best approach is to go with Postgres-XC or GridSQL, but Postgres-XC while more complex is also more flexible.
http://postgres-xc.sourceforge.net/ is the project page.