Do cross-joins get optimised by the client library

optimizationquery

Note This is deliberately database agnostic, I'm interested in how different implementations, er, differ.

I came across Joins are for lazy people on SO, and it's made me question an assumption I'd made.

Joining in the app server can be more
efficient if joining on the database
causes severe redundancy in the result
set sent over the network. Consider
tables A and B, where each row in A is
associated with 20 rows in B, B has
only 100 rows, and we want to fetch
the first 1000 rows from A with
associated rows from B. Joining in the
database will result in 20 * 1000
tuples sent across the network. If the
join is done in the app server (first
fetching the entire B table into
memory), a mere 100 + 1000 rows are
sent across the network.

I kinda assumed that a cross-join would be optimised in the client library, so that less data was sent over the network, and then merged, so that the app didn't need to glue the data together. Am I plain wrong, or does it vary depending on the database?

Best Answer

I'm reasonably confident that there is no optimization in the Oracle client libraries for repeating data. If it is, it's below the SQL*Net trace layer which would surprise me.

There are definitely cases where it could make sense to do the join in the app server rather than in the database. But it would be exceedingly rare that you'd have a table structure like this where you would also want to display the result of joining A to B to the user. If you don't want to display the data to the user, you'd just make sure that the results were being processed in the database so that neither A nor B had to be sent over the network.

I have had one or two cases where I needed to get highly denormalized results to a client where it made sense to use the CURSOR function rather than doing a straight join solely in an effort to minimize the amount of data being sent over the network. Of course, that requires that you're selecting a substantial amount of data from table A (DEPT in this case) so that the duplication of data actually becomes problematic.

SQL> ed
Wrote file afiedt.buf

  1  select dname,
  2         cursor( select ename, empno
  3                   from emp e
  4                  where e.deptno=d.deptno) emps
  5*   from dept d
SQL> /

DNAME          EMPS
-------------- --------------------
ACCOUNTING     CURSOR STATEMENT : 2

CURSOR STATEMENT : 2

ENAME           EMPNO
---------- ----------
CLARK            7782
KING             7839
MILLER           7934

RESEARCH       CURSOR STATEMENT : 2

CURSOR STATEMENT : 2

ENAME           EMPNO
---------- ----------
smith            7369
JONES            7566
SCOTT            7788
ADAMS            7876
FORD             7902

SALES          CURSOR STATEMENT : 2

CURSOR STATEMENT : 2

ENAME           EMPNO
---------- ----------
ALLEN            7499
WARD             7521
MARTIN           7654
BLAKE            7698
TURNER           7844
SM0              7900

6 rows selected.

OPERATIONS     CURSOR STATEMENT : 2

CURSOR STATEMENT : 2

no rows selected