Just off the top of my head, it makes sense that it needs to tablescan those two tables. They are both present in a UNION ALL
. This essentially means all records are in consideration. You then also use a LEFT OUTER JOIN
meaning everything from the left side (your two union'd) tables even if it doesn't match the right side (your Children table). This is telling DB2 that no matter what every possible record in the two tables of your UNION ALL is under consideration.
Now...to fix that problem. I would ask, do you have any indexes over the clazz column in both tables? Since clazz is a predicate of your where clause you may wish to have a index over that field in both tables (MyStuff and MyWarehouseStuff). You don't need an index over the id fields (as long as they are primary keys), because you gain that by default in DB2.
To better verify what DB2 is doing, you should use the visual explain (of course you may have done that). It should show you why it is picking what it is. A few other things to keep in mind when examining the explain, do you run stats? If you don't, you should. Running stats will keep DB2 better informed as to statistics about its tables, data, indexes, etc, and may cause it to choose a better access path. Also, this may seem a silly question, what version and fix pack of DB2 are you on? Occasionally certain versions/fix packs end up buggy and there are performance issues due to poor access path selection by the optimizer. An example of this, if you are on DB2 9.7 then you want to be on fix pack 4 or 5. There is at least one fix pack below those that had serious performance issues due to a bug in the optimizer.
Anyway, check out your indexes first. Adding one to both the tables I mentioned may solve your problem. Worse case, I suppose you could re-write your query to put the clazz predicate up into the sub select "table". That may cause the optimizer to eliminate rows before it unions and joins. Just another thought.
EDIT: Here is a thought on your query
SELECT allStuff.id
FROM (SELECT id, 1 AS clazz_
FROM MyStuff
UNION ALL
SELECT id, 2 AS clazz_
FROM MyWarehousedStuff) allStuff
LEFT OUTER JOIN
Chilren child
ON allStuff.id = child.id
WHERE allStuff.id IN (/*fast sub select */)
AND allStuff.clazz_= ?
You are using a constant basically that from what I understand doesn't
exist in the table. You also are not selecting this, but you have
applied it to every row, and you use it in the WHERE
clause. So I
ask why do this? It really is just saying select every row. You may as
well be doing select *
. This will contribute to the table scan. Plus
you are not displaying this value back. So why use it? Why not rather
do the following:
SELECT allStuff.id
FROM (SELECT id FROM MyStuff
UNION ALL
SELECT id, FROM MyWarehousedStuff) allStuff
LEFT OUTER JOIN
Chilren child
ON allStuff.id = child.id
WHERE allStuff.id IN (/*fast sub select */)
Since you are not selecting the data, unless you oops'd on your query
you are posting, isn't this what you would rather have? It union
(all)'s the ids, joins them to the child table and then returns what
you want?
Although I would also question the point of your LEFT OUTER JOIN
.
Because if your left table (in this case allStuff) has a NULL
for
the id it shouldn't match your "fast sub select", unless that allows
NULL
's as well. (Does it?), so you would be excluding these. I don't
know the context of what you are trying to accomplish so perhaps I am
not speaking to it appropriately, but it makes me question having a
LEFT OUTER JOIN
as well.
Ah ok.
If you are referring to the UNLOAD command, that is only available on DB2 for z/OS. DB2 LUW does not have an UNLOAD command....except if you purchase IBM's separate product Optim High Performance Unload. (I am assuming you are wishing to use that?)
Optim High Performance Unload uses the db2hpu command. And though we do not use the tool where I work, it appears that the syntax you are after is the following:
db2hpu -t 'S.T' -d <database-name> -format ixf -m 'filesystem/messages.MSGS' -o 'filesystem/myfile.IXF'
Best Answer
The IXF format is documented in the manuals. Conceivably, if other methods don't work as you would like them to, you could write a script to extract column definitions.