Parallel INSERT APPEND
with NOLOGGING
would be the way to do this, then as with all NOLOGGING operations, take a backup immediately on completing. Mark indexes unusable first, disable constraints, alter table, perform the operation, then re-enable constraints etc.
Append causes Oracle to always grab free space above the current high water mark, so it's not efficient at reusing space in the segment, but it avoids fiddling with the freelist and the UNDO overhead. If you have to start again for any reason, TRUNCATE
, don't DELETE
.
As to the incremental commit, it will depend on how your data is segmented, can you easily say move a month's worth at a time (e.g. is the partitioning scheme the same in source and target)? Because remember that if you need to satisfy some predicate, that will obviously slow you down. Test to make sure the operation isn't going to fail logically (e.g. incompatible datatypes in source and target) then allocate sufficient resources and just go for it in one transaction. Good luck!
You could do that in Oracle 11gR2 with expdp
and the REMAP_DATA
option.
- Create a function in a package that takes a blob as argument, and returns
null
(or an empty blob perhaps).
Call expdp
as usual, adding:
REMAP_DATA=SCHEMA.TAB.BLOB_COLUM:SCHEMA.PACKAGE.YOUR_FUNCTION
Short example (schema: mat
):
create or replace
package remap as
function null_lob(col blob) return blob;
end remap;
/
create or replace
package body remap as
function null_lob(col blob)
return blob is
eblob blob;
begin
return null;
end null_lob;
end remap;
/
create table foo(a number, b blob);
-- insert a line with a 10M blob, commit
select a, length(b) from foo;
A LENGTH(B)
---------- ----------
1 10485760
Plain export:
$ expdp mat/*** DIRECTORY=EXP_DIR TABLES=MAT.FOO
...
Total estimation using BLOCKS method: 11.06 MB
...
. . exported "MAT"."FOO" 10.00 MB 1 rows
...
Dump file is ~11M.
With the data remap:
$ expdp mat/*** DIRECTORY=EXP_DIR TABLES=MAT.FOO \
REMAP_DATA=MAT.FOO.B:MAT.remap.null_lob
...
Total estimation using BLOCKS method: 11.06 MB
...
. . exported "MAT"."FOO" 5.398 KB 1 rows
...
Dump file is ~100k, importing that dump does indeed import a null in the column.
Another option would be to use an external table to do the load/unload, with the ORACLE_DATAPUMP
driver (I think this was introduced in 10g, see external table concepts).
For the unload, use something like:
create table foo_unload
organization external (
type oracle_datapump
default directory exp_dir
location ('foo.dmp')
)
as select a, empty_blob() b from foo;
(You can drop the external table right after creating it, that doesn't - by default anyway - delete the dump file.)
Then move the foo.dmp
file to your target server & directory, and create the external table there:
create table foo_load (a number, b blob)
organization external
type oracle_datapump
default directory exp_dir
location ('foo.dmp')
);
And finally load your target table (CTAS or truncate ...; insert /*+ append */ ...
for instance):
create table foo as select * from foo_load;
Drop the external loader table once that's done.
Best Answer
Filtering options in Data Pump are still something that should be improved. Anyway, you could try the below parameters:
This excludes
TABLE_DATA
for all tables and partitions whose name match the result of the query provided in theexclude
part.And that is why this may not be appropriate for you. If you have a table called
T1
both inSCHEMA1
andSCHEMA3
, the data ofSCHEMA3.T1
will be excluded, because the above query returnsT1
fromSCHEMA1
, and Data Pump will ignoreTABLE_DATA
of all table calledT1
. If you have a partitioned table inSCHEMA2
, with a partition calledP1
, the data of the tableSCHEMA3.P1
will be excluded. The above may work if your table and partition names are distinct accross all referenced schemas.Or you could dynamically generate the expdp command or use DBMS_DATAPUMP to filter out all the table data from
SCHEMA3
.But I would rather just run 2 seperate exports for
SCHEMA1,SCHEMA2
withCONTENT=METADATA_ONLY
and a default export (CONTENT=ALL
) forSCHEMA3
.