You could do that in Oracle 11gR2 with expdp
and the REMAP_DATA
option.
- Create a function in a package that takes a blob as argument, and returns
null
(or an empty blob perhaps).
Call expdp
as usual, adding:
REMAP_DATA=SCHEMA.TAB.BLOB_COLUM:SCHEMA.PACKAGE.YOUR_FUNCTION
Short example (schema: mat
):
create or replace
package remap as
function null_lob(col blob) return blob;
end remap;
/
create or replace
package body remap as
function null_lob(col blob)
return blob is
eblob blob;
begin
return null;
end null_lob;
end remap;
/
create table foo(a number, b blob);
-- insert a line with a 10M blob, commit
select a, length(b) from foo;
A LENGTH(B)
---------- ----------
1 10485760
Plain export:
$ expdp mat/*** DIRECTORY=EXP_DIR TABLES=MAT.FOO
...
Total estimation using BLOCKS method: 11.06 MB
...
. . exported "MAT"."FOO" 10.00 MB 1 rows
...
Dump file is ~11M.
With the data remap:
$ expdp mat/*** DIRECTORY=EXP_DIR TABLES=MAT.FOO \
REMAP_DATA=MAT.FOO.B:MAT.remap.null_lob
...
Total estimation using BLOCKS method: 11.06 MB
...
. . exported "MAT"."FOO" 5.398 KB 1 rows
...
Dump file is ~100k, importing that dump does indeed import a null in the column.
Another option would be to use an external table to do the load/unload, with the ORACLE_DATAPUMP
driver (I think this was introduced in 10g, see external table concepts).
For the unload, use something like:
create table foo_unload
organization external (
type oracle_datapump
default directory exp_dir
location ('foo.dmp')
)
as select a, empty_blob() b from foo;
(You can drop the external table right after creating it, that doesn't - by default anyway - delete the dump file.)
Then move the foo.dmp
file to your target server & directory, and create the external table there:
create table foo_load (a number, b blob)
organization external
type oracle_datapump
default directory exp_dir
location ('foo.dmp')
);
And finally load your target table (CTAS or truncate ...; insert /*+ append */ ...
for instance):
create table foo as select * from foo_load;
Drop the external loader table once that's done.
Best Answer
Yes, stay away from the
sp_OA*
OLE Automation procedures.If you want / need to handle this purely within SQL Server then you can do so using SQLCLR, which replaces the
sp_OA*
OLE Automation procedures. And if you aren't usingxp_cmdshell
for anything else, then no need to enable it just for this (although to be fair, if using SQL Agent, then a CMD step would be able to execute BCP and/or 7zip without the need for enablingxp_cmdshell
).With SQLCLR you can:
encrypt with a password (see RijndaelManaged and AesManaged). While you did specify AES, the MSDN documentation for AesManaged states:
The difference is if you are using SQL Server 2005, 2008, or 2008 R2. Those versions are limited to using .NET 3.5, and the MSDN documentation for
AesManaged
(for that version) states that it has a Host Protection Attribute (HPA) which means that any assembly using that class has to be marked asPERMISSION_SET = UNSAFE
, whereasRijndaelManaged
does not have that issue and can be used in anEXTERNAL_ACCESS
assembly. However, it seems that the documentation forAesManaged
shows that, starting in .NET Framework version 4.0 (SQL Server 2012, 2014, and 2016 use .NET Framework 4.0 and newer--currently up to 4.6), the note about the HPA has been removed. Even ifUNSAFE
is acceptable, please note that the specific HPA is MayLeakOnAbort, which means "might leak memory if the operation is terminated".In this setup, the password is part of the encryption, not the compression. Hence you either have a zip file containing an encrypted file, or you have an encrypted file containing the zip file. The former method gives you a file that can be unzipped by anyone but then only read by someone with the correct password. The latter method requires knowing the password to get the compressed file, and also allows for better compression since the compression will be done on a text file (with a greater likelihood of compressible patterns).
While I have not implemented the encryption algorithms yet, I have coded both the file export (of any random query) and GZip/GZunzip in SQLCLR and can state that they do run efficiently, or at least can be made to be efficient (and can also be made to be inefficient). By this I mean I have used them to extract queries producing 5 GB text files, and GZipping those 5 GB text files, and memory was never adversely affected since both operations are streamed. Please note that the .NET-provided GZipStream class does not support the Zip64 format, and so is limited to 4 GB of source data.
For anyone who is interested in pre-built SQLCLR functions to do the extract and compression (the ones I mentioned above), they are available in the SQL# library (which I am the author of). Please note that:
GZipStream
class, I had to incorporate a 3rd-party library that does support Zip64.