SQL Server – How to Export a Table and Zip with Password

compressionexportole-automationsql serversql-clr

I would like, from TSQL, to use a view to extract data from a table, output it with BCP, then use non 3rd party tools to ZIP it with a password and AES encryption. Optionnally, I could use a 3rd party tool that doesn't require a local installation (standalone EXE on a network drive for example).

I tried using sp_OA*, but found this post saying they were buggy, memory-leaky and deprecated features. So I am looking for alternatives, if there are any. Google searches all came back to the same solutions: 7zip, Winrar, or sp_OA*.

I don't want the code, just the name of the features or methods that could help me, so I would know what to search for.

MSSQL2008 and up. Can be a Powershell solution. Not SSIS if possible. For the record, it's to daily update a software with current data, as it doesn't have direct access to the source.

If 7zip is the best/only way to go, I'll have a ton of paperwork to fill in and meetings to attend to justify installing non-database software on a SQL server. 🙂

Best Answer

Yes, stay away from the sp_OA* OLE Automation procedures.

If you want / need to handle this purely within SQL Server then you can do so using SQLCLR, which replaces the sp_OA* OLE Automation procedures. And if you aren't using xp_cmdshell for anything else, then no need to enable it just for this (although to be fair, if using SQL Agent, then a CMD step would be able to execute BCP and/or 7zip without the need for enabling xp_cmdshell).

With SQLCLR you can:

do the export from within T-SQL as well and not need to mess with BCP which is an external process (see FileStream).
zip the file, as a separate step, since. NET has methods to gzip and gunzip (see GZipStream).
encrypt with a password (see RijndaelManaged and AesManaged). While you did specify AES, the MSDN documentation for AesManaged states:

The AES algorithm is essentially the Rijndael symmetric algorithm with a fixed block size and iteration count. This class functions the same way as the RijndaelManaged class but limits blocks to 128 bits and does not allow feedback modes.

The difference is if you are using SQL Server 2005, 2008, or 2008 R2. Those versions are limited to using .NET 3.5, and the MSDN documentation for AesManaged (for that version) states that it has a Host Protection Attribute (HPA) which means that any assembly using that class has to be marked as PERMISSION_SET = UNSAFE, whereas RijndaelManaged does not have that issue and can be used in an EXTERNAL_ACCESS assembly. However, it seems that the documentation for AesManaged shows that, starting in .NET Framework version 4.0 (SQL Server 2012, 2014, and 2016 use .NET Framework 4.0 and newer--currently up to 4.6), the note about the HPA has been removed. Even if UNSAFE is acceptable, please note that the specific HPA is MayLeakOnAbort, which means "might leak memory if the operation is terminated".

In this setup, the password is part of the encryption, not the compression. Hence you either have a zip file containing an encrypted file, or you have an encrypted file containing the zip file. The former method gives you a file that can be unzipped by anyone but then only read by someone with the correct password. The latter method requires knowing the password to get the compressed file, and also allows for better compression since the compression will be done on a text file (with a greater likelihood of compressible patterns).

While I have not implemented the encryption algorithms yet, I have coded both the file export (of any random query) and GZip/GZunzip in SQLCLR and can state that they do run efficiently, or at least can be made to be efficient (and can also be made to be inefficient). By this I mean I have used them to extract queries producing 5 GB text files, and GZipping those 5 GB text files, and memory was never adversely affected since both operations are streamed. Please note that the .NET-provided GZipStream class does not support the Zip64 format, and so is limited to 4 GB of source data.

For anyone who is interested in pre-built SQLCLR functions to do the extract and compression (the ones I mentioned above), they are available in the SQL# library (which I am the author of). Please note that:

while there is a Free version of SQL#, the two objects mentioned here, DB_BulkExport and File_GZip, are only available in the Full version.
in order to overcome the 4 GB limitation of the GZipStream class, I had to incorporate a 3rd-party library that does support Zip64.
soon enough there will be functions to handle file encryption and decryption via Rijndael (available for SQL Server 2005 - 2016 at least) and Aes (available for SQL Server 2012 - 2016 at least).

Related Solutions

Sql-server – How to export a single huge table from SQL Server

I'd of thought BCP, with a format file rather than defaults, would be your best bet.

Creating a format file

Export table without BLOB column

You could do that in Oracle 11gR2 with expdp and the REMAP_DATA option.

Create a function in a package that takes a blob as argument, and returns null (or an empty blob perhaps).

Call expdp as usual, adding:

REMAP_DATA=SCHEMA.TAB.BLOB_COLUM:SCHEMA.PACKAGE.YOUR_FUNCTION

Short example (schema: mat):

create or replace 
package remap as 
  function null_lob(col blob) return blob;
end remap;
/

create or replace
package body remap as
  function null_lob(col blob)
    return blob is
    eblob blob;
  begin
    return null;
  end null_lob;
end remap;
/

create table foo(a number, b blob);
-- insert a line with a 10M blob, commit

select a, length(b) from foo;

     A  LENGTH(B)
---------- ----------
     1   10485760

Plain export:

$ expdp mat/*** DIRECTORY=EXP_DIR TABLES=MAT.FOO 
...
Total estimation using BLOCKS method: 11.06 MB
...
. . exported "MAT"."FOO"                                 10.00 MB       1 rows
...

Dump file is ~11M.

With the data remap:

$ expdp mat/*** DIRECTORY=EXP_DIR TABLES=MAT.FOO \
                REMAP_DATA=MAT.FOO.B:MAT.remap.null_lob
...
Total estimation using BLOCKS method: 11.06 MB
...
. . exported "MAT"."FOO"                                 5.398 KB       1 rows
...

Dump file is ~100k, importing that dump does indeed import a null in the column.

Another option would be to use an external table to do the load/unload, with the ORACLE_DATAPUMP driver (I think this was introduced in 10g, see external table concepts).

For the unload, use something like:

create table foo_unload
  organization external (
    type oracle_datapump
    default directory exp_dir
    location ('foo.dmp')
  )
as select a, empty_blob() b from foo;

(You can drop the external table right after creating it, that doesn't - by default anyway - delete the dump file.)

Then move the foo.dmp file to your target server & directory, and create the external table there:

create table foo_load (a number, b blob)
  organization external
    type oracle_datapump
    default directory exp_dir
    location ('foo.dmp')
  );

And finally load your target table (CTAS or truncate ...; insert /*+ append */ ... for instance):

create table foo as select * from foo_load;

Drop the external loader table once that's done.

Best Answer

Related Solutions

Sql-server – How to export a single huge table from SQL Server

Export table without BLOB column

Related Question