Sql-server – Export all tables in database using SQLCMD

sql-server-2008sqlcmd

I'm looking to export all tables from a SQL Server Standard Edition database with data into CSV files onto a remote linux server. I setup FreeTDS and am able to connect, do SQLCMD or bcp successfully.

I know we can use SQLCMD but I was wondering if there is a way that we can write each table into its individual CSV file using a single connection instead of looping through every table in the database and writing to a CSV file.

Any thoughts if its possible?

Thanks in advance,
simak

Best Answer

I looked into doing this years ago and looked into both BCP and SSIS. Neither one really provided for great flexibility. BCP needed format files to get the text-qualifiers, etc. But this was for a dynamic backup solution for a subset of tables to get archived nightly. SSIS seemed a bit more dynamic (though not fully), but couldn't do a few of the things that BCP could. I needed something that could dynamically accept changes to tables (include new tables, don't require removed tables) AND changes to columns (as I needed to do "SELECT * FROM {SchemaName}.{TableName};") AND dynamic text-qualification AND column headers, etc.

If you have a static set of tables, then SSIS, using a ForEach container, would be one way to do this (again, assuming the desire is to stay away from BCP due to requiring multiple calls).

Using PowerShell would probably also be an option that would allow for more flexibility than SSIS.

However, if you need something fully dynamic and/or need to stick with a single connection per each DB (all tables in that DB) or even a single connection per server (all tables for all databases), then the only way I have ever found to handle this is the one that I created. It is a SQLCLR stored procedure called DB_BulkExport which has parameters for @Query and @FilePath that allow for easily cycling through a list of all databases, then per each one cycling through a list of tables, and building both the query and filename from those pieces. You could write a fairly simple stored proc that uses a cursor to get the Database and Table names and call this proc, hence being a single call to SQLCMD.EXE to call that one proc. Or it can be scheduled via SQL Agent without needing to enable xp_cmdshell.

This procedure -- DB_BulkExport -- is part of the SQL# library (which I am the creator of), though it is only available in the Full (i.e. paid for) version, not in the Free version. I generally do not mention stuff here that is not available in the Free version, but as I said, I am not aware of any other existing means of automating this type of process. Of course, anyone is free to write this themselves, either as a SQLCLR proc as I have done, or as a stand-alone .NET app, or using PowerShell. It is more than a few lines of code in C# and I suspect it is also not overly simplistic in PowerShell either (but I haven't tried and don't have enough experience with PowerShell to really know).

EDIT:
Regarding how I have used the above mentioned stored proc in at least one instance: It was used to export data nightly, for long-term archival, the oldest day from a large table that was itself a common, short-term archival of data from 20 source servers.

It would extract maybe 10 million rows for a given date
The table held 150 - 450 million rows (fluctuation due to several factors)
It would grab all records for the oldest date and export them to a delimited file
1 day was 5 million rows (avg)
Each row was 1240 bytes (avg; 2 URLs plus a few other text fields, etc)
The file was 3 GB - 6 GB depending on the day
It exported to a local temp folder so that it didn't need to go across the network.
The archival process then:
- used the File_GZip function -- which can handle zip64 (i.e. source files over 4 GB) -- to compress the text file down to 1 GB - 2 GB (no, it didn't take much memory; buffer is 8k)
- used File_Copy to copy the .gz file to a networked archive folder
- used File_Delete to remove the local file(s)
- If the above steps completed successfully, the date of the data that was just archived was placed into a queue table for another SQL Agent job to pick up and incrementally (in small batches as the table was constantly being filled from the 20 source servers) delete the records from the archive table.
  
  So to amend the description above of the archival process: it would extract the oldest date that was not in this queue table.
- Once all of the records had been deleted the date being deleted would be removed from the queue table.

Ideally there would be a local drive that would not have pagefile, mdf, or ldf files on it that could be used for the extraction and compression. For the long-term archival process noted above, that server was a dedicated archive that just collected data for a few tables from the 20 source servers. Hence the only querying of the data 99% of the time was the SELECT * FROM ArchiveTable to do the extract; the other 1% of the time it was research.

Related Solutions

Sql-server – SQLCMD Exporting 16.5 Million Rows

You'll probably want to do this with bcp, which is intended for this kind of thing. It would be something like this:

bcp "SELECT ROW_KEY, ISNULL(CARRIER_ID, ''),ISNULL(CONVERT(VARCHAR(19),ORDER_DATETIME,120),''),ISNULL(HEIGHT, ''),ISNULL(RESIDENTIAL_CHARGE, '') FROM MYTABLE" queryout OUTPUT_FILE.txt -c -S <servername> -U <login_name> -P <password>

Replace -c with -w if you're dealing with nvarchar/nchar, and want Unicode output.

Sql-server – SQL Server DR, mirroring or virtualise and mirror

I would implement Log Shipping, which is just a process of taking transaction log backups at one site, and restoring them at another site. This operates at the database level, and it will inherently handle all the schema changes because backups operate at a level above that.

There's really no other sane solution on Standard Edition because the requirement of having an asynchronous process severely limits the available options. (If you're just getting started in this area, don't even think about any kind of SQL Server replication.)

Note also that as far as I can tell (and I could be wrong about this), Windows Server 2012 storage virtualization (aka Storage Spaces) is meant as an enhanced RAID controller given a set of local disks; it's not for replicating virtual machines across data centers. I believe to do that requires specific storage hardware that supports it, and I don't think you're looking to go down that path right now.

I dont want to put much more stress onto the SQL server as it is quite busy with both reads and updates.

Quantify this. What kind of options do you have to upgrade the hardware (I'm assuming it's several years old at this point)? Is the bottleneck CPU or I/O?

Regardless, if you're legitimately maxing out the available resources right now, virtualizing is a bad idea, as it adds inherent overhead to the system. Virtualization is great, and it may be appropriate here for other reasons, but definitely not if you're concerned about performance on your current physical hardware.

I don't want the primary server waiting on the secondary server to update before it commits a write.

I'm assuming this is because there's a slow link between the sites? If that's the case, and you want to protect against hardware failure, consider implementing a synchronous site-local mirror (on a separate physical box using Database Mirroring) for high availability, and then an asynchronous remote site using Log Shipping for disaster recovery.

Any kind of mirroring or replication is going to add some latency or load to the primary server. How much depends on the hardware and the workload. This is the tradeoff you have to make to protect against failures. It isn't completely free, but with proper hardware and design, it can be minimized to the point that it's irrelevant.

Best Answer

Related Solutions

Sql-server – SQLCMD Exporting 16.5 Million Rows

Sql-server – SQL Server DR, mirroring or virtualise and mirror

Related Question