SQL Server Data Upload – Best Data Upload Formats

csvdate formatsql server

Most of our automatically generated files contain integers, decimals, date, time, and sometimes varchar. This is instrumentation data. Since we will be uploading those files directly to a SQL Server database, we were trying to decide what is the most efficient file format for automatic generation from our instrumentation.

What is the most efficient file format for uploading data to a SQL Server database? I thought it would be Text format but when I did an experiment with CSV, CSV tends to take less space and is also easier for users to analyze.

Data file size would vary from 50K to 250K. Only a few range in MBs.

I am working on creating an application that would upload data into the db. We use LabView for our instrumentation data, so that is where we want the standardized file format.

I am leaning towards CSV. I do not have much experience with XML, so I might consider it that if the advantages outweigh those of CSV by a lot. Readability is my concern only in terms of quality control. There might be cases where the user would want to change or fix some data before they upload. Speed is also important but with such small file sizes, I am not too worried about it.

Best Answer

Generally, CSVs are the smallest file size for non-binary (text) data. XML adds a great deal of xml codes that bloat the file size. So CSVs tend to import faster. But XML importers can handle exception cases, etc., with more complexity.

In my career, I've worked with instrument imports at four different kinds of laboratories. Where there was no direct LIMS (Laboratory Information Management System)-to-instrument interface system in place, we found that delimited text files are the fastest to write working import scripts around.

CSV is a comma-delimited text file. If your input data will include commas, then using a different delimiter may make more sense, like a pipe character for example: | A CSV file or pipe-delimited file generally follows the rule that each row/line of text from column 1 to the end-of-line carriage return represents one row of sample data in your database. Then within that row, each column of data in your database is separated from other columns by your delimiter character(s), which could be commas or pipes or anything you want to use.

Text files are great as long as the data is consistent over time.