Get SSIS to check if excel file is updated

excelssis

So I have a SSIS package that reads from an excel file and populates a Sql server database. It assumes that the file will be updated monthly, and consequently, the data pull to update it is also done monthly.

However, as this excel file needs to be updated manually, there may be a case in which the user forgets to re-upload a new excel file. Naturally, I want to detect if that is the case. How would I be able to do this?

The best that I could think of was to use a script task to alter the excel file's sheet name to something like "already processed" after the data pull, and check for that condition at the beginning of the task. However, this seems like a work around solution that's overly complicated. Any ideas for better ways to do this?

Best Answer

You could have a meta data table that stores the file name, processing date, and last modified date. Then you can have your job run daily and identify whether the file's last modified date is greater than the stored value.

If the file hasn't been modified in 32 (arbitrary number but base it on business tolerance) send a notification to you and/or the business process owner saying no new file detected.

This won't cover cases where they've opened the file, changed nothing but clicked save none-the-less (or Save As and overlaid the file). At that point, you'd get to write actual change detection. Which could be a weird kind of fun but surely there's more important priorities for the business.

I'd advise against changing the tab name. Lots of reasons for it but the easiest one to enumerate is technical/license based. Programmatically renaming an Excel worksheet is an easy technical change. But, to overcome a business process deficit, you're going to take a dependency on the Excel object library. You get to read/write to Excel via SSIS for free. Reading/Writing to Excel from a Script Task is going to use the COM interop functionality. To use that, you need to have Excel installed. Excel on a desktop, yes. Excel on a Server, no (it kills your uptime + costs you an license for the Office installation)

Yes, I am quite aware of nuget packages, the ACE library etc that are free and can provide interactivity with Excel but then that gets us into server installation, GAC requirements, 32 vs 64 bittedness discussions etc.