Excel Power Query Load Times – Why do dependent queries reload CSV file data

csvmicrosoft excelpower-query

I have a series of queries in Excel which all depend on an initial query which loads data from CSV files, processes them and loads them into a table in the worksheet. The remaining queries all use worksheet tables as a source and don't utilize external datasources.

When I reload them individually them seem to take longer than they should and appear to be accessing the CSV files themselves.

I have all of the global and data settings for query to optimize loading (fast load, turn off background, etc.) based on some internet research and the original query loading/processing is acceptable given the amount of data. It would just seem that once I have the processed data in a table in the workbook that queries of that table and others would not require going back to datasource files.

Just wondering if I am missing something here. Thanks!

Best Answer

When a query references another query as a Source (or in a Merge or Append step), that entire Source query logic will run every time the referencing query is refreshed. If you have 5 queries referencing the same Source query, the Source query logic will be run 5 times, in parallel (by default). It's obviously inefficient.

In an Excel scenario, the pattern to avoid that inefficiency is to set the Source query to Load To an Excel table. Then the referencing queries can be pointed at that Excel table as their Source.

That does leaves you with a challenge to co-ordinate the refreshes, as hitting the Refresh All button risks inconsistent results. I typically mitigate that using Query Groups or by recording a refresh Macro.

Related Question