Excel – How should I design time series data for ease-of-use by Excel Pivot Tables

microsoft excelpivot table

I have an Excel spreadsheet with several years of daily data in the following form:

Date       | Category1   | Category 2  | ...   | Category30
------------------------------------------------------------
dd/mm/yyyy |  value      |  value      | ...   |  value 
...        |  ...        | ...         | ...   | ...
------------------------------------------------------------

I also have a list in the following form, where each of the categories from the first table is assigned one of five Type descriptions:

| Type  | Category   |
----------------------
| Type1 | Category1  |
| Type2 | Category2  | 
| Type2 | Category3  | 
| ...   | ...        | 
| Type5 | Category30 |
----------------------

I want to be able to organize and summarize these data using a Pivot Table, and plot the time series using a Pivot Chart. However, because of the way the data are tabulated I cannot group them by category, type or sequential month (I am interested in charting the data by month/year instead of grouping them by only months across years).

If I ordered the data like in the following table, all my problems would be solved, but I would be duplicating several column's values as many times as there are categories (and I am very much opposed to duplicating data):

Date        | Year  | Month  | Day  | Type    | Category    | Values
--------------------------------------------------------------------------
date(i)     | yyyy  | mmm    | dd   | Type(j) | Category(k) | value(i,j,k)
--------------------------------------------------------------------------

My question is, how should I design my table so that I can quickly handle the data with pivot tables without having to duplicate all the other information?

Best Answer

For a PivotTable to work as expected, you must arrange your data like a database: a row is a record, and a column is a field providing information on that record.

So first off, having several columns containing multiple values for a specific date appears wrong by my understanding of the situation. That translates to having many records (values) crammed into one record. By definition, having several columns for different categories prevent you from using this data meaningfully in a PivotTable.

If you can modify your source data on a permanent basis, here's a suggested layout:

 | Date       | Category | Value
 | 01/01/2011 | 1        | x
 | 01/01/2011 | 2        | y
 | 01/01/2011 | 25       | z
 | 01/02/2011 | 1        | x

To which you can append the TYPE by using a VLOOKUP formula that fetches the proper type for your category to avoid duplication and automatically reflect any change in your TYPE table. (Note that this assumes inverted columns in your type table: CATEGORY | TYPE for VLOOKUP to work)

 | Date       | Category | Type                           | Value
 | 01/01/2011 | 1        | VLOOKUP(B2, 'TypeTable', 2, 0) | x
 | 01/01/2011 | 2        | VLOOKUP(C2, 'TypeTable', 2, 0) | y
 | 01/01/2011 | 25       | VLOOKUP(D2, 'TypeTable', 2, 0) | z
 | 01/02/2011 | 1        | VLOOKUP(E2, 'TypeTable', 2, 0) | x

Such a table would contain lots of rows but feeding this to a PivotTable will allow you to manipulate it in every way: by date, by category, by type.

In fact, your source data would be a good RESULT of a PivotTable summarizing values by date with a column label on category! And God knows it can be hard to take a PivotTable result and translate it back to data that can then be used to make another PivotTable. Solutions including manual labor, macros, or a bunch of intermediate formulas.

Hope that helps.

Related Question