Excel – Find the previous Row with a specific value at in column A

microsoft excelmicrosoft-excel-2010worksheet-function

Sample sheet with and without formulae

Given the above sample sheet, which has semantic 'header', 'subheader' and 'subtotal' rows, I am trying to determine a formula to locate the previous subheader row relative the current cell. For example, if the formula were entered in F5, it would locate row 2 and, if entered into F17, it would locate row 13.

The rows are conditionally formatted as a header, subheader or subtotal, by the existence of values H, S or T in column $A:$A, i.e subheader row n is a row where $An = "S". Now, I would like to extend this concept to my formulae.

A header row will always be followed by a subheader (so I do not to have to worry about headers and subheaders that are out of order).

I have tried the following:

  1. =MATCH("S", OFFSET($A5, 0, 0, -ROW($A5), 1), 0)

    This always returns row 2, because MATCH returns the first match, in the set, and I can't limit the OFFSET height (i.e. recursively, because the previous subheader location is unknown);

  2. {=LARGE(MATCH("S", OFFSET($A5, 0, 0, -ROW($A5), 1), 0), 1)}

    This also returns 2, because, even in array context (i.e. with Ctrl+Alt+Enter), MATCH still just returns the first result;

  3. =LARGE(IF(OFFSET($A5, 0, 0, -ROW($A5), 1)="S", ROW(OFFSET($A5, 0, 0, -ROW($A5), 1))), 1)

    This returns 0, because IF is not expecting an array here, so expands OFFSET($A5, 0, 0, -ROW($A5), 1) to a single value 0, which does not match "S", and LARGE treats FALSE as a number;

  4. {=LARGE(IF(OFFSET($A5, 0, 0, -ROW($A5), 1)="S", ROW(OFFSET($A5, 0, 0, -ROW($A5), 1))), 1)}

    This returns #VALUE, because array expansion occurs too early, which leaves -ROW($A5) as array -{5}, which is not a valid numeric height parameter to OFFSET (I wanted the IF(OFFSET(...)="S",...) bit to be an array, not the -ROW($A5) bit, but Excel cannot distinguish).

I am currently targeting Excel 2010. Earlier versions are not applicable (although forward-compatibility is a bonus). I am trying to avoid VBA, since it is more difficult for me to distribute *.xlsm files than *.xlsx (besides, I already know how to do it with VBA).

Are there any other things I can try?

Best Answer

The easy way to do this is to cheat and use a mixed absolute/relative formula. This is an array formula (enter with CTRL+SHIFT+ENTER) entered into cell B4 but it could go anywhere in row 4. It will return the row number of the one marked S.

=MAX(IF($A$1:A4="S",ROW($A$1:A4)))

When copied down, the second part of the reference B4 and A4 will increase. This ensures that you get the row with the largest match that is above the current row. You can enter those formulas more quickly using F4 after typing/selecting the relevant range. This will cycle the dollar signs through all the choices.

Picture of ranges

picture of data and result

Used to replace your formulas

After reading the question a bit (and based on the edit by @SteveTaylor), it seems your use for this is to update your formulas. You can use the row that is returned from above along with INDEX to get ranges of data to sum. I see 2 formulas that can be replaced:

  • Total calculation for each labeled row of data. In this case, the subtotal row above can be referenced dynamically.
  • Total calcualtion for the substotal row. In this case the values to sum from above can be reference dynamically.

For the single row data, you can use the formula, starting in F3 as an array formula. Note that I switched over to using SUMPRODUCT which makes it much easier to go to more than 2 columns.

=C3*SUMPRODUCT(INDEX(D:E,MAX(IF($A$1:A3="S",ROW($A$1:A3))),),D3:E3)

For the total row formula, you can use, starting in F11, again array formula:

=SUM(F10:INDEX($F$1:F10, 1+MAX(IF($A$1:A11="S",ROW($A$1:A11)))))

If you want one formula to rule them all! then you can combine these into a nested IF based on the value in column A. Here is said array formula, starting in F2 which can be copied down.

=IF(
  A2="S", 
  SUM(D2:E2), 
    IF(A2="T", 
      SUM(F1:INDEX($F$1:F1, 1+MAX(IF($A$1:A2="S",ROW($A$1:A2))))), 
      C2*SUMPRODUCT(INDEX(D:E,MAX(IF($A$1:A2="S",ROW($A$1:A2))),),D2:E2)))

This formula does not differentiate between a blank row and a "data" row. It currently returns 0 for the spacer row which is fine.

Picture of results of and formulas for two blocks of your data.

enter image description here

enter image description here

Related Question