Excel – Remove URL Parameter in Excel

microsoft excelmicrosoft-excel-2007string manipulation

I have a URL of the following form:

http://www.example.com/page.html?param1=asdf&param2=asdfg&param3=asdfgh

I want to extract the value of param2 or 3 from the URL as well as remove that particular param from the URL. Any ideas on how to do this using Excel?

Best Answer

I don't believe Excel has built-in functions for handling URLs, so you'll have to resort to creatively combining regular string manipulation functions like LEN(), MID() and SEARCH().

Assuming your URL is in cell A1 and the parameter whose value you want to extract is in cell B1, try the following formulae.

To get the value of the given parameter:

=IFERROR(MID(A1; SEARCH(B1 & "="; A1) + LEN(B1) + 1; IFERROR(SEARCH("&"; A1; SEARCH(B1 & "="; A1)) - SEARCH(B1 & "="; A1) - LEN(B1) - 1; LEN(A1))); "")

To remove the parameter and its value from the URL:

=IFERROR(REPLACE(A1; SEARCH(B1 & "="; A1); IFERROR(SEARCH("&"; A1; SEARCH(B1 & "="; A1)) - SEARCH(B1 & "="; A1) + 1; LEN(A1)); ""); A1)

Note: depending on your regional settings, you may have to replace all the semicolons in the formulae with commas.


Detailed explanation

To get the value of the given parameter, we first need to locate it within the URL:

=SEARCH(B1 & "="; A1)

The SEARCH() function locates the first parameter (the given parameter) within the second parameter (the URL) and returns the number of the starting position where it occurs. Note that we append the equal sign to the name of the parameter, to be sure we're searching for the correct thing. Otherwise searching for param1 might instead return the location of, say, param10 if it occurs earlier in the URL.

With the parameter found, we need to return the portion (or substring) of the URL starting from where the parameter's value begins and ending right before the next ampersand. To do that, we use the MID() function, which takes three parameters: the string from which to return the substring, the position at which to start, and the number of characters to return.

=MID(A1; SEARCH(B1 & "="; A1); LEN(A1))

We're also using the LEN() function, which simply returns the length of the given string (in this case the URL). This is just a placeholder for now (the third parameter didn't become optional until Excel 2010), but will come in handy later when we want the value of the last parameter.

First we need to move the starting position from the beginning of the parameter itself to where its value begins. To do so, we add to the parameter's location within the string its length, as well as 1 (for the = sign).

=MID(A1; SEARCH(B1 & "="; A1) + LEN(B1) + 1; LEN(A1))

That's better, the value returned starts at the correct location. To get it to stop at the correct location, we need to locate the next ampersand sign, which signifies where the next parameter starts.

=SEARCH("&"; A1; SEARCH(B1 & "="; A1))

We're using the SEARCH() function again, this time adding a third parameter specifying the position to start the search at - we're not interested in ampersands preceding the given parameter, only those after it.

To get the length of the parameter's value from the above, we need to subtract the position where the parameter begins, the length of the parameter, and again 1 for the = sign.

=SEARCH("&"; A1; SEARCH(B1 & "="; A1)) - SEARCH(B1 & "="; A1) - LEN(B1) - 1

This works for all parameters except the last, because there's no trailing ampersand at the end of the URL. For that case, we can use the IFERROR() function to detect the error that Excel gives, and return some fixed number instead. A good choice is LEN(A1) - the length of the string is guaranteed to be greater than the length of any of its substrings, and if we pass this as the third argument to MID(), it'll return all characters from the given position to the end of the string, which is just what we need.

=IFERROR(SEARCH("&"; A1; SEARCH(B1 & "="; A1)) - SEARCH(B1 & "="; A1) - LEN(B1) - 1; LEN(A1))

Combining this with the above, we get the following formula:

=MID(A1; SEARCH(B1 & "="; A1) + LEN(B1) + 1; IFERROR(SEARCH("&"; A1; SEARCH(B1 & "="; A1)) - SEARCH(B1 & "="; A1) - LEN(B1) - 1; LEN(A1)))

One last thing: if B1 contains a parameter that doesn't exist in the URL, the above returns a #VALUE error. To return an empty string (or any other appropriate value) in such a case, wrap the whole thing in another IFERROR():

=IFERROR(MID(A1; SEARCH(B1 & "="; A1) + LEN(B1) + 1; IFERROR(SEARCH("&"; A1; SEARCH(B1 & "="; A1)) - SEARCH(B1 & "="; A1) - LEN(B1) - 1; LEN(A1))); "")

The formula to remove the parameter and its value from the URL is pretty similar, using the REPLACE() function to replace a given substring of the URL (which is defined more or less the same way as above) with an empty string.

You can probably streamline the formulae a bit by moving commonly used blocks (like SEARCH(B1 & "="; A1)) to their own columns and referencing those cells instead of typing out the formula several times. Those additional columns can then be hidden from view.

Related Question