I am trying to extract a value from a long string that may change over time. So for example the string could look something like this


And what I want to extract is the value between filename- and .tar.gz, essentially the file version ( in this case). The reason I need to do it this way is because I may later run the command and the value will be or or something entirely different.

How can I do this? I'm currently only using grep, but I wouldn't mind using other utilities such as sed or awk or cut or whatever. To be perfectly clear, I need to extract only the file version part of the string, since it is very long (on both sides) everything else needs to be cut out somehow.

With grep -P/pcregrep, using a positive look-behind and a positive look-ahead:

grep -P -o '(?<=STRING1).*?(?=STRING2)' infile

in your case replace STRING1 with filename- and STRING2 with \.tar\.gz

If you don't have access to pcregrep and/or if your grep doesn't support -P you can do this with your favourite text processing tool. Here's a portable way with ed that gives you the same output:

ed -s infile <<\IN

How it works: a newline is prepended to each STRING1 occurrence (so now there's at most one occurrence per line) then all lines not matching STRING1.*STRING2 are deleted; on the remaining ones we only keep what's between STRING1 and STRING2 and print the result.

