How to get the text between two words specified by their indices

text processing

Using awk, I can print the words of the given indices as following.

$ echo "The quick brown fox jumps over the lazy dog" | awk  '{print $3, $7}'
brown the

But I also want to get the text between the specified words, "brown" and "the". So I want the output to be like that.

brown fox jumps over the

It's not necessary to use awk specifically, but the indexing and tokenization of words should match that of awk to keep consistency with the other parts in my shell script that use awk.

I thought about printing the words from the first index to the last index, but this doesn't retain the whitespaces between words.

To put this in a complicated but more accurate way, I want to get the text that begins at the beginning of some word specified by an index and ends at the end of another word specified by another index. How can I achieve that (preferably without bash loops)?

Best Answer

With gawk, you can use the split() function to determine fields and their separators:

$ echo "The quick brown fox   jumps over the lazy dog" | awk '{ split($0, a, "\\s+", s); for (i = 3; i <= 7 && i <= length(a); i++) printf "%s%s", a[i], (i < 7 ? s[i] : "\n") }'
brown fox   jumps over the
Related Question