Bash – Append something to each list in a file

bashgrepregular expressionsedshell-script

I have a file, lists.txt, that looks like this:

// stuff at beginning of file

var list1 = new Array();
i = 0;
list1[i++] = 'a';
list1[i++] = 'b';
...
list1[i++] = 'z';

var list2 = new Array();
i = 0;
list2[i++] = 'a';
list2[i++] = 'b';
...
list2[i++] = 'z';

// other stuff at end of file

I need to append to each of these lists (there are more than two of them) and end up with something like this:

var list1 = new Array();
i = 0;
list1[i++] = 'a';
list1[i++] = 'b';
...
list1[i++] = 'z';
list1[i++] = 'something new';

var list2 = new Array();
i = 0;
list2[i++] = 'a';
list2[i++] = 'b';
...
list2[i++] = 'z';
list2[i++] = 'another thing';

// other stuff at end of file

I've been wracking my brain on this for a while. I know how to get the last occurrence of each list:

list1_last=$(grep "list1\[i++\]" lists.txt | tail -1)
list2_last=$(grep "list2\[i++\]" lists.txt | tail -1)

I know how to get everything between the start of the first list and the start of the second list (inclusive):

list1=$(sed -n '/var list1/,/var list2/p' lists.txt)

I know I can get list1 without the first line of list2 with this Perl one-liner or this crazy sed script.

But I'm having a hard time putting all the pieces together. How should I do this?

Edit

The additional values I want to append are in another file, additional-values.txt, which for example contains:

list1[i++] = 'something new';
list2[i++] = 'another thing';

I guess you could say I'm trying to merge the two files.

Edit 2

The actual file looks more like this:

// comment
// comment
// ...
var foo = "bar";

// comment
// comment
// ...
var i= 0;

// comment
// comment
// ...
var GoodDomains = new Array();
i=0;
GoodDomains[i++] = "anything.com";  // comment
GoodDomains[i++] = "something.com"; // comment
...
GoodDomains[i++] = "lastthing.com"; // comment
// THIS IS WHERE I WANT TO INSERT SOMETHING

// comment
// comment
// ...
var BadDomains = new Array();
i=0;
BadDomains[i++] = "anything.com";  // comment
BadDomains[i++] = "something.com"; // comment
...
BadDomains[i++] = "lastthing.com"; // comment
// THIS IS WHERE I WANT TO INSERT SOMETHING

// more lists, including GoodHosts, GoodURLs, etc.

// comment
// comment
// ...
for (i in GoodDomains) {
    ...
}

// loop through BadDomains, GoodHosts, GoodURLs, etc.

// comment
// comment
// ...
function IsNumIpAddr(host) {
    ...
}

I originally posted a simplified version because

  1. I'm not sure if the actual file will always follow this format (comments at the top, variable declarations, more comments, list definitions, functions, etc.)
  2. I'd like to find a generic solution to the problem (appending stuff to lists in the middle of a file)

Sorry if this was misleading.

Best Answer

Since you're trying with sed ranges, here's one possible way to do it. The lines in your additional-values.txt follow the same pattern:

KEY[i++] = 'VALUE'; //etc

and as far as I can tell, each line should be inserted in a range that is always delimited by

var KEY = new Array();

and an empty line


so you could process additional-values.txt and turn it into a sed script that for each line does:

/^var KEY = new Array();/,/^$/{
/^$/ i\
KEY[i++] = 'VALUE'; // etc
}

that is, in /^var KEY = new Array();/,/^$/ range, insert line KEY[i++] = 'VALUE'; // etc before the empty line. You then use the script to process lists.txt:

sed 's/\\/&&/g' additional-values.txt | \
sed 's|^\([^[]*\).*|/^var \1 = new Array();/,/^$/{\
/^$/ i\\\
&\
}|' | sed -f - lists.txt

The first sed escapes any backslashes, the second sed processes additional-values.txt turning it into a script that is used by the third sed (via -f) to process lists.txt.
e.g. sample additional-values.txt content:

GoodDomains[i++] = '^stuff/here/'; \
BadDomains[i++] = '%XYZ+=?\\<>';
GoodNetworks[i++] = '|*{};:\'; // Malware\\
BadDomains[i++] = '\$.|&$@"#"!||';

the result of:

sed 's/\\/&&/g' additional-values.txt | \
sed 's|^\([^[]*\).*|/^var \1 = new Array();/,/^$/{\
/^$/ i\\\
&\
}|'

is

/^var GoodDomains = new Array();/,/^$/{
/^$/ i\
GoodDomains[i++] = '^stuff/here/'; \\
}
/^var BadDomains = new Array();/,/^$/{
/^$/ i\
BadDomains[i++] = '%XYZ+=?\\\\<>';
}
/^var GoodNetworks = new Array();/,/^$/{
/^$/ i\
GoodNetworks[i++] = '|*{};:\\'; // Malware\\\\
}
/^var BadDomains = new Array();/,/^$/{
/^$/ i\
BadDomains[i++] = '\\$.|&$@"#"!||'; 
}

this is then passed to sed -f - lists.txt so with e.g. sample lists.txt:

// Counter Variable to initalize the arrays.
var i= 0;

var GoodDomains = new Array();
i=0;
GoodDomains[i++] = 'aba.com'; // Phish - 2010-02-05

var GoodNetworks = new Array();
i=0;
GoodNetworks[i++] = '10.0.0.0, 255.0.0.0';  // NRIP
// GoodNetworks[i++] = "63.140.35.160"; // DNSWCD 2o7

var BadDomains = new Array();
i=0;
BadDomains[i++] = '.0catch.com'; // AdServer - 2009-06-16

//var BadDomains = new Array();

running:

sed 's/\\/&&/g' additional-values.txt | \
sed 's|^\([^[]*\).*|/^var \1 = new Array();/,/^$/{\
/^$/ i\\\
&\
}|' | sed -f - lists.txt

outputs:

// Counter Variable to initalize the arrays.
var i= 0;

var GoodDomains = new Array();
i=0;
GoodDomains[i++] = 'aba.com'; // Phish - 2010-02-05
GoodDomains[i++] = '^stuff/here/'; \

var GoodNetworks = new Array();
i=0;
GoodNetworks[i++] = '10.0.0.0, 255.0.0.0';  // NRIP
// GoodNetworks[i++] = "63.140.35.160"; // DNSWCD 2o7
GoodNetworks[i++] = '|*{};:\'; // Malware\\

var BadDomains = new Array();
i=0;
BadDomains[i++] = '.0catch.com'; // AdServer - 2009-06-16
BadDomains[i++] = '%XYZ+=?\\<>';
BadDomains[i++] = '\$.|&$@"#"!||'; 

//var BadDomains = new Array();

If you prefer gnu sed and process substitution:

sed -E 's|^([^[]*).*|/^var \1 = new Array();/,/^$/{/^$/ i\\\n&\
}|' <(sed 's/\\/&&/g' additional-values.txt) | sed -f - lists.txt
Related Question