Text Processing – Join Every Two Lines with Commas

pastetext processing

I have more than 1000 lines in a file. The file starts as follows (line numbers added):

Station Name
Station Code
A N DEV NAGAR
ACND
ABHAIPUR
AHA
ABOHAR
ABS
ABU ROAD
ABR

I need to convert this to a file, with comma separated entries by joining every two lines. The final data should look like

Station Name,Station Code
A N DEV NAGAR,ACND
ABHAIPUR,AHA
ABOHAR,ABS
ABU ROAD,ABR
...

What I was trying was – trying to write a shell script and then echo them with comma in between. But I guess a simpler effective one-liner would do the job here may be in sed/awk.

Any ideas?

Best Answer

Simply use cat (if you like cats ;-)) and paste:

cat file.in | paste -d, - - > file.out

Explanation: paste reads from a number of files and pastes together the corresponding lines (line 1 from first file with line 1 from second file etc):

paste file1 file2 ...

Instead of a file name, we can use - (dash). paste takes first line from file1 (which is stdin). Then, it wants to read the first line from file2 (which is also stdin). However, since the first line of stdin was already read and processed, what now waits on the input stream is the second line of stdin, which paste happily glues to the first one. The -d option sets the delimiter to be a comma rather than a tab.

Alternatively, do

cat file.in | sed "N;s/\n/,/" > file.out

P.S. Yes, one can simplify the above to

< file.in sed "N;s/\n/,/" > file.out

< file.in paste -d, - - > file.out

which has the advantage of not using cat.

However, I did not use this idiom on purpose, for clarity reasons -- it is less verbose and I like cat (CATS ARE NICE). So please do not edit.

Alternatively, if you prefer paste to cats (paste is the command to concatenate files horizontally, while cat concatenates them vertically), you may use:

paste file.in | paste -d, - -

Related Solutions

Join Lines of Text with Repeated Beginning – Command Line Tips

This is standard procedure for awk

awk '
{
  k=$2
  for (i=3;i<=NF;i++)
    k=k " " $i
  if (! a[$1])
    a[$1]=k
  else
    a[$1]=a[$1] "<br>" k
}
END{
  for (i in a)
    print i "\t" a[i]
}' long.text.file

If file is sorted by first word in line the script can be more simple

awk '
{
  if($1==k)
    printf("%s","<br>")
  else {
    if(NR!=1)
      print ""
    printf("%s\t",$1)
  }
  for(i=2;i<NF;i++)
    printf("%s ",$i)
  printf("%s",$NF)
  k=$1
}
END{
print ""
}' long.text.file

Or just bash

unset n
while read -r word definition
do
    if [ "$last" = "$word" ]
    then
        printf "<br>%s" "$definition"
    else 
        if [ "$n" ]
        then
            echo
        else
            n=1
        fi
        printf "%s\t%s" "$word" "$definition"
        last="$word"
     fi
done < long.text.file
echo

Concatenate multiple files with two blank lines as delimiter

With paste:

:|paste -sd'\n' file1.md - - file2.md

Best Answer

Related Solutions

Join Lines of Text with Repeated Beginning – Command Line Tips

Concatenate multiple files with two blank lines as delimiter

Related Question