Ubuntu – Is it possible with Gedit or the command line to modify every fourth line of a text file

command linegeditlibreoffice

I am trying to convert a text file into a tab separated spreadsheet. My text file is something like this:

Dog
Cat
Fish
Lizard
Wolf
Lion
Shark
Gecko
Coyote
Puma
Eel
Iguana

With standard search and replace functions in Gedit or LibreOffice, it's easyto replace the end of line with a tab. But if I just swap carriage returns for tabs, I'll get this:

Dog   Cat   Fish   Lizard   Wolf   Lion   Shark   Gecko   Coyote   Puma   Eel   Iguana

But what I need to do is get it to look like this:

Dog   Cat   Fish   Lizard
Wolf   Lion   Shark   Gecko  
Coyote   Puma   Eel   Iguana

So, can I swap every end of line character for a tab except for every fourth line?

I don't know if that kind of conditional iteration can be done with regular expressions inside a program like Gedit or LibreOffice, so maybe this needs to be some kind of command line function? I'm not even clear on what the best tool to start with is.


Update:

I tried the following commands:

sed 'N;N;N;s/\n/\t/g' file > file.tsv

paste - - - - < file > file.tsv

pr -aT -s$'\t' -4 file > file.tsv

xargs -d '\n' -n4 < inputfile.txt

But when I try to open the resulting tsv file in LibreOffice, the columns are not quite right. I'm not sure if this means I'm not executing the above commands correctly, or if I'm doing something wrong in the LibreOffice import function:

TSV opening in Calc

Just for reference, the desired result should look like this:

Proper columns

Best Answer

You could use a command-line editor such as sed

sed 'N;N;N;s/\n/\t/g' file > file.tsv

or, more programatically, by adding backslash line continuation characters to each of the lines you want to join using GNU sed's n skip m address operator and following it with the classic one-liner for joining continued lines:

sed '0~4! s/$/\t\\/' file | sed -e :a -e '/\\$/N; s/\\\n//; ta'

See for example Sed One-Liners Explained :

  1. Append a line to the next if it ends with a backslash "\".

    sed -e :a -e '/\\$/N; s/\\\n//; ta'
    

However IMHO itwould be easier with one of the other standard text-processing utilities e.g.

paste - - - - < file > file.tsv

(the number of - will correspond to the number of columns) or

pr -aT -s$'\t' -4 file > file.tsv

(you can omit the -s$'\t if you don't mind the output to be separated by multiple tabs).


The strange re-import behavior that you are observing is almost certainly because the original file has Windows-style CRLF line endings. If you need to work with files from Windows, then you can roll the conversion into the command in various ways e.g.

tr -d '\r' < file.csv | paste - - - -

or

sed 'N;N;N;s/\r\n/\t/g' file.csv

The former will remove ALL carriage returns whereas the latter will preserve a CR at the end of each of the new lines (which may be what you want if the intended end user is on Windows).