Ubuntu – Data extraction from a text file using bash

bashtext processing

I am looking for a bash script. In a text file I have data like:

+------+------
| Id   | User | 
+------+------+
| 8192 | root | 
| 8194 | root |
| 8202 | root |
| 8245 | root | 
| 8434 | root |  
| 8754 | root | 
| 8761 | root | 
| 8762 | root | 
| 8764 | root | 
| 8771 | root | 
+------+------+

I want to extract the data like this:

8192,8194,8202,8245,8434,8754,8761,8762,8764

I mean, I need the first field containing numbers, but not the last one, and all the numbers extracted should be separated by commas (,).

Could somebody help me to get it ?

Best Answer

You don't need a script for such a simple thing. You can use awk:

awk '$2 ~ "^[0-9][0-9]*$" { print $2 }' file.txt | head -n -1 | awk '{print}' ORS=',' | sed 's/,$/\n/'

Some explanations:

awk '$2 ~ "^[0-9][0-9]*$" { print $2 }' file.txt - print from the file.txt only the fields which are numbers.
head -n -1 - remove last line / last number.
awk '{print}' ORS=',' - concatenate all lines in one single line, each number separated by ,.
sed 's/,$/\n/' - replace last , with a newline character.

Or, shorter:

awk '$2 ~ "^[0-9][0-9]*$" { print $2 }' ORS=',' file.txt | sed 's/,[0-9]*,$/\n/'

Related Solutions

Ubuntu – Writing a bash script to read a text file of numbers into arrays, by column, on Ubuntu

Here is a script, it will store numbers from text file into two arrays x and y as you wished,

#!/bin/bash

nl=$(cat "$1" | wc -l)
declare -a x
declare -a y
for i in $(seq 1 $nl)
do
    x[i]="$(cat "$1" | awk -v p="$i" '{if(NR==p) print $1}')"
    y[i]="$(cat "$1" | awk -v p="$i" '{if(NR==p) print $2}')"
done
#upto this point all the numbers from first and second column of the file are stored 
#into x and y respectively. Following lines will just print them again for you.
for it in $(seq 1 $nl)
do
    echo "${x[$it]} ${y[$it]}"
done

Do not forget to give the script execution permission.

chmod +x script.sh

Usage

./script.sh numfile.txt

where I am considering you will save the above script as script.sh and your textfile containing numbers is numfile.txt. And both are in same directory.

Ubuntu – convert txt file to csv seperated with tabs

Ok, so you need to replace the first two and the last space in every line with a comma. You can't just replace every space, because the 3rd field may contain spaces itself. You can do this with regular expression replacement. Here's a sed script/command, that works:

sed -re 's/^(\S*) (\S*) (.*) (\S+)\s*$/\1,\2,\3,\4/' in.txt > out.csv

With the above example this returns:

Account,Units,Description,Delta
2281,19,Toshiba PX-1982GRSUB,0
9618,200,HP MX19942-228b,-25
19246,4,CompuCom HD300g Hard Drive,4

This is still quite fragile with handling empty fields and breaks entirely, if columns other than the 3rd contain spaces. It's very easy to introduce such malformed data if it is formatted manually as done by your boss. You should suggest to him to switch to a more robust table format (e. g. proper CSV & Co.) and editor (common spread sheet tools can manipulate CSV quite well and flexibly, e. g. LibreOffice/OpenOffice Calc, Microsoft Excel and Google Docs).

Best Answer

Related Solutions

Ubuntu – Writing a bash script to read a text file of numbers into arrays, by column, on Ubuntu

Ubuntu – convert txt file to csv seperated with tabs

Related Question