sed -n 10p myfile | wc -c
will count the bytes in the tenth line of myfile
(including the linefeed/newline character).
A slightly less readable variant,
sed -n "10{p;q;}" myfile | wc -c
(or sed '10!d;q'
or sed '10q;d'
) will stop reading the file after the tenth line, which would be interesting on longer files (or streams). (Thanks to Tim Kennedy and Peter Cordes for the discussion leading to this.)
There are performance comparisons of different ways of extracting lines of text in cat line X to line Y on a huge file.
In your example,
001
002
003
004
byte number 8 is the second newline, not the 0
on the next line.
The following will give you the number of full lines after $b
bytes:
$ dd if=data.in bs=1 count="$b" | wc -l
It will report 2
with b
set to 8 and it will report 1
with b
set to 7.
The dd
utility, the way it's used here, will read from the file data.in
, and will read $b
blocks of size 1 byte.
As "icarus" rightly points out in the comments below, using bs=1
is inefficient. It's more efficient, in this particular case, to swap bs
and count
:
$ dd if=data.in bs="$b" count=1 | wc -l
This will have the same effect as the first dd
command, but will read only one block of $b
bytes.
The wc
utility counts newlines, and a "line" in Unix is always terminated by a newline. So the above command will still say 2
if you set b
to anything lower than 12 (the following newline). The result you are looking for is therefore whatever number the above pipeline reports, plus 1.
This will obviously also count the random newlines in the binary blob part of your file that precedes the ASCII text. If you knew where the ASCII bit starts, you could add skip="$offset"
to the dd
command, where $offset
is the number of bytes to skip into the file.
Best Answer
Using AWK:
This assumes that there are no spaces in the label (“Digital_…”).