Shell – Is it possible to truncate a file (in place, same inode) at the beginning

filesshelltailtruncate

It is possible to remove trailing bytes of a file without writting to a new file (> newfile) and moving it back (mv newfile file). That is done with truncate:

truncate -s -1 file

It is possible to remove leading bytes but by moving it around (which changes the inode) (for some versions of tail):

tail -c +1 file > newfile ; mv newfile file

So: How to do that without moving files around?
Ideally, like truncate, only a few bytes would need to be changed even for very big files.

note: sed -i will change the file inode, so, even if it may be useful, is not an answer to this question IMO.

Best Answer

With ksh93:

tail -c+2 < file 1<>; file

(where <>; is a ksh93 specific variant of the standard <> operator that truncates the file in the end if the command being redirected was successful).

Would remove the first byte (by writing the rest of the file over itself and truncate at the end).

Same can be done with sh with:

{
  tail -c+2 < file &&
    perl -e 'truncate STDOUT, tell STDOUT'
} 1<> file

Note that it would unsparse sparse files (you could still re-dig holes afterwards with fallocate -d though).

Upon read/write errors, tail would likely bail out leaving the file partly overwritten (so for instance, abcdefgh could end up as bcddefgh if it fails after rewriting bcd). You could adapt the above so that it reports the writing offset upon error, so you know how to recover the data. Still with ksh93:

unset -v offset
{ tail -c+2 < file || false >#((offset=CUR)); } 1<>; file

After which if $offset is set, it contains the amount of data that was successfully written.

On Linux (since 3.15) and on ext4 or xfs file systems, one can collapse ranges or bytes of size and offset that are a multiple of the filesystem block size with the fallocate() system call or fallocate utility.

So for instance

fallocate -c -l 8192 file

Would remove the first 8192 bytes of the file (assuming a FS with a block size that is a divisor of 8192) without having to rewrite the rest of the file. But that's of no use if you want to remove a section that is not a multiple of the FS block size.

Related Solutions

Does multitail follow the inode or the file name by default

According to the multitail manual:

There are a few other options not fitting elsewhere, these are:
-f  This makes MultiTail follow the file. In case the original file gets
    renamed and a new file is created with the original filename, MultiTail
    will start watching the file with the original filename (the one you
    entered).

To me, this implies that by default it follows by inode / file descriptor rather than filename.

A cursory reading of the source bears this out; in exec.c:79, the follow_filename var (set in cmdline.c:889 or ui.c:966) defines whether the follow-by-filename flag (-F, --follow=name, etc.) is passed to tail.

Grep from the end of a file to the beginning

tac only helps if you also use grep -m 1 (assuming GNU grep) to have grep stop after the first match:

tac accounting.log | grep -m 1 foo

From man grep:

   -m NUM, --max-count=NUM
          Stop reading a file after NUM matching lines.

In the example in your question, both tac and grep need to process the entire file so using tac is kind of pointless.

So, unless you use grep -m, don't use tac at all, just parse the output of grep to get the last match:

grep foo accounting.log | tail -n 1

Another approach would be to use Perl or any other scripting language. For example (where $pattern=foo):

perl -ne '$l=$_ if /foo/; END{print $l}' file

awk '/foo/{k=$0}END{print k}' file

Best Answer

Related Solutions

Does multitail follow the inode or the file name by default

Grep from the end of a file to the beginning

Related Question