Tar – Transform a Tar Archive’s Paths Without Extracting It

tar

GNU tar(1) has a neat option called --transform. From the man page:

–transform, –xform EXPRESSION
use sed replace EXPRESSION to transform file names

This allows transformation of path names on the fly as the archive is being extracted so that you may control where and how it will be extracted.

My question is, is there a way to perform a similar transformation in situ; i.e., without extracting the archive?

Example

[user@host]$ tar tf test.tar
./foo/blah  ./foo/bleh
[user@host]$ some_deep_magic 's/foo/bar/' test.tar
[user@host]$ tar tf test.tar
./bar/blah  ./bar/bleh

Use case

I'm distributing a tar archive to basically clueless end users and would like it to extract into the correct path without interference from me. I'm trying to avoid the trivial solution of extracting the archive, renaming the directories and repacking as the archive is largish.

Best Answer

You could mount the archive with archivemount or mountavfs and recreate it again

archivemount tarfile.tar /mnt
cd /mnt
tar cf /tmp/tarfile.tar --transform 's/foo/bar/' .

write operations on the archive filesystem will perfom a full rewrite on umount, so don't seem a good option for large files.

EDIT

I don't know implementation details but seem like we are saving the write files to filesystem step.

Just test to solve dudes, (over a tar of my /usr)

#!/bin/bash

# try to avoid slab cache issues 
cat /tmp/usr.tar > /dev/null

T="$(date +%s)"
tar xf /tmp/usr.tar
tar cf usr.tar usr --transform 's/usr/foo/'
T="$(($(date +%s)-T))"
echo "Tar/Untar seconds: ${T}"

T="$(date +%s)"
archivemount -o readonly -o nobackup /tmp/usr.tar /mnt
tar cf usr.tar /mnt  --transform 's/usr/foo/'
umount /mnt
T="$(($(date +%s)-T))"
echo "Archivemount seconds: ${T}"

T="$(date +%s)"
mountavfs
cd '/root/.avfs/tmp/usr.tar#'
tar cf /tmp/test/usr.tar   --transform 's/usr/foo/' .
T="$(($(date +%s)-T))"
echo "Avfs seconds: ${T}"

Output:

Tar/Untar seconds: 480
Archivemount seconds:  failure, a lot of read errors.
Avfs seconds: 217

So Avfs wins!.