You want to create a tar file away from the place the files you need to tar reside?
There are many ways to do this.
If it is to be created locally (= on the same machine) :
tar czvf /path/to/destination/newfile.tar.gz ./SOURCEDIR_OR_FILES
You can add additionnal files or directories to tar at the end of that command.
If it is to be created remotely (ie, you want to create the tar file on a remote host from the one containing the data to be tared):
tar czvf - ./SOURCEDIR_OR_FILES | ssh user@host 'cat > newfile.tar.gz'
The later version is very versatile. For example you can also "duplicate" a directory + subdirs using the same technique:
Duplicate a directory+subdirs to another local directory:
tar cf - ./SOURCEDIR_OR_FILES | ( cd LOCAL_DEST_DIR && tar xvf - )
Duplicate a directory+subdirs to another remote directory:
tar cvf - ./SOURCEDIR_OR_FILES | ssh user@host 'cd REMOTE_DEST_DIR && tar xf - '
Drop the 'v' if you don't need it to display files as they are tar-ed (or untarred): it will then go much faster, but won't say much unless there is an error.
I use "./..." for the source to force tar to store it as a RELATIVE path. In some cases you'll want to add additionnal path information:
For example to tar the crontab files, including the one in /etc, you could do:
cd / ; tar czf all_crons.tgz ./etc/*cron* ./var/spool/cron
I use on purpose the relative path: some OLD versions of tar may be dangerous and extract files with their original GLOBAL path, meaning you could do : cd /safedir ; tar xvf sometar
and have the files with global names overwrite files at their original path, which is OUTSIDE of /safedir and not underneath it! Very dangerous, and still possible as there are old production servers out there. Better to be used to use relative paths all the time, even if you use a more recent tar.
I don't agree with the squashfs recommendations. You don't usually write a squashfs to a raw block device; think of it as an easily-readable tar archive. That means you would still need an underlaying filesystem.
ext2
has several severe limitations that limit its usefulness today; I would therefore recommend ext4
. Since this is meant for archiving, you would create compressed archives to go on it; that means you would have a small number of fairly large files that rarely change. You can optimize for that:
- specify
-I 128
to reduce the size of individual inodes, which reduces the size of the inode table.
- You can play with the
-i
option too, to reduce the size of the inode table even further. If you increase this value, there will be less inodes created, and therefore the inode table will also be smaller. However, that would mean the filesystem wastes more space on average per file. This is therefore a bit of a trade-off.
- You can indeed switch off the journal with
-O ^has_journal
. If you go down that route, though, I recommend that you set default options to mount the filesystem read-only; you can do this in fstab
, or you could use tune2fs -E mount_opts=ro
to record a default in the filesystem (you cannot do this at mkfs
time)
- you should of course compress your data into archive files, so that the inode wastage isn't as bad a problem as it could be. You could create squashfs images, but xz compresses better, so I would recommend tar.xz files instead.
- You could also reduce the number of reserved blocks with the
-m
option to either mkfs
or tune2fs
. This sets the percentage (set to 5 by default) which is reserved for root only. Don't set it to zero; the filesystem requires some space for efficient operation.
Best Answer
AFAIK it is (unfortunately) not possible to truncate a file from the beginning (this may be true for the standard tools but for the syscall level see here). But with adding some complexity you can use the normal truncation (together with sparse files): You can write to the end of the target file without having written all the data in between.
Let's assume first both files are exactly 5GiB (5120 MiB) and that you want to move 100 MiB at a time. You execute a loop which consists of
truncating the source file by one block (freeing disk space)
But give it a try with smaller test files first, please...
Probably the files are neither the same size nor multiples of the block size. In that case the calculation of the offsets becomes more complicated.
seek_bytes
andskip_bytes
should be used then.If this is the way you want to go but need help for the details then ask again.
Warning
Depending on the
dd
block size the resulting file will be a fragmentation nightmare.