Archive Tar Zip File Management – How to De-unzip, De-tar -xvf, and De-unarchive in a Messy Folder

archivefile managementtarzip

Usually, I unarchive things by $ mkdir newFolder; $ mv *.zip newFolder; $ cd newFolder; $unzip *.zip but sometimes I get lazy and just do in an arbitrary folder $ unzip *.zip so time-to-time messing up with other content. I will list here some methods — some archive version surely have crappy-flags while others more spartan, I am more interested about the latter.

Some ways to de-unarchive, are there others?

  1. $ find . -anewer fileThatExistedBeforeUnarchieving -ok rm '{}' \; Weaknesses are that it lists the *.zip dirs, so you need to use slow -ok, slow with many *.zip matches and, for some reason, it does not seem to match everything extracted.

  2. If small amount of extracted files, one-by-one, slow, cumbersome and error-prone.

  3. When I want to make sure whether the content of the archieve is actually a folder, I sometimes check it with $ unzip -l *.bsd, works at least in obsd`s unzip-version.

If you are referring to certain archiving tools, please, state them when appropriate. Keep it simple though — I am more interested about the WAYS how you do it, rather than a single tool.

Best Answer

By name

You can generate the list of files in the archive and delete them, though this is annoyingly fiddly with archivers such as unzip or 7z that don't have an option to generate a plain list of file names. Even with tar, this assumes there are no newlines in file names.

tar tf foo.tar | while read -r file; do rm -- "$file" done
unzip -l foo.zip | awk '
    p && /^ --/ {p=2}
    p==1 {print substr($0, 29)}
    /^ --/ {++p}
' | while …
unzip -l foo.zip | tail -n +4 | head -n -2 | while …  # GNU coreutils only
7z l -slt foo.zip | sed -n 's/^Path = //p' | while …  # works on tar.*, zip, 7z and more

Instead of removing the files, you could move them to their intended destination.

tar tf foo.tar | while read -r file; do
  if [ -d "$file" ]; then continue; fi
  mkdir -p "/intended/destination/${file%/*}"
  mv -- "$file" "/intended/destination/$file"
done

Using FUSE

Instead of depending on external tools, you can (on most unices) use FUSE to manipulate archives using ordinary filesystem commands.

You can use Fuse-zip to peek into a zip, extract it with cp, list its contents with find, etc.

mkdir /tmp/foo.d
fuse-zip foo.zip /tmp/foo.d
## Remove the files that were extracted mistakenly (GNU/BSD find)
(cd /tmp/foo.d && find . \! -type d -print0) | xargs -0 rm
## Remove the files that were extracted mistakenly (zsh)
rm /tmp/foo.d/**(:"s~/tmp/foo.d/~~"^/)
## Extract the contents where you really want them
cp -Rp /tmp/foo.d /intended/destination
fusermount -u foo.d
rmdir foo.d

AVFS creates a view of your entire directory hierarchy where all archives have an associated directory (same name with # tacked on at the end) that appears to hold the archive content.

mountavfs
## Remove the files that were extracted mistakenly (GNU/BSD find)
(cd ~/.avfs/"$PWD/foo.zip#" && find . \! -type d -print0) | xargs -0 rm
## Remove the files that were extracted mistakenly (zsh)
rm ~/.avfs/$PWD/foo.zip\#/**/*(:"s~$HOME/.avfs/$PWD/foo.zip#~~"^/)
## Extract the contents where you really want them
cp -Rp ~/.avfs/"$PWD/foo.zip#" /intended/destination
umountavfs

By date

Assuming there hasn't been other any activity in the same hierarchy than your extraction, you can tell the extracted files by their recent ctime. If you just created or moved the zip file, you can use it as a cutoff; otherwise use ls -lctr to determine a suitable cutoff time. If you want to make sure not to remove the zips, there's no reason to do any manual approval: find is perfectly capable of excluding them. Here are example commands using zsh or find; note that the -cmin and -cnewer primaries are not in POSIX but exist on Linux (and other systems with GNU find), *BSD and OSX.

find . \! -name '*.zip' -type f -cmin -5 -exec rm {} +  # extracted <5 min ago
rm **/*~*.zip(.cm-6)  # zsh, extracted ≤5 min ago
find . -type f -cnewer foo.zip -exec rm {} +  # created or moved after foo.zip

With GNU find, FreeBSD and OSX, another way to specify the cutoff time is to create a file and use touch to set its mtime to the cutoff time.

touch -d … cutoff
find . -type f -newercm cutoff -delete

Instead of removing the files, you could move them to their intended destination. Here's a way with GNU/*BSD/OSX find, creating directories in the destination as needed.

find . \! -name . -cmin -5 -type f -exec sh -c '
    for x; do
      mkdir -p "$0/${x%/*}"
      mv "$x" "$0/$x"
    done
  ' /intended/destination {} +

Zsh equivalent (almost: this one reproduces the entire directory hierarchy, not just the directories that will contain files):

autoload zmv
mkdir -p ./**/*(/cm-3:s"|.|/intended/destination|")
zmv -Q '(**/)(*)(.cm-3)' /intended/destination/'$1$2'

Warning, I haven't tested most of the commands in this answer. Always review the list of files before removing (run echo first, then rm if it's ok).

Related Question