So, I got a directory filled with other directories, and I was wondering if it was possible to remove files that have no size. Typically these files are 0 bytes and since I want to merge all these subdirs I could replace a perfectly legit file with a weightless 0 byte file, and there goes my legit file. Any way to remove the zero byte files?
Ubuntu – Remove files of 0 bytes in size via command line
command linefiles
Related Solutions
Assuming you are on 14.04 (using python3), the small script below lists your files recursively in given directory. It identifies the file's mimetype by the file
command, as described here
file --mime-type -b filename
additionally, you can extend the script by adding a command by using shutil
(e.g. .move / .copy
) at the same level as the print
command.
Adding mimetypes
For a combined search, you can add (or remove) mimetypes to search for, by adding them to the filetypes
-tuple.
The script
#!/usr/bin/env python3
import os
import subprocess
source_dir = "/path/to/directory"
filetypes = ("image", "video")
for root, dirs, files in os.walk(source_dir):
for name in files:
file = root+"/"+name
ftype = subprocess.check_output(['file', '--mime-type', '-b', file]).decode('utf-8').strip()
if ftype.split("/")[0] in filetypes:
print(file)
How to use it
Copy the script into an empty file, set the directory to list (sourcedir
) and the mimtype(s) to look for (filetypes
), save it as list_files.py
and run it by the command:
python3 /path/to/list_files.py
Assuming both files are not huge, the python script below will do the job as well.
How it works
Both files are read by the script. The lines in file_1 (the file that has precedence) is split by the directory you entered for the file in the head section (in your example /mnt/app/
).
Subsequently, the lines in file_1 are written to the output file (the merged file). At the same time, lines from file_2 are removed from the line list if the identifying string (the section after the mount point) occurs in the line. Finally, the "remaining" lines of file_2 (of which no dupe exist in file_1) are written to the output file as well. The result:
file_1:
1058,b8203a236b4f15316e516165a6546666,/mnt/app/Certificados/ca.crt
2694,8a815adefde4fa0c263e74832b15de64,/mnt/app/Certificados/ca.db.certs/01.pem
136,77bf2e5313dbaac4df76a4b72df2e2ad,/mnt/app/Certificados/ca.db.index
file_2:
1058,b8203a236b4f15316e516165a6546666,/mnt/app/Certificados/ca.crt
3,72b2ac90f7f3ff075a937d6be8fc3dc3,/mnt/temp/Certificados/ca.db.serial
2694,8a815adefde4fa0c263e74832b15de64,/mnt/app/Certificados/ca.db.certs/01.pem
136,77bf2e5313dbaac4df76a4b72df2e2ad,/mnt/app/Certificados/ca.db.index
merged:
1058,b8203a236b4f15316e516165a6546666,/mnt/app/Certificados/ca.crt
2694,8a815adefde4fa0c263e74832b15de64,/mnt/app/Certificados/ca.db.certs/01.pem
136,77bf2e5313dbaac4df76a4b72df2e2ad,/mnt/app/Certificados/ca.db.index
3,72b2ac90f7f3ff075a937d6be8fc3dc3,/mnt/temp/Certificados/ca.db.serial
The script
#!/usr/bin/env python3
#---set the path to file1, file2 and the mountpoint used in file1 below
f1 = "/path/to/file_1"; m_point = "/mnt/app"; f2 = "/path/to/file_2"
merged = "/path/to/merged_file"
#---
lines1 = [(l, l.split(m_point)[-1]) for l in open(f1).read().splitlines()]
lines2 = [l for l in open(f2).read().splitlines()]
for l in lines1:
open(merged, "a+").write(l[0]+"\n")
for line in [line for line in lines2 if l[1] in line]:
lines2.remove(line)
for l in lines2:
open(merged, "a+").write(l+"\n")
How to use
- Copy the script into an empty file, save it as
merge.py
- in the head section of the script, set the paths to
f1
(file_1
),f2
, the path to the merging file and the mountpoint as mentioned infile_1
. Run it by the command:
python3 /path/to/merge.py
Edit
Or a tiny bit shorter:
#!/usr/bin/env python3
#---set the path to file1, file2 and the mountpoint used in file1 below
f1 = "/path/to/file_1"; m_point = "/mnt/app"; f2 = "/path/to/file_2"
merged = "/path/to/merged_file"
#---
lines = lambda f: [l for l in open(f).read().splitlines()]
lines1 = lines(f1); lines2 = lines(f2); checks = [l.split(m_point)[-1] for l in lines1]
for item in sum([[l for l in lines2 if c in l] for c in checks], []):
lines2.remove(item)
for item in lines1+lines2:
open(merged, "a+").write(item+"\n")
Best Answer
Use the Find command to find files by size and print file names to standard output.
substitute
-print
with-delete
to delete the files rather than print them on screen.