To filter out the human-readable file names, you can make use of the [:print:]
(printable) character class name. You will find more about such classes in the manual for grep
.
find . -type f -size 1033c -name "[[:print:]]*" ! -executable
On a second thought, the "human-readable" requirement might refer to the file's content, instead of its name. In other words, you would be searching for text files. That is a little more tricky. As @D_Bye suggested in a comment, you should then use the file
command to determine the file content type. But it would not be a good idea to run file
after a pipe, because it would complicate the task of displaying the file's name. Here's what I suggest:
find . -type f -size 1033c ! -executable -exec sh -c 'file -b $0 | grep -q text' {} \; -print
This is briefly how the file
-part works:
- The
-exec
predicate executes sh -c 'file -b $0 | grep -q text' FILENAME
for each FILENAME
that satisfies all the previous conditions (type, size, non-executable).
- For each of those files, a shell (
sh
) runs this short script: file -b $0 | grep -q text
, replacing $0
with the filename.
- The
file
program determines the content type of each file and outputs this information. The -b
option prevents printing the name of each tested file.
grep
filters the output coming from file
program, searching for lines containing "text". (See for yourself, how a typical output of the file
command looks like.)
- But
grep
does not output the filtered text, because it has the -q
(quiet) option given. What it does, is just change its exit status to either 0
(which represents "true" - the filtered text was found) or 1 (meaning "error" - the text "text" did not appear in the output from file
).
- The true/false exit status coming from
grep
is passed further by sh
to find
and acts as the final result of the whole "-exec sh -c 'file $0 | grep -q text' {} \;
" test.
- In case the above test returned true, the
-print
command is executed (i.e. the name of the tested file is printed).
Yes, you can use find
to look for non-executable files of the right size and then use file
to check for ASCII. Something like:
find . -type f -size 1033c ! -executable -exec file {} + | grep ASCII
The question, however, isn't as simple as it sounds. 'Human readable' is a horribly vague term. Presumably, you mean text. OK, but what kind of text? Latin character ASCII only? Full Unicode? For example, consider these three files:
$ cat file1
abcde
$ cat file2
αβγδε
$ cat file3
abcde
αβγδε
$ cat file4
#!/bin/sh
echo foo
These are all text and human readable. Now, let's see what file
makes of them:
$ file *
file1: ASCII text
file2: UTF-8 Unicode text
file3: UTF-8 Unicode text
file4: POSIX shell script, ASCII text executable
So, the find
command above will only find file1
(for the sake of this example, let's imagine those files had 1033 characters). You could expand the find
to look for the string text
:
find . -type f -size 1033c ! -executable -exec file {} + | grep -w text
With the -w
, grep
will only print lines where text
is found as a stand-alone word. That should be pretty close to what you want, but I can't guarantee that there is no other file type whose description might also include the string text
.
Best Answer
find
doesn't have sophisticated options likels
. If you wantls -h
, you need to callls
.I recommend the
-xdev
option to avoid recursing into other filesystems, which would be useless if you're concerned about disk space.If you use zsh as your shell, then instead of using
find
, you can use glob qualifiers. Limiting the file size is simple:L
followed by a size; the size can have an optional unit before the number. If you don't care about the maximum depth, you can use**/
to recurse into subdirectories. If you care about maximum depth, it's more cumbersome as zsh glob patterns lack a way to express “at most n occurrences”. To avoid cross-device recursion, use thed
glob qualifier; you need to find the device number, which you can display with thestat
command under Linux (stat -c %d /
to display just the number) or with zsh's ownstat
builtin (runzmodload zsh/stat
to load it).