How to find non-ASCII characters in text files

asciigroovyjava

Is there a tool that can scan a small text file and look for any character not in the simple ASCII character set?

A simple Java or Groovy script would also do.

Best Answer

Well, it's still here after an hour, so I may as well answer it. Here's a simple filter that prints only non-ASCII characters from its input, and gives exit code 0 if there weren't any and 1 if there were. Reads from standard input only.

#include <stdio.h>
#include <ctype.h>

int main(void)
{
    int c, flag = 0;

    while ((c = getchar()) != EOF)
        if (!isascii(c)) {
            putchar(c);
            flag = 1;
        }

    return flag;
}
Related Question