How would I remove all non-ascii characters from one file? Would there be a specific command to perform this?
grep --colour='auto' -P -n'[^\x00-\x7]' /usr/local/...
I believe this finds the characters within the workflow, but how would I remove all the instances of the characters in question?
Best Answer
ASCII characters are characters in the range from 0 to 177 (octal) inclusively.
To delete characters outside of this range in a file, use
The
tr
command is a utility that works on single characters, either substituting them with other single characters (transliteration), deleting them, or compressing runs of the same character into a single character.The command above would read from
file
and write the modified content tonewfile
. The-d
option totr
makes the utility delete characters (instead of transliterating them), and-c
makes it consider characters outside the given interval (instead of inside).LC_ALL=C
makes sure that every byte value makes up a valid character. Without it, sometr
implementations would abort if they found sequences of bytes that don't form valid characters in the locale's character encoding.To replace the original file with the modified one, use
This renames the new file to the name of the old file after
tr
has completed successfully. Iftr
does not complete successfully, either because it could not read the original file or not write to the new file, the original file will be left unchanged.Alternatively, to preserve as much as possible of the meta data (permissions etc.) of the original file, use