Removing words comprised of upper and lowercase letters

greptext processing

I have file called file.txt. In this file there are words composed of upper and lowercase letters, also there are words consist of upper or lowercase letters and numbers. I would like to filter this file, so the output is free of the words that contain both upper and lower case letters. For example, the input file.txt:

Aaa
aBb
aB
Aa12
12aA
123
123Ab
AAA
aaa

In this file there are words with upper and lowercase letters (e.g. Aaa, aBp), and words contain upper/lower case letters AND digits (e.g. 123Ab). In addition, to words contain only small letters (e.g. aaa), or only capital letters (e.g. AAA).
I would like to remove only the words that contain upper AND lowercase letters (e.g. Aaa, aBp), so the output is as follows:

Aa12
12aA
123
123Ab
AAA
aaa

Any ideas?

Best Answer

grep -Exv '[A-Za-z]*([A-Z][a-z]|[a-z][A-Z])[A-Za-z]*'

Explanation

  • The idea is to match the opposite of what you want first, i.e. those lines that contain only upper- and lower-case letters. This uses grep -Ex, i.e. grep with extended regex, match the whole line. The -v flag then negates the regex, i.e. return those lines that do not match the following regex.
  • The central part ([A-Z][a-z]|[a-z][A-Z]) matches a single upper-case letter followed by a lower-case letter, or vice versa.
  • The outer part [A-Za-z]*...[A-Za-z]* means that the rest of the line must comprise upper- or lower-case letters only.
Related Question