I have a number of XML files containing sanskrit texts to be converted to tex. Latex has a maximum of 63 characters per word for its hyphenation to work, everything longer than that will not be hyphenated. Now I would like to grep my files for these words, only that grep doesn't appear to be the right tool here. Some of the words use IAST encoding, others Devanāgarī. I suppose a perl one-liner could do that?
Perl one liner to find words longer than 63 characters
greplatexperlregular expression
Best Answer
In an attempt to give this Q a proper answer, based - on - the - comments (heeding Sobrique's note that parsing XML should really be done with an XML parser):