Applescript character count in words with combining diacritics

applescript

If I combine an acute accent with, say "x", with a w before it and a y following, like this: "wx̀y", this will look good in BBEdit and Word, and both programs will give a character count of 4. However, this applescript:

set a to "wx̀y"

display dialog (number of characters of a)

… will reply "3".

This is in Smile and Script Debugger, my Script Editor goes into beach ball mode seconds after opening. And it is a real problem in indexing some exotic texts, not only one out of curiosity, so any suggestions would be most welcome.

Best Answer

I did a bit of experimentation, and I get varied results. Fortunately, one of those results was 4. I did this by outsourcing the calculation to Python:

on run {input, parameters}
    set var to "wx̀y"

    set output to (do shell script ¬
"python - <<EOF
# -*- coding: utf-8 -*-
print len(u'" & var & "')
EOF")

    return output
end run