I have a huge .txt file which contains lots of HTML entities representing Unicode characters, like this:
哀牢山
In Pinyin, this would read "Ai Lao Shan" or "Ai1 Lao2 Shan1", to be more precise.
I need a tool or command line or Pages/Numbers macro, whatever, which replaces all strings like &#....;
in said file into proper Hanzi, which in this case would be:
哀牢山
Any suggestions for a tool or script or program that runs on macOS?
Best Answer
You can install recode via the Terminal with Homebrew:
and then use it to convert HTML to Unicode, like this:
This produces
(inspired by @creving's answer on Stack Overflow)