I need to convert the character encoding in some text files created by a third-party app on my MBP Catalina 10.15.6. I'm in unfamiliar waters here, so please indulge my ignorance. Also, please note that the 3rd party app is not the subject of this question – understanding how to reconcile the different character sets used in macOS is the subject.
I use an application (LTspice) on my MBP occasionally. There is also a Windows version of LTspice. LTspice provides a GUI for creating a circuit schematic, and LTspice creates a plain text file (.asc extension) to encode the schematic and other directives and parameters created in the LTspice GUI; this is the file I need to convert.
I assumed the .asc files were not ASCII-encoded, and so I ran the file
utility on the .asc
file to learn how they were encoded:
% file -I '/Users/seamus/Documents/LTspice/Rounding demo-MacMod.asc'
/Users/seamus/Documents/LTspice/Rounding demo-MacMod.asc: application/octet-stream; charset=binary
binary?!… This made no sense to me. I can open and edit this file in TextEdit
. All of the characters are recognizable ASCII characters – which I understand to be a subset of UTF-8.
My next step was to open the file in the BBedit
app. This revealed new information. According to BBedit
, the demo-MacMod.asc
file reported by file -I
as binary is actually: "UTF-16 Little Endian" format. I know this is confusing… In an effort to clarify, I've placed a couple of screenshots below to illustrate how this file is rendered in BBedit
and TextEdit
. The extra byte (¿) in the BBedit
screenshot is a NUL.
I need a method (that I can automate/script) to convert these "UTF-16 Little Endian" files to "US-ASCII". I thought that the iconv
tool would be perfect for this job:
iconv -l
...
# long list of character encodings which included:
US-ASCII
UCS-2LE UNICODELITTLE
UCS-2LE UNICODELITTLE
looked like the best match to "UTF-16 Little Endian", but:
% iconv -f 'UCS-2LE UNICODELITTLE' -t 'US-ASCII' '/Users/seamus/Documents/LTspice/Rounding demo-MacMod.asc' > '/Users/seamus/Documents/LTspice/Rounding demo-MacMod-iconvASCII.asc'
iconv: conversion from UCS-2LE UNICODELITTLE unsupported
iconv: try 'iconv -l' to get the list of supported encodings
I don't know why I get this response. Clearly iconv -l
says that UCS-2LE UNICODELITTLE
is supported. Whether it's the correct match for "UTF-16 Little Endian" is another question, but I find nothing in the list that looks to be a better match.
So this is the gist of my question. I think it could be answered in one of two ways:
-
What is my error in the use of
iconv
, or in my reading ofman iconv
oriconv -l
? -
Is there another option for converting "UTF-16 Little Endian" to "US-ASCII" that can be automated/scripted?
Best Answer
With BBEdit this is easy. First open the file in BBEdit. If you let BBEdit install its command line tool you can even do this from Terminal with
bbedit /path/to/filename
. If the file has opened as the wrong encoding, selectFile > Reopen Using Encoding > correct encoding
. I think it would be worth trying reopening usingUTF-16 Little-Endian
&UTF-16 Little-Endian, no BOM
to see if either of those has the file open as desired. When you have the file correctly opened, selectFile > Save As...
. In the Save As dialog box you can choose the desired encoding, & also the line ending type if that matters.For dealing with inverted red question marks, probably null (ASCII 0), select
Text > Zap Gremlins...
for the dialog below...Using the options shown there should give a state like what you see in TextEdit. Try different options on copies of a couple of your files.
Because BBEdit has a command line tool, you should be able script it once you have the correct options. BBEdit also works with AppleScript & Automator.
You can download BBEdit for free from the link I've given. It will start in demo mode, & when the demo mode expires it will continue to run in free mode where the features you need are still available.