Windows – Putting a file encoded with UTF-8 on the clipboard with CLIP.EXE under Windows

clipboardunicodewindows

I have a situation where I have a Java program that first write a text file and then invokes "CMD /C CLIP < textfile" to be able to put an arbitrarily large file on the Windows clipboard. Works well.

Now I've found that there is an encoding issue so I have ensured I have a valid UTF-8 encoded file (including the BOM, and it opens correctly in vim) but it appears that CLIP.EXE does not honor the BOM to change the expected encoding to UTF-8.

So, how should I tell Windows and/or CLIP.EXE that this file is UTF-8 encoded and treat it accordingly? (If another encoding like UTF-16 or UTF-32 would work better for Unicode I can use that instead).

The system showing the behavior is Windows 7 and the default codepage in CMD.EXE is 850. I need this to work on systems I do not have any control over.

Best Answer

UTF-16 works for me, on my Windows 7 (my OEM ('cmd') codepage is 437, though it shouldn't matter).

How I tested:

  1. Open notepad, type some non-ASCII texts (or copy from some site with many langs, like http://wikipedia.org
  2. Save As, choose Encoding: Unicode (which means UTF-16), save as UTF16.txt
  3. In cmd, type clip < UTF16.txt
  4. Open new notepad, paste

Result: Text appears correctly.