Windows – Files with non-ASCII characters in file name in a Windows batch file

batchbatch filecommand lineunicodewindows

On a usual (Western) Windows computer, I have a file

файл.txt

with non-ASCII letters in the file name. How can I do the following from a .bat file?

dir файл.txt
ren файл.txt file.txt

etc.?

I tried placing the above commands into a file mybat.bat (using UTF-8 or UTF-16 encoding), but it does not work even if I run it as cmd /u /c mybat.bat.

Note: the question is not how to put those letters in a batch file, but how to make the batch file do what is expected (in my example, to list the file and then rename it).

Note: dir > log.txt command shows the file файл.txt as ????.txt. However, dir shows this file on the screen correctly as файл.txt.

Best Answer

Your main problem is font https://stackoverflow.com/questions/9321419/unicode-utf-8-text-file-gibberish-on-windows-console-trying-to-display-hebrew With the correct font you won't get question marks. So you should add Courier New to the command prompt. Then you'll be able to type or display/echo such characters.

If you then find that some commands have issues then try chcp 65001 (in answer to your question, rest assured that chcp 65001 will only affect that cmd prompt window). You'd need chcp 65001 for redirection to work on characters beyond \u7F e.g. that dir >asdf command to write a file with those characters, will need chcp 65001. But your ren command works fine without 65001.

Note- OP points out a correction to this.. His font was fine.. But he needed chcp 65001.

Another case where one needs chcp 65001 is if a batch file is in utf8. Otherwise even executing a batch file with just letters like привет , those will be converted into question marks.

OP also points out a great workaround for the problem that notepad saves with utf-8 with BOM, whereas chcp 65001 is UTF-8 without BOM. And if you have a batch file encoded as utf-8 with bom, that says just e.g. dir, or echo привет then it will not work, even if cmd has encoding of 65001. Because cmd mixes the BOM up into the first line. So a workaround is to put the command(s) starting from the second line. (Alternatively one could use a text editor that saves as utf-8 without BOM).

Related Question