Ubuntu – How to unzip a Japanese ZIP file, and avoid mojibake/garbled characters

encodingjapaneseunzipzip

I received a ZIP file from a Japanese customer.

When I try to unzip it the file and folders names are messed up:

$ unzip ~/Downloads/【新入荷ECM】資料.zip
...
 inflating: БyРVУ№Й╫ECMБzОСЧ┐/123_ГЖБ[ГXГPБ[ГX.xlsx

What is the problem, and how to avoid it?

Best Answer

The problem is that most ZIPs circulating in Japan have their content encoded as Shift JIS, which is not shown correctly by default on Ubuntu.

The solution is to use the -O shift-jis option in your command:

$ unzip -O shift-jis ~/Downloads/【新入荷ECM】資料.zip
...
 inflating: 【新入荷ECM】資料/123_ユースケース.xlsx

This way, the expanded files are perfectly readable in Ubuntu.

Related Question