Ubuntu – convert wordlist.txt file[s], to make them compatible and working in goldenDict

dictionaryfile formatformat conversiongoldendict

I've got five wordlist.txt dictionary-files; downloaded at dict.cc, and I need to convert them into .ifo | dict.dz | idx.gz or .index | .dict(.dz)-format, in order to make them work in/with goldenDict.

  • Can someone please provide some insides, on how to go about it RIGHT!
  • Also i would like to keep the formatting of those dictionaries clean, [simple and structured as possible] once they are converted.

Best Answer

I don't have much time on my hands right now but here's a quick overview*

  1. Download Pyglossary from this link. Extract the archive and move the folder to a location of your choice. Make pyglossary.pyw executable and start it by double-clicking on the file.

    Alternatively you can download a prepackaged older version of pyglossary on the google code project page and install it with Ubuntu-Software Center or gdebi. Because this version is quite old (dating back to 2009) your mileage may vary.

  2. Rename your wordlist files to whatever you would like your dictionary to be called (e.g. DICTCC_EN_DE.txt)

  3. Point Pyglossary to your file and set the input and output formats as follows:

    Input: tabfile, output: stardict

  4. Click apply. WARNING: Depending on the length of the word list conversion might be very memory-intensive.

  5. Import the dictionary to GoldenDict. This as well may use a lot of system memory (building the index, etc.)

Formatting should be acceptable (not perfect but usable). For further formatting tweaks (e.g. to remove annoying gender denominations) you would have to manually edit the input file.


*: please feel free to edit this post as you see fit to add more detail. I might do so myself later.