Sunday, March 9, 2014

Adding custom dictionaries to Kobo devices

I recently bought a Kobo Glo. Great little e-reader, that one.

I spent a while struggling to figure out how to install a Turkish-English dictionary on it, because it's not supported by default. I decided to write a little tutorial about installing other dictionaries on a Kobo device, in case I or anyone else ever need it to do this again.

Check whether the dictionary you want is already available

1. Check here whether Kobo already made the dictionary you want. If yes, download the file, connect your Kobo device to your computer and place it in the ".kobo\dict" folder and you are done. If not, go to step 2.

2. Check here whether someone else already made a custom dictionary for the language pair you want. If so, you can skip to step 13 for instructions on how to put the file on your Kobo device.

Make a custom dictionary

3. Download the dictionary you want in StarDict format. This website has a lot of dictionaries.

4. Unzip it the .tar dictionary file. You will get a .dz, .idx and .ifo file. Rename the .dz file to .gz. Extract the .gz file so you end up with a .dict file. Give the .dict, .ifo and .idx representative names, eg. tur_en.dict, tur_en.idx, and tur_en.ifo.

5. Go here and download the latest version of Penelope.

6. Unzip the Penelope script and place it in the folder of your choice, for example "c:\Penelope". Put the dictionary files in the same folder as the Penelope files.

7. Install Python 3.3.4.

8. Download the marisa.zip file here or here. It containes two Windows executables It was built from open source Marisa code by the user EbokJunkie on mobileread.com. Penelope script invokes these executables. You can put them anywhere; I put them to "C:\tmp".

9. Right click the file penelope3.py and choose Edit with IDLE. Find the variables MARISA_BUILD_PATH and MARISA_REVERSE_LOOKUP_PATH and set their location to "c:/tmp/marisa-build.exe" and to "c:/tmp/marisa-reverse-lookup.exe" respectively, or wherever you saved the marisa files. Save penelopa3.py.

10. Open the command prompt and make the Penelope folder your current folder by typing the command "cd c:\Penelope".

11. Type the following command to start the conversion: "penelope3.py -p tur_eng -f tur -t eng --output-kobo". Replace the variable "tur" with the language you want to translate from and "eng" with your target language.

12. You will end up with a Kobo dictionary file now, for example dicthtml-tur-eng.zip.

Transfer the dictionary file to your Kobo device

13. Connect your Kobo device to your computer and go to the ".kobo\dict" folder on the reader.

14. In order to get your Kobo device to recognize the dictionary files, you have to use it to replace an existing original dictionary. To do this, you have to use the file name of an existing original dictionary. For example, I don't use the Spanish dictionary, so I renamed the file "dicthtml-es-en.zip" on my Kobo device to "dicthtml-es-en-backup.zip", so I will still have a backup in case I need it later. Then I renamed my Turkish-English dictionary to "dicthtml-es-en.zip".

15. Unplug your Kobo device, tap a word in a book and use the "Spanish" dictionary to look up words in your dictionary.

Credits:

26 comments:

  1. Thanks very much! I could make and install a German English dictionary into my Kobo glo. I am really happy.

    ReplyDelete
  2. Star ingilizce sözlüğü yerine OED gibi daha kapsamlı bir sözlük çevirebilir miyiz. Bunların kaynak dosyalarına ulaşabilir miyiz?

    ReplyDelete
    Replies
    1. Please comment in English, my Turkish is not very good yet and this blog is supposed to be in English. :)
      As far as I know, we are limited to the dictionaries available in StarDict format, unless you can find a way to convert dictionaries.

      Delete
  3. This worked like charm. The only issue I faced was os.rename wasn't working in penelope script.

    ReplyDelete
  4. Hi, thank you. It works well:)

    ReplyDelete
  5. I would be forever grateful to you for this tutorial. Worked like a charm on my Kobo Aura.

    ReplyDelete
  6. Hi, I'm Alberto Pettarin, the author of Penelope.

    SG pointed me to this post. I am glad that you find the tool useful and took time to write this tutorial.

    Just a two updates:

    1. the code is now hosted on GitHub, including some documentation: https://github.com/pettarin/penelope

    2. on this MobileRead thread there is an index of custom dictionaries, mostly done with penelope, for several languages: http://www.mobileread.com/forums/showthread.php?t=232883

    (Maybe you want to update the post, thank you.)

    ReplyDelete
    Replies
    1. Thank you for creating the tool and also informing me of the updates; I updated my blog post accordingly. Don't hesitate to let me know if you have any other updates or comments. :)

      Delete
    2. Excellent, thank you.

      Alberto

      Delete
  7. Thanks for the post.
    I tried but it didnt work for me. I encountered following error

    UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 1: ordinal not in range(128)

    can you give me some hints to solve this?

    thanks

    Nam

    ReplyDelete
  8. Thanks a lot, really simple to follow.
    Notice that if you want a definition dictionary, and not translation, you have to set the same "from" and "to" language.

    ReplyDelete
  9. Hello, is there a link where I can download that English - Turkish dictionary you made?:) Many thanks!

    ReplyDelete
  10. Hello, I had problems with the preffix, it just doesn't accept it and aborts the operation (even having done it the same it was shown here). Can anybody help me? Thanks

    ReplyDelete
  11. Hello, I need help
    I followed the instructions and instead of getting the dicthtml zip file within the penelope folder I get another folder named pycache with a pyc file inside it (small size 15 KB)
    What am I doing wrong?

    ReplyDelete
    Replies
    1. Nvm, got it to work.
      Now how to combine two dictionaries (part 1 and 2)?

      Delete
  12. Hello!
    Thanks for great instructions! Only, now when I tried I didn't get a zip-file in step 12. Instead of using the file penelope3.py in step 9 and 11 you should use the file penelope.py (without 3). :)

    ReplyDelete
  13. I've penelope-3.0.1 and no penelopa3.py file.

    ReplyDelete
  14. Thanks for the detailed instruction. This method was working in the past but the actual firmware replaces the changed dictionary to the default quite often (I suspect at sync).
    Do you know a way to have the custom dictionary persistent?
    Thanks in advance

    ReplyDelete
  15. Hello!
    TNX for so good manual!
    But...
    I can't find penelope.py when I use penelope 2.1.40 and the same when I use penelope 3.0.1.9 - can't find penelope3.py.
    Where is my mistake?
    Evmolp

    ReplyDelete
  16. I am experiencing the same problem above. I downloaded and extracted the penelope files, but there is no penepole file in the directory. There is another "penelope" directory in the "penelople" folder and it contains many python files. please help. I need to get it work to use on my kobo. I am trying to convert Oxford and Merriam Advanced eng to eng. Thank for the froum

    ReplyDelete
    Replies
    1. i am woking with stardict and need to create kobo

      Delete
    2. This comment has been removed by the author.

      Delete
    3. this is the file I am trying to covert

      http://www.mediafire.com/download/p0hhhfn680yazsv/MWADICT_EN-EN.tar.bz2

      Delete
  17. I am trying to install the english to hindi dictionary, but when i search in the dictionary, it comes up as squares

    ReplyDelete
  18. I discovered this 2014 forum that addressed the issue of Kobo custom dictionaries and I followed the instructions but, although I renamed the existing Kobo English-Spanish dictionary to dicthtml-en-es-backup.zip and my custom dictionary to dicthtml-en-es.zip, my reader continued using the English-Spanish dictionary, despite the fact that it was now in ...-backup form. If anyone is still monitoring this forum I would be grateful to have a suggestion. Thanks

    ReplyDelete