Python

Kindle Vocabulary Builder into Memrise

Introduction

A script to pull Kindle Vocabulary Builder DB and convert into Memrise course.

The latest Kindle Paperwhite (second generation) offers the Vocabulary Builder feature. With Vocabulary Builder, you can look up words with the dictionary and memorize their definitions.

For my self-education I use http://memrise.com/ (both on my phone and desktop PC). I thought it would be great to pull words which I’ve checkded when reading English books on my Kindle and push them into my Memrise course.

How does it work?

  1. The script reads through the vocab.db to look for all Engligh words (in table WORDS).
  2. Each of the words (aka stems) is used for a definition lookup in the Cambridge Dictionary
  3. Retreve word definitions, usage example, pronounciation, audio mp3 and insert into a new SQLite database memrise.db (the mp3 is written to the disk only, folder audio)
  4. Each new word is written to a text file, in a format suitable for bulk words import into Memrise.

Pre-requisties

  • Kindle Paperwhite (or newer)
  • vocab.db file (retrieved from your Kindle, from /Volumes/Kindle/system/vocabulary/)
  • python 3
  • BeautifulSoup

References

I heavily sourced from two GitHub projects:

ToDOs

  • Parametrize hardcoded things – especially language pair English-Polish
  • Upload Audio files with prononciation

Usage

DB conversion using the script

MBP:kindle-to-memrise jhartman$ ./kindle2memrise.py -h
usage: kindle2memrise.py [-h] [-kindleDB KINDLEDB]
                         [-dictionaryDB DICTIONARYDB] [-output OUTPUT]
                         [-revision REVISION] [-debug]

optional arguments:
  -h, --help            show this help message and exit
  -kindleDB KINDLEDB    Kindle vocabulary db filename (default: vocab.db)
  -dictionaryDB DICTIONARYDB
                        Memrise dictionary db filename (default: memrise.db)
  -output OUTPUT        Output file to import to memrise.com (default:
                        memrise.txt)
  -revision REVISION    Revision to output. Not specfied (default): last, 0 -
                        all
  -debug                Enable debug

At minimum, the tool does not require any parameters, it will search for vocab.db in the current folder and will write output files into the same, current folder.

Pay your special attention to memrise.txt which has been generated:

MBP:kindle-to-memrise jhartman$ tail memrise.txt
mere    Sam. Used to emphasize that something is not large or important. Example: It costs a mere twenty dollars.   mɪər
thinning    Rozcieńczać, rozrzedzać. To make a substance less thick, often by adding a liquid to it. Example: N/A   θɪn
carnivore   Zwierzę mięsożerne. An animal that eats meat. Example: N/A  ˈkɑːnɪvɔːr
embrace Obejmować (się). If you embrace someone, you put your arms around them, and if two people embrace, they put their arms around each other.. Example: We are always eager to embrace the latest technology.   ɪmˈbreɪs

This is the file, which will be used for bulk word add into your Course.

Bulk word add

Go to your Course, press Edit and in the Advanced options, look for Bulk add words:

Bulk Add words

Open memrise.txt in an editor (e.g. Notepad), select all, copy it and paste into Memrise Bulk Add form then press Add:

Bulk Add words

That’s it!

Download

Download the script from GitHub: https://github.com/jaroslawhartman/kindle-to-memrise

Installation of pygraphviz in MacOS 10.12 (Sierra)

First attempt to the installation failed because of missing graphviz:

(django_python_env) MBP:mysite jhartman$ pip3 install pygraphviz
Collecting pygraphviz
 Downloading pygraphviz-1.3.1.zip (123kB)
 100% |████████████████████████████████| 133kB 1.4MB/s
Building wheels for collected packages: pygraphviz
 Running setup.py bdist_wheel for pygraphviz ... error
 Complete output from command /Users/jhartman/scripts/django_python_env/bin/python3.6 -u -c "import setuptools, tokenize;__file__='/private/var/folders/t7/v05f2p5j5kgdd8t_px1lv1jc0000gn/T/pip-build-zn5secqs/pygraphviz/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /var/folders/t7/v05f2p5j5kgdd8t_px1lv1jc0000gn/T/tmp34txy01_pip-wheel- --python-tag cp36:
 running bdist_wheel
 running build
 running build_py
 creating build
 creating build/lib.macosx-10.12-x86_64-3.6
 creating build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 copying pygraphviz/__init__.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 copying pygraphviz/agraph.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 copying pygraphviz/graphviz.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 copying pygraphviz/release.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 copying pygraphviz/version.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 creating build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/__init__.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_attribute_defaults.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_attributes.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_clear.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_drawing.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_edge_attributes.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_graph.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_html.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_layout.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_node_attributes.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_readwrite.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_string.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_subgraph.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 copying pygraphviz/tests/test_unicode.py -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz/tests
 running egg_info
 writing pygraphviz.egg-info/PKG-INFO
 writing dependency_links to pygraphviz.egg-info/dependency_links.txt
 writing top-level names to pygraphviz.egg-info/top_level.txt
 reading manifest file 'pygraphviz.egg-info/SOURCES.txt'
 reading manifest template 'MANIFEST.in'
 warning: no previously-included files matching '*~' found anywhere in distribution
 warning: no previously-included files matching '*.pyc' found anywhere in distribution
 warning: no previously-included files matching '.svn' found anywhere in distribution
 no previously-included directories found matching 'doc/build'
 writing manifest file 'pygraphviz.egg-info/SOURCES.txt'
 copying pygraphviz/graphviz.i -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 copying pygraphviz/graphviz_wrap.c -> build/lib.macosx-10.12-x86_64-3.6/pygraphviz
 running build_ext
 building 'pygraphviz._graphviz' extension
 creating build/temp.macosx-10.12-x86_64-3.6
 creating build/temp.macosx-10.12-x86_64-3.6/pygraphviz
 clang -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -I/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/include/python3.6m -c pygraphviz/graphviz_wrap.c -o build/temp.macosx-10.12-x86_64-3.6/pygraphviz/graphviz_wrap.o
 pygraphviz/graphviz_wrap.c:2954:10: fatal error: 'graphviz/cgraph.h' file not found
 #include "graphviz/cgraph.h"
 ^
 1 error generated.
 error: command 'clang' failed with exit status 1

Ok, so next obvious step was to install missing library using brew:

MBP:~ jhartman$ brew install libcgraph
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/core).
==> Updated Formulae
ammonite-repl apache-flink checkstyle libass pcap_dnsproxy supervisor zanata-client
angular-cli bastet confuse neo4j s6 swiftformat

Error: No available formula with the name "libcgraph"
==> Searching for a previously deleted formula...
Error: No previously deleted formula found.
==> Searching for similarly named formulae...
Error: No similarly named formulae found.
==> Searching taps...
Error: No formulae found in taps.
MBP:~ jhartman$ brew install graphviz
==> Installing dependencies for graphviz: libtool, libpng, freetype, fontconfig, jpeg, libtiff, webp, gd
==> Installing graphviz dependency: libtool
==> Downloading https://homebrew.bintray.com/bottles/libtool-2.4.6_1.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring libtool-2.4.6_1.sierra.bottle.tar.gz
==> Using the sandbox
==> Caveats
In order to prevent conflicts with Apple's own libtool we have prepended a "g"
so, you have instead: glibtool and glibtoolize.
==> Summary
🍺 /usr/local/Cellar/libtool/2.4.6_1: 70 files, 3.7MB
==> Installing graphviz dependency: libpng
==> Downloading https://homebrew.bintray.com/bottles/libpng-1.6.29.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring libpng-1.6.29.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/libpng/1.6.29: 26 files, 1.2MB
==> Installing graphviz dependency: freetype
==> Downloading https://homebrew.bintray.com/bottles/freetype-2.8.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring freetype-2.8.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/freetype/2.8: 63 files, 2.6MB
==> Installing graphviz dependency: fontconfig
==> Downloading https://homebrew.bintray.com/bottles/fontconfig-2.12.1_2.sierra.bottle.1.tar.gz
######################################################################## 100.0%
==> Pouring fontconfig-2.12.1_2.sierra.bottle.1.tar.gz
==> Regenerating font cache, this may take a while
==> /usr/local/Cellar/fontconfig/2.12.1_2/bin/fc-cache -frv
🍺 /usr/local/Cellar/fontconfig/2.12.1_2: 487 files, 3.1MB
==> Installing graphviz dependency: jpeg
==> Downloading https://homebrew.bintray.com/bottles/jpeg-8d.sierra.bottle.2.tar.gz
######################################################################## 100.0%
==> Pouring jpeg-8d.sierra.bottle.2.tar.gz
🍺 /usr/local/Cellar/jpeg/8d: 19 files, 708.3KB
==> Installing graphviz dependency: libtiff
==> Downloading https://homebrew.bintray.com/bottles/libtiff-4.0.8.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring libtiff-4.0.8.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/libtiff/4.0.8: 245 files, 3.4MB
==> Installing graphviz dependency: webp
==> Downloading https://homebrew.bintray.com/bottles/webp-0.6.0.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring webp-0.6.0.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/webp/0.6.0: 36 files, 2.0MB
==> Installing graphviz dependency: gd
==> Downloading https://homebrew.bintray.com/bottles/gd-2.2.4_1.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring gd-2.2.4_1.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/gd/2.2.4_1: 34 files, 1MB
==> Installing graphviz
==> Downloading https://homebrew.bintray.com/bottles/graphviz-2.40.1.sierra.bottle.1.tar.gz
######################################################################## 100.0%
==> Pouring graphviz-2.40.1.sierra.bottle.1.tar.gz
🍺 /usr/local/Cellar/graphviz/2.40.1: 536 files, 12.9MB

Everything looked find, however pip’ing still shown same problem with not installed library. I think the reason was in the warning show above:

In order to prevent conflicts with Apple's own libtool we have prepended a "g"
so, you have instead: glibtool and glibtoolize.

Ok, so where we have eventually the lib?

MBP:~ jhartman$ find /usr/local/Cellar/ -name "*graphviz*"
/usr/local/Cellar//graphviz
/usr/local/Cellar//graphviz/2.40.1/.brew/graphviz.rb
/usr/local/Cellar//graphviz/2.40.1/include/graphviz
/usr/local/Cellar//graphviz/2.40.1/include/graphviz/graphviz_version.h
/usr/local/Cellar//graphviz/2.40.1/lib/graphviz
/usr/local/Cellar//graphviz/2.40.1/share/graphviz
/usr/local/Cellar//graphviz/2.40.1/share/man/man7/graphviz.7

So we’re almost there. Let’s convince pip to use include and lib folder mentioned above (by enforcing it by --install-option):

(django_python_env) MBP:mysite jhartman$ pip3 install pygraphviz --install-option="--include-path=/usr/local/Cellar/graphviz/2.40.1/include/" --install-option="--library-path=/usr/local/Cellar/graphviz/2.40.1/lib/"
/Users/jhartman/scripts/django_python_env/lib/python3.6/site-packages/pip/commands/install.py:194: UserWarning: Disabling all use of wheels due to the use of --build-options / --global-options / --install-options.
 cmdoptions.check_install_build_global(options)
Collecting pygraphviz
 Using cached pygraphviz-1.3.1.zip
Skipping bdist_wheel for pygraphviz, due to binaries being disabled for it.
Installing collected packages: pygraphviz
 Running setup.py install for pygraphviz ... done
Successfully installed pygraphviz-1.3.1

We’re done!