Kindle

Kindle Vocabulary Builder into Memrise

Introduction

A script to pull Kindle Vocabulary Builder DB and convert into Memrise course.

The latest Kindle Paperwhite (second generation) offers the Vocabulary Builder feature. With Vocabulary Builder, you can look up words with the dictionary and memorize their definitions.

For my self-education I use http://memrise.com/ (both on my phone and desktop PC). I thought it would be great to pull words which I’ve checkded when reading English books on my Kindle and push them into my Memrise course.

How does it work?

  1. The script reads through the vocab.db to look for all Engligh words (in table WORDS).
  2. Each of the words (aka stems) is used for a definition lookup in the Cambridge Dictionary
  3. Retreve word definitions, usage example, pronounciation, audio mp3 and insert into a new SQLite database memrise.db (the mp3 is written to the disk only, folder audio)
  4. Each new word is written to a text file, in a format suitable for bulk words import into Memrise.

Pre-requisties

  • Kindle Paperwhite (or newer)
  • vocab.db file (retrieved from your Kindle, from /Volumes/Kindle/system/vocabulary/)
  • python 3
  • BeautifulSoup

References

I heavily sourced from two GitHub projects:

ToDOs

  • Parametrize hardcoded things – especially language pair English-Polish
  • Upload Audio files with prononciation

Usage

DB conversion using the script

MBP:kindle-to-memrise jhartman$ ./kindle2memrise.py -h
usage: kindle2memrise.py [-h] [-kindleDB KINDLEDB]
                         [-dictionaryDB DICTIONARYDB] [-output OUTPUT]
                         [-revision REVISION] [-debug]

optional arguments:
  -h, --help            show this help message and exit
  -kindleDB KINDLEDB    Kindle vocabulary db filename (default: vocab.db)
  -dictionaryDB DICTIONARYDB
                        Memrise dictionary db filename (default: memrise.db)
  -output OUTPUT        Output file to import to memrise.com (default:
                        memrise.txt)
  -revision REVISION    Revision to output. Not specfied (default): last, 0 -
                        all
  -debug                Enable debug

At minimum, the tool does not require any parameters, it will search for vocab.db in the current folder and will write output files into the same, current folder.

Pay your special attention to memrise.txt which has been generated:

MBP:kindle-to-memrise jhartman$ tail memrise.txt
mere    Sam. Used to emphasize that something is not large or important. Example: It costs a mere twenty dollars.   mɪər
thinning    Rozcieńczać, rozrzedzać. To make a substance less thick, often by adding a liquid to it. Example: N/A   θɪn
carnivore   Zwierzę mięsożerne. An animal that eats meat. Example: N/A  ˈkɑːnɪvɔːr
embrace Obejmować (się). If you embrace someone, you put your arms around them, and if two people embrace, they put their arms around each other.. Example: We are always eager to embrace the latest technology.   ɪmˈbreɪs

This is the file, which will be used for bulk word add into your Course.

Bulk word add

Go to your Course, press Edit and in the Advanced options, look for Bulk add words:

Bulk Add words

Open memrise.txt in an editor (e.g. Notepad), select all, copy it and paste into Memrise Bulk Add form then press Add:

Bulk Add words

That’s it!

Download

Download the script from GitHub: https://github.com/jaroslawhartman/kindle-to-memrise

How to check if all Calibre books are already in the Amazon cloud?

Preface

My primary software for managing Kindle books is Calibre. I’m using Amazon cloud for books synchronisation and sometimes I’d like to ensure that all my books stored in Calibre have been already sent to the cloud. Mainly, to avoid re-sending books and avoid duplicates.

Getting a list of Amazon books

Refer to Kindle Library List article.

After executing this step you should have one or more text listing your books:

Jareks-MacBook-Pro:Downloads jhartman$ java -jar KindleLibrary.jar 1.htm
Amazon book list extractor
Elements found:400
Saving 1.html
Saving 1.txt
Saving 1.xml
Done!
Jareks-MacBook-Pro:Downloads jhartman$ java -jar KindleLibrary.jar 2.htm
Amazon book list extractor
Elements found:400
Saving 2.html
Saving 2.txt
Saving 2.xml
Done!
Jareks-MacBook-Pro:Downloads jhartman$ java -jar KindleLibrary.jar 3.htm
Amazon book list extractor
Elements found:373
Saving 3.html
Saving 3.txt
Saving 3.xml
Done!

Getting a list of Calibre books

Now open your Calibre, press a small triangle icon next to the Convert Books icon. Follow the instructions below:

After pressing OK you should get your Calibre books:

Jareks-MacBook-Pro:Downloads jhartman$ head CalibreBooks.csv
title,authors
"Świat pani Malinowskiej ; Trzecia płeć","Tadeusz Dołęga-Mostowicz"
"Drugie życie doktora Murka","Tadeusz Dołęga-Mostowicz"
"Świat pani Malinowskiej","Tadeusz Dołęga-Mostowicz"
"Wysokie Progi","Tadeusz Dołęga-Mostowicz"
"Złota maska","Tadeusz Dołęga-Mostowicz"
"Morderstwo pod cenzura","Marcin Wroński"
"Wąż Marlo t1","Marcin Wroński"

Compare the lists

In the very last step, use tide up the output files.

#!/usr/bin/env bash

awk -F $'\t' '{print $1,$2}' 1.txt > Amazon.tmp
awk -F $'\t' '{print $1,$2}' 2.txt >> Amazon.tmp
awk -F $'\t' '{print $1,$2}' 3.txt >> Amazon.tmp
sort Amazon.tmp > Amazon.txt

awk -F ',' '{print $1,$2}' CalibreBooks.csv | sed 's/"//g' | sort | grep -v "title authors" > Calibre.txt

And finally – use a side-to-side text compare tool (JEdit, WinMerge, any other) and view the differences between Amazon.txt and Calibre.txt:

Review the differences and take proper actions:

  • Send the file to the cloud
  • Remove duplicates in the cloud (yes, this will be also visible in the output)

I hope it will help.

(more…)

Kindle Library List

Extracts documents list from Amazon Kindle webpage and save into a txt, xml and html file. Project stored in GitHub.

How to use:

  1. Download (or build) KindleLibrary.jar
  2. Navigate to Manage your content and devices Amazon page (tested using Chrome but I trust it should work with any other web browser)
  3. Switch Show to Docs Image 1
  4. Scroll down to reach end of your list (or to see Show more button) Image 2
  5. Save the html (File -> Save Page As…, using Complete Webpage). Override the default filename with an easy name, e.g. 1.
  6. If more docs pending, press Show More button on the bottom of the page and iterate to Step 4
  7. When all pages iterated, open a command line and invoke the conversion:

(more…)

Kindle review

Originally – that was my review wrote on Amazon.com. It’s stil unpublished – maybe because of “sensitive” content. However, I wanted to share it with you.

Almost perfect. But only ‘almost’. Why? I have no concerns about reading converted ebooks (in Amazon’s format or Mobi or ‘Calibred’ EPUBs->Mobi). It really great! But it’s only for ‘hardcode’ readers who reads only text. But ‘cmon, Amazon! DO SOMETHING WITH PATHETIC PDF VIEWER! It’s absolutely useless! We all know it can be done better. How we do know that? It’s easy- chinese alternative firmware (called Duokan) has almost perfect PDF viewer. It has all the features missed from the original one firmware:

  • free-hand zoom factor. You can set 110% or 102%. You don’t have to stick to 100% or 150% or 200% as it is in original kindle
  • smarter movement after zooming. On kindle scroll of zoomed display is always inappropriate.
  • faster!

Of course – ‘modification’ of original firmware most likely it’s breaking of Amazon’s licence but… who cares?! If I can turn my device from “almost-useless-for-PDFs” to “really-nice-reading-of-PDFs-although-6”-display-and-now-I’m-truly-in-love-with-it” I don’t hesitate.

I really like documents delivery over the air. I must admit – I use it 100% over wifi but most the time I’m too lazy to get a cable, connect it to my laptop, drag the doc from laptop to Kindle. It’s muuuch easier just to send it over email. And if you have Mac OS X you can easily prepare simple automator script and make this simple sending even simpler (just one click).

After years spent on reading ebooks on PDAs and iPhone now I’m asking myself: “why so late?!”. Reading experience is amazing. And there is no video or review which can express it. You have like-real-book experience without weight and smell of real book.

Rest is same (or better). I do buy ebooks. But 90% books I read for pleasure is in Polish. Kindle deals with Polish language very well. Much better than Amazon does – library of Polish books almost do not exists in Amazon. That’s why I buy books in Polish ebook stores and – of course! – they are DRMed in a way non-compatible with Kindle. But at this moment it’s not a problem to DeDRM it and read on any viewer.

I had hard time to decide Kindle or iPad? If I could spend 3-4x more money – most likely I’d buy iPad. But only for 189U$ I have ultimate reading device. With small flaws that could be easily overcome.

For sure – for now it’s my favorite electronic gadget.

Jarek