Generating Kindle collections
Continuing on from my post a few months ago about playing with my Kindle, I’ve now amassed a fair number of books, and managing it is starting to be a bit of an issue. The main way to do this is with collections of books. However, Amazon in their infinite wisdom have decided not to provide any sort of tool for managing these outside of the Kindle itself, and the interface is somewhat lacking and slow for doing any sort of mass changes.
Life isn’t all that bad, as it turns out the collections are stored in a simple JSON file as “collections.json” in the “system” folder on your Kindle. It does however use various ways to generate a GUID for an ebook to store each collection, so we’re going to need a bit of help to figure out what those GUIDs are, and having someone else do the drudge work of working with the JSON would be nice. Enter Kindelabra (yes, that’s the author’s chosen spelling…). Primarily, it’s a little Python/Gtk app for managing collections, but there’s also a reasonable library sitting behind it that we can play with. I found a bit of the functionality was buried in the GUI as opposed to being in the library, but that was easily enough tidied up.
Ok, so we’ve got a library for managing collections, but I don’t want to go through the mind-numbing process of building all of them individually, so let’s build something to do this automatically. I’d like a collection for each author that I’ve got more than 3 books for, which is something we can derive from the metadata for the books.
The script to do so is below (and on the author-collections branch of my github fork for this), but I’d like to note a couple of items in particular:
- We skip anything without an author, or with author “Unknown”. This will be lots of non-ebook stuff, including many items generated from my previous post
- Some authors are listed as “Last, First”, and I want everyone in “First, Last” order. Admittedly, not all items with a ‘,’ are a “Last, First”, but only swapping the ones with a single comma skips most of the bad cases
- All of the collections we make here have a “_” at the beginning. This means that when I sort by title on the Kindle, I get first my “__to read” collection, then all of these, then everything else, which is what I want.
- This needs to interact with existing collections, so don’t wipe anything unless it was created earlier in the script