Monday, January 12, 2009

Python and Rhythmbox

I've got a fairly large collection of digital music on a networked drive, and I access it from multiple machines on the network. I consolidated it 5 years ago when I started using iTunes, and over the past few years picked up about 400 songs from the iTunes store. This is something I avoid now, since Amazon offers songs at higher bitrates and without the crippling, non-Fair Use of DRM. (Apple seems to have recently changed it's policy, though the pain their DRM crap has caused me doesn't make me a very loyal customer.)

Since I started working at Canonical, I've been using Ubuntu for more than just development -- it's my main-use machine. I still use iTunes every once in a while, but my primary media player is Rhythmbox. There are a couple of issues with the old library, though.

Obviously, Rhythmbox can't play Apple's encrypted .m4p files or a couple audio book files I have. What's more, there are about 200 files that iTunes is able to locate but which Rhythmbox cannot (this may be due to the differing case sensitivities of the respective OSs). In order to track all these issues down conveniently, I wanted to export the import errors and missing files as a text file. Sadly, Rhythmbox doesn't have this functionality.

Fortunately, it comes with a Python console :-)

The missing files export was fairly easy, after some digging around and poking at the Python objects:

Try as I might, I was completely unable to obtain similar data for the import errors. After looking at the C code, I was able to determine that though the import errors were treated generally as a media source, due to their nature (not being able to provide the actual media itself), the related meta data was handled differently. Yet I wasn't able to decipher how, exactly.

So, I hopped on their mail list and asked for help :-) After a few quick exchanges, I was pointed in the right direction by one of the developers, who said that I needed to make use of the db object and some constants. After a quick test, this advice resulted in the following:

Note that the shell and rhythmdb objects are exposed by Rhythmbox in Python console sessions.

I now have a complete list of files that either need some file name updates or need to be burned to CD in iTunes and ripped to OGG.

So far, I've been pretty pleased with Rhythmbox. Thanks to their use of Python, I find I'm now becoming somewhat of a fan :-)


Kevin Dangoor said...

In Apple's defense, the DRM was never their choice, and the fact that Amazon had a full catalog of DRM-free music before Apple did was a power play.

More usefully, I wanted to mention that Apple lets you "upgrade" your existing songs to iTunes+ (DRM free, higher bitrate). They're still AAC files, but they're unencrypted. There's no need to lose the collection of songs you've got, thankfully (or go through a re-encode step with a loss of quality).

Duncan McGreggor said...

Oh, man! Kevin, thank you so much for that info about the upgrades!

*does a happy dance*

glyph said...

If you're interested in a music player written in Python, you might give Quod Libet a shot :).

I haven't used Rhythmbox in a while, but on earlier versions at least, QL worked much better on large collections. And QL has a feature which I'm pretty sure RB doesn't have: "tags from path", which lets you re-tag your collection easily, based on regular expressions on their full filenames.

Duncan McGreggor said...


Thanks for the heads up :-) Chris has mentioned it a few times, but I don't think he's using it anymore (or not as much as he used to? I forget). I'll have to ask him why...

The tags from path is a *great* feature! I could benefit from that immediately...

glyph said...


No prob! Let me know how you like it.

One of the best (and most unique) things about Quod Libet is that it's modular. If you like the tagging and library-organization features that it provides, but you don't want to switch music players, you can also use "Ex Falso", which is a front end to its tagging engine with no music-playing features.

thisfred said...

+100 for QL: I have used it's *trunk* version for years, and it has only broken once (which was fixed within a day).

Since it's 100% pure python, extending it through plugins is very easy. (RB severely limits what you can do from Python plugins.) Boilerplate is minimal and it is possible to write a useful plugin in 10 lines of code.

It is indeed faster and more stable than RB especially with large collections, and it comes with a ton of optional plugins included.

It handles any id3 tag you throw at it (including ones you invent) meaning you can search on their content, or add them to the UI and sort on them.

It allows for super easy (batch) editing of metadata through the included Ex Falso, including looking up the correct metadata on (through a plugin).

It's search is unsurpassed. Any field can be searched on, using substring matching, or, should the need arise, full blown regular expressions. In fact the search interface is what I use most. With some clever queries, it's easy to create impromptu "playlists" for many occasions.