Electric Duncan

Saturday, January 17, 2009

Window Maker and Ubuntu

Pictured right is the new setup that I've recently started using for development. I've found it to be much, much faster and more memory efficient than Gnome. When I'm developing, sometimes I want to run several server and/or client instances, and I need as many resources as possible, ready for crazy, unexpected usages. Window Maker fits the bill.

As a high school student in the 80s, I lusted after NeXT boxes and the look of NeXTSTEP. I had a collection of glossy pamphlets from the company that were kept out on my desk for regular ogling. The best I could do though, was incessantly use the Macs at the University of Maine at Orono (where I spent as much time as possible). When, as a new Linux user in '96 or '97, I discovered AfterStep, I switched from the Motif-alike FVWM. Immediately after Window Maker was released, I choose it as my primary window manager.

Since then, the Linux distributions and operating systems I've used have been distractingly varied. However, after several excellent years with Mac OS X, I'm actually quite happy to be back in Linux. Though I have generally enjoyed Gnome, I was really unhappy with how it seemed run rather slow, with delayed transitions between applications and other similar operations. With a new machine having 4GB of RAM, this seemed rather unnecessary. (However, I do run some heavy services on my machine... databases, web servers, etc.)

When I switched to Window Maker, I was stunned. Number one, it still works on a modern distro! Secondly, it's super fast. Finally, I *love* the clean desktop (I'm not a fan of having the desktop background display the contents of a folder). Not only that, but a quick test drive of some Python dockapp-writing software left me with some fun ideas for side-projects I could write to even better augment my working environment. (For example, dock apps for NetworkManager, update notifications, and Rhythmbox.)

Now I'm ready to start playing with architecture emulation for some exploratory networking projects...

Monday, January 12, 2009

Python and Rhythmbox

I've got a fairly large collection of digital music on a networked drive, and I access it from multiple machines on the network. I consolidated it 5 years ago when I started using iTunes, and over the past few years picked up about 400 songs from the iTunes store. This is something I avoid now, since Amazon offers songs at higher bitrates and without the crippling, non-Fair Use of DRM. (Apple seems to have recently changed it's policy, though the pain their DRM crap has caused me doesn't make me a very loyal customer.)

Since I started working at Canonical, I've been using Ubuntu for more than just development -- it's my main-use machine. I still use iTunes every once in a while, but my primary media player is Rhythmbox. There are a couple of issues with the old library, though.

Obviously, Rhythmbox can't play Apple's encrypted .m4p files or a couple audio book files I have. What's more, there are about 200 files that iTunes is able to locate but which Rhythmbox cannot (this may be due to the differing case sensitivities of the respective OSs). In order to track all these issues down conveniently, I wanted to export the import errors and missing files as a text file. Sadly, Rhythmbox doesn't have this functionality.

Fortunately, it comes with a Python console :-)

The missing files export was fairly easy, after some digging around and poking at the Python objects:

Try as I might, I was completely unable to obtain similar data for the import errors. After looking at the C code, I was able to determine that though the import errors were treated generally as a media source, due to their nature (not being able to provide the actual media itself), the related meta data was handled differently. Yet I wasn't able to decipher how, exactly.

So, I hopped on their mail list and asked for help :-) After a few quick exchanges, I was pointed in the right direction by one of the developers, who said that I needed to make use of the db object and some constants. After a quick test, this advice resulted in the following:

Note that the shell and rhythmdb objects are exposed by Rhythmbox in Python console sessions.

I now have a complete list of files that either need some file name updates or need to be burned to CD in iTunes and ripped to OGG.

So far, I've been pretty pleased with Rhythmbox. Thanks to their use of Python, I find I'm now becoming somewhat of a fan :-)

Thursday, January 08, 2009

Twisted Mail Server: The Conclusion

Holy old code, Batman -- it's been about 2.5 years since I first blogged about the Twisted mail server I cobbled together. In the intervening time, I've received tons of emails and instant messages requesting that the code be put up somewhere. Well, that time has come...

I recently decommissioned a box I had at tummy.com which was my primary mail server for several years. After many years of qmail and a few months with Postfix, I wanted a solution that I had more immediate control over. At that time, Gmail for Domains was just announced and I became one of the beta testers for it. However, it was missing a few critical features, so I sat down and put together a mail server for myself. Abe Fetting had some nice examples in his O'Reilly book, so I started there. Then I jumped into the Twisted source code and discovered the rest that I needed.

A year or so later, Gmail for Domains (now Google Apps for business) was rockin' out, so I started moving my mail there. I eventually aliased the remainder of my domains and then stopped using the Twisted mail server. I kept the tummy.com box around for a while for DNS, but due to a tightening of the budget at home, I later had to pull the plug. When I reviewed the backup files from the server, I saw the old mail code and put it at the top of the list for my down-time projects.

I've sort of done that, but the process is incomplete. In the spirit of r0ml's OSCON 2008 talk, I'm placing this code in a public space for people to play with, find bugs in, report them, add patches, and branch from. The project page on Launchpad.net is here:

https://launchpad.net/tx/txmailserver

There are a couple non-standard parts of the code (in particular, the account, alias, and mail list management), but it can be improved very easily. The Dspam stuff was nascent at best (I didn't end up using it in production). Also, the relaying really needs to be looked into to make sure that it's done safely. A quick task someone could jump on right away is to convert the copious print statments to twisted.python.log calls.

Having the code in the state that it's in could be a lot of fun, really. The code base is tiny and there's nothing too tricky going on. For those that haven't had a chance to play with Bazaar, you can easily branch it to your own lp home dir, and if you make some cool improvements, I can easily merge those back in to the main branch. Looking at the list of branches, potential users and developers will be able to easily see the efforts that others have made. txMailServer could end up being a nice little piece of utilitarian code for folks...

Enjoy! And sorry it took so long :-(

Saturday, December 27, 2008

Intellectual Property and Open Source

A few months ago, I received a complementary copy of Van Lindberg's new O'Reilly book Intellectual Property and Open Source: A Practical Guide to Protecting Code and the first thing that happened at home when the book was unwrapped was three of us began arguing over who got to read it first.

This may seem like an odd thing to happen for what one could easily assume was a dry and less than interesting topic. However, at the time I was strongly considering the possibility of beginning a non-tech-industry startup built with both open source and proprietary code. The discussions with the potential founders of the startup had been very vigorous and exciting, but the big questions that remained revolved around patents, protecting IP, and providing protection against big business while still offering powerful, free code for use by individuals/private consumers. If you've read the book or even seen the table of contents, you can see why everyone wanted to be the first to read it and learn from the insights provided between its covers.

Instead of jumping into another startup, I ended up joining Canonical; this has kept me both very busy and exceptionally happy. The holiday break has provided an opportunity to finish reading the book, and it has been a delight. I have friends working on startups that depend upon exciting code to power some or all of the business models for their visions, and reading this book should be on their shelves, close at hand. Even if you're not involved directly with open source and intellectual property, this book is an excellent read.

Intellectual Property and Open Source accomplishes a difficult goal of sharing dense information while making the subject matter engaging. This is done through examples, thought experiments, and well developed analogies. Van does an excellent job of igniting a powerful curiosity on the part of the reader while providing rewards for this in the lucid explanations of related laws and perspectives. I am resisting the urge to turn this post into a long series of quotes, but at the very least I want to mention a few little "spoilers" ;-)

The book starts off with an excellent foundation, giving an overview of the origins of intellectual property from an economic and legal perspective. This was particularly useful for me, as I have no background in this field. Van Lindberg does a really great job of expressing some of the widely held (and diverse) views of IP in the open source community.

The book then launches the reader into an array of well organized chapters on patents, the patent system, trademarks, copyright, trade secrets and licenses. Every open source developer should read chapter 10 on choosing an open source license (the opening dialog had me laughing out loud, a hilarious parody of news groups and IRC arguments as well as a nod to Princess Bride). There's also a chapter dedicated to patches and their relationships to copyright; another on reverse engineering; and the final one provides information and advice on establishing non-profits for open source projects -- the author even gives mention to our friends at the Software Freedom Conservancy (the umbrella non-profit for the Twisted Software Foundation).

In all honesty, I can't rave enough about this book. I've re-read parts of it just because I enjoyed the clarity of the explanations so much. Law is a twisty maze of easily confused subtleties to those who have not been trained in its dark arts. Through explicit language and examples, the author guides us past pitfalls of misunderstanding and brings us directly to all the major points.

If you are an Amazon shopper, you may want to act quickly: last I checked, there were only two copies left.

Enjoy!

Monday, December 15, 2008

Ubuntu Developer Summit

For the past two weeks, I've been listening, learning, discussing, and hacking various Landscape and Ubuntu initiatives with members of the Ubuntu community and fellow Canonical employees. It was an amazing experience, and we've got the next 6 months crammed full of plans... with the next 3 months already spec'ed out.

Canonical has surprised me. It's an extraordinary company... both in the modern business-sense of the word as well as the original sense: a fellowship of companions with a common goal. While so far I have only had a chance to hear some personal histories, it's evident that every member of this company is an extraordinary individual with a rich background and a great deal to offer to the whole. Everyone works with an unprecedented amount of motivation towards the company vision, one that is well and tightly integrated into the corporate culture.

There is a bright future ahead for this amazing group...

Monday, December 08, 2008

The State of Graphs in Python

There is a sad need for standardization of graphs in Python. The topic has come up numerous times on various mail lists, news groups, forums, etc. There is even a wiki page dedicated to the discussion of the topic on python.org. Ach, when will the madness end?

As far as I can tell, Guido van Rossum essentially solved this issue 10 years ago when he published his paper on Python Patterns - Implementing Graphs. The graph representation is a simple dict and he provided a few functions for demonstration purposes. In 2004, UC Irvine professor David Eppstein started making public his Python graph-theoretic efforts (with a functional programming approach). Both of these represent a direct approach that appeals to my aesthetic sense.

Now, after years of tracking the lack of progress made in standardizing graph representations in Python, I've recently had strong need of them. I did some checking around, and found projects that potentially met my needs. Sadly, none of them had the simplicity of Guido's original implementation (and therefore, anticipated speed benefits).

I was looking for graph implementations with no cruft, no external dependencies, no afterthoughts. I need something that balances runtime performance with a usable API, preferably created using PEP-8 (or similar) coding style.

Here's what I found, with some notes that I used to make a decision for my own needs:

PADS - David Eppstein's work; functional programming style; very strong math; leaves the implementation of the graph up to the developer-user
altgraph - too many utility and special-purpose methods for my taste; uses a custom graph object
python-graph - a new implementation; uses its own objects; seems to take the "framework" approach to graph implementation
graph - requires the use of custom vertex and edge objects
NetworkX - fairly complete; lots of redundant code; covers more than just a graph implementation (I only include it here because it seems to be fairly highly used)

If you know me, then you've guessed what's coming next. Yes, I'm going to contribute to the general chaos and announce yet another graph library. What I hope to accomplish with this is provide a very simple implementation based on Guido van Rossum's approach (dictionary-based) that doesn't consume much memory, can be operated on quickly, and can be used anywhere.

In keeping with this motivation, I've started a new project on Launchpad and named it simple-graph. My initial efforts will be aimed at implementing a dict-based graph per Guido's paper, with the possible inclusion of some of David's functions (updated to operate on a dict object). I will then spend some time taking inspiration from the best of what the other graph libraries have to offer while keeping things simple.

As I stated on the web panel at PyCon 2007, diversity is a good thing; it gives us a rich gene pool from which a full and healthy process of natural selection may occur. Let's hope that the efforts of so many Python programmers eventually lead to the inclusion of a graph object in the Python standard library.

Tuesday, November 11, 2008

Python and Lisp... Again

It seems that Lisp continually comes up in various conversations (virtual and otherwise) in the context of Python. In fact, maybe we could even call such occurances the Python-Church conversations. Well, here it is again.

Earlier this year I started working on a new project: an object-oriented genetic programming library. I had a bunch of experiments I wanted to do, but I needed to assemble parts of programs in order to do it. I had hoped to use Python, but inspecting Python's AST ended up being too much of a hassle. I wanted to distribute, process, and manage evolutionary algorithms/programs across multiple remote Twisted servers, and manipulating permutations of partial programs would be much easier to integrate with Twisted (the target "platform") if the programs themselves could be evaluated and introspected easily with Python.

After some digging around, I eventually settled on using PyLisp, mostly for the simplicity of the code and the fact that it was just a single file. Since it hadn't been maintained since 2002, I decided to roll the original file into the genetic programming code and then apply any changes as-needed, over time.

More recently, I've wanted to use this modified PyLisp on other projects and as a result, I have split it out into it's own project: pyLisp-NG. This naturally led to further code break-out, for a total of three projects:

pyLisp-NG - the functional programming (and introspection) component of the original project
Evolver - the code that allows one to do Python-based evolutionary programming (string-based as well as source code tree node optimization/search solution discovery)
txEvolver - will enable users to distribute genetic programming operations (such as merging parallel generations of computations)

pyLisp-NG was released earlier today and is available for download on PyPI.

Monday, November 10, 2008

txJSON-RPC

Tonight, I just pushed a new version of txJSON-RPC up to PyPI. Let me know if you have problems with this one, as the last one had some issues with setup.py.

The new cool feature in this version is the serialization available to the bundled jsonrpclib module (which doesn't depend on Twisted code, so anyone can use that). txjsonrpc.jsonrpclib now supports Python datetime -> JSON serialization. The date format is the same as that used by xmlrpclib: YYYYddmmTHH:MM:SS.

Enjoy!

Sunday, September 28, 2008

Current and Future Happenings

Sorry there's been so much radio silence at this end lately... a lot has been going on, and it looks like it's going to stay that way for a while. I just need to get used to it and start posting again :-)

Canonical

The big news is that Canonical quite took me by surprise :-) I had planned on doing consulting work again, but I was made an offer of camaraderie, to come join a team at Canonical that I know well, and I couldn't resist. I'm now working on the Landscape team at Canonical, the same folks who brought you the much beloved Storm ORM :-)

Already, I've been working there for two weeks and it's been a delight. They use a lot of the same processes that we did at Divmod and in the Twisted project (in fact, three of us on the team are Twisted developers), so that was very smooth. Another thing that made the transition very easy was manner in which they engage in a beautiful mix of group discussion and rapid development. The open source community roots at Canonical are very deep... and you can see them very clearly without digging :-)

At Canonical, I've repeated run across old friends from my Zope days, from Hacking Society in Colorado, and other places/associations from my past. I am somewhat stunned at the job Canonical has done in acquiring a talented and dedicated workforce. I've never seen a company embrace open source at the level and to the degree that this company does, while at the same time retaining all of the most excellent qualities of the community within the corporate culture. Someone should do a socio-technological/business PhD thesis on these guys...

In preparation for the many (and intense) marathon sprints that this team runs in a year, I've purchased a new laptop. It's the first dedicated Ubuntu dev machine/Desktop I've had... I've been running all my Ubuntu instances as virtual machines in Parallels and VMWare Fusion (or as remote servers at colos and virtual host providers). My love for the Evolution mail client continues to grow and I've found the only reason I miss the Mac is for the automatic handling of sound and to play Spore :-)

SOA Conference

Now on to some future stuff. I've been invited to speak about dynamic languages (Python) and ultra-large scale (ULS) systems at SOA-India this year in Bangalore. The industry that has grown up around service oriented architectures (SOA) overwhelmingly tends towards Java, so this is a really great sign. I think the efforts that the Java Mothership has made in building bridges with dynamic languages such as Ruby and Python is having a tremendous impact throughout the programming world. I've got an eye on Ted Leung and the Jython team :-)

Anyway, the conference promises to be quite interesting, with speakers from around the world and with diverse backgrounds. I'm expecting to return from Bangalore with a multitude of new ideas and lots of new avenues to explore.

Blog

Speaking of SOA, I am still working on the second part of the book review for Josuttis' book SOA in Practice. Perhaps before I finish that one, though, I will blog about another O'Reilly title I have been enjoying immensely: Van Lindberg's Intellectual Property and Open Source. Note that Van has been quite active in the Python community and is contributing his expertise at many levels for the benefit of us all. Regardless, the book is very well written and I will have nothing but good to say about it :-)

After that, I'm going to finish up the draft I have for a blog post on metaclasses, based on notes I took while working with Incredible Pear on the PBS DTV project.

And finally, there have been more requests for me to write about setting up a Twisted Mail server... so, as one reader puts it, I will conclude the telling of that tale in an up-coming post as well :-)

Wednesday, August 27, 2008

netaddr Python Library

I recently got several feature requests for my NetCIDR Python library, and in the course of a conversation with one user in particular, I was made aware of the netaddr project. I took some time to explore the code details and was quite impressed: drkjam did a great job. The manner in which he implemented the many features (especially the IP math) was the kind of thing I wanted to do for NetCIDR ... at some point. After about an hour of digging around, testing out the API, and pondering, I decided to retire NetCIDR and encourage my users to migrate to netaddr.

There are a couple more esoteric features in NetCIDR that netaddr currently doesn't have, but we've started talking about adding support for those in netaddr, at which point there will be no need to use NetCIDR.

To facilitate this, I've added a wiki page on the netaddr Google Code project for helping users make the transition from NetCIDR to the netaddr API.