Monday, October 26, 2009

Recent Work on Various Open Source Projects


Over the last few months, I've been doing lots of work on various open source projects. I've been so burried in them that I haven't blogged (or microblogged) much about them. So much has been happening, though, that I needed to take a break from the coding and communicate some of this :-)

txAWS


Over the last few months, Robert Collins, Thomas Herve, Jamshed Kakar and I have been putting lots of effort into improving cloud support in the async (Twisted) Python Amazon EC2 library. It's been a lot of fun to see that part of the library take shape and start getting production use from Canonical. We have implemented the following functionality groups and their associated API methods:
  • Instances
  • Key Pairs
  • Security Groups
  • Elastic IPs
  • Availability Zones
  • EBS
There is a ticket for the following two groups of functionality, and branches in progress for both:
Once those are merged, EC2 support in txAWS will be complete.

As a bonus, we've added support for arbitrary endpoints and with that in place, have successfully tested txaws.ec2 against Eucalyptus clouds :-)

txJSON-RPC

There's a new release of txJSON-RPC out now, downloadable from PyPI. Work on the next version has been a great deal of fun. What started out as a conversation on IRC with Terry Jones, ended up as spec-driven doctest work on trunk for implementing support for multiple versions of the JSON-RPC spec (pre-1.0, 1.0, and 2.0).

With these changes in spec support, txJSON-RPC has really started to mature, and that's been fantastic. Even more, the jsonrpclib module that's included in txJSON-RPC (and can be used with non-Twisted projects) is getting spec version support as well.

SOOM in txULS

As some may know, one of my computing passions is ultra large-scale systems. After a phone conversation with Jamshed Kakar and some nice exchanges on the Python ULS-SIG mail list with Alex Drahon, I started working on a set of coding experiments in self-organizing objects. The Google Doc informally outlines the various stages and goals.

For now, the code is living in a txULS series on Launchpad. The reason for its inclusion in txULS is that ultimate goal of the SOOM (self-organizing object meshes) code is to produce an async API for building Twisted services that provide behaviours as outlined in the Google Doc (linked above).

I would to emphasize the networking-library-agnostic nature of the ULS-SIG: Twisted comes up since I spend a lot of time with Twisted, but ever networking library is welcome. I'm personally interested in exploring (or watching other developers explore) various Stackless Python experiments in the ULS systems space.

txSpore

This project was a spontaneous effort resulting from an evening of code review when I first discovered the official Python API from EA/Maxis for the game Spore. It's been a blast and something that Chris Armstrong and I have been working on together. The API is currently feature-complete, but Chris has some excellent ideas about improving usage as well as some additional API augmentations that will make life easier for game developers.

Already more featureful and usable than the official Spore Python API, there are great things in store for this library. Chris has come up with several very cool demo ideas that take advantage of the new API and will push it to the limits. We're both pretty excited :-)

Isomyr

I love isometric games. I'm a freak for the classic look. One night about a month ago, Chris and I discovered Isotope, an isometric Python game engine by Simon Gillespie. It was last updated in 2005 at version 0.9, so I started working on a branch that could be released as 1.0. I never heard back from Simon after an inquiry for his permission to release as 1.0, so I forked the code to a new project: Isomyr. I released the code rewrite work I had done to that point (plus some changes such as replacing some old code with Numpy) as 0.1. At which point things just went nuts...

Isomyr now has support for multiple worlds, customizable (per world) in-game time and calendars, and basic interactive fiction development. The latest chunk of code (that hasn't been pushed up to Launchpad yet) is adding support for general planetary simulation (e.g., axial title, varying daylight hours, seasons, and weather). As you might imagine, this has been a great deal of fun to work on!

PyRRD

PyRRD has gotten some recent community love, with requests for a mail list, new developer-oritnted features, etc. Currently at version 0.0.7, the 0.1.0 release isn't to far away. Folks have been using trunk for a while, which added support for the RRDTool Python bindings back in March of this year. (PyRRD's focus has primarily been for users/developers who didn't have the RRDTool Python bindings installed). In the next couple weeks or months, I expect that we'll be adding a few more features, and then preparing the new release.

PyRTF

Another fixer-upper project, PyRTF (mirroed on Google code as pyrtf-ng) has been on hiatus for a while, due to my diminished need to manipulate and interact with RTF files. However, a new developer has joined the project and the code-cleanup and unit test development now continues. Thanks Christian Simms!

A while ago, Simon Cusack
(the original author of PyRTF) and I had some great discussions about the future development of PyRTF and his interest in merging the recent changes into trunk on SourceForge. I deferred on that action, wanting to wait until the code cleanup, unit tests, and API changes had been completed. With Christian's help, we may get there now :-)

Wrap-up

It's been about a year since I've been so active in open source development, and it feels really good to be at it again :-) Being back in Colorado seems to have helped in subtle ways, but mostly it's been the increased interaction and interest from developers in the community that I can thank for my increased activity (and thus enjoyment). You guys are awesome. You're the reason for any code I produce.


Thursday, September 17, 2009

PyCon 2010 Talks Neeed!


Hey folks,

At my last count, we've only received 20 talks so far for PyCon 2010! There are only 14 days remaining for talk submissions... if you've had a great idea about a talk for PyCon, now's the time to make it happen!

Below I've pasted some links that folks might find helpful. The first one has pretty much everything you need to know about submitting a talk for PyCon.



Sunday, September 13, 2009

txSpore: Twisted Spore



I just had a delightful weekend of coding :-) I spent the past two days porting the Spore Python API to Twisted. You can now incorporate Spore data (from static XML as well as REST requests) into your non-blocking Python applications/games!

This was a pretty easy task, really. The API just makes HTTP requests with twisted.web.client.getPage. There was a little bit of work involved in creating object models for the XML, and some head-scratching for the error-catching deferToThread unit test I tried to write (it's still buggy... need to figure that one out). Everything else was pretty much cake.

In fact, it was so much fun to kick back and write some playful code that I might overhaul the sync Python code as well and incorporate both into txSpore.

Do be aware, however, that the code still has some big improvements coming. The first thing I want to hit is actually create a client object. Right now, the client module contains a series of functions (since state's not currently needed). However, I want to start doing some basic object caching in order to limit the number of requests made to spore.com and increase the response time. That's the big item for 0.0.2. Update: 0.0.2 is now released!

Next I'd like to create some more demo apps that show off the API usage better. Right now, there's one demo (a .tac file). All it does is ask for a user name, renders a user page, and then links to a user "Spore assets" page (that's the thumbnail image above).

One thing that might be fun to do is write a script that checks for the latest achievements and publishes them to various microblog/status sites with the Twisted PyngFM client :-)

There's a project page up on Launchpad for txSpore, and I've posted a notice and some updates to the Spore developer forums. It's also been published on PyPI.

Enjoy!


Monday, September 07, 2009

Windows Media to MP3 Conversion for Mac OS X and Linux


For the past couple years, my girlfriend has been amazingly (astonishingly) patient about a whole slew of .wma files that we've got on the network drive... backups of her CD collection made when she was a Windows user. We managed to save them right before the computer died, but she hasn't been able to listen to them when she's booted into Ubuntu or Mac OS X.

Late last month, after getting back from two weeks abroad, Marjorie said that she'd really like to have access to her music collection again (the CDs are cumbersome and stored away in boxes for our impending move back to Colorado). With that said, I did some digging around, and found some immediately helpful links (two years ago, a few google searches had turned up results that indicated too much effort was involved).

I started out by trying a couple free Mac OS X GUI applications, but these ended up being quite horrible: either they did not offer the functionality I desired, they were buggy to the point of being unusable, or they rendered audio with unlistenable artifacts.

In the end, I had to use mplayer and lame in combination. After googling around and some trial and error, I discovered the combination of mplayer options that would successfully extract the audio data from .wma files and dump them as .wav files.

I started with a shell script, but quickly changed to Python, since there were several locations for the .wma files, and none of them on nice paths. I've used this script several times since then, when more .wma files were discovered, and have yet to encounter any issues in sound quality. Once nice-to-have would be to extract .wma metadata and save it in the new .mp3 files as id3 tags...

Anyway, here's the code:



Hope someone else finds this useful and their significant others don't have to wait 2 years for their music!


Twisted Ping.fm Client


I just merged async (Twisted) support into the Python ping.fm library today and have just taken it for a test drive. I do love Twisted :-) The Twisted pyngfm API usage is identical to the synchronous API, with the usual exception of using deferreds and callbacks.

Here's some example usage, the client I now use for command-line updates to Twitter, Identi.ca, Tumblr, Facebook, LinkedIn, Jaiku, and even Flickr (note that the keys are stored in an .ini-style config file):


There's another example in the README that iterates through the recents posts made to ping.fm. If you do manage to use it and come across any issues, be sure to file a ticket.

Enjoy!



Sunday, July 05, 2009

Possible Directions in Human Resource Management



This afternoon, while having a nice soak in the tub, I got lost in a Strossian reverie. Or perhaps it was more along the lines of one of his characters, rather than the author himself. Regardless, I was thinking about management styles in modern corporations... and ways in which one might improvise.

I was only half paying attention, until the stream of thought actually started getting interesting. At which point I sat upright in the tub and started taking mental notes. Quickly showering off, I couldn't stop the flood of ideas, hoping to get to a written medium as soon as possible, lest they be forgotten.

I forget how the day-dreaming started... perhaps my usual: pick a random topic where I have some experience, and start playing with it. Run simulations and tweak parameters until there's nothing more to look at or some interesting permutation has popped up. 

Ah! Now I remember. I was thinking about the increasing importance of QA in software as more and more marketshare is gained. Not from the usual perspective ("we gotta make sure users don't see bugs, or our shit's gonna flop"), but rather as part of the original problem domain.  It's very common for open source projects to suffer from a lack of sex appeal when considering QA. QA is essential for success; just as important as the code itself. As such, there should be ways of making QA as interesting, engaging, understood, and respected as the act of programming.

That train of thought spawned a couple more pathways for exploration, but the one that ended up being the most interesting was pondering the QA-interest problem at a human resource management level. What would it take to get talented and skilled engineers who would normally gravitate towards some other field of expertise interested in QA instead? How flexible would a company have to be to start attracting for this and other positions as new needs arose due to new pressures? How could it do so by growing and adapting from the inside?

Let's take these rather arbitrary initial conditions as part of a thought experiment following up on those questions:
  • a company with a sufficient number of employees to support multiple departments
  • a highly intelligent, motivated workforce
  • flexible employees and teams, probably working in a distributed, remote environment
Imagine with this sort of company that employees are not locked into teams, departments, divisions, etc. Let's say there is some mechanism which allows for an easy re-shuffling of talent throughout the company. For budgeting and reporting purposes, corporate structures would remain, but the individuals that performed specific tasks were granted much more expansive, organic freedoms when it came to projects and tasks.

From the individual's perspective, an employee could choose to work for and closely associate with whatever team they wanted to spend some time with, given of course, that this team could make use of the newly reallocated employee's particular skills and abilities. This is somewhat analogous to the geographical freedoms one has when working remotely: you can move anywhere in the world, to whichever time zone you prefer, whatever culture you want to enjoy, given that you can continue to work effectively. 

Of course, checks and balances would have to be introduced. Otherwise, what's to prevent an employee from team-hopping, and never really getting any work done? Just playing social butterfly? Karma/experience points could be earned on per task, project, team, department, and division bases. Moving at any particular level to a different group would reset an employee's karma for that group and all other organizational units below it. For instance, changing to another department would zero out any team, project, and unfinished task karma that had been accumulated. Perhaps some of the patterns from role playing games could come in handy as sources of inspiration.

The ability to opt for a move could also be governed by an appropriate accumulation of karma points for a given organizational unit, to be determined at the discretion of that organizational unit. For example, hitting a certain karma "level" would allow one new movement opportunities. Obviously, in addition to having enough points, the target group would have to have a need and be willing to bring the requesting person on board.

Intuitively (i.e., without doing any research or checking sources), this feels like something modeled after biological systems, as opposed to the fairly static forms of systems organizations we currently see in many corporations. A company that was able to successfully implement such a human resource management approach could likely see additional, unexpected benefits. Biological systems tend to be well-suited to surviving shock, drastic changes, massive failures. We might see companies with longer-reaching vision, more innovation, less frequent employee turnover, and greater financial stability.

That aside, what appeals to me at a personal level is this: employees would be participating in their work at a new level. The management processes behind professional development would be opened up to them. Not only that, part of one's work would become a game with known rules and clearly defined markers for cumulative achievements. Rewards, though, would not be power-climbing, but rather lateral expansion and exploration. Deeper involvement in other areas of the company.

I'd love to hear from folks who can recommend some good reading/research materials on this topic...

Saturday, June 20, 2009

Mac OS X: Execute Shell Commands via Icon-Clicks



My main development machine is a custom PowerBook running Ubuntu natively. I use it when I'm sitting on the couch, my office comfy chair, the futon, floor, etc. Every once in a while, though, I want to work at a desk from my 24" iMac. Just to mix it up a little. However, that box is my gaming and web-browsing machine: it runs Mac OS X and that's the way I want to keep it. So, if I'm going to do work on the iMac, I need to ssh into the machines that have the environments set up for development.

In the course of an average day of writing code, I'll connect to anywhere from 1 to 5 remote machines open up 5-10 ssh sessions in a terminal to each machine. If I'm at the iMac, this get's tedious. Today, it got tedious enough for me to do somthing about it. Here's what I want: to click on a Terminal icon and have an ssh connection automatically established to the box I need to work on. This it pretty easy on Linux and Windows, but I had no idea how to accomplish this on a Mac until tonight.

I thought I'd share my solution; others may like it... but I'm betting there are some pretty cool ways of doing this that didn't occur to me -- so feel free to share yours!


Profile Hack

From previous messing about with the open command, I knew I could open Terminal.app from the terminal:
open -n "/Applications/Utilities/Terminal.app"
This got me part way there... if only I could dynamically execute a command upon login... so, yeah, I did something nasty:
vi ~/.bash_profile
And then:
if [ ! -z "$REMOTE_CONNECTION" ]; then
ssh $REMOTE_CONNECTION
REMOTE_CONNECTION=""
fi

.command Files


I was stumped at that point, until some googling revealed a nifty trick I didn't know about:
  • Create a new file in your favorite editor, using the .command extension
  • Add the commands you want executed
  • Save it and chmod 755
  • Double-click it and enjoy
So here's what I added to rhosgobel.command:
REMOTE_CONNECTION=rhosgobel \
open -n "/Applications/Utilities/Terminal.app"

The Obligatory Icon Tweak


I then used the standard "Get Info" trick of icon copying: "Get Info" for Terminal.app, copy icon, "Get Info" for all my .command files, paste icon.


Usage


Now, I just click my "Shells" menu, choose the destination, and start working on that machine. A new window or new tab opened with that instance of Terminal.app will give me a new session to that server, without having to manually ssh into it -- this is even more convenient than having an icon to double-click!

One bit of ugly I haven't figured out how to remove: when I open a shell to a remote server, there's another shell opened at the same time with a [Process completed] message.


Thursday, June 18, 2009

A Sinfonia on Messaging with txAMQP, Part III


A Sinfonia on Messaging:
  1. The Voice of Business
  2. The Voice of Architecture
  3. A RabbitMQ and txAMQP Interlude

Before we play our third voice in this three-part invention, we need to do some finger exercises. In particular, let's take a look at the concepts and tools we'll be using to implement and run our kilt store messaging scenario.


Messaging

The RabbitMQ FAQ has this to say about messaging:

Unlike databases which manage data at rest, messaging is used to manage data in motion. Use messaging to communicate between and scale applications, within your enterprise, across the web, or in the cloud.
Paraphasing Wikipedia's entry on AMQP:
The AMQ protocol is for managing the flow of messages across an enterprise's business systems. It is middleware to provide a point of rendezvous between backend systems, such as data stores and services, and front end systems such as end user applications.

AMQP Essentials

AMQP is a protocol for middleware servers ("servers" is used in the most general sense, here... anything that is capable of running a service) -- servers that accept, route, and buffer messages. The AMQP specification defines messaging server LEGO blocks that can be combined in various ways and numbers, achieving any manner of messaging goals, whose final forms are as diverse as the combinations of the components.

For the visually inclined, below is a simple diagram of the AMQ protocol. I've put multiple virtual hosts ("virtual hosts" in the AMQP sense, not Apache!) in the diagram to indicate support for multiple server "segments" (domains in the most general sense). There could just as easily be multiple exchanges and queues in each virtual host, as well. Likewise for publishers and consumers.



















I highly recommend reading the spec: it is exceedingly clear at both intuitive and practical levels. To better understand the diagram above, be sure to read the definition of terms at the beginning as well as the subsections in 2.1 about the messaging queue and the exhange. Don't miss the message life-cycle section either -- you'll be reminded of circuitry diagrams and electronics kits, which is what AMQP really boils down to :-)

The Advanced Messaging Queing Protocol specifies that the the protocol can be used to create exchanges, message queues, chain them together, and do all of this dynamically. Any piece of code that has access to an API for your AMQP server can connect to it and communicate with other code -- using or creating simple messaging patterns or deeply complex ones. And everything in between.


RabbitMQ Quickstart
RabbitMQ is a messaging system written in Erlang, but in particular, it is an implementation of AMQP. The RabbitMQ web site provides documentation on installing and administering the messaging server. I run mine on Ubuntu, but since I've got a custom Erlang install, I didn't install the package (I dumped the source in /usr/lib/erlang/lib). To participate in the code play for this blog series, you'll need to install RabbitMQ.

Once you've got it installed, you'll need to start it up. If you've used something like Ubuntu's apt-get to install RabbitMQ, starting it up is as simple as this:
sudo rabbitmq-server

If you've got a custom setup like mine, you might need to do something like this (changing the defaults as needed):
BASE=/usr/lib/erlang/lib/rabbitmq-server-1.5.5/
BIN=$BASE/scripts/rabbitmq-server

RABBITMQ_MNESIA_BASE=$BASE/mnesia \
 RABBITMQ_LOG_BASE=/var/log/rabbitmq \
 RABBITMQ_NODE_PORT=5672 \
 RABBITMQ_NODENAME=rabbit \
 $BIN &


A txAMQP Example

Now that we've got a messaging server running, but before we try to implement out kilt store scenarios, let's take a quick sneak peek at txAMQP with a simple example having the following components:
  • a RabbitMQ server
  • a txAMQP producer
  • a txAMQP consumer
From reading the spec, we have a general sense of what needs to happen in our producer. It needs to:
  • connect to the RabbitMQ server
  • open a channel
  • send a message down the channel
Similarly, our reading lets us anticipate the needs of the consumer:
  • connect to the RabbitMQ server
  • open a channel
  • create an exchange and message queue on the RabbitMQ server, binding the two
  • check for in-coming messages and consume them
I have refactored some examples that the author of txAMQP created and I've put them up here. Once you download the three Python files (and the spec file, one directory level up), you can run them in two separate terminals. In terminal 1, start up the consumer:
python2.5 consumer amqp0-8.xml
In terminal 2, fire off a message:
python2.5 producer amqp0-8.xml \
"producer-to-consumer test message 1"
After running the producer with that message, you should see the same text rendered in the consumer terminal window. You can also fire the message off first, then start up the consumer. The message is sitting in a queue on your RabbitMQ instance and will be available to your consumer as soon as it connects.

Now that you see evidence of this working, you're going to be curious about the code :-) Go ahead and take a look. There are lots of comments in the code that give hints as to what's going on and the responsibilities that are being addressed.

If you are familiar with Twisted, you may have noted that the code looks a little strange. If you're not, you may have noticed that the code looks normal, with the exception of extensive yield usage and the inlineCallbacks decorator. Let me give a quick overview:

Ordinarily, Twisted-based libraries and applications use the asynchronous Twisted deferred idiom. However, there's a little-used bit of syntactic sugar in Twisted (for Python 2.5 and greater) that lets you write async code that looks like regular, synchronous code. This was briefly explored in a post on another blog last year. The Twisted API docstring for inlineCallbacks has a concise example.

Briefly, the difference is as follows. In standard Twisted code, we assign a deferred-producing function's or method's return value to a variable and then call that deferred's methods (e.g., addCallback):
def someFunc():
   d1 = someAsyncCall()
   d1.addCallback(_someCallback)
   d2 = anotherAsyncCall()
   d2.addCallback(_anotherCallback)

With inlineCallbacks, you decorate your function (or method) and yield for every deferred-producing call:
@inlineCallbacks
def someFunc():
   result1 = yield someAsyncCall()
   # work with result; no need for a callback
   result2 = yield anotherAsyncCall()
   # work with second result; no need for a callback

Visually, this flows as regular Python code. However, know that the yields prevent the function from blocking (given that there is no blocking code present, e.g., file I/O) and execution resumes as soon as the deferred has a result (which is assigned to the left-hand side). Since this latter idiom is used in txAMQP, I use it in the examples as well.

Next, we finally reach our implementation!


References



Thursday, June 11, 2009

A Sinfonia on Messaging with txAMQP, Part II


A Sinfonia on Messaging:
  1. The Voice of Business
  2. The Voice of Architecture
  3. A RabbitMQ and txAMQP Interlude

After writing the last blog post, I found a fantastic site that focuses on messaging in the enterprise. I have really enjoyed the big-picture overview I get from some of the Martin Fowler signature books in this series, so I ordered a copy of this one too.

On the web site, the authors give a nice example for messaging integration in Chapter 1. They provide a more detailed, supplier-version of the kilt store (we're doing "manufacturing" as opposed to distribution) with "Widget-Gadget Corp", but the basic principles are the same. I highly recommend reading that entire page. I used it as the basis for much of this post.


Business Process Overview

At a top-level, we have the following business process for MacGrudder's kilt store:



These are represented by a sales guy or a web store, a third-party billing service, MacGrudder, and a third-party shipping service.

Up until now, the sale process could be either a user deciding to buy something in the online store or the sales guy engaging with a customer. Both generated orders; neither shared resources. The web app interfaced with the payment gateway operating by billing/shipping guy. The sales guy had to call in his orders to the billing/shipping guy. Once orders were charged/approved, a printout was handed to MacGrudder, who then created the ordered kilt. Once completed, he'd set it aside for shipping guy to come box it up and slap a label on it.


The Voice of Architecture

We're now ready to weave in the second voice of our three-part invention. MacGrudder's original infrastructure consists of silos of applications, functionality, data, and process. We want to interconnect these separated areas in order to reduce long-term overhead incurred by redundant components and data. Practically, we want to see the following changes:


Unified orders: at the end of the sales process, there should be one abstraction of the "order", regardless if the source was the web store, a phone call, or the sales guy. The order abstraction will be a message (or series of messages, for orders with multiple items; we'll be addressing only the simple case of a single item).


Unified validation and billing
: at the end of the order creation process, there should be a validated order or an invalid order (e.g., if there were insufficient funds). At the end of the billing process, there should be an approved order that has be paid for via a single means (e.g., a single payment gateway, without bothering billing guy for manual entry). Additionally, once an order has been validated, messages should be sent to other components in the system.


Unified status: at the end of the manufacturing process, both the shipping guy and customers should be aware that the product has been completed and is ready to be sent: the shipping guy can connect to our messaging system (probably via a service) and the customer can be notified by email or by checking the order status in the web kilt store.


In the next installment, we will finally start looking at the code. We'll look at the "unified orders" messaging solution after covering some basics with RabbitMQ and Twisted integration, and then see how far we get with implementation details and descriptions. Unified validation, billing, and status might have to be pushed to additional posts.


References



Sunday, June 07, 2009

A Sinfonia on Messaging with txAMQP, Part I


A Sinfonia on Messaging:
  1. The Voice of Business
  2. The Voice of Architecture
  3. A RabbitMQ and txAMQP Interlude

Prelude

A complete messaging solution is a three-part invention of business need, architecture, and implementation. In its final form, these three voices blend in harmony, with each one taking a dominant role depending upon which part of the solution one examines.

Neither have I the ability nor skill to seamlessly weave three concepts together while clearly explaining their roles. Therefore, I will separate out the voices from each other and leave it as an exercise for the reader to construct an application and practice the principles involved, thus experiencing well-earned contrapuntal pleasures first-hand.

Introduction

As computing and data exchange systems increased in complexity over the past 30 years, so has the need for improvements -- and where possible, simplifications. Some of these efforts have been focused on decentralization of communications (shared, distributed load) and decoupling of messaging from applications (removing redundancy and increasing delivery speed/throughput). The first steps towards this were made in the 1990s with explorations in "middleware" application universe.

Messaging, as we now refer to it in the industry, arouse from those middleware adventures: out of the business drive to refactor old software as new services to wider, more sophisticated audiences. With many new services replacing a single, monolithic application, formal and well-architected solutions were needed for creating, editing, and deleting shared data.

From Wikipedia:
Message-oriented middleware (MOM) is infrastructure focused on message sending that increases the interoperability, portability, and flexibility of an application by allowing the application to be distributed over multiple heterogeneous platforms. It reduces the complexity of developing applications that span multiple operating systems and network protocols by insulating the application developer from the details of the various operating system and network interfaces.
AMQP (Advanced Message Queuing Protocol) is one of these protcols.

In a recent blog post about Ubuntu as a business server, Vaughan-Nichols provides evidence for Ubuntu's and Canonical's commitment to enterprise, saying "... the new [version of] Ubuntu also includes AMQP [...] support. AMQP is an important set of middleware and SOA [...] protocols."

AMQP has demonstrated itself as a compelling protocol for messaging solutions, even to the point of being included in two Linux distributions. The code included in this blog series is txAMQP, an asynchronous Python AMQP library built with Twisted.

Note that last year I had planned to write a blog series on messaging with Twisted and XMPP, but was unable to as a result of time constraints. These days, I'm working with AMQP instead of XMPP, but I still hold some hope that I'll be able to write an anolog for this series from the perspective of XMPP.


The Voice of Business

Dipping into the future, one of the goals for AMQP as developed by a special interest group is the following:
Decentralized, Locally Governed Federated Mesh of AMQP Brokers with standardized Global Addressing. The killer application for AMQP is transacted secure business messages between corporations - e.g. send a banking confirmation message to confirms@bank.com [...]
I find this rather exciting due to my interest in ultra large-scale systems; scenarios like the one described above are the seeds for tomorrow's ULS systems :-)

For now, though, let's look at a more immediate use for AMQP: a messaging protocol for shared services between departments in a small store. In this exercise, the voice of business is the primary melody; everything else (architecture and implementation) is done in support of this theme.


An Example

Fhionnlaidh MacGrudder creates hand-made kilts to order. He's got a sales guy who works with movie costume design shops and the like. He's got a web girl who wrote and maintains a custom store front app. He's got a friend who does shipping and billing for him (as well as some other local Glen Orchy artisans). Until now, these three business "groups" associated with the kilt shop have been maintaining their own records, sometimes copying and updating them manually from each others' various export files.

Fhionnlaidh's niece Fíona is programmer, business student, and is dating shipping guy's son. Horrified by the inefficiencies in her uncle's business processes (and tired of her boyfriend's father's complaints), she has proposed the following:
  • sales guy will maintain customer contact info offer this as a service to the store and billing
  • the web store will dynamically update displayed data when sales guy changes it
  • the CRM will dynamically update displayed data when a web store customer updates their info
  • MacGrudder will have a new web page he goes to where all pending orders are presented with their full details; changes can be updated by a customer in real-time until MacGrudder has started working on the order
  • billing/shipping guy will be notified instantly as soon as MacGrudder marks a kilt order as completed
This setup has the following benefits:
  • Contacts will be maintained in a single data store
  • There is zero latency between customer-driven updates and sales guy-driven updates
  • Customers have increased post-purchase flexibility with their orders
  • Shipping guy can plug into MacGudder's messaging and be notified when packages are ready for pickup
  • Everyone has more time for buttered scones and tea (especially shipping guy, who will no longer be making unneeded trips down the glen)
The following changes will be made to the current software:
  • Contacts will need to be merged into the CRM
  • A read/write data service for the contacts will need to be created
  • The CRM front-end will need to be upgraded to an AJAX-enabled version
  • The web store app will need to be updated to support AJAX
  • A new page will be created which displays the status of all orders and allows MacGrudder to change an order from "pending" to "in-progress" to "completed"
  • The current "new order" email notification code in the web app will need to be changed so that it uses the same messaging as MacGrudder's status page
  • A new service needs to be created for shipping guy so that he can choose to be notified about pending pickups by email or he can check a web page or even make a query directly to the service, thus preventing unnecessary trips to MacGrudder's isolated little shop
  • After all the work is done, somone's going to need to order more scones
This example is not meant to fully justify messaging for businesses, but rather to provide a simple use case for which we can write some simple (and less than robust) code. It is a toy, but a conceptually useful one with a solid, concrete foundation.

In the next installment, we'll review the business process (with diagrams!) and the explore the architecture of the system, before and after. Another post will take that architecture and combine it with MacGrudder's already extant infrastructure, reusing as much as possible. With that in place, we will have the opportunity to look at some RabbitMQ basics and some actual txAMQP code.


References



Thursday, May 28, 2009

After the Cloud: Epilogue


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

Though wildly exciting to imagine a future of computing where ubiquitous devices are more deeply integrated into the infrastructure we use to power our applications, the real purpose of these posts has been to explore possibilities.

Let's have crazy thoughts. Let's build upon them, imagining ways in which they could become a reality. Let's not only munch on the regular diet of the technical "now"; let's plant seeds in many experiments for the future.

This post is about thinking of small, mobile devices and cloud computing. But it's also a rough template. Let's do the same thing for a the desktop: how might it evolve? Where will users be spending their time? Let's do it for the OS and the kernel: what radical changes can we envision there? The technology behind health care. Education. Our new patterns of behaviour in a constantly changing world. All of these and more deserve our attention.

The more we discuss such topics in a public forum, the more thought will be given to them. Such increased awareness and attention might spark the light of innovation years ahead of time, and do so in the context of an open exchange of ideas. Let's have Moore's law for the improved quality of life with regard to technology; let's take it out of the chip and into our lives.



Friday, May 22, 2009

Canonical's Vision


Canonical's most recent AllHands meeting finished last night (this morning, really... I can't believe I got up at 7:30am), and I'm somewhat at a loss for words. In a good way.

But I'll try anyway :-)

As someone who was highly skeptical of the validity of Canonical's business model prior to working here, I can say that not only do I not doubt our ability to be a hugely successful company, but I am deeply committed to that success. Before AllHands, Canonical had earned my respect and loyalty through the consistent support and care of its employees. After AllHands, I have a much greater practical, hands-on understanding of Canonical's strategies and the various projects involved in creating a reality of success.

What's more, though, is the completeness of my belief in the people and the vision. This is thanks to the massive exposure we've had during AllHands to the collective vision; the team projects; all the individuals with amazing histories, skills, unbelievable talent and ability to deliver; and most of all, the dedication that each employee of Canonical has to truly making the world a better place for anyone who depends upon technology.

Ubuntu is free, and that's great. But Canonical needs to be a huge commercial success if its free OS distribution is going to have the power to transform the market and thus people's lives. This AllHands has given me a complete picture of how that will happen: we're all working on a different part of this puzzle, and we're all making it happen.

Success in the marketplace is crucial. Not because of greed or the lust for power, but because we live in a world where value is exchanged. As part of that ecosystem, we want to bring the greatest value to the people. This is not "selling out"; it's selling. This does not give away a user's freedom; it helps guarantee its continued safety in a competitive, capitalist society.

If we want anyone to embrace Ubuntu instead of a non-free OS -- without asking our users to sacrifice anything -- we're going to need to make very serious changes in design, usability, integration, and stability. To do this in a clean, unified manner really only comes with a significant investment of time, direction, and capital. Due to the seriousness of Canonical's altruistic vision, as we generate this capital, we're making the dream come true for the world.

At AllHands, I've seen designs that will seriously challenge Apple. I've seen a usability team's plans for true computing goodness. I've seen revenue models that have made my jaw drop. I've seen glimpses of the bright future.

And baby, it's exciting as hell.


Monday, May 18, 2009

After the Cloud: Heaps of Cash


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

A One-Two Punch

Let me give you the punchline first, this time. Imagine a service that:
  1. Combines the cloud management features of EC2 for any system (in this case, mobile devices), the monitoring and update management of Canonical's Landscape, the buying/selling power of an e-commerce application, the auctioning capabilities of eBay, and the data of marketing campaigns.
  2. Seamlessly integrates all this into a cloud (Open Heap!) provisioning/acquisition system or as part of your mobile provider's billing and information web pages.
So where does the cash come in?


Power to the People

As usual, revenue would depend upon market adoption. That would depend upon appeal (addressed with marketing), usefulness (addressed by software engineering and usability), and viability. That last one's particularly interesting, as it's where people and the cash intersect.

A product suite and service, all built around open heaps, could have a long and fruitful life if implemented with the end user in mind. Users would have the opportunity to become partners in an extraordinary way: they would be consuming a service, while at the same time, being given the opportunity to resell a portion of that service for use in the cloud-like architectures of open heaps.

The first company that does this really well would have a continuously growing following of users. This company would be helping consumers earn immediate cash back on their property. This is something I believe deeply in; it's a positive manifestation of the continuing evolution of the consumer's role in the market. I'm convinced that the more symbiotic a relationship between consumer and producer, the healthier an economy will be.


Providers

In an open heap scenario, there are two providers: the mobile phone provider and the heap provider. Phone companies get to make money from the deal passively: through a partnership that provides them with a certain percentage of the revenue or indirectly through heap-related network use.

The heap provider (e.g., someone like Amazon or RightScale), would stand to make the most of everyone. Even though they wouldn't own the devices themselves (in contrast to current cloud providers), they would be able to assess fees on various transactions and for related services.

Imagine application developers "renting" potential CPU, memory, storage and bandwitdth from a heap that included 100s of 1000s of mobile users. The heap provider would be the trusted third party between the device owner and the application developer. In this way, the provider acts like an escrow service and can asses feeds accordingly.

Imagine a dynamic sub-market that arises out of this sort of provisioning: with millions of devices to choose from, a user is going to want to make theirs more appealing than 1000s of others. Enter auctions. Look at how much money eBay makes. Look at the fractional fees that they asses... fees which have earned them billions of dollars.

Throw in value-adds like monitoring and specialized management features, and you've got additional sources of revenue. There's a lot of potential in something like this...


Review
Obviously, all of this is little more than creative musing. The technology isn't quite there yet for a lot of what is required to make this a reality. Regardless, given the shere numbers of small, networkable devices in our society, we need to explore how to best exploit untapped resources in mobile computing, providing additional, cheaper environments for small applications, decreasing the dependency we have upon large data centers, and hopefully reducing the draw on power grids. We need decentralized, secure storage and processing. We need smarter, more fair, consumer-as-beneficiary, economies.

Next, we develope a new segment of the market, where any user or company with one or more networked devices would be able to log in to an open heap provider's software and offer their machine as another member in that cloud. There's a lot of work involved in making that happen, much of it focused on the design and implementation of really good software.

If we can accomplish all that, we will have reinvented the cloud as something far greater and more flexible than it is today.


Thursday, May 14, 2009

After the Cloud: Open Heaps


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

Refresher

Up to now, we've considered technical explorations and possible related future directions for the technology surrounding the support of distributed applications and infrastructure. This post takes a break and returns to thoughts of provisioning resources on small devices such as mobile phones. As stated in To Atomic Computation and Beyond:
This could be just the platform for running small processes in a distributed environment. And making it a reality could prove to be quite lucrative. A forthcoming blog post will explore more about the possibilities involved with phone clouds...
But first, I'm so tired of the term "cloud," so I did some free-association... from cloud to clouds to "tons of little clouds" to "close to the ground" to cumulus to heap (Latin for "cumulus"). Heap! It's irresistible :-)

"Open" is such a terribly abused word these days (more so than cloud), but using it as an adjective for a wild collection of ad-hoc, virtualized process spaces satisfies some subtle sense of humor. Open Heaps it is.


Starting Points

Let's think about the medium in our example: cellular telephony. Is there a potential market here? Here are some raw numbers from Wikipedia:
By November 2007, the total number of mobile phone subscriptions in the world had reached 3.3 billion, or half of the human population (although some users have multiple subscriptions, or inactive subscriptions), which also makes the mobile phone the most widely spread technology and the most common electronic device in the world.
I think we can count that as a tentative "yes."

Can we do this the easy way and just use TCP/IP? In other words, what about using WiFi phones or dual-mode mobile phones as the communication medium for devices in our open heaps? Well, that would certainly make many things much easier, since everything would stay in the TCP/IP universe. However, the market penetration of standard mobile phones is so much greater in comparison.

That being said, how many currently operating phones are capable of serving content on the internet, running background processes, etc.? Maybe only a small fraction, perhaps even enough to justify supporting only devices such as handhelds, smartphones, MIDs, UMPCs, and Netbooks.

Two possibilities for ventures here might be:
  1. A startup that developed an Open Heap offering for any Internet-connected device.
  2. A company that formed a partnership with one or more mobile carriers, acting as a bridge between the carrier-controlled network/device-management capabilities and the Internet.

The Business Problem

So, let's say we've got the technology ready to go that will allow users to upload a process hypervisor to their phones, and that this technology provides the ability for users to allot process resources (e.g., RAM, CPU, storage). There are still a couple basic principles to address to justify a business in this area:
  1. How will people be better off with this than without it?
  2. How will this technology generate revenue?
In general, I believe that consumers are always better off with more choices. I also believe that balanced systems run better than those that are rigged to benefit just one group. As such, I have an idealist's interest in things like Open Heaps, as they will empower interested consumers to earn revenue (however small) on their own property (mobile phones and other devices with marketable resources). What's more, if there are billions of devices available as nodes in Open Heaps, and there is a computing demand for those resources, then there will inevitably be competitors aiming to capitalize on them. Generally speaking, I also believe that increased competition provides a better chance for improved quality of service.

Conversely, imagine that Open Heaps don't happen, that the idle resources of mobile devices (or any other eligible equipment) either remain untapped of their potential or, worse, are put to use by corporations that only desire the end consumer to have limited power over their own property and how it's used. Dire scenarios aren't difficult to imagine, thanks to various examples of anti-consumer behaviour we've seen from large corporations and special interest organizations in the recent past.

So, yes -- I think we can make a case for this being of benefit to consumers (and thus a marketer's dream!). The more prevalent mobile devices become, the more they will integrate into our daily lives... and the more important it will be that these are managed as the rightful property of the consumer, people that have the right to rent or lease and profit from their property as they see fit.

But, how could this generate revenue?

Next up: Gimme da cash!



Thursday, April 23, 2009

Generators and Coroutines


This came up in the blog comments yesterday, but it really deserves a post of its own. A few days back, I was googling for code and articles that have been written regarding the use of Python 2.5 generators to build coroutines. All I got were many dead ends. Nothing really seemed to have any substance nor did materials dive into the depths in the manner I wanted and was hoping to find. I was at a loss... until I came across a blog post by Jeremy Hylton.

This was heaven-sent. Until then, I'd never looked at David Beazley's instruction materials, but immediately was dumbstruck at the beautiful directness, clarity, and simplicity of his style. He was both lucid in the topics while conveying great enthusiasm for them. With Python 2.5 generator-based coroutines, this is something that few have been able to do a fraction as well as David has. I cannot recommend the content of these two presentations highly enough:
Even though I've never been to one of his classes, after reading his fascinating background, I'd love the chance to pick his brain... for about a year or two (class or no!).

I've done a little bit of prototyping using greenlets before, and will soon need to do much more than that with Python 2.5 generators. My constant companion in this work will be David's Curious Course. Also, don't give the other slides a pass, simply because you already understand generators. David's not just about conveying information: he's gifted at sharing and shifting perspective.



Wednesday, April 22, 2009

Functional Programming in Python


Over the past couple years or so I've toyed with functional programming, dabbling in Lisp, Scheme, Erlang, and most recently, Haskell. I've really enjoyed the little bit I've done and have broadened my experience and understanding in the process.

Curious as to what folks have done with Python and functional programming, I recently did a google search I should have run years ago and discovered some community classics. I'm posting them here, in the event that I might spare others such an error in oversight :-)
I've always enjoyed David's writing style, though I've never read his FP articles until now. They were quite enjoyable and have aged well, despite referencing older versions of Python. Andrew's HOWTO provides a wonderful, modern summary.

I make fairly regular use of itertools but have never used the operator module -- though I now look forward to some FP idiomatic Python playtime with it :-) I've never used functools, either.

Enjoy!



Tuesday, April 21, 2009

After the Cloud: To Atomic Computation and Beyond


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

To restate the problem: we've got cloud for systems and we've got cloud for a large number of applications. We don't have cloud for processes (e.g., custom, light-weight applications/long-running daemons).

Personally, I don't want a whole virtual machine to myself, I just need a tiny process space for my daemon. When my daemon starts getting slammed, I want new instances of it started in a cloud (and then killed when they're not needed).

What's more, over time, I want to be writing my daemon better and better... using less of everything (memory, CPU, disk) in subsequent iterations. I want this process cloud to be able to handle potentially significant changes in my software.

Dream Cloud

So, after all that stumbling around, thinking about servers in the data center as the horsepower behind distributed services, and then user PCs/laptops as a more power-friendly alternative, the obvious hit me: phones. They are almost ubiquitous. People leave them on, plugged in, and only use them for a fraction of that time. What if we were able to construct a cloud from cell phones? Hell, let's throw in Laptops and netbooks, too. And Xboxes, Wii, and TiVos. Theoretically, anything that could support (or be hacked to support) a virtual process space could become part of this cloud.

This could be just the platform for running small processes in a distributed environment. And making it a reality could prove to be quite lucrative. A forthcoming blog post will explore more about the possibilities involved with phone clouds... but for now, let's push things even further.

When I mentioned this idea to Chris Armstrong at the Ubuntu Developer Conference last December, he immediately asked me if I'd read Charles Stross' book Halting State. I had started it, but hadn't gotten to the part about the phones. A portion of Stross' future vision in that book dealt with the ability of users to legally run programs of other's phones. I really enjoyed the tale, but afterwards I was ready to explore other possibilities.

Horse-buggy Virtualization


So I sat down and pondered other possibilities over the course of several weeks. I kept trying to think like business visionaries, given a new resource to exploit. But finally I stopped that and tried just imagining the possibilities based on examples computing and business history.

What's the natural thing for businesses to do when someone invents something or improves something? Put new improvements to old uses, potentially reinventing old markets in the process. That's just the sort of thing that could happen with the cloudification of mobile devices.

For examples, imagine this:
  • Phone cloud becomes a reality.
  • Someone in a garage in Silicon Valley buys a bunch of cheap phones, gumstix, or other small ARM components, rips off the cases, and sells them in rack-mountable enclosures.
  • Data centers start supplementing their old hardware offering with this new one that lets them use phone cloud tech (originally built for remote, hand-held devices) to sell tiny fractions of resources to users (on new, consolidated hardware... like having hundreds of phone uses in a single room with full bars, 24/7).
  • With the changing hardware and continuing improvements in virtualization software, more abstraction takes place.
  • Virtualization slowly goes from tool to prima materia, allowing designers not to focus on old-style, horse-drawn "machines" like your grandpa used to rack, but rather abstract process spaces that provide just what is needed, for example, to enable a daemon to run.
Once you've gotten that far, you're just inches from producing a meta operating system: process spaces (and other abstracted bits) can be built up to form a traditional user space. Or they can be used to build something entirely different and new. The computing universe suddenly gets a lot more flexible and dynamic.

Democritus Meets Modern Software

So, let's say that my dream comes true: I can now push all my tiny apps into a cloud service and turn off the big machines I've got colocated throughout the US. But once this is in place, how can we improve our applications to take even better advantage of such a system, one so capable of massively distributing our running code?

This leads us to an almost metaphysical software engineering question: how small can you divide an application until you reach the limits of functionality, where any further division would be senseless bytes and syntax errors? In terms of running processes, what is your code atom?

Prior to a few years ago, the most common answer would likely have been "my script" or "my application". Unless, of course, you asked a Scheme programmer. Programming languages like Scheme, Haskell, and Erlang are finding rapidly increasing acceptance as solutions for distributed programming problems because functional programming languages lend themselves easily to the problem of concurrency and parallelism.

If we had a massive computing cloud (atmosphere, more likely!) where we could run code in virtual process spaces, we could theoretically go even further than running a daemon: we could split our daemon up into async functions. These distributed functions could be available as continuously running microthreads/greenlets/whatever. They could accept an input and produce an output. Composing distributed functions could result in a program. Programs could change, failover, improve, etc., just by adding or removing distributed functions or by changing their order.

From Atoms to Dynamic Programs

Once we've broken down our programs into distributed functions and have broken our concept of an "Operating System" down into virtual process spaces, we can start building a whole new world of software:
  • Software becomes very dynamic, very distributed.
  • The particulars of hardware become irrelevant (it just needs to be present, somewhere).
  • We see an even more marked correlation between power consumption and code, where functions themselves could be measured in joules consumed per second.
  • Just for fun, let's throw in dynamic selection of fuctions or even genetic algorithms, and we have ourselves one of the core branches of the predicted Ultra-large Scale Systems :-)
I mention this not for cheap thrills, but rather because of the importance of having a vision. Even if we don't get to where we think we're going, by looking ahead and forward, we have the opportunity to influence our journey such that we increase the chances of getting to a place equal to or better than where we'd originally intended.

From a more practical perspective: today, I'm concerned about running daemons in the cloud. Tomorrow I could very well be concerned about finer granularity than that. Why not explore the potential results of such technology? Yes, it my prove infeasible now; but even still, it could render insights... and maybe more.

A Parting Message

Before I wind this blog post down, I'd like to paste a couple really excellent quotes. They are good not so much for their immediate content, but for the pregnant potentials they contain; for the directions they can point our musings... and engineerings. These are two similar thoughts about messaging from two radically different contexts. I leave you with these moments of Zen:

On the Erlang mail list, four years ago, Erlang creator Joe Armstrong posted this:
In Concurrency Oriented (CO) programming you concentrate on the concurrency and the messages between the processes. There is no sharing of data.

[A program] should be thought of thousands of little black boxes all doing things in parallel - these black boxes can send and receive messages. Black boxes can detect errors in other black boxes - that's all.
...
Erlang uses a simple functional language inside the [black boxes] - this is not particularly interesting - *any* language that does the job would do - the important bit is the concurrency.
On the Squeak mail list in 1998, Alan Kay had this to say:
...Smalltalk is not only NOT its syntax or the class library, it is not even about classes. I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea.

The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about... The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Think of the internet -- to live, it (a) has to allow many different kinds of ideas and realizations that are beyond any single standard and (b) to allow varying degrees of safe interoperability between these ideas.

If you focus on just messaging -- and realize that a good metasystem can late bind the various 2nd level architectures used in objects -- then much of the language-, UI-, and OS based discussions on this thread are really quite moot.

Resources

Next up: The Business of Computing Atmospheres



Monday, April 20, 2009

After the Cloud: The New Big


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

Intermission

I've made a few hints so far about what cloud service I'd like to see come into being, and at the end of this post, we'll get closer to discussing that. Hang in there: the post after this one will describe that in more detail. Then, after that, there will be at least one post which will take a peek at some of the many business opportunities that could come from this.

A Passing Comment

At PyCon 2006 in Dallas, TX, an after-hours event was held in a local bookstore. At one point during that evening, Itamar, Moshe and I got into a discussion about miniaturization and Moshe went off on a hilarious rant that Itamar and I just sat back and enjoyed. His whole tirade was based on the beauty and perfection of gumstix. This was the first I'd heard of them; I had no idea a product like that was on the market, and it hit me like a ton of bricks.

For the next day or so all I could think about was buying a boxload of gumstix computers and doing something with them -- anything! And not just because they were the coolest toys ever, but because there was something about them that I could just feel was a part of the future of computing (see my 2004 post on Dinosaurs and Mammals). It seemed that these miniture devices could help prototype what was destined to be one of the most exciting fields in the coming years for both systems and application engineers.

Sadly, I never did get that box :-) But I neither did I stop thinking about them. Confronted with the problem of small distributed services sitting on big, barely-used iron, gumstix haunted my musings.

Tiny Apps in the Cloud?

When at Divmod, one of the strategies that Glyph and I were working on concerned Twisted adoption in web hosting and cloud environments. The differences between CGI and Twisted applications are magnified when one considers a cloud environment like Mosso and one that would suitably support Twisted design principals. I spent a lot of time pondering the ramifcations of that one, let me tell you. A potential merger permanently postponed those business possibilities, but a nice side benefit was the forking of Python Director into a pure-Twisted conversion, txLoadBalancer (with the beginnings of native, in-app load-balancing support).

Thoughts of adjusting tiny apps to be able to run on big cloud hardware still grated, though. It felt dangerously close to pounding round pegs into square holes. What I really wanted was something closer to the future hinted at by Ultra Large-Scale Systems research: massively distributed, fault-tolerant services running on everything :-) Until then, though, I would have been satisfied with tiny apps on tiny hardware, consuming only the resources they need in order to provide the service they were designed for.

This brought up ideas of distributed storage, memory, and processing as well as the need for redundacy and failover. But tiny. All I could see was tiny hardware, tiny apps, tiny protocols, tiny power consumption. For me, tiny was big. The easiest "tiny" problem to address with small devices was storage. And I already knew the guys that were working on the problem.

Distributed Storage Done Right

There's an odd, rather abstract parallel between EC2 and Tahoe (a secure, decentralized, fault-tolerant filesystem). EC2 arose in part from a corporation acting out of its best interests: turn a liability into an asset. For Tahoe, the "body" in question isn't a corporation, but rather a community. And the commodity is not bottom lines, but rather data owned and treasured by members of a data-consuming community.

Here's a quick description of Tahoe from a 2008 paper:
Tahoe is a storage grid designed to provide secure, long-term storage, such as for backup applications. It consists of userspace processes running on commodity PC hardware and communicating with [other Tahoe nodes] over TCP/IP.
Tahoe is written in Python using Twisted and a capabilities system inspired by those defined by E. But what does this mean to a user? It means that anyone can setup and run a storage grid on their personal computers. All data is encrypted and redundant, so you don't need to trust members of the community (your data grid), you just need to set aside some disk space on your machines for them.

In a message to the Tahoe mail list, I responded to an associate who was exploring Tahoe for in-memory use by Python mapreduce applications. I wanted in-memory distributed storage for a different use case (tiny apps on tiny devices!) but our interests were similar. It turned out one of the primary Tahoe developers was working on related code; something that could be used as the basis for future support for distributed, solid-state devices.

Here's some nice dessert: Twisted coder David Reid was reported to have gotten Tahoe runnig on his iPhone. Now we're talking ;-) (Update: David has informed me that Allmydata has a Tahoe client that runs on his iPhone).

Processing in the Right Direction

But what about the CPU? Running daemons? Can we do something similar with processing power? If a whole virtual machine is too much for users, can we get a virtual processing space? I want to be able to run my process (e.g., a Twisted daemon) on someone else's machine, but in such a way that they feel perfectly safe running it. I want Tahoe for processes :-)

As part of some recent experiments in setting up a virtual lab of running gumstix ARM images, I needed to be able to connect mutliple gumstix instances in a virtual network for testing purposes. In a search for such a solution, I discovered VDE. Then, unexpectedly, I ran across a couple fascinating wiki pages on the site of related super-project Virtual Square Networking. Their domain is currently not resolving for me, so I can't pull the exact text, but here's a blurb from a sister project on SourceForge:
View OS is a user configurable, modular process virtual machine, or system call hypervisor. For each process the user is able to define a "view of the world" in terms of file system, networking, devices, permissions, users, time and so on.
Man, that's so close, I can almost taste it!

Where is all this techno-rambling going? Well, I'm sure some of you have long since guessed by now :-) Regardless, I will save that for the next post.

Oh, and yes: tiny is the new big.

Next Up:
A Passing Message



Sunday, April 19, 2009

After the Cloud: So Far


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

Systems Engineering in a Box

The recent redefinition of "the cloud" as a service and commodity is a brilliant bit of frugal resource management (making use of idle resources in an expensive data center) coupled with flawless marketing. Yes, from a business perspective, that's an amazing coup. But it's the 30,000 foot technical perspective that really impresses me:

In the same way that software frameworks, their libraries, and best practices have, through the trials of last 40 tears, productized application engineering, the cloud has started to experience something similar. What everyone is now calling the cloud is really the productization of systems engineering.

Systems engineering (and the management of related resources) has proven to be an expensive, time-consuming endeavor best left to the experts. Sadly, those that need it are often in the unenviable position of having to determine who the experts are without having the proper background to do so effectively. When the planning, building, and management of large systems works well, it's a labor of sweat and blood. When it doesn't, it's the same thing, with a nightmare tinge about the whole thing coupled with an odd time-dilation effect.

It seems that in applicable circumstances, some businesses are spared that nightmare by using a cloud service or product.

Bionic CGI

As someone with a long history and interest in application development, I was particularly keen on Google App Engine when it came out. This was a different take on the cloud, one that Mosso also seems to be embracing: upload an application that is capable of having it's data access and views distributed/load balanced across multiple systems (virtual or otherwise).

This is essentially CGI's grandchild. You have an application that needs to be started up by any number of machines in response to demand. A CGI app in Mosso will probably need very few (if any) adjustments required in order to run "in the cloud." Google is a special case, since developers are using custom, black-box infrastructure built by Google (for insights into this, check out these papers), but I'd be willing to bet someone lunch that there is room for a CGI analogy at some level of Google App Engine.I guess with Google, we kind of have both application and systems engineering in a box, in so far as the systems support your application.

At any rate, it's CGI better than it was before. Better, stronger, faster.

The Rub

However fascinating these cloud offerings may be, I find myself not getting what I need. As a developer of Twisted applications, I'm interested in small apps. Hell, I don't even like running databases and full-blown web servers. A while ago, I spent a couple years working on some Twisted-based application components that could be run as independent services (thus load-balanceable) and completely replace the standard web server + database + lots of code routine for application development.

So what about developers out there like me, who want to run tiny apps? We don't need "classic" web hosting, nor CGI in the cloud, nor cloud-virutualized versions of large machines.

As a segment of the population, business consideration for developers such as myself might seem like a waste of time. But before dismissing us, consider this:
  1. Exploring small niche's like this one often lead to interesting revelations.
  2. Market segments that have proven quite vibrant may be able to expand into even greater territories (e.g., the iPhone apps phenomena).
Next up: Tiny > *