Electric Duncan

Friday, December 28, 2012

The Secret History of Lambda

Being a bit of an origins nut (I always want to know how something came to be or why it is a certain way), one of the things that always bothered me with regard to Lisp was that no one seemed to talking about the origin of lambda in the lambda calculus. I suppose if I wasn't lazy, I'd have gone to a library and spent some time looking it up. But since I was lazy, I used Wikipedia. Sadly, I never got what I wanted: no history of lambda. [1] Well, certainly some information about the history of the lambda calculus, but not the actual character or term in that context.

Why lambda? Why not gamma or delta? Or Siddham ṇḍha?

To my great relief, this question was finally answered when I was reading one of the best Lisp books I've ever read: Peter Norvig's Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp. I'll save my discussion of that book for later; right now I'm going to focus on the paragraph at location 821 of my Kindle edition of the book. [2]

The story goes something like this:

Between 1910 and 1913, Alfred Whitehead and Bertrand Russell published three volumes of their Principia Mathematica, a work whose purpose was to derive all of mathematics from basic principles in logic. In these tomes, they cover two types of functions: the familiar descriptive functions (defined using relations), and then propositional functions. [3]
Within the context of propositional functions, the authors make a typographical distinction between free variables and bound variables or functions that have an actual name: bound variables use circumflex notation, e.g. x̂(x+x). [4]
Around 1928, Church (and then later, with his grad students Stephen Kleene and J. B. Rosser) started attempting to improve upon Russell and Whitehead regarding a foundation for logic. [5]
Reportedly, Church stated that the use of x̂ in the Principia was for class abstractions, and he needed to distinguish that from function abstractions, so he used ⋀x [6] or ^x [7] for the latter.
However, these proved to be awkward for different reasons, and an uppercase lambda was used: Λx. [8].
More awkwardness followed, as this was too easily confused with other symbols (perhaps uppercase delta? logical and?). Therefore, he substituted the lowercase λ. [9]
John McCarthy was a student of Alonzo Church and, as such, had inherited Church's notation for functions. When McCarthy invented Lisp in the late 1950s, he used the lambda notation for creating functions, though unlike Church, he spelled it out. [10]

It seems that our beloved lambda [11], then, is an accident in typography more than anything else.

Somehow, this endears lambda to me even more ;-)

[1] As you can see from the rest of the footnotes, I've done some research since then and have found other references to this history of the lambda notation.

[2] Norvig, Peter (1991-10-15). Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp (Kindle Locations 821-829). Elsevier Science - A. Kindle Edition. The paragraph in question is quoted here:

The name lambda comes from the mathematician Alonzo Church’s notation for functions (Church 1941). Lisp usually prefers expressive names over terse Greek letters, but lambda is an exception. Abetter name would be make-function. Lambda derives from the notation in Russell and Whitehead’s Principia Mathematica, which used a caret over bound variables: x̂(x + x). Church wanted a one-dimensional string, so he moved the caret in front: ^x(x + x). The caret looked funny with nothing below it, so Church switched to the closest thing, an uppercase lambda, Λx(x + x). The Λ was easily confused with other symbols, so eventually the lowercase lambda was substituted: λx(x + x). John McCarthy was a student of Church’s at Princeton, so when McCarthy invented Lisp in 1958, he adopted the lambda notation. There were no Greek letters on the keypunches of that era, so McCarthy used (lambda (x) (+ x x)), and it has survived to this day.

[3] http://plato.stanford.edu/entries/pm-notation/#4

[4] Norvig, 1991, Location 821.

[5] History of Lambda-calculus and Combinatory Logic, page 7.

[6] Ibid.

[7] Norvig, 1991, Location 821.

[8] Ibid.

[9] Looking at Church's works online, he uses lambda notation in his 1932 paper A Set of Postulates for the Foundation of Logic. His preceding papers upon which the seminal 1932 is based On the Law of Excluded Middle (1928) and Alternatives to Zermelo's Assumption (1927), make no reference to lambda notation. As such, A Set of Postulates for the Foundation of Logic seems to be his first paper that references lambda.

[10] Norvig indicates that this is simply due to the limitations of the keypunches in the 1950s that did not have keys for Greek letters.

[11] Alex Martelli is not a fan of lambda in the context of Python, and though a good friend of Peter Norvig, I've heard Alex refer to lambda as an abomination :-) So, perhaps not beloved for everyone. In fact, Peter Norvig himself wrote (see above) that a better name would have been make-function.

Tuesday, December 18, 2012

Seeking a Twisted Maintainer

Last week we posted on the Twisted Matrix blog about the maintainer position for the Twisted project being open. We are accepting applicants for a motivated and experienced release manager and core contributor. Our core maintainers are getting busier and busier with specialized Twisted work, and don't have the time that they used to be able to dedicate to maintaining Twisted.

The post on the Twisted Matrix blog gives a quick overview of the position; if you're interested, please check out the fellowship proposal for more details and email the address on that page (at the bottom).

Also, feel free to ping glyph, exarkun, or myself (oubiwann) on #twisted-dev on IRC to chat about it more.

Wednesday, December 12, 2012

Async in Python 3

Update: Guido has been working on PEP 3156; check on it regularly for the latest! (In the last two hours I've seen it updated with three big content changes.)

The buzz has died down a bit now, but the mellowing of the roaring flames has resulted in some nice embers in which an async for Python 3 is being forged. This is an exciting time for those of us who 1) love Python and 2) can't get us enough async.

I wanted to take the time to record some of the goodness here before I forgot or got too busy working on something else. So here goes:

The latest bout of Python async fever started in September of 2012 in this message when Christian M. Amsüss emailed the Python-ideas mail list about the state of async in Python and the hopes that a roadmap could be decided upon for Python 3. Note that this is the latest (re)incarnation of conversations that have been going on for some time and for which there is even a PEP (with related work on github).

After a few tens of messages were exchanged, Guido shared his thoughts, starting with:

This is an incredibly important discussion.

This seemed to really heat things up, eventually with core Twisted and Tornado folks chiming in. I learned a tremendous amount from the discussions that took place. There's probably a book deal in all that for a motivated archivist/interviewer...

After this went on for chunks of September and October, Guido stated that he'd like to break the discussion up into various sub-topics:

reactors
protocol implementations
Twisted (esp. Deferred)
Tornado
yield from vs. Futures

This was done in order to prevent the original thread from going over 100 messages and to better organize the discussion... but wow, things completely exploded after that (in good ways. mostly). It was async open season, and the ringing of shots in the air seemed continuous. If you scroll to about the half-way point of the October archive page, you will see the first of these new threads ([Python-ideas] The async API of the future: Reactors). These messages essentially dominate the rest of the October archives. It's probably not unexpected that this continued into November. A related thread was started on Python-dev and it seemed to revive an old thread this month (on the same list).

All of this got mentioned on Reddit, too. It inspired at least two blog posts of which I am aware: one post by Steve Dower, and another by Allen Short. Even better, though, Guido started an exploratory project called Tulip to test out some of these ideas in actual running code. As he mentions in the README, a tutorial by Greg Ewing was influential in the initial implementation of Tulip and initial design notes were made in the message [Python-ideas] Async API: some code to review.

Shortly after that, some of the Twisted devs local to Guido met with him at his former office in San Francisco. This went amazingly well and revolved mostly around the pros and cons of separating the protocol and transport functionality. Guido started experimenting with that in Tulip on December 6th. Yesterday, a followup meeting took place at the Rackspace office, this time with notes.

There's a long way to go still, but I find myself compulsively checking the commit log for Tulip now :-) It's exciting to imagine a future where Twisted and Tornado could easily interoperate with async support in Python 3 with a minimum of fuss. In fact, Glyph has already sketched out two classes which might be all that's needed for 2-way interoperation between Twisted and Python 3.

Here's to the future!

Tuesday, October 30, 2012

Async in Clojure: Playing with Agents, Part II

In the last post, we took a look at basic usage of Clojure's agent function. In this post, we'll dive a little bit deeper

Validation
We glossed over the options that you can define when creating an agent; one of them is the validator which one can use to check before the agent is updated with the passed value.

If we want to make sure that our read-agent always gets a string value, this is all we have to do:

Similarly, any function that takes a single value as a parameter can be used here. As you can see, we had to change our default value for the agent from nil to "" since there is now a string validator. If we hadn't, any time we tried to use that agent, we'd get java.lang.IllegalStateException.

When Things Go Wrong
Another option you can set when defining an agent is the error handler. This will be used in the event of an error, including if a value fails to be validated by your validator function. Here's an example:

With both of these options, you don't have to set them when the agent is defined; you can do it later with a function call, if you so desire (or if needs demand it):

Watch This!
So, we've got an error handler but no event handler? Yup. However, you can actually get callback-like behavior using watches. Check this out:

Now, any time our agent's state changes, the function passed to the watch will fire. As described in the docs, the parameters are: the agent, a key of your devising (must be unique per agent), and the handler that you want to have fired upon state change. The handler takes as parameters: the key you defined, the agent, the agent's old value, and the agent's new value.

All Together Now
With all our example code in place, we can now exercise the whole thing at once. Here's the whole thing:

To simply demonstrate the async nature and the callbacks in action, let's run the following:

Eventually, our callback will render output very much like the following:

Do note, however, that if we called a series of send-offs with different times (using the same agent and watch), we wouldn't see the ones with shorter times come back first. We'd see the callback output in the same order we called send-off. This is because the watch function is called synchronously on the agent's thread before any pending send-offs (or sends) are called. In future posts, I'll cover ways around this (constructing agents on the fly as well as exploring alternative solutions with external libraries).

Regardless, with these primitives, there are all sorts of things one can do. For instance...

Dessert
To close, check out this neat little bit of code that sends 1,000,000 messages in a ring. This code creates a chain of agents, and then actions are relayed through it (taken from the agents doc page):

Kicking this puppy off, our million messages finish in about 1 second :-)

Monday, October 29, 2012

Async in Clojure: Playing with Agents, Part I

Clojure has a very interesting async primitive: the agent. There is some good documentation on agents, but for those that come from a background such as mine (Python at Twisted), I thought it might be nice to present one way of using agents to mimic the familiar async + callback reactive-style programming.

Do note, however, that Clojure agents run in one of two threadpools (one intended for CPU-intensive tasks, and the other for I/O-intensive tasks). As such, this is quite different than the event-loop approach that Twisted uses (or async frameworks that utilize libraries such as libevent or libev). Twisted has the deferToThread functionality, which is ... well, not exactly close, really. Regardless, let's get started.

In the following examples, we're going to pretend we have huge files we'll be reading off a local disk.

What to Call
Clojure's agent function is very, very simple: you pass it a value (it's initial state) and some options, if needed. That's it.

To update its state, you use either the send or send-off functions. If you've got CPU-bound tasks whose state you want to manage with agents, then you should use the send function. If your tasks will be I/O-bound, then you should use the send-off function for updating agent state. (The threadpool dedicated for use by send has a fixed size, based on the number of processors on your system. The threadpool for send-off is exapandable with thread caching and keep-alives.) Since our examples are focused on disk I/O, we'll be using send-off. (they have the same signature, though, so the following usage information applies to both).

When you send-off something to an agent, you pass if a few things:

an agent
the action or update function
any number of additional parameters you want the action function to consume

Here's what that looks like:

What to Write
So, we know what an agent looks like when bound and we know how we're going to send an update to the agent, but how might we construct the update itself? Perhaps like this:

As you can see, the first value that an action function takes is the "old" value of the agent -- the value that the agent has prior to the action that will take place. Once this function returns, the agent's value will be set to the return value of the action function. (What's more, if we needed to access the agent itself inside the action function for any reason, we could do so using the *agent* variable -- accessible within the scope of the action function).

Before we go on, let's take a look at this in action from the REPL:

The first thing we do is switch from the default namespace to one dedicated to our examples (this makes managing scope in the REPL much cleaner). Then we load a file that has the agent and action function defined. Then we tell it to run our fake "big read" function, asking it to run for about 10 seconds. As you can see, send-off returns the agent immediately. We then get the current value of the agent by dereferencing it. Finally our big read finishes, and we see it print how long it took. We then look at the agent directly, and then dereference it again -- both showing us what we'd expect: that the value of the agent has been updated to the return value of our big read function. Finally, we shutdown the agent threads and exit the REPL.

(The start-clojure script is wrapped with rlwrap so that I have access to a command line history, persistent over different sessions. The script boils down to this: rlwrap java -cp /usr/local/clojure-1.4.0/clojure-1.4.0.jar clojure.main.)

We've seen the agent in action now, but there's a bit more we can do. We'll take a look at that in the next post.

Friday, October 19, 2012

libevent for Lisp: A Signal Example

At MindPool, there are several async I/O options we've been exploring for Lisp:

CMUCL/SBCL's SERVE-EVENT
IOlib
cl-event (libevent for Lisp, using cffi; three years old)
cl-async (also using cffi libevent wrapper, actively developed)

As luck would have it, I started with cl-event, and it was a fun little adventure (given the fact that it hasn't been maintained). I corresponded with the very nice original author a bit, asking if there were any updates in other locations, but sadly there weren't.

I was ready to dive in and get things current, when one last Google search turned up cl-async. This little bugger was hard to find, as at that point it had not been listed on CLiki. (But it is now :-)). Andrew Lyon has done a tremendous amount of work on cl-async, with a very complete set of bindings for libevent. This is just what I had been looking for, so I jumped in immediately.

As one might imagine from the topic of this post, there's a lot to be explored, uncovered, and developed further around async programming in Lisp. I'll start off slowly with a small example, and add more over the course of time.

I also hope to cover IOlib and SBCL's SERVE-EVENT in some future posts. Time will tell... For now, let's get started with cl-async in SBCL :-)

Dependencies
In a previous post, I discussed getting an environment set up with SBCL, I the rest of this post assumes that has been read and done :-)

Getting cl-async and Setting Up an SBCL Environment for Hacking
Now let's download cl-async and install the Libevent bindings :-)

With the Lisp Libevent bindings installed, we're now ready to create a Lisp image to assist us when exploring cl-async. A Lisp image saves the current state of the REPL, with all the loaded libaries, etc., allowing for rapid start-ups and script executions. Just the thing, when you're iterating on something :-)

Example: Adding a Signal Handler
Let's dive into some signal handling now! Here is some code I put together as part of an effort to beef up the examples in cl-async:

Note that the as: is a nickname for the package namespace cl-async:.

As one might expect, there is a function to start the event loop. However, what is a little different is that one doesn't initialize the event loop directly, but with a callback. As such, one cannot set up handlers, etc., except within the scope of this callback.

We've got the setup-handler function for that, which adds a callback for a SIGINT event. Let's try it out :-)

Once your script has finished loading the core, you should see output like the above, with no return to the shell prompt.

When we send a SIGINT with ^C, we can watch our callback get fired:

Next up, we'll take a look at other types of handlers in cl-async.

Thursday, October 18, 2012

Getting Started with Steel Bank Common Lisp

As some of you know, I've been a closet Lisp fan for several years. When I first joined Canonical in 2008, I was hacking on Lisp in Python, so that I could do genetic programming in Python. In fact, my first and only lightening talk at a Canonical sprint was on genetic algorithms and programming :-) (This was the same set of lightening talks that Vincenzo Di Somma gave a wonderful presentation on his photography; completely unrelated: this is one of my favorite pics of Vincenzo :-) ).

A few years later, I talked to Jim Baker about Python's AST, and how one might be able to do genetic programming by manipulating it directly, instead of running a Lisp in Python.

Throughout all this time, I've been touching in with various community projects, hacking on various Lispy Things, reading, etc., but generally doing so quite quietly. Over the past few months, however, I've really gotten into it, and Lisp has become a real force in my life, rapidly playing just as dominant a role as Python.

Similarly, MindPool has become active in several Lisp projects; as such, there are a great many things to share now. However, before I begin all that, I'd like to take an opportunity to get folks up and running with an example Lisp environment.

Future posts will explore various areas of Common Lisp, Scheme dialects, I/O loops, etc., but this one will provide a basis for all future posts that relate to Common Lisp and specifically the Steel Bank implementation.

Installing SBCL
If you don't have SBCL (Steel Bank Common Lisp; a pun on it's source parent, CMUCL), you need to install it:

For Ubuntu (12.04 LTS has 1.0.55): $ sudo apt-get install sbcl
Or you can go to the download page for everyone else.

apt-get for Lisp
Next, you'll need to install Quicklisp (as you might have surmised, it's like Debian apt-get for Common Lisp). The instructions on this page will get you up and running with Quicklisp.

I like having quicklisp available when I run SBCL, so I did the following after installing Quicklisp (and you might want to as well) from the sbcl prompt:
* (ql:add-to-init-file)

Readline Support
The default installation of SBCL doesn't have readline support for the REPL, so using your arrow keys won't give you the expected result (your command history). To remedy that, you can use a readline wrapper. First, install rlwrap:

Ubuntu: $ sudo apt-get install rlwrap
Mac OS X: $ brew install rlwrap

Then, create the chmoded 755 script ~/bin/start-sbcl with the following content (make sure that ~/bin is in your path):
rlwrap sbcl

At which point you can run the following and have access to a command history in SBCL:
$ start-sbcl
*

Why Steel Bank?
CMUCL gained an excellent reputation for being a highly performant, optimized implementation of Lisp. Based on CMUCL and continuing this tradition of excellent performance, SBCL's reputation preceded it. Over a range of different types of programs, SBCL not only compares favorably to other Lisp dialects, it seriously kicks ass all over.

SBCL comes in at 8th place in that benchmark ranking, beating out Go in 9th place. In all the languages that made it into the Top 10, I've only ever touched C, C++, Java, Scala, Lisp, and Go. In my list, SBCL made the Top 5 :-) Regardless, of all of them, Lisp has the syntax a find most pleasurable. Given my background in Python, this is not surprising ;-)

What's next?
Funny that you should ask... given my background with Twisted, I'll give you one guess ;-)

Monday, October 15, 2012

Rendering ReST with Klein and Twisted Templates

In a previous life, I spent about 25 hours a day worrying about content management systems written in Python. As a result of the battle scars built up during those days, I have developed a pretty strong aversion for a heavy CMS when a simple approach will do. Especially if the users are technologically proficient.

At MindPool, we're building out our infrastructure right now using Twisted so that we can take advantage of the super amazing numbers of protocols that Twisted supports to provide some pretty unique combined services for our customers (among the many other types of services we are providing). For our website, we're using the Bottle/Flask-inspired Klein as our micro web framework, and this uses the most excellent Twisted templating. (We are, of course, also using Twitter Bootstrap.)

Here's the rub, though: we want to manage our content in the git repo for our site with ReStructured Text files, and there's no way to tell the template rendering machinery (the flattener code) to allow raw HTML into the mix. As such, my first attempt at ReST support was rendering HTML tags all over the user-facing content.

This ended up being a blessing in disguise, though, as I was fairly unhappy with the third-party dependencies that had popped up as a result of getting this to work. After a couple false starts, I was hot on the trail of a good solution: convert the docutils-generated HTML (from the ReST source files) to Twisted Stan tags, and push those into the renderers.

This ended up working like a champ. Here's what I did:

Created a couple of utility functions for easily getting HTML from ReST and Stan from ReST.

Wrote a custom IRenderable for ReST content (not strictly necessary, but organizationally useful, given what else will be added in the future).

Updated the base class for "content" page templates to dispatch, depending upon content type.

Afterwards I was rewarded with some nicely rendered content on the staging MindPool site :-) (once the content text has been completed, we'll be pushing it live).

Kudos to David Reid for Klein and (as usual) to the Twisted community for one hell of a framework that is the engine of my internet.

Saturday, October 13, 2012

GIMP 2.8 and the Taming of Two Decades' Graphics Habits

At long last, I find myself in a 100% comfort zone with the GIMP. As a user, it's been a long road to get here ... I can't even imagine what the development teams have experienced over the past 16 years. Regardless, I am infinitely grateful for their efforts over that time, their hard work to bring the world a completely free, open source application for incredibly sophisticated image manipulation, drawing, and even painting.

I'll provide some more details on my experiences with GIMP 2.8, but first... a romp through the past!

I've been using graphics programs since my first exposure to the Macintosh Plus and SE machines in the late 1980s, where the head of the science department at Bangor High School often found himself without access to his machine (yeah, thanks to me...). As much as a science geek as I was, it was actually the graphics programs to which I was addicted. In particular, I was obsessed with using MacPaint to draw -- pixel by pixel -- a dithered bitmap of a photograph leaning next to the monitor.

About 6 or 7 years later, I was at it again doing graphics work on a non-profit's Mac (running System 7) with Photoshop 3.0 -- my first exposure to this software. It was absolutely mind-blowingly awesome what Photoshop enabled one to do with digital images. I was consumed. If I was awake, I was at the computer, creating something whacky from scratch or morphing something old into a new form. Though the work that I had done was focused on graphics for HTML and getting into the nascent field of "web design," some of my efforts ended up getting published as illustrations for a book.

I continued doing graphics as part of my work (in some form or another) from then on, usually borrowing a friend's machine in order to use Photoshop (which I could not afford). Despairing of not having my own copy of the software, I would often try out other drawing, painting, and image manipulations applications. Without fail, none would ever measure up to the power and even more, the usability, of Photoshop.

It wasn't too much after this that I got involved with Linux and open source software. As for many from that time, my life has not been the same since. But those early years were painful. My first distro (Slackware), I had to manually hack the ethernet drivers in C just to get connected to the network using my particular hardware. Regardless, Linux was all I could think about; it was all I wanted to use.

Desperately hungering for a Photoshop-like experience on Linux, I searched HotBot on a regular basis, and finally discovered GIMP. Based on a quick glance at the now-historic splash screens, it looks like my first use of the GIMP started after the 1.0 release, though I didn't start using it more regularly until 1.2 (which had my favorite GIMP splash screen to date).

Despite my hopes and the best efforts of an amazing development team, the GIMP infuriated me. What took one step in Photoshop usually took about 5 steps in the GIMP. The user interface was anti-intuitive, and seemed to be built by folks with an intimate knowledge of the underlying API and reflected this far too well.

With each major release of the GIMP, I would find myself downloading and installing it, but only actually using it a handful of times. Ultimately, I'd run crying back to Photoshop, borrowing a friend's computer or persuading someone to give me an unused copy. The very nature of graphic arts is visual and intuitive... I felt that the GIMP was blocking that natural flow. As a result, I couldn't get anything done in it (and what little I did looked bad).

Year later, while working at Canonical, I had several opportunities to create graphics for various fun projects, gags, or practical jokes. Given the nature of the company's mission, I made sure to do that work using the GIMP. Though the pain of the old days had faded considerably, and GIMP had, by 2008, reached a much higher degree of usability, I still found myself enduring awkward workflow moments on a regular basis and this continued throughout the 4 years of version 2.6's release. However, during that time, an impressive selection of GIMP plugins were created, some of which I came to adore so completely that I would switch from Photoshop to GIMP, just so I could use them on a project. This was a new -- and amazingly welcome -- change for me.

Then, earlier this year, the last few pieces fell into place.

With the release of GIMP 2.8 in May, I have not looked for nor pined over any other sophisticated graphics programs. At long last, the GIMP completely satisfies. The biggest single point of pain for me in 2.6 (given that I was running on Mac OS X most of the time) was the lack of single-window support. For every UI interaction, I had to click twice -- once to focus on the window, and once to actually perform the given action. It was crazy-making. 2.8 finally released the goodness of single-window mode for GIMP users around the globe. Futhermore, since I have been a layer-using maniac since Photoshop 3.0 and my graphics files tend to reach into the 200 and 300 MB range due to the numbers of layers (and resolution, of course), I have a desperate need to organize the chaos of all those layers. Prior to 2.8, my layer pallet on large projects was almost completely unnavigable. With 2.8, sanity and cleanliness abounds.

And there's more :-) Check out this short list:

single-window mode
multi-column dock windows
increased screen real-estate
layer groups
Cairo is used for all rendering
on-canvas text editing

For more details, you'll want to visit this page.

Oddly, this walk down memory lane and ultimate endorsement of the GIMP relates to MindPool. I used the GIMP to do all the complicated graphics work for our initial branding, logos, etc. Looking at the results, I'm sure if seems fairly silly and even trivial... but there's a LOT that went into that simplicity :-)

For instance, the shot of the moon is actually taken from a photograph of the moon. It has been passed through multiple filters in order to get just the level of abstract color and shape I was looking for. Similarly for the water, but the settings were actually quite different to achieve the result I was seeking. I think the layer count peaked around 30 in one version of the file.

Another example is the logo that we'll be using for the company's main site. This also took advantage of some plugins (new and old), though mostly it relied upon paths, brush strokes, and mask layers (the latter for the water pool reflection watermarked into the image). The new features in 2.8 made all of that work very easy to do. No longer is the tool interfering with my workflow; rather, it is facilitating it. Working with graphics using open source tools is now an absolute pleasure for me.

For giggles, I'm including a screenshot of GIMP 2.8 in action, with my single-window mode view stretched across two 27" monitors (full version is here). Graphical editing has never been so clean!

Thanks, GIMP team :-) It's nice to finally be home.

Fun link: For the historically inclined, you might enjoy this quick read about the beginnings of Photoshop and the family that made it all happen :-)

Monday, May 14, 2012

CERN, OpenStack Keep Resonance Cascades at Bay

Tim Bell preparing to get his
OpenStack on

As previously mentioned, there's a growing momentum around ops-oriented participation in the OpenStack community. DreamHost is deeply invested in DevOps, seeing how that's where we're going to be living in a few months! As Simon Anderson, CEO of DreamHost, recently said:

"When we're running a complex fabric of apps on over 5,000 servers across three data centers, we need a lean and nimble approach to software development and operational implementation. Without a DevOps approach, we wouldn't be able to push code into production as fast or as efficiently as we do, and our customers would not be happy! Today's developers demand up-to-the-hour security and performance updates to Internet infrastructure, so we aim to deliver just that with DevOps."

Though expressed in the context of our work, the import of DevOps that Simon's comment generally highlights is going to be increasingly important for nearly anyone running cloud services.

In particular, I've been following the work of the intrepid folks at CERN. As such, this post is not about DreamHost; rather, it's a mad tale of OpenStack, DevOps, and averting alien invasion.

After countless long-distance phone conversations, a flight to Switzerland, and spending several days buying pints for a security guard in the know (referred to from now on as "Barney"), I've uncovered some profound truths -- Mulder-style -- and have confirmed that the impact of OpenStack at CERN is huge.

Superficial examinations turn up the usual: CERN's planning slides, nice quotes, discussions of features and savings in time and money. For instance, in a recent email conversation with Tim "Gordon Freeman" Bell at CERN, I learned that

"The CERN Agile Infrastructure project aims to develop CERN's computing resources and processes to support the expanding needs of LHC physicists and the CERN organisation."

I think these guys have been hanging out with Simon! But once you slip behind the scenes, peek at some of the whiteboards in unattended rooms, or rifle through notes lying about, you see that things are not what they appear. I've included a shot of Mr. OpenStack-at-CERN himself; this was my first clue.

Publicly, he's been working with other teams at CERN to:

modernise the data centre configuration tools and automating operations procedures
exploit wide scale use of virtualisation, improving flexibility and efficiency
enhance monitoring such that the usage of the infrastructure can be fully understood and tuned to maximise the resources available

But privately, it seems that he and his team have been doing much, much more. This was alluded to in a statement made by team member Jan van Eldik: "We expect the number of requests to insert non-standard specimens into the scanning beam of the Anti-Mass Spectrometer to significantly decrease, once automation is in place and everyone is using the standard infrastructure we are setting up."

That isn't to say there haven't been incidents...

Innocuously enough, the current toolchains are based around:

OpenStack as a single Infrastructure-as-a-Service providing physics experiment services, developer boxes, applications servers as well as the large batch farm
Puppet for configuration management
Scientific Linux CERN as the dominant operating system with sizeable chunk of Windows installs

But that second bullet caught my eye, and one of Barney's pub mates confirmed a rumor that we'd heard: the Puppet instances are actually trained headcrabs. The primary training tool? You guessed it, a crowbar. Barney said that the folks from Dell took inspiration from this and developed it further for their OpenStack deployment framework after an extended visit to CERN.

Although Barney hadn't seen any evidence of resonance cascades, there have been minor cross-dimensional disturbances as a result of some "cowboy" activity and folks not following DevOps best practices. This has been kept quiet for obvious reasons, but has led to a small pest problem in some of CERN's older tunnel complexes. As rouge elements are discovered, CERN has been educating transgressors aggressively. (Sometimes they go as far as sending employees to Xen training... or was it Xen training?)

One artist's conception of what success will
look like for OpenStack at CERN

Despite the minor hiccoughs along the way, CERN is aiming for success. (Given the lack of Combine and forced relocation programs, they're already doing better than Black Mesa's Anomalous Materials team.) Plans are in place for an initial pre-production service, OpenStack deployment this year. Following that, they will be moving towards 300,000 virtual machines on 15,000 hosts spread across two data centres by 2015.

The OpenStack community is supporting them in their efforts with fantastic new features, high-quality discussions on the mail lists, and real-time interaction on the IRC channels. In an act of reciprocity and community spirit, operators at CERN have volunteered to contribute back to the OpenStack community with regard to operations best practices, reference architecture documentation, and support on the operators' mail list.

To see how other institutions were taking this news, I spent several days waiting on hold. In particular, Aperture Science could not be reached for comment. However, Ops team member Belmiro Rodrigues Moreira did say that there's an audio file being circulated at CERN of Cave Johnson threatening to "burn down OpenStack" ... with lemons. Given Aperture Science's failure record with time machine development, it's generally assumed to be a prank audio reconstruction. CloudStack developers are considered to be the prime suspects, seeing how much time they have on their hands while waiting for ant to finish compiling the latest Java contributions.

When asked what advice he could give to shops deploying OpenStack, Tim said simply: "Remember, the cake is a lie. Don't get distracted and don't stop. Just keep hacking."

Alyx, explaining to her dad why she loves DreamHost

Couldn't have said it better myself.

In closing, and interestingly enough, one of DreamHost's employees has an uncle who works at the Black Mesa Research Facility. Though his teleportation research team was too busy for an extended interview, his daughter did mention that she is a DreamHost customer and can't wait to use OpenStack while interning at CERN next summer. After all, that's what she uses to auto-scale her WordPress blog (she's in our private beta program).

It's a small world.

And, thanks to Tim and the rest at CERN, a safer one, too.