Tuesday, April 21, 2009

After the Cloud: To Atomic Computation and Beyond


After the Cloud:
  1. Prelude
  2. So Far
  3. The New Big
  4. To Atomic Computation and Beyond
  5. Open Heaps
  6. Heaps of Cash
  7. Epilogue

To restate the problem: we've got cloud for systems and we've got cloud for a large number of applications. We don't have cloud for processes (e.g., custom, light-weight applications/long-running daemons).

Personally, I don't want a whole virtual machine to myself, I just need a tiny process space for my daemon. When my daemon starts getting slammed, I want new instances of it started in a cloud (and then killed when they're not needed).

What's more, over time, I want to be writing my daemon better and better... using less of everything (memory, CPU, disk) in subsequent iterations. I want this process cloud to be able to handle potentially significant changes in my software.

Dream Cloud

So, after all that stumbling around, thinking about servers in the data center as the horsepower behind distributed services, and then user PCs/laptops as a more power-friendly alternative, the obvious hit me: phones. They are almost ubiquitous. People leave them on, plugged in, and only use them for a fraction of that time. What if we were able to construct a cloud from cell phones? Hell, let's throw in Laptops and netbooks, too. And Xboxes, Wii, and TiVos. Theoretically, anything that could support (or be hacked to support) a virtual process space could become part of this cloud.

This could be just the platform for running small processes in a distributed environment. And making it a reality could prove to be quite lucrative. A forthcoming blog post will explore more about the possibilities involved with phone clouds... but for now, let's push things even further.

When I mentioned this idea to Chris Armstrong at the Ubuntu Developer Conference last December, he immediately asked me if I'd read Charles Stross' book Halting State. I had started it, but hadn't gotten to the part about the phones. A portion of Stross' future vision in that book dealt with the ability of users to legally run programs of other's phones. I really enjoyed the tale, but afterwards I was ready to explore other possibilities.

Horse-buggy Virtualization


So I sat down and pondered other possibilities over the course of several weeks. I kept trying to think like business visionaries, given a new resource to exploit. But finally I stopped that and tried just imagining the possibilities based on examples computing and business history.

What's the natural thing for businesses to do when someone invents something or improves something? Put new improvements to old uses, potentially reinventing old markets in the process. That's just the sort of thing that could happen with the cloudification of mobile devices.

For examples, imagine this:
  • Phone cloud becomes a reality.
  • Someone in a garage in Silicon Valley buys a bunch of cheap phones, gumstix, or other small ARM components, rips off the cases, and sells them in rack-mountable enclosures.
  • Data centers start supplementing their old hardware offering with this new one that lets them use phone cloud tech (originally built for remote, hand-held devices) to sell tiny fractions of resources to users (on new, consolidated hardware... like having hundreds of phone uses in a single room with full bars, 24/7).
  • With the changing hardware and continuing improvements in virtualization software, more abstraction takes place.
  • Virtualization slowly goes from tool to prima materia, allowing designers not to focus on old-style, horse-drawn "machines" like your grandpa used to rack, but rather abstract process spaces that provide just what is needed, for example, to enable a daemon to run.
Once you've gotten that far, you're just inches from producing a meta operating system: process spaces (and other abstracted bits) can be built up to form a traditional user space. Or they can be used to build something entirely different and new. The computing universe suddenly gets a lot more flexible and dynamic.

Democritus Meets Modern Software

So, let's say that my dream comes true: I can now push all my tiny apps into a cloud service and turn off the big machines I've got colocated throughout the US. But once this is in place, how can we improve our applications to take even better advantage of such a system, one so capable of massively distributing our running code?

This leads us to an almost metaphysical software engineering question: how small can you divide an application until you reach the limits of functionality, where any further division would be senseless bytes and syntax errors? In terms of running processes, what is your code atom?

Prior to a few years ago, the most common answer would likely have been "my script" or "my application". Unless, of course, you asked a Scheme programmer. Programming languages like Scheme, Haskell, and Erlang are finding rapidly increasing acceptance as solutions for distributed programming problems because functional programming languages lend themselves easily to the problem of concurrency and parallelism.

If we had a massive computing cloud (atmosphere, more likely!) where we could run code in virtual process spaces, we could theoretically go even further than running a daemon: we could split our daemon up into async functions. These distributed functions could be available as continuously running microthreads/greenlets/whatever. They could accept an input and produce an output. Composing distributed functions could result in a program. Programs could change, failover, improve, etc., just by adding or removing distributed functions or by changing their order.

From Atoms to Dynamic Programs

Once we've broken down our programs into distributed functions and have broken our concept of an "Operating System" down into virtual process spaces, we can start building a whole new world of software:
  • Software becomes very dynamic, very distributed.
  • The particulars of hardware become irrelevant (it just needs to be present, somewhere).
  • We see an even more marked correlation between power consumption and code, where functions themselves could be measured in joules consumed per second.
  • Just for fun, let's throw in dynamic selection of fuctions or even genetic algorithms, and we have ourselves one of the core branches of the predicted Ultra-large Scale Systems :-)
I mention this not for cheap thrills, but rather because of the importance of having a vision. Even if we don't get to where we think we're going, by looking ahead and forward, we have the opportunity to influence our journey such that we increase the chances of getting to a place equal to or better than where we'd originally intended.

From a more practical perspective: today, I'm concerned about running daemons in the cloud. Tomorrow I could very well be concerned about finer granularity than that. Why not explore the potential results of such technology? Yes, it my prove infeasible now; but even still, it could render insights... and maybe more.

A Parting Message

Before I wind this blog post down, I'd like to paste a couple really excellent quotes. They are good not so much for their immediate content, but for the pregnant potentials they contain; for the directions they can point our musings... and engineerings. These are two similar thoughts about messaging from two radically different contexts. I leave you with these moments of Zen:

On the Erlang mail list, four years ago, Erlang creator Joe Armstrong posted this:
In Concurrency Oriented (CO) programming you concentrate on the concurrency and the messages between the processes. There is no sharing of data.

[A program] should be thought of thousands of little black boxes all doing things in parallel - these black boxes can send and receive messages. Black boxes can detect errors in other black boxes - that's all.
...
Erlang uses a simple functional language inside the [black boxes] - this is not particularly interesting - *any* language that does the job would do - the important bit is the concurrency.
On the Squeak mail list in 1998, Alan Kay had this to say:
...Smalltalk is not only NOT its syntax or the class library, it is not even about classes. I'm sorry that I long ago coined the term "objects" for this topic because it gets many people to focus on the lesser idea.

The big idea is "messaging" -- that is what the kernal of Smalltalk/Squeak is all about... The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Think of the internet -- to live, it (a) has to allow many different kinds of ideas and realizations that are beyond any single standard and (b) to allow varying degrees of safe interoperability between these ideas.

If you focus on just messaging -- and realize that a good metasystem can late bind the various 2nd level architectures used in objects -- then much of the language-, UI-, and OS based discussions on this thread are really quite moot.

Resources

Next up: The Business of Computing Atmospheres



11 comments:

Christopher said...

Dangit, you are so not allowed to read directly from my brain!

Great piece. I twittered a few weeks ago that I wanted to have a Unix pipe for the web. That would fit just fine into what you said.

Also, I've been looking at building single-purpose web services farms that cooperate, first for load-balancing, and then to self-upgrade and self deploy. This is still at the paper/wetware stage.

Ultimately the system would allow nodes to communicate with each other and update one another as to where the fastest and/or cheapest nodes are.

Good stuff!!!

Duncan McGreggor said...

*laughs*

Thanks Chris :-)

The more that people share these views together, the closer we get to making such things a reality... the next 5-10 years are going to be fascinating!

Duncan McGreggor said...

Chris,

As part of your research into pipes for the web, I highly recommend David Beazley's two talks (slides available from his site):
* Generator Tricks for Systems Programmers
* A Curious Course on Coroutines and Concurrency

In both he mentions the next() calls as being analogous to pipes. It's a great perspective. Functional programming offers this clearly, and it's really cool to see the parallels in Python. It's always nice when someone shines a light on something familiar, giving fresh insight on things we've just been taking it for granted.

Christopher said...

Duncan,

Yeah, more talk, more awareness. Forget the Cloud. Did someone register Cloud2.0 yet?

Can we get past that already?

I'll look up the stuff on David Beazley's. This stuff is so interesting.

By the way, Sqlite is a great data platform for this sort of things.

Duncan McGreggor said...

*chuckles*

Yeah:-)

You know that Divmod's Axiom is built on top of SQLite, right? There's some very cool stuff that they've done with distributed services and Axiom.

Jamu Kakar said...

I was talking to a friend of mine several weeks ago and mentioned the idea of "phone as a cloud node" and he brought up a good point: you want to run as little as possible on your phone to maximise battery life. I think this will be a limiting factor that makes the idea somewhat unlikely to catch on in practice until battery life is significantly better than it is now.

Duncan McGreggor said...

Jamu, it always makes me chuckle when people nay say like that :-) It's not a limitation of technology, just a difference in the type of person! If there's *anything* that we have learned from the past 100 years of technological advancements, a lack one year is readily overcome the next. Human ingenuity is astounding; certainly far more powerful than any current limitations!

That being said, I have two other thoughts on this matter:

1) More than anything, these blog posts are thought experiments, meant not to be prognostications, but catalysts for our friends and fellow technologists... Perhaps someone is on the verge of doing something really amazing, and a conversation like this one provides a needed spark. Or maybe it brings like-minded thinkers together to discuss more interesting ideas.

2) In particular, with a "phone cloud", I see many other problems than battery power... ones that are historically more cumbersome problems to overcome that simple technological limitations. However, there is a potential solution to these as well. It's almost entirely a business issue and requires proper motivation... I hope to cover that soon, in the next cloud post :-)

Anyway, don't loose heart! Keep thinking crazy! Take those crazy thoughts and push them even further! Don't let anyone slow you down :-) Who knows what pops out at the end of that adventurous process? I know I would *love* to see what you might come up with!

Anonymous said...

You should check out this talk from Eben Moglen, which relates to some of the ideas you've presented here but from a different angle:

Duncan McGreggor said...

mdzlog,

Wow, thanks for the link. I haven't finished watching the presentation yet, but this is awesome. It also reminds me of the post I made a couple years ago about the future of personal data. I'd like to see things head towards a situation where individuals own their own data and rent/lease it to corporations that want it, trade it as a commodity, etc. Power (and $) to the people...

Duncan McGreggor said...

Holy crap, mdzlog:

I just finished the "Freedom in the Cloud" presentation by Eben Moglen... the thoughts are flying through my mind and the synthesizing is becoming a run-away process in my brains :-) Can't wait to start a new blog series...

Pete Hunt said...

Hey Duncan - like the idea, but I'm skeptical.

First of all, have you seen the Barrelfish operating system and the Intel SCC? You might be interested.

But what about the data? There comes a point when your atoms become so small that bandwidth and (more likely) latency become your problem rather than compute power. How do you plan to address this? I don't think it's a problem that will ever go away; I guess that it could be alleviated with some sort of optimization routines that schedule computation near the data it wants to use?