Friday, November 28, 2014

Scientific Computing with Hy and IPython

This blog post is a bit different than other technical posts I've done in the past in that the majority of the content is not on the blog in or gists; instead, it is in an IPython notebook. Having adored Mathematica back in the 90s, you can imagine how much I love the IPython Notebook app. I'll have more to say on that at a future date.

I've been doing a great deal of NumPy and matplotlib again lately, every day for hours a day. In conjunction with the new features in Python 3, this has been quite a lot of fun -- the most fun I've had with Python in years (thanks Guido, et al!). As you might have guessed, I'm also using it with Erlang (specifically, LFE), but that too is for a post yet to come.

With all this matplotlib and numpy work in standard Python, I've been going through Lisp withdrawals and needed to work with it from a fresh perspective. Needless to say, I had an enormous amount of fun doing this. Naturally, I decided to share with folks how one can do the latest and greatest with the tools of Python scientific computing, but in the syntax of the Python community's best kept secret: Clojure-Flavoured Python (Github, Twitter, Wikipedia).

Spoiler: observed data and
polynomial curve fitting
Looking about for ideas, I decided to see what Clojure's Incanter project had for tutorials, and immediately found what I was looking for: Linear regression with higher-order terms, a 2009 post by David Edgar Liebke.

Nearly every cell in the tutorial notebook is in Hy, and for that we owe a huge thanks to yardsale8 for his Hy IPython magics code. For those that love Python and Lisp equally, who are familiar with the ecosystems' tools, Hy offers a wonderful option for being highly productive with a language supporting Lisp- and Clojure-style macros. You can get your work done, have a great time doing it, and let that inner code artist out!

(In fact, I've started writing a macro for one of the examples in the tutorial, offering a more Lisp-like syntax for creating class methods. We'll see what Paul Tagliamonte has to say about it when it's done ... !)

If you want to check out the notebook code and run it locally, just do the following:

This will do the following:

  • Create a virtualenv using Python 3
  • Download all the dependencies, and then
  • Start up the notebook using a local IPython HTTP server

If you just want to read along, you're more than welcome to do that as well, thanks to the IPython NBViewer service. Here's the link: Scientific Computing with Hy: Linear Regressions.

One thing I couldn't get working was the community-provided code for generating tables of contents in IPython notebooks. If you have any expertise in this area, I'd love to get your feedback to see how I need to configure the custom ihy IPython profile for this tutorial.

Without that, I've opted for the manual approach and have provided a table of contents here:
If all goes well, you will enjoy that as much as I did :-)

More soon ...


Friday, November 21, 2014

ErlPort: Using Python from Erlang/LFE

Update 1: This post has a sequel here.

Update 2: There is a new LFE library that provides more idiomatic access to Python from LFE/Erlang by wrapping ErlPort and creating convenience functions. Lisp macros were, of course, involved in its making.

This is a short little blog post I've been wanting to get out there ever since I ran across the erlport project a few years ago. Erlang was built for fault-tolerance. It had a goal of unprecedented uptimes, and these have been achieved. It powers 40% of our world's telecommunications traffic. It's capable of supporting amazing levels of concurrency (remember the 2007 announcement about the performance of YAWS vs. Apache?).

With this knowledge in mind, a common mistake by folks new to Erlang is to think these performance characteristics will be applicable to their own particular domain. This has often resulted in failure, disappointment, and the unjust blaming of Erlang. If you want to process huge files, do lots of string manipulation, or crunch tons of numbers, Erlang's not your bag, baby. Try Python or Julia.

But then, you may be thinking: I like supervision trees. I have long-running processes that I want to be managed per the rules I establish. I want to run lots of jobs in parallel on my 64-core box. I want to run jobs in parallel over the network on 64 of my 64-core boxes. Python's the right tool for the jobs, but I wish I could manage them with Erlang.

(There are sooo many other options for the use cases above, many of them really excellent. But this post is about Erlang/LFE :-)).

Traditionally, if you want to run other languages with Erlang in a reliable way that doesn't bring your Erlang nodes down with badly behaved code, you use Ports. (more info is available in the Interoperability Guide). This is what JInterface builds upon (and, incidentally, allows for some pretty cool integration with Clojure). However, this still leaves a pretty significant burden for the Python or Ruby developer for any serious application needs (quick one-offs that only use one or two data types are not that big a deal).

erlport was created by Dmitry Vasiliev in 2009 in an effort to solve just this problem, making it easier to use of and integrate between Erlang and more common languages like Python and Ruby. The project is maintained, and in fact has just received a few updates. Below, we'll demonstrate some usage in LFE with Python 3.

If you want to follow along, there's a demo repo you can check out:
Change into the repo directory and set up your Python environment:
Next, switch over to the LFE directory, and fire up a REPL:
Note that this will first download the necessary dependencies and compile them (that's what the [snip] is eliding).

Now we're ready to take erlport for a quick trip down to the local:
And that's all there is to it :-)

Perhaps in a future post we can dive into the internals, showing you more of the glory that is erlport. Even better, we could look at more compelling example usage, approaching some of the functionality offered by such projects as Disco or Anaconda.