Wednesday, April 02, 2014

Hash Maps in LFE: Request for Comment

As you may have heard, hash maps are coming to Erlang in R17. We're all pretty excited about this. The LFE community (yes, we have one... hey, being headquartered on Gutland keeps us lean!) has been abuzz with excitement: do we get some new syntax for Erlang maps? Or just record-like macros?

That's still an open question. There's a good chance that if we find an elegant solution, we'll get some new syntax.

In an effort to (re)start this conversation and get us thinking about the possibilities, I've drawn together some examples from various Lisps. At the end of the post, we'll review some related data structures in LFE... as a point of contrast and possible guidance.

Note that I've tried to keep the code grouped in larger gists, not split up with prose wedged between them. This should make it easier to compare and contrast whole examples at a glance.

Before we dive into the Lisps, let's take a look at maps in Erlang:

Erlang Maps
%% create a new map
A = #{key1 => Val1, key2 => Val2, ...}
%% pattern match a map
#{key1 := Pattern1, key2 := Pattern2, ...} = VarContainingAMap
%% updating a map with new key/value pairs
NewX = X#{ key1 => Val1, ... KeyN => ValN, ...}
%% updating a map with new values for keys that already exist
NewX = X#{ key1 := Val1, ... KeyN := ValN, ...}
%% keys can be any ground term
Z = #{{age, fred} => 12,
{age, bill} => 97,
{color, red} => {rgb, 255, 0, 0}}.
%% extracing values via pattern matching
#{{color, red} := X1} = Z.
view raw 01-maps.erl hosted with ❤ by GitHub

Common Lisp Hash Tables
;; create a new empty map
(defparameter *my-map* (make-hash-table))
;; add values to map
(setf (gethash 'key-one *my-map*) "value one")
(setf (gethash 'key-two *my-map*) "value two")
;; get a value from a map
(gethash 'key-one *my-map*)
"value one"
;; update a map
(setf (gethash 'key-two *my-map*) "value 2")
;; iterating over key/values
(maphash #'(lambda (k v) (format t "~a => ~a~%" k v)) *my-map*)
KEY-ONE => value one
KEY-TWO => value two
view raw 02-maps.lisp hosted with ❤ by GitHub

Racket Hash Tables
;; create a new empty map
(define my-map (hash))
;; add values to map
(define my-map (hash-set my-map 'key-one "value one"))
(define my-map (hash-set my-map 'key-two "value two"))
my-map
'#hash((key-one . "value one") (key-two . "value two"))
;; create a new map with default values
(define my-map (hash 'key-one "value one" 'key-two "value two"))
my-map
'#hash((key-one . "value one") (key-two . "value two"))
;; get a value from a map
(hash-ref my-map 'key-two)
"value two"
;; update a map
(hash-update my-map 'key-two (lambda (x) 2))
'#hash((key-one . "value one") (key-two . 2))
;; iterating over key/values
(hash-map my-map vector)
'(#(key-one "value one") #(key-two "value two"))
view raw 03-maps.rkt hosted with ❤ by GitHub

Clojure Hash Maps
;; create a new empty map
(def my-map (hash-map))
;; add values to map
(def my-map (assoc my-map :key-1 "value one"))
(def my-map (assoc my-map :key-2 "value two"))
;; or you could use merge
(def my-map (merge my-map {:key-1 "value one"}))
(def my-map (merge my-map {:key-2 "value two"}))
;; create a new map with default values
(assoc {} :key-1 "value one" :key-2 "value two")
{:key-2 "value two", :key-1 "value one"}
;; get a value from a map
(my-map :key-2)
"value two"
;; update a map
(def my-map (assoc my-map :key-2 "value 2"))
;; iterating over key/values
(map #(list %) my-map)
(([:key-2 "value 2"]) ([:key-1 "value one"]))
view raw 04-maps.clj hosted with ❤ by GitHub

Shen Property Lists
;; create a new property list with default values
(put my-map key-1 "value one")
(put my-map key-2 "value two")
;; get a value from a property list
(get my-map key-2)
"value two"
;; create a non-system (non-default) hash-table for property lists
;; we'll make it small for demo purposes
(set *hash-table* (vector 2))
;; add key/values
(put my-map key-1 "value one" (value *hash-table*))
(put my-map key-2 "value two" (value *hash-table*))
;; get a stored value
(get my-map key-1 (value *hash-table*))
"value one"
;; peek inside the hash table
(value *hash-table*)
<[[[my-map key-1] | "value one"] [[my-map key-2] | "value two"]] ...>

OpenLisp Hash Tables
;; create a new empty map
(defdynamic my-hash (make-hash-table))
;; add values to map
(puthash 'key-1 my-hash "value one")
(puthash 'key-2 my-hash "value two")
;; get a value from a map
(gethash 'key-2 my-hash)
"value two"
;; iterating over key/values
(maphash #'(lambda (key val) (print "key: " key " and value: " val)) my-hash)
"key: "key-2" and value: ""value two"
"key: "key-1" and value: ""value one"

LFE Property Lists
;; create a new empty property list
(set my-map '())
;; create a new property list with default values
(set my-map '(#(key-1 "value one") #(key-2 "value two")))
;; get a value from a property list
(: proplists lookup 'key-2 my-map)
#(key-2 "value two")
;; or this
(: proplists get_value 'key-2 my-map)
"value two"
;; iterating over key/values
(: lists map (lambda (x) x) my-map)
(#(key-1 "value one") #(key-2 "value two"))
view raw 08-plists.lfe hosted with ❤ by GitHub

LFE orddicts
;; create a new ordered-list-based dictionary
(set my-map (: orddict new))
;; add values to map
(set my-map (: orddict store 'key-1 '"value one" my-map))
(set my-map (: orddict store 'key-2 '"value two" my-map))
;; get a value from a map
(: orddict fetch 'key-2 my-map)
("value two")
;; iterating over key/values
(: lists map (lambda (x) x) my-map)
(#(key-1 ("value one")) #(key-2 ("value two")))
view raw 09-orddicts.lfe hosted with ❤ by GitHub

I summarized some very basic usability and aesthetic thoughts on the LFE mail list, but I'll restate them here:
  • Erlang syntax really is quite powerful; I continue to be impressed.
  • Clojure was by far the most enjoyable to work with... however, doing something similar in LFE would require quite a bit of additions for language or macro infrastructure. My concern here is that we'd end up with a Clojure clone rather than something distinctly Erlang-Lispy.
  • Racket had the fullest and most useful set of hash functions (and best docs).
  • Chicken Scheme was probably second.
  • Common Lisp was probably (I hate to say it) the most awkward of the bunch). I'm hoping we can avoid pretty much everything the way it was done there :-/
One of the things that makes Clojure such a joy to work with is the unified aspect of core functions and how one uses these to manipulate data structures of different types. Most other implementations have functions/macros that are dedicated to working with just maps. While that's clean and definitely has a strong appeal, Clojure reflects a great deal of elegance.

That being said, I don't think today is the day to propose unifying features for LFE/Erlang data types ;-) (To be honest, though, it's certainly in the back of my mind... this is probably also true for many folks on the mail list.)

Given my positive experience with maps (hash tables) in Racket, and Robert's initial proposed functions like map-new, map-set, I'd encourage us to look to Racket for some inspiration:
Additional thoughts:
  • "map" has a specific meaning in FPs (: lists map), and there's a little bit of cognitive dissonance for me when I look at map-*
  • In my experience, applications generally don't have too many records; however, I've known apps with 100s and 1000s of instances of hash maps; as such, the idea of creating macros for each hash-map (e.g., my-map-get, my-map-set, ...) terrifies me a little. I don't believe this has been proposed, and I don't know enough about LFE's internals (much less, Erlang's) to be able to discuss this with any certainty.
  • The thought did occur that we could put all the map functions in a module e.g., (: maps new ... ), etc. I haven't actually looked at the Erlang source and don't know how maps are implemented in R17 yet (nor how that functionality is presented to the developer). Obviously, once I have, this point will be more clear for me.
With this done, I then did a thought experiment in potential syntax additions for LFE. Below are the series of gists that demonstrate this.

Looking at this Erlang syntax:
A = #{key1 => Val1, key2 => Val2, ...}
view raw 01-maps.erl hosted with ❤ by GitHub

My fingers want to do something like this in LFE:
(set a #((=> key1 "value 1") (=> key2 "value 2")))
view raw 02-maps.lfe hosted with ❤ by GitHub

That feels pretty natural, from the LFE perspective. However, it looks like it might require hacking on the tuple-parsing logic (or splitting that into two code paths: one for regular tuple-parsing, and the other for maps...?).

The above syntax also lends itself nicely to these:
(set a #((:= key1 "value 1") (:= key2 "value 2")))
(set a #((=> key1 "value 1") (:= key2 "value 2")))
view raw 03-maps.lfe hosted with ❤ by GitHub

The question that arises for me is "how would we do this when calling functions?" Perhaps one of these:
(set a (map (=> 'key1 '"value 1")))
(set a (map (:= 'key1 '"value 1")))
# Or this:
(set a (map (add-pair 'key1 '"value 1")))
(set a (map (update-pair 'key1 '"value 1")))
view raw 04-maps.lfe hosted with ❤ by GitHub

Then, for Joe's other example:
Z = #{{age, fred} => 12,
{age, bill} => 97,
{color, red} => {rgb, 255, 0, 0}}.
view raw 05-maps.erl hosted with ❤ by GitHub

We'd have this for LFE:
(set a #((=> #(age fred) 12)
(=> #(age bill) 97)
(=> #(color red) #(rgb 200 0 0))))
view raw 06-maps.lfe hosted with ❤ by GitHub

Before we pattern match on this, let's look at Erlang pattern matching for tuples:
{Len, Status, Data} = {8, ok, "Trillian"}
view raw 07-maps.erl hosted with ❤ by GitHub

Compare this with pattern matching elements of a tuple in LFE:
(set (tuple len status data) #(8 ok "Trillian"))
view raw 08-maps.lfe hosted with ❤ by GitHub

With that in our minds, we turn to Joe's matching example against a specific map element:
#{{color, red} := X1} = Z.
view raw 09-maps.erl hosted with ❤ by GitHub

And we could do the same in LFE like this:
(set (map (update-pair (tuple 'color 'red) x1)) z)
view raw 10-maps.lfe hosted with ❤ by GitHub

I'm really uncertain about add-pair and update-pair, both the need for them and the names. Interested to hear from others who know how map is implemented in Erlang and the best way to work with that in LFE...