Sunday, September 18, 2011

Fun with pattern matching in Racket

Finally managed to circle back to playing with Racket.  The more I play, the more impressed I get.  Here are a few examples, from how Racket handles pattern matching:

(match 2
  [1 'one]
  [2 'two]
  [3 'three])

; outputs 'two

This is from the Racket language guide, which so far is proving to be a pretty good place to dip in your toe.  It's got tons of examples without drowning you in all the variations that Racket allows.  I get the feeling that Racket has a very lush vocabulary, which makes drowning a real danger.  Scuba gear attached, we dive in.

We see from the first example that something we might have suspected: we can match against pretty much anything.  Having this kind of general flexibility gives using the system a nice predictable even-ness.  We'll see later that Racket also allows us a few useful other tricks for certain data types.

(struct item (label size))
(match (item "foo" 4)
  [(item "foo" 4) (display "matched!")]
  [2 'two])

Here we create a struct that has two fields: a label and a size.  We match against a new value of this type with something that matches it directly.  Notice, too, that our patterns don't have to have the same "type", we can try to match against potentially very different data types.

That's cool, but not particularly powerful.  To gain some flexibility, we introduce variables into our patterns.  Once we do this, the value in that position in what we match against is bound to our variable, and we can use it later.

(struct item (label size))
(match (item "foo" 4)
  [(item lbl 4) (display lbl)])

; outputs foo

Now 'lbl' is going to have the string "foo" bound to it once this pattern matches.  Those who've played with Ocaml or Haskell might be familiar with this - and it's the heart of pattern matching.  Being able to pull your data apart and get at the values you need, and to do so in a very succinct way, is very powerful. Once you get used to having it, you loathe going back to languages that don't have it.

Do not fear, Racket has it in spades.

Let's get a bit trickier and switch to working with lists.  To help us play with lists, Racket has '...' which is a way to match multiple things in a list.

(match '(1 2 3 4 5)
  [(list x ...) (print x)])

; outputs '(1 2 3 4 5)

Again, not particularly interesting (at first).  We can match an entire list into a variable.  Now, we let our imagination run wild.  What else can this '...' do?  By looking at the syntax, we can hypothesize that the '...' after a variable means to match everything in that position and after with the variable.  Is it smart enough to match everything but, say, the last element and then bind that to a separate variable?

(match '(1 2 3 4 5)
  [(list x ... y) (begin (display x) (display "\n") (display y))])

; '(1 2 3 4)
; 5

It is smart enough!  We can match lists as if they were simple regular expressions.

We can even have multiple '...'

(match '(1 2 3 4 5)
  [(list 1 x ... 3 y ... z) y])

; '(4)

If the '...' follows a constant, it matches it being repeated

(match '(1 1 1 3)
  ((list 1 ... x) x))

; 3

Very cool, and this barely scratches the surface.    

Let's say we wanted to match the elements in a list, but we weren't sure of the order.  I can see you shaking your head "oh no".  Oh yes!

(match '(1 2 3 4)
  [(list-no-order 3 x 4 1) x])

; 2

What happens if we throw multiple variables in that pattern?

(match '(1 2 3 4)
  [(list-no-order 3 x 4 y) (begin (display x) (display y))])

; 12 (for me)

I have no idea what kind of deterministic order it picks, if any, but there you have it.  We can mix list-no-order with '...'

(match '(1 2 3 4 5)
  [(list-no-order 4 2 y ...) (display y)])

; '(1 3 5)


Going back to the list style, we can also match against a duplicate value

(match '(1 2 3 1)
  [(list x 2 3 x) x])

; 1

If we changed our matched value '(1 2 3 4), it would fail, because x can't be bound to both a 1 and a 4.  In the example, we bind x to the same value multiple times, so it succeeds.

What if we wanted to do the same thing, but we weren't sure where the duplicates would occur.  We might be tempted to use the list-no-order from before:

(match '(1 2 3 1)
  [(list-no-order 2 3 x x) x])

but this throws an error:

compile: unbound identifier (and no #%top syntax transformer is bound) in: x7

Not a particularly helpful error at that, unless you notice that 'x7' might be referring to some kind of uniquing going on that we don't want.  We can tweak our example to get what we want (and there may be a more idiomatic way - anyone know?)

    (match '(1 2 3 1)
      [(list-no-order 2 3 x ... ) x])
  [(list x x) x])

; 1

Can matching patterns have expressions?  Scary thought.  Oh wait, that's normal in these parts

(match '(2 4 6)
  [(list (? odd?) ...) (display "odd")]
  [(list (? even?) ...) (display "even")]
  [_ (display "neither odd nor even")])

; even

If you can use expresions, can you also call functions and check their results?  Of course

(match '(4 5 6)
  [(list 1 _ ...) (display "starts with 1")]
  [(app length 3) (display "has three elems")])

We've been using a lists quite a bit, but let's not forget that many of these patterns work with structs too

(struct person (name height))
(match (person "bob" 6.0)
  [(person (app string-length 3) _) 
   (display "Name of three chars")])

Incidentally, if we want to bind to the result of an inner pattern, like capturing the name that matches this string-length, we can use the 'and' keyword.

(struct person (name height))
(match (person "bob" 6.0)
  [(person (and x (app string-length 3)) _) 
   (display (string-append "Name of three chars: " x))])

This is ridiculously fun to see what all can do.  Thanks for reading!

For those curious in exploring further, the full list of pattern types that Racket supports is enormous, with lots more tricks like those above.

No comments:

Post a Comment