Saturday, October 8, 2011

Running a limited REPL in Racket

Let's say you want to only expose some functionality to a REPL and you're using Racket.  One way to do this is to create a module with the functionality you want to expose and then create a namespace that only includes this functionality.

modlang.rkt:


(module um racket
  (provide x doit #%module-begin #%app #%datum)
  (define x 100)
  (define (doit a) (display (+ a 1))))

use_modlang.rkt:


#lang racket


(define ns (make-base-empty-namespace))
(define (ns-req e ns) (parameterize ([current-namespace ns]) (namespace-require e) ns))
(ns-req "modlang.rkt" ns)


(define (prompt-user)
  (printf "prompt> ")
  (define input-string (read-line (current-input-port)))
  (cond 
    [(not (string=? input-string "quit"))
     (printf "~a\n" (eval (read (open-input-string input-string)) ns))
     (prompt-user)]))


(prompt-user)

And now you can run use_modlang.rkt.  At the prompt you can only do simple numerics, x, and the doit function, but even + isn't available.

Wednesday, September 21, 2011

Promising New Programming Languages

It seems like these days there are lots of programming languages springing up.  A combination of the web, parallel programming, and just outright curiosity seems to be fueling creativity towards new mixtures of classic features and experimental new ones.

Here are a few new(-ish) projects I'm keeping my eye on:


  • Rust (Mozilla Research) - Concurrent language that only allows sharing of mutable state if it's handed off between threads.  Uses typestates to enforce this ownership.  Expected to be used in Project Servo, Mozilla's experimental parallel browser engine.  Recent talk slides look promising.  Still experimental (haven't reached 0.1 yet).
  • Kotlin (JetBrains) - Impressively clean/lightweight syntax language for the JVM in the spirit of a Scala light.  Perhaps may be to Scala what Firefox was to Netscape, but it's not yet release so it's hard to say for sure.  Still, their presentation at the JVM Language Summit is worth a watch. 
  • ooc - You have to like the infectious audacity of this project.  Hobbyist putting together a language, then proceeding to write one compiler for it in itself and then another.  
  • Dart (Google) - Mentioning it here because I'm curious, but the hype has already overshadowed it. We're still weeks away from knowing anything about it.  Still, Gilad Bracha is like a "curiously strong mint", so we at least know it won't be flavorless.

These are just the ones on my radar.  There are certainly others (Ceylon, Gosu, Mirah - and those are just the ones on the JVM), but they haven't quite made it to my twitter follow list.

Sunday, September 18, 2011

Fun with pattern matching in Racket

Finally managed to circle back to playing with Racket.  The more I play, the more impressed I get.  Here are a few examples, from how Racket handles pattern matching:


(match 2
  [1 'one]
  [2 'two]
  [3 'three])


; outputs 'two


This is from the Racket language guide, which so far is proving to be a pretty good place to dip in your toe.  It's got tons of examples without drowning you in all the variations that Racket allows.  I get the feeling that Racket has a very lush vocabulary, which makes drowning a real danger.  Scuba gear attached, we dive in.


We see from the first example that something we might have suspected: we can match against pretty much anything.  Having this kind of general flexibility gives using the system a nice predictable even-ness.  We'll see later that Racket also allows us a few useful other tricks for certain data types.


(struct item (label size))
(match (item "foo" 4)
  [(item "foo" 4) (display "matched!")]
  [2 'two])


Here we create a struct that has two fields: a label and a size.  We match against a new value of this type with something that matches it directly.  Notice, too, that our patterns don't have to have the same "type", we can try to match against potentially very different data types.

That's cool, but not particularly powerful.  To gain some flexibility, we introduce variables into our patterns.  Once we do this, the value in that position in what we match against is bound to our variable, and we can use it later.



(struct item (label size))
(match (item "foo" 4)
  [(item lbl 4) (display lbl)])


; outputs foo

Now 'lbl' is going to have the string "foo" bound to it once this pattern matches.  Those who've played with Ocaml or Haskell might be familiar with this - and it's the heart of pattern matching.  Being able to pull your data apart and get at the values you need, and to do so in a very succinct way, is very powerful. Once you get used to having it, you loathe going back to languages that don't have it.

Do not fear, Racket has it in spades.

Let's get a bit trickier and switch to working with lists.  To help us play with lists, Racket has '...' which is a way to match multiple things in a list.


(match '(1 2 3 4 5)
  [(list x ...) (print x)])

; outputs '(1 2 3 4 5)

Again, not particularly interesting (at first).  We can match an entire list into a variable.  Now, we let our imagination run wild.  What else can this '...' do?  By looking at the syntax, we can hypothesize that the '...' after a variable means to match everything in that position and after with the variable.  Is it smart enough to match everything but, say, the last element and then bind that to a separate variable?

(match '(1 2 3 4 5)
  [(list x ... y) (begin (display x) (display "\n") (display y))])

; '(1 2 3 4)
; 5

It is smart enough!  We can match lists as if they were simple regular expressions.

We can even have multiple '...'

(match '(1 2 3 4 5)
  [(list 1 x ... 3 y ... z) y])

; '(4)

If the '...' follows a constant, it matches it being repeated

(match '(1 1 1 3)
  ((list 1 ... x) x))

; 3

Very cool, and this barely scratches the surface.    

Let's say we wanted to match the elements in a list, but we weren't sure of the order.  I can see you shaking your head "oh no".  Oh yes!

(match '(1 2 3 4)
  [(list-no-order 3 x 4 1) x])

; 2


What happens if we throw multiple variables in that pattern?


(match '(1 2 3 4)
  [(list-no-order 3 x 4 y) (begin (display x) (display y))])

; 12 (for me)

I have no idea what kind of deterministic order it picks, if any, but there you have it.  We can mix list-no-order with '...'

(match '(1 2 3 4 5)
  [(list-no-order 4 2 y ...) (display y)])

; '(1 3 5)

Cute.

Going back to the list style, we can also match against a duplicate value

(match '(1 2 3 1)
  [(list x 2 3 x) x])

; 1

If we changed our matched value '(1 2 3 4), it would fail, because x can't be bound to both a 1 and a 4.  In the example, we bind x to the same value multiple times, so it succeeds.


What if we wanted to do the same thing, but we weren't sure where the duplicates would occur.  We might be tempted to use the list-no-order from before:


(match '(1 2 3 1)
  [(list-no-order 2 3 x x) x])

but this throws an error:

compile: unbound identifier (and no #%top syntax transformer is bound) in: x7

Not a particularly helpful error at that, unless you notice that 'x7' might be referring to some kind of uniquing going on that we don't want.  We can tweak our example to get what we want (and there may be a more idiomatic way - anyone know?)


(match 
    (match '(1 2 3 1)
      [(list-no-order 2 3 x ... ) x])
  [(list x x) x])


; 1

Can matching patterns have expressions?  Scary thought.  Oh wait, that's normal in these parts


(match '(2 4 6)
  [(list (? odd?) ...) (display "odd")]
  [(list (? even?) ...) (display "even")]
  [_ (display "neither odd nor even")])

; even

If you can use expresions, can you also call functions and check their results?  Of course

(match '(4 5 6)
  [(list 1 _ ...) (display "starts with 1")]
  [(app length 3) (display "has three elems")])

We've been using a lists quite a bit, but let's not forget that many of these patterns work with structs too

(struct person (name height))
(match (person "bob" 6.0)
  [(person (app string-length 3) _) 
   (display "Name of three chars")])

Incidentally, if we want to bind to the result of an inner pattern, like capturing the name that matches this string-length, we can use the 'and' keyword.

(struct person (name height))
(match (person "bob" 6.0)
  [(person (and x (app string-length 3)) _) 
   (display (string-append "Name of three chars: " x))])

This is ridiculously fun to see what all can do.  Thanks for reading!

For those curious in exploring further, the full list of pattern types that Racket supports is enormous, with lots more tricks like those above.







Monday, March 7, 2011

Will parallelism go implicit?

I've been spending quite a bit of time recently with Implicit Parallel Programming in pH, which luckily the school library had a copy of.  It's quite a good textbook to introduce programmers to functional programming, perhaps even one of the better ones (next to the original Introduction to Functional Programming by Bird, Wadler that I mentioned in an earlier blog post).  What turned me on to it, though, wasn't the treatment of functional programming - it was its far more audacious titular goal.

Which leads to the obvious question. What is implicit parallelism?

If you bring up the term around PL geeks, you might get a few questioning looks and some comments that hint at automatic loop parallelization.  Even more confusingly, this second kind of parallelism is sometimes referred to as automatic parallelism.  It refers to taking sequential codes and reasoning about where parallelism could occur without changing any results.  Implicit parallelism is a more holistic approach - what if the only sequentiality in your program comes from the data dependencies?  Then the rest, it stands to reason, can be done in parallel.

I mentioned this as an audacious effort.  The first and foremost reason, to my engineering mind, is how in the world will you make this efficient?  I can just imagine the compiler identifying tens, perhaps even hundreds, of tiny tasks that can be done in parallel at any one timeslice.   How do we schedule these tasks in such a way we don't pay a high overhead for synchronization and task switching?

I haven't dug into the implementations of the languages yet to see how they address this issue, but as I was brainstorming this morning it hit me.  Implicit parallelism isn't so different from implicit (or automatic) memory management.  When garbage collectors first came out, they weren't the lean mean things they are today.  That's part of the advantage of having decades to polish an approach, and now fewer and fewer applications programmers use languages that don't have gc support.  It saves more headaches than it causes, and that's generally a winning combination.

Not always, of course.  The HPC community has regularly shunned gc in their languages, where high performance codes are hand-tuned Fortran or C with manual memory management.  Still, I would argue that HPC is boutique, a cottage industry focused on maximal efficiency.

Most everyday programmers don't need these levels of efficiency, and I imagine, will instead opt for automatic support for parallelism.  This saves them the headache of separately tuning for the same application to run comfortably on either 16 or 128 cores.  From the consumer's point of view, this is also a win.  Buying that new 256 core machine actually will make these apps run faster.

Some final thoughts

Perhaps it's time to focus on introducing more forms of automatic and implicit parallelism into our programming languages and runtime systems.  Our early attempts, taking from the experiments of the past could kick it off, and once industrial interest has turned to it, I'd expect steady progress in efficiency (just like we've seen in gc).

Is there some practical limit to the amount of parallelism in code?  There's a tension between Amdahl's law, that maximum speedup through parallelism is limited by the amount of sequential code, and Gustafson's law, which says that the increased computing power lets us tackle larger problems.  To continue to gain we may have to change how we approach problems, or even reframe the problems themselves.

Sunday, October 3, 2010

Functional Programming - Beauty in Obscurity?

"Sometimes when you read a book, it makes you feel like you're in heaven.  It's perfect.  It's elegant.  The world looks light, shiny, and happy.  Bird & Wadler first edition is one of those books"

  - Erik Meijer (for the whole interview go here)

That's high praise.  I was lucky enough to come across a copy of said book, titled Introduction to Functional Programming, sneaking time during the last few weekends to read about a third of it.  Of what I've read so far, let me say I agree with Erik and not without some sadness.  It is a beautiful book, but that's one of its flaws.  In fact, the same could be said of functional programming in general.  Let me explain what I mean with an example taken from the book.

Non-decreasing sequences 
Suppose we want to define a function nondec which determines whether a sequence [Xl , . . . , xn] is in non-decreasing order...  
...This condition can be expressed as a list comprehension in the following way: 
nondec xs = and [x <= y | (x, y) <- zip xs (tail xs)]

(Note, I've translated the above example to Haskell for ease of readability)

When I first read the code snippet I said "what in the world is that doing?", so if you thought the same thing you're not alone.  Let me break the parts down, and we'll see how the machinery works.

"nondec xs"

Note: this would be "let nondec xs" if you're using GHCi.  Here we define a new function called "nondec" that will test a list, here called 'xs', to see if it's in non-decreasing order.

"and [...]"

'and' is a function which will take a list of boolean values, apply a boolean 'and' to all the elements, and return the result.

"[ ... | ... ]"

This is a list comprehension.  With it we'll create a list made from values defined by the left and right sides of the vertical bar.

"[ x <= y | ... ]"

Here the list comprehension says that we'll be using the 'x' and 'y' defined on the right side of the vertical bar in the following way: we're applying a less-than/equal comparison.  The result of this will be boolean values that will make up the list we mentioned earlier (which we'll apply 'and' to).

Now let's look at the right hand side of the vertical bar, which in this case is easier to read from back to front.

"(tail xs)"

The argument to our function, if you recall, was 'xs'.  Here we apply the 'tail' function to it, which returns everything but the first element of the list.  For example: if we evaluated tail [1, 2, 3], we would get the list [2, 3] as the result.  Why we would want to do so will soon become apparent.

"zip xs (tail xs)"


The 'zip' function takes two lists, walks down both of them, and creates pairs of the elements at corresponding indices.  For example: zip [1, 2, 3] [4, 5, 6] would make the new list [(1, 4), (2, 5), (3, 6)].  To answer why we're zipping a list with itself, let's look a little closer at what the above is doing.  Let's say 'xs' is equal to [1, 2, 3, 4].


zip [1, 2, 3, 4] (tail [1, 2, 3, 4])


After tail is evaluated*, we have the following:

zip [1, 2, 3, 4] [2, 3, 4]

The 'zip' function can only zip elements up to the length of the shortest list, which gives us:

[(1, 2), (2, 3), (3, 4)]  -- notice that the final 4 from the first list goes unused.

Can you feel it starting to come together?  Now each element in the list has been paired with the next element in the list.  All that's left is to extract each pair and compare the elements, which is exactly what we do.

"[x <= y | (x, y) <- zip ...]"

The '<-' here says that the (x, y) pair will be drawn from the zip we just created, one at a time.  Each one will then go through the x <= y comparison and be turned into a boolean for our list.

Let's recap: to see if a list is in non-decreasing order we take the elements of the list and pair it with itself, but slide the list over by an element so we create pairs of elements with the immediately following neighbor.  We then apply the comparison to each pair and generate a boolean, which we 'and' together to get the final boolean answer.

The example problems in the book have you proud of your growing ability to read and even create samples like the above.  Which, I put forward, is both a blessing and a curse.

To explain let me give another example.  In high school, I studied classical Greek for fun (if you're reading this blog you're probably also a geek, so I know I'm in safe company).  Greek is a language of strange characters as well as strange grammar and pronunciation rules. I relished making flash cards for myself and learning things like which tone marker meant to prepend an 'h' sound to the front of a word.  I've long forgotten much of what I learned due in large part to not having other people around who had the patience to learn new characters, words and grammars.

The parallel between Greek and Haskell is striking.  Instead of talking about diphthongs and diacritical marks (like the 'h' sound), Haskellers will speak about list comprehensions, monads, evaluation strictness and thunks, and so on.  While programmers who learned C++ could go back in time and read Pascal or forward and read Java or C#, they will likely not be able to pick up Haskell source and easily understand what's going on.  For those who appreciate the aesthetic, the above example is a beautifully crafted** algorithm to a common problem.  But if programmers must learn Greek to speak your language of beautiful solutions, how many will go that extra mile?  For the brave few who do, will they find themselves having visions on the island of Patmos, struggling to reintegrate with the outside world?

* - I realize this explanation assumes eager evaluation.  My apologies.  I did so for clarity of the example.


** - Bonus factoid to programmers more used to languages like Java or C++.  Would you believe me if I told you the code above was actually fairly efficient, too?  Because of Haskell's non-strict (or lazy) evaluation, the 'and' function will only continue to check values in the list if they are true, and each element of that list is only created when it's needed.  

Thursday, September 30, 2010

Simple Proof in Isabelle

As part of my "for fun" study lately I've been learning how to use the proof assistant Isabelle.  I've never used any proof assistants before, but it's not unlike working in a strange programming language - at first it feels awkward and later it feels... well, I don't know yet.  I looked for Isabelle tutorials that were at the hello world level, and sadly have yet to find one, so hopefully this post will help others who are looking for something similar.

Last night's adventure was to see if I could create a function that summed the first n natural numbers and then show that it's correct.

After a few iterations, it looked like this:

theory MyTheoryName
imports Main
begin


primrec app :: "nat => nat" where
  "app 0 = 0"
| "app (Suc n) = Suc n + (app n)"
lemma "app 2 = 3" by simp


end

As it turns out, this doesn't quite pass muster in Isabelle, which is unable to unpack the natural number in the function application to its corresponding Peano encoding.  A slight tweak to the above (pointed out by my advisor) allows it to pass, though not without obfuscation.

theory MyTheoryName
imports Main
begin


primrec app :: "nat => nat" where
  "app 0 = 0"
| "app (Suc n) = Suc n + (app n)"
lemma "app (Suc (Suc 0)) = 3" by simp


end

Screenshot of Simple Isabelle Example
To use the above, open up Isabelle and copy it into the working buffer as shown in the screenshot to the left.

The fifth button on the button bar (the red triangle pointing right) lets us step through each piece of the proof and get feedback on what needs to get corrected for Isabelle to continue.  You can also use the left-pointing red triangle to back up and retype a piece of the proof.

The up and down-pointing red triangles let you reset the entire proof or run the entire buffer, respectively.

Tuesday, September 14, 2010

Haskell and Regular Expressions

Chapter 8 of Real World Haskell gives us a real gem, a look into how Haskell's type system allows it to do some fairly miraculous things with regular expressions (at least to programmers who come from a C-style background).

To begin, we need to install the regular expression modules:

> cabal install regex-posix

Then, we can open up GHCI and take a look at what it can do.

First, we load the module:


Prelude> :module +Text.Regex.Posix


Then, we can match a String against a regular expression pattern.  The magic lies in what is returned from the pattern.  For example, if we ask for a Bool, we get whether or not the pattern matches somewhere in the string:

Prelude Text.Regex.Posix> "my fool foot" =~ "foo." :: Bool
True

We can also expect a return type of String.  If we do so, we get the first match for the pattern:

Prelude Text.Regex.Posix> "my fool foot" =~ "foo." :: String
"fool"

Or, if we want all the matches, we can look for a list of lists of String, or a [[String]]:

Prelude Text.Regex.Posix> "my fool foot" =~ "foo." :: [[String]]
[["fool"],["foot"]]

Lastly, we can get a tuple of three strings.  The first string will be the text leading up to the first match, the second the match, and the third the remainder of the string:

Prelude Text.Regex.Posix> "my fool foot" =~ "foo." :: (String, String, String)
("my ","fool"," foot")

Haskell's type inference lets us also not have to explicitly give the type if we can imply it by the context.  For example, in an 'if' statement:

Prelude Text.Regex.Posix> if "my fool foot" =~ "foo." then putStrLn "Match!" else putStrLn "No match!"
Match!