Posterous theme by Cory Watilo

core.logic & VPRI STEPS

I stumbled across this amazing blog series which is tackling computational linguistics by porting Prolog to core.logic. It reminds me a bit of my earlier attempt to implement Definite Clause Grammars. I got something experimental working but in order for core.logic to really scale for large parsing tasks we'll probably need to rethink how core.logic handles substitutions. That said, core.logic does have tabling so building packrat parsers shouldn't be too difficult.

All this brings me to the incredibly succinct program in Appendix II of the VPRI 2011 Report (Alan Kay et al). They start with a machine oriented lisp, define a grammar, implement Smalltalk, and finish with a non-trivial program.

Could we take similar approaches when writing software with Clojure?

Comparing JavaScript, CoffeeScript & ClojureScript

UPDATE: Jeremy Ashkenas (CoffeeScript creator) has pointed out on HN a somewhat intentional flaw in the final gist. Hopefully you can spot it and see that this post is about solving that very problem. I'm being a bit old school here - I don't like to give everything away :)

I’ve been spending a lot of time recently hacking on the ClojureScript language. I can say without qualification that I haven’t had this much fun programming since I first taught myself JavaScript nearly seven years ago. So let’s put aside logic programming for a moment and let’s talk about code complexity and code expressivity.

Recently on StackOverflow someone asked how to idiomatically construct a type in ClojureScript. Before we get into that let’s consider how this is done in JavaScript:

CoffeeScript gets a lot of deserved attention for its brevity for common tasks. For example the same thing in CoffeeScript:

That requires nearly half the amount of characters. Of course on real code the code compression isn't nearly that great - perhaps 10-20% in my experience. Still I find that CoffeeScript tends to give the feeling of compression for many common tasks and how a language feels day in and day out is important for programmer happiness.

Let's take a look at the same thing in ClojureScript:

The ClojureScript without the strange protocol form would give even better compression than CoffeeScript! So what does this protocol form do and why do we need that cluttering up our type definition?

ClojureScript, unlike JavaScript or CoffeeScript, promotes defining reusable abstractions. Imagine if all the types in your favorite library were swappable with your own implementations? Hmm ... perhaps that's an abstraction too far for many users of JavaScript or CoffeeScript.

Well here's a use case I think more people will get - neither JavaScript nor CoffeeScript provide any kind of doesNotUnderstand: hook that is fantastic for providing default implementations.

We've extended all objects including numbers to respond to the bar function. We can provide more specific implementations at anytime, i.e. by using extend-type on stringarray, Vector, even your custom types instead of default. It's important to note that this extension is safe and local to whatever namespace you defined your protocol.

Still not convinced? Let's demonstrate a very powerful form of extension that even Dart is getting behind.

In ClojureScript it's simple to construct types which act like functions. While this might sound esoteric consider very succinct operations like the following:

Wow. HashMaps in ClojureScript are functions! Now this may look like some special case provided by the language but that's not true. ClojureScript eats its own dog food - the language is defined on top of reusable abstractions.

How can we leverage this? An example - JavaScript and CoffeeScript both let you extract a range from strings and arrays. In JavaScript you have slice and CoffeeScript provides sugar via the [i..j] syntax. Neither provide you with a way to succinctly construct and manipulate the idea of a slice. For example:

IFn is one of the many reusable abstractions that ships with language. We define ISlice to illustrate that our type has dual functionality - as an object with fields that can be manipulated and as a function which can be applied to data!

Many people have the misconceived notion that Clojure/Script is only about functional programming - on the contrary Clojure/Script is very much "Object Oriented Programming: The Good Parts".

Another Taste of cKanren

Yesterday I posted an incorrect solution to this puzzle because I did not read the problem description correctly. At first it seemed like the problem did not lend itself at all to constraint logic programming. After a whole day of pondering it- I'm no longer sure that's true.

To be clear I don't think I would have arrived at the following code had I not seen the answer - 1, 3, 9, 27. Before we dive into what is curious about these values lets first see that I retained my original inspiration:

We still want all the stone values to be all different - there's no benefit to representing zero on the scale with stones themselves. We also still want the first stone to weigh one pound. To limit the search space we still want to ensure the stones weights are in increasing order and they still have to add up n, which in this case is 40.

These constraints decrease the search space very quickly. We then pass whatever weight sets are left to checko:

checko is a recursive goal. It takes 4 important parameters - ws a list of weights in ascending order, sl a list of weights from ws that we've already considered in reverse order, an accumulation list of weights we know we can represent in reverse sequential order, and finally the maximum weight.

The base case is simple, we have no more weights to consider. If the head of the accumulation list is equal to 40 we are done - if not we failed.

If we have more weights to consider we pull apart the weight list ws into w and wr. We then call subchecko with the weight w, sl, the accumulation list of weights r, and a unbound logic variable nr to hold subchecko's result. If subchecko succeeds, we move w into sl and we recur with wr one element shorter, sl one element longer, nr with all the weights we can now represent and n.

subchecko does quite a bit of work. First let us consider the base case - sl is empty. If so we have no values with which we can modify w and get other representable weights. If the accumuluation list r has any weights in it we check that w is exactly one greater than the first element of r. If so we unify an extended r to o. If r is empty (as it might be on the first call) we just add the weight straightaway.

So much for the base case. Now let's consider if sl is not empty. This means we have some math we can do. sl holds previously examined stones in reverse order. We can pull these out and add and subtract to determine other weights we can represent. Note this is very tree like! First we consider w-a, then we consider w itself, then we consider w+a. Because we keep track of the previously considered stones from ws in sl, we can at each considered weight use the sl to compute further weights in a tree like fashion. It's a bit like an in order tree traversal - only 1, 3, 9, 27 can satisfy this behavior.

Again this code shows an understanding of the solution - but I hope it reveals how fascinating logic programming can be! We're combining search and global constraints to produce the answer very quickly - ~12ms on my machine under Racket.

A Taste of cKanren

UPDATE: This doesn't actually solve the puzzle! I wasn't considering putting stones on either side of a physical scale. I'm leaving the post as is since it does highlight the powerful declarative aspect of cKanren w/o sacrificing efficiency.

Cédric Beust wrote up a fairly interesting challenge. Petit Laurent notified me about it on Twitter saying that core.logic might be a good fit for the problem. As it turns out core.logic in its current form is not ideal, however I'm planning on integrating cKanren functionality to core.logic which would extend core.logic to Constraint Logic Programming.

For fun I spent a little bit of time coding up a solution in cKanren with Racket. Read the problem description before proceeding.

Take a look at the following bit of code:

This is constraint logic programming. I declare 4 logic variables a, b, c, d to hold the weights of the stones and two to hold intermediate sums s1, s2. I specify that these variables are allowed to range from 1 to 40. I constrain them to be all different.

We know that at least one stone must weigh 1 pound, so we go ahead and unify the first stone with 1. To limit the search space we also declare that the stones increase in weight. This means that the set of stones with the smallest weights that would add up to 40 would be 1, 2, 3, 34. Again to decrease the search space we say that stones a, b, c must be less than or equal to 34 pounds. We then sum the four stones together and declare that they must add up to 40.

This simple declarative function will generate the stone weights that satifisfy all the constraints. This is not brute force search - we arrive at the solution in ~85ms on my machine.

The entire solution and output:

I'm looking forward to adding this functionality to core.logic. I'm excited to hear William Byrd and Daniel Friedman describe the ideas behind cKanren at the Clojure/conj.

Advanced Pattern Matching for JavaScript

The ClojureScript compiler will need to communicate that we're looking at ClojureScript code to the match macro, but I got this working easily after a little fiddling. This is pretty amazing. JS.Next indeed.

Anyone that thinks that Google Closure was a bad idea should look at how concise the final output is:

Of course the above pattern is hardly interesting - consider checking red-black tree balance:

Object Oriented Macros

It's beautiful thing to be able to integrate object oriented principles with macros. In order to avoid overhead (no wrapper types) when pattern matching primitives you can derive the compilation mechanism and specify what type of code you want to generate. This means you're able to write beautiful patterns that have brutally fast performance.

As mentioned before this is all that it takes to make it work on bits:

Every other pattern matching implementation, read and weep :)

In case you're not familiar this is edge work happening on match. This library already supports map patterns, seq patterns, guards, or patterns, matching locals, and provides an interface so that you can pattern match Java objects.

Solving The Mapping Dilemma

In his essay Why Are Black Boxes So Hard To Reuse? Gregor Kiczales outlines the lamentable truth about object oriented software development - the mapping dilemma. The gist of his excellent essay is that it’s incredibly difficult to implement good abstractions in our programming languages that scale to all levels and deliver the requirements at each level.

My experience as a developer, and I think many developers, is that his analysis is heartbreakingly accurate. You always end up giving up something or your abstraction simply cannot be applied to a large class of cases that we know are fundamentally trivially different. You shake your head, you roll up your sleeves and basically write the whole thing again filled with minutae that cannot abstracted away.

Well, for what seems like the very first time in my own code, I’ve avoided the mapping dilemma.

As you might know, I’ve been working on match, a pattern matching library for Clojure. Like the facilities present in Standard ML, OCaml, Haskell, Scala, Erlang, and Racket, the library allows you to easily pattern match data. If you’re a functional programming enthusiast all the goodies are there - matching maps, sequences, or patterns, guards, Java types, and naming submatches. Unlike the aforementioned programming languages (with the exception of Racket), match is extensible.

One feature I left off until recently was vector pattern matching. Originally this was going to be specifically for Clojure’s persistent vectors, but I realized the work should apply to any array-like (random access) data structure. This should include primitive arrays, buffers, and even individual bytes.

Now here’s the rub, if you’re pattern matching primitive arrays and bytes you probably care quite a bit about performance. Introducing even one object instantiation is an exorbitant cost. So wrapper types are out. Yet hardcoding the types that a person might want to pattern match is very undesirable.

So what did I do? I opened up the machinery to match and allow clients to tell me what the most optimal access is for a particular datatype. The match library is an optimizing pattern match compiler. I provide hooks to the compiler so you can specify the precise mapping that you need. You can pattern match primitive arrays and bytes without introducing any overhead. No wrapper types.

The benefit? Pattern matching arrays is 30 times faster than in Scala and about 10 times faster than in Racket. 

How much work did this all involve? The library is only around 1100 lines of code. We can use the same abstractions to pattern match everything from Java objects to individual bits. At every level of scale you get exactly the performance profile you want.