Saturday, April 9, 2011

The case of functional programming, excel and the web

When I was a CS student, I thought functional programming (FP) was absolutely useless. Lisp was a nice way of solving problems that took a list of strings as input and produced another list as output. But when it came to GUI programming, operating systems, or low-level network hacking, FP was nowhere to be seen. It just did not fit. The crazy approaches in doing FP GUIs just demonstrated how poorly FP was suited to the task.

I used to smiled when reading an yet another article demonstrating that QuickSort can now be implemented even more elegantly in two lines of code or that it is now as fast as procedural QuickSort thanks to some crazy compiler optimizations. When I wanted QuickSort, I called the library function. Thank you.

Years later I ended up in the lecture hall of Microsoft Research in Cambridge, UK. I was facing a series of talks on FP by some of the brightest minds the field had to offer. I prepared to be bored to death, especially since I could not leave for political reasons. When finally Simon Peyton Jones entered the room he caught my attention. Not because he is the godfather of Haskell, but because he did not wear any shoes. Famous CS people need their own crazy habit. Let it be a long beard like Ken Thomson, ugly glasses like Bill Gates, or a dislike of shoes.

I remember only one statement from these talks. Simon said that Excel is perhaps the most successful functional programming language of all times. This is kind of true. An Excel sheet is a bunch of functions and Excel decides when and how to evaluate these functions. Being the author of the KOffice spreadsheet myself, I realized that I had even implemented a functional language without knowing it.

Ever since I changed my mind. FP is useful, you just have to find the proper use case. I am still convinced that a general purpose programming language is not the right use case and therefore I pretty much ignore LISP, Haskell and all of their friends.

Whenever I see a new problem I wonder whether it might be a fit for functional programming. Several years ago when using template languages for generating HTML on the server side I realized that these are FP as well. A template is just a function that takes some data as input and produces HTML as output. It avoids state and mutable data.

However, templates have a problem with todays real-time push web applications. Modern web applications render their HTML on the client-side. It seems that they have to. Using XMLHTTPRequest the client polls for new data and then some nasty DOM traversal code modifies the HTML to incorporate the new data. Writing applications like these is no fun at all. When the design of the web page needs a change, the DOM traversal code must be adapted. Luckily I don't have to earn my money this way.

Lately I had the idea of combining my two favorite FP languages (templates and Excel) to write modern web apps more easily. The nice thing about spreadsheets is that the interpreter is very good at recomputing the value of all the function upon changes of the input data, i.e. when you change one cell in Excel only the minimum number of formulas is recomputed because Excel knows about the functional dependencies between the cells.
The cool thing is that Excel can do this optimization automatically.

Let's apply this idea to web programming. In Excel the input are cells with data and the output are strings and numbers in other cells. In web programming the output is HTML and the input is some live data coming from a database. A template is a function that maps input to output. So far so trivial.

It becomes more interesting when the input data changes, e.g. someone appended a new record to the database, or an existing record has been updated. Recomputing the entire template and replacing the HTML in the browser would be insanely inefficient. However, by pulling off the Excel trick, we can make it more efficient. The template engine must know the functional dependencies between the database and certain pieces of HTML. When the data changes it can determine which parts of the template need to be updated.

Thus, the server is simply sending minimal HTML patches to the client, telling it about the minimal HTML modifications. All it needs on the client side is a general purpose library that applies these patches to the DOM. On the server side a new template engine is required that works more like Excel, i.e. it can compute the minimal changes to its previous HTML output as input data is being updated. Since this magic is realized by the template engine, programmers are entirely shielded from this ... just as in spreadsheets.