Tag Archives: Haskell

The Dying Philosophers Problem

Reading a masters thesis draft that mentions the dining philosophers problem – a parable about the difficulties of process synchronization very well known in computer science – it occurred to me that it must not be a very good idea to eat just spaghetti (or just rice). I asked a nutritionist about it, and here is her answer. Even if they manage to avoid deadlock or livelock, dying of malnutrition is not going to be their first problem, go read the full story!

[Typos and thinkos corrected after initial publication]

Benja on denotational semantics

My friend Benja Fallenstein got carried away in the comment section of my latest post on recursion and wrote a tutorial on how it all relates to denotational semantics. It’s a shame it is buried somewhere in the comments, so I am reposting the comment here with Benja’s permission.

The tagline is, Denotational semantics tutorials. It’s the new monad tutorials!


Benja Fallenstein writes:

Having read and understood [the] entire post, I can now summarize Antti-Juhani’s complaint in a way that programmers with no theoretical background will understand:

“The definition is an infinite recursion. Sure, it’s self-referential, but since it recurses forever, it doesn’t actually define what ‘recursion’ is!”

:-)

That said, I find it kind of a pity to introduce all this theory without talking about why it nicely models the real world. Antti-Juhani argues that this pretty much the definition of “popularized science,” but I have to admit that, as definitions go, I like “recursion: see recursion” better. :)

So let me have a quick go. [UPDATE: Didn’t turn out to be quick at all. This is, in fact, as long as Antti-Juhani’s original post. Proceed at your own peril.]

All of this comes from denotational semantics, which is concerned with what expressions in a programming language “mean” — and which ones mean the same thing. For example, “7″ ‘denotes’ the number seven, “7+2″ denotes the number nine, and “5+4″ also denotes the number nine.

Now, here’s the knotty problem. If we have

f(x) = if x = 0 then 1 else x*f(x-1)

then f(4) denotes the number twenty-four, but what does f(-2) denote?

So, denotational semantics introduces a new, special value, “recurses forever.” An expression of type ‘integer’ either denotes an integer, or, like f(-2), it denotes “recurses forever.”

Now, about these partial orders. (Reminder: A partial order says, for every pair (x,y) of elements of a certain set, that x is smaller than y, equal to y, larger than y, or “incomparable” to y.) Here is Benja’s guide to what the partial orders of denotational semantics mean:

Let’s say that you write a program where, in one place, you can plug in either x or y. If you can write such a program that prints “foo” when you plug in x and prints “bar” when plug in y, then x and y are “incomparable.” If you can’t do that, but you can write a program that prints “foo” when you plug in y, and recurses forever when you plug in x, then x is smaller than y. If every program you can write will do the same thing whether you plug in x or y, then x and y are equal.

(We ignore details like out-of-memory errors and *how long* a program takes to do something.)

Examples:

– 5 and 7 are incomparable. Proof: if ___ == 5 then print “foo” else print “bar”.

– f(-2) < f(2). Proof: if ___ == 2 then print “foo” else print “foo” — if you plug in f(-2), this will loop forever, if you plug in f(2), it’ll print “foo”.

- (5+2) = 7. Whatever program I write, it will do the same thing, no matter which of these expressions I plug in.

Now, with expressions of type ‘integer,’ the resulting partial order is: “Recurses forever” is smaller than everything else. All other values are incomparable.

Recall from Antti-Juhani’s post that the bottom element (⊥) is defined to be the smallest element of a mathematical universe (if such a beast exists). Since “recurses forever” is smaller than everything else, it’s the bottom element of this universe. That’s what denotational semantics calls it. (Haskell, too.)

Of course, this partial order is kind of boring (the technical term is ‘flat’). It gets more interesting when we consider functions.

In general, we can talk about functions from values of type S to values of type T, where we already know the partial orders of S and T. In particular, we can talk about functions from integers to integers.

Now, if you can find an x and a program that prints “foo” when you plug in f(x) and “bar” if you plug in g(x), then you can write a program that prints “foo” when you plug in f, and “bar” when you plug in g. (Exercise! ;-) ) Further, denotational semantics wants applying a function to a parameter and then doing something with the result to be the *only* way you can get information about a function. No f.get_function_name, please. (If your programming language has that, you can’t map the functions of your programming language 1:1 to the functions of denotational semantics.)

By my rule above, this implies that f <= g if and only if f(x) <= g(x) for all x. Since formal theories can’t have fuzzy stuff like my rule, denotational semantics takes this as the *definition* of the partial order on functions.

The bottom element of this mathematical universe, then, is the function that always returns bottom, no matter what it is given — or in other words, a function that recurses forever, no matter what parameter you pass to it.

(Side note. Denotational semantics allows only “monotone” functions — that is, if x <= y, then f(x) <= f(y) for all f, x, and y. That’s an example of how all these definitions allow us to make formal statements about the real world. In the real world, if you can find a function f and a program that prints “foo” when given f(x) and “bar” when given f(y), then you have a program that prints “foo” when given x and “bar” when given y — so if f(x) and f(y) are incomparable, then x and y must be incomparable, and the other way around, if x and y are comparable, then f(x) and f(y) must be comparable, too. (This reasoning doesn’t exclude the case that x <= y and f(x) >= f(y). If you really care, prove it yourself. :) ) This ends the side note.)

Recall, now, Antti-Juhani’s definition of recursion: An equation of the form x = some expression (which may contain x), interpreted as a recursive definition, defines x to be the smallest value satisfying this equation. The models constructed in denotational semantics have the property that for any equation of this form, the set of values satisfying it does have a smallest element, so ‘x’ is well-defined no matter what the equation is.

It’s important to note that this does not mean that x has to be “bottom,” because there are equations to which “bottom” is not a solution. Let’s look at a slightly simpler example than the one Antti-Juhani gave.

f(x) = if x = 0 then 0 else f(x-1)

We have f(0) = 1, but bottom(0) = bottom, so f can’t be bottom.

The equation has more than one solution, though. It is satisfied by a function that returns 0 for x >= 0 and “returns bottom” (recurses forever) for x < 0; but it is also satisfied by a function that returns 0 for every input. And by an infinite number of functions that return 0 for some negative inputs and bottom for others.

We “want” the first of these solutions, of course, because that’s what we get if we interpret the equation as a computer program, and computer programs are what we’re trying to model. But we’re in luck: The first solution is smaller than all the other ones, because for all inputs, they either return the same value, or the solution we want returns ‘bottom’ and the solution we don’t want returns 0 (and bottom < 0).

The point of it all is, of course, that when we interpret an equation as a computer program, we always get the least value satisfying the equation (if we use a partial order that follows my rule, above). And that, my friend, is denotational semantics.

On recursion

Note: This entry has been edited a couple of times (and might be edited in the future) to remove errors. Thanks to Benja Fallenstein for critique.

My previous post about recursion drew some comments. This is a topic that frequently confuses me, and there is a lot of confusion (some of it mine) in the blog post and its comments. On reflection, it seems prudent to examine what recursion really is.

Consider the following recursive definition: 0 is an expression; if T is an expression, then sT is an expression; nothing is an expression that doesn’t have one of these two forms. Now, it is clear that 0 is an expression. It is equally clear that sssssK is not an expression.

In some applications, it makes sense to consider infinite expressions like “…ssssss0“; that is, an infinite number of ss before the final 0, and in others it doesn’t. Is that infinite utterance and expression? It does have one of the forms given in the definitions, so it is not forbidden from being an expression, but neither is there anything in the definition requiring that it is.

It is helpful to do this a bit more formally. Let us translate the self-referential definition into a set equation: The set of expressions is the set E that satisfies the equation


E = { 0 } ∪ { se | e ∈ E }

But this definition is bogus! There are more than one set E that satisfies the equation, as remarked above, and thus there is no “the set E”!

We need to fix the definition in a way that makes the set E unique. The key question is, do we want to regard the infinite utterance discussed above as an expression. It is simplest to just say “the set of expressions is the smallest set E that …” if we want it to not be an expression, and if we want it to be an expression we should say “the set of expressions is the largest set E that …”

Let us make the following definitions:

Consider the set equation S = … S …, in which the set S is alone on the left-hand side and occurs at least once on the right-hand side. This equation is a recursive definition of the smallest set S satisfying this equation, and a corecursive definition of the largest set S satisfying this equation.

Thus, a definition like


E = { 0 } ∪ { se | e ∈ E }

is not recursive because of its form; it is self-referential. By itself, it is an ambiguous and thus bogus definition. We use the words “recursive” and “corecursive” to indicate the intended method of disambiguation.

A rather nice example of this can be seen in the treatment of algebraic data types in strict (ML-style) and nonstrict (Haskell-style) functional programming languages. Consider the declaration (using Haskell syntax)

data IntList = Nil | Cons Int IntList

This can be read as a set equation as follows:


IntList = { Nil } ∪ { Cons n x | n ∈ Int ∧ x ∈ IntList }

Now, as a recursive definition, it is restricted to finite lists as found in ML, but as a corecursive definition, it also allows infinite lists as found in Haskell. (It unfortunately fails to address the question of bottom-terminated lists, which also can be found in Haskell.)

As it happens, the technique is not limited to set definitions; but to generalise it, we need to discuss a bit what we mean by the terms “smallest” and “largest”.

In a set context, a set is smaller than another set if it is the other’s subset. Notice how this makes the ordering partial, that is, there are sets that are not comparable (for example { 1 } and { 2 }; neither is smaller than the other, but neither are they equal). A collection of sets does not necessarily have a “smallest” set (that is, one of the sets that is a subset of all the others). However, we can always find the lowest upper bound (“lub” or “join”) of a collection of sets as the union of the sets, and the greatest lower bound (“glb” or “meet”) as the intersection of the sets, and if the lub or glb is itself a member of the collection, it is the smallest or largest set of the collection.

Haskellists will find things instantly familiar when I say that the smallest element of a mathematical universe (if it exists) is called bottom and is denoted by ⊥; somewhat less familiar will be that the greatest element (if it exists) is called top and is denoted by ⊤.

A partial function is any function that needs not be defined everywhere. For example, f(x) = 1/x is a partial function on the reals (it is not defined for x = 0). We can form a partial ordering of (partial) functions having the same domain and codomain by specifying that f is smaller than g if f(x) = g(x) for all such x that f(x) is defined. Sometimes it will make sense to specify that f is smaller than g if f(x) is smaller than g(x) for all such x that f(x) is defined. Now, it happens that all collections of functions have a lub; and in particular, the partial function that is defined nowhere is the smallest function (the bottom function).

Now, with this definition of “ordering” of functions we can apply the definitions of recursion and corecursion to functions. For example, consider f(x) = f(x) + 1. The bottom function (defined nowhere) qualifies, since undefined + 1 is undefined; thus it is the smallest solution to the equation and thus this equation is a recursive definition of bottom. However, consider f(x) = if x = 0 then 1 else x*f(x-1): any solution to this equation must have f(0) = 1 and thus the bottom function is disqualified. In fact, the smallest solution is the factorial function, and thus the equation is a recursive definition of factorial.

In the comments to my previous entry, I was asked whether the Haskell definition “ones = 1 : ones” is recursive. Of course, the question in itself is nonsensical; whether it is a recursive definition or something else (such as corecursive) depends on what the programmer intends. I will interpret this question as whether the equation is a recursive definition of the intended result, the infinite list of all ones.

First, observe that we can order Haskell lists (including the special “undefined” list) partially as follows: l1 is smaller than l2 if l1 is undefined, if they both are empty lists or if a) the first element of l1 is smaller than the first element of l2 and b) the tail (the list with the first element removed) of l1 is smaller than the tail of l2.

The undefined list does not qualify as a solution for “ones = 1 : ones”. No finite list qualifies either. The infinite list of all ones does qualify, and it is the only solution; hence, the equation is a recursive definition of the infinite list of all ones; it is also a corecursive definition of the same list.

Finally, let us consider the question that started it all: what should we make of the definition “recursion: see recursion”. First, note that a equation x = x admits all elements of the universe as a solution; hence, the proper reading of “recursion: see recursion” as a definition is that it is ambiguous and hence bad. It is a recursive definition of the bottom element of the universe (whatever that might be for a dictionary), and a corecursive definition of the top element of the universe (whatever that might be). In any case, it does not define recursion (unless of course, the concept of recursion is the smallest or the largest concept in the universe of concepts).

Recursion that isn’t

Joey’s recent post reminded me of the old joke about an entry in a dictionary:

recursion see recursion

Non-geeks can stop reading here: if you didn’t understand the joke, you won’t understand what’s coming next either.

I mean, sure it’s a funny joke, but it doesn’t quite work. Technically, it’s not even an example of recursion, as the base case is missing. Perhaps it’s co-recursion? — but it isn’t, as it’s not progressing.

Why can’t we say instead,

recursion if you do not understand it, see recursion

It would still be funny, and it might actually work.

Announcing darcs-monitor 0.3.1

Darcs-monitor will send email to a specified recipient about new changes added to a specific darcs repository. It can be run as an apply posthook (resulting in near-instantaneous “push” emails), or periodically from Cron, or occasionally by hand, whatever seems most convenient.

This new release (0.3) contains the following changes:

  • there is now support for the %CHANGES% and %SHORTREPO% substitutions in the template
  • a default template is now provided
  • &apos; and &quot; entity references in darcs changes –xml is now supported
  • patches with just the author email address (no real name) are now handled correctly
  • when multiple emails are sent, they are now sent in chronological order
  • emails are announced to the user

The minor update 0.3.1 brings the documentation up to date.

Benja Fallenstein and Benjamin Franksen provided some of the features and bug fixes in this release. My thanks to them :)

You can get it by darcs at http://antti-juhani.kaijanaho.fi/darcs/darcs-monitor/ (darcsweb
available
), or as a tarball at http://antti-juhani.kaijanaho.fi/software/dist/. It depends on mtl and
HaXml and is written in Haskell, naturally. It has been tested only with GHC 6.6 so far. For installation and usage instructions, refer to the README at any of the locations above.

Please note that this software is very new and thus can be buggy. As with any email-sending software, bugs can result in major annoyance; user caution is warranted.