other posts

Method Dispatch in Bard

Bard 0.1 continues to inch toward release. I'm pretty happy with the reader. It won't be user-programmable in the 0.1 release—at least I don't think it will—but that's not by design. It's just that the current implementation of the reader doesn't provide hooks for user programming. I've been concentrating on getting it working as intended rather than designing and implementing a reader API. I do intend to provide a programmable reader eventually.

Mutable cells are presently implemented as Termite processes. Bard is biased toward immutable data, but does provide a few mechanisms for modeling mutable data. The most important such mechanism is the cell, a box that contains a value that can be updated. The main Bard implementation is a virtual machine built in Gambit Scheme. It includes primitives for concurrent and distributed programming, built on Guillaume Germain's Termite, an Erlang-style extension to Scheme. It seems natural to use the same model that Termite uses to model mutable state.

Bard doesn't support inheritance‐at least not in the usual sense. The most important reason for Bard's existence is to explore the implementation of a Lisp in which behavior, representation, and taxonomy of types are firmly separated. Conventional inheritance conflates taxonomy and representation (in other words, when you inherit from something, you get its data layout and its supertype relations, not one or the other, even though the two concepts are not logically related). In order to remain consistent with Bard's reason for existence, I had to reject conventional inheritance.

That decision left me with another: do I go with the dispatching solution favored by Haskell and Go, and require a polymorphic function to exactly match on the types of its parameters, or else fail? I'm sympathetic to that approach. It's certainly simple and efficient.

I never quite succeeded in deciding to go that way, though. I'm very fond of CLOS-style multiple dispatch with next-method dispatching. I liked it in Dylan and I like it in Common Lisp. But if Bard doesn't have inheritance, how can it possibly support that style of dispatch?

Well, I'll tell you.

Bard's current design completely separates behavior, representation, and taxonomy of types. Behavior is defined by generic functions and methods. I plan to provide a mechanism for collecting related functions into structures called Protocols, but it's possible Bard 0.1 will omit that feature; it's very convenient for documentation and organization of code, but it's not essential to the function of the language.

Representation is handled by special forms that define data layout: define-structure and define-vector. These forms define new representations that specify how to lay out bits and bytes in memory. They create concrete types called representations that can be used to instantiate new values. Importantly, remember, they don't support any form of inheritance.

Taxonomy is the province of categories. A category is an abstract type that may include one or more representations. Importantly, categories can be organized into type graphs in which each category can have supertypes and subtypes.

A category can have one or more supertypes; if you create a category and don't specify any supertypes, it automatically gets the supertype Anything. A category may also have zero or more subtypes.

Representations also participate in type graphs, but with different rules: a representation cannot have any subtypes, and its supertypes must be categories. You can't make a representation a subtype of another representation. You can get the same effect, though, if you really want it, by suitably arranging categories and representations.

A programmer moving from Smalltalk to CLOS finds that CLOS breaks apart concepts that were welded together in Smalltalk. In Smalltalk, a class combines data layout, supertype relations, and behavior into a single construct. CLOS increases the programmer's flexibility by separating data layout and supertype relations from behavior. CLOS classes define the representation of data and the relations among types, and generic functions define behavior. Bard takes this factoring a step further, separating data layout from type relationships. Where Common Lisp breaks the definition of a type into defclass for specifying the structure and supertypes, and defmethod for the functions and methods, Bard goes a step further: define-structure and define-vector define the layout of data, and define-category defines supertype relations. Essentially, Bard breaks defclass into two parts: one for data layout and one for taxonomy.

So what does all this have to do with method dispatch?

Well, I'll tell you.

Bard function definitions look very much like CLOS or Dylan method definitions. For example:

(define-function add ((x Number) (y Number))
  (+ x y))

(define-function add ((x Sequence) (y Sequence))
  (concat x y))
			

If you've ever used CLOS or Dylan, your instincts about the meanings of those forms will serve you well. They define different methods on a function, add, specialized for different argument types. By using define-category to arrange suitable graphs of types, you can control how method dispatch works. If you can understand class relationships in languages like Common Lisp, Dylan, and C++, then you can understand type relationships in Bard. They work essentially just like CLOS class relationships, except that they are independent of the representation of data.

Bard adds another wrinkle to method dispatch. Before we get to that, I just want to say that define-function supports other familiar features of CLOS, such as unspecialized (default) methods:

(define-function add (x y)
  (error "Don't know how to add the arguments ~s and ~s" x y))
			

and EQL specializers:

(define-function add (x (y == nothing))
  (sequence x))
			

The new wrinkle that Bard adds to dispatch is general predicate dispatch:

(define-function add (x y) ?? MonotonicallyIncreasing?
  (sequence x y))
			

In this example, MonotonicallyIncreasing? is a predicate that is applied to add's parameters when the generic function is applied. If the predicate returns true, then the method is applicable.

An interesting question that arises in the presence of predicate dispatch is: how can you tell which of two predicates is more specific? When dispatch is controlled by parameter types, it's possible to deterministically compute which of two methods is more specific, as long as the dispatch mechanism can compute a unique total ordering of related types. Bard uses the C3 method of linearization to do that, and it relies on the type graph constructed by the relations among categories.

One solution to the problem of ordering predicates is to devise a restricted predicate language in which all legal predicates have implications that can be used to order them with respect to one another. Dispatch is then a matter of proving a theorem regarding the implications of a given set of predicates.

Bard handles it differently: predicates are categories. You control the specificity of predicate specializers the same way you control the specificity of other types: by using define-category to establish supertype relations.

This decision simplifies the implementation of predicate dispatch considerably; in essence, predicate dispatch is handled no differently from ordinary class dispatch; it's just that the expression that you evaluate to determine whether a method is applicable can involve a more general predicate than just whether a value belongs to a particular type.

On the other hand, although the theorem-proving approach obliges you to use a decidable (and therefore restricted) predicate language, it can provide stronger safety guarantees. Since all implications are automatically decided, as long as there isn't a bug in the theorem prover, your predicates will be ordered appropriately. Using Bard's approach, it's up to the programmer to order predicates appropriately. On the one hand, it means that your predicate language is unrestricted, and you can use any test you like on input parameters; on the other hand, you're free to define programs with absurd implications. You could, for example, write methods that behave as if even numbers are not numbers!

Now to figure out how to compile (as Sequence {:name "Fred" :size 'large}})