Updating FreeBSD with ZFS Root File System

posted by Christian in General

This weekend I wanted to update my FreeBSD box and saw that the ZFS tools fundamentally changed. The UPDATING file has a corresponding note:

20090520:
        Update ZFS to version 13. ZFS users will need to re-build
        and install both kernel and world simultaneously in order 
        for the ZFS tools to work. Existing pools will continue to work
        without upgrade. If a pool is upgraded it will no longer be
        usable by older kernel revs. ZFS send / recv between 
        pool version 6 and pool version 13 is not supported.

Installing kernel and world simultaneously is scary. If something breaks you are left with a broken system, no way to roll back, and uncertain repair possibilities. Sure, you should have backups, but restoring a whole system is still something that takes a lot of time.

But is this not exactly one of ZFS' strength? ZFS provides easy snapshots and clones. When I search for "FreeBSD ZFS update" I stumbled of this Mail: Solaris live upgrade like FreeBSD ZFS-rootfs update, which gave me the idea to snapshot my root and usr file system. This should give me a reasonably easy rollback path.

Disclaimer

Here is what I did (with some minor tweaks). It might or might not work for you. You might loose all your data following my steps. I cannot be held reliable for that. I also probably made a few mistakes while writing this up. Back up your system!

Prerequistes

I have my FreeBSD with encrypted ZFS set-up like I described in my blog post FreeBSD: Encrypted ZFS Root with Geli. It should also work with a different encryption mechanism or without encryption.

Creating the Snapshots

Let us assume that the ZFS pool is called tank and that the root file system is under tank/root and the usr file system is under tank/usr.

First, let us create a snapshot of the usr file system and of the root file system:

$ sudo zfs snapshot tank/usr@pre_update
$ sudo zfs snapshot tank/root@pre_update

A snapshot is not writable. So let us create clones:

$ sudo zfs clone tank/usr@pre_update tank/usr_pre_update
$ sudo zfs clone tank/root@pre_update tank/root_pre_update

We need two versions of the root file system. The normal one (tank/usr) where usr points to the normal tank/usr and the pre update one (tank/root_pre_update) that points to the pre update snapshot tank/usr_pre_update. We will use a symlink for this.

Unfortunately, you cannot change the mountpoint of your usr file system while the system is running. So let us first finalize and boot into the pre update set-up to see that it is working.

$ echo Temporary mounting the pre update root
$ sudo mdkir -p /pre_update/root
$ sudo zfs set mountpoint=/pre_update/root tank/root_pre_update
$ echo Setting the mountpoint of the pre update usr file system
$ sudo zfs set mountpoint=/usr_pre_update tank/usr_pre_update
$ echo Symlinking usr to the pre update usr file system
$ cd /pre_update/root
$ sudo rmdir usr
$ sudo ln -s usr_pre_update usr
$ echo Setting the pre update root mountpoint to legacy
$ sudo zfs set mountpoint=legacy tank/root_pre_update

We need a similar set-up in the normal root file system where the symlink points to usr_post_update. In order to set the mountpoint of tank/usr to /usr_post/update we need to reboot into single user mode with ZFS disabled. We go to the boot loader prompt (normally by choosing 6 in the boot menu).

# unset vfs.root.mountfrom
# disable-module zfs
# set boot_single
# boot

Once we are in single user mode, we can import zpool and set the mountpoint.

# zpool import -d /boot/zfs
# zpool import -f -R /tank tank
# cd /tank/root
# rmdir usr
# ln -s usr_post_update usr
# zfs set mountpoint=/usr_post_update tank/usr_post_update

Now we reboot the system into the pre update environment. We go to the boot loader prompt and change the root file system to the pre update root file system.

# set vfs.root.mountfrom=zfs:tank/root_pre_update
# boot

If everything goes well, the machine boots up into the cloned environment. We can verify this by looking that /usr points to /usr_pre_update.

Let us reboot again into the normal environment (by doing nothing special during boot-up). Now /usr should point to /usr_post_update. With this system we can build and install the world and the kernel as mentioned in the Handbook - Rebuilding "world", without the reboot into single user mode (which will not work, anyway).

Rollback

If anything goes wrong during the update, we can boot our pre update environment by going to the boot loader prompt, selecting the pre update root file system and booting the old kernel.

# set vfs.root.mountfrom=zfs:tank/root_pre_update
# boot kernel.old

Here we can copy over the backup kernel to correct destination and roll back to the snapshot with the comfort of the system in a fully functional state that we know.

Should the update work out, we can destroy the pre update environment in order to safe disk space. Or we can leave it around to mitigate problems of the update we are not aware of just yet.

Resources

Equals != Equality

posted by Christian in Programming

The other day, I read the article How to Write an Equality Method in Java. Which reminded me that I wanted to write an entry about equality in programming. I cannot hope to cover this topic adequately in a blog post. My goal is merely to show that equality is very hard and that we often have a sever misconception about it.

Most programming languages give us several ways to check equality. Java has the operator == to test equality on primitives and references and the method boolean Object.equals(Object o) to test semantic equality of objects (we will discuss later what that means). Common Lisp has a lot more (eq, eql, equal, equalp, =, string=, ...). And did you know about the === operator in Javascript?

Semantics of Equality

What does it mean that two Objects are the same? In the OO world a naive approach is to say, that two objects are the same when they are instances of the same class and all members are the same. This is a nice recursive approach, and it is also not to hard to create a terminating implementation, if you take a bit care of circular references.

However, let us consider the numerical tower. Is the integer 2 the same as the float 2.0? If you interpret class in a Java sense, then they are not the same according to the above definition, because they are not instances of the same class. You might argue that integers can be seen as a super-class of floats (meaning that integers are a subset of floats, i.e., every integer is a float). So you could let the more specialized class decide the equality.

This approach does not help us with a different problem: Two strings, according to our definition, are equal if both strings consist of the same characters (in the same order). So the strings "FooBar" and "foobar" are not equal. But what if we do not care fore case sensitivity, e.g. because our file system does not care for it? Building up a set with all file names is a reasonable use-case. A standard Java HashSet<String> would not be very helpful here.

As a matter of fact, the Java String acknowledges this problem and provides equalsIgnoreCase(), but I cannot tell HashSet to use this method instead of equals() (without extending String).

Object relational mappers (ORM) have a similar, but slightly different, problem. They have to make sure that objects in the data base and in the program are matched up correctly. Here the instances of the class Person with name Maria Smith and Maria Jones might be the same object, just at a different point of time. This is usually mitigated by surrogate keys. So these objects are the same with respect to their ID (the surrogate key), but not with respect to the data they hold. A different distinction for the ORM system when it needs to synchronize objects with the persistency context.

Equivalence Relations

The equals() methods should of course implement an equivalence relation. However, very often they do not. An equivalence relation is reflexive (a = a), symmetrical (if a = b, then b = a), and transitive (if a = b and b = c, then a = c). It is quite easy to break the symmetry the class structure is not considered correctly. Let us assume you have a class A and a class B that is a subclass of A (i.e., it extends A in Java).

class A {
  int x,y;
  boolean equals(Object o) {
    if (o instanceof A) {
      A that = (A)o;
      return that.x == this.x && that.y == this.y;
    }
    return false;
  }
}

class B {
  int z;
  boolean equals(Object o) {
  if (o instanceof B) {
    B that = (B)o;
    return that.z == this.z && super.equals(o);
  }
}

If we have an instance a of A with x=1 and y=2 and an instance b of B with x=1, y=2, and z=3, then a.equals(b) will be true, but b.equals(a) will not. Just saying, only instances of the same class may be equal, is not always a viable solution, as explained above.

There Cannot be Just One

The real problem is, that there is more than one equivalence relation. Java (and many other OO languages) forces us to come up with one (more or less) canonical equivalence relation, rather than being able to define the real equivalence relation one currently needs.

A map implementation where you can define the equivalence relation would help. A Java interface for such a definition could look like this:

public interface Equality {
  boolean equals(Object o1, Object o2);
  int hash(Object o);
}

The problem is, that this equality definition needs to know about all classes whose instances might end up in the map. This problem can be solved by open classes and multiple dispatch (Java supports neither), but also by other techniques.

Mutability

The article I mentioned in the first paragraph mentions defining equality on mutable fields as a pitfall. IMHO the final keyword on variables and fields is underutilized in Java (and final should be the default) but this is beside the point. However, it is not so important that the fields used in the equals method are immutable, but that they do not change as long as the object is in the map. The way a lot of libraries and frameworks (e.g., Spring and Hibernate) work, you must give them mutable fields, but these fields might not change after the initialization is done (which is more than just calling the constructor).

Linguine with Meatballs and Tomato Sauce

posted by Christian in Food

The other day, my wife and I made linguine with meatballs and tomato sauce with an improvised recipe. We were quite pleased with the result. This recipe serves three to four people.

Ingredients

500 gLinguine
500 gMinced Beef
1 tsMeat Rub
Smoked Paprika
Salt
Pepper
2Spanish (Red) Onions
2 - 3 ClovesGarlic
Olive Oil
3 ts(Sun Flower) Oil
1 cnTomato Paste
500 mlTomato Puree
Italien Herbs
Shiraz

Directions

My wife made the tomato sauce. Fry the chopped onions and the chopped garlic in olive oil. Add the tomato paste and toast it. Add the tomato puree and cook it. Season it with the Italien herbs, smoked paprika, salt, and pepper. For the finishing touch add some more olive oil and a big sip of Shiraz.

While my wife prepared the sauce, I made the meatballs. Put the minced beef in a bowl and add the meat rub, smoked paprika, salt, and pepper We were considering to also add a chopped or thinly sliced chili. Next time we'll definitely do that. and mingle it. Form the meat to small meatballs with a diameter between one and two centimeters. Strew some salt and pepper over the meatballs.

Heat up a frying pan with (sun flower) oil and fry the meatballs for two minutes on one side, another two minutes from the other side (until they are crunchy and dark brown) then fry them a little more while tossing them around every now and then. Put the meatballs into the tomato sauce.

Boil water and cook the linguine. Serve the linguine topped with the meatball tomato sauce.

Duckprxy now with Javassist Implementation

posted by Christian in Programming

I just released version 0.2 of duckprxy. It now comes with a duck proxy implementation that uses Javassist: JavassistDuckPrxy.

Proxies created with JavassistDuckPrxy are not using runtime reflection, but dynamically created code that is compiled to byte code by Javassist. This should reduce the overhead to that of normal delegation, hence, very little. I did not do any performance tests yet, though.

FreeBSD: Encrpyted ZFS Root with Geli

posted by Christian in General

ZFS is supposed to support encryption, but it does not yet on FreeBSD. In a previous post I wrote about setting up ZFS on FreeBSD where the root file system uses UFS and the rest goes to logical ZFS volumes.

This time I use geli to encrypt a disk partition and use ZFS for the root file system. I encountered a few problems which I'd like to document here.

The FreeBSD Handbook section on encrypted disks show how to set up encrypted disks with geom or geli. However, you cannot use the geli rc-scripts to ask for the passphrase and attach the partition, because we need the partition ready for ZFS which is started by the loader. It needs to be started by the loader, because it shall provide the root file system.

Fortunately, geli can auto-detect encrypted partitions when it is started by the loader. I think it is not possible to use a key-file in this case. At least I could not see how.

I will now show how I set up a new FreeBSD installation with ZFS as the root file system. Following this instruction can and will destroy your data. It can also destroy your data you will put on the newly installed system in the future. I can take no liability.

Firstly, start installing FreeBSD 7.x. Partition the hard disk to have twro partitions. I used 3GB for the first partition which consists of two slices, the first is 1GB for the initial root file system (512MB should be sufficient) and 2GB for swap. The second partition will contain the encrpyted ZFS pool. Continue to install a minimal system.

After the installation is finished initialise the encrypted partition (assuming we want to encrypt ad0s2) and attach it:

# geli init -s 4096 -b /dev/ad0s2
# geli attach /dev/ad0s2

The -b flag will cause geli to ask for the passphrase at start-up.

Now the encrypted partition is ready for ZFS, but I encountered a problem (a bug?). During start-up the the passphrase was not accepted. The problem is that not all charactes I typed where recognized. We will take care of this later (we need a new kernel). For now we use a work-around. Add the following line to /boot/loader.conf

kern.geom.eli.visible_passphrase="1"

ATTENTION: This will show your passphrase when you type it during start-up. This way you will get feed-back what characters got lost.

Create the ZFS pool on the encrypted partition:

# zpool create tank /dev/ad0s2.eli

Edit /etc/rc.conf to enable ZFS:

zfs_enable="YES"

Reboot the system to see if everything works as expected. You should be asked for your passphrase. You will see the passphrase while typing it. After logging in as root you should be able to see that your encrypted partition (ad0s2.eli) is in the ZFS pool by using the following command:

# zpool status

Now you can follow the rest of the instructions for How to install FreeBSD 7.0 under ZFS.

After that is done you probably want to get rid of the passphrase work-around. If you did not have any problems typing in the passphrase, you can simply remove the setting that makes it visible in the /boot/loader.conf Now that we have the space for it, download the FreeBSD sources and create your own kernel configuration with the device dcons removed, which seems to cause the trouble (at least for me). I don't know why. I just found a discussion where someone mentioned that this might be the problem.

Duck Typing for Java

posted by Christian in Programming

Every now and then I need to implement some interface, but only one (or very few) methods of it. Usually I need something like this in test code, but it is useful elsewhere, too. The mocking libraries I have seen so far offer something similar, but not quite in the way I want to have it.

I had the idea to use something like duck typing for it for a while now. Yesterday I finally found some time to implement it. So I present to you duckprxy. You give it an interface to implement and an object to delegate to you and duckprxy will call the delegate according to the name of the method.

The service interface for this service is defined by DuckProxy. Currently there are two different implementations, but more on that later. Let us assume that the field duckProxy holds such an implementation of DuckProxy (e.g., via injection). Then we can do the following:

interface MyInterface {
    void foo();
    int bar(int x, int y);
}
public class Delegate {
    public int bar() { return 42; }
}
...
MyInterface prxy = duckProxy.makeProxy(MyInterface.class, new Delegate());
prxy.bar(2, 3); // Will return 42.

The standard behavior if an unimplemented method is called is to throw an exception. However, the delegate class can specify a fallback method to call.

public class Delegate {
    ...
    @DuckMethod(fallback = true)
    public int defaultMethod(
            @DuckArg(DuckArgType.NAME) String name) {
        log.info("called method " + name);
    }
}

Now every time an unspecified method is called, it will simply be logged.

Other features:

  • Mapping calls to methods with patterns.
  • Defining sub-delegates to allow easy overriding of a few methods, while the rest is handled by the sub-delegate.

Right now you need to define a class to use the extra-features with annotations. But I am thinking about also allowing to annotate the delegate variable instead of the delegate class.

As mentioned before, there are currently two implementations of DuckProxy. The first is DuckPrxyImpl. This one basically looks what method to call when the call actually happens. The second is DuckPrxyPreCompImpl. This one pre-computes the mapping of the methods in the given interface to the delegate.

I am thinking about using something like javassist for a third implementation that creates a class instead of a proxy. This way the performance impact should be negligible.

Klatschbase 0.7 with Persistence and Generics Parameters

posted by Christian in Lisp

I did not really need persistence in Klatschbase, because I never restarted the Lisp image Klatschbase was running in since version 0.2. The only reason I used a different Lisp image in 0.2 than in 0.1 was due to the fact that I moved Klatschbase to another server. Being able to deploy a new version without stopping your application is really a nice feature of Common Lisp.

However, an unexpected reboot of my virtual root server made it clear, that persistence would be a nice thing to have. The machine was rebooted again a few days later. Hence, I decided to do it. It was on my list, anyway.

So I reevaluated the options. I keep on looking at CouchDB. There are two Common Lisp libraries for it. I also played on connecting to it with simple-http directly. This works good.

I also had a brief look at Elepahnt. There was not a lot to see the last time I checked it out, but now it seems to have some good documentation. It is definitely worth a shot if you have to persist something.

But in the end, I decided that my persistence needs are quite basic. I just want to store the clients and their options and the rooms that the server has. I ended up also storing the current messages in the system. Lisp already has a fairly simpl way to store such things: writing s-expressions with write and reading them with read.

I have to admit that the approach that I used does not scale to well, because the whole state is written every n seconds (or minutes). But this is OK in the given context. The plus side is that it gave me persistence with under 100 lines of code.

Another thing I fixed in this version are the parameter names of the generic functions. For some reason I always thought that the names of the generic function parameters must be the respective class. Which means I ended up with generic function definitions like these:

(defgeneric poll-msgs-wait (msg-store t t))

This, of course, is not the case. While in sbcl the above code compiles, other Common Lisp implementations choke on them. The following, more readable version, should work fine:

(defgeneric poll-msgs-wait (msg-store start-key timeout))

FreeBSD 7 on Compaq 6710b

posted by Christian in General

As mentioned in the previous post, I installed FreeBSD on an HP Compaq 6710b. Previously, I had Ubuntu 8.04 installed.

Booting the ISO Image

First of all, I had to choose how to install. My preferred way of installing FreeBSD and Linux is to boot from a minimal CD and then get the packages (or whatever they are called for the system at hand) via network. The 6710b is an Intel 64 bit architecture. Hence, I assumed that ia64 is the right platform, but it is not. After reading the release notes, it was clear to me, that the correct platform is (counter intuitively) amd64. So I started my download of 7.0-RELEASE-amd64-bootonly.iso and booted the ISO.

The installation went quite smooth. I choose to have a 1GB root partition and 4GB swap partition and have the rest for a ZFS partition. See the post on ZFS on FreeBSD for the details.

WLAN

The 6710b has a Intel PRO/Wireless 3945ABG device. There is a project to get the 3945ABG WLAN device to work under FreeBSD, and now it is available in 8.0-Current and RELENG_7. I had a little trouble to get the device to work. The reason was that you need to accept the license for the firmware. To do so you need the following line in your /boot/loader.conf in order to state that you accept the license:

legal.intel_wpi.license_ack=1

Setting it via kenv and unloading and loading the module if_wpi did not help. I did not want to try every module, so I did a reboot. After that, the device worked, but there are still two strange problems: 1) I need to scan for WLANs, before the device connects to the configured WLAN. I also witnessed this problem under Ubuntu every now and then, but under FreeBSD it is reproducible. 2) Sometimes I loose the connection. After that I have to unload and load the if_wpi module.

X and DRI

I installed the port x11/xorg and configured it with X -configure. DRI was not working, even though the agp module was loaded and detected the Intel Mobile GM965/GL960 card correctly. 3D acceleration is not so important for me, but an X without 2D acceleration is unusable these days. Unfortunately, X made strange rendering errors that would only go away when I set the NoAccel option in the device section to true.

After trying several things I decided to grab the FreeBSD sources of RELENG_7 via the following command and compile my own kernel.

# csup -h cvsup2.de.freebsd.org /usr/share/examples/cvsup/stable-supfile

I basically simply compiled a new kernel using the GENERIC configuration file and installed it with the modules. After that DRI worked and the rendering problems were gone.

Suspend to RAM

To make it short: it does not work. I was expecting this, because I never had luck to get it to work under FreeBSD on a HP Compaq nx7010. The problem seems to be that HP Bioses are a bit, well, different than one normally expects.

Other Hardware

The LAN device just works. USB is no problem, either (checked with USB mouse and memory stick). I don't care about blue-tooth. CPU speed-stepping and SMP seems to work, but I did not take a close look.

Links

  • TuxMobil has or links reports and HowTos for Linux and *BSD on different (mobile) Hardware

ZFS on FreeBSD

posted by Christian in General

This posting is a small wrap-up of how I set-up ZFS on my FreeBSD 7 installation. I am currently in the process of installing FreeBSD 7 on a Compaq 6710b. A complete report about this will follow in a future posting, but this posting is not about the hardware side.

Sun's file system ZFS has a lot of cool features. If you understand german, you can listen to Chaosradio Express episode 49 to learn more about it.

DISCLAIMER: Disk related things always put your data at risk. So keep back-ups of your data. Following anything here might destroy your hard disk (or the data on it). I cannot take any responsibility for that. I only write up what worked for me.

I like to use disk encryption. ZFS in theory supports encryption, but it is not really available for FreeBSD, yet. So I set up GEOM Based Disk Encryption as described in the disk encryption chapter of the FreeBSD handbook. Using geli should work just as fine.

Instead of creating a file system with newfs (step 5) I set up the zfs pool using the geom device.

# zpool create tank ad4s1c.bde

Note that step 6 and 7 are not applicable in this set-up.

I used some tips from the FreeBSD ZFS quick start guide, like enabling compression on the ports tree, except the distfiles directory. However, I choose to have a plain old UFS root partition and normal swap partition. Fortunately, you can use one ZFS partition and still have different subdirectories in your root that use this ZFS partition without using symlinks.

# zfs create tank/usr
# zfs create tank/usr/ports
# zfs set compression=gzip tank/usr/ports
# zfs create tank/usr/ports/distfiles
# zfs set compression=off tank/usr/ports/distfiles
# zfs create tank/tmp
# zfs create tank/var

Once everything has been copied to the right place, I set the mountpoints.

# zfs set mountpoint=/usr tank/usr
# zfs set mountpoint=/var tank/var
# zfs set mountpoint=/tmp tank/usr

To have everything set-up at boot time, I added the following to /etc/rc.conf:

gbde_autoattach_all="YES"
gbde_devices="ad4s1c"
gbde_lockdir="/etc/gbde"
zfs_enable="YES"

Furthermore, I made sure that ad4s1c is not mentioned anymore in the fstab (neither ad4s1c.bde). It was still there because I created the partition with the installer.

Note that gbde is handled before ZFS during start-up. Hence, there is no problem having ZFS in an encrypted container like this. The same holds when you use geli.

I wanted to copy some data from a Linux machine (the former installation) to a FreeBSD machine (the new installation) using an external USB disk. You can use ZFS with fuse on linux. So, in principal, that should have worked something like this:

--- On the source machine:
# zpool create ext sdb1
# cp -r * /ext/
# zpool export ext
--- On the target machine
# zpool import ext
# cp -r /ext/* .

Unfortunately, I got a message that the pool was created with a newer ZFS version. So I tried to attach the disk on my iBook where Linux is installed, but while tryin to install fuse ZFS I had to realize that only amd64, x86, and sparc64 are supported by ZFS on fuse.

So far this set-up works just fine. I will see how useful thing like snapshots are.

The Visitor Pattern in Differen Languages

posted by Christian in Programming

In this post I want to explore the usage of the visitor pattern in different programming languages.

The goal is always to represent expression trees, i.e., to have the abstract syntax tree (AST) of an algebraic expression like 2 * 5 - 3.

Let us start with a simple implementation in Java:

import java.util.HashMap;
import java.util.Map;

public class ExpressionTree {
    
    public static interface Operator {
        int eval(int ... args);
    }
    
    public static Map<String, Operator> OPERATOR_MAP
    = new HashMap<String, Operator>();
    
    static {
        OPERATOR_MAP.put("+", new Operator() {
            public int eval(int ... args) {
                int result = 0;
                for (int i : args) {
                    result += i;
                }
                return result;
            }
        });
        OPERATOR_MAP.put("*", new Operator() {
            public int eval(int ... args) {
                int result = 1;
                for (int i : args) {
                    result *= i;
                }
                return result;
            }
        });
        OPERATOR_MAP.put("-", new Operator() {
            public int eval(int ... args) {
                if (args.length != 2) {
                    throw new IllegalArgumentException("wrong arity");
                }
                return args[0] - args[1];
            }
        });
        OPERATOR_MAP.put("/", new Operator() {
            public int eval(int ... args) {
                if (args.length != 2) {
                    throw new IllegalArgumentException("wrong arity");
                }
                return args[0] / args[1];
            }
        });
    }
    
    public static interface Visitor {
        void apply(Node node);
    }
    
    public static class Node {
        public void visit(Visitor visitor) {
            visitor.apply(this);
        }
    }
    
    public static class Literal extends Node {
        private int value;
        public Literal(int value) {
            this.value = value;
        }
        public int getValue() {
            return value;
        }
    }
    
    public static class Operation extends Node {
        private Node[] children;
        private String name;
        public Operation(Node[] children, String name) {
            this.children = children;
            this.name = name;
        }
        public String getName() {
            return name;
        }
        public Node[] getChildren() {
            return children;
        }
    }
    
    public static class EvalVisitor implements Visitor {
        private int value;
        public int evaluate(Node node) {
            apply(node);
            return this.value;
        }
        public void apply(Literal node) {
            value = node.getValue();
        }
        public void apply(Operation operation) {
            Node[] children = operation.getChildren();
            int[] values = new int[children.length];
            for (int i=0; i<children.length; i++) {
                EvalVisitor visitor = new EvalVisitor();
                children[i].visit(visitor);
                values[i] = visitor.getValue();
            }
            String name = operation.getName();
            Operator op = OPERATOR_MAP.get(name);
            if (op == null) {
                throw new RuntimeException("unknown operation " + name);
            }
            this.value = op.eval(values);
        }
        public void apply(Node node) {
            if (node instanceof Literal) {
                apply((Literal) node);
            }
            else if (node instanceof Operation) {
                apply((Operation) node);
            }
            else {
                throw new RuntimeException("unknown node instance");
            }
        }
        public int getValue() {
            return value;
        }
    }

    public static void main(String[] args) {
        System.out.println(
                new EvalVisitor().evaluate(
                        new Operation(new Node[] {
                                new Operation(new Node[] {
                                        new Literal(2),
                                        new Literal(5)},
                                        "*"),
                                new Literal(3)},
                                "-")));
    }

}

Two things can be observed here: 1) The base class Node has a method visit() which simply calls the visitor with itself as the argument. 2) The visitor uses instanceof to delegate to the according apply() methods. Hence, run time type information is used. Java's OO system uses single dispatch. Multiple dispatch would make this delegation apply() method superfluous, as we shall see later.

Not all languages allow us to use run time type information and we can get rid of it by modifying the the Visitor interface as follows:

    public static interface Visitor<T> {
        T apply(Literal node);
        T apply(Operation node);
    }

This, however, forces us to make the visit() method in Node abstract and to implement it in Literal and Operation. The code is always exactly the same. Thus, we have the same method in every concrete subclass of Node instead of having it just once. The approach would be required in OCaml if you want to use objects. Another option would be to use algebraic data structures, which are very well supported in OCaml.

Let's move on to a language that has multiple dispatch: Groovy.

def OPERATOR_MAP = [
    "+": { args ->
        int result = 0;
        for (int i : args) {
           result += i;
        }
        return result;
    },
    "*": { args ->
        int result = 1;
        for (int i : args) {
            result *= i;
        }
        return result;
    },
    "-": { args ->
        if (args.length != 2) {
            throw new IllegalArgumentException("wrong arity");
        }
        return args[0] - args[1];
    },
    "/": { args ->
        if (args.length != 2) {
            throw new IllegalArgumentException("wrong arity");
        }
        return args[0] / args[1];
    }]

public interface Visitor {
    void apply(Node node);
}

public class Node {
    public void visit(Visitor visitor) {
        visitor.apply(this);
    }
}

public class Literal extends Node {
    private int value;
    public Literal(int value) {
        this.value = value;
    }
    public int getValue() {
        return value;
    }
}

public class Operation extends Node {
    private def children;
    private String name;
    public Operation(def children, String name) {
        this.children = children;
        this.name = name;
    }
    public String getName() {
        return name;
    }
    public def getChildren() {
        return children;
    }
}

public class EvalVisitor implements Visitor {
    private int value;
    private def operatorMap;
    public EvalVisitor(def operatorMap) {
        this.operatorMap = operatorMap;
    }
    public int evaluate(Node node) {
        apply(node);
        return this.value;
    }
    public void apply(Literal node) {
        value = node.getValue();
    }
    public void apply(Operation operation) {
        Node[] children = operation.getChildren();
        int[] values = new int[children.length];
        for (int i=0; i<children.length; i++) {
            EvalVisitor visitor = new EvalVisitor(operatorMap);
            children[i].visit(visitor);
            values[i] = visitor.getValue();
        }
        String name = operation.getName();
        def op = operatorMap[name];
        if (op == null) {
            throw new RuntimeException("unknown operation " + name);
        }
        this.value = op(values);
    }
    public void apply(Node node) {
        throw new RuntimeException("unknown node instance");
    }
    public int getValue() {
        return value;
    }
}

println(new EvalVisitor(OPERATOR_MAP).evaluate(
    new Operation([
        new Operation([ new Literal(2), new Literal(5)], "*"),
        new Literal(3)],
        "-")));

The Java code was taken as the basis for the Groovy code, as one can clearly see. As mentioned before, the method apply() of the visitor that delegated.

The method visit() is not really needed, since we use run time type information, anyway. But let us first look at an implementation in Common Lisp.

(defpackage :expression-tree
  (:use :common-lisp))

(in-package :expression-tree)

(defparameter *operator-map*
  `((+ . ,#'+)
    (* . ,#'*)
    (- . ,#'-)
    (/ . ,#'/)))

(defclass visitor () ())

(defgeneric visitor-apply (visitor node))


(defclass node () ())

(defgeneric visit (node visitor))

(defmethod visit ((node node) (visitor visitor))
  (visitor-apply visitor node))


(defclass literal (node)
  ((value :reader literal-value :initarg :value)))

(defclass operation (node)
  ((name     :reader operation-name     :initarg :name)
   (children :reader operation-children :initarg :children)))


(defclass eval-visitor (visitor)
  ((value :accessor eval-visitor-value :initform 0)))

(defmethod visitor-apply ((visitor visitor) (node literal))
  (setf (eval-visitor-value visitor) (literal-value node)))

(defmethod visitor-apply ((visitor visitor) (node operation))
  (let ((operation (cdr (assoc (operation-name node) *operator-map*))))
    (when (null operation)
      (error "unknown operation ~A" (operation-name node)))
    (setf (eval-visitor-value visitor)
          (apply operation
                 (mapcar (lambda (child)
                           (let ((sub-visitor (make-instance 'eval-visitor)))
                             (visit child sub-visitor)
                             (eval-visitor-value sub-visitor)))
                         (operation-children node)))))

(defun expression-evaluate (node)
  (let ((visitor (make-instance 'eval-visitor)))
    (visit node visitor)
    (eval-visitor-value visitor)))

(expression-evaluate
 (make-instance 'operation :name '-
                :children
                (list
                 (make-instance 'operation
                                :name '*
                                :children
                                (list
                                 (make-instance 'literal :value 2)
                                 (make-instance 'literal :value 5)))
                 (make-instance 'literal :value 3))))

The same approach is used here with the visitor class and the visit method. The fact that visit really just calls visitor-apply with swapped arguments, makes it even more obvious that this method is not really needed in dynamic typed languages. So let us get rid of this extra step. And while we are at it, let us define a function sexp-to-ast that transforms an s-expression into a node graph.

(defpackage :expression-tree
  (:use :common-lisp))

(in-package :expression-tree)

(defparameter *operator-map*
  `((+ . ,#'+)
    (* . ,#'*)
    (- . ,#'-)
    (/ . ,#'/)))

(defclass visitor () ())

(defclass node () ())

(defgeneric visit (node visitor))

(defclass literal (node)
  ((value :reader literal-value :initarg :value)))

(defclass operation (node)
  ((name     :reader operation-name     :initarg :name)
   (children :reader operation-children :initarg :children)))


(defclass eval-visitor (visitor)
  ((value :accessor eval-visitor-value :initform 0)))

(defmethod visit ((node literal) (visitor eval-visitor))
  (setf (eval-visitor-value visitor) (literal-value node)))

(defmethod visit ((node operation) (visitor eval-visitor))
  (let ((operation (cdr (assoc (operation-name node) *operator-map*))))
    (when (null operation)
      (error "unknown operation ~A" (operation-name node)))
    (setf (eval-visitor-value visitor)
          (apply operation
                 (mapcar (lambda (child)
                           (let ((sub-visitor (make-instance 'eval-visitor)))
                             (visit child sub-visitor)
                             (eval-visitor-value sub-visitor)))
                         (operation-children node))))))

(defun expression-evaluate (node)
  (let ((visitor (make-instance 'eval-visitor)))
    (visit node visitor)))
    (eval-visitor-value visitor)))

(defclass eval-visitor (visitor) ())

(defmethod visit ((node literal) (visitor visitor))
  (literal-value node))

(defmethod visit ((node operation) (visitor visitor))
  (let ((operation (cdr (assoc (operation-name node) *operator-map*))))
    (when (null operation)
      (error "unknown operation ~A" (operation-name node)))
    (apply operation
           (mapcar (lambda (child)
                     (let ((sub-visitor (make-instance 'eval-visitor)))
                       (visit child sub-visitor)))
                   (operation-children node)))))



(defun sexp-to-ast (sexp)
  (if (listp sexp)
      (let ((children (mapcar #'sexp-to-ast (cdr sexp))))
        (make-instance 'operation :name (car sexp) :children children))
      (make-instance 'literal :value sexp)))

(expression-evaluate (sexp-to-ast '(- (* 2 5) 3)))

Visitors can have state. We exploited that fact, but let us redefine visit to return a value. This way we can get rid of the state and reuse our visitor.

(defpackage :expression-tree
  (:use :common-lisp))

(in-package :expression-tree)

(defparameter *operator-map*
  `((+ . ,#'+)
    (* . ,#'*)
    (- . ,#'-)
    (/ . ,#'/)))

(defclass node () ())

(defgeneric visit (node visitor))

(defclass literal (node)
  ((value :reader literal-value :initarg :value)))

(defclass operation (node)
  ((name     :reader operation-name     :initarg :name)
   (children :reader operation-children :initarg :children)))


(defclass eval-visitor () ())

(defmethod visit ((node literal) (visitor eval-visitor))
  (literal-value node))

(defmethod visit ((node operation) (visitor eval-visitor))
  (let ((operation (cdr (assoc (operation-name node) *operator-map*))))
    (when (null operation)
      (error "unknown operation ~A" (operation-name node)))
    (apply operation
           (mapcar (lambda (child) (visit child visitor))
                   (operation-children node)))))

(defun expression-evaluate (node)
  (let ((visitor (make-instance 'eval-visitor)))
    (visit node visitor)))


(defun sexp-to-ast (sexp)
  (if (listp sexp)
      (let ((children (mapcar #'sexp-to-ast (cdr sexp))))
        (make-instance 'operation :name (car sexp) :children children))
      (make-instance 'literal :value sexp)))

(expression-evaluate (sexp-to-ast '(- (* 2 5) 3)))

This, of course, can also be done in Groovy (and in Java). Here it is (with OPERATOR_MAP omitted):

public interface Visitor {
    Object apply(Node node);
}

public class Node {
}

public class Literal extends Node {
    private int value;
    public Literal(int value) {
        this.value = value;
    }
    public int getValue() {
        return value;
    }
}

public class Operation extends Node {
    private def children;
    private String name;
    public Operation(def children, String name) {
        this.children = children;
        this.name = name;
    }
    public String getName() {
        return name;
    }
    public def getChildren() {
        return children;
    }
}

public class EvalVisitor implements Visitor {
    private def operatorMap;
    public EvalVisitor(def operatorMap) {
        this.operatorMap = operatorMap;
    }
    public int apply(Literal node) {
        return node.getValue();
    }
    public int apply(Operation operation) {
        Node[] children = operation.getChildren();
        int[] values = new int[children.length];
        for (int i=0; i<children.length; i++) {
            values[i] = apply(children[i]);
        }
        String name = operation.getName();
        def op = operatorMap[name];
        if (op == null) {
            throw new RuntimeException("unknown operation " + name);
        }
        return op(values);
    }
    public Object apply(Node node) {
        throw new RuntimeException("unknown node instance");
    }
}


println(new EvalVisitor(OPERATOR_MAP).apply(
    new Operation([
        new Operation([ new Literal(2), new Literal(5)], "*"),
        new Literal(3)],
        "-")));

Here it is in Java (with OPERATOR_MAP omitted):

import java.util.HashMap;
import java.util.Map;

public class ExpressionTree {
    
    public static interface Operator {
        int eval(int ... args);
    }
    
    public static interface Visitor {
        void apply(Node node);
    }
    
    public static class Node {
        public void visit(Visitor visitor) {
            visitor.apply(this);
        }
    }
    
    public static class Literal extends Node {
        private int value;
        public Literal(int value) {
            this.value = value;
        }
        public int getValue() {
            return value;
        }
    }
    
    public static class Operation extends Node {
        private Node[] children;
        private String name;
        public Operation(Node[] children, String name) {
            this.children = children;
            this.name = name;
        }
        public String getName() {
            return name;
        }
        public Node[] getChildren() {
            return children;
        }
    }
    
    public static class EvalVisitor implements Visitor {
        private int value;
        public int evaluate(Node node) {
            apply(node);
            return this.value;
        }
        public void apply(Literal node) {
            value = node.getValue();
        }
        public void apply(Operation operation) {
            Node[] children = operation.getChildren();
            int[] values = new int[children.length];
            for (int i=0; i<children.length; i++) {
                EvalVisitor visitor = new EvalVisitor();
                children[i].visit(visitor);
                values[i] = visitor.getValue();
            }
            String name = operation.getName();
            Operator op = OPERATOR_MAP.get(name);
            if (op == null) {
                throw new RuntimeException("unknown operation " + name);
            }
            this.value = op.eval(values);
        }
        public void apply(Node node) {
            if (node instanceof Literal) {
                apply((Literal) node);
            }
            else if (node instanceof Operation) {
                apply((Operation) node);
            }
            else {
                throw new RuntimeException("unknown node instance");
            }
        }
        public int getValue() {
            return value;
        }
    }

    public static void main(String[] args) {
        System.out.println(
                new EvalVisitor().evaluate(
                        new Operation(new Node[] {
                                new Operation(new Node[] {
                                        new Literal(2),
                                        new Literal(5)},
                                        "*"),
                                new Literal(3)},
                                "-")));
    }

}

For the sake of completeness here is the mentioned variant in Java that does not use dynamic type features (with OPERATOR_MAP omitted):

import java.util.HashMap;
import java.util.Map;

public class ExpressionTree {
    
    public static interface Operator {
        int eval(int ... args);
    }
    
    public static interface Visitor<T> {
        T apply(Literal node);
        T apply(Operation node);
    }
    
    public static abstract class Node {
        public abstract Integer visit(Visitor<Integer> visitor);
    }
    
    public static class Literal extends Node {
        private int value;
        public Literal(int value) {
            this.value = value;
        }
        public int getValue() {
            return value;
        }
        public Integer visit(Visitor<Integer> visitor) {
            return visitor.apply(this);
        }
    }
    
    public static class Operation extends Node {
        private Node[] children;
        private String name;
        public Operation(Node[] children, String name) {
            this.children = children;
            this.name = name;
        }
        public String getName() {
            return name;
        }
        public Node[] getChildren() {
            return children;
        }
        public Integer visit(Visitor<Integer> visitor) {
            return visitor.apply(this);
        }
    }
    
    public static class EvalVisitor implements Visitor<Integer> {
        public Integer apply(Literal node) {
            return node.getValue();
        }
        public Integer apply(Operation operation) {
            Node[] children = operation.getChildren();
            int[] values = new int[children.length];
            for (int i=0; i<children.length; i++) {
                values[i] = children[i].visit(this);
            }
            String name = operation.getName();
            Operator op = OPERATOR_MAP.get(name);
            if (op == null) {
                throw new RuntimeException("unknown operation " + name);
            }
            return op.eval(values);
        }
    }

    public static void main(String[] args) {
        System.out.println(
                new EvalVisitor().apply(
                        new Operation(new Node[] {
                                new Operation(new Node[] {
                                        new Literal(2),
                                        new Literal(5)},
                                        "*"),
                                new Literal(3)},
                                "-")));
    }

}

So, what does all this tell us? Two things are obvious: 1) To get rid of the method visit() in the target object structure we need dynamic typing. 2) To get rid of the manual dispatching we need either multiple dispatch or we have to implement the visit() in every concrete sub-class of the target object structure (here all sub-classes of Node).

But there is still more to it. There is a subtle difference between the Groovy and the Common Lisp version. To emphasise this, in CL the visiting method is visit, but in Groovy it is apply. The method apply belongs to the visitor class. In CL a method does not belong to any class, but it is just a question of specialisation.