Skip to content
This repository has been archived by the owner on Jun 6, 2021. It is now read-only.

Note: Computed Gotos and Label Values

Alex Rønne Petersen edited this page Jun 7, 2013 · 3 revisions

An interpreter might be written like this:

pub enum OpCode : u8 {
    Add = 0;
    // ...
}

pub struct Insn {
    pub op : OpCode;
    pub next : &Insn;
}

pub fn dispatch(code : &[&Insn]) -> unit {
    let mut insn = code.[0];

    start: match insn.op {
        Add {
            // ...

            insn = insn.next;
            goto start;
        }
        // ...
    };
}

This traditional style of interpreter is rather slow. We can do better by threading the dispatch using label values and computed gotos:

pub enum OpCode : u8 {
    Add = 0;
    // ...
}

pub struct Insn {
    pub op : OpCode;
    pub lbl : *u8;
    pub next : &imm Insn;
}

pub fn dispatch(code : &[&mut Insn]) -> unit {
    for insn in code {
        if insn.lbl != null {
            loop;
        };

        match insn.op {
            Add { insn.lbl = &&handle_add; }
            // ...
        };
    };

    let mut insn = code.[0];
    goto *insn.lbl;

    handle_add: {
        // ...

        goto *(insn = insn.next).lbl;
    };

    // ...
}

The difference between the first version and this second version is that the first one uses a jump table (generated from the match). For every instruction, the operation code must be looked up in the table in order to compute the address to jump to. The second version has the address computed ahead of time and can simply jump to it without further ado. This may seem like a minor difference, but in an interpreter's dispatch loop, it can make all the difference in the world. For instance, the Erlang folks reported up to a 50% speedup from using computed gotos.

The particular syntax used is inherited from GNU C where the feature was first implemented.

The unary && operator is needed because labels live in a separate namespace from regular identifiers. It takes the address of a label and returns it as type *u8. A label address is guaranteed to be stable across function calls (so long as the dynamic library (if any) containing the function the label addresses were obtained in is not reloaded in some way).

The goto expression followed by a * and an arbitrary expression is a computed goto (as opposed to goto followed by an identifier). The expression given to a computed goto must be of type *u8. Jumping to a label that is not inside the currently executing function is illegal.

Both && and goto * are considered unsafe.

Clone this wiki locally