|
In June of 1997, I wanted to learn Java beter. I had a pretty good theoretical understanding of the language, but hadn't written much actual Java code. I decided to write a Lisp interpreter, and JSCHEME is the result. Here are the design decisions I dealt with.
Addendum: In 2001 (and again in 2002) we got letters from the friendly folks at Thredtec.com, who have a trademark on ``Silk'' for a Java simulation tool, so we changed the name to ``Jscheme,'' which was the name of Tim Hickey's implementation before he joined the SILK project.
(define (loop n) (if (> n 0) (loop (- n 1)))) (loop 1000)I conjecture that most of the time goes into a linear lookup of functions in the global environment. This conjecture is confirmed by seeing that the following variation runs about nine times faster:
(define loop2 (let ((> >) (- -)) (lambda (n) (if (> n 0) (loop2 (- n 1)))))) (loop2 1000)I can get another 15% faster by doing:
(eval `(define (loop3 n) (if (,< n 0) 0 (loop3 (,- n 1))))) (loop3 1000)With this I can loop 14,000 times in a second, which seems fast enough. Perhaps I should have the notion of immutable variables, and have lambda do some pre-treatment analysis, so the user can write the loop version and get the effect of the loop3 version.
However, I knew I really wanted to have a separate class for the interpreter (so I could swap several interpreters in; perhaps a simple interpreter, a tail-recursive interpreter, and an interpreter that supports full call/cc, as I did in PAIP). I also wanted to have a class for the primitive procedures, so I could have an R4RS, and an R5RS version some day. (I should probably clean things up so that the extensions are in a separate class from the R4RS primitives.) But if I put the list method in the primitive procedure class (where it will be used fairly often), then I would have to write Primitive.list(x, y) in the other classes (which also want to use list fairly often). I guess most Java programmers would just have accepted this, but I came up with the idea of putting the utility methods in their own class, SchemeUtils, and then having the five other top-level classes extend SchemeUtils. That way, I can use the unqualified name, and I get the modularity I want. So the class structure I ended up with is:
SchemePrimitives SchemeUtils Scheme Environment Pair InputPort Procedure Primitive JavaMethod Continuation Closure Macro |
new Primitive("list", env, 0, n) { public Object apply(Object args) { return args; }}; new Primitive("null?", env, 1) { public Object apply(Object args) { return truth(first(args) == null); }};
The other possibility is to have a big switch statement. I decided that loading 150 inner classes might be slow, so I went with the big switch. This means a lot of busy work, and breaking up the definition of each primitive into three places: first I need to define a long list of final static ints for the switch labels, then I need to initialize all the procedures, and finally I need to write the big switch that defines the semantics of each primitive. This is ugly (in Lisp it would be pretty) but I decided to put up with it. It ends up like this:
final static int ... LIST = 16, ... NULLQ = 23, ... // Define constants ... public static Environment installPrimitives(Environment env) { int n = Integer.MAX_VALUE; env .defPrim("list", LIST, 0, n) // list has 0 to n arguments ... .defPrim("null?", NULLQ, 1) // null? has exactly one argument ...; } public Object apply(Scheme interpreter, Object args) { ... Object x = first(args); switch(idNumber) { ... case LIST: return args; ... case NULLQ: return truth(x == null); } }
For numbers, I have only one type: Double. This is allowable by the specification but raises a problem with respect to exact/inexact numbers. I didn't implement inexact contagion properly. Technically, (max 4 3.9) should be an inexact 4. But I have no way to distinguish #e4 from #i4. I implement (exact? x) as true iff (= x (round x)) and if adding or subtracting 1 from x changes x. Rather than do the add/subtract each time, I test against the maximum allowable number. I guess I could have looked up the format for IEEE double floating point to see that there are 53 bits in the mantissa, but instead I wrote a program to figure out that I can represent exact integers up to 0x1fffffffffffff (just under 1016). I still have the problem that (* 2 0.5) comes out to an exact number, when it should be inexact. Here's the program:
(define (f lo hi) (let ((mid (+ (/ hi 2) (/ lo 2)) (round (sqrt (* lo hi))))) (if (or (= lo mid) (= mid hi)) lo (if (ok? mid) (f mid hi) (f lo mid))))) (define (ok? n) (and (< (- n 1) n (+ n 1)) (= (- (+ n 1) 1) n (+ (- n 1) 1)))) (number->string (f 1 (expt 2 63)) 16)
Scheme Type | Java Type |
|
pair | Pair | Java has no equivalent to use. |
empty list | (null) | We could have Pair and Empty as subclasses of List, or EMPTY could be a static var. But I decided its easiest to have null for (). This means only static methods on lists. |
boolean | Boolean | Simple enough. Note that the empty list is not considered false in Scheme, only #f is false. |
number | Double | We might add BigDecimal or BigInteger. |
string | char[] | Can't be String; they're immutable. Could be StringBuffer, but char[] is simpler and sufficient. |
character | Character | Simple enough. |
symbol | String | String is a somewhat confusing choice, because you have to remember that String means symbol and char[] means string, but String has the right properties: has a name, is internable, is immutable, doesn't need anything else. Another option is to have a class for Symbol that includes a slot for the global value. That would make lookup faster, but note that then we'd need a symbol table. Actually, we'd need one symbol table for each instance of the Scheme interpreter, and we'd need separate copies of every symbol in each instance. |
vector | Object[] | Because Vector is more powerful than is needed. |
procedure | Procedure
Primitive Closure SchemePrimitives Macro Continuation JavaMethod |
Procedure is the abstract superclass.
Primitive is a Scheme primitive. Closure is a user-defined function. SchemePrimitives holds Scheme code in a string; loaded at start-up. Macro is a subclass of Closure for doing macros. It has an expand method. Continuation is what you get from call/cc. JavaMethod encapsulates a Java Method. |
input port | InputPort | It would have been simpler to use Reader, but I wanted someplace to put the code for read, so I created a class. |
output port | PrintWriter | We could have had a class to hold the stringify method, but its not really necessary to have another class. |
Scheme Concept | Java Type | Notes |
interpreter | Scheme | The class Scheme implements eval, apply, and a few other methods. Note you can instantiate several different Schemes at once. |
environment | Environment | An environment has lists of variables and values. It uses linear search, which is slow, especially in the global environment. |
continuation | Continuation | call/cc builds a Continuation procedure which, when applied, throws a RuntimeException. call/cc then sets up a try/catch block that catches that identical exception only. Got it? |
error | RuntimeException | Call me lazy, but I didn't want to clutter the program with throws clauses, so I use RuntimeException, which doesn't need to be declared. |
symbol table | (none) | We use the internal Hashtable that Java uses in its String.intern() method as the symbol table. That is, symbols are represented as Strings that have been interned. |
variable | (none) | Variables are refered to implicitly by their position in Environments. |