Systems Analysis and Programming:
Thoughts from the Attic

As a teenager in the early 1970s, I enjoyed going up to the attic and looking through old stacks of Scientific American magazines. (We didn't have Wikipedia back then.) Most coveted were the September issues, which were dedicated to a single topic area. I remember reading about Life, The Universe, and Mathematics.


The Universe



But the issue that had the biggest effect on me was the September 1966 on Information (which I read about 40 years ago). The issue featured a terrific collection of authors who are now acknowledged as pioneering leaders in computer science: Evans and Sutherland explaining computer hardware; Fano and Corbato on operating systems; Tony Oettinger describing his natural language parser; and the two giants of my own subfield (Artificial Intelligence), McCarthy and Minsky, on Information Theory and AI. If I had somehow been able to comprehend everything in this issue, I could have cut a decade's time off my education in Computer Science.

In this essay I'll concentrate on on one article from the issue: Christopher Strachey's contribution on System Analysis and Programming. At the time I had seen only a few snippets of BASIC code; nothing more than a few lines. This short article was my first introduction to a high-level programming language and the first non-trivial program I'd ever read: a checkers-playing program. When I rediscovered this article recently I was surprised to find two things: (1) the programming language and programming style are thoroughly modern, and (2) there is a serious mismatch between the design and the implementation, or the systems analysis and programming as Strachey calls it. Let's look at what Strachey did surprisingly well, what he got wrong, and what he didn't include at all.

The Good, The Bad, and the Missing

The Checkers Program

Here is the code as presented by Strachey on page 117. I have done my best to copy the original verbatim (including copying the inconsistent usage of spaces around punctuation):
ChosenPosition(P) = value of
    § let L = LegalPositionsFrom(P)
      if Null(L) then Resign
      let (p,v) = BestPosition(NIL, - ∞,L,0)
      result is p §

BestPosition(P,V,L,d) = Null(L) → (P,V), value of
    § let (p, l) = Next(L)
      let v = - PositionValue(p,d + 1)
      result is (v > V) → BestPosition(p,v,l,d),
                          BestPosition(P,V,l,d) §

PositionValue(P,d) = Terminal(P,d) → TerminalValue(P), value of
    § let L = LegalPositionsFrom(P)
      let (p,v) = BestPosition(NIL,- ∞,L,d)
      result is v §

LegalPositionsFrom(P) = value of
    § let L = RemainingPositionList(P,Capture,5)
      result is Null(L)→RemainingPositionList(P,NonCapture,5),L §

RemainingPositionList(P,C,s) =

PartialPositionList(P,C,s,L) =
      Null(L)→( (s = -5)→NIL,
               RemainingPositionList(P,C,NextMoveDirection(s) ), 
         value of
    § let Φ = SingleDigitFrom(L)
      let Ip = MakeMove(P,C,s,Φ)
      let l = (C = Capture)→CaptureTree(Ip),
      result is Join(l,PartialPositionList(P,C,s, L - Φ))§

NextMoveDirection(s) = (s = 5) → 4, ( (s = 4) → - 4, -5)

FindMoveList(P,C,s) = value of
    § let (X,Y,K,σ) = P
      let Empty = ~ X ∧ ~ Y ∧ Board
      let ψ = (C = Capture) → (Shift(Empty,σs) ∧ Y), Empty
      let Φ = Shift(ψ, σs) ∧ X
      result is (s > 0) → Φ, Φ ∧ K §

MakeMove(P,C,s,Φ) = value of
    § let (X,Y,K,σ) = P
      let ψ = (C = Capture) → Shift(Φ, - σs), NIL
      let θ = (C = Capture) → Shift(ψ, - σs),
                              Shift(Φ, - σs)
      let Xk = Null(Φ ∧ K) → (θ ∧ LastRows), (θ - Φ)
      result is ((X - Φ + θ), (Y - ψ), (K - ψ ∧ K + Xk),σ,θ)§

FinalPosition(Ip) = value of
    § let (X,Y,K,σ,Φ) = Ip
      result is (Y,X,K, - σ)§

CaptureTree(Ip) = value of
    § let L = PartialCapturePositionList(Ip)
      result is Null(L) → (FinalPosition(Ip) ),
                           CombineCaptureTrees(L) §

PartialCapturePositionList(Ip) = value of
    § let (X,Y,K,σ,Φ) = Ip
      let P = (X,Y,K,σ)
      result is MinList(PCP(P,Φ,5),PCP(P,Φ,4),
                        PCP(P,Φ ∧ K, - 4), PCP(P,Φ ∧ K, - 5) )§

PCP(P,Φ,s) = value of
    § let (X,Y,K,σ) = P
      let ψ = Shift(Φ, - σs) ∧ Y
      let Empty = ~ X ∧ ~ Y ∧ Board
      let θ = Shift(ψ, - σs) ∧ Empty
      let Xk = Null(Φ ∧ K) → (θ ∧ LastRows), (θ - Φ)
      result is Null(θ) → NIL,
                   ((X - Φ + θ),(Y - ψ),(K - ψ ∧ K + Xk),σ,θ)§

CombineCaptureTrees(L) = Null(L) → NIL, value of
    § let (Ip,l) = Next(L)
      result is Join(CaptureTree(Ip),CombineCaptureTrees(l) )§

Representing Positions

Strachey uses a clever representation for board positions, one that goes back to Arthur Samuel's 1959 program and article. He numbers the positions as follows:

5 6 7 8
9 10 11 12
14 15 16 17
18 19 20 21
23 24 25 26
27 28 29 30
32 33 34 35
36 37 38 39

Note that this representation is not dense; some numbered squares (13, 22, 31) are not on the board. The advantage of this approach is that every square is either 4 or 5 away from its diagonal neighbors.

Given this numbering, a board position is a four-tuple (X, Y, K, σ), where σ is the player to move, and X,Y and K are bit patterns with a 1 in the square number where there is a piece: the X bit pattern represents the player's pieces; Y represents the opoonent's pieces; and K represents the kings (of either player).

A Translator for CPL

Strachey didn't have a CPL compiler when he wrote the article in 1966. The first CPL compiler emerged around 1970, and by the 1980s they were gone. So I had to create my own before I could run and debug the checkers program. I had three choices to make: (1) What CPL syntax is legal and what does it mean? (2) What parser-generator system should I use to create the CPL translator? (3) What language should I translate into?

(1) Most of CPL is straightforward, but I got some help by reading the 1963 and 1968 articles on CPL. I learned that "A block consists of one or more definitions followed by one or more commands, the whole being enclosed within section brackets." However, the checkers program does not follow this definition of block, because in the first function definition, ChosenPosition, we have a block that has a definition (let L), followed by a command (if Null(L)), followed by another definition. According to the 1963 paper, we should have a second block for this second definition. So either Strachey made a mistake in the checkers program or he was working with a different version of CPL then. I decided to accept definitions anywhere, not just at the start of blocks. That means that "value of" and "result is" are no-ops; the result of a function is always the last expression. I'm not sure if that was intended for CPL, but this interpretation works for all the code in the checkers program.

Another thing I learned from the 1963 paper is that CPL has two kinds of variable names: small names, which are single lower-case letters (or Greek letters, I presume from the checkers program), and large names, which start with an upper-case letter and continue with any letters. Small names need not be delimited by spaces, and it seems that implicit multiplication is allowed, so "σs" is equivalent to "σ * s".

(2) For the parser generator, I settled on YAPPS2 (Yet Another Python Parsing System), mostly because the author of the system, Amit Patel, is a friend who sits nearby, so if I ran into trouble I knew I'd be able to walk down the hall and get help. (It turns out his system was very easy to use, and I didn't need any help. Good job, Amit!)

(3) CPL is a flexible mostly-functional language, so it seems that Lisp would be a good target language. But CPL uses infix notation; I'd have to get all the operator precedence right to generate Lisp code with the parens in the right place. So I decided to generate Python code instead, and to ignore the problem of operator precedence: I would translate the CPL code "(K - ψ ∧ K + Xk)" into the Python code "(K - psi & K + Xk)" and hope that the operator precedence just works out. But there is one problem in using Python: it has a strict separation between expressions and statements (statements cannot appear inside expressions), and CPL does not have this strict distinction. Therefore, I generate code in a subset of Python that uses only expressions. For example, given the CPL function:

LegalPositionsFrom(P) = value of
    § let L = RemainingPositionList(P,Capture,5)
      result is Null(L)→RemainingPositionList(P,NonCapture,5),L §
the most straightforward translation to Python is:
def LegalPositionsFrom(P):
    L = RemainingPositionList(P, Capture, 5)
    return (RemainingPositionList(P, NonCapture, 5) if Null(L) else L)
That would work fine for this function. But in general, a CPL block can be an expression embedded inside another expression. Python statements (such as "L = ..." and "return ...") can't appear inside expressions. So we won't use them; instead we will translate a CPL block into a Python lambda, which we will then immediately call, using the syntax "(lambda ...)()". No arguments are passed in this call to the lambda (because it is just a block of code); however, we can introduce new local variables within the block by defining optional arguments to the lambda expression, with their default values. So we have:
def LegalPositionsFrom(P):
    return (lambda L=RemainingPositionList(P, Capture, 5):
    (RemainingPositionList(P, NonCapture, 5) if Null(L) else L))()
Notice that there is one exception to the "everything is an expression" rule: every function definition is a statement of the form "def f(x): return expr".

The 1963 article also made it clear that CPL has a logical data type that is distinct from the integer data type. In languages like C and Python, the bit pattern 0b1111 is identical to the integer 15; they ae both of type int. But in CPL the two are different. I implemented the CPL logical data type with a Logical class in Python. The reason I need this is that the expression int(0) - int(1) should be -1 (because here the minus sign is ordinary integer subtraction), but Logical(0) - Logical(1) should be Logical(0) (because here the minus sign is the logical-bitset-difference operator). The Logical class has the usual set of operators (and, or, not, etc.) and has a constructor that takes as input either a bit pattern or a list of set bits. Thus, we can use either Logical(0b1111) or Logical([0,1,2,3]) to represent the logical bitset in which the four rightmost bits are true.

With those choices made, I was ready to write the grammar of CPL in the YAPPS formalism, with the translation into Python. The original code uses Greek letters and math symbols (§, ∨, ∧, →); I chose to use HTML entities for these (rather than Unicode). You can see the grammar in the file cpl.g.

The Rest of the Checkers Program

Scientific American limited Strachey in his page count, and so didn't show the whole program. He left out some of the basic primitives of CPL (such as Shift, which does a logical shift of a bit string) and some of the parts of the checkers program that didn't seem crucial (such as printing the board). You can see these missing parts in the file or as an annotated web page.

Debugging the Checkers Program

Here are the bugs/misunderstandings I discovered as I implemented the CPL parser and checkers program:
  1. CPL language: As discussed above, I had to modify the definition of a CPL block to allow mixed definitions and statements, not just definitions first followed by statements.
  2. Typo in article: The code in the article left out a right parenthesis in PartialPositionList (at the end of the third line); I added it.
  3. NIL: The documentation says that NIL is "the empty list". Actually NIL is used three ways: (1) as a placeholder whose value doesn't matter, (2) as the empty list, and (3) as the bit string with all zeros. The conflation of (2) and (3) was confusing to me. I suppose I could have modified my implementation of the primitive routines to accomodate the conflation, but instead I changed one use of NIL in the program to Zero, the all-zero bit string.
  4. Definition of Join and MinList. Strachey's documentation says that Join(L1,L2) produces "a single list formed from the members of L1 and L2." This would be like the Lisp function append. But in the program it is used with the first argument being a single element, not a list. I changed my definition of Join to allow that. Similarly, MinList(L1,L2...) is documented to produce "a single list formed from the members of several lists, and also leaves out null lists and repetitions". But actually it is not passed lists, but rather individual bit strings (or NIL). Again, I changed my function to reflect its actual usage in the program, not the documentation.
  5. Dealing with kings. I was impressed that with the minor tweaks mentioned above, the program worked perfectly for manipulating moves and jumps of regular pieces. But it didn't do as well with kings. I didn't understand how the expression (K - ψ ∧ K + Xk) was supposed to give the locations of all the kings on the board after a move. Maybe I still have a misunderstanding of what the bit string operators mean, and maybe the expression is in fact correct, but under my understanding a clear and correct expression is (K - ψ - Φ + Xk). This expression says to start with K, the bit string of all the original king positions; then remove ψ, the position of the captured piece (if any); remove Φ, the position where the move originated; and add Xk, which is the ending position of the move if either (a) the end position is in the last row or (b) the moved piece was a king at the start of the move; Xk is the Zero bitstring otherwise. I made this change to MakeMove and PCP. With this final change, the program passed my complete test suite.
Overall, this is a very impressive piece of software for 1966. Other than the trivial missing parenthesis (and I'm thinking maybe that was the fault of the Scientific American typesetter, not of Strachey), there is only one real error in the code, with the handling of kings. (And it is possible that Strachey's expression is in fact correct, but I have the wrong impression of what he meant.)

However, the correspondance between software and documentation is sloppy. The number of arguments on functions don't match; not all documented functions are implemented; there is confusion between what is a list and what is a string of bits. If only Strachey the analyst and Strachey the programmer had iterated a few times, the article would have been even more of a work of beauty!

Peter Norvig