# The Chifir Virtual Machine

A simple project to change the mood while I was in the middle of grinding webdev stuff.

2022.7.4: Somehow I forgot to link the repository here. Here you go: https://git.sr.ht/~bctnry/chifir

Chifir is a virtual machine described in the paper "The Cuneiform Tablets of 2015" by Long Tien Nguyen and Alan Kay at Viewpoint Research Institute. To be brief, Chifir is a part of the "Cuneiform" system which is designed to preserve programs for decades to come.

The virtual machine itself is extremely simple:

• Each word is 32 bits.
• A memory `M` of 2097152 words. (2097152 * 4 = 8388608Bytes = 8MBytes)
• A program counter PC with a length of 32 bits.
• A memory-mapped 512 * 684 black & white bitmap display:
• Each pixel is represented as a single 32-bit word.
• The display memory starts at address 1048576
• Each instruction is made up of 4 words: one operator and three operands, named as `A`, `B` and `C` respectively.
• The instruction set is defined as follows:
1. PC ← M[A]
2. If M[B] = 0, then PC ← M[A]
3. M[A] ← PC
4. M[A] ← M[B]
5. M[A] ← M[M[B]]
6. M[M[B]] ← M[A]
7. M[A] ← M[B] + M[C]
8. M[A] ← M[B] - M[C]
9. M[A] ← M[B] × M[C]
10. M[A] ← M[B] ÷ M[C]
11. M[A] ← M[B] modulo M[C]
12. If M[B] < M[C], then M[A] ← 1, else M[A] ← 0
13. M[A] ← NOT(M[B] AND M[C])
14. Refresh the screen
15. Get one character from the keyboard and store it in M[A]

All operands are treated as unsigned 32-bit integers. When the result is bigger than 32-bit maximum, the higher part is ignored (i.e. modulo 2^32). `PC` is increased by 4 except for instruction 1 and 2 when PC is directly assigned.

The input is the most confusing bit. Because it was originally intended to preserve Smalltalk-72 the ASCII set is different:

• `\n` is "Up" and `\r` is "Enter". I was not well-versed in the original Smalltalk-72 so this "Up" is mapped to the up arrow key and "Enter" the enter key. It can be easily modified into "Enter key = `\n`", "Enter key = `\r\n`" or whatever you like.
• 33 is the exclaimation mark `!` but in Chifir it's the "Return" (a big upward arrow with no fill) symbol. It's the good old `return` construct that we all know. But in later Smalltalk `^` is used as the return symbol and `^` is mapped to another symbol in Chifir so I'm not sure which to use.
• 34 is the double quote `"` but in Chifir it's "Hand", which is like a quote in LISP, used to signify that "this is a symbol".
• 37 is the percent sign `%` but in Chifir it's the "Eye", which in Smalltalk-72 means "see", as in "do this when you saw this message".
• 38 is the ampersand sign `&` but in Chifir it's the `○` symbol (a white box symbol in the original Smalltalk-72 manual). It's used as the "bitwise logical operation prefix" symbol. e.g. `+` is the normal addition and `&+` is the bitwise OR, `*` is the normal multiplication and `&*` is the bitwise AND.
• 63 is the question mark `?` but in Chifir it's the "Right" symbol. It's the conditional statement (i.e. the `if-then-else` construct). The conditional statement in Smalltalk-72 has the form of `condition [Right] (then-clause) else-clause`; it's probably a direct influence from McCarthy60. It's also used in method definition as well. For example:
```to box var | x y size tilt
(
[Eye]draw [Right] ([Smile] place x y turn tilt. square size.)
[Eye]undraw [Right] ([Smile] white. SELF draw. [Smile] black)
[Eye]turn [Right] (SELF undraw. [Hand]tilt ← tilt + :. SELF draw.)
[Eye]grow [Right] (SELF undraw. [Hand]size ← size + :. SELF draw.)
isnew [Right] ([Hand]x ← [Hand]y ← 256. [Hand]size ← 50.
[Hand]tilt ← 0. SELF draw)
)
```

or, if you want a much easier time to read:

```to box var | x y size tilt
(
%draw ? (@ place x y turn tilt. square size.)
%undraw ? (@ white. SELF draw. @ black)
%turn ? (SELF undraw. "tilt _ tilt + :. SELF draw.)
%grow ? (SELF undraw. "size _ size + :. SELF draw.)
isnew ? ("x _ "y _ 256. "size ← 50.
"tilt _ 0. SELF draw)
)
```

the `%draw` `%undraw` etc. are actually conditions; `isnew` is the condition of "whether it's creating a new object", the then-clause of this condition is thus the constructor.

The "?" does look like asking about the name of the message (e.g. `%draw?` = "is the message `draw`?"). I have no idea if this is a coincidence or not.

• 64 is the at sign `@` but in Chifir it's the "Smile". The "Smile" represents the turtle object - Smalltalk-72 is very different from Smalltalk-80, it's more like "LOGO but OOP".
• 94 is the circumflex sign `^` but in Chifir it's the upward arrow `↑`. Now, the upward arrow does not exist in the Smalltalk-72 manual (only the "thicc" upward arrow, which is the `return` construct), it probably does not exist in Smalltalk-76 either. I thought it was the `super` construct (as in inheritance) but that does not exist in Smalltalk-72.
• 95 is the underscore sign `_` but in Chifir it's the left arrow `←`. It's mainly used to assign stuff (combined with "Hand") so in Smalltalk-72 it would be something like `[Hand] d ← 3` (or `"d _ 3` if we directly translate the left arrow as `_`).
• 96 is the backtick sign ``` but in Chifir it's the unary minus, the same construct as the APL high minus symbol `¯`.

Some constructs in the original Smalltalk-72 is lacking here, e.g. the thicc colon (the normal colon gets the next value in the message evaluated, this thicc colon get the next literal token), the keyhole (haven't read enough manual to know what it's for but probably some kind of inspection utilities), and the `'s` symbol (the subscript construct, the `.` in the `A.B`; but that can be unified with methods e.g. à la Io so there can be no problem).

Yeah, if you want to preserve programs you can definitely do a lot better than this...

(BTW I haven't heard about Project Oberon being treated as a permacomputing-related project & with Project Oberon you got a whole computing stack as well.)

• You need to install PySDL2.
• Took me about 3 hours? I lost count. Half of the 3 hours was spent on testing things out. The "afternoon hack" part is definitely true.
• The `HOTAREA` part is there because I don't want to refresh the whole screen every time instruction 14 is executed.
• The file reading part is not tested.

## About programming in this VM

• `JMP` feels kinda weird, because to do `JMP [some-address]` the simplest way is to do `M[PC] = 1; M[PC+1] = a+2; M[PC+2] = [some-address]` because `M[PC+1]` is the `A` part so instead of `PC ← M[A]` it's actually `PC ← M[M[PC+1]]`, while with other common machines you don't have this `M[a+1]=a+2` indirect stuff.
• Conditional branching feels even weirder, because you have to use instruction 12 and instruction 2 together, so if you want to do `IF x < y THEN GOTO 1000`, it would need to be like this:
```X:     x
Y:     y
T:     [ignored]
PC+0:  IF M[M[PC+2]] < M[M[PC+3]] THEN M[M[PC+1]] ← 1 ELSE M[M[PC+1]] ← 0
PC+1:  T
PC+2:  X
PC+3:  Y
PC+4:  IF M[M[PC+6]] == 0 THEN PC = M[M[PC+5]]
PC+5:  PC+7
PC+6:  T
PC+7:  1000
```

where `X`, `Y` and `T` are different addresses than `PC+0~7`. To be honest, this kind of indirect manuvering is kinda killing me.