The Invisible Language
Wolf Wejgaard
[Presented at the EuroFORTH conference 1996]
Computer science seems to have a problem with the programming language Forth. Forth is
not visible in the journals of the computer societies. Recent books about programming
languages do not mention Forth at all. Dictionaries of computing describe Forth poorly. I
believe that we can gain new insight into Forth (and CS), if we understand the reasons for
this peculiar situation.
For many years I have been looking for an answer to the question "What is Forth?". I know how a Forth system looks and how it behaves. Or, I can find this information in the ANSI Forth standard document [1]. But the standard does not tell me, how Forth fits into the world of computer science. What are the properties of Forth that make this programming method so very special?
Computer science does not answer my question, either. I have been scanning the journals of the computer societies for many years, and have not found any information about Forth. I have looked up four new books about Programming Languages, and none of them mentions Forth. A fifth book does contain the word Forth at least in a reference to another book. The sad fact is: The science of programming ignores the programming language Forth.
To be sure, this is a sad situation for computer science. It must be difficult to
overlook a programming method, which solves very practical problems in the real world. And
which is the basis for two national (and de facto international) standards [1,2]. I assume
that every respectable computer scientist does know about Forth, but for some interesting reason chooses to ignore it. Peter Knaggs [3]
offers the explanation that Forth carries the image of a hacker's language. This is
probably true, but it does not explain why Forth is ignored. If a hacker's language is
really productive and contributes to international standards, I would expect this fact to
raise the curiosity of a true computer scientist.
Forth differs from conventional programming methods in several respects, which makes it truly difficult to understand Forth from a conventional point of view. One of the most obvious differences is the method of generating target code, i.e. compilation. I have tried to understand the essence of this difference [4], and offer my findings here for discussion.
A conventional compiler transforms source code into target code in two general stages, analysis and synthesis. The analysis of the source code generates an intermediate code, which is used to synthesize (generate) the executable target code.
The analysis stage comprises three logical phases: Lexical analysis scans the source text and creates a symbol table. The keywords of the programming language are recognized, because the compiler contains a knowledge of the programming language (its grammar). This knowledge is also used in the following phases.
Syntax analysis checks the correctness of the program's syntax and transforms the program into a new representation, the syntax tree. Semantic analysis finally checks the correct use of data types, and eventually adds type conversion operators to the syntax tree.
The output of the analysis is the intermediate code representation of the program. There exist several methods of representation, one of them is reverse polish notation.
The synthesis stage transforms the intermediate code into (optimized)
executable code for
the target processor.
How does Forth fit into this picture? Let us look at the Forth method of generating target code. The interpreter parses the source text, picks up a word and looks it up in the dictionary. The interpreter then executes or compiles the word, and creates the code.
The dictionary obviously relates to the symbol table, but what are the phases of lexical, syntactical and semantical analysis? Simple and surprising answer: These phases are performed by the words themselves. Moreover, the words also perform the synthesis and compile their own code.
In principle every word does its own compiling action. In conventional Forth systems this may not be obvious. The interpreter performs the action for the normal colon and code words, and only asks the compiling words (i.e. the immediate and defining words) to act for themselves. However, in at least one modern Forth system [5] every word is in effect an object that responds to the messages 'interpret' and 'compile'. Thus the interpreter parses the text, looks up the word in the dictionary and invokes the interpreting or the compiling action of the word, depending on the value of the variable STATE.
With the simple device of "letting every word mind its own business" Forth eliminates the complexity of a conventional compiler. If my speculation is true, it could explain why computer scientists hesitate to look deeper into Forth. Can you imagine a professor at the end of a course on compilers, saying: "And now we shall have a look at a programming language, which doesn't need all the complicated stuff we have been discussing"?
[Forth source code is written in reverse
polish notation, which can be transformed directly into executable code. Thus
the formal stages of lexical and structural analysis are simply not needed,
respectively have already been done. The Forth programmer performs these actions
en passant when writing the code (and would be surprised to learn
that he is doing such complicated things).]
Consider the regulation of traffic at a crossroad. One established solution consists in installing traffic lights and the corresponding controllers. A modern control system might include sensors for measuring the actual traffic and it can be quite complex. The driver of a vehicle does not have to think about the crossing. Stop, Wait, Go. The system is the authority, it contains all the knowledge and tells you what to do.
Now let us convert the crossing into a roundabout (traffic circle). Suddenly the old traffic control system is obsolete and the traffic controls itself. No traffic lights, no controllers, no sensors, no programs, no algorithms. With the simple decision of letting every driver mind his/her own business and treating them as responsible people, we have disposed of a lot of complexity. The knowledge is in the drivers, not in the crossroad. The drivers respond dynamically to the situation.
Forth is a roundabout of programming. You can add new rules, new compiling words and new data types to the system dynamically in the actual program. There is no fixed compiler authority watching over you. The program is the compiler.
Manufacturers of traffic control systems are not particularly interested in roundabouts.
[1] ANSI X3.215-1994, American National Standard for Information Systems - Programming Languages - Forth
[2] IEEE Std. 1275-1994, IEEE Standard for Boot Firmware: Core Requirements and Practices
[3] P. J. Knaggs, A Look at FORTH's Academic Standing, Proc. EuroFORTH, 1993
[4] A.V. Aho, R. Sethi and J.D.Ullman, Compilers: Principle, Techniques, and Tools, Addison-Wesley, 1986
[5] W. Wejgaard, Holon-A New Way of Forth, Proc. EuroFORTH, 1992, p.13-18