Chapter 4: Grammars and Parsing


Grammars serve as a concise definition of how meaningful sentences in a language can be constructed and as a tool for diagnosing malformed sentences. The first test of a sentence's validity is typically its adherence to the language's grammar. Of course, it is possible to construct sentences in a natural language that are grammatically correct but still make no sense. In other words, a natural language's grammar captures a small but important aspect of a sentence's validity with respect to a language.

A compiler's front-end performs several steps to establish the validity of its input. An input stream is scanned for tokens as discussed in Chapter 3. Tokens defined using regular sets could be processed by scanners that were constructed automatically from the regular-set specifications. Just as regular sets guide the actions of an automatically constructed scanner, so also can the actions of the parsers described in Chapter 5 and Chapter 6 be guided by a grammar that specifies a programming language's syntax. Modern programming languages often contain a grammar in their specification as a guide to those who teach, study, or use the language. Based on the analysis discussed in this chapter, such grammars can also participate in the automatic construction of syntax-checking parsers. Programming language rules that are not easily expressed by grammars are enforced during the semantic analyses discussed in Chapter 7, Chapter 8, and Chapter 9.

In Chapter 2, we discuss the rudiments of context-free grammars (CFGs) and define a simple language using a CFG. Here, we formalize the definition and notation for CFGs and present algorithms that analyze such grammars in preparation for the parsing techniques covered in Chapter 5 and Chapter 6.