Chapter 3: Scanning—Theory and Practice


In this chapter, we discuss the theoretical and practical issues involved in building a scanner. For the purposes of crafting a compiler, the scanner's job to translate an input stream of characters into a stream of tokens each corresponding to a terminal symbol of a programming language.

More generally, scanners perform specified actions triggered by an associated pattern of input characters. Techniques related to scanning are found in most software components that are tasked with identifying structure in their input. For example, the processing of network packets, the display of Web pages, and the interpretation of digital video and audio media require some form scanning.