Chapter 3: Scanning—Theory and Practice


Getting Started (5 minutes)

In each of the following sections, go as far as you can in the time allotted, but move on to the next section so you'll have a chance to work on each section while in studio.


  1. Open the JFlex Documentation page in a browser window.
  2. Open the following files as you'll be modifying them throughout the session.
  3. Follow the instructions for ant builds.
    This may be the first time you are asked to follow these instructions, which provide steps for configuring your project and workspace to allow the ant builds to work properly.

    Please follow the instructions carefully.

  4. Upon running this studio code, you should produce some output in the console window.
  5. Look at the console output. What do you see and how does it relate to the scanner specification and sample input files?

    Spend a few minutes discussing and documenting (in report.txt) your observations.

  6. To simplify project navigation, maintenance, and appearance, any automatically generated files are placed in the autogen folder at the outermost level of your project. Any automatically generated Java source files are placed in the autogen src package.
    The ant builds should automatically refresh your workspace so that you can see these files.

    Take a look at the Yylex source file in the autogen package. This file was automatically generated by JFlex based on the scanner specification in studio3.jflex.

    Try to find portions of the Yylex class that correspond to patterns and code that are supplied in the studio3.jflex specification.

  7. If you have time, glance at the build.xml file. Discuss and document how it controls the build process you witnessed.

Studio Exercises

For each of the following exercises, add appropriate patterns to the studio3.jflex file and have the action call found to respond to the pattern. Also, add whatever strings you need to the TestFiles/input file so that you can verify that your pattern works as you expect. Be sure to add some counterexamples, to make certain your pattern catches only what you expect.

As much as possible, accumulate the patterns, examples, and counterexamples as you move from one exercise to the next, so that you can continue to test that each pattern behaves as it should. In other words, add to your studio3.jflex file a new pattern for each exercise while keeping all the old patterns. At the same time, add to the TestFiles/input file so you can properly test that the patterns you specified work as you expect. Be sure to include examples and counterexcamples of the patterns in the input file.

Also, note that the effects of the patterns you add may be unexpected and may cause you to change things a bit. Document and report what you learn and how you go about getting JFlex to do your bidding.

Part 1 (20 minutes)

  1. On seeing the string goodbye, cause the program to terminate (call System.exit(0))
  2. For each of your first names, have the scanner react or respond to your name in some interesting way.
  3. Look at (in the default package, you may have to refresh the workspace to see it) and find where the actions appear that you arranged for the above strings. Document your findings, and compare what you to see to what you code for actions in Lab 1. For patterns you don't match, you see output in the console window. What is producing that output? Try modifying the action for unmatched patterns and see if it works.
  4. Find binary integers that are even (end with a 0)
  5. Find binary integers that have at least 2 1s in them
  6. Find strings that begin with an h and continue as a binary integer with
    1. a 1, 2 digits from the end
    2. a 1, 3 digits from the end
    3. a 1, 5 digits from the end
    4. a 1, 10 digits from the end
    Report the effects of the above strings on the size of the FSA generated by JFlex.
  7. Using the alphabet {a,b,c} find identifiers (strings) of at least length 2 that end with a letter other than the one with which they begin.
  8. Same as above, but extend the alphabet to {a,b,c,d,e}
  9. Find strings of decimal digits
  10. Same as above, but now distinguish
  11. Look at and report where you see the code using a GOTO table to realize the FSA.

Part 2 (15 minutes)

  1. Come up with two simple regular expressions for the same language. Try each separately. What do you notice about the number of states in the FSA that results from each expression? What do you expect to happen for regular expressions that denote exactly the same set?
  2. Find comments using the regular expression
               /- a* -/
    where a can be any character at all. What do you expect this pattern to do? What kind of behavior do you see?
  3. Now find comments using the regular expression specified on page 12 (slide 24) of this part of the tutorial notes.
  4. Try to simplify the regular expression. What is the result on the number of states produced by JFlex?

Part 3 (20 minutes)

We next develop a sequence of expressions that build up components of a programming language such as Java. Develop patterns, examples, and counterexamples for the following:
  1. Java identifiers (begin with a letter, continue with letters or digits)
  2. The keyword if. I want you to try interchanging the patterns for this and the previous pattern in your JFlex file. What behavior do you see?
  3. Add 5 or so more keywords and test that it works for you.
  4. Now try some Java punctuation:
  5. Java (quoted) strings
  6. Comments of both varieties (// and /*... */)

Part 4 (time permitting)

One or both of the following

What to submit

Finishing Up

Submit your work as directed by your instructor.

Copyright 2010 by Ron K. Cytron
Last modified 14:32:04 CDT 13 August 2010