Fall 2015 CSE 425 Lab 0: Translating Horn Clauses in C++

Due by Tuesday September 15, 2015 at 11:59 pm
(deadline for our e-mail server's receipt of your submission with a zip file with your solution)
Final grade percentage: 5 percent

In this lab you are allowed (and in fact encouraged) to work in teams of 2 or 3 people (but not more), though you also are allowed to work individually if you prefer.

Objective:

This lab is intended to give you experience with basic techniques for parsing expressions (i.e., Horn clauses from the domain of logic programming as a specific example), including:

In this lab assignment, you will write a program that can (1) scan an input file to recognize tokens from it, and (2) parse the stream of tokens to recognize well-formed Horn clauses (in a Prolog-like format).

Assignment:

    Part I - Readings and Resources:

  1. The following readings in the course text books may be useful as reference material while working on this lab assignment:

  2. The following on-line resource also may be useful while working on this lab assignment:

    Part II - Program Design and Implementation:

    Note: the details of this assignment are intentionally somewhat under-specified, leaving you some room to choose what you think is the best way to implement them, as long as what you do is reasonable and you explain your design decisions in comments in the code and in your project report.

  3. Open up Visual Studio 2013 and create a new C++ Win32 Console Application project for your lab 0 assignment, e.g. named cse425_lab0.

    In the ReadMe.txt file, put the lab number and names of the people submitting the lab solution, and as you work on this lab please record your design, implementation, and testing approach in this file as your project report.

    Please use the default project settings - specifically, your program must use precompiled headers which will put the #include "stdafx.h" directive at the top of each of your .cpp files, which must be the first non-comment line in each of those files.

  4. Modify the main function's signature so that it looks like the standard (i.e., portable between Windows and Linux) main function entry point for C++:

    int main (int argc, char * argv[])

    The main function should first check whether or not the program was passed exactly two command line arguments (in addition to the program's name which is always in argv[0]): if so the program should return the result of calling the parsing function (described below) with the two command line arguments (C style strings); otherwise the program should print a helpful message indicating how the program should be run, e.g.:

    cout << "usage: " << argv[0] << " <input_file_name> <output_file_name>" << endl;

    Important: make sure that the main function uses aliasing correctly and is exception safe, e.g., it should not access locations in argv beyond the last valid location (as indicated by the value in argc), it should not allow an exception to propagate uncaught outside the main function, etc.

  5. The parsing function should take two strings (C style strings are fine, or if you prefer you can pass C++ style strings or references to C++ style strings) and use them as the names of input and output files respectively: the parsing function should construct an input file stream with the first one and an output file stream with the second one. If either of those streams fails to open correctly (for reading or writing respectively) the parsing function should print an error message and return a unique non-zero error code, both indicating the kind of error that occurred (for example if the file named in the first argument did not exist, and/or if the file named in the second argument existed already and was read-only).

    Otherwise, the parsing function should construct an instance of the scanner class (described below) using the input file stream object, and repeatedly extract token objects (described below) from the scanner via its extraction operator (operator>>) until no more tokens remain, parsing the stream of tokens it extracts and identifying well formed Horn clauses within that stream according to the following context-free input grammar:

    hornclause -> head [SEPARATOR body]

    head -> predicate

    body -> predicate {AND predicate}

    predicate -> name LEFTPAREN [args] RIGHTPAREN

    name -> LABEL

    args -> symbol {COMMA symbol}

    symbol -> LABEL | NUMBER

    Where AND, COMMA, LABEL, LEFTPAREN, NUMBER, RIGHTPAREN, and SEPARATOR, are all terminal tokens corresponding to the kinds of tokens recognized by the scanner (as described below); args, body, head, hornclause, name, and predicate are non-terminal symbols in the grammar; and ->, |, {}, and [] are (EBNF) metasymbols for (respectively) production, selection, zero or more repetitions, and zero or one occurrences of a sequence of symbols in the grammer, as we discussed in class.

    Whenever the parsing function detects a well formed Horn clause it should write it out (using the strings from the tokens from which it was recognized) on its own line to the output file, and start parsing for the next Horn clause starting with the next token it obtains from the scanner; whenever the parsing function recognizes that the sequence of tokens it is currently parsing does not represent a well formed Horn clause it should print (to the standard output stream) an error message indicating where and how it failed (i.e., what kind of terminal token it had expected, what production it was in the midst of parsing, and what tokens it had seen so far for the current Horn clause it was attempting to parse) - the parsing function should then start over with the next token from the scanner, and attempt to recognize a valid Horn clause from that point.

  6. Declare (in a separate .h file) and define (in a separate .cpp file) a token struct type that has: a public enumerated type for the kind of token, which has the enumeration labels AND, COMMA, LABEL, LEFTPAREN, NUMBER, RIGHTPAREN, SEPARATOR, and UNKNOWN; a public member variable of the enumerated type; a public member variable of type (C++ style) string; and a public constructor that sets the first public member variable's value to UNKNOWN but leaves the second member variable as an empty string.
  7. Declare (in a separate .h file) and define (in a separate .cpp file) a scanner class that has: a reference to an input file stream as a member variable; a public constructor that takes a reference to an input file stream and uses it to initialize the member variable; a public type conversion operator to bool that returns true as long as the input file stream its member variable references remains valid for reading, and otherwise returns false; and an extraction operator (operator>>) that takes a reference to a token object and returns a reference to the scanner class object on which the extraction operator was invoked.

    Each time the scanner object's extraction operator is invoked, it should try to extract a (C++ style) string from the input file steam referenced by its member variable, and if it is successful should assign the string to the passed token object's string member variable, and then set the passed token object's kind (enumerated type) member variable as follows:

    If the string consists entirely of lowercase or uppercase alphabetic characters (in 'a' to 'z' or 'A' to 'Z') the token is a LABEL token.

    If the string consists entirely of decimal digit characters (in '0' to '9') the token is a NUMBER token.

    The token is an AND token if the string is "^"; a SEPARATOR if the string is ":-"; a LEFTPAREN token if the string is "("; a RIGHTPAREN token if the string is ")"; or a COMMA token if the string is ",".

    Otherwise the token is an UNKNOWN token.

  8. Build and run your program , and put it through a series of trials that test it with good coverage of cases involving files containing both well formed and badly formed Horn clauses, with names of existing and missing files, etc. in as many possible combinations as you can think of.

    In your project report please document which cases you ran, summarize what your program did and whether or not that was correct behavior (and why or why not), in each case.

    Please make sure that your code compiles and runs correctly in a Visual Studio 2013 C++ Win32 Console Application project with the default settings, on the Windows machines in the studio/lab where we have our class. This is especially important if you developed your code on another platform or your own machine, since there may be differences between compiler and development environment installations (an identical software image is installed on all of the machines in the class studio/lab environment).

  9. Prepare a .zip file that contains all of your project's header and source code files, input/output traces for the important cases you tested, and your project report. Send the .zip file containing your lab 0 solution as an e-mail attachment to the course e-mail account (cse425@seas.wustl.edu) by the submission deadline for this assignment. If you need to make changes to your lab solution you are welcome to send a new .zip file, and we will grade the latest one received prior to the deadline (according to the time stamp the server puts on the e-mail).

    IMPORTANT:please make sure that no .exe file is included in any .zip file you send, as the WUSTL e-mail servers will block delivery of email with a .zip attachment that includes a .exe file. Please also make sure to send a .zip file (not a .7z file) when you send your solutions.

    Part III - Extra Credit (1 to 5 percent of assignment's value depending on quality and completeness):

  10. Optionally, extend your solution so that (like the definition of the body) the head of a Horn clause may be either a conjunction of multiple predicates or a single predicate.

    Please add an extra credit section to your project report documenting your design and implementation of that additional capability, including showing how the input and grammar was extended to support that feature, and describe how you modified your implementation to parse those grammatical extensions. In that same section, please show examples of input and output from the different cases that you tested in order to validate that it is working correctly.

    Please submit both the required and extra credit portions of your program code and your project report together (rather than in a separate directory).


Posted 10:50am Monday August 31, 2015, by
Chris Gill