CSE 542 Fall 2024


Lab 3: "That ends this strange eventful history"

Points: 100, weighted as 25% of semester grade

Submit: files uploaded to the assignment page for this lab in Canvas

Due: by 11:59pm Friday December 6, 2024


Purpose and Objectives

The purpose of this assignment is to give you experience using threads, atomic reference counted (Arc) pointers, mutexes, and input/output abstractions to manage information concurrently and across network connections. We will use those additional abstractions to augment and optimize performance of your lab 2 solution as a capstone lab exercise for this semester.

In this assignment you will work in a group of 1, 2, or 3 people (but not more) to refactor your Lab 2 solution into a concurrently executing form, and then extend it with additional features including networked interactions to obtain files over sockets from a common server endpoint. You will refactor your code from the previous lab assignment to (1) develop a client program that can conduct high-latency IO operations in parallel using multiple threads according to the fork-join concurrency model, (2) develop a multi-threaded server to support performance of multiple scripts in multiple client programs, and (3) expand the client program's IO model to use networked communication. It is again possible that your programs may encounter lines that are badly formed, and if so should simply skip that line in many cases (though it may print out a warning message saying it did that) and proceed to subsequent steps.

Throughout this assignment, you will again work with input files in particular formats, which the programs you are writing will use as described below. For example, two client programs may be given two different script files, partial_hamlet_act_ii_script.txt and partial_macbeth_act_i_script.txt for the first parts of Act II from William Shakespeare's "Hamlet, Prince of Denmark" and Act I from William Shakespeare's "Macbeth" respectively (text obtained from the Literature Page web site at http://www.literaturepage.com/read/shakespeare_hamlet.html), with corresponding configuration files hamlet_ii_1a_config.txt, hamlet_ii_1b_config.txt, and hamlet_ii_2a_config.txt for the first script file and macbeth_i_1_config.txt, macbeth_i_2a_config.txt, and macbeth_i_2b_config.txt for the second.

As in the previous lab assignments, each scene fragment's configuration file gives the name of each character in the scene fragment along with their corresponding part file, e.g., Polonius_hamlet_ii_1a.txt and Reynaldo_hamlet_ii_1a.txt in hamlet_ii_1a_config.txt, Polonius_hamlet_ii_1b.txt and Ophelia_hamlet_ii_1b.txt in hamlet_ii_1b_config.txt, King_hamlet_ii_2a.txt and Queen_hamlet_ii_2a.txt and Rosencrantz_hamlet_ii_2a.txt and Guildenstern_hamlet_ii_2a.txt in hamlet_ii_2a_config.txt, FIRST_WITCH_macbeth_i_1.txt and SECOND_WITCH_macbeth_i_1.txt and THIRD_WITCH_macbeth_i_1.txt and ALL_macbeth_i_1.txt in macbeth_i_1_config.txt, MALCOLM_macbeth_i_2a.txt and DUNCAN_macbeth_i_2a.txt and SOLDIER_macbeth_i_2a.txt in macbeth_i_2a_config.txt, and LENNOX_macbeth_i_2b.txt and MALCOLM_macbeth_i_2b.txt and DUNCAN_macbeth_i_2b.txt and ROSS_macbeth_i_2b.txt in macbeth_i_2b_config.txt.


Compilation and Execution

For full credit on the assignment, the code for your lab solution must compile without warnings or errors with the Rust version 1.71.0 tools that are installed on the Linux Lab machines (and are accessed by adding module rust-2023) this semester, and must run correctly on those machines. You are free to use other platforms and compilers to develop your code, but before submitting your solution you should please make sure it also compiles without warnings or errors and runs correctly on the Linux Lab machines, which is where your solution will be graded.


Assignment

To complete this assignment please implement the following features:

  1. Log in to one of the Linux Lab machines, and in your directory for this course create a new Rust package for the client program for this assignment: e.g., cargo new lab3client and in that package's src directory (1) create a lab3 directory and and then (2) copy over the .rs files from your Lab 2 solution (with the files from the previous assignment's lab2 directory going into this assignment's lab3 directory). Update the code (i.e., replacing lab2 with lab3 in the appropriate comment and code lines) for this new lab's directory structure.

    Compile and run your program and confirm that it still behaves as it did at the end of the previous lab assignment.

    Thread-safe output and data sharing

  2. To hedge against potential future evolutions of your code to involve multiple threads printing to the standard output stream at once, replace all uses of the println! macro with uses of writeln! macro with its first argument being std::io::stdout().lock(), and process the Result returned by that macro to determine success or failure.

    Similarly, replace all uses of the eprintln! macro with uses of writeln! macro with its first argument being std::io::stderr().lock(), and process the Result returned by that macro to determine success or failure.

    Compile and run your program and confirm that it still behaves as it did at the end of the previous lab assignment.

  3. To allow mutable data to be shared safely across multiple threads, modify your Play struct so that its vector holds elements of type Arc<Mutex<SceneFragment>> instead of simply SceneFragment. In your Play struct's process_config method, instead of pushing just a new SceneFragment, push a new Arc initialized with a new Mutex initialized with a new SceneFragment into the vector.

    In your Play struct's prepare and recite methods, match on the result of a call to lock on the appropriate element of that vector, using a ref or mut pattern to extract an immutable or mutable SceneFragment reference and invoke methods using that reference (and for the calls to the enter and exit methods additional references obtained in a similar way).

    Compile and run your program and confirm that it still behaves as it did at the end of the previous lab assignment when given well formed input files.

  4. Similarly, modify your SceneFragment struct so that its vector holds elements of type Arc<Mutex<Player>> instead of simply Player. In your SceneFragment struct's process_config method, instead of pushing just a new Player, push a new Arc initialized with a new Mutex initialized with a new Player into the vector.

    Add a function that compares two references to Arc<Mutex<Player>> and has a return type of std::cmp::Ordering. The function should match on calls to the lock method on each of its passed references, and should return std::cmp::Ordering::Equal if either of those returns an error. Otherwise it should match both results using ref patterns to obtain immutable references to the underlying Player structs, and match on the result of passing them into a call to Player::partial_cmp. If that call returns Some the function should return the value carried by that enum label, and otherwise it should return std::cmp::Ordering::Equal.

    Use the function in a call to sort_by to sort the vector (instead of simply calling sort as was done in the previous lab assignment).

    Update your SceneFragment struct's methods as needed, to match on the result of a call to lock on the appropriate element of that vector, using a ref or mut pattern to extract and use an immutable or mutable SceneFragment reference instead of calling methods directly on the vector's elements.

    Compile and run your program and confirm that it still behaves as it did at the end of the previous lab assignment when given well formed input files.

    Multi-threaded file operations

  5. Modify your Play struct's process_config method so that instead of calling each SceneFragment struct's prepare method directly with each configuration file name, it spawns a thread that makes that call and stores the thread's handle in a collection.

    After spawning all those threads the Play struct's process_config method should match on a join with each of their handles, and should handle any error that is returned from the join as though it had been returned by calling the method directly.

    Modify the SceneFragment struct's prepare method so that if it would have returned an error it instead calls the panic! macro so that the thread panics and an error result is returned when the thread that spawned it joins with it.

    Compile and run your program and confirm that it still behaves as it did at the end of the previous lab assignment when given well formed input files. Then test your program with a modified script file that will induce file IO errors (e.g., one containing names of config files that do not exist) and make sure those errors are detected and handled by the program as they were in the previous lab assignment.

  6. Similarly, modify your SceneFragment struct's process_config method so that instead of calling the Player struct's prepare method directly with each part file name, it spawns a thread that makes that call and stores the thread's handle in a collection.

    After spawning all those threads the method should join with each of their handles, and should unwrap the result of each of those joins so that if the thread being joined had a panic the current thread then also panics (thus propagating the panic upward).

    Modify the Player struct's prepare method so that if it would return an error it instead calls the panic! macro so that the thread panics which returns an error result when the thread that spawned it joins with it.

    Compile and run your program and confirm that it still behaves as it did at the end of the previous lab assignment when given well formed input files. Then test your program with a modified script file that will induce file IO errors (e.g., with its config files containing names of player part files that do not exist) and make sure those errors are detected and handled by the program overall as they were in the previous lab assignment.

    Multi-threaded server

  7. Create another new Rust package for this assignment: e.g., cargo new lab3server and in that package's src directory create a lab3 directory containing empty mod.rs, and server.rs files. Also copy over the return_wrapper.rs file from your lab 3 client package's lab3 directory into this one.

    Modify the main.rs file so that it declares a public module named lab3. Modify the mod.rs file in the lab3 directory so that it declares public modules named server and return_wrapper.

  8. In the server.rs file declare a Server struct with a listener member of type Option<TcpListener> and a listening_addr member of type String.

    Above that, declare a static CANCEL_FLAG variable of type AtomicBool that is initialized to false.

    Implement an asssociated new function for the Server struct, which initializes the Option field to be None and the String field to be empty.

    Implement a is_open method for the Server struct, which returns false if its Option field is None, and otherwise returns true.

    Implement an open method for the Server struct, which takes a string slice and calls TcpListener::bind with it. If that call is successful, the open method should store the listener that was returned by it in a Some label in the Server struct's Option field, and store a copy of the string slice in the struct's other field (generally speaking this second field is useful for error and debugging messages but isn't essential to the server's operation).

  9. Implement a run method for the Server struct, which stays in a loop as long as the static CANCEL_FLAG variable is false and the Server struct's Option field is not None.

    In each iteration of the loop, the run method should call the accept method of the listener stored in the struct's Option field, and then immediately should again check the static CANCEL_FLAG variable and if it is true should return; otherwise, if the accept call was successful the run method should spawn a child thread (e.g., using a move closure) to manage the newly accepted socket connection.

  10. The child thread should read in a text token from the accepted socket connection, and if the token is "quit" the thread should store the value true in the CANCEL_FLAG variable and return. Otherwise, the thread should treat the token as the name of a file and try to open that file for reading. If the file cannot be opened, the thread should shut down the connection and return. Otherwise, the thread should read in all the contents of the file and write them out over the connection and then return. For security purposes, your server may assume that the only files it should open and stream out over the connection will reside within the current directory in which the server is running, which will be true in all of the test cases I will run in evaluating your lab solution. That is, the server can check the token for any characters indicating a directory path or expansion of an environment variable (including / or \ or .. or $) and decline to try to open the file if any of those are found.

  11. Modify the main function in the main.rs file so that it has a return type of ReturnWrapper. It should check that the command line arguments to the program have exactly two tokens, one with the program's name and one with a network address at which the server will listen and accept connections, and if not should print a usage message and return an error code.

    Otherwise, the main function should declare a variable initialized with Server::new(), pass the network address token into a call to its open method, and then call its run method.

  12. Create another new Rust package for a simple test client with which to validate the server's behavior: e.g., cargo new lab3testclient and in that package's src directory modify the main function in the main.rs file so that it checks that the command line arguments to the program have exactly three tokens, one with the program's name, one with a network address with which to connect to a server, and one with a token to send to the server. If the wrong number of command line arguments was given, the main function should print a usage message and return an error code.

    Otherwise, the main function should pass a copy of the network address into a call to TcpStream::connect, and if a connection was successfully established should send the token to the server over that connection. If the token was anything other than "quit", the test client should read lines of text from the connection and print each one out, until the connection is shut down by the server at which point the test client's main function should return Ok(()).

    If the token was "quit", the test client should instead declare a variable of type std::time::Duration that is initialized to a value of one second, pass that variable into a call to std::thread::sleep, call TcpStream::connect again with the same network address (to wake up the server out of the accept call) and then return Ok(()).

    From local to networked file IO

  13. Add a get_buffered_reader function to the script_gen.rs file in your lab3client package, which takes an immutable reference to a String and returns a Result with a newly initialized BufReader.

    The get_buffered_reader function should check whether the string that was passed to it begins with "net:" followed by an 8-digit dotted decimal address and another colon, followed by a port number and another colon, followed by a file name (e.g., "net:127.0.0.1:7777:partial_macbeth_act_i_script.txt"). If it does, the get_buffered_reader function should separate out a token containing just its dotted decimal address and port number (with a colon between them) and pass that token into a call to TcpStream::connect.

    If that call fails, the get_buffered_reader function should return an error, or otherwise if it succeeds should (1) separate out a token with the file name that follows the colon after the port number, and send it to the server over the connection, and then (2) use the connection handle to initialize and return a new BufReader.

    If the string passed into the get_buffered_reader function is not formatted in that way, the function should use it as a file name, call File::open with it, and use the file handle to initialize and return a new BufReader (or should return an error if one occurred during any of those steps).

  14. Modify the grab_trimmed_file_lines function in the script_gen.rs file so that instead of opening a file and wrapping the file handle in a BufReader, it instead calls the get_buffered_reader function and checks the result of that call.

    Testing

  15. To test your program, you should please make sure it compiles with no errors or warnings, and run it (and check the output to make sure it appears correct) with relevant script files, scene fragment configuration files, and part files, such as: partial_hamlet_act_ii_script.txt, hamlet_ii_1a_config.txt, hamlet_ii_1b_config.txt, hamlet_ii_2a_config.txt, Polonius_hamlet_ii_1a.txt, Reynaldo_hamlet_ii_1a.txt, Polonius_hamlet_ii_1b.txt, Ophelia_hamlet_ii_1b.txt, King_hamlet_ii_2a.txt, Queen_hamlet_ii_2a.txt, Rosencrantz_hamlet_ii_2a.txt, and Guildenstern_hamlet_ii_2a.txt.

  16. You are again encouraged to generate your own script file, configuration files and part files for segments of other plays, and to fuzz some or all of them by adding extra tokens, badly formed lines, badly formed tokens, extra whitespace, blank lines, etc., and also shuffling the lines in part files so that they are out of order and also selectively removing lines from some part files, to test your program's handling of key use cases across a wide range of possible inputs.

  17. You also should test your code's ability to manage both local and remote files at once, both by varying the format of the script file name passed to the lab3client program and by modifying some entries in the script files and configuration files so that they specify remote files as well (e.g., "net:127.0.0.1:7777:hamlet_ii_1a_config.txt" or "net:127.0.0.1:7777:DUNCAN_macbeth_i_2a.txt").

  18. You should also stress test your code by running multiple copies of your lab3client program at once, some with the same script file and some with different script files, and involving both local and remote files.

  19. In your ReadMe.txt file please add a subsection titled "Testing" and in it please summarize how you tested your solution, including any problems your testing detected, and how you addressed those.

What to Submit

ReadMe.txt

As you work on your program, please write down the design decisions you faced, and describe both the solutions you chose to address them and the rationale for the choices you made, in a ReadMe.txt file.

The first section of your ReadMe.txt file should include:

  1. the number of the lab (e.g., "CSE 542 Fall 2024 Lab 3")
  2. the names and e-mail addresses of all team members
  3. an overview of how your program was designed, and
  4. insights, observations and questions you encountered while completing the assignment.

The second section of your ReadMe.txt file should provide detailed instructions for how to:

  1. unzip or otherwise unpack your files,
  2. build your program(s), and
  3. run your program(s) on the CEC Linux Lab machines where your lab solutions will be evaluated.
Important:Your programs must be able to be unpacked, built, and run using only the instructions in your ReadMe.txt file and the 1.71.0 versions of rustc and cargo and other tools already present on the CEC Linux Lab machines.

The third section of your ReadMe.txt file should provide a reasonably detailed description of how you developed and tested your solution, including each stage of how you refactored and extended your Lab 2 solution to implement this lab assignment. Please also describe the kinds of script, configuration, and character part files you used and their formats (including well formed and badly formed content and local and remote file names to test how your program handled those variations), and any other scenarios that you tested that you consider important.

Electronic Submission

Please submit your solution by uploading (on the assignment page for this lab in Canvas):
  1. your source code files, organized in separate directories for your lab3client, lab3server, and lab3testclient packages;
  2. any example files you think useful (for example, the different configuration and character part files you used to test your solution);
  3. output from relevant significant tests of your programs; and
  4. your ReadMe.txt file.


Grading

Please treat this assignment as you would a commercial or research product that you are shipping to a large group of customers. Please take the time to test, document, and check your work to make sure all parts are shipped, are of high quality, and behave resonably under a range of different operating conditions. Grading will focus both on the quality of your solution and on the quality of the product itself, as submitted. To ensure the best possible result (and accordingly the highest possible grade), please pay attention to the following issues:

Correct Compilation and Operation

Your program must compile and run correctly on the CEC Linux Lab machines using the source files you provide. Please make sure to fix all compilation errors and warnings before submitting your lab solution. Missing files will be handled with a 5 point deduction (possibly per file) if you need to supply a new file in order for your solution to build successfully.

Design Quality

Design decisions are largely yours to make, and as long as the design is concise, coherent, consistent, and addresses all design forces in the assignment, you will receive full credit. One key area to consider is whether each abstraction in your design does a single job, does that job completely, and collaborates appropriately with other abstractions. Minor deductions (1-3 points, but please be aware minor deductions can add up) will be made for abstractions that are unnecessarily large or have inappropriate inter-dependencies. Major deductions (5-15 points) will be made for larger problems like neglecting key requirements of the assignment, code that does not compile or that has a run-time error when it runs, etc.

Implementation Quality

We will take into account different approaches to implementation. Minor deductions will be made for things your program may get away with but are not good, like neglecting to check the return value from a function call. Major deductions will be made for problems that produce incorrect or extremely inefficient behavior of the program. The former kinds of errors should be eliminated during your coding and code review phases, and the latter kinds of errors should be eliminated during your testing phase, which should include running different combinations of configuration and character part files containing well formed and/or badly formed contents.

Coding Style

Please code clearly with both the reader of the code and the user of the program in mind. Please use consistent indentation and placement of braces, and comment your code thoroughly. Use whitespace liberally - it's free and it makes it a lot easier to read your code. When grading, tagged comments may be added to the code indicating areas where particular issues to mention were found, whether or not points are deducted. Except in extreme cases, only minor deductions will be made for each style issue (though again, those may add up).

Documentation

Please make sure you provide adequate instructions on how to build and run your program. Also, please make sure to document your solution. Minor or even major deductions will be made for inadequate explanation of how your solution does what it does, why you made key design choices, or how the user (or grader) can successfully build and run your program. Even if how you did something is obvious to you, please assume it is not obvious to the reader.

Missing Files

Missing files in the delivered software make it difficult or impossible to evaluate your solution. An automatic deduction of up to 5 points may be applied for each missing or corrupted file that is submitted later on request.

Late Submission

Labs recieved within 24 hours after the deadline will be graded with an automatic 10 point deduction. Labs received more than 24 hours after but within 48 hours of the deadline will be graded with an automatic 20 point deduction. Labs received more than 48 hours past the deadline may not be graded, except under extenuating circumstances. If you are running late completing the assignment, please let me know about the trouble as soon as you can (and it may be possible to give you a brief extension if you request it in advance of the deadline), and please turn in as much as you can before each deadline so at least some credit can be given for the work you have done.

Grading Issues to watch out for

As you develop your lab solution, please follow the programming guidelines for this course, and especially please avoid the following practices, which may result in deductions: