CSE 422S: Studio 4

Userspace Benchmarking

"I wish it need not have happened in my time," said Frodo.
"So do I," said Gandalf, "and so do all who live to see such times. But that is not for them to decide. All we have to decide is what to do with the time that is given, us."

The Fellowship of the Ring , Book 1, Chapter 2

Benchmarking programs can give important insights into how they perform (including where potential performance bottlenecks may exist) under different conditions. In addition, benchmarking in userspace serves as an introduction to the concepts behind benchmarking in the kernel, which in combination allows us to measure the impact of the kernel on userspace performance and vice versa.

In this studio, you will:

  1. Benchmark programs using command line tools
  2. Benchmark programs using Linux's clock functions

Please complete the required exercises below.

As you work through these exercises, please record your answers, and when finished email your results to eng-cse422s@email.wustl.edu with the phrase Userspace Benchmarking in the subject line.

Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.

Required Exercises

  1. As the answer to the first exercise, list the names of the people who worked together on this studio.

  2. First, we need some programs to benchmark. Please download a code package that includes four programs here. Unzip the package, build the programs with the Makefile that is provided in it, and run each program a few times. As the answer to this exercise, describe briefly what each program does.

  3. Coarse grained benchmarking can be done directly from the command line, using a bash command named time. This is actually a special command built into the bash shell, so its documentation can be found under man bash. As the answer to this exercise, use that command to capture the timing of a few runs of each of the test programs and show the results of those runs.

  4. The time command outputs three different pieces of timing information. As the answer to this exercise please say what they are and briefly explain the differences among them.

  5. Now compare the results of the following two commands. As the answer to this exercise please describe and briefly explain what you observe about the relationships among the user and real timing information for these runs.

  6. Look at the code in sing.c and execute the following command. As the answer to this exercise, please describe and explain what you observed about the relationship between the user and sys timing information.

  7. Now we're going to switch to using the C API for Linux's clocks. First, we'll look at exactly what clocks are available and get some info about each one, with the function clock_getres. You can find the documentation for this function in the manual pages: man clock_getres. Warning: Internet versions of man pages may not be up to date. Use the version on your Raspberry Pi.

    Look through the clocks available at the clock_getres man page. As the answer to this exercise, name a clock that would be well suited for userspace benchmarking (and explain briefly why), and the name a clock that would be poorly suited for userspace benchmarking (and explain briefly why).

  8. Next, use clock_getres to write a short program called getres.c that gives the resolutions for several different clock types. This function requires a structure called a timespec, which is also documented in the same man page and is the basic data structure used to report timing information from the kernel to userspace.

    As the answer to this exercise, copy and paste your program output (include at least one _COARSE clock type) and explain briefly what is meant by the resolution values that were output by your program.

  9. Based on the descriptions of Linux timers, the HZ variable, and jiffies from chapter 11 in our text book (the assigned readings for today), as the answer to this exercise please explain briefly what you think the difference is between CLOCK_MONOTONIC and CLOCK_MONOTONIC_COARSE, and why.

  10. Write a second short program that uses the function clock_gettime to figure out how long a call to clock_gettime takes. As the answer to this exercise, report this value, and describe briefly how you obtained it.

  11. Copy parallel_dense_mm.c into a new file called timed_parallel_dense_mm.c. First modify the code in the new file so that you time the critical computational loop with the CLOCK_MONOTONIC_RAW clock. Then modify the code again so that the program takes a second parameter (which defaults to 1) and executes the timed segment multiple times. Your program should output the minimum, mean, and maximum times over all timed iterations.

    Run your program for 100 iterations with matrix size 100. As the answer to this exercise, show the reported timing values, and based on the minimum, average, and maximum timing values, say briefly what you think a common running time actually is and why

Things to turn in

In addition to the answers above, please submit: