CSE 422S: Studio 10

Kernel Synchronization


"When it comes down to it, the problems with sharing data between threads are all due to the consequences of modifying data."

—Anthony Williams, C++ Concurrency in Action, Section 3.1, pp. 34

Like many userspace programs, the Linux kernel is designed to run concurrently with itself. In this case, kernel code must be careful not to create race conditions and concurrency bugs by accessing shared data without protection.

In this studio, you will:

  1. Create a race condition in a kernel module
  2. Use kernel synchronization to resolve the race condition

Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete.

As you work through these exercises, please record your answers, and when finished email your results to eng-cse422s@email.wustl.edu with the phrase Kernel Synchronization in the subject line.

Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.


Required Exercises

  1. As the answer to the first exercise, list the names of the people who worked together on this studio.

  2. Write code for a kernel module that creates one thread on each core of your Raspberry Pi, each of which simply calls the same function and returns (i.e., your threads should not do the periodic or repeating behavior from lab 1, though you can use your solution to lab 1 as a starting point for this exercise).

    As the answer to this exercise, describe briefly how you ensured that each thread that was created runs only on the processor to which it was assigned.

  3. Now we will create a data race in the kernel, so that we can later resolve it. Declare a global int variable named shared_data with the volatile qualifier, as in:

    volatile int shared_data = 0;

    The volatile qualifier tells the compiler that this variable will be modified outside of the current execution context, and has the effect of disabling certain optimizations.

    Within the (common) function that is run by each of your threads, write a for loop that increments the shared_data variable one million (1,000,000) times, as in:

    #define iters 1000000

    for( i=0; i<iters; i++){

    shared_data++;

    }

    In your module's exit function, print out the value of shared_data to the system log. As the answer to this exercise, explain briefly (1) why the value that should be printed if there are no data races should be 4,000,000 and (2) why a data race might cause a different value to be printed (e.g., based on the single variable data race example from the kernel synchronization I slides).

  4. Compile your module and then load and unload it three times on your Raspberry Pi, and record the value that was produced by each run. If you saw 4,000,000 consistently then you did not successfully create a race condition (make sure your variable is declared as volatile and that your threads really execute simultaneously on different cores).

    As the answer to this exercise report the values that were output by your module in each of those runs.

  5. Make a new copy of your kernel module and change the type of your global integer variable to be atomic_t, a type that is defined in include/linux/types.h. This type only should be accessed through special mutators and accessors. Initialize the variable using the function atomic_set(), increment it with the function atomic_add(), and access it's value with the function atomic_read(). The prototypes for these functions are found in include/asm-generic/atomic.h

    As the answer to this exercise please show your code that implements these features.

  6. Compile your new module and on your Raspberry Pi load and unload it three times, and record the value that was output to the system log each time. If you saw any result other than 4,000,000 then you have not successfully resolved the race condition (make sure you used the correct type and only accessed it with the functions noted above).

    As the answer to this exercise please report the values that were output in each run.

  7. Have your threads print a statement to the system log just before they start their for loop and just after they finish it. Use this as a crude timestamp to determine how long it takes your code to execute. Compile your module, and then load and unload it on your Raspberry Pi, and as the answer to this exercise report how long it took for each of the threads to complete its loop.

  8. Make another copy of your original kernel module. This time, rather than modifying your global integer to be atomic, we will use a mutex to protect it. Use the macro DEFINE_MUTEX(mutex_name) to statically declare a global mutex variable, and then use the functions mutex_lock(&mutex_name) and mutex_unlock(&mutex_name) inside of the loop within the thread function, to protect access to the shared_data variable.

    As the answer to this exercise please show the code that implements these features.

  9. Again, use kernel print statements as a crude timestamp for your thread functions. Compile your module, and load and unload it on your Raspberry Pi. As the answer to this exercise, explain briefly (1) how long your mutex-protected code took to complete compared to the atomic variable version, (2) why it might be necessary in some cases to use a mutex instead of an atomic variable.

Things to turn in