CSE 422S: Studio 7

Process Family Tree


There were many Bagginses and Boffins, and also many Tooks and Brandybucks; there were various Grubbs (relations of Bilbo Baggins' grandmother), and various Chubbs (connexions of his Took grandfather); and a selection of Burrowses, Bolgers, Bracegirdles, Brockhouses, Goodbodies, Hornblowers and Proudfoots. Some of these were only very distantly connected with Bilbo, and some of them had hardly ever been in Hobbiton before, as they lived in remote corners of the Shire.

The Fellowship of the Ring, Book 1, Chapter 1

Processes are one of the most important abstractions in any operating system. They are the basis for scheduling, memory management, accounting, and more. Even the kernel itself runs as a process, and spawns other processes to manage kernel-level services!

In this studio, you will:

  1. Write simple userspace programs to learn more about working with processes
  2. Learn how to do simple kernel module I/O
  3. Write a kernel module that explores the task_struct process data structure

Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete.

As you work through these exercises, please record your answers, and when finished email your results to eng-cse422s@email.wustl.edu with the phrase Process Family Tree in the subject line.

Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.


Required Exercises

  1. As the answer to the first exercise, list the names of the people who worked together on this studio.

  2. Write a short program called simple_fork.c that uses the fork() function to spawn a child process from a parent process. The parent process should print a statement before the fork and print out the child's PID after the fork. The child process should print out a statement after it has been spawned that uses the getpid() and getppid() functions to print out its PID and its parent's PID. Compile and run your program on your Raspberry Pi, and show your program's output as the answer to this question.

  3. Write a second program called tree_fork.c that takes an integer as a command line parameter, and up to a fixed limit of 10 produces a "binary family tree" of processes with that many generations. This will create 2n - 1 processes, where n is the number of generations. For n=1 the program will create a single process, for n=5 it will create 31, and for n=10 it will create 1023 processes, etc. We limit the number of generations to be 10 at most, since larger numbers would take a long time to run, and could even freeze your Raspberry Pi!

    If the command line parameter is not provided the program should output a helpful usage message and exit. Otherwise, it should convert the command line parameter into an integer and store it in a generations variable.

    If the generations variable is less than 1 or greater than 10, the program should output a helpful usage message and exit. Otherwise, it (and any child processes it spawns) should do the following (with the original program being the first generation):

    (1) print out a line that says what generation it belongs to and its own PID;

    (2) increment the current generation;

    (3a) if the last generation has been reached (according to the generations variable) simply return 0 or (3b) otherwise spawn two child processes;

    (4) use wait(0) to wait for any successfully spawned child processes to complete; and then

    (5) return 0 if both were successfully spawned or -1 if either spawn (or both) failed.

    Hint: it is important to check the pid_t return values from each call to fork() (or whatever call you used to spawn each child process) carefully, since (1) these calls may fail as indicated by a negative return value, (2) 0 indicates that the child process is running, and (3) a positive number indicates that the parent process is running.

    Hint: based on this, it's straightforward to implement the logic for this exercise using a while loop with a couple pid_t variables (one for each child), if/else statements to handle the different cases for success/failure (and on success running in the parent/child process after the fork), and judicious use of the continue statement for code branches that are running in a child process.

    As the answer to this exercise, show output of your program when it is run with 4 generations (which should spawn a total of 15 processes if implemented correctly).

  4. Now we're going to return kernel module design using these ideas. First, from your linux source directory, find and copy the file samples/kobject/kobject-example.c into the directory where you keep your kernel module code. This is a kernel module that uses a feature called kobjects to provide an interface to exchange data between the kernel and userspace. Each data item is called an attribute, and for each attribute you provide a show and store function that is called when userspace reads and writes these values, respectively.

    This particular module provides three attributes: foo, baz, and bar. Once loaded, you can find them in the sysfs filesystem under the /sys/kernel/kobject-example/ directory.

    Modify this file so that a system log message is printed when any of foo, baz, or bar is updated, and the old value as well as the new value of each variable that was updated is shown in the message.

    Now we can build your modified module as we did in the previous studio. First, update your Makefile so it contains the appropriate .o file target for your new module, and then generate the .ko file for your new module, as follows:

    module add raspberry

    and then

    KERNEL=kernel7

    (or set KERNEL=kernel7 if that's what your shell requires) and then

    make -C ../linux/ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- SUBDIRS=$PWD modules

    Then, use sftp to copy the generated .ko file over to your Raspberry Pi, and load the module using sudo insmod.

    From a terminal with root permissions you can then read values out of these attributes with the cat command, and you can write values into these attributes with the echo command, as in:

    sudo bash

    and then

    cd /sys/kernel/kobject-example/

    and then

    cat foo

    which will show the current (original) value of the foo attribute and then

    echo 42 > foo

    which will write the value 42 into the attribute foo and then

    cat foo

    which will show the new value of attribute foo. Note that you must have a root terminal (which sudo bash gives you) to write into these commands (i.e. sudo echo doesn't work).

    As the answer to this exercise, show output that demonstrates your module successfully changing the values offoo, bar, and baz, using these commands.

  5. Now we're going to write a kernel module that reads a PID through the sysfs interface and prints that process' ancestry in the system log.

    Make a new kernel module file called family_reader.c that is based on your modified kobject-example.c file (i.e., to start with, please make a copy of it). This module should create a single system attribute (that's integer valued as in the previous exercise) under /sys/kernel/fam_reader/. When you write an integer to this attribute, your module should try to print out the PID's ancestry (i.e., its parent and then its parent's parent, etc. all the way up to the init task). There are a few steps involved:

    Side note: The modern Linux kernel makes a distinction between "real" and "virtual" PIDs for the benefit of migrating processes across different virtual hosts. The virtual PID is the PID that a process sees from userspace.

    1. You will need to convert your integer input into a proper kernel PID. Use the function find_vpid(), which returns a pid*. This function can fail, so be sure to check its return value before dereferencing the pointer.
    2. Next you can convert your pid* to a task_struct* by passing it into the function pid_task() along with the flag PIDTYPE_PID. This function can fail, so be sure to check its return value before dereferencing the pointer.
    3. Once you have a task_struct*, you can access any of the data it stores. In particular, the real_parent field stores a task_struct* pointer for the process that spawned it, and the comm field is a string that gives the command name. Note: there is a separate field called parent, which is not what we want for this exercise. The parent is the logical parent that shares process group signals and allows for waiting between parent and child.
    4. Work your way back up the family tree, printing out each task's PID and command name, all the way back to the init task with a PID of one.

    Compile your new kernel module as you did for the previous one, and then use sftp to copy the resulting .ko file over to your Raspberry Pi. On your Raspberry Pi, install the module, and then use sudo bash to give your terminal session root access. In the directory for that module (which will be under /sys/kernel/), use cat and echo to read and write the value of that module's attribute, and then use dmesg to confirm that the system log shows your module is working correctly.

    Use the ps command to find the PID of the sudo process that is running in your current terminal window, use echo to set the attribute of your module to that PID, and then use dmesg to see the ancestry of that process. As the answer to this exercise please show the messages from the system log that show the ancestry of the sudo process.


Things to turn in

Optional Enrichment Exercises

  1. The task struct includes a lot of interesting process data and process accounting. Try printing out other fields into the system log, and as the answer to this exercise please describe briefly what you printed out and show some of the system log messages from doing that.

  2. The file sched.c includes a lot of facilities for working with tasks, including the ability to modify specific tasks or iterate over every task in the system with macros such as do_each_process(). Try out some of those capabilities, and as the answer to this exercise please describe briefly what you did and what you observed when you did that.