CSE 422S: Studio 14

VFS Layer


All filesystems rely on the VFS to enable them not only to coexist, but also to interoperate.

— Robert Love, Linux Kernel Development, 3rd Edition, Chapter 13, pp. 261.

The virtual filesystem (VFS) layer allows a wide range of filesystems to be used within Linux, even if their implementation details vary significantly. Each filesystem is required to implement a common set of abstractions, which in turn allows Linux to handle them in a uniform manner. This also means that how a process views a filesystem is also standardized, allowing even kernel threads and other specialized processes to interact with different filesystem abstractions in a common and portable (at least within Linux) manner.

In this studio, you will:

  1. Write a simple kernel module that accesses the filesystem mounted on your Raspberry Pi, via a kernel thread's process descriptor (task_struct).
  2. Extend that kernel module to explore some of the VFS data structures, including directory entries for the current working directory and the root directory among others.
  3. Extend your kernel module further to do the same for userspace task.

Please complete the required exercises below, as well as any optional enrichment exercises that you wish to complete.

As you work through these exercises, please record your answers, and when finished email your results to eng-cse422s@email.wustl.edu with the phrase VFS Layer in the subject line.

Make sure that the name of each person who worked on these exercises is listed in the first answer, and make sure you number each of your responses so it is easy to match your responses with each exercise.


Required Exercises

  1. As the answer to the first exercise, list the names of the people who worked together on this studio.

  2. Write a kernel module that spawns a single kernel thread. That thread should use the current macro to access its own process descriptor (struct task_struct declared in <linux/sched.h>) and print out the values (i.e., the addresses they contain) of three of its task_struct's fields to the system log: fs, files, and nsproxy.

    These fields give a process direct access into the virtual filesystem. Respectively, these fields are pointers to the process's filesystem structure (struct fs_struct, declared in <linux/fs_struct.h>), its open file table structure, (struct files_struct, declared in <linux/fd_table.h>), and its namespace proxy structure (struct nsproxy, a new struct declared in <linux/nsproxy.h> that wraps the pointer to the mnt_namespace struct described in the text book).

    Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, please show the lines of the system log that contain your module's output, including the values of the three pointers.

  3. Modify your code so that the kernel thread uses the fs field of its process descriptor to access two fields of the process' filesystem structure (struct fs_struct, declared in <linux/fs_struct.h>): pwd and root.

    These fields are path structures (struct path, declared in <linux/path.h>) for the process' current working directory and the root directory, respectively. Each of these path structures contains two fields: mnt which points to a VFS mount structure (struct vfsmount, declared in <linux/mount.h>) and dentry which points to a directory entry structure (struct dentry, declared in <linux/dcache.h>).

    Modify your module so that its kernel thread prints out the values (the addresses they point to) of both of those path structures' mnt fields. If the values of those pointers differ, your module's kernel thread should also print out the pointer values found in the mnt_root and mnt_sb fields of the VFS mount structures to which they point.

    Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, please explain, based on your module's output to the system log, whether or not (and why or why not) you think the current working directory and the root directory are part of the same mounted filesystem.

  4. Modify your module so that its kernel thread prints out the values (the addresses of the locations they point to) of both of those path structures' dentry fields. If the values of those pointers differ, your module's kernel thread should also print out the strings in the d_iname fields of the directory entry structures to which they point.

    Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, please explain, based on your module's output to the system log, whether or not (and why or why not) you think the process' current working directory is the same as its root directory.

  5. Modify your module so that it takes in an optional parameter of a PID so that if a valid PID was pass in, instead of the kernel thread exploring its own VFS (via the current macro) it explores the VFS associated with whatever PID was given it. Otherwise the kernel module's behavior should not change.

    You can refer back to Lab 1 for more information on how to pass in parameters to your kernel module and the Process Family Tree Studio for information on getting the proper struct task_struct.

    Compile your module and load it on your Raspberry Pi passing in a PID that you know belongs to a Userspace thread (perhaps write your own program to simply spin indefinitely), examine the system log to see your module's output, and then unload it. As the answer to this exercise, please explain, based on your module's output to the system log, whether or not (and why or why not) you think the process' current working directory is the same as its root directory.

  6. Modify your module so that its kernel thread traverses the list of directory entries whose head is in the d_subdirs field of the directory entry structure to which the path struct for the root directory points. These are the directory entries for all of the files and directories that are within the root directory.

    Special functions are needed to traverse Linux kernel data structures, as described in Chapter 6 of the LKD course text book. Review the discussion and examples in that chapter, and use the appropriate functions in your module's kernel thread to obtain and print out the d_iname field of each directory entry in that list, to the system log.

    Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, please show the output from your module that contains the names of the entries in the root directory.

  7. Modify your module so that as its kernel thread traverses the list of directory entries in the d_subdirs field of the root directory entry, it checks the d_subdirs field and d_child field of each directory entry it visits: if the list in either of those fields is non-empty, the kernel thread should print out the d_iname field of each directory entry in that list.

    Compile your module and load it on your Raspberry Pi, examine the system log to see your module's output, and then unload it. As the answer to this exercise, based on the output you saw from this exercise (and from the previous one, and which list from which directory entry is being printed out in each case), please explain how the list in a directory entry's d_subdirs field differs from the list in its d_child field.


Things to turn in

Optional Enrichment Exercises

  1. Make a copy of your kernel module and modify that module's kernel thread so that it does a full (recursive) depth-first traversal of the mounted filesystem, starting at the root directory. When it reaches the directory entry structure for a non-directory file, it should simply print out a line to the system log with the file's name. When it reaches the directory entry structure for a directory, it should print out that directory's name and then recursively explore that directory before visiting any of the other directory entry structures within its parent directory. As the answer to this exercise please show a fragment from the system log that demonstrates depth-first traversal of the filesystem.

  2. Make a copy of your kernel module and modify that module's kernel thread so that it does a full (recursive) breadth-first traversal of the mounted filesystem, starting at the root directory. The system log messages for this version should print out all of the directory entries within the root directory, then all the directory entrys for the first sub-directory of the root directory, then all the directory entries for the second subdirectory of the root directory, etc. As the answer to this exercise please show a fragment from the system log that demonstrates breadth-first traversal of the filesystem.