Linux Kernel Programming

Project 2025 – Block-sharing : Slice it, don’t waste it!

Authors

Julien Sopena, Sorbonne University

Jérôme Coquisart, RWTH

In this project, you will have to implement new features on top of an existing file system, ouiche_fs, in the form of a kernel module for Linux 6.5.7.

As usual, you are strongly encouraged to use a cross-reference tool to navigate the Linux kernel source code, be it integrated in your development environment or online like Bootlin’s Elixir.

You are also strongly encouraged to read the documentation available in Documentation/filesystem/vfs.rst or online to better understand how the virtual file system (VFS) layer works and how a file system is implemented.

Objectives of the project

In this project, you will design and implement a block-sharing file system that allows multiple small files to share the same block. This approach significantly reduces disk space wastage and improves performance when handling small files. Traditional file systems allocate dedicated blocks for each file, which can lead to inefficiencies and fragmentation. While this fragmentation is negligible for large files, it becomes a major issue for numerous small files, resulting in substantial space loss:

for a 100 MB file, the loss is at most 4 KB (block size), which represents a 0.004% space loss
for 100,000 files of 1 KB, the loss is 100,000 * 3 KB, or 300% space loss

1 Project roadmap

The project is structured into a series of progressive steps, each focusing on a specific feature to implement. It is essential to complete these steps sequentially, as each builds upon the foundation laid by the previous one.

Important

Please read the assignment in its entirety before starting your implementation!!!

Having a clear understanding of all the requirements and tasks beforehand allows you to make more informed design decisions, reducing the need for frequent revisions and ensuring a smoother development process.

1.1 Mounting and testing a vanilla OuicheFS

This project is based on the OuicheFS file system that you need to download from GitHub at the following address: https://github.com/rgouicem/ouichefs. Be careful to use the branch named v6.5.7!

Alongside the source code, you will find comprehensive documentation detailing the design of this simple file system, including the relationships between various kernel structures and their interactions. Feel free to explore the source code, which is well-documented, and experiment with the vanilla file system in your virtual machine. This hands-on approach will help you understand its functionality, identify its limitations, …

After downloading the source and building the kernel module, you will need to format a partition with OuicheFS and mount it. You can then test the file system by creating files, writing data, and reading it back.

1.2 Reimplementation of the read and the write functions

The first step of your project is to reimplement the read and the write functions of OuicheFS without changing their behavior. Currently, OuicheFS does not implement these functions and therefore uses the default implementation provided by the Linux kernel. Unfortunately, this implementation assumes that the data of a file systematically begins at the start of a data block, which will no longer be the case at the end of the project.

To simplify this step, you will implement read and write functions that behave like the default implementation, but that will not use the page cache. To do this, you will need to use the sb_bread and brelse functions provided by the Linux kernel to read data directly from the disk. You will also need to use the mark_buffer_dirty and sync_dirty_buffer functions to write data to the disk.

The read function, whose prototype you can find in the documentation of the VFS referred to above, has the following semantic :

You must return the amount of bytes you copied to userspace. There are some cases when that quantity is 0.
Updating the offset given to you by the application is your responsibility.
Note that you do not have to return all the bytes that the application asked you for. The returned quantity and the offset update must only be consistent with what you copy. An application that wishes to read until the end of the file (like cat(1)) will attempt the read syscall again. However, note that returning 0 can be interpreted by the application that there will be no more bytes left to read, prompting them to give up.

The write function has the following semantic :

You must return the amount of bytes you have copied from userspace.
You must still update the offset.
You do not have to accept all the bytes the application sent you.
All the updates to the inode are your responsibility.
Note that writes done after the end of the file are valid. It is your responsibility to fill the possible blanks in the file.

You are strongly advised to read the functions ouichefs_write_begin and ouichefs_write_end to see what checks and modifications are made. Your write function will be called instead of those !

Check that your implementation works correctly using the test bench you wrote in the previous step and measure the performance loss compared to the default implementation.

1.3 Modification of the data structure and implementation of an ioctl command

To allow ouichefs to store data from multiple files in the same block, we will need to introduce a new type of block: the ouichefs_sliced_block. The 4096 bytes of these blocks will be considered as 32 slices of 128 bytes. The first slice of each of these blocks will contain the block’s metadata, while the remaining 31 slices will contain the file data. Partially filled blocks will be recorded in a list so that they can be completed.

Thus, in the first slice, we will find (at least):

a 32-bit bitmap where each bit represents the availability of a slice (1 for free, 0 for occupied)
the number of the next block in the list of partially filled blocks, if some slices are still available

We also need to modify the ouichefs_sb_info structure to add a s_free_sliced_blocks field, which will contain the number of the first block in the list of partially filled blocks (0 if the list is empty).

Now that everything is in place to manage block slices, we need a mechanism to use them in the case of small files. To avoid overloading the ouichefs_inode structure, we will recycle the index_block field, which is unnecessary since there is no indirection in this case. The idea is as follows:

the 27 least significant bits will be used to store the block number containing the slice (only the blocks of the first 512 gigabytes of a partition can be sliced)
the 5 most significant bits will be used to store the slice number within the block

There is no need to add a field to identify small files here, as the inode already stores the file size.

1.4 Implementation of a sysfs

To check the state of your file system, you will implement a file in the virtual file system sysfs. Thus, in a directory /sys/fs/ouichefs/<partition>/, you will find the following files:

free_blocks : the number of free blocks
used_blocks : the number of used blocks (sliced or not)
sliced_blocks : the number of sliced blocks (not the number of slices)
total_free_slices : the total number of free slices (in partially filled blocks)
files : the number of files
small_files: the number of files smaller than 128 bytes
total_data_size: the total size of the data in the file system (in bytes)
total_used_size: the total size of the used blocks (in bytes)
efficiency: the ratio of the total size of the data in the file system to the total size of the used blocks (in %)

1.5 Modification of the write function

With the necessary structures in place to support block sharing, update the write function (implemented in Section 1.2) to handle files smaller than 128 bytes.

For this step, assume that all files fall within this size limit. If a file exceeds 128 bytes, temporarily return an error. Your implementation should ensure seamless completion or modification of a file’s content, provided it remains within the 128-byte limit.

1.6 Implementation of the ioctl command to display the content of a block

To facilitate debugging and ensure the correctness of the slice-based storage mechanism, you will implement an ioctl command. This command will allow you to display the content of the block containing the slice of a given file descriptor. The output will consist of 32 lines, each containing 128 characters, representing the slices within the block. If the file is not small enough to use the slice-based storage, the command should return an appropriate error.

This ioctl command will temporarily replace the read function for small files, enabling you to verify that slices are correctly filled and that the metadata is properly updated. This tool will be invaluable for debugging and validating your implementation as you progress through the project.

1.7 Deleting Small Files and Releasing Slices

Update OuicheFS to support the deletion of small files, ensuring that the slice used by the file is properly released. If this release results in the creation of a partially filled block, update the metadata accordingly. In cases where the block becomes completely free, it should be released and added back to the list of free blocks, maintaining the integrity of the file system’s free block management.

1.8 Coexistence with traditional files

It is now time to enable OuicheFS to handle files larger than 128 bytes. Complet the write function implemented in Section 1.5 so that it switches the file to legacy storage (with dedicated blocks and an indirection block). Remember to release the slice once the data has been copied.

To simplify, we will not revert to slice storage if data happens to be deleted. Therefore, small files not using the new slice system may appear.

1.9 Implementation of the read function for small files

Your filesystem now supports files of any size and allows them to be edited correctly. However, reading small files is still done via the ioctl implemented in Section 1.6. It is now time to modify the read function implemented in Section 1.2 so that it can read files smaller than 128 bytes

1.10 Handling Files Spanning Multiple Slices

For now, our optimization only applies to files smaller than 128 bytes. Space wastage is therefore still present for slightly larger files. However, everything is already in place in the current implementation to limit space loss to one slice for files smaller than a page.

Enhance the read and write functions to support the use of multiple contiguous slices within the same block. Since the data resides in the same block, knowing the starting slice number is sufficient to access all the data. Consequently, no additional fields need to be added to the structures. The main challenge lies in managing data relocation efficiently (avoiding slice fragmentation) when the subsequent slice is already occupied.

You can later enhance the placement strategies to reduce data copying and optimize the free space search algorithm. If needed, you can leverage the metadata space available in the first slice of the block to achieve these improvements.

1.11 Bonus: Slice Defragmentation and Optimization

As the file system is used, external fragmentation will naturally occur, leading to an increase in the number of partially filled blocks. To address this, you will implement a defragmentation algorithm designed to consolidate files into fewer blocks, thereby optimizing storage efficiency. This defragmentation process can be triggered automatically when the proportion of partially filled blocks exceeds a predefined threshold (e.g., 10% of the total number of blocks). Additionally, you may implement an ioctl command to allow manual triggering of the defragmentation process, giving users greater flexibility and control over storage optimization.

You could also handle the case of files that have shrunk after exceeding the 4096-byte limit. Currently, these files are treated as regular files and are not optimized. It might be beneficial to move their data back into slices, thereby reducing internal fragmentation and improving storage efficiency.

2 Bug bounty

If you find any bug in ouiche_fs, do not hesitate to report it and fix it! If you manage to fix a bug, provide a patch and an explanation in your report. After the submission deadline, you can submit your fix through a pull request on GitHub (in exchange for a few bonus points of course…).

3 Submission

Submission rules

Do not share or make your code available before you have received your final grade for this course. You must keep your code private.

The project is a group work, with groups consisting of 2 students. You can use the Matrix room to find teammates if needed.

Submission process
Projects must be submitted as patches via email, in the same fashion as labs, to lkp-maintainers@os.rwth-aachen.de, with the subject [PATCH X/Y] project: ... (these are the same instructions as those provided during the lab, but replace lab0X: task0Y by project).

The submission deadline is 06.08.2025 included.

This time, you will submit only the patches applied to the ouichefs module, not patches made to the Linux kernel. We will apply your patches against the latest ouichefs code, and compile your module. Also, you don’t have to modify the Makefile, except for the line ouichefs-objs := fs.o super.o inode.o file.o dir.o ... if you are adding files to the module.

When submitting, scripts will execute basic tests, make sure you pass them throughout the development of your module. As with the labs, you can submit patches as many times as you want before the deadline. Only the last submission will be used for grading. Passing the tests is not a requirement but is highly encouraged to make sure you are respecting the lab’s instructions. Note that only one member of the group has to submit.

The reference evaluation script for the project will be uploaded later.

Alongside your source code, provide a short pdf report (2–4 pages). This report should contain explanations about your design choices (algorithms and data structures) and a status report with the following three sections:

List of features implemented and functional
List of features implemented but not fully functional. In this case, explain the problems, why they occur, and potential fixes if you have any in mind.
List of features not implemented.
Short conclusion based on your experimental results.

You will be graded on the following points:

Meaningful patch descriptions
Code clarity and compliance with the kernel coding style
Design choices
Presentation quality
Compliance with the project instructions
Code correctness

Bonus points can be obtained if you find a bug in the original repository and if you are able to fix it. To claim your bonus points, make a pull request on https://github.com/rgouicem/ouichefs after the deadline and before your final presentation.

Final presentation:
You will need to present your work in a short in-person 15’ presentation. During this presentation, you will:

Present the features you implemented
Explain and defend your implementation choices
Perform an interactive demonstration on your laptop

The final presentations will be held during the week of 11.08.2025.

You can reserve a slot for your presentation using this link.

4 Appendix

4.1 Buffer cache (include/linux/buffer_head.h)

/*
 * Returns a pointer to the cached buffer of block bno on partition sb.
 * The size of the read buffer is sb->s_blocksize, and the data in the buffer is accessible
 * through the b_data member of the struct buffer_head returned.
 */
struct buffer_head *sb_bread(struct super_block *sb, sector_t bno);

/*
 * Marks the buffer pointed at by bh as dirty, so that the page cache writes it back to disk.
 */
void mark_buffer_dirty(struct buffer_head *bh);

/*
 * Forces the write back of the buffer pointed at by bh on disk.
 * Returns 0 on success.
 */
void sync_dirty_buffer(struct buffer_head *bh);

/*
 * Releases a reference to the buffer bh.
 */
void brelse(struct buffer_head *bh);

4.2 Inode (include/linux/fs.h)

/*
 * Marks the inode as dirty (after a modification).
 * When the inode will be freed, the destroy_inode() function will be called on it.
 */
void mark_inode_dirty(struct inode *inode);