Writing Filesystems

From Genunix

Jump to: navigation, search
Image:Info.gif This article has been identified as a draft. It is currently undergoing a community review. Please add your comments to the discussion page.

Do not quote any text on this page! It is still a draft!


How to write a Solaris filesystem

Finally fulfilling a promise I gave a while ago, I'm going to braindump a little mini-series here on what I know about the sourcecode structure of Solaris filesystems, the undocumented interfaces that filesystem code in Solaris must use to actually work, how these interfaces work and how locking issues / typical design faults that Solaris filesystems have had in the past can be avoided in new implementations. As this unspools, a reimplementation of the FAT filesystem driver will be written, explaining step-by-step how the skeleton code grows into a functional implementation.

As an editorial note, I had a title "Writing a Solaris filesystem in 21 days" for this series originally. I must admit I failed at doing it in 21 days - so beware, it might take longer ...

Table of contents

Introduction 
What does it involve writing a (disk-based) filesystem for Solaris ? Why this article ?
Sourcecode Structure 
The sourcefile tree for a Solaris filesystem driver and associated utilities
Build Environment 
How to set up an OpenSolaris workspace for developing your filesystem
Module glue 
Filesystem drivers are kernel modules - but not 'ordinary' kernel modules ...
Mount option handling 
See how mount option parsing can be handed off to the framework
VFS and Vnode interfaces 
Provides an overview of per-mountpoint (VFS) and per-file (Vnode) operations that a filesystem driver must/may implement
Specifics about VFS Operations
VFS_MOUNT(), VFS_UNMOUNT(), and VFS_FREEVFS()
VFS_STATVFS()
VFS_ROOT()
VFS_SYNC()
VFS_VGET() - Requesting a file node from a filesystem instance
Userdata I/O 
Excursion on how read/write and mmap-based I/O
A simple locking protocol for a read/write filesystem 
Reentrancy in filesystems and the need for shared/exclusive access for directory and file updates
Specifics about Vnode interfaces for file I/O
Opening and closing files - VOP_OPEN(), VOP_CLOSE()
Reading and writing files via system calls - VOP_READ(), VOP_WRITE()
Support for mmap'ed IO - VOP_MAP(), VOP_ADDMAP() and VOP_DELMAP()
Backend for mapped IO - VOP_GETPAGE(), VOP_PUTPAGE()
File and directory attributes - VOP_GETATTR(), VOP_SETATTR(), VOP_ACCESS()
ACLs and security attributes - VOP_GETSECATTR(), VOP_SETSECATTR()
ioctl on files - VOP_IOCTL()
Specifics about directory-related Vnode operations
Reading directory contents - VOP_READDIR() and VOP_LOOKUP()
File and directory creation/removal/rename - VOP_CREAT(), VOP_REMOVE(), VOP_RENAME(), VOP_MKDIR(), VOP_RMDIR()
VOP_INACTIVE() and a vnode's lifecycle
Shows how VFS_VGET() and VOP_INACTIVE() complement each other, and describes support for forced umount
Generic directory walking support code 
demonstrates how readdir/lookup codepaths can be unified using a generic directory walker mechanism
Mapping file/directory offsets to disk blocks 
one of the most important tasks of a filesystem - where's my data, dude ?
Timestamps 
POSIX atime, mtime, ctime, and their (non)-equivalents in a given filesystem
Filesystem Utilities
mount / unmount
fsck
mkfs
Next: Writing Filesystems - Introduction
Personal tools