A technical overview of file systems, including functions, structures, types, partitions, volumes, performance, resilience, security, recovery, and forensics.

Check it out!

The file system (filesystem) is a logical structure and a set of algorithms, methods, and conventions responsible for the organisation, storage, retrieval, management, and protection of persistent data on secondary memory devices, such as hard drives, SSDs, optical media, and other non-volatile storage media.

At an abstract level, the file system defines the rules according to which data is named, organised into logical units (files), and grouped into directories (or folders), while also controlling the operations of creation, deletion, reading, writing, and manipulation of attributes and permissions associated with these objects. It thus provides a uniform and standardised interface between the operating system and the underlying storage hardware, hiding the physical peculiarities of devices and exposing a logical model that is understandable to users and applications.

Technically, a file system is composed of:

  • Metadata structures (such as superblocks, inodes, FATs, MFTs), which store control and management information (e.g. physical location, size, timestamps, permissions, links).
  • Allocation and space-mapping mechanisms, which define how data is physically arranged in the blocks or sectors of the device.
  • Directory structures and indexes, which implement efficient search and navigation mechanisms.
  • Integrity, security, and fault-tolerance policies, such as journaling, checksums, and access control.

In the context of modern operating systems, the file system is implemented as an intermediate layer — generally called the Virtual File System (VFS) — which abstracts the multiplicity of supported file systems, allowing the OS kernel to handle various formats and devices in a transparent, interoperable, and extensible manner.

Therefore, the file system not only enables persistent storage and rational organisation of data, but also imposes access rules, protection mechanisms, and optimisation strategies that directly influence the efficiency, reliability, and security of contemporary computing systems.

Function and Purpose of the File System in the Operating System Context

The file system plays a fundamental role in the architecture of operating systems by providing the mechanisms required for the abstraction, organisation, persistent storage, and efficient management of data on secondary memory devices.

Abstraction of Physical Storage

The operating system uses the file system to hide physical details of storage hardware (e.g. sectors, cylinders, blocks, mechanical latencies) and to present users and applications with a logical, uniform, and hierarchical view of data composed of files and directories. This abstraction allows multiple storage devices and technologies to be accessed transparently.

Data Organisation and Structuring

Through the file system, the operating system provides methods for:

  • Naming files and directories (symbolic identifiers);
  • Hierarchical structuring (directories, subdirectories, absolute and relative paths);
  • Metadata association (size, permissions, timestamps, owner);
  • Management of free space and fragmentation.

Access Management and Concurrency Control

The file system implements access control mechanisms (ACLs, POSIX permissions, etc.), ensuring data isolation, integrity, and confidentiality. In addition, it provides support for concurrency, locking, and synchronisation of read/write operations, which are essential in multi-user and multitasking environments.

Data Persistence and Integrity

The file system is responsible for ensuring that data remains durably stored, even after shutdowns or failures. Techniques such as journaling, checksums, transaction logs, and recovery policies are employed to mitigate loss and corruption.

Interface for Users and Applications

The operating system provides, through the file system, programming interfaces (syscalls) and utility commands for the creation, reading, writing, deletion, and manipulation of files and directories. This standardised interface enables the development of applications that are independent of the underlying storage hardware.

Resource Multiplexing and Sharing

It allows multiple users and processes to access the same files and devices in a controlled and secure manner, enabling resources such as network sharing, external volume mounting, usage quotas, and versioning.

Difference Between Partition, Volume, and Filesystem

Partition

A partition is a logical subdivision of a physical storage device (such as a hard disk or SSD). Through a partitioning scheme (e.g. MBR, GPT), the total space of the device is segmented into independent areas, each delimited by a defined start and end at the physical disk address.
Each partition operates in an isolated manner: it may host different file systems, be intended for distinct functions (OS, swap, data, recovery), or even remain unformatted. Partitioning aims to organise, protect, and enable multiple environments within the same physical device, while also facilitating data management and recovery.

Volume

A volume is a logical abstraction, generally implemented and managed by the operating system, which represents a storage unit usable by the user.
A volume may correspond directly to a physical partition, but it may also be formed by multiple aggregated partitions (for example, via LVM – Logical Volume Manager, RAID, or distributed file systems), or even be only a fraction of a partition.
In short, the volume is the entity that the operating system mounts and makes available for reading and writing, and it may receive a letter (e.g. C: in Windows) or be mounted in a directory (e.g. /home in Linux).

Filesystem (File System)

The file system is the logical structure and the set of methods that define how data is organised, stored, and accessed within a volume. It is the file system that dictates the arrangement of files, directories, metadata, allocation mechanisms, protection, integrity, and data recovery.
In practical terms, for a volume (or partition) to be usable for file storage, it must be “formatted” with a specific file system (e.g. NTFS, ext4, FAT32).

Relationship Among Them

  • A physical device may contain one or more partitions.
  • Each partition may be associated with one or more volumes, depending on the adopted logical management scheme.
  • Each volume must be formatted with a file system in order to store and organise data in a structured manner.

Practical Example:
On a 1 TB hard drive, one may create three partitions:

  • One of 200 GB (OS), formatted with NTFS, which will be volume C:;
  • One of 700 GB (data), formatted with exFAT, which will be volume D:;
  • One of 100 GB for swap, without a file system, used by the OS.

In advanced environments, a logical volume may encompass multiple partitions from different disks (RAID, LVM), and be presented as a single mount point.

Types of File Systems

Classic File Systems

FAT12, FAT16, FAT32

The FAT (File Allocation Table) family was originally developed by Microsoft in the 1970s–80s for MS-DOS systems. It uses an allocation table to map data blocks and control free disk space.

  • FAT12 was intended for floppy disks, supporting only a few megabytes.
  • FAT16 allowed addressing of up to 2 GB.
  • FAT32 expanded support to volumes of up to 2 TB and files of up to 4 GB, becoming a standard in removable media (USB drives, SD cards).
    FAT does not provide security features, journaling, or advanced permission control, making it limited in multi-user or critical environments.

NTFS

NTFS (New Technology File System), introduced in Windows NT, features a record-based architecture (MFT – Master File Table), support for journaling, compression, encryption (EFS), granular permissions (ACLs), disk quotas, and automatic recovery.
NTFS supports large files and volumes, symbolic links, alternate streams, and integration with Active Directory. It is the standard file system for modern Windows systems.

ext2, ext3, ext4

The ext (extended file system) family was developed for Linux:

  • ext2: simple, efficient, without journaling.
  • ext3: added journaling for greater resilience to failures.
  • ext4: increased maximum file/volume sizes, improved performance, and added support for delayed allocation, extents, integrity checking, and extended timestamps.
    ext4 is widely adopted in Linux distributions due to its stability and robustness.

HFS/HFS+

HFS (Hierarchical File System) was Apple’s file system for classic Mac OS. HFS+ (or Mac OS Extended) brought improvements such as support for long names, larger files, and journaling. Both were optimised for magnetic hard drives and were replaced by APFS in more recent versions of macOS.

exFAT

exFAT (Extended File Allocation Table) is aimed at high-capacity removable media, overcoming FAT32 limitations. It supports large volumes and files, fast allocation, and is compatible with Windows, macOS, and embedded systems, becoming the standard in SDXC cards and portable devices.

Modern File Systems

Btrfs

Btrfs (B-tree file system), developed for Linux, incorporates advanced features such as snapshots, checksums for data and metadata, online balancing, native compression, integrated RAID, and dynamic volume expansion. Its design is aimed at high integrity, flexibility, and simplified administration.

ZFS

ZFS (Zettabyte File System), created by Sun Microsystems, is notable for its support for large volumes and files, integrity through checksums, deduplication, snapshots, compression, RAID-Z, self-repair, and unified management of volumes and filesystems. It is used in mission-critical environments, enterprise storage, and servers.

XFS

Developed by SGI for UNIX, XFS is optimised for performance with large files and parallel operations. It supports journaling, online expansion, dynamic allocation, quotas, and is frequently used in high-performance servers and data storage solutions.

APFS

APFS (Apple File System) replaced HFS+ in the Apple ecosystem. It is designed for SSDs and flash storage, offering native encryption, snapshots, clones, efficient space management, and high performance in simultaneous operations. It is the standard on macOS, iOS, and related devices.

File Systems for Specific Devices

Flash (F2FS, JFFS2)

  • F2FS (Flash-Friendly File System) was created to optimise the performance of NAND flash-based devices, such as SSDs and SD cards, reducing write amplification and managing the internal structure of flash memory.
  • JFFS2 (Journaling Flash File System v2) is used in embedded systems and NOR/NAND memories, with a focus on fault tolerance and efficient management of erasable blocks.

SD Cards, USB Drives

In removable media, systems such as FAT32, exFAT, and F2FS prevail due to their broad compatibility across operating systems and the typical access profile (small files, frequent read/write operations).

Network File Systems

NFS

NFS (Network File System), created by Sun Microsystems, allows UNIX and Linux systems to share directories and files over the network, providing transparency in remote mounting, support for multiple clients, and integration with authentication and permissions.

SMB/CIFS

SMB (Server Message Block), and its evolved version CIFS (Common Internet File System), is widely used in Windows environments for file and printer sharing, as well as interprocess communication on local and corporate networks.

Lustre, GlusterFS, CephFS

  • Lustre: a distributed file system for HPC (High Performance Computing) environments, scaling to thousands of nodes and petabytes of data.
  • GlusterFS: a distributed filesystem based on aggregation of network volumes for horizontal scalability.
  • CephFS: part of the Ceph project, it offers a distributed filesystem with high availability, automatic replication, fault tolerance, and integration with object storage.

Performance, Robustness, and Security

Performance: Benchmarks, Latency, and Throughput

The performance of a file system is evaluated through metrics such as latency (response time for I/O operations) and throughput (data transfer rate, usually expressed in MB/s or IOPS – Input/Output Operations per Second).
Factors that impact performance include:

  • Block allocation strategies;
  • Block size and cache policy;
  • Journaling and redundancy techniques;
  • Parallelism in operations and support for multiple threads/processes.

Specific benchmarks, such as fio, bonnie++, iozone, and vdbench, are used to measure performance under different access patterns (sequential, random, read, write).
Modern file systems, such as XFS and ZFS, optimise operations for heavy workloads by providing parallelism and efficient management of caches and buffers.

Resilience: Journaling, Checksums, and Self-Healing

Resilience refers to the file system’s ability to withstand and recover from failures, ensuring data integrity and availability.

  • Journaling: A mechanism that records metadata operations (and, in some cases, data) before they are actually committed to disk. In the event of an abrupt failure (e.g. power outage), the journal is used to restore the consistent state of the system. NTFS, ext3/ext4, XFS, and ReFS use journaling to prevent corruption.
  • Checksums: Cryptographic checks (e.g. CRC, hash) applied to data and metadata for detection of silent corruption, especially in systems such as ZFS and Btrfs.
  • Self-healing: Systems such as ZFS and Btrfs detect discrepancies using checksums and, in redundant environments (RAID), can automatically reconstruct corrupted data from healthy copies, enabling “self-healing”.

Security: Permissions, ACLs, Encryption, and Logs

Security in file systems includes mechanisms to control access, ensure confidentiality, authentication, integrity, and traceability of operations.

  • Traditional permissions: User/group-based models (Unix-like: rwx) that restrict access to files and directories.
  • ACLs (Access Control Lists): Allow greater granularity, specifying detailed permissions for multiple users or groups on individual objects.
  • Encryption: It may be applied at different levels:
    • Transparent at the file system level (e.g. NTFS EFS, APFS, ext4 with e4crypt);
    • At the block/disk level (LUKS, BitLocker, FileVault);
    • On specific files.
  • Logs and auditing: Systems may record critical operations (creation, deletion, access, permission changes) in logs for auditing, forensic, and compliance purposes. Tools such as auditd (Linux), Event Viewer (Windows), and syslog are used for monitoring.

7.4 Recovery and Forensics in File Systems

The ability to perform recovery and forensic analysis is vital in corporate and critical environments:

  • Recovery tools: Used to recover deleted files, restore partitions, and repair corrupted metadata (e.g. TestDisk, PhotoRec, extundelete, chkdsk, fsck).
  • Snapshots and backups: Snapshots (point-in-time copies of the state of the file system) and backup routines facilitate fast restoration after failures, corruption, or attacks (e.g. ransomware).
  • Digital forensics: Detailed analysis of metadata, logs, timestamp records, and traces left by disk operations makes it possible to reconstruct security incident scenarios, investigate unauthorised access, and recover digital evidence.