Storage

Different workloads demand different storage characteristics. Choosing the right type is a question of access patterns, performance requirements, and cost. The three fundamental categories — object, block, and filesystem — serve distinct purposes. Understanding them prevents the common mistake of reaching for the wrong tool — like using a database for file storage, or a filesystem for data that should be in object storage.

Object Storage

Object storage treats data as discrete objects, each with a unique key, the data itself, and metadata. There's no directory hierarchy — just a flat namespace (though key prefixes like images/2024/photo.jpg simulate folders).

S3-compatible APIs have become the de facto standard. Whether you're using AWS S3, MinIO, or another provider, the interface is the same: PUT an object, GET it by key, DELETE it when you're done.

Object storage excels at scale. It handles petabytes without operational complexity, provides built-in redundancy, and costs significantly less per gigabyte than block storage. It's the right choice for backups, static assets, log archives, and data lakes.

The tradeoff is latency and access pattern. You can't append to an object or modify a byte range — you replace the whole thing. And access times, while consistent, are higher than local disk. Object storage is for data you write once and read by key, not for databases or applications that need random I/O.

Lifecycle rules automatically transition objects between storage tiers or delete them after a retention period. Versioning preserves previous versions of objects, providing a safety net against accidental overwrites.

Block Storage

Block storage presents raw storage volumes to the operating system, which formats them with a filesystem and mounts them like a local disk. It's the storage type that databases, virtual machines, and high-performance applications depend on.

Performance is characterized by IOPS (input/output operations per second) and throughput (megabytes per second). Different workloads stress different dimensions — a database needs high IOPS for random reads, while video processing needs high throughput for sequential reads.

Block volumes support snapshots — point-in-time copies that are nearly instantaneous to create. Snapshots are incremental (only changed blocks are stored), making them efficient for regular backup points.

The tradeoff is cost and flexibility. Block storage is more expensive per gigabyte than object storage, and a volume is typically attached to a single host at a time.

Filesystems

Network and distributed filesystems share storage across multiple hosts.

NFS (Network File System) is the traditional option for shared access to files over a network. It's simple, widely supported, and sufficient for many workloads — but performance degrades over high-latency links and it has limited scalability.

Distributed filesystems like Ceph and GlusterFS spread data across a cluster of nodes, providing scalability, redundancy, and high availability. They're complex to operate but solve problems that centralized storage cannot.

Copy-on-write filesystems (ZFS, Btrfs) offer built-in features like snapshots, checksumming, compression, and data deduplication. ZFS in particular is a strong choice for storage servers where data integrity is paramount.

Container Storage

In containerized environments, storage requires additional abstraction. Containers are ephemeral by default — when a container stops, its filesystem is discarded.

Volumes provide persistent storage that outlives individual containers. In Kubernetes, Persistent Volumes decouple storage provisioning from consumption, and CSI drivers (Container Storage Interface) allow any storage backend to integrate with the orchestrator.

StatefulSets manage workloads that need stable storage and network identity — databases, message brokers, and other stateful services that can't simply be replaced with a fresh instance.

For backup strategies across all storage types, see backup-strategies. For container-specific storage patterns, see container-runtimes.