Longhorn on Production Clusters: Storage Configuration, Tuning, and Gotchas

Longhorn is a lightweight, distributed block storage system built specifically for Kubernetes. It runs entirely inside your cluster, turning local disks on worker nodes into replicated persistent volumes with no external storage array required. That simplicity is what makes it appealing, especially in the Rancher and SUSE ecosystem where it ships as the default storage option. You get persistent storage that is easy to install, easy to understand, and tightly integrated with the Kubernetes lifecycle. This Longhorn guide goes beyond helm install. That gets you a working demo but it does not get you production-grade storage. The gap between “Longhorn is running” and “Longhorn on production is running well under real workloads” is where most teams lose hours (or data).

CloudCasa uses Longhorn’s CSI snapshot interface for automated, off-cluster backups and, for teams that need the fastest possible recovery times, integrates directly with Longhorn’s DR Volumes to enable storage-layer-aware restores. We see a lot of Longhorn deployments across our customer base, and the pattern is consistent: getting Longhorn installed is straightforward, but getting it tuned for production takes deliberate effort.

This post covers the configuration decisions that matter once you move past the default install: replica counts, data locality, resource reservations, disk layout, and the failure modes that catch people off guard.

Start with Dedicated Disks

This is the single most impactful decision you can make, and it happens before you install anything.

Longhorn stores replica data on the local disks of your worker nodes. If you let it use the root disk, you’re sharing I/O bandwidth with the OS, kubelet, container runtime, and every emptyDir volume on that node. Under load, this turns into unpredictable latency spikes and, worse, the risk of triggering DiskPressure conditions that evict pods.

Set up dedicated disks (SSDs or NVMe if your workloads are I/O-sensitive) and mount them at a consistent path across nodes, something like /var/lib/longhorn. Then configure Longhorn to use only those mount points.

If you absolutely must use the root disk, keep Longhorn’s Minimal Available Storage Percentage at the default 25% and set Storage Over Provisioning Percentage to 100%. This limits how aggressively Longhorn consumes space. On a dedicated disk, you can lower the minimum to 10% since there is nothing else competing for that space.

Replica Count: Two Is Often Enough

The default replica count is 3. That is a safe default, but it is not always the right one.

Each replica is a full copy of the volume data, stored on a different node. Three replicas means every write gets committed to three nodes before it is acknowledged. For a 100 GB volume, that is 300 GB of consumed disk space and 3x the write amplification on your network.

For most production workloads, set the default replica count to 2. You still get node-level redundancy (one node can fail without data loss), and you significantly reduce disk consumption and write latency. The read path can also benefit: Longhorn can serve reads from either replica, so two replicas already give you read distribution.

Reserve 3 replicas for workloads where the data is irreplaceable and the application cannot tolerate any window of reduced redundancy during a node failure. Databases that don’t have their own replication (a single-instance Postgres, for example) are a good candidate.

For workloads that already handle replication at the application layer (think Kafka, Cassandra, or CockroachDB), consider strict-local data locality with a single replica. More on that next.

Data Locality: The Performance Lever Most People Miss

Data locality controls whether Longhorn tries to keep a replica on the same node as the pod consuming the volume. There are three settings:

disabled (default): Replicas are placed wherever Longhorn’s scheduler finds space. Every read and write goes over the network to at least one remote node.

best-effort: Longhorn attempts to keep one replica local to the consuming pod. If that is not possible (the node is full, for example), it falls back to remote replicas. This is the setting you want as your default for most StorageClasses.

strict-local: Longhorn keeps exactly one replica, always on the same node as the pod. No network round-trips. This gives you the highest IOPS and lowest latency, close to raw disk performance. The tradeoff: if that node goes down, the data is unavailable until it comes back.

The practical guidance:

  • Set your default StorageClass to best-effort. This gives you local-read performance for the common case while maintaining multi-node redundancy.
  • Create a second StorageClass with strict-local for applications that handle their own replication. A distributed database with three replicas at the application level does not need three more at the storage level.
  • Avoid disabled in production unless you have a specific reason (such as wanting maximum flexibility in replica placement across zones).

Here is what a best-effort StorageClass looks like:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: longhorn-production
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
parameters:
  numberOfReplicas: "2"
  dataLocality: "best-effort"
  staleReplicaTimeout: "30"

Resource Reservations: Protect the Storage Plane

Longhorn runs instance manager pods on every node. These pods manage the engines (which handle I/O for attached volumes) and replicas. If Kubernetes schedules CPU-hungry workloads onto a node and starves these pods, your storage I/O stalls.

Two settings control this:

Guaranteed Instance Manager CPU: This reserves a percentage of each node’s allocatable CPU for instance manager pods. The default is 12%. For nodes running more than a handful of volumes, increase this to 15–25%. The formula to think about: (estimated max volume count on node × 0.1) / total allocatable CPUs × 100.

Priority Class: Create a system-cluster-critical or custom high-priority PriorityClass and assign it to Longhorn’s components. This ensures that under node pressure, Kubernetes evicts application pods before it touches your storage infrastructure.

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: longhorn-critical
value: 1000000
globalDefault: false
description: "Priority class for Longhorn storage components"

Then set priorityClass: longhorn-critical in your Longhorn Helm values under longhornManager, longhornDriver, and longhornUI.

Snapshot Hygiene: The Silent Disk Eater

Longhorn creates system snapshots automatically on certain operations (replica rebuilds, volume detach/attach cycles, upgrades). These accumulate quietly. On a busy cluster with frequent pod rescheduling, you can find that snapshot chains are consuming 2–3x the actual volume data.

Build a cleanup routine:

  • Set a recurring snapshot job with a sensible retention policy. For most workloads, keeping the last 5–10 snapshots is plenty.
  • Purge snapshots on applications with their own replication. If your app replicates at the application layer, you do not need Longhorn snapshots for point-in-time recovery. Delete them aggressively.
  • Monitor snapshot space separately from volume space. The Longhorn UI shows this, but it is easy to overlook until a node runs out of room.

Replica Auto-Balance: Keep Things Even

As you add and remove nodes, or as volumes are created and deleted, replica distribution across nodes can become uneven. One node ends up holding a disproportionate number of replicas, becoming a bottleneck and a larger blast radius if it fails.

Set Replica Auto-Balance to least-effort. This mode periodically rebalances replicas so that each zone (or node, if you’re not using zones) holds a roughly equal share. It is not aggressive: it will not disrupt running volumes unnecessarily, but it will gradually correct imbalances.

The Gotchas

Scheduling race conditions on bulk PVC creation. If you create many PVCs simultaneously (common in CI/CD or batch workloads), Longhorn’s scheduler can place multiple replicas on the same disk before the space accounting catches up. The result: over-provisioned disks that fail when the actual data lands. Mitigation: set Storage Over Provisioning Percentage conservatively (100–150%) and stagger bulk PVC creation if possible.

Volume attachment storms after node reboot. When a node comes back after a restart, all volumes that were attached to it try to reattach simultaneously. With many volumes this can saturate the network and cause timeouts. Increase mkfs-ext4-parameters timeout values and consider setting the Concurrent Volume Backup Restore Per Node Limit to throttle parallel operations.

Network partitions and split-brain. Longhorn relies on the node network for replica synchronization. If your network partitions (common in on-prem environments with flaky switches), you can end up with divergent replicas. Use a dedicated storage network if your infrastructure supports it. This separates storage replication traffic from pod-to-pod traffic and significantly improves stability.

Upgrades that reset settings. Some Longhorn upgrades reset custom settings to defaults. Always export your settings (the Longhorn UI has a “Backup Settings” option, or capture them via kubectl) before upgrading, and verify them afterward.

Backing Up Longhorn Volumes

All the tuning in the world does not help if you lose the cluster. Longhorn snapshots protect you from volume-level failures, but they live on the same nodes as your data. A multi-node failure, a bad upgrade, or an infrastructure outage can take them all out.

You need off-cluster backups. Since Longhorn v1.3, full CSI snapshot support means any CSI-compatible backup tool can create consistent, portable snapshots of your Longhorn volumes. CloudCasa integrates with Longhorn’s CSI layer to automate this: scheduled backups, cross-cluster restores, and (for teams using Longhorn’s DR Volumes feature) storage-layer-aware recovery that dramatically reduces restore times.

Whatever tool you choose, the principle is the same: your backup must live outside the blast radius of the cluster it protects.

Wrapping Up

Longhorn is a solid choice for Kubernetes-native storage, especially in the Rancher and SUSE ecosystem. But the defaults are optimized for ease of getting started, not for production performance and resilience. Adjust your replica counts, enable data locality, protect the storage plane with resource reservations, and stay on top of snapshot hygiene. Those four things alone will get you most of the way to a reliable production setup.

Share the Post:

BY PLATFORM

BY USE CASE

BY CLOUD