Providing High Availability to NFS in CEPH using Ganesha

Introduction to Ceph and Ceph-Ganesha

Ceph-Ganesha, an NFS tool embedded within CEPH with powerful orchestration features that enable high availability and dynamic management on a multi-node Ceph cluster. We will focus on the declarative simplicity of its deployment, and showing off its HA capabilities.

 

Ceph is an open-source, software-defined storage platform that delivers highly scalable object, block, and file storage from a unified cluster. At its core, Ceph’s architecture is built on a distributed network of independent nodes. Data is stored across OSDs (Object Storage Daemons), managed by Monitors, and orchestrated by Managers.

 

Ceph architecture explained

The Ceph File System (CephFS) is a POSIX-compliant file system that sits atop this infrastructure, providing a distributed and fault-tolerant namespace. For a system administrator, Ceph offers a great alternative to traditional storage arrays by providing a single, resilient platform that can grow linearly with the addition of commodity hardware.

 

Its self-healing and self-managing capabilities are key benefits, reducing the operational overhead typically associated with petabyte-scale storage.

 

What is NFS Ganesha in Ceph?

NFS Ganesha is an open-source NFS server that acts as a user-space gateway, a key distinction from conventional NFS servers that reside within the operating system’s kernel. This fundamental design choice provides a more robust and stable service environment. A bug in a user-space daemon is far less likely to cause a catastrophic system failure, a crucial advantage for a critical service endpoint. Ganesha’s architecture is also designed for maximum compatibility, supporting a full range of NFS protocols from NFSv3 to NFSv4.2, ensuring it can serve a diverse client base.

 

The true genius of Ganesha lies in its File System Abstraction Layer, or FSAL. This modular architecture decouples the NFS protocol logic from the underlying storage. For a Ceph environment, the FSAL_CEPH module is the key, enabling Ganesha to act as a sophisticated Ceph client. This means administrators can provide a consistent NFS interface to clients while benefiting from the full power and scalability of the Ceph cluster, all without exposing the underlying Ceph infrastructure directly.

A modern data center filled with glowing Ceph storage nodes connected in a resilient cluster. In the center, a friendly cartoon-style Ganesha deity sits at a console with multiple arms managing NFS exports, cables, and servers. One hand holds a network cable, another a laptop, another a glowing Ceph logo, symbolizing high availability and orchestration.

Cephadm integration: Declarative deployment of Ceph-Ganesha

The integration of Ganesha with the Ceph orchestrator (cephadm) elevates its deployment from a manual, host-specific task to an elegant, cluster-wide operation. This partnership allows for a declarative approach to service management, where a single command can manage the entire lifecycle of the Ganesha service.

 

For any mission-critical service, a system administrator’s primary concern is ensuring business continuity. Unplanned downtime can lead to significant data loss, loss of productivity, and damaged reputation. High Availability (HA) is the architectural principle that addresses this concern by eliminating single points of failure. For an NFS service, this means that if one server node goes offline, another node can seamlessly take over its duties. This provides administrators with peace of mind and allows for planned maintenance without impacting the end-user. For Ceph, its inherent distributed nature is the perfect complement to an HA NFS service, as the underlying storage is already resilient to node failures.

 

Preparing CephFS Storage for Ganesha

A successful Ganesha deployment begins with preparing the underlying CephFS storage. A seasoned administrator will provision the necessary pools to host the filesystem data and metadata, setting the stage for the service to be deployed.

 

Create a dedicated pool for NFS Ganesha data with autoscaling enabled

# sudo ceph osd pool create ganeshapool 32 32

# sudo ceph osd pool set ganeshapool pg_autoscale_mode on

 

Create a metadata pool, marked as bulk for optimized behavior

# sudo ceph osd pool create ganeshapool_metadata 16 16

# sudo ceph osd pool set ganeshapool_metadata bulk true

 

Tie the pools to a new CephFS filesystem

# sudo ceph osd pool application enable ganeshapool cephfs

# sudo ceph osd pool application enable ganeshapool_metadata cephfs

# sudo ceph fs new ganeshafs ganeshapool_metadata ganeshapool

# ceph fs set ganeshafs max_mds 3

# ceph orch apply mds cephfs --placement="3 ceph-node1 ceph-node2"

Deploying the Ceph NFS Ganesha Service

With the storage foundation laid, the deployment of Ganesha itself can be done either with .yamls or with simple orchestration CLI commands. The ceph orch apply command is a powerful instruction to the orchestrator, telling it to ensure the desired state of the NFS service. By specifying a placement count and listing the cluster’s hosts, the administrator ensures that a Ganesha daemon will run on every designated node, a critical step for a resilient and highly available service.

 

Deploy the Ganesha NFS service across all three specified hosts

 

# sudo ceph orch apply nfs myganeshanfs ganeshafs --placement="3 ceph-node1 ceph-node2 ceph-node3"

 

This single command initiates a complex, multi-faceted deployment. The orchestrator pulls the necessary container images, configures the daemons, and distributes them across the specified hosts. This contrasts sharply with manual, host-by-host installations, showcasing the power of centralized orchestration.

 

Advanced capabilities: Dynamic exports and service resilience

Once the Ganesha service is running, its power is further revealed through its dynamic export management capabilities. Instead of editing static configuration files, an expert can create, modify, and delete NFS exports on the fly using a series of simple commands. This is invaluable in dynamic environments where storage needs change rapidly.

 

Create a new export to make the CephFS filesystem accessible

 

# sudo ceph nfs export create cephfs myganeshanfs /ganesha ganeshafs --path=/

The true value of this distributed deployment lies in its service resilience. The Ceph orchestrator is constantly monitoring the health of the Ganesha daemons. Should a host fail, the orchestrator will automatically detect the loss and take action to ensure the service remains available. This automated failover process provides a high degree of transparency to clients, moving Ganesha from a simple gateway to a genuinely high-availability service. Its architecture is built to withstand disruption, making it an indispensable part of a robust storage strategy.

Real-World example

Let’s say we have a cluster with 3 ganesha-ready nodes, that means we can successfully export the underlying ceph fs from node 1 to node 2 and from node 2 to node 3, or whichever way we want !

Conclusion: Why Ceph-Ganesha is essential for modern storage

NFS Ganesha is more than just a gateway; it is a critical component for integrating traditional file services with modern, scalable storage. By leveraging the command-line orchestration of cephadm, administrators can deploy a highly available, resilient, and dynamically manageable service. The process is a testament to the power of declarative infrastructure management, simplifying what would otherwise be a complex task. The architectural design of Ganesha, combined with the power of the Ceph orchestrator, makes it a perfect solution for meeting the demanding storage requirements of today’s hybrid environments.

SIXE