Sunday, October 13, 2024

How Does Longhorn Use Kubernetes Worker Node Storage as PV?

Longhorn installs as a set of microservices within a Kubernetes cluster and treats each worker node as a potential storage provider. It uses disk paths available on each node to create storage pools and allocates storage from these pools to dynamically provision Persistent Volumes (PVs) for applications. By default, Longhorn uses /var/lib/longhorn/ on each node, but you can specify custom paths if you have other storage paths available.

Configuring Longhorn to Use a Custom Storage Path

To configure Longhorn to use existing storage paths on the nodes (e.g., /mnt/disks), follow these steps:

1. Install Longhorn in the Kubernetes Cluster:

Install Longhorn using Helm or the Longhorn YAML manifest:

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/master/deploy/longhorn.yaml

You can also install Longhorn from the Kubernetes marketplace or directly from the Longhorn UI.

2. Access the Longhorn UI:

Once installed, access the Longhorn UI to configure and manage your Longhorn setup.

By default, Longhorn is accessible through a Service of type ClusterIP, but you can change it to NodePort or LoadBalancer if needed.


kubectl get svc -n longhorn-system


3. Add a New Storage Path on Each Node:

Before configuring Longhorn, ensure that the desired storage paths are created and available on each node. For example, you might want to use /mnt/disks as your custom storage directory:

mkdir -p /mnt/disks

You may want to mount additional disks or directories to this path for greater storage capacity.

4. Configure Longhorn to Use the New Storage Path:

Open the Longhorn UI (<Longhorn-IP>:<Port>) and navigate to Node settings.

Select the node where you want to add a new disk path.

Click Edit Node and Disks, and then Add Disk.

Specify the Path (e.g., /mnt/disks) and Tags (optional).

Set the Storage Allow Scheduling option to true to enable Longhorn to schedule storage volumes on this disk.

Repeat this process for each node in the cluster that should contribute storage.

5. Verify Storage Path Configuration:

After adding the new storage paths, Longhorn will automatically create storage pools based on these paths. Check the Nodes section in the Longhorn UI to see the updated disk paths and available storage.

6. Create a Persistent Volume (PV) Using Longhorn:

Now that Longhorn is using your custom storage paths, you can create Persistent Volumes that utilize this storage.

Either create a new PersistentVolumeClaim (PVC) that dynamically provisions a PV using the Longhorn StorageClass or use the Longhorn UI to manually create volumes.

Example: Configuring a Node's Storage for Longhorn

Below is an example YAML configuration for adding a disk path (/mnt/disks) to a node, which can also be done through the UI:

apiVersion: longhorn.io/v1beta1
kind: Node
metadata:
  name: <node-name>
  namespace: longhorn-system
spec:
  disks:
    disk-1:
      path: /mnt/disks
      allowScheduling: true
      storageReserved: 0
  tags: []

path: Specifies the custom path on the node where Longhorn will allocate storage.

allowScheduling: Enables Longhorn to schedule volumes on this disk.

storageReserved: (Optional) Specifies the amount of storage to be reserved and not used for Longhorn volumes.


Important Considerations When Using Node Storage for Longhorn:

1. Data Redundancy and Availability:

Longhorn provides replication for data redundancy. When using node-local storage, ensure that you have sufficient replicas configured (e.g., 3 replicas for high availability) so that data remains safe even if one node goes down.

This means you need enough storage capacity across multiple nodes to accommodate these replicas.

2. Storage Path Consistency:

Ensure that the same storage path (/mnt/disks) is present on each node where you want Longhorn to store data.

If a node does not have the specified path, Longhorn will not be able to use it, leading to scheduling failures.

3. Handling Node Failures:

If the node with the custom storage path fails or becomes unavailable, the volumes stored on that node may be temporarily inaccessible.

Consider setting up anti-affinity rules and replication strategies in Longhorn to handle such scenarios gracefully.

4. Storage Permissions:

Make sure the Kubernetes worker node's storage directory has the appropriate permissions for Longhorn to read/write data.

5. Longhorn's Built-in Backup and Restore:

Utilize Longhorn’s built-in backup and restore capabilities to safeguard data if you are using node-local storage paths, as this storage may not be as reliable as network-based or cloud-backed storage solutions.

How to create a Kubernetes Operator?

Creating a Kubernetes operator involves building a controller that watches Kubernetes resources and takes action based on their state. The common approach to create an operator is using the kubebuilder framework or the Operator SDK, but a custom solution using the Kubernetes API client directly can also be done.

Below, I'll show an example of a simple operator using the client-go library, which is the official Kubernetes client for Go. This operator will watch a custom resource called Foo and log whenever a Foo resource is created, updated, or deleted.

Prerequisites

Go programming language installed.

Kubernetes cluster and kubectl configured.

client-go and apimachinery libraries installed.


To install these dependencies, run:

go get k8s.io/client-go@v0.27.1
go get k8s.io/apimachinery@v0.27.1

Step 1: Define a Custom Resource Definition (CRD)

Create a foo-crd.yaml file to define a Foo custom resource:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: foos.samplecontroller.k8s.io
spec:
  group: samplecontroller.k8s.io
  versions:
    - name: v1
      served: true
      storage: true
  scope: Namespaced
  names:
    plural: foos
    singular: foo
    kind: Foo
    shortNames:
    - fo

Apply this CRD to the cluster:

kubectl apply -f foo-crd.yaml

Step 2: Create a Go File for the Operator

Create a new Go file named main.go:

package main

import (
"context"
"flag"
"fmt"
"log"
"os"
"os/signal"
"syscall"
"time"

"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/client-go/dynamic"
"k8s.io/client-go/tools/cache"
"k8s.io/client-go/tools/clientcmd"
)

func main() {
// Load the Kubernetes configuration from ~/.kube/config
kubeconfig := flag.String("kubeconfig", clientcmd.RecommendedHomeFile, "Path to the kubeconfig file")
config, err := clientcmd.BuildConfigFromFlags("", *kubeconfig)
if err != nil {
log.Fatalf("Error building kubeconfig: %v", err)
}

// Create a dynamic client
dynClient, err := dynamic.NewForConfig(config)
if err != nil {
log.Fatalf("Error creating dynamic client: %v", err)
}

// Define the GVR (GroupVersionResource) for the Foo custom resource
gvr := schema.GroupVersionResource{
Group:    "samplecontroller.k8s.io",
Version:  "v1",
Resource: "foos",
}

// Create a list watcher for Foo resources
fooListWatcher := cache.NewListWatchFromClient(
dynClient.Resource(gvr), "foos", "", cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
foo := obj.(*unstructured.Unstructured)
fmt.Printf("New Foo Added: %s\n", foo.GetName())
},
UpdateFunc: func(oldObj, newObj interface{}) {
foo := newObj.(*unstructured.Unstructured)
fmt.Printf("Foo Updated: %s\n", foo.GetName())
},
DeleteFunc: func(obj interface{}) {
foo := obj.(*unstructured.Unstructured)
fmt.Printf("Foo Deleted: %s\n", foo.GetName())
},
},
)

// Create a controller to handle Foo events
stopCh := make(chan struct{})
defer close(stopCh)
_, controller := cache.NewInformer(fooListWatcher, &unstructured.Unstructured{}, 0, cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
fmt.Println("Foo Created:", obj)
},
UpdateFunc: func(oldObj, newObj interface{}) {
fmt.Println("Foo Updated:", newObj)
},
DeleteFunc: func(obj interface{}) {
fmt.Println("Foo Deleted:", obj)
},
})

// Run the controller
go controller.Run(stopCh)

// Wait for a signal to stop the operator
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
<-sigCh
fmt.Println("Stopping the Foo operator...")
}

Step 3: Running the Operator

1. Build and run the Go program:

go run main.go


2. Create a sample Foo resource to test:

# Save this as foo-sample.yaml
apiVersion: samplecontroller.k8s.io/v1
kind: Foo
metadata:
  name: example-foo

Apply this resource:

kubectl apply -f foo-sample.yaml

Step 4: Check the Output

You should see logs in the terminal indicating when Foo resources are added, updated, or deleted:

New Foo Added: example-foo
Foo Updated: example-foo
Foo Deleted: example-foo

Explanation

1. Dynamic Client: The operator uses the dynamic client to interact with the custom resource since Foo is a CRD.


2. ListWatcher: The NewListWatchFromClient is used to monitor changes in Foo resources.


3. Controller: The controller is set up to handle Add, Update, and Delete events for the Foo resource.


4. Signal Handling: It gracefully shuts down on receiving a termination signal.



Further Enhancements

Use a code generation framework like kubebuilder or Operator SDK for complex operators.

Implement reconcile logic to manage the desired state.

Add leader election for high availability.


This example demonstrates the basic structure of an operator using the Kubernetes API. For production-grade operators, using a dedicated framework is recommended.

Related Posts Plugin for WordPress, Blogger...