Securing Couchbase Data Platform With TLS Certs

For this blog, we will be using GKE to set up a kubernetes cluster on GKE with version 1.12; deploying the Autonomous Operator; and then eventually deploying Couchbase Cluster with server groups, persistent volumes, and lastly with x509 TLS certificates.

The overall steps are as follows:

  1. Initialize gcloud utils
  2. Deploy kubernetes cluster (v1.12+) with two nodes in each availability zone
  3. Deploy Autonomous Operator 1.2
  4. Deploy Couchbase Cluster
  5. Perform Server Group Autofailover


  • kubectl (gcloud components install kubectl)
  • GCP account with right credentials

Initialize gcloud utils

Download gcloud sdk for the OS version of your choice from this URL.

One would need Google Cloud credentials to initialize the gcloud cli:

cd google-cloud-sdk
./bin/gcloud init

Deploy Kubernetes Cluster (v1.12) 

Deploying a kubernetes cluster on GKE is a fairly straightforward job. To deploy resilient kubernetes clusters, it’s a good idea to deploy nodes in all available zones within a given region. Doing it in such way that we can make use of server groups, rack zones, or availability zones (AZ) awareness features within the Couchbase server means that if we lose an entire AZ, Couchbase can failover the entire AZ and the application will be active, as it still has the working dataset.

gcloud container clusters create rd-k8s-gke --region us-east1 --machine-type n1-standard-16 --num-nodes 2
Details about above command
K8s cluster name : rd-k8s-gke
machine-type: n1-standard-16 (16 vCPUs and 60GB RAM)
num-nodes/AZ : 2

More machines types can be found here.

At this point, k8s cluster with the required number of nodes should be up and running:

$ gcloud container clusters list
rd-k8s-gke us-east1 1.12.6-gke.10 n1-standard-16 1.12.6-gke.10 <strong>6</strong> RUNNING

Details of the k8s cluster can be found below:

$ kubectl cluster-info
Kubernetes master is running at
GLBCDefaultBackend is running at
Heapster is running at
KubeDNS is running at
Metrics-server is running at

Deploy Autonomous Operator 1.2

GKE supports RBAC in order to limit permissions. Since the Couchbase Operator creates resources in our GKE cluster, we will need to grant it the permission to do so.

$ kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user $(gcloud config get-value account)

Download the appropriate package for deploying kubernetes in your environment. Untar the package and deploy the admission controller.

$ kubectl create -f admission.yaml

Check the status of the admission controller:

$ kubectl get deployments
couchbase-operator-admission 1 1 1 1 7s

In order to visualize how the admission controller works in sync with the operator and Couchbase cluster, it can be best illustrated with the following diagram:

The next steps are to create a crd, operator role, and operator 1.2:

$ kubectl create -f crd.yaml
$ kubectl create -f operator-role.yaml
$ kubectl create -f operator-deployment.yaml

Once the operator is deployed, it gets ready and available within seconds.

couchbase-operator-admission 1 1 1 1 11m
couchbase-operator 1 1 1 1 25s

Deploy Couchbase Cluster

Couchbase cluster will be deployed with the following features:

  • TLS certificates
  • Server groups (each server group in one AZ)
  • Persistent volumes (which are AZ aware)
  • Server group auto-failover

TLS Certificates

It’s fairly easy to generate TLS certificates; detailed steps can be found here.

Once deployed, TLS secrets can be found with the kubectl  secret command, like below:

$ kubectl get secrets NAME TYPE DATA AGE
couchbase-operator-tls Opaque 1 1d
couchbase-server-tls Opaque 2 1d

Server Groups

Setting up server groups is also straightforward, which will be discussed in the following sections when we deploy the Couchbase cluster YAML file.

Persistent Volumes

Persistent volumes provide a reliable way to run stateful applications. Creating them on a public cloud is a one-click operation.

First, we can check what storageclass is available for use:

$ kubectl get storageclass
standard (default) 1d

All the worker nodes available in the k8s cluster should have failure domain labels, like below:

$ kubectl get nodes -o yaml | grep zone us-east1-b us-east1-b us-east1-d us-east1-d us-east1-c us-east1-c

NOTE: I don’t have to add any failure domain labels; GKE added them automatically.

Create PV for each AZ:

$ kubectl apply -f svrgp-pv.yaml

YAML file svrgp-pv.yaml can be found here.

Create secret for accessing the Couchbase UI:

$ kubectl apply -f secret.yaml

Finally, deploy a Couchbase cluster with TLS support, along with server groups (which are AZ aware) and on persistent volumes (which are also AZ aware).

$ kubectl apply -f couchbase-persistent-tls-svrgps.yaml

The YAML file couchbase-persistent-tls-svrgps.yaml can be found here.

Give it a few minutes; the Couchbase cluster will come up, and it should look like this:

$ kubectl get pods
cb-gke-demo-0000 1/1 Running 0 1d
cb-gke-demo-0001 1/1 Running 0 1d
cb-gke-demo-0002 1/1 Running 0 1d
cb-gke-demo-0003 1/1 Running 0 1d
cb-gke-demo-0004 1/1 Running 0 1d
cb-gke-demo-0005 1/1 Running 0 1d
cb-gke-demo-0006 1/1 Running 0 1d
cb-gke-demo-0007 1/1 Running 0 1d
couchbase-operator-6cbc476d4d-mjhx5 1/1 Running 0 1d
couchbase-operator-admission-6f97998f8c-cp2mp 1/1 Running 0 1d

A quick check on persistent volumes claims can be done like below:

$ kubectl get pvc

In order to access the Couchbase Cluster UI, either we can port-forward port 8091 of any pod or service itself, on a local laptop or local machine, or it can be exposed via lb.

$ kubectl port-forward service/cb-gke-demo-ui 8091:8091

Port-forward any pod like below:

$ kubectl port-forward cb-gke-demo-0002 8091:8091

At this point, the Couchbase server is up and running and we have a way to access it.

Perform Server Group Auto-Failover

Server Group Auto-Failover

When a Couchbase cluster node fails, then it can auto-failover, and without any user intervention, ALL the working set is available; no user intervention is needed and the application won’t see downtime.

If Couchbase cluster is set up to be server group (SG), AZ, or rack zone (RZ) aware, then even if we lose the entire SG, then entire server groups fail over and the working set is available; no user intervention is needed and the application won’t see downtime. 

In order to have Disaster Recovery, XDCR can be used to replicate Couchbase data to other Couchbase Cluster. This helps in the event that the entire source, data center, or region is lost; applications can cut over to a remote site and won’t see downtime.

Let’s take down the server group. Before that, let’s see what the cluster looks like:

Delete all pods in group us-east1-b; once the pods are deleted, Couchbase cluster will see that nodes are:

The operator is constantly watching the cluster definition and it will see that server group is lost, and it spins the three pods, re-establishes the claims on the PVs, and performs delta-node recovery, and then eventually, it performs rebalance operation and the cluster is healthy again. All with no user-intervention whatsoever.

After some time, the cluster is back up and running.

From the operator logs:

$ kubectl logs -f couchbase-operator-6cbc476d4d-mjhx5 

We can see that the cluster is automatically rebalanced.


Sustained differentiation is key to our technology. We have added quite a number of new and supportability features. With all these enterprise-grade supportability features, they enable end-users to find the issues faster and help operationalize the Couchbase Operator in their environments in an efficient, faster way. We are very excited about the release; feel free to give a try!


All YAML files and help files used for this blog can be found here.

This article was originally published here