A.
About NDB Cluster
MySQL NDB Cluster is an in-memory database cluster of MySQL based on
shared nothing and distributed architecture. NDB cluster uses NDB storage
engine to store data in memory-optimized tables with durability for extra fast
performance and low latency to handle extreme high volume of transactions as
well as meeting ACID compliance. NDB Cluster enables data to be redundantly
distributed on several cluster nodes without single point of failure to
meet 99.999% system availability SLA.
In a nutshell, NDB Cluster nodes (processes) can be divided into 3
building blocks based on types:
1.
Data Nodes
(ndbd / ndbmtd)
When storage engine for a table is defined as ndb (or
ndbcluster), the data is stored in memory of data node servers and this data
will be directly accessible from all SQL Nodes / API Nodes whereby users can
access data.
2.
SQL Nodes
(mysqld) / API Nodes
SQL-based operations are performed via SQL nodes
running mysqld server daemon, which has support for NDB storage engine.
Applications or user sessions in performing query, DML, or even DDL, can
connect to any of SQL Nodes that access backend data node servers where data is
stored.
3.
Management
Nodes (ndb_mgmd)
Management nodes are nothing but centralized cluster related system
variables for data nodes and API Nodes / SQL Nodes as well as to control
cluster membership. New node that comes up will firstly connect to management
node to get cluster parameters for the node to run. Therefore, the management
node has to be run prior starting up any other nodes. In case of network
partitioning, the management node will act as an arbitrator for the cluster.
NDB Cluster Architecture
This article does not focus to discuss about NDB Cluster in detail, but
focuses to provide simple DIY solution to deploy management node, SQL nodes,
and data nodes on Kubernetes. More information about NDB Cluster can be found
from https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-overview.html
B. Docker
Image for NDB Cluster
In microservices context, Kubernetes is an orchestrator for containers
that run docker. This article uses Minikube for simplicity reason. If you have
not done Minikube installation, please follow section A and B from this post, https://mysqlsg.blogspot.com/2020/05/deploying-mysql-innodb-cluster-on.html before doing this article.
Next, you need an NDB Cluster docker image. If you don’t have one,
please do below command to pull the image from GitHub and store it on your
Minikube.
$ minikube ssh -- docker
pull mysql/mysql-cluster
Then, run the following command to check if NDB Cluster image is successfully
downloaded.
$ minikube ssh -- docker
images | grep mysql-cluster
The above command shall return “mysql/mysql-cluster latest’.
C. Pre-requisites
for Management Node
To deploy a management node, you need a config.ini. Config.ini is a file
which consists of all cluster parameters required for starting data nodes and
SQL nodes. You can use below sample as
starting point <Please name the file
as: config_ini.yaml>
[ndbd default]
NoOfReplicas=2
DataMemory=98M
[ndb_mgmd]
NodeId=1
HostName=mgmt-0
DataDir=/var/lib/mysql
[ndbd]
HostName=dataa-0
NodeId=2
DataDir=/var/lib/mysql
ServerPort=2202
[ndbd]
HostName=datab-0
NodeId=3
DataDir=/var/lib/mysql
ServerPort=2202
[mysqld]
[mysqld]
Run the following command to upload config_ini.yaml onto Kubernetes as a
configMap, named config-ini.
$ kubectl -n mysql-cluster
create configmap config-ini --from-file=config_ini.yaml
config-ini has to be mounted as /etc/config.ini on management node. See section F for detail manifest in YAML file format. To start ndb_mgmd daemon, the YAML for management node in section F contains:
"sleep 5; ndb_mgmd --config-file= /etc/config.ini
--config-dir=/home; while true; sleep 1; done; "
"sleep 5" in the command argument is used to pause the pod for 5 seconds to ensure the ndb_mgmd daemon on management node starts after all pods for data nodes are available. This is because all pods for data nodes must be reachable before starting ndb_mgmd daemon, otherwise the management node will fail to start.
Although a management node does not need to persist data, I recommend to
use StatefulSet with replicas =1 for deploying a management node. We set
replicas=1 because management node doesn’t need to be redundant since high
availability of management node can be provided by Kubernetes. In the event of
management node is crashed, kubelet will automatically start a new management
node for the NDB cluster.
D. Pre-requisites
for Data Node
Like management node, to start a data node, we need option file that
tells the hostname of the management node. The following is an example for
option file required by a data node <please
name the file as: datanode_ini.yaml>
[mysqld]
ndbcluster
ndb-connectstring=mgmt-0
Run the following command to upload datanode_ini.yaml as a configMap,
named datanode.
$ kubectl -n mysql-cluster
create configmap datanode --from-file=datanode_ini.yaml
This configMap has to be mounted as /etc/datanode.cnf (see section F for detail). To start ndbd daemon, the YAML uses the following command:
"sleep 10; ndbd --defaults-file=/etc/datanode.cnf
--ndb-connectstring=mgmt-0; while true; sleep 1; done; "
"sleep 10" in the command argument is used to pause the pod for 5 seconds to ensure the ndbd daemon on data node starts after the management node.
Although a data node stores data in memory, the LDM creates the data
copies (snapshot) to the disk every regular LCP to provide the durability of
data. An LDM is responsible for tuple storage, hash and T-tree indexes, page
buffer and tablespace management, writing and restoring local checkpoints, and
Disk Data log management. Thus, I recommend to use StatefulSet with PV/PVC for
physical disk storage. However, in this article, I don’t use PV/PVC for
simplicity reason.
To provide redundancy, each data node within a data node group shall
have (at least) a replica. So, I recommend two StatefulSet with replicas=1 for
a single data node group. If we want to expand data node into two or more data
node groups, we just need to set replicas equal to the number of data node
groups. In theory, expanding the data node group is used to store databases
that are larger than the memory capacity of one data node. Thus, table data can be sharded across data
node groups for performance gain. See below diagram.
Data Nodes on Kubernetes with single node group vs multi node groups
E. Pre-requisites
for SQL Node
To deploy SQL Node, you need a proper option file. Use
the following sample as start (file name: my_cnf.yaml), and deploy it as
configMap on Kubernetes.
[mysqld]
ndbcluster
ndb-connectstring=mgmt-0
user=mysql
Run this command to deploy:
$ kubectl -n mysql-cluster
create configmap my-cnf --from-file=my_cnf.yaml
To spin mysqld on Kubernetes, we need to use secret to determine root
password. Use below YAML (file name: secret.yaml) to set root password as
“root”.
apiVersion: v1
kind: Secret
metadata:
name: mysql-root-password
namespace: mysql-cluster
type: Opaque
data:
password: cm9vdA==
Run this command to deploy:
$ kubectl apply -f
secret.yaml
In contrast to InnoDB Clusters, mysqld on NDB Clusters does not have to
store data because all data is intended to be stored on the data node. Because
SQL Nodes in NDB Clusters can be stateless, I recommend SQL Nodes to be
deployed in Kubernetes as Deployment.
F. NDB Cluster
YAML File
At this point, your preparation is compete and you are ready to deploy
your first NDB Cluster on Kubernetes! All you need now is a YAML file. Copy
below text, paste into a text file, and save it as all_cluster.yaml. You can
use this file as a baseline YAML and you can freely modify it to meet your
needs.
---
apiVersion: v1
kind: Service
metadata:
name: mgmt-0
namespace: mysql-cluster
spec:
ports:
- name: mgmtport
port: 1186
targetPort: 1186
- name: pingport
port: 7
targetPort: 7
selector:
statefulset.kubernetes.io/pod-name: mgmt-0
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: dataa-0
namespace: mysql-cluster
spec:
ports:
- name: dataport
port: 2202
targetPort: 2202
- name: pingport
port: 7
targetPort: 7
selector:
statefulset.kubernetes.io/pod-name: dataa-0
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: datab-0
namespace: mysql-cluster
spec:
ports:
- name: dataport
port: 2202
targetPort: 2202
- name: pingport
port: 7
targetPort: 7
selector:
statefulset.kubernetes.io/pod-name: datab-0
clusterIP: None
---
apiVersion: v1
kind: Service
metadata:
name: mysql-cluster
namespace: mysql-cluster
labels:
app: mysql-cluster
spec:
ports:
- name: tcp-rw
port: 3306
targetPort: 3306
selector:
app: mysql-cluster
type: LoadBalancer
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mgmt
namespace: mysql-cluster
spec:
serviceName: "mgmt"
replicas: 1
selector:
matchLabels:
app: mgmt
template:
metadata:
labels:
app: mgmt
spec:
containers:
- image: mysql/mysql-cluster:latest
name: mysql
volumeMounts:
- name: mysql-configmap-volume
mountPath: /etc/config.ini
subPath: config.ini
command: ["/bin/sh"] # do not modify
args: ["-c", "sleep 30;
ndb_mgmd --config-file=/etc/config.ini --config-dir=/home; while true; do sleep
1; done;"] # do not modify
volumes:
- name: mysql-configmap-volume
configMap:
name: config-ini
items:
- key: config_ini.yaml
path: config.ini
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dataa
namespace: mysql-cluster
spec:
serviceName: "dataa"
replicas: 1
selector:
matchLabels:
app: dataa
template:
metadata:
labels:
app: dataa
spec:
containers:
- image: mysql/mysql-cluster:latest
name: mysql
volumeMounts:
- name: mysql-configmap-volume
mountPath: /etc/datanode.cnf
subPath: datanode.cnf
command: ["/bin/sh",
"-c", "sleep 60; ndbd --defaults-file=/etc/datanode.cnf
--ndb-connectstring=mgmt-0; while true; do sleep 1; done;"]
volumes:
- name: mysql-configmap-volume
configMap:
name: datanode
items:
- key: datanode_ini.yaml
path: datanode.cnf
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: datab
namespace: mysql-cluster
spec:
serviceName: "datab"
replicas: 1
selector:
matchLabels:
app: datab
template:
metadata:
labels:
app: datab
spec:
containers:
- image: mysql/mysql-cluster:latest
name: mysql
volumeMounts:
- name: mysql-configmap-volume
mountPath: /etc/datanode.cnf
subPath: datanode.cnf
command: ["/bin/sh",
"-c", "sleep 60; ndbd --defaults-file=/etc/datanode.cnf
--ndb-connectstring=mgmt-0; while true; do sleep 1; done;"]
volumes:
- name: mysql-configmap-volume
configMap:
name: datanode
items:
- key: datanode_ini.yaml
path: datanode.cnf
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-cluster
namespace: mysql-cluster
spec:
replicas: 2
selector:
matchLabels:
app: mysql-cluster
strategy:
type: Recreate
template:
metadata:
labels:
app: mysql-cluster
spec:
hostname: mysql-cluster
containers:
- image: mysql/mysql-cluster:latest
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-root-password
key: password
volumeMounts:
- name: mysql-configmap-volume
mountPath: /etc/my.cnf
subPath: my.cnf
volumes:
- name: mysql-configmap-volume
configMap:
name: my-cnf
items:
- key: my_cnf.yaml
path: my.cnf
G. Deploying
NDB Cluster YAML
Simply run the following command and wait 1 min for
cluster to be ready
$ kubectl apply -f
all_nodes.yaml
H. Testing NDB
Cluster
After 1 min, run the following command to see if all
pods are running:
goldfish.local:~/webinar/cge-k8s
$ kubectl -n mysql-cluster get pod
NAME READY STATUS
RESTARTS AGE
dataa-0 1/1 Running
0 68s
datab-0 1/1 Running
0 68s
mgmt-0 1/1 Running
0 69s
mysql-cluster-796d8b4d78-dwn9j 1/1
Running 0 68s
mysql-cluster-796d8b4d78-n6n4l 1/1
Running 0 68s
goldfish.local:~/webinar/cge-k8s
$
Check cluster status from management node:
goldfish.local:~/webinar/cge-k8s
$ kubectl -n mysql-cluster exec -it mgmt-0 -- ndb_mgm -c localhost -e show
Connected to Management
Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 @172.17.0.12
(mysql-5.7.30 ndb-7.6.14, Nodegroup: 0, *)
id=3 @172.17.0.15
(mysql-5.7.30 ndb-7.6.14, Nodegroup: 0)
[ndb_mgmd(MGM)] 1 node(s)
id=1 @172.17.0.16
(mysql-5.7.30 ndb-7.6.14)
[mysqld(API)] 2 node(s)
id=4 @172.17.0.14
(mysql-5.7.30 ndb-7.6.14)
id=5 @172.17.0.13
(mysql-5.7.30 ndb-7.6.14)
goldfish.local:~/webinar/cge-k8s
$
Now you can play with your cluster, following is the
example.
1. Create an NDB table from SQL node 1 and insert three records
goldfish.local:~/webinar/cge-k8s
$ kubectl -n mysql-cluster exec -it mysql-cluster-796d8b4d78-dwn9j -- mysql
-uroot -h127.0.0.1 -proot -e "create database test; create table test.test
(i int) engine=ndb; insert into test.test values (1); insert into test.test
values (2); insert into test.test values (3); select * from test.test;"
mysql: [Warning] Using a
password on the command line interface can be insecure.
+------+
| i |
+------+
| 1 |
| 3 |
| 2 |
+------+
goldfish.local:~/webinar/cge-k8s
$
2. Query that data from SQL node 2
goldfish.local:~/webinar/cge-k8s
$ kubectl -n mysql-cluster exec -it mysql-cluster-796d8b4d78-n6n4l -- mysql
-uroot -h127.0.0.1 -proot -e "select * from test.test;"
mysql: [Warning] Using a
password on the command line interface can be insecure.
+------+
| i |
+------+
| 1 |
| 3 |
| 2 |
+------+
goldfish.local:~/webinar/cge-k8s
$
The NDB Cluster works!
I. Future
Article on NDB Cluster
I will write subsequent article about NDB Cluster runs on Kubernetes.
The article will discuss on ways to expand data node group online without
shutting down the cluster, explaining how it can be done on Kubernetes. The
article will also discuss on how to use hybrid deployment, when data nodes are
deployed on VM / bare metal (outside Kubernetes cluster).
Disclaimer:
The method and tricks presented here are experimental only and it’s your
own responsibility to test, implement, and support in case of issues. It is
only showing example and not for a production deployment, as this is not
formally a supported configuration. I encourage more testing to be done,
including by development team. Implementation with less layer as possible is
recommended, and with having support from Oracle Support for real world
implementation serving business application.
Stay tune, Thank you!
No comments:
Post a Comment