MySQL Cluster . It is a high availability, high redundancy version of MySQL adapted for the distributed computing environment. Use the NDB Cluster storage engine to allow multiple MySQL servers to run in a cluster. This storage engine is available in the binary distributions of MySQL 5.0 and in the RPMs compatible with the latest Linux distributions .
[ hide ]
- 1 Panoramic
- 2 Basic MySQL Cluster basics
- 3 Elements of Architecture
- 4 Main processes
- 5 Hardware, software and networks
- 6 Configuration
- 7 TCP / IP connections
- 8 Sources
MySQL Cluster is a technology that enables clustering of databases in memory in a non-sharing environment. The non-sharing architecture allows the system to run on cheap hardware , and without any special hardware or software requirements . They also have no single point of failure because each component has its own memory and disk.
MySQL Cluster integrates the standard MySQL server with a memory clustered storage engine called NDB.
Basic MySQL Cluster basics
NDB Storage Engine
This is a memory storage engine that offers high availability and persistence of data. It is highly configurable offering a large number of options to handle load balancing and fault tolerance.
Management node (MGM node)
This type of node fulfills the function of managing, controlling and coordinating the other nodes within the cluster. Implement data configuration functions, start or stop other nodes within the cluster, run backups, or other administrative tasks. Because it controls and configures the rest of the nodes, it must be started before any other type of nodes using the ndb_mgmd command.
This type of node stores the data. The number of nodes of this type within the cluster equals the number of replicas times the number of fragments. That is, if 4 replicates of the data are handled with 2 fragments, 8 data nodes would be required. It is not necessary to handle more than one replica. This type of node is raised using the ndbd command.
SQL node (MySQL server)
Clustered data is accessed through this type of node. Basically, it consists of a MySQL Server that uses the NDB storage engine. It is started using the ndbcluster command, specifying the necessary configuration file for this server.
To connect to a MySQL cluster remotely, you must use the same client used to connect to a non-clustered MySQL server . The cluster is transparent to clients.
There are other types of clients that communicate with the administration server and provide the same functionalities as a node of this type. Unlike administrative nodes, clients allow administration tasks to be run remotely. Some tasks that can be performed with these clients include starting or stopping nodes, managing debug message tracking, showing the status of other nodes and their respective versions, performing backups, etc.
Elements of Architecture
A MySQL Cluster consists of a set of machines, each running a number of processes including MySQL servers , NDB Cluster data nodes , management servers, and specialized data access programs.
The MySQL CLuster architecture is designed not to have a single point of failure, each component will be a separate server and must have its own storage and memory capacity to work. In the MySQL Cluster architecture, four points are proposed, a manager, an SQL node and two data nodes, this in order to distribute the loads, maintain high availability and redundancy of data.
The MySQL Cluster is made up of three types of nodes, these are:
The administration node (ndb_mgmd): This type of node fulfills the function of managing, controlling and coordinating the other nodes within the cluster, it provides configuration data, it allows starting and stopping nodes, executing backup copies, it allows to know the status of data nodes and performs other administrative activities. Since this node type manages the configuration of other nodes, such a node must be started first, before any other node. In high availability environments it is recommended to have more than one administration node or manager node.
The data node (ndbd): This is the type of node that stores the cluster data. There are as many data nodes as replicas. It is not necessary to have more than one data node but if you only have one data node the redundancy would be lost, so in the proposed model it is proposed to work with two data nodes, which will work in synchronization and each one will be the replica of the other.
The SQL node (mysqld): These nodes are the ones that access the cluster data and the ones that maintain the LMS database schemas . In the case of MySQL Cluster, a client node is a traditional MySQL server that uses the NDB Cluster engine . These nodes will be the nodes through which the applications will access the data stored in the database, in this case the Database Cluster.
The architecture presented in this example is a basic architecture that shows how to install and configure a MySQL clustered database . It is important to clarify that in high availability environments it is recommended to install more than one manager node and more than one SQL node , this in order to distribute loads and always support the operation of the system.
All of these programs work together to form a MySQL Cluster. When the data is stored in the NDB Cluster engine , the tables are stored in the data nodes. Such tables are directly accessible from all the other MySQL servers in the cluster. Data stored on MySQL Cluster data nodes can be replicated: The cluster can handle individual data node failures with no other impact other than aborting a few transactions due to loss of transaction status. Since transactional applications are supposed to deal with transactional failures, this should not be a problem.
By bringing MySQL Cluster to the Open Source world , MySQL provides clustered data processing with high availability, high performance, and scalability available to everyone who needs it.
MySQLD The traditional database server process that can be used in clustered or isolated environments. For this process to be used within a MySQL cluster, it needs to be specially built to support the NDB storage engine, the compiled binaries available on the MySQL site already integrate this functionality into the process.
An easy way to ensure that the correct version is available for a MySQL cluster is by invoking the SHOW ENGINES command within the environment from the server looking for the existence of the NDB engine. This command displays all of the engines supported by the currently installed process.
In order to join a MySQL server to a cluster, it is necessary to be able to interact with the administration node of said cluster. For this, the MySQL configuration file (my.cfg) must specify the connection string to said server. Communication between servers is done using the TCP / IP protocol, so it is necessary to indicate in the connection string the IP address of the administrative node and the port on which the administration service is published.
The ndbd process is in charge of handling all the data in the tables using the cluster ndbd engine. This process supports the functionality of handling distributed transactions between nodes, recovery of faulty or offline nodes, checkpoint to disk (the moment data is actually written to disk), online backup, and other tasks related to the cluster layout.
The set of ndbd distributed processes cooperate collectively in the task of data management. Each of these processes generates a set of independent log files that is stored in the DataDir directory specified in the configuration file for each data node (config.ini). In order to connect a data node it is necessary to provide the ndbd process with the necessary information about the administrative node. Analogously to the MySQLserver node, you must specify its IP address and port in the configuration file.
When an ndbd process starts running, two processes are raised, one of them is called angel. The angel process monitors the execution of the storage process, and in case of failure, starts it again. For this reason, if you try to stop the ndbd process using the Unix kill command, it is necessary to kill both processes, starting with the angel process, since otherwise it will start the data process again. The best way to kill an ndbd process on a node is by using the appropriate commands on the management node or using an external management client.
The execution process uses a thread to read, write, scan data, and perform other activities. This thread is implemented asynchronously in order to easily perform multiple concurrent activities. Another thread is used to monitor that execution does not stop or cause a deadlock. Access to files on disk is done through multiple threads, each managing a particular data file. In this way the ndbd process is able to make exhaustive use of architectures with multiple processors in an optimal way.
It is the process that controls the administration server, being responsible for knowing and maintaining the cluster configuration and distributing this information to all the nodes that request it when joining the cluster. It also maintains the log of the cluster activities and reports its status to the clients that connect to it.
In a scheme with a single administration server, it is not necessary to specify a connection string to the administration node since the server itself is the same. If you are working in a scheme where there is more than one administration node, you must specify to identify each one with a specific ID and indicate the connection strings to each one of them.
Along with the NDB_MDMd process is ndb_mgm, which is responsible for managing the administration client that interacts with the administration node. The management client communicates with the management node using an API in C . This API can be used to develop applications dedicated to controlling and managing the cluster in a personalized way.
Hardware, software and networks
One of the advantages of MySQL Cluster is that it can run on normal hardware without any special requirements other than large amounts of RAM , due to the fact that all data is stored in memory. Naturally, multiple and faster CPUs improve performance. The memory requirements for cluster processes are relatively small.
The software requirements for Cluster are modest. The machines operating systems do not require any unusual modules, services, applications or strange configuration to support MySQL Cluster.
- For Mac OS Xor Solaris , the standard installation is sufficient.
- For Linux,the MySQL software requirements are simple: all you need is a production version of MySQL-MAX0; to have cluster support. It is not necessary to compile MySQL to use cluster.
For communication between nodes, the cluster supports TCP / IP network in any standard topology , and at least a 100 Mbps Ethernet network , plus a switch , hub , or router is expected to provide network connectivity to the entire cluster. It is recommended that MySQL Cluster be run on its subnet that is not shared with non-cluster machines for the following reasons:
Security: Communication between cluster nodes is not encrypted. The only way to protect transmissions within a MySQL Cluster is to run your cluster on a protected network. If it is about using MySQL Cluster for web applications , the cluster must reside behind a firewall and not in your DMZ or elsewhere.
Efficiency: Initializing a MySQL Cluster on a private or protected network allows the cluster to make exclusive use of bandwidth between machines in the cluster. Using a switch for the MySQL Cluster not only helps protect you from unauthorized access to cluster data, it also ensures that the cluster nodes are protected from interference caused by transmissions between other machines on the network. For added confidence, dual switches and dual cards can be used to eliminate the network as a single point of failure; various devices support failures for these communication links.
To avoid unnecessary resource reservation, the server is configured by default with the NDB engine disabled. To activate NDB, you need to configure the server’s my.cnf configuration file with the –ndbcluster option.
Since MySQL server is part of the cluster, you need configuration data that you know how to access the MGM node to get cluster configuration data. The default behavior is to search for the MGM node on localhost. However, you can specify your location where you are, this can be done in my.cnf or on the MySQL server command line . Before the NDB can be used, at least one MGM node must be operational, as well as the desired data nodes.
Setting up MySQL Cluster requires working with two files:
my.cnf: Specify options for all executables of the MySQL Cluster. This file must be accessible by each executable in the cluster.
config.ini: This file is read-only by the MySQL Cluster administration server, which distributes the information contained in it to all the processes participating in the cluster. config.ini contains a description of each node in the cluster. This includes configuration parameters for data nodes and configuration parameters for connections between all nodes in the cluster.
TCP / IP connections
TCP / IP is the default transport mechanism for establishing connections in MySQL Cluster. Normally it is not necessary to define connections since the cluster automatically creates a connection between all the data nodes, between each data node and the MySQL server nodes and between each node and the administration server.