Database Clusters – Scale up & Increase availability for Mission critical databases

What is a Database Cluster?

Databases offer back-end support to any critical application used in the enterprise (like ERP, CRM, etc) by storing, organizing and retrieving all the data used by the applications. Database Clusters refer to grouping of multiple such database servers/ nodes in order to provide high availability to databases and to scale up the number of database servers, based on application requirements (among others). Let us learn more about Database clusters, in this article.

You might be wondering why a Computer Networking blog is giving an introduction to Database Clustering? Think about it like this – Computer Networks are basically built to enable applications to run over the network. And applications depend on databases to function. So, having a basic understanding about applications/ databases might be useful while designing networks, especially for anticipating the demand for network bandwidth.

Generally, the I/O interconnects between the various members (nodes/ servers) in a database cluster is the weakest link. Using specialized low-latency/ high capacity interconnects like Infiniband/ SCI, etc better inter-cluster performance can be realized. SAN (Storage Area Networks) can be employed in large clusters to improve their reliability and performance.

Applications of Database Clusters: Any application that needs to deal with large amount of data in real time, might find database clusters very useful. Some examples include real time analytics, large e-commerce transactions, big multi-player games, telecom soft-switches, IPTV/ Video on Demand applications, Share trading applications, ERP/ CRM applications for large companies, etc.

Types of Database Clusters:

Database Servers are used to host databases. Small databases can be hosted in a single database server. But when they become bigger (start handling more data), additional servers (nodes) can be added and a database clustering software can be used to combine all these individual database servers to form a large cluster that works as a single (huge) database system.

There are two major architectures that are popular for clustering databases – Shared Nothing Database Cluster & Shared Disk/ Shared Everything Database Cluster. Depending on the Database / Database Clustering software vendor, one of these might be used.

Shared Nothing Database Cluster:

In this architecture, database files are divided into various parts and each part is hosted in a database server/ node which controls data hosted in it, exclusively. So, when data is required to be written/ read by an application, it is diverted to the node/ server containing the particular data, by the clustering software. Essentially, the database files are divided into sub-groups and each group is hosted by certain nodes and work is divided among them.

Generally, database clustering software ensures that there will be at-least two nodes which contain the same information (data) so that even if one node fails, the users can fail-over to another node, almost instantly. Data is generally replicated across these multiple nodes (which store the same data) so that newer data is updated (synchronously) on the back-up files.

MySQL is a popular open-source based database system which offers a database cluster that follows the shared nothing type of database clustering. This cluster has three types of nodes – Data nodes, Application nodes and Management nodes for each Database Cluster installation.

Shared Disk/ Shared Everything Database Cluster:

In the shared disk/ shared everything database cluster architecture, database files are logically shared among multiple interconnected nodes. The database files are common to multiple servers/ nodes and mostly stored using specialized storage systems (like NAS/ SAN/ SCSI disks, etc). Any part of the database can be accessed and updated by multiple database servers/ nodes.

Oracle 11g is a common database system that employs the Shared everything database cluster mode. There are two types of nodes in this database cluster – Database servers (nodes) and Mirrored disk sub-system nodes.

Advantages of Database Clusters:

Database Clusters provide high availability and fail-over. There is generally no/ very less downtime in case of individual nodes/ database server failures or in case of scheduled maintenance.
It is possible to Scale up the number of database read/write operations by adding additional nodes/ servers, which maybe commodity hardware. Scaling-up can be on-demand and instantaneous, with many database cluster vendors.
Its possible to distribute clusters to remote data centers using geographic replication feature offered by some database cluster vendors in order to achieve disaster recovery and take the data closer to the users accessing them (hence reducing latency).
Clusters can do automatic data partitioning (sharding) with basic load-balancing among the various cluster members.
Database clustering is a good option for application partitioning.
The cluster can be backed-up (either manually or automatically) so that database contents can be restored during individual node/ cluster failures.
Cluster performance can be monitored in real time for parameters like user defined threshold exceeds, read-write stats, etc. if the cluster vendor enables the same.

excITingIP.com

You could stay up to date on the various computer networking/ related IT technologies by subscribing to this blog with your email address in the sidebar box that says, ‘Get email updates when new articles are published’