MS SQL Clustering

SteelEye LifeKeeper High Availability Solution

SQL Servers form a crucial part of many applications, and are nearly always valuable to a business. However reliance upon an SQL server also results in it becoming a single point of failure, where downtime affects multiple systems and end users leading to a loss of productivity by staff and/or a loss of sales.

It is therefore no wonder that solutions like Oracle's RAC or Microsoft's Clustered SQL Server are popular, as they offer redundancy and resiliance to failure, helping save organisations downtime and loss of revenue. Unfortunately, such solutions come with a large price tag and often require expensive hardware (for example shared storage) resulting in the price of the solution making it only viable in a minority of cases.

Nordicmind can help you gain the benefits of High Availability Clustering, at a fraction of the cost with increased functionality. Our solutions offer features not available with traditional clustering products - for instance disaster recovery, LAN or WAN failover and the option of whether or not to use shared storage, while also providing resiliance and enhancing availability.

Our solutions are generally based around SteelEye's award winning LifeKeeper High Availability product, which is available for Windows NT, 2000 and 2003. With over 10 years of development and usage, it is a solution that can be trusted.

No Requirement for Enterprise Edition of SQL Server

LifeKeeper does not require enterprise versions of the operating system or application, helping you to control costs. It also provides for a more flexible solution which can be expanded in the future to cluster any application.

No requirement for Shared Storage

One of LifeKeeper's benefits is that it does not require shared storage, which can result in significant cost savings, as well as providing increased redundancy and resiliance (offsite recovery, remote data store etc). This is undertaken through the use of SteelEye's Data Replication product, which is suitable for a LAN or WAN configurations.

LifeKeeper uses Server Virtualisation, resulting in failovers that are largely transparent to end users - in that end users often do not notice that the application is now running on a different node within a cluster, and do not need to reconfigure their application. One of the advantages of this approach is that nodes can be taken offline for administration during normal working hours.

Server Virtualisation uses floating IP addresses, which move across with applications on failover removing any need for clients to reconfigure, as well as ensuring that all necessary files are available on both servers for the application to restart where it left off previously. To make files available to multiple nodes within a cluster either shared storage (SAN, SCSI Array, NAS etc) or data replication over a network can be used.

ProActive Monitoring of Resources

LifeKeeper actively monitors and protects all resources required by the SQL Server application, and actively checks that the application itself is running, resulting in responsive failover to another node within the cluster if there are problems (e.g. faulty network cable, disk failure or server hang).

LifeKeeper's active monitoring helps to ensure the highest level of availability, going beyond just checking whether the server is alive, as seen in many other products.

LifeKeeper is capable of protecting the following optional SQL Server services :

  • Distributed Transaction Coordinator
  • SQLServerAgent
  • Microsoft Search
  • MSSQLServerADHelper

Active/Active or Active/Standby Configurations

An SQL Server cluster is capable of running in two basic configurations. The first, where both servers are running seperate instances of SQL Server (Active/Active), or secondly where one server acts as a hot backup for the other (Active/Standby). Both servers can run other applications, which may (or may not) be under control of the clustering software.

The SQL Cluster is capable of being configured in various configurations, using either shared storage, or the data replication configuration, as shown in the following images.

Within the cluster, all data files are stored on shared or mirrored volumes. While the application binary files are installed on each host locally.

LifeKeeper monitors all resources (processes, disk, ip addresses) and upon detecting a failure it initiates a failover.


The first scenario shows LifeKeeper protecting a Active/Standby configuration using shared storage. In this scenario, one node remains active, while another acts as a hot backup. The advantage of such a configuration is that in the event of failure on the primary node, there should be no performance degradation. However the doubling of hardware costs normally means that other configurations (E.g. N+1 or Active/Active) are used on larger clusters.


Data Replication (non-shared storage)

The second scenario (below) shows LifeKeeper protecting an Active/Standby configuration, without the use of shared storage. SteelEye's data replication product is used to mirror data between the nodes over a LAN. The LAN is normally a cross over ethernet cable, allowing for a high speed, low latency, low risk of failure configuration.

The mirroring between nodes can be either synchronous or asynchronous depending upon the environment and requirements. Normally within a LAN environment where bandwidth is not an issue, Synchronous replication will be used, which will help ensure data integrity.

Synchronous mirroring provides greater protection for data, but can result in poorer performance as writes have to take place on both servers before control returns to the calling application.

Asynchronous mirroring is near real time, in that writes occur on the local disk, and are queued to be sent to the remote server before returning control to the calling application.

See here for more information on data mirroring

Disaster Recovery Solution

This next scenario shows data replication taking place over a WAN. Because WANs often have lower bandwidth or greater latency than a LAN, for performance reasons it is often necessary to sacrifice some of the fault tolerance capabilities of the replication and use an asynchronous solution rather than a synchronous solution.


Shared Storage High Availability Solution

The final scenario shows shared storage being used in an Active/Active configuration where there are two instances of SQL server, which may be running on either nodes. This allows for maximum hardware utilisation, and leads to greatest performance, as well as allowing for a manual load sharing configuration to exist. Each instance of the SQL server is "virtual" in that it can move from one host to the other, without clients being aware - this is normally done through the use of floating IP addresses.

In this configuration, either shared storage or data replication technologies could be used. Shared storage generally has quicker failover times, and there is never any need to perform a full resynchronisation of the data after e.g. a node failure.


Original Source: www.openminds.co.uk

 

  • Does not require Enterprise Edition of SQL Server

  • Allows monitoring and protection of SQL Databases with or without shared storage

  • Protects multiple instances of SQL 2000

  • Active/Active environment

  • Disaster Recovery solutions available

  • N+1 or cascading failover

  • Failover across a LAN for local recovery

  • Failover over a WAN for off-site disaster recovery