High Availability and Load Balancing with VMware Clusters

Using VMware ESXi on a host server maximizes that hardware because you can run multiple applications on isolated virtual machines (VMs). However, most users need more VMs, which in most cases cannot fit on a single physical server. This is where VMware clusters come in.

VMware clusters group multiple physical servers and treat them as a single machine by aggregating and managing their combined resources as a unit. When used in server virtualization, VMware clusters unlock vSphere High Availability (HA), load balancing and VMware vSAN features.

Gain Flexibility with vSphere Cluster

A vSphere cluster is a set of ESXi hosts configured to share resources such as processor, memory, network and storage. In vSphere environments, each cluster can accommodate a maximum of 32 ESXi hosts, with each host supporting up to 1024 VMs.

Using vSphere clusters allows IT administrators to aggregate and organize virtualization resources in a VMware environment and tie them back to the underlying physical resources. Suppose a cluster contains three physical servers, each with four dual-core processors running at 8 GHz and having 16 GB of memory.

The total computing power for such a cluster becomes 192 GHz, while the memory available is 48 GB. With this setup, you don’t need to worry about the underlying cluster resources’ physical composition. All you need is to set up resource pool policies based on the aggregate available resources via the vCenter Server.

The VMware infrastructure assigns the resources to the VMs within the boundaries of those policies automatically.

For example, the marketing department reserves 128 GHz and 32 GB from a cluster of 192 GHz and 48 GB, leaving 64 GHz and 16 GB for the sales department. The fact that you’ve allocated 64 GHz of computing power and 16 GB of memory to the sales department doesn’t mean you cannot resize them on the fly.

If the sales department’s workload increases, you can dynamically raise its computing power from 64 GHz to 92 GHz. You can also increase memory consumption from 16 GB to 20 GB without shutting down the associated VM. Reserving resources for the VM doesn’t mean they leave the marketing department immediately.

When you take away 32 GHz reserved for the marketing department, it’s assigned to the sales department only when the marketing department is idle. When the marketing department increases its capacity, it takes away its 32 GHz automatically. Using resources in this manner does not lead to wastages even though you’ve reserved them for different pools.

Achieve High Availability with vSphere HA

High Availability (HA) is a system characteristic that describes its ability to operate continuously without downtime. Availability is usually expressed as a ratio between uptime (total time the system is available) over downtime (the total time the system is not available) in a given year.

For example, the popular metric “five-nines”, or 99.999% availability, translates to roughly 5.26 minutes or less of total downtime in a year. Organizations can achieve HA in different ways, including redundant network interface cards (NICs), HA applications, and server clusters.

VMware uses a license feature known as vSphere HA to provide broad-based and cost-effective high availability at the virtualization layer. When enabled, vSphere HA automatically restarts the failed VMs on other ESXi hosts that have spare capacities. This minimizes service disruptions and downtime while eliminating the need for costly dedicated hardware and additional software.

While the terms vSphere HA and vSphere Fault Tolerance (FT) are often used interchangeably, they mean different things. vSphere HA focuses on achieving the least possible downtime where high performance is the top priority. vSphere FT also ensures minimal downtime but doesn’t focus on delivering high performance during a system failure event.

vSphere HA leverages a High Availability cluster (a logical grouping of ESXi hosts pooled on the same network) to protect against ESXi hosts, VMs and application failure. Restarting VMs on different ESXi hosts is possible because the HA cluster has shared storage that maintains virtual machine disk (VMDK) files accessible to all the hosts within the cluster.

vSphere HA uses a feature known as Fault Domain Manager (FDM), an agent to monitor physical servers’ availability. When you set up a VMware cluster, the vCenter Server places the FDM agent on each cluster’s ESXi hosts. One of the ESXi hosts in the cluster becomes the master, while others are slaves. The master host monitors the signals from slaves in the cluster and communicates with the vCenter Server.

If the master host fails to detect a signal from any host or virtual machine in the vSphere environment, it instructs vSphere HA to undertake remedial steps. If the entire host has failed, all VMs on that hardware get restarted on other servers in the cluster with spare capacities. If the VM has failed, vSphere HA restarts it on other hosts within the cluster.

Manage Cluster Resources with VMware DRS

Like vSphere HA, VMware Distributed Resource Scheduler (DRS) is a licensable feature that you can add to the VMware cluster. When you enable VMware DRS, vCenter Server uses its system algorithms coupled with your own defined rules to manage and optimize the cluster resources.

VMware DRS treats the amalgamated CPU, memory and storage resources as global resource pools that all virtual machines in the cluster can access. VMware DRS intelligently monitors the workload of running VMs and their resource consumptions on ESXi hosts against resource assignment policies within the cluster.

In case a particular workload violates the set policies, or there is potential for improvement, VMware DRS leverages VMware vMotion to reassign the VMs to different ESXi hosts within the cluster dynamically.

When you create a new VM, you don’t have to specify the host if you have enabled the DRS feature. VMware DRS automatically collects hosts’ details and the new VM’s resource consumption details in the cluster and generates recommendations for placements.

This way, VMware DRS provides load balancing and quality of service (QoS) functionalities. By migrating VMs to different ESXi hosts automatically, VMware DRS enhances the performance within vSphere environments. For this reason, most organizations leverage VMware DRS with vSphere HA to achieve failover and load balancing.

In the case of failover, vSphere HA restarts VMs on other ESXi hosts automatically, while DRS intelligently checks the available computing resources to recommend VM placements within the cluster.

High Availability Load Balancing with Parallels RAS

An organization requires an effective load balancer—whether in on-premises datacenters or public clouds—to maintain application availability to its customers, partners and end users. Besides ensuring optimal response times and high service availability to mission-critical applications, an effective load balancer allows an enterprise to scale up and accommodate any surge in traffic.

Parallels® Remote Application Server (RAS), an all-in-one virtual desktop infrastructure (VDI) solution, provides effective load balancing via its High Availability Load Balancing (HALB) feature. Parallels HALB eliminates the challenges associated with multi-gateway environments by ensuring that traffic gets redirected only to healthy gateways.

The process of deploying Parallels HALB is straightforward. Just install and configure the HALB appliance, then add it to Parallels RAS from the Console, and you’re good to go. Most importantly, you can run multiple HALB appliances in Parallels RAS simultaneously, further minimizing downtime to guarantee application availability.

Download your free 30-day Parallels RAS trial today, and experience high availability and load balancing for your applications!