What is load balancing?

In a large computing system, a single server cannot always handle the load or demand of the network traffic. This is where load balancing comes into place. Load balancing is the method of ensuring that networks run smoothly, by spreading the network or application traffic across a group of servers. Load balancers typically exist between client devices and backend servers and distribute the incoming requests to available servers that are capable of handling the request.

Types of Load Balancers

Hardware load balancers:

Software-based load balancers:

How do load balancers work?

Load balancers can be:

Load balancers work by:

Hardware vs software-based load balancers

There are two categories of Load Balancers: those that are Hardware-based and those that are Software-based. A Hardware-based Load Balancer—also called Hardware Load Balancing Device (HLD)—requires a proprietary rack and stack appliance with a specialized firmware.

However, HLDs are built from Application Specific Integrated Controllers (ASICs) to distribute traffic between the client and server. On the other hand, a Software-based Load Balancer runs on standard x86 servers or Virtual Machines (VMs) as an Application Delivery Controller (ADC).

Built with specialized ASICs, HLDs manages traffic between the clients and the server with minimum impact on the processor. During peak times, organizations must provision enough HLDs to meet increased demands. However, such a provision may imply that most HLDs could sit idle during off-peak season.

In contrast, Software-based Load Balancers run their services on clustered VMs. Typical Software-based Load balancers usually designate one primary cluster server to distribute client workloads to other secondary servers. This helps to minimize downtimes in case one server fails. In this regard, Software-based Load Balancers can scale elastically to meet growing demands than their HLD counterparts.

Common Load Balancing algorithms

Load Balancing algorithms select which backend server handles the client traffic based on two factors: the server’s health status and pre-defined rules. A load-balancing algorithm first identifies which server pool can correctly respond to clients’ requests. Next, it uses pre-configured rules to choose an appropriate server within the pool to handle the traffic.

Typical load balancing algorithms include:

Round robin

The load balancer sequentially serves the requests to the server. After transmitting the request to the last server, the process restarts with the first server. There are two categories of round-robin algorithms: weighted round-robin and dynamic round-robin.

The weighted round-robin algorithm assigns individual weights to the servers based on their efficiencies and structure. Weighted round-robin is useful in instances where you have unidentical pools of servers. In contrast, the dynamic round-robin algorithm computes the server’s weight in real-time to determine which server to forward the requests.

Least connections method

As the name suggests, the Least connections algorithm selects the server that has the minimum connections. It is appropriate in instances where client workloads result in longer sessions.

Least Response Time method

The load balancer selects the server that has the fewest active connections and the minimum response time. This method is appropriate in instances where clients demand a prompt response from the server.

Least bandwidth method

The load balancer computes the bandwidth (in Mbps) required to send the client workloads to various servers. It then sends the request to the server that consumes the minimum bandwidth.

Hashing method

The load balancer computes a hash value based on the client’s packet. The hash value determines the server to forward the client’s workload. In other words, the workload’s IP address determines the server that receives the request.

Custom Load method

The load balancer queries the individual servers’ load, such as CPU and memory consumption using Simple Network Management Protocol (SNMP). It then forwards the incoming requests to servers based on the server’s workload.

Resource-based method

The load balancer determines which servers are idle based on existing sessions, CPU and memory consumption, and counters. It then distributes client workloads to servers that are consuming the least resources.

How can Parallel RAS help with load balancing?

Parallels RAS allows you to load balance your extensive IT infrastructure without the need for complex configurations, or expensive add-ons. It balances both RDSH servers and internal components.

With Parallels RAS, server load balancing is available out of the box from the first installation. User access is distributed among healthy servers that host the same application. It also allows resource-based or round-robin based balancing. Your IT infrastructure gets dynamic provisioning of RDS servers, allowing you to scale up or scale down the number of hosts dynamically.

Parallels RAS offers high availability gateway load balancing and high redundancy, reducing the possibility of downtime and disruption. It effectively lets you reap the benefits of load balancing without having to take the economic or complexity overhead.

Download the free trial of Parallels RAS to reap the benefits of load balancing!

References:

https://www.nginx.com/learn/load-balancing/

https://www.parallels.com/products/ras/capabilities/load-balancing/