Implementing a Linux Virtual Server
[30 mn de lecture - paru le 9/19/2005 11:38:59 AM - Public : Expert]
|
   
|
Auteur
1. Load balancing and Linux Virtual Server (LVS)
1.1 What is load balancing ?
With the success of Internet and varied services which were proposed by it, companies knew significant technical problems relating to overload of their architectures. Their equipment quickly appeared obsolete and unable to hold the load. The need of resources (Web server, mall, or another media) had so strongly and brutally increased. To face this sudden request, companies employed various techniques.
Before load balancers, administrators of sites used (and some still uses it) a process of load balancing known under the name of "DNS Round Robin". This process employs a DNS function making it possible to associate several IP addresses with an host name. Each entry of DNS "A" maps an host name (such as www.monsite.com) at an address IP (such as 220.220.220.221). Usually only one IP address is associated with a host name.
With DNS Round Robin process, it is possible to give multiples IP addresses to only one host name, distributing the traffic in a more or less equitable way to the addresses enumerated in this list.
This seem to be a simple and effective manner to distribute traffic among several servers, then why don't take it to implement a concept of load distribution? The reason is that DNS Round Robin has several limitations, including the unforeseeable distribution of the load and a lack of fault-tolerance. A comprehension of DNS working would help to explain problems about DNS load balancing.
Better solutions were necessary to manage problems of redundancy, modularity, and administration. Web sites becoming increasingly critical, the load balancer was born from this need.
The load distribution or "Server Load Balancing" has several advantages, that's why this technology is largely used today. Three principal advantages which answer directly to the needs of Web sites strongly attended and considered to be critical:
High availability:
Load distribution can check server status and can exclude of the pool a server that not more answering, and include back it when it comes up again. It is automatic and does not require any intervention of the administrator.
Modularity / Evolutionarily / Flexibility:
Load balancing allows the addition or the exclusion of servers constantly. This can be due to the maintenance of a machine, even during peak hours with little or not impact on the site. Thus, when the load on the site increases, servers can be integrated immediately to accept this increasing traffic, and this, in a completely transparent way.
1.2. LVS presentation
Linux vitual server results from an open project source. This software functions on the layers 3 and 4 of OSI model. Installed on a machine, its role is to distribute the load by distributing the requests received from outside among the several local servers.

So that the system can function without encumbers, it is necessary that real servers have the same software configuration, so that the end-user does not make any difference at the time of his connection between the servers.
The operating mode remains relatively simple: when a client sends a request on an IP address, this address will actually be the external address of our load balancer. This one will retransmit the packet to real servers which are behind him in order to answer the initial request. The load balancer will define himself the way in which it will distribute the load on the various servers, according to an algorithm defined by the administrator. It has for that a table that it manages, this functionality is brought to him thanks to the "ipvsadm" tool. This table indexes the list of real servers, their state and all the services proposed.
In certain cases, you could possibly request the need to associate 2 load balancer in a cluster. This more complex case requires the installation of "keep alive" tool.
The advantage of a tool such as LVS is that the unavailability of a real server for a unspecified reason (software problem, breakdown...) is automatically detected. The server in question is excluded immediately from the active pool of servers and the table of the load balancer is automatically updated. Your application thus remains available to 99,9% of time.
Let's see now how does LVS manage load balancing.
1.3. Various types of load balancing
LVS proposes 3 types of load balancing:
- NAT
- IP tunneling
- Direct routing.
1.3.1. NAT
This method is the easiest to set up and also the most famous. The load balancer has a table in memory storing each server state as well as the state of their connections. It can thus intelligently dispatch work onto the various servers. The interest is in the fact that real servers do not require any particular configuration (Windows and Unix can cohabit together).

In an architecture of load distribution based on NAT as presented in the diagram, the load balancer is built on two separate sub-networks (often two different VLANs). It represents the network gateway of real servers, and thus uses the routing method.
1.3.2. IP Tunneling
This method offers the advantage to avoid potential overload of the load balancer. In this case, once requests are received and treated by the real servers, they are directly transmitted to the client by those servers. This system decreases significantly the load of the load balancer. In NAT method, the load balancer can easily become a bottleneck for the platform. All requests pass by it as well in entry as on exit, whereas in this solution, the real servers answer directly the clients by modifying IP packet headers.
This method is also interesting because the real servers can be located at various places (not inevitably near the load balancer).

1.3.3. Direct routing
This technique is close identical to the IP tunnelling, with the only difference that here the real servers are in the same local area network. The interest is the same as the IP Tunnelling (avoids the point of congestion) and IP encapsulation with virtual IP addressing.

|