A chap on Twitter was asking for this type of article, we
did not have one, and neither did our competition. In fact I could not find a
good example anywhere. The result is that I have written this, I won’t claim it
is very good, but it is a little better than nothing (or out of date items).
Load Balancing is a generic term used to describe a set of
practices with the aim of achieving resiliency/redundancy/scale in a farm of
similar application servers.
Most commonly, this terminology is used in connection to web
or Internet facing servers (though any application is a candidate for load
balancing). Which raises the most important question, what is it you want to
achieve?
In a lot of cases this will be simply, failover (or
redundancy). This is where you have an automatic capability that should your
application fail on one server, a second server takes over the task. In this
scenario, work or session information may be lost, but users are able to
continue/restart with the minimum of downtime (i.e. time in which they cannot
do anything).
Failover is nearly interchangeable with the terms resilience
and clustering, it is simply down to the preference of the person talking about
this space what terminology they use. I tend to use
redundancy/failover to mean an active/standby scenario. This is where you have
one instance of the application running at any one time, with the capability of
switching which is active automatically and/or by an operator’s intervention.
I use cluster when we are talking about multiple instances
running at the same time with the work load being spread across them, this is
true load balancing. The advantage of this type of operation is that it also
allows your applications to scale to deal with numbers of users that would be
greater than one server could manage.
So how do you do all the above? Or more importantly, where do you do this work? There are two places, internally to
the application and externally. Many modern applications have a concept of
failover or redundancy, and a few are able to cluster themselves. There are
also third party products (often in the form of a plug-in) that can achieve
this.
The problem is that they are often simplistic, and that they
add more compute load onto the servers (they are having to work out and manage
the distribution of load, as well as working on the actual task they are there
for).
Therefore it has become almost standard to deploy an
external load balancing product in applications that have a large audience.
With the advent of Cloud Computing however, there has been a bit of step
backward in this respect, due to the smaller scale of many deployments and the
limitations of the clouds themselves. But this is changing rapidly, as the
functionality is added by cloud vendors and the “deployers” of these
applications become more aware of the possibilities of load balancing.
So let’s focus on external load balancing. This is a product
space that has been around for quite a few years now and has undergone a couple
of dramatic evolutionary leaps.
Before load balancing by a dedicated device or application,
we had DNS round robin (though this is probably better described as load
sharing as no measurement of the load is taken into account). DNS round robin
is still a valid method and there are companies that are willing to do this for
you as a service (even adding intelligence in terms of availability of the
service).
The next step in the evolution of the load balancing space
was to put “something” in front of the servers and have this make a decision on
which server to send the traffic to. In their earliest forms these were pretty
simple devices that sat in the path of the traffic “messing about with IP
addresses”(tm)! What this meant was that the service was identified as IP1,
which translated as the load balancer. The load balancer would forward
connections aimed at this IP address, but changing the destination IP address
to that of one of the servers (IP2, IP3 etc) configured for this service. On
the way back the load balancer would change this IP address back to the
original.
Functionality was added around this, such as health
monitoring to ensure that the servers in the farm or pool were actually up and
working. A huge number of other enhancements have occurred to this generation
of the load balancing family, but this is the core behaviour. A good and bad
point is that it enables applications to scale, by adding more servers to those
available for a specific service. This is great as it allows scaling, it is bad
as it engenders rather inefficient use of resource.
It was this problem that the second generation of the load
balancing family tree first addressed. This was achieved by a dramatic
re-architecting of the core behaviour. Rather than being a lump in the network
passing through the connections, the products became application proxies
(though most called themselves accelerators). What this change meant was that
rather than passing on the connection, the proxy terminates it and initiates a
new connection of it’s own for the communication between itself and the server.
This means that individual requests from multiple clients can be sent down the
same connection from the proxy to the server.
The result is that there can be dramatic improvements in the
performance of the servers as they are not having to work as hard. For example,
rather than trying to setup and tear down lots of TCP connections per second,
there can be a much smaller number of long lasting connections that are reused
again and again. Similarly, rather than coping with lots of high latency
connections eating up processing and memory resource, all the servers see are
fast local connections from the load balancer (accelerator).
Other performance improvements are also available by
offloading SSL encryption/decryption work to the proxy and
manipulating/inspecting XML on the proxy.
All of the above reduces the amount of work required from
the servers behind the proxy, so much so that in most cases some of this server
resource can be removed. So rather than adding servers to scale your
applications it is possible to remove servers whilst still being able to handle
larger amounts of traffic.
Along with the improvements in performance, a trend started
in this generation of the family to add more intelligence to the products.
Functionality such as the ability to inspect, modify, remove HTTP headers, and
to make forwarding decisions based on the content of the traffic became common.
It is this sort of functionality that causes these devices to sometimes be called
content switches.
We now reach the third generation of this family (which is
where we are today), basically take everything that has gone before i.e. all
the load balancing and acceleration, but wrap around it a huge degree of
intelligence to manage the application traffic on the network.
This generally means the addition of a scripting language
and an API to the product to allow granular control of what happens to the
application. I tend to talk about this type of functionality as a translation
between how the business describes behaviour and how this is enabled on the
network. For example when the CEO says it would be good to be able to
differentiate between users (e.g. gold card holders and silver card holders) of
the application, and ensure the high value users get a good service even during
busy periods. The scripting language is the tool to make this happen with the
application on the network, by (for example) identifying the users (by account
number), monitoring the performance of the servers, and when this drops to a
certain threshold slowing down the connections of the silver card users.
This generation is mostly labelled Application Delivery
Controllers (ADCs), a term coined by Gartner a few years ago.
One word of warning, all the above descriptions and
categories are not hard and fast rules. There is a lot of cross over and
blurring of functionality. There are a couple of vendors who, though still
selling what is in essence a load balancer (from my definition) albeit with a
lot of functionality, call them ADCs.
There are also two other ways in which the products in the
family tree are differentiated. One is by commercial and open source versions,
and the other is the split of the space into hardware appliance type products
and software application products.
Historically the hardware appliance was King of this space
for a couple of reasons. The first being that the majority of the commercial
players came from a network hardware background and therefore this was the
natural thing for them to do. Also, until fairly recently commodity hardware
was not powerful enough to run these products. Custom ASICs and Network
Processors were required to achieve the performance needed in these tasks. This
has dramatically changed and it is fair to say that similar performance can be
achieved whatever route (hardware or software) you take, and the decision is
now more about features, ease of use and value for money than raw performance.
Most devices can outperform the network connectivity of even the largest
websites.
So what does the future hold for this family of products? For some time application delivery has become an integral part of how
organisations take their services to market. More and more work that was
previously done inside the application is being shifted to the ADC sitting on
the network. This is especially true of tasks that need to be replicated
across a number of services, e.g. authentication, geo-ip lookups, content
translation (i.e. for mobile devices). This makes the ADC an essential part of
the application.
Now with organisations looking at moving (or developing new)
applications to Cloud infrastructures it is important to take this
functionality into the Clouds. However, most cloud vendors are not in a
position to offer more than simple load balancing. If you need the higher
functionality that you have become used to, you need to deploy an ADC in your
Cloud environment yourself.
The problem with this is that you obviously cannot float a
hardware appliance in a virtual data centre. Therefore the future, almost by
default, has to be in software ADCs. It is for this reason that Citrix recently
announced their own virtual product, only 6 years after ZXTM became available
as a software product. Other vendors will probably be following along later.
The following diagram is a snapshot of some of the players
in this space and how they fit into the different functional generations of
this space.
Nick Bond