Scaling Compute Load
One of our customers first approached Digipede (www.digipede.net) when designing their first implementation of an SOA. What led them to SOA was the need to expand their internal access to a critical, financial analytic application that performed both short- and long-running computations. They needed to provide many more people access to this service and expected demand to increase in the future. At the same time, they were under pressure to reduce the turnaround time for long-running analytics requests.
A critical issue for this company's serviceand for services in many SOAsis that it is highly compute-intensive. Typical analytics for a financial-services company include running scenarios based on years of historical data. As in many such simulations, better results can be achieved by running more scenarios, thus using more computing power. Clearly, this client needed to improve the performance and scalability of the service in a way that could scale to their future needs.
The classic approach to scaling up a service involves acquiring the fastest machine possible. However, this approach can be prohibitively expensive; the cost for a 32-way SMP box can easily top $1 million. Moreover, scaling up is ultimately a dead end. While you can scale to a certain extent, the size of the server limits the ultimate scalability. To avoid that expense and at the same time achieve the required scalability, many enterprises are turning to distributed computing and bringing together many lower cost computers to achieve the desired result.
Frequently, we find people insisting that network load balancing (NLB) is a solution to this problem. It is not. NLB balances network load, not CPU load. It directs traffic to servers performing the fewest transactions per interval of time. This is a poor measure of load for compute-intensive tasks: A server that is serving a long-running, CPU-intensive request completes a low number (or even zero) requests over the interval and therefore looks underutilized to NLB. What does NLB do in this case? It piles more work on that computer. A round-robin configuration of NLB doesn't solve the problem either.