Java Vibes..

June 22, 2009

Notes on QoS and Capacity Planning

The architecture you create must address the following service-level requirements: performance, scalability, reliability, availability, extensibility, maintainability, manageability, and security.

You will have to make trade-offs between these requirements. For example, if the most important service-level requirement is the performance of the system, you might sacrifice the maintainability and extensibility of the system to ensure that you meet the performance quality of service.


The performance requirement is usually measured in terms of response time for a given screen transaction per user. In addition to response time, performance can also be measured in transaction throughput, which is the number of transactions in a given time period, usually one second.

Some of the performance considerations to be taken care while designing systems are: Network overheads (making multiple fine grained network calls vis-à-vis a single course grained network call), Memory issues (Heap size for objects created on heap, Perm size for reflection and dynamically created objects), Logging issues (amount of logging, synchronous/asynchronous i/o etc), Concurrency issues (is thread safety needed).


Scalability is the ability to support the required quality of service as the system load increases without changing the system. A system can be considered scalable if, as the load increases, the system still responds within the acceptable limits.

The capacity of a system is defined as the maximum number of processes or users a system can handle and still maintain the quality of service. If a system is running at capacity and can no longer respond within an acceptable time frame, then it has reached its maximum scalability. Scalability can be added vertically or horizontally:

Vertical scaling in a J2EE application involves adding additional servers running on the same machine. With vertical scaling, the machine’s processing power, CPU usage, and JVM heap memory configurations are the main factors in deciding how many server instances should be run on one machine (also known as the server-to-CPU ratio).

Horizontal scaling involves adding more machines to the cluster, thus increasing the overall system capacity.

Vertical scaling typically does not have an impact on the architecture, but the architecture must be created with special considerations for a horizontal scaling.


Reliability ensures the integrity and consistency of the application and all its transactions.

As the load increases on your system, your system must continue to process requests and handle transactions as accurately as it did before the load increased.


Availability ensures that a service/resource is always accessible. By setting up an environment of redundant components and failover, an individual component can fail and have a negative impact on reliability, but the service is still available due to the redundancy.

Load balancing (and Failover) is a mechanism where the server load is distributed to different nodes within the server cluster, based on a load balancing policy. There are many different algorithms to define the load distribution policy, ranging from a simple round robin algorithm to more sophisticated algorithms like minimum load, weight-based, last access time etc.

Two popular methods of load balancing in a cluster are DNS round robin and hardware load balancing. DNS round robin provides a single logical name, returning any IP address of the nodes in the cluster. This option is inexpensive, simple, and easy to set up, but it doesn’t provide any server affinity or high availability. In contrast, hardware load balancing offers virtual IP addressing. Here, the load balancer shows a single IP address for the cluster, which maps the addresses of each machine in the cluster. The load balancer receives each request and rewrites headers to point to other machines in the cluster. If we remove any machine in the cluster, the requests are still served seamlessly. Hardware load balancing has obvious advantages in terms of server affinity and high availability.


Extensibility is the ability to add additional functionality or modify existing functionality without impacting existing system functionality. You should consider the following when you create the architecture and design to help ensure extensibility: low coupling, interfaces, and encapsulation.


Maintainability is the ability to correct flaws in the existing functionality without impacting other components of the system. When creating an architecture and design, you should consider the following to enhance the maintainability of a system: low coupling, modularity, and documentation.


Manageability deals with system monitoring of the QoS requirements and the ability to change the system configuration to improve the QoS dynamically without changing the system.


Security is the ability to ensure that the system cannot be compromised. Security includes not only issues of confidentiality and integrity (SQL injection, URL injection etc), but also relates to Denial-of-Service (DoS) attacks that impact availability. Creating an architecture that is separated into functional components makes it easier to secure the system because you can build security zones around the components. If a component is compromised, then it is easier to contain the security violation to that component.


Create a free website or blog at