Java Application performance analysis and optimization using AppDynamics
Posted by Jai on December 11, 2012
The enterprise java application stack is growing bigger and bigger which makes it equally difficult to keep control on all the layers of the infrastructure to get maximum result out of it. One of the basic requirement of any web application is well performing, we will cover here an ideal enterprise java web application setup and see how to analyze and optimize the same using AppDynamics tool.
Java Enterprise web application N-tier set up
Take an example of below n-tier java web application interacting with complex middleware system, integrate with numerous external web api’s and equally powerful backend storage system.
The diagram covers quite common and complex enterprise application set up.
- Web Servers (eg. Apache web server)
- Application Servers (tomcat application server)
- Mobile application server (tomcat application server)
- Email Server
- Web content management server (eg. Team Site, Alfresco)
- Web application Administration server (tomcat application server)
- File servers (Shared disk eg. NFS)
- Real time/Messaging/Queue server (eg. ActiveMQ)
- Data/File processing backend servers (tomcat application server)
- Data storage/Database servers (eg. MySQL/Oracle)
Looking at above diagram, the application servers are the single communication channel with the webservers and communicate further with internal middleware and backend servers. let’s take example that application servers are not performing well and the web application response is degraded because of some reasons and we will investigate the same further that how to do the performance tuning using some monitoring tooling system. Well listed down performance tuning steps, performance tuning tips
Performance tune the application servers to reduce load average (CPU utilization) and improved response time for the web application.
From bird view perspective, common bottleneck in the set up for the application server can be,
- Load on application, number of requests each server can handle
- Database communication
- Messaging/Queue system
- Network bandwidth
- File server, Network Disk read/writes
- Hardware (CPU, local disk etc.)
- Java virtual machine (JVM, GC)
- Hosted application implementation/code
Well just guessing and looking into the problem won’t help you much. If you lucky, you may strike jackpot but that is not always the case. we need numbers to prove, pin- point or leave out the possible bottlenecks.
There are quite few tools which can help us to measure the infrastructure, application server and backend system. One of such tool we will use further in the discuss is AppDynamics,
AppDynamics, AppDynamics gives you unique real-time visibility of how your applications perform inside many of the industy-leading Java application servers like Weblogic, WebSphere, JBoss, Tomcat, Glassfish and others.
- Visualize and monitor all your JVM dependencies
- Real-Time monitoring of JVM performance, health, and exceptions
- Cross-JVM visibility for monitoring of distributed transactions
- Troubleshoot Java code latency in minutes
One Thing at a time:
The best way to analyze the whole system is to change on thing at a time. Change on thing and analyze the impact of the same before changing multiple things at the same time.
Load(number of calls/requests per minute) on the application
One of the first reasons for any web application degraded response time is always correlated with the load (number of calls/requests per min). Each server has a capacity to handle sustainable assigned load. We will analyze the correlation between load and different parameters for the application first.
The correlation diagrams gives further insights of how different system parameter behave with variation of other system parameter. We will use the same concept to see impact of one on another. Below data shows some measurements for the web application using AppDynamics,
Number of threads:
GC (Garbage Collection) Time Spent:
Database response time:
Analyzing the above graphs you will see that the average response increase or the system resource utilization is not directly correlated with the load on the application but indeed the probability of the same is higher during higher load on the application.
Most of the performance consultants will point to the database as the first bottleneck in the system. The reasons can be many
- Number of connections allowed from application server to database
- Driver used to connect to database
- Database connection pooling, statement caching etc.
If the database response time is low for one application server, the same impact should be applied to all the servers.
Data taken from two different application servers does not show the correlation between the slow response of one server in comparison with another one.
Same way as multiple application servers communicating with database server, the queue server response from on server does not directly relate with another one. Analyzing queue backend system response using AppDynamics enables us to rule out queue system being the bottleneck in the system.
Analyzing the data transfer KB/s both incoming and outgoing in the network enables us to compare the statistics with the network bandwidth, and we can easily rule out the bandwidth being the bottleneck here.
Based on the nature of the application, if the application doing lot of disk read/write operations it can still be the reason for degraded response time. The disk read/write can be both heavy size data or number of operations per seconds etc.
The data compared above proves that the disk writes is not the sole cause of the high response time for the application.
One of direct question, is the machine actually doing something during high response time. Is the high response time because machine is busy in processing something.
Well from the diagram it seems like the machine is doing a lot doing the high response time. And the reason could be that the high cpu resource utilization may be the cause of the high response time.
Number of concurrent threads
Theoretically it is possible that increase in number of threads in the system may have lead to high response time. Quite probable that number of threads increase may lead to contention and every threads trying to get system resources (CPU, database, DISK etc.).
Looking at the above graph it seems, the high response time may be resulting in high number of threads, but cause of high number of threads.
Changing one thing at a time and analyze the system by controlling number of threads seems good idea. Theoretically it is good to have limit to keep the system in healthy state and not to allow to overload the system. But in the current case it does not directly impact the end response time by controlling current number of threads.
Third-Party integration points/web services
Now a days each web application interacts with tons of external web services. The openid authentication system, single sign on authentication mechanism or authentication using different social networking accounts. It can be displaying different information on the portal and submitting end user behavior information to external website.
If any of the external communication with these services is slow that will lead to slow response time for the application. We need to analyze communication with these external system and see the impact.
The auto discovery of backends by Appdynamics fits best here. You can see the response time from different parties and can compare the same with application response time and see if there is direct corelation between the two.
Java virtual machine (JVM)
For any java application, JVM tuning is must have. If you ask any java programmer, it is must have as the settings would also should be specific to your application.
The picture itself says thousands words. You can not afford to have java application spending this much time in doing major garbage collection.
JVM Garbage collection Tuning
JVM tuning in itself is quite a ball game. To give you some reference to read further on, How to tune java garbage
Java version: 1.6
- -Xms3072M -Xmx3072M -XX:MaxPermSize=384m -XX:+UseParallelGC
- -Xms3072M -Xmx3072M -XX:MaxPermSize=384m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
New Area Size/ NewRatio
Here we will try following four scenarios with different GC setting and will try to minimize the time spent in GC.
- -Xms1536M -Xmx3072M -XX:MaxPermSize=384m
- -Xms3072M -Xmx3072M -XX:MaxPermSize=384m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:NewRatio=2
- -Xms3072M -Xmx3072M -XX:MaxPermSize=384m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
- -Xms3072M -Xmx3072M -XX:MaxPermSize=384m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:NewRatio=3
Based on the above settings,
you can see the comparison and GC time spent between different scenarios. Clearly the fourth option is best suited for the current application with no peak in major garbage collection time spent.
Heap Area Size/Xms, Xmx
This can further is analyzed with min/max heap memory settings and it very much also depends on how your application behaves.
On GC tuning, the total GC time spent statistics
In some cases infrastructure is the reason but most cases it is the application designed or coded in bad way which lead to the bottleneck in different parts of the whole system.
The next step is to analyze and optimize the high response time (Very Slow or Stalled requests) to get better performance from the system.
The statistics before the performance tuning,
The statistics after performance tuning,
Comparing the diagram, we can see the improved GC time spent , less CPU utilization which leads to improved response time. The peaks.max in average response time still can be correlated to implementation or complex business transaction.
In the end it is regular monitoring, troubleshooting and problem solving for your stalled and very slow business transactions listed in Appdyanamics.
Read further on java monitoring from AppDynamics tool, Java Monitoring Solutions
Treat performance testing first class citizen during the development process and happy application!