haroldr
VMware Employee
VMware Employee

Java Performance on VMware ESX

As a performance engineer at VMware, I have done a lot of testing with Java applications on VMware ESX. I have uniformly found the performance and scaling of Java applications to be excellent, with no special tuning required. As a start at demonstrating this, I recently published the results of some SPECjvm2008 experiments in VROOM!, the VMware Performance Engineering teams' blog. While SPECjvm2008 is only focused on core Java performance, and can't be used to demonstrate multi-vm scaling, the results do show that there is nothing inherent in Java itself that would lead to poor performance when virtualized.

On a few occasions, I have worked with customers who were experiencing performance issues with their Java deployments on ESX. In all of these cases, the root-cause turned out to be a configuration issue in their environment, and not really related to Java itself. In most cases, the issues were related to memory overcommitment. One of my colleagues has written an excellent document,[ Java in Virtual Machines on VMware ESX: Best Practices|http://www.vmware.com/resources/techresources/1087], which provides guidance on avoiding these and other common issues when deploying Java application on ESX.

I would like to use this thread for a discussion of questions or issues related to Java performance on VMware ESX 3.5 or vSphere 4.0. Post your comments, questions, or experiences, and I'll do my best to respond.

Hal

Tags (3)
0 Kudos
27 Replies
tcutts
Enthusiast
Enthusiast

Did you find any more information on the possible JVM-Network issue that you were investigating? We have many tomcat-based applications running in JVMs and recently, there has been a reported slowdown. They do not push the CPU at all, but they do use back-end JDBC Oracle/MS-SQL DBs. If this is an issue, I want to be able to escalate internally and within VMware.

We do not have the luxury of being able to move the apps to a physical environment. We are using JVM jdk1.5.0_14-b03 (64-bit) on RHEL 4.8 (64-bit) 2vCPU.

Any input would greatly enhance additional troubleshooting that we would perform to validate.

I have now completed my migration to ESX 4.0, and while the performance of some java applications does seem to have improved (notably Lucene) at least one of my users is still reporting tomcat application poor performance. I am trying to arrange a meeting with the user at the moment to narrow down where the performance problem is. The difficulty is that we're dealing with a vast number of layers of code here, any one of which might be having trouble in the virtualised environment:

The user's own java code

Tomcat

JDBC

The JVM

The Oracle client libraries

The OS itself

The virtual network card in the guest

The virtual switch

to name a few. We're all using different distributions, so unless it's a fundamental problem with Linux itself, I think we can discount that. I've also tried several java versions, with no change, so it isn't that, again unless there's a fundamental problem with Java on ESX guests, which I don't believe.

I've asked the user to give me:

a) The SQL query that's slow (so I can use it directly with sqlplus, and see if it's the Oracle client stack or below that's causing it)

b) A tiny CLI java app which uses JDBC to make that same query (which should tell us whether it's JDBC-related, or something in the higher user code)

I've also asked the user to put some instrumentation into their code to time various sections, so they can really tell me where the slowdown is. The application is quite large and complex, and the reports are really just nebulous "it's too slow" reports.

I'll let you know what results I get from the above tests, if the user supplies me with some code.

If any of you have done anything like the above analysis, I'd like to hear results.

0 Kudos
haroldr
VMware Employee
VMware Employee

You are quite correct that with any application, Java or not, there are a large number of possible sources of performance problems. One item that you left out of your list is storage performance, which is the source of many problems that are initially blamed on ESX.

I have just published at performance troubleshooting guide for vSphere that focuses on identifying sources of performance problems in ESX 4.0. Most of the information in this guide is applicable to previous versions of ESX as well. The guide is available at , and has some simple checks that can help to identify or rule out common problems.

I do not currently have access to a good servlet-based application for benchmarking, and so I have not been able to do any testing on Tomcat. However, I have been doing some testing with an enterprise-class three tier EJB-based application on SLES10 SP2. Since I am working with a partner, I will hold off mentioning the specific application-server and database involved. However, this application does have a significant networking load, as well as relying heavily on an external database (also running in a VM on a separate ESX host).

The tunings that I have found most significant are:

  • Database performance: Ensuring that the database is performing properly is the number one factor in ensuring that the application provides acceptable response-times. No amount of JVM-level tuning will help if the DB performance is poor. My tuning was mostly related to the layout of the tablespaces on the storage, and proper sizing of the DB's in-memory caches.

  • Use the vmxnet virtual adapters: Switching from the default e1000 adapter to the vmxnet3 adapter gave me a 10% increase in peak throughput (at acceptable response-times) for this particular workload. A nice paper about the benefits of the vmxnet3 adapter has been published in the VMware technical resources. For network-intensive VMs, tuning interrupt-coalescing can also help (see this document: ), but be careful with this one as it affects all VMs on a host, and there is a small tradeoff between the efficiency gains and higher latency.

Your approach of breaking out possible problems into small test-cases is a good one. However, be careful not to rule things out to quickly. A small test program to test DB access might miss something like improperly sized JDBC connection pools, which will only become obvious by monitoring the application when under load. I have seen one Java application in which the problem was a JDBC connection pool that was too slow. However, the best solution was not to increase the size of the pool, but to switch the VM to use vmxnet2, which decreased network latency and took enough pressure off of the connection pool. This is just one example of why it always makes sense to ensure that you are using best practices. This document is a good read.

Let us know what you find out. These case studies are helpful to everyone.

0 Kudos
tcutts
Enthusiast
Enthusiast

Having gone through those best practices, yes, we're on board with all of them.

The only variable we have here is the 'hardware' platform on which the tomcat java service is running. The software is on an NFS server, so both physical and virtual servers are getting their tomcat copy from the same place. The same goes for the Oracle client software. The Oracle server is on a physical machine, and is the same for both the physical and virtual clients. The OS installed on the physical and virtual machines is also identical, with the exception of some of the device driver modules in use, a natural consequence of virtualising the system.

So we have already controlled for most variables. I'm still waiting for the broken-down code snippets from the users.

Tim

0 Kudos
tcutts
Enthusiast
Enthusiast

It turned out that our performance problems were down to a manufacturing fault in the ethernet passthrough modules of our HP c7000 blade chassis. We had the modules replaced, and the java performance issues went away. It turned out not to be a VMware issue at all, although for some reason the java codes were particularly badly affected.

Tim

0 Kudos
deeanna
Contributor
Contributor

Hi,

I'm a noob to the concept of virtualization -- however I've been looking into it and I thought this would be the approp thread to post to.

I have a similar stack as others who have posted here:

- Windows OS x64

- Tomcat 6

- Java 6 x64

- MySQL DB x65

I'd like to start testing this stack and take advantage of virtualization but obviously need to see the potential impact on performance.  Would appreciate any pointers that anyone has.  I was looking to download a trial / free version of VMWare Server but my guess is that VMWare ESXi is what I want to download / trial first?  Would appreciate any other pointers from those in the know

Thanks,

Deeanna

0 Kudos
mittim12
Immortal
Immortal

Welcome to the forums Deeanna.   I will validate that you would want to run a trial of ESXi and not VMware Server .   ESXi is the enterprise type one hypervisor while VMware Server requires itself to be installed on top of a existing OS.

On a side note if you run into any problems I would recommend starting a new thread.  I think you will get a lot more visibility that way.

0 Kudos
Aftenposten
Enthusiast
Enthusiast

Tim,

Do you any more information regarding the manufacturing fault? I am experiencing a performance issue with one vm with Java and also have the c7000 blade chassis, although with virtual connect modules. Probably not the same issue, but I will greatly any information you have about this issue.

Regards,

GB

0 Kudos
shishir08
Hot Shot
Hot Shot

From Vsphere 5. one new feature has been introduced whose intent is to solve the performance issues of java application under memory over-commitment.

Java applications were facing severe performance penalty when run on the VM's under memory over-commitment. The reason is that Java GC allocates a large trunk of virtual memory as its heap and manages the heap itself. For this reason customers are very cautitious and slow to virtualize java and often times are unhappy with the result.

In order to overcome this problem application level balloon is the  technique through which we can effectively reduce the java heap size by  providing balloon pages directly from java heap instead of traditional  way that allocates pages from OS.

Application level balloon is the first step in solving the problem which provides ESX more knowledge on java applications, allowing more intelligent memory management with java workloads

With this it can avoid swapping activity and performance degradation. The java balloon driver is a critical piece which solves the problem. With this ESX memory scheduler become more intelligent and aware of the java applications running on the VM.

0 Kudos