Greetings,
I have a newly built ESX 3.5 Server running on a SunFire 40 Dual AMD Single Core system. I have zero (0) Virtual Machines running on it and noticed that in the VI Client that 50% of the CPU was being consumed. Delving into things further with ESXTOP I discovered that helper is indeed consuming 50% of the CPUs. Does anyone know what is causing this and how to make it stop? Or how to further troubleshoot the problem? With no VMs running it shouldn't be consuming 2 Ghz of CPU horsepower
Thanks in advance!
M
Well since it was a new install, it may be the 'helper' service was completing some initial tasks to update the server or configure the host for VC. So maybe this is by design, and you didn't state how long you left it that way, so perhaps you may have been a bit hasty...
Ok - it started happening again, so I left it alone. It's been at 50% for about two hours now. No VMs running - and the ESX Server has been left completely alone. Any ideas? I will examine the logs after more coffee 😄
from the console type : ps -ef and see what you get returned. Look for the process ID for that 'helper' service.
Then try kill and see if it dies, or kill -9 and see if you can kill the service. Then see if it returns. Also the line that helper runs on, there should be a path, what path is it pointing to?
Digging a little further, I refreshed my recollection of 'helper' . 'Helper' is one of the esxtop "worlds" that represents an 'asynchronous task' (per the guide to monitoring ESX 2 with esxtop:[http://www.vmware.com/pdf/esx2_using_esxtop.pdf] - I know this is 3.5 but I presume helper is still used the same way).
Now, what's interesting is examining the CPU load through normal old 'top' it shows the system is basically idle.
So, riddle me this Batman: does 'top' show no CPU load because that's only showing the performance of the service console and not the VMKernel? Or is the load shown in esxtop and VirtualCenter "phantom load" ? In otherwords, is this system really running at 50% of two CPU's?
Fun fun
I've attached the 'top' output screenshot.
Top shows only the service console.
esxtop includes the vmkernel.
--Matt
So does this mean you did an upgrade from 2.5 to 3.5?
No - completely fresh installation of 3.5 (complete format of the hard drive). I was simply citing the esxtop guide reference to 'helper worlds' which was written in the 2.5 era. Since esxtop still shows 'helper' it appears that hasn't changed from 2.5 (but who can say).
So, as the other poster concurred, top only shows the service console - so something within VMKernel is sapping 1/2 my CPUs. I perused all the logs (VMKernel, VMKSummary, VMKwarning, messages) and there is nothing of interest - the logs are actually much "cleaner" than my 3.02 servers
More info.
Expanding the 'helper world' in esxtop it shows that helper0-1 is consuming the CPU; and that there are 22 "helper worlds" in all. Only helper0-1 is affecting the CPU. Can anyone out there correlate helper0-1 to an actual ESX process? Attached is a screenshot showing the expanded helper worlds in esxtop.
M
p.s. I transferred a couple VMs to the system and when I fired them up, the system did reflect their performance on top of the 50% being consumed by helper0-1. helper0-1 didn't 'back off' or do anything nice like that
I have the same problem after a fresh install of esx 3.5 dated feb 20. ESXTOP shows that helper0 is consuming around 98% of cpu time.
My server is an IBM X235 which is not supported by VMWare. Is your server supported?
i have the same problem with four servers. I will now change the PCI cards to other slots.
Even with the latest fixes, it still happens. As a work around, I have limited the cpu limit for the helper process. Because the x235 is not supported, I didn't open an SR.
Is your servers in the list of supported servers?
Yes, the servers (Primergy RX600S4) are on the hcl list. I will update you after reordering the PCI cards.
Workaround for the RX600S4:
Do not use PCI-Slot 6 for ESX 3.5 at the moment.
Hi,
I have exactly the same issue on 2 brand new SUN Blade X6250 with ESX 3.5 (wich is a supported configuration by both SUN and VMware).
I opened a support request with VMware and they told me that this issue has been seen in the past and should be fixed by an update of the BIOS of the server.
They also confirmed that it could be related to one of the PCI slots and that I should try different slots, which however proves to be difficult since the blades only have 2 slots and they are both used.
I then opened a service request with SUN, but they seem to be pointing to VMware to find a fix for the problem.
This leaves me stuck in the middle
So any ideas on how to solve the problem are welcome.
Thanks,
JM.
On my x235, I have an ethernet card on slot #5 that is not used. I will try to remove it and also check for bios update.
For jmlemmer, even though VMware responded by saying that you need to apply a bios and left it at that. You should insist on further investigation by another engineer. Some of them are not very curious and tend to easily close their incidents.
As an exemple, we had a problem with ESX servers not failing over to controller B, when we did a reset on the SAN controller A. Setting were set correctly. The first incident resulted in saying that it was an IBM problem, not VMWare. Then we spoke to IBM and even if they tried to say they have no problems, we insisted and did all their dummy tests. Meanwhile I opened up a new case at VMware and the new engineer showed more interest to solve the problem. He came up with instructions on starting ESX with more debug information for the HBA and confirmed again it was a problem with IBM. He also proposed to speak with IBM which probably help IBM to agree that their DS4500 SAN has a problem.
So if you do have a support contract with the server and their BIOS doesn't fix it. Reopened a new case at VMWare and request them to speak with your contact at SUN. If the problem is in the hardware, VMWare has to help you diagnose exactly what is not working and only them have access to information on how to increase diagnostic level in ESX.