tomk3534
Contributor
Contributor

ESX 3.0.2 ftPerl high cpu usage

Had a problem yesterday where one cpu (cpu 0) on an ESX Host was piked at 95 - 100%. Ran top and found that ftPerl was using 79-80% cpu. Thought it was something on a VM but all the VMs were using average cpu. Found something on the internet indicating it was HA. Ran a "Reconfigure HA" on the problem host and the process ended. The host then returned to normal. Is this a known issue?

0 Kudos
18 Replies
ekos
Contributor
Contributor

Thanks for posting this solution! One of our hosts was experiencing the exact same issue.

Reconfiguring HA did the trick!

0 Kudos
richlane79
Contributor
Contributor

We've been getting the same issue on our new cluster, but running on 3.5 U3. It's not tied to a particular host either, and reconfiguring HA does the trick, but we find it returns periodicaly.

Will be raising this with tech support.

0 Kudos
Threonine
Contributor
Contributor

Just ran into this at a customer site. We are running vCenter 2.5 Update 4 (Build 147633) and ESX 3.5 Update 3 (Build 123630) with 4 hosts. This was causing major issues for vRanger backups (slowing backups to a crawl). Resolved (at least temporarily) by running the "Reconfigure for VMware HA" procedure on the affected host. Very disappointing to see this issue occur as High Availability has been around for quite some time and obviously this issue still exists with the latest builds.

Update: This issue started recurring again about 20 minutes later on another ESX host. I've completely disabled HA at this point to clear up the issue.

0 Kudos
sbarnhart
Enthusiast
Enthusiast

Add me in as another "me too". Reconfiguring for HA on the affected host appeared to solve the problem for that host (for now). I will monitor and see if it returns.

I had initially suspected something to do with VMWare Converter, as we have been doing conversions to this host and they have been GLACIAL. XP boxes with 8-9GB of used disk space are taking 2 hours to convert to VMs. Reconfiguring for HA has freed up the high CPU utilization but not sped up conversion, unfortunately.

0 Kudos
sidhas
Contributor
Contributor

Count me in too. ftPerl consuming all of processor 0. Reconfiguring for HA helps for a couple of days, then it's back. Running VC 2.5 update 4 and ESX 3.5 update 3. It may be hardware dependent as I have two dl380 g4s in a cluster with HA having no issues and I have a pair of bl460 blades in a cluster with HA having the problem. All were installed identically from a scripted install with the SIM agent install scripted after that.

0 Kudos
sidhas
Contributor
Contributor

In case anyone else has this issue, what finally worked for me was recreating the cluster. I followed this KB article and the problem has not returned. http://kb.vmware.com/kb/1003715

0 Kudos
tohmeiphun
Contributor
Contributor

Hi, I'm running 3.0.2, what if you just kill ftPerl?

-tom

0 Kudos
cody_bunch
Hot Shot
Hot Shot

I've found rebuilding the cluster to work well: http://professionalvmware.com/2009/05/27/ftperl-hates-me/

-Cody Bunch

vExpert, VCP VI3

-Cody Bunch http://professionalvmware.com
0 Kudos
amsh
Contributor
Contributor

Same happened to me to :

ESX 3.5U4 + VC2.5U4. The cluster is configured for EVC.

I noticed this behavior on one of the ESX servers. I evacuated all the VMs

from it and wanted to reboot it. After entering Maintenance-Mode, the CPU

returned to normal. Exited Maintenance mode and moved the VMs back to it.

After one day, the second ESX in this cluster showed this problem.

Reconfigure for HA fixed it.

0 Kudos
steveanderson3
Contributor
Contributor

We are experiencing this issue on 3 different hosts, in three different environments, all running esx 3.5 u4. Seems like there should be a better fix than to re-build the entire cluster? That is quite a bit of work depending on your cluster size. I wonder if this is fixed in 4.0.

0 Kudos
javella
Contributor
Contributor

Cody,

Did rebuilding the cluster help future ftPerl issues or just that one time? We're having the same ftPerl 100% usage issues on our clusters, we may as well reconfigure HA one by one on each when the problem occurs since we fix one and it starts on the other. vCenter 2.5 U4 + ESX 3.5 U4 with all latest patches =( The vmware engineer in our support case did mention there are still some issues to be worked out, usually disabling the pegasus service fixes most issues he stated which we've already had to do for other memory leak issues. We'll be giving him a call back 😃

-Joaquin

0 Kudos
cody_bunch
Hot Shot
Hot Shot

Joaquin,

Let me know how the case plays out then. It will be great to get some official response/fix to this. Yes, rebuilding the cluster was the only way I got it to stay gone.

-Cody Bunch

vExpert, VCP VI3

http://professionalvmware.com

-Cody Bunch http://professionalvmware.com
0 Kudos
-Jason_Pope-
Contributor
Contributor

Hi all,

I had the same issue running ESX 3.5 U4 (build 163429). I did all the HA stuff (disabling and enabling) and it kept coming back.

I have been graphing the load averages in nagios for all my VM Hosts and noticed the loads rising slowly over a period of a couple of days, when suddenly it would spike on one of the servers. Looking at that server I saw ftPerl having a field day.

One thing I did notice everytime I re-enable HA on the devices (and viewing top at the same time) was the HP VMM (HP Virtual Machine Manager) service spiking a little at the same time throughout the re-configure. Seeing that we don't use HPVMM in our environment, I removed it, then reconfigured HA over the cluster. Now looking at my load averages, they have stayed low and constant for days now.

Hope this helps anyone, if HPVMM is the real reason or just a fluke.

0 Kudos
a2alpha
Expert
Expert

If you don't mind me asking, what do you mean by rebuilding the cluster? Is it just you created a new Cluster and added the existing hosts into it and then enabled HA / DRS on it or did you do that plus reinstall all the hosts again.

Sorry to bug you but I am seeing this in one of our environments and I want to make sure it stays gone!!

Thanks,

Dan

0 Kudos
-Jason_Pope-
Contributor
Contributor

To take them out of their cluster, drag and drop the individual hosts onto the top level of the tree. Once all are on the top level, delete the cluster they were under, recreate a new cluster and set up the HA/DRS settings you want. Then add the hosts into that cluster (drag and drop again). HA should reconfigure themselves on the hosts.

Also check out for the HP Virtual Machine Manager agent if you have that installed on your hosts. I found removing it (as we don't use insight manager to manage the VM's) stopped any issues I had with HA ftPerl going crazy. Been running sweetly now for almost 2 months now.

0 Kudos
a2alpha
Expert
Expert

Thanks for the quick response, is HPVMM installed by default on an ESX 3.5 installation, I take it that would only be on an HP host, it is Dell in this site and we didn't put on the dell management stuff manually.

Thanks again.

0 Kudos
-Jason_Pope-
Contributor
Contributor

HPVMM is not installed by default, but it can be installed by a sysadmin that has HP Systems Insight Manager to monitor the servers (Dell or HP).

just do a rpm -q | grep hpvmm*

If not there then don't worry.

0 Kudos
a2alpha
Expert
Expert

Thanks for this, i'll give it a go.

Dan

0 Kudos