cpu spikes on a poweredge 850 - Page 2

thor918 · ‎08-31-2008

hi there.

I just finished up putting togheter a poweredge 850 with a perc5i sas raid card with two sata disks in raid 1.

The system seems fine, all is green in health. it has only 1GB ram at the moment. But runs just fine with that.

I just noticed that when I look at the preformance window for my cpu, there are regular cpuspikes.

it seems they accour every 10min

http://home.no.net/thor918/vmware/spikes.jpg

anyone have any clue why I get these spikes with a average 40%?

the spikes accoure even if no virtual machines are running.

thor918 · ‎09-11-2008

tried the same thing as you now, spikes never go away on my machine.

the only way I found that works is to delegate less cpu force to the prosses that goes cracy,

but that is hardly a fix, it's just like brushing dust under the carpet. it's there but you can't see it.

when I find time I will investigate more.

ar039 · ‎09-11-2008

Hi there, thought I would just say I have the exact same symptoms, every 10mins a CPU spike, but with a different configuration.

I have a whitebox with: ASUS P5K Motherboard, Intel Q6600 CPU, 8GB RAM, LSI MegaRAID 8308ELP RAID card, 4x500GB SATAII HDD and Intel Pro/1000MT NIC.

During the time of the raised CPU this is the contents of the /var/log/messages file:

Sep 11 23:27:32 LSIESG: LSIESG:INTERNAL :: StorelibManager::getEnclosureConfig - StorelibManager::getEnclosureConfig:ProcessLibComma

Sep 11 23:27:32 sfcbd: INTERNAL StorelibManager::getEnclosureConfig - StorelibManager::getEnclosureConfig:ProcessLibCommandCallfail

Sep 11 23:27:32 LSIESG: LSIESG:INTERNAL :: StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUStatus, ProcessLibCo

Sep 11 23:27:32 sfcbd: INTERNAL StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUStatus, ProcessLibCommandCall

Sep 11 23:27:33 LSIESG: LSIESG:INTERNAL :: StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUCapacityInfo, Proces

Sep 11 23:27:33 sfcbd: INTERNAL StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUCapacityInfo, ProcessLibComman

Sep 11 23:27:33 LSIESG: LSIESG:INTERNAL :: StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUDesignInfo, ProcessL

Sep 11 23:27:33 sfcbd: INTERNAL StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUDesignInfo, ProcessLibCommandC

Sep 11 23:27:34 LSIESG: LSIESG:INTERNAL :: StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUProperties, ProcessL

Sep 11 23:27:34 sfcbd: INTERNAL StorelibManager::fireStorelibCommand - caller StorelibManager::getBBUProperties, ProcessLibCommandC

The RAID card does not have a BBU.

thor918 · ‎09-12-2008

did you check with resxtop command if it's the same prossess as we other in this thread have reported to spike?

your log messages seems different from mine. I have the BBU unit on my controller.

thor918 · ‎09-12-2008

A tip to help investigate source.

enable ssh on the box. (there is a how to on enabling ssh on your box.. try searching for it on the net)

open two different ssh sessions:

in session 1, type:

tail -f /var/log/messages

(this will show the last lines in the log, and monitor the file for more changes)

in session 2, type:

esxtop

(this will show the list of prosesses with cpu use)

Try to see what logg messages is comming exactly when the spike comes..

From the latest test when I monitored the log exactly when spiking:

Sep 12 19:26:22 LSIESG: LSIESG:INTERNAL :: StorelibManager::fireStorelibCommand - caller StorelibManager::getConnectorInfo, ProcessLibCommandCall failed, rval = 0x2

Sep 12 19:26:22 sfcbd: INTERNAL StorelibManager::fireStorelibCommand - caller StorelibManager::getConnectorInfo, ProcessLibCommandCall failed, rval = 0x2

Sep 12 19:26:22 LSIESG: LSIESG:INTERNAL :: StorelibManager::discover - DatadiscoveryfailedforConnector;Errorcode=2

Sep 12 19:26:22 sfcbd: INTERNAL StorelibManager::discover - DatadiscoveryfailedforConnector;Errorcode=2

jstretch · ‎09-12-2008

Same problem here with a Dell 1950 running RAID5. top shows several sfcbd processes spiking every ten minutes, and the same StorelibManager messages appear in the log:

Sep 13 00:14:55 LSIESG: LSIESG:INTERNAL :: StorelibManager::fireStorelibCommand - caller StorelibManager::getConnectorInfo, ProcessLibCommandCall failed, rval = 0x2

Sep 13 00:14:55 sfcbd: INTERNAL StorelibManager::fireStorelibCommand - caller StorelibManager::getConnectorInfo, ProcessLibCommandCall failed, rval = 0x2

Sep 13 00:14:55 LSIESG: LSIESG:INTERNAL :: StorelibManager::discover - DatadiscoveryfailedforConnector;Errorcode=2

Sep 13 00:14:55 sfcbd: INTERNAL StorelibManager::discover - DatadiscoveryfailedforConnector;Errorcode=2

Sep 13 00:14:55 vmkernel: 0:00:29:50.221 cpu3:1509)WARNING: UserThread: 406: Peer table full for sfcbd

Sep 13 00:14:55 vmkernel: 0:00:29:50.221 cpu3:1509)WARNING: World: vm 8149: 910: init fn user failed with: Out of resources!

Sep 13 00:14:55 vmkernel: 0:00:29:50.221 cpu3:1509)WARNING: World: vm 8149: 1775: WorldInit failed: trying to cleanup.

Unfortunately we can't progress with an Infrastructure deployment until we get this issue fixed in our evaluation lab.

ar039 · ‎09-14-2008

Hi Thor,

The previous lines from /var/log/messages occur exactly when the spike starts. The spike lasts approx 2 mins and occurs every 10 mins. The process in resxtop which was associated with the spike is sfcbd.833867.

I followed previous instructions to reduce it's impact by using System Resource Allocation for sfcb and this is a good temporary solution for me.

thor918 · ‎09-16-2008

good to here something good came out of this topic.

I use the resource delegation trick, however I would much rather find out if it could be solved.

could someone with the same cpu spike problem that have active support, file it as a support request?

http://www.vmware.com/support/policies/defect.html

thor918 · ‎09-27-2008

Hi folks,

Looks like "nick.couchman" was right with his first post in this thread about "health monitoring"

http://communities.vmware.com/message/1061379#1061379

Dave.Mishchenko, has come up with more info about the problem process.

seems like sfcbd process is the CIM server. CIM server is used to present health status (Virtual client\Configuration \ Health Status).

Turning of the CIM server did remove the cpu spikes I was experiencing!

so if the sfcbd process is just for health monitoring, I don't see any problem with allocation adjustments to how much cpu that process may take.

I would rather use the allocate adjustment trick than disabling the health monitoring all togheter.

looks like vmware should look into the code of the health monitor server........

thor918 · ‎09-28-2008

Good news everyone!

Looks like an firmware update from vmware was made availeble (18.sept)...

after I upgraded with that update, cpu spikes are gone !!!!!!!!!!!!!!!!!!!!!!!!!!!!!

I'm going to have it tested for a while, but it sure looks good.

(edited.....)

Dabj · ‎09-28-2008

Nice to hear something is happening. I will try it on monday. Small

question: i don't see the update on the vmware site.

Greetings.

,
A new message was posted in the thread "cpu spikes on a poweredge 850":
http://communities.vmware.com/message/1061583#1061583
Author : thor918
Profile : http://communities.vmware.com/people/thor918
Message:

thor918 · ‎09-28-2008

I may have been mistaken on the date.

The update was availeble through the "vmware infrastructure update" program.

Release Date: 18-Sep-2008

Build Number: 113338

ESXi - patches

Dabj · ‎10-20-2008

Finally put the update to the test. It did indeed resolve my issue.

Thnx Vmware!

Case closed.

imvirtualized · ‎11-22-2008

Sadly i have to say it doesn't!

ESXi downloaded today on Dell PowerEdge T300 and those spikes are there every 10 minutes lasts for 2 minutes.

Is there any fix i would have missed?

Thanks a lot

Dave_Mishchenko · ‎11-22-2008

Have you tried to disable the CIM providers? And have you tried to narrow down the problem process? You can use resxtop in the RCLI appliance or esxtop if you have console / SSH access (not recommended) to your host.

imvirtualized · ‎11-23-2008

unfortunatly not yet but i'll try this as soon as i'll have access to the server.

Is there any procedure on using RCLI? (i'm unix/linux user but CLI on windows does scare me a bit ^^)

Thanks a lot

thor918 · ‎11-23-2008

a download does not nessesary mean that you get the newest update.

what date is the download?

have you checked with esx update service that is bundled with vic?

I see the newest download in the download page is:

ESX Server 3i U3 Installable Refresh

Version 3.5 Update 3 | 123629 - 11/06/08

newest patch for esxi is from same date as above:

http://support.vmware.com/selfsupport/download/

I guess that download should have spikes fixed. hmm I hope not that the spikes returns when I update again.....

imvirtualized · ‎11-23-2008

I downloaded the version on the download page.

I mean this one:

Version 3.5 Update 3 | 123629 - 11/06/08

I've also checked for updates using the windows client (something like Start>VMware>WMware update ).

Is this the good way to check for updates? (sorry esxi newbie here ^^)

I'll try to find crazy process during the week. Should i search for anything else?

I hope for you that spike won't come back!

mmurrin · ‎12-04-2008

This is an issue for me as well. I have ESXi installed on a Dell PowerEdge 2950 III. I have Update 3 build 123629 installed but the spikes are still happening every 10 minutes. I can lower the reservation on the process causing the spike but I am only keeping that option open if there is nothing else I can do.

Wurzelsepp · ‎12-05-2008

Hi all

i have the same problem here ...

Hardware:

DELL Poweredge 2950

2x Xeon 5150

8GB Ram

ESXi build 123629

CPU spikes every 10min ... nobody have a solution for this problem?

The only thing i can do is go to /configuration/System resource allocation/Advanced

edit the process sfcb (change the cpu limit from unlimited to 466Mhz .. for example) ... but thats only a workaround

sapro27 · ‎12-15-2008

Same problem with DELL Poweredge 2950 III and ESX3i Build 130755