VMware Cloud Community
bhawkins4194
Contributor
Contributor

Seeing hard crashes, any known issues or troubleshooting for a Cisco USC-E blade?

I have a cisco ISR router with a UCS-E blade, I have been having issues ever since we got it. Preformed the latest 5.5 Update 2 upgrade hoping the hard crashes would stop, but does not seem to make a difference as tonight I just has 7 VMs go down on me as the host crashed and rebooted it appeared. As I could not see this purple screen everyone is talking about. Any clues if there is a crash log I can get off the device? I would really like to know if I have hardware failing or if I am asking too much for this little guy to do. I would say the crashes appear every few days when virtual machines are running, I don't see any records if the machine has the hosts suspended. I used the special Cisco build of ESXi that I have available to download, and standard setup, only think I think I did wrong in the setup is that the hypervisor is installed to the hard drive and not the SD card that the instructions say to use. This is the same drive which hosts the store.

5 Replies
Alistar
Expert
Expert

Hello,

we can gain more insight if you provide vmkernel.log and vmkwarning.log from /var/log - use WinSCP to connect to your ESXi host and gather these files from there. It is highly improbable that the ESXi is crashing solely because it is deployed on hard drives and not on SD cards because the whole hypervisor gets loaded into memory on boot.

Good luck!

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/
0 Kudos
bhawkins4194
Contributor
Contributor

We might have to wait until it crashes again, missed the logs before end of day recycle. You may see a few times while it was powered down this evening, I did notice there was a bios update from Cisco that I applied. (Fingers Crossed that fixes it)

Thanks Again.

0 Kudos
Alistar
Expert
Expert

Hi again,

unfortunately I really can't see anything in the logs Smiley Sad Anyways a BIOS upgrade is a great way to stabilize things, so fingers crossed here. Also, has anyone done a physical inspection of the hardware itself? We had one fellow on this forum in the past that had his server reset at random and eventually he found out that it was overheating - cleaning the internals helped.

You're welcome and hopefully the crashes will stop appearing! Smiley Happy

Stop by my blog if you'd like 🙂 I dabble in vSphere troubleshooting, PowerCLI scripting and NetApp storage - and I share my journeys at http://vmxp.wordpress.com/
0 Kudos
cykVM
Expert
Expert

Also check if Cisco provides further information on firmware/driver versions which are known/tested to be working. That's mainly firmware for storage controller(s) and NICs in use, but may also include for example remote management card (IPMI or similar). In rare cases even a firmware downgrade might be needed.

bhawkins4194
Contributor
Contributor

Well it ran for three weeks almost without a crash but, woke up this am to emails saying servers were offline.  I have attached the latest logs I'm not sure what I am looking for, so if someone sees something could they point it out so in the future I can take a look.

0 Kudos