VMware Cloud Community
timowest
Contributor
Contributor

ESX crashes

We have encountered frequent ESX crashes and are wondering what the reason is.

Reply
0 Kudos
11 Replies
timowest
Contributor
Contributor

We have ESX 3.0.1 with internal storage on a HP ProLiant tower

On crash the console output contains lines like the following

\* Waiting for time-out heartbeat

\* WARNING : SCSI: 5444: READ of handleID 0x4b9

\* WARNING : LinBlock: 1284: Asked to abort - ignoring

Any ideas or similar experiences?

Reply
0 Kudos
continuum
Immortal
Immortal

Maybe someone pulled the plug ?

Just kidding - do you really expect that with so little background info some one will be able to help you ?


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

Reply
0 Kudos
VirtualNoitall
Virtuoso
Virtuoso

Hello,

I would open a support call with VMware. Outside of that have a look at the \var\log\vmware directory for logs and key in on the time of the crash and start working backwards.

Reply
0 Kudos
RS_1
Enthusiast
Enthusiast

Hello, i got the LinBlock problem too.

the VMs don't respond to ping anymore but the ESX do. if i try to connect to ssh or with VI client i got a timeout.

i just see the vmkernel log on the console.

did you got an answer from VMWare ?

Thx

Reply
0 Kudos
opbz
Hot Shot
Hot Shot

Hi

when you say crashes...what crashes the service console access? The vms? THe VC?

By the way how patched is your ESX 3.0.1 ? try thr cvommand esxupdate -l query if you do not see a whole lot of patches I would consider that to be a first step. Atleast put the critical ones

Reply
0 Kudos
RS_1
Enthusiast
Enthusiast

hi,

when i say craches, i mean all VM down (i can't know if they're really down but i can't reach them via network), VC down, the console access respond but i can't connect. i can't connect on the local console (i can enter the login but no prompt for password). My ESX 3.0.1 is 100% patched.

if i can't got an answer from vmware, i'll probably move the VM on a 3.5 host.

Regards

Reply
0 Kudos
opbz
Hot Shot
Hot Shot

do you do anything to get it back up and running or does it do it by itself?

  • To see if its a network problem:

how many physical nics do you have and how are they connected?

  • To see if its ESX related:

how much space do you have on your luns? do a vdf -l also check space on /

how is your storage configured? Do you have spearate disks for /vmfs volumes?

  • If your server comes back up without a reboot do you get any error messages on your VMs? Ilike unexpected shutdowns?

Are you using VC? Does your server disconnect from it?

By the way there was an old problem similar to yours at http://communities.vmware.com/message/500553 slightly different linblock number but similar...seemed related to networkingand to the broadcom card..granted its for ESX 2.5X

BY the way I do know that there was problems with Broadcom cards on ESX 3.01

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2242&slice...

This particular issue appears to have been resolved by ESX 3.02 u1

Reply
0 Kudos
RS_1
Enthusiast
Enthusiast

I got 2 physical network card, one for the console, the other with vlans (both intel e1000). they are both healthy and well configured.

I got one big vmfs with 70Go left

I use VC and the ESX was disconnected (ping ok but can't connect even in ssh)

BTW, this is the first time it appened.

Reply
0 Kudos
opbz
Hot Shot
Hot Shot

how much space left do you have at root?

did it ever resolve itself or did you have to reboot?

Reply
0 Kudos
RS_1
Enthusiast
Enthusiast

/dev/cciss/c0d0p2 4.9G 2.4G 2.3G 52% /

/dev/cciss/c0d0p1 99M 29M 65M 31% /boot

none 131M 0 131M 0% /dev/shm

/dev/cciss/c0d0p6 2.0G 123M 1.7G 7% /var/log

i need to poweroff the DL380 G5 to solve the problem. the system was almost frozen (still vmkernel log on alt-F11)

Reply
0 Kudos
opbz
Hot Shot
Hot Shot

we are kind of running dry here....

your system is esx 301 patched with all 70+ patches

your networking is working fine

you seem to have plenty of space

your server is fully qualified in the support matrix for vmware...

assuming your server is also at lattest bios and firmware for it

I also assume you have no other external networking problems..

I would say if this happens again you have 2 choices...

1: gather vm-support and escalate to vmware

2: upgrade to esx 302 or 350 and hope newer drivers on those releases will resolve your issue

Reply
0 Kudos