We have encountered frequent ESX crashes and are wondering what the reason is.
We have ESX 3.0.1 with internal storage on a HP ProLiant tower
On crash the console output contains lines like the following
\* Waiting for time-out heartbeat
\* WARNING : SCSI: 5444: READ of handleID 0x4b9
\* WARNING : LinBlock: 1284: Asked to abort - ignoring
Any ideas or similar experiences?
Maybe someone pulled the plug ?
Just kidding - do you really expect that with so little background info some one will be able to help you ?
Hello,
I would open a support call with VMware. Outside of that have a look at the \var\log\vmware directory for logs and key in on the time of the crash and start working backwards.
Hello, i got the LinBlock problem too.
the VMs don't respond to ping anymore but the ESX do. if i try to connect to ssh or with VI client i got a timeout.
i just see the vmkernel log on the console.
did you got an answer from VMWare ?
Thx
Hi
when you say crashes...what crashes the service console access? The vms? THe VC?
By the way how patched is your ESX 3.0.1 ? try thr cvommand esxupdate -l query if you do not see a whole lot of patches I would consider that to be a first step. Atleast put the critical ones
hi,
when i say craches, i mean all VM down (i can't know if they're really down but i can't reach them via network), VC down, the console access respond but i can't connect. i can't connect on the local console (i can enter the login but no prompt for password). My ESX 3.0.1 is 100% patched.
if i can't got an answer from vmware, i'll probably move the VM on a 3.5 host.
Regards
do you do anything to get it back up and running or does it do it by itself?
To see if its a network problem:
how many physical nics do you have and how are they connected?
To see if its ESX related:
how much space do you have on your luns? do a vdf -l also check space on /
how is your storage configured? Do you have spearate disks for /vmfs volumes?
If your server comes back up without a reboot do you get any error messages on your VMs? Ilike unexpected shutdowns?
Are you using VC? Does your server disconnect from it?
By the way there was an old problem similar to yours at http://communities.vmware.com/message/500553 slightly different linblock number but similar...seemed related to networkingand to the broadcom card..granted its for ESX 2.5X
BY the way I do know that there was problems with Broadcom cards on ESX 3.01
This particular issue appears to have been resolved by ESX 3.02 u1
I got 2 physical network card, one for the console, the other with vlans (both intel e1000). they are both healthy and well configured.
I got one big vmfs with 70Go left
I use VC and the ESX was disconnected (ping ok but can't connect even in ssh)
BTW, this is the first time it appened.
how much space left do you have at root?
did it ever resolve itself or did you have to reboot?
/dev/cciss/c0d0p2 4.9G 2.4G 2.3G 52% /
/dev/cciss/c0d0p1 99M 29M 65M 31% /boot
none 131M 0 131M 0% /dev/shm
/dev/cciss/c0d0p6 2.0G 123M 1.7G 7% /var/log
i need to poweroff the DL380 G5 to solve the problem. the system was almost frozen (still vmkernel log on alt-F11)
we are kind of running dry here....
your system is esx 301 patched with all 70+ patches
your networking is working fine
you seem to have plenty of space
your server is fully qualified in the support matrix for vmware...
assuming your server is also at lattest bios and firmware for it
I also assume you have no other external networking problems..
I would say if this happens again you have 2 choices...
1: gather vm-support and escalate to vmware
2: upgrade to esx 302 or 350 and hope newer drivers on those releases will resolve your issue