VMware Cloud Community
Tikitiboo201110
Enthusiast
Enthusiast

VSphere ESX version 4.1 server not responds

Hi,

We have a VSphere ESX version 4.1 server installed which runs 04 virtual machines. Cpanel & Webmin is running on VMs.

Suddenly the server went unresponsive. Cannot connect to the server at all!

Had to restart from the button to startup the server.

Where should I start to find the cause of the problem? This is the second time this happened within last 02 weeks.

03 of the web servers running many websites.

Kind Regards

Mili

Tags (1)
0 Kudos
8 Replies
LOAGANATHAN
Enthusiast
Enthusiast

hi

     did u find any error in esx console that is in direct console of your server.while the server is unresponsive  and what is the hardware configuration and model.whether your using iscsci or fibre channel HBA for storage connectivity

0 Kudos
Shakaal
Hot Shot
Hot Shot

Hi,

Please check the following things when the server hangs:

1.> Ping the server, if response is coming try to do a putty and run esxtop to check the performance, also you can run "tail -f /var/log/vmkernel" to check the on going error messages.

2. If ping doesn't respond please try to have a direct console accesss of the ESX host and check are you able to login yes try above steps, if not then reboot, generate the logs by running vm-support command or using VI client upload them along with the time stampt to this post

Regards

0 Kudos
Tikitiboo201110
Enthusiast
Enthusiast

The server hangs today again & I am very worried!

Tried to ping to the mail IP of the VMWARE but time outs

Tried to ping to one of the websites hosted on one of virtual servers & got 01 reply but timeouts rest

Can't connect to the cosole at all.

So I restarted the server & everything is back to normal. I generated system log

But the log file is massive (300MB) if you could tell me specific file I can upload that here

There are 03 different folders under the log file called

tmp

var

vmfs

I was able to upload the tmp  & vmfs files as they are small.

The time stamp is

Date : 12th dec 2012

Time : in between 1655 - 1730

I checked the performance chart of the server using VI client & I can see there is high usage of CPU within the time period the server not responded (but not 100% but in between 100-75-50).

This server has 04 VMs & I can see the same CPU on one of the machines but not on other 03.

Many Thanks

0 Kudos
Shakaal
Hot Shot
Hot Shot

Hi,

please upload /var/log/messages* and /var/log/vmkernel*

Regards

0 Kudos
Tikitiboo201110
Enthusiast
Enthusiast

Thanks. I have attached.

0 Kudos
Shakaal
Hot Shot
Hot Shot

Hi,

I don't see any logs for the timementioned by you, also there are no errors reported, following is the last line of logs in VMkernel

"commands)
Jan 12 12:19:46 ds8193 vmkernel: 0:00:11:23.807 cpu0:4247)VSCSI: 2519: handle 8195(vscsi0:0):Reset [Retries: 0/0]
Jan 12 12:19:46 ds8193 vmkernel: 0:00:11:23.807 cpu0:4247)VSCSI: 2319: handle 8195(vscsi0:0):Completing reset (0 outstanding commands)"

this means that there is some Hardware problem as there are no logs for that event, would request you to get complete Hardware diagnostics done by the H/W vendor, also make sure that BIOS and Firmware for the host are upto date.

============

However I do see some packet drops, but that was prior to the time when the issue happened

0:01:18:20.073 cpu6:4102)NetSched: 3817: outputlist overflow. dropping 1 packets
Jan 12 10:43:49 ds8193 last message repeated 2 times
Jan 12 10:43:49 ds8193 vmkernel: 0:01:18:20.073 cpu6:4102)NetSched: 3817: outputlist overflow. dropping 5 packets
Jan 12 10:43:49 ds8193 vmkernel: 0:01:18:20.073 cpu6:4102)NetSched: 3817: outputlist overflow. dropping 4 packets
Jan 12 10:43:49 ds8193 vmkernel: 0:01:18:20.074 cpu6:4102)NetSched: 3817: outputlist overflow. dropping 5 packets
Jan 12 10:43:49 ds8193 vmkernel: 0:01:18:20.074 cpu6:4102)NetSched: 3817: outputlist overflow. dropping 3 packets
Jan 12 10:43:49 ds8193 vmkernel

cpu8:4104)NetSched: 3817: outputlist overflow. dropping 2 packets
Jan 12 10:45:07 ds8193 vmkernel

===========

Regards

0 Kudos
Tikitiboo201110
Enthusiast
Enthusiast

Hi Could anyone here tell me what does the following warning mean??? This log is from "vmkwarning" log file from the console.

Thanks

Jan 12 09:26:13 ds8193 vmkernel: 0:00:00:16.123 cpu12:4127)WARNING: ScsiScan: 1677: Add path: vmhba0:C0:T32:L0
Jan 12 09:26:13 ds8193 vmkernel: 0:00:00:16.203 cpu12:4127)WARNING: ScsiScan: 1677: Add path: vmhba0:C2:T0:L0
Jan 12 09:26:14 ds8193 vmkernel: 0:00:00:20.487 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008032840: Not supported
Jan 12 09:26:14 ds8193 vmkernel: 0:00:00:20.488 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x41000802f440: Not supported
Jan 12 09:26:14 ds8193 vmkernel: 0:00:00:20.488 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008033040: Not supported
Jan 12 09:26:14 ds8193 vmkernel: 0:00:00:20.488 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008033040: Not supported
Jan 12 09:26:14 ds8193 vmkernel: 0:00:00:20.488 cpu0:4096)VMNIX: WARNING: VmkDev: 2117: scsi_add_device(vml0, 0, 0, 0) [t10.DP______BACKPLANE000000] failed with -19 (alive 1)
Jan 12 12:09:08 ds8193 vmkernel: TSC: 72667732 cpu0:0)WARNING: MemMap: 1793: Reducing number of colors from 192 to 64
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:16.678 cpu2:4127)WARNING: ScsiScan: 1677: Add path: vmhba0:C0:T32:L0
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:16.747 cpu2:4127)WARNING: ScsiScan: 1677: Add path: vmhba0:C2:T0:L0
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:20.877 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008032840: Not supported
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:20.877 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x41000802f440: Not supported
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:20.877 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008033040: Not supported
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:20.877 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008033040: Not supported
Jan 12 12:09:09 ds8193 vmkernel: 0:00:00:20.877 cpu0:4096)VMNIX: WARNING: VmkDev: 2117: scsi_add_device(vml0, 0, 0, 0) [t10.DP______BACKPLANE000000] failed with -19 (alive 1)
Feb 14 09:12:57 ds8193 vmkernel: TSC: 72641954 cpu0:0)WARNING: MemMap: 1793: Reducing number of colors from 192 to 64
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:16.455 cpu2:4126)WARNING: ScsiScan: 1677: Add path: vmhba0:C0:T32:L0
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:16.527 cpu2:4126)WARNING: ScsiScan: 1677: Add path: vmhba0:C2:T0:L0
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:20.767 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008031680: Not supported
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:20.768 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008031680: Not supported
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:20.768 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x410008031680: Not supported
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:20.768 cpu0:4096)WARNING: ScsiHost: 903: SCSI command failed on handle 0x41000802f440: Not supported
Feb 14 09:12:58 ds8193 vmkernel: 0:00:00:20.768 cpu0:4096)VMNIX: WARNING: VmkDev: 2117: scsi_add_device(vml0, 0, 0, 0) [t10.DP______BACKPLANE000000] failed with -19 (alive 1)

0 Kudos
Tikitiboo201110
Enthusiast
Enthusiast

Anybody here?????

0 Kudos