I've been experiencing this issue for a couple of months now where my ESX hosts lose connectivity with my iSCSI SAN vmfs volumes.
As a results the ESX hosts enter a nonresponsive mode the associated VMs disconnect and the only remedy is to reboot the host.
This issue happens randomly . I have escalated this issue with VMWare but I haven't had any solution to the issue yet.
I see no errors on my switches and there are no hardware issues as well. My SAN infrastucture is solid and there are 2 paths for every vmfs volume.
Did anybody else experienced a similar issue?
Themis
Please have a look at this KB describing the issue identification path:
---
iSCSI SAN software
http://www.starwindsoftware.com
Thank you for your recommendations
I tried everything that is mentioned in the article already.
There is no Hardware issue or any issues with the swithes.
What type of SAN and switch have you got?
Is your server and san on the HCL?
have you checked within the following directory when your error happens?
/proc/scsi/vmkiscsi/1
it could highlight some problems...
Consider awarding points for "helpful" and/or "correct" answers.
Thank you for your responses. Here are some more details:
The iSCSI SAN software is Datacore Sanmelody 2.0.4.2 running on 2 HP Proliant G5 servers.
The storage attached to each of the servers is an HP MSA70 and all the iSCSI SAN Volumes that are presented to my 4 ESX hosts are mirrored.
I have two iSCSI swithces HP Procurve 1800G-24 that are trunked together.
My SANLELODY servers are using NC360T NICs. I team two NICs and have one cable connecting to each iSCSi switch.
Each ESX server uses two NICs as well for the iSCSI Network.
Please let me know if you need more information.
To be sure where the problem is(esx or san) try to use some other iSCSI software. Let's try StarWind- there is a trial version and good support as I know.
No
Thank you,
Themis Tokkaris
Systems Engineer
Truly Nolen of America Inc
Tucson, AZ 85716
W:520.322.4053
C:520.495.9256
CISSP,CEH,VCP,CCNA,MCP,
MCSE2003,MCDBA,SCSA, OCP 9I
From: TimPhillips <communities-emailer@vmware.com>
To: <themis.tokkaris@trulymail.net>
Date: 03/04/2010 09:22 AM
Subject: New message: "ESX hosts lose connectivity with iSCSI SAN LUNs"
Not a good idea my friend.
The SAN infrastructure is solid.
I think the issue here might be with NIC teaming.
The Starwind spam is becoming more subtle on these forums I see.
In what way this could be a spam? On what planet?
I'm just trying to help Themist. In the way I can.
Good morning
If you want to help me lets try to resolve the issue not recommwending to me to tear apart my SAN infrastructure and start from scratch
Sincerely
Themis
What model of NICs are you using for the iSCSI connections on your ESX hosts? Are they iSCSI HBAs or regular NICs? If you're using the software iSCSI initiator, how are your two iSCSI NICs configured on the ESX hosts?
Are your DatacoreVMs running either VMs or physical?
In what way this could be a spam? On what planet?
I'm just trying to help Themist. In the way I can.
On the planet where nearly every post you've made is a "helpful recommendation" that the poster use Starwind, with no advice on what the actual problem may ever be.
Yeah, OK, I am a spam bot. Just very smart
@Themist!!! We're waiting for you to answear.
The NICs are NC360T
my two iSCSI NIcs are both active under the iSCSI vswitch
I am not using any RDM
The Datacore servers are HP DL380 G5s and they are Physical Servers dedicated as SAN.
Each server hasn an MSA70 attached to it and the Datacore software is configured as partneship between the servers
All volumes are mirrored.