VMware Cloud Community
neyz
Contributor
Contributor
Jump to solution

iSCSI storage redundancy Equallogic PS6000 + ESX : LUN disapears when 1 switch goes out !

Hello everyone, this is my first post here Smiley Happy

Ok so as the title says, i'm trying to work out redundancy under ESX. Because network setups can be quite confusing i just made a quick draft of our infrastructure with only one of my 3 ESX hosts and will try to expose as clearly as possible the configuration details.

In my ESX host, i have configured a new vswitch with the vmkernel and vmnetwork attached to 4 (NIC 2,3,4 and 5) of my NIC's configured as active in NIC teaming (also tried with putting 2 as available) failover method is Beacon probing (also tried with Link status)

On another VSwitch i have the service console with 2 (NIC 1 and 6) NIC's attached.

My 2 switches are connected via a LAG of 4 links and the RSTP protocol is active and working (even though i have alot of brodcast activity as soon as i connect my LAN switch to the SAN switches but that is another story)

My SAN is connected with 2 links on switch A and 2 on switch B for each controler. I have created 3 volumes on that SAN which are each configured in my 3 ESX host. So far, everything is great.

Now i pull the power plug of switch A. Switch B takes the lead and i can still ping the SAN from the network lan switch, i can also ping my little guest VM but the VM itself quickly dies. I cannot even reboot it or I get an error message stating that the file can't be found. So i go in my datastore and that's when i see that it's all empty ! I need to initiate a whole rescan of the LUN's in order to be able to regain access to the datastore and start my VM again.

From my point of view, even though ESX detects that some links are down when i pull the cord, the iSCSI session just dies on me and won't "restart". I read some stuff about multipathing but the button "manage paths" is greyed out, i'm not even sure that this is what I need.

So basicly, as you can see i'm pretty new to all this and quite lost when it comes to all those redundancy things, hell, i didn't even know about spanning tree before yesterday ! So, if any of you guys can give me a few advices on achieving redundancy I would greatly appreciate it ^^

Thanks alot !

0 Kudos
1 Solution

Accepted Solutions
kjb007
Immortal
Immortal
Jump to solution

I wouldn't put my vmnetwork and my storage network on the same vSwitch. I'd separate them out into their own vSwitch, and use two pNICs for each vSwitch.

Once you have the traffic separated out, I'd make sure a vmkping, and not a ping, to the storage target works over each physical NIC. Pull all the cables, and plug them in one at a time to make sure routing works.

-KjB

VMware vExpert

-KjB

VMware vExpert

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB

View solution in original post

0 Kudos
4 Replies
depping
Leadership
Leadership
Jump to solution

welcome to the forums,

Could you tell us how you've setup the virtual switches / portgroups and vmnics as well?

Duncan

VMware Communities User Moderator

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

kjb007
Immortal
Immortal
Jump to solution

I wouldn't put my vmnetwork and my storage network on the same vSwitch. I'd separate them out into their own vSwitch, and use two pNICs for each vSwitch.

Once you have the traffic separated out, I'd make sure a vmkping, and not a ping, to the storage target works over each physical NIC. Pull all the cables, and plug them in one at a time to make sure routing works.

-KjB

VMware vExpert

-KjB

VMware vExpert

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos
neyz
Contributor
Contributor
Jump to solution

Ok,

i have seperated the vmkernel with the vmnetwork.

So i had SwitchA on and Switch B on, the vmkping command was working correctly, i then turned off Switch A, at that point, the vmkping command still goes on but I get something like 10 packets which get lost. The same thing happens when i turn back Switch A on, i get a few packet loss but then everything works.

At least, my VM running IO meter does not crash anymore and i can still see my datastores, so I think it is safe to say that this fixed my problem Smiley Happy

Thanks alot !

By the way, what is the difference between putting both NIC's in active or one in standby ?

Thanks again guys !

0 Kudos
kjb007
Immortal
Immortal
Jump to solution

Having all your NICs as active allows them all to be used at the same time. Meaning, if you have multiple vm's behind that portgroup, they will use all of the available NICs. By default, the first vm to come up will use the first NIC, the second will use the next, and so on. In standby mode, the standby NIC will only be used if the primary NIC fails.

-KjB

VMware vExpert

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
0 Kudos