Hi. I posted this in the vNetwork community last week, but no responses yet so I figured I'd try here.
We have 4 esxi 6.0 hosts connected to 2 Dell EQ PS6210x arrays via 2 Dell PC5424 1Gb switches - all baseT ethernet, no SFP. The hosts have separate vswitches for vmotion and iscsi, and each vswitch has dual teamed nics with the default "originating port id" configuration. The iscsi nics are 10g, auto negotiating down to 1g (array nics also auto neg). The switches have separate vlans for iscsi and vmotion. All connections from the hosts and SANs are fully redundant with one connection from each nic pair going to each switch, and the switches are lagged.
We have 2 new Dell N4032 10Gb switches that I will be setting up with the vlans, etc in the next couple of days. I'd love to move everything to the new switches without any downtime. Moving the vmotion connections should be easy as long as VMs stay put (I'd put the hosts into maintenance mode for the vmotion cable changes), so iscsi is the main concern. Here is the plan for moving the iscsi connections, if possible:
Configure the new switches with vlans and lag and schedule a maintenance window - not downtime, but to minimize traffic
On the arrays, unplug one nic on each offline controller from 1g switch and plug into 10g switch
Disable one iscsi vmnic on each host; move cable for disabled nic from 1g to 10g
Restart EQ controllers on each array so that the mixed 1g and 10g controller comes online
Enable the vmnic now connected to the 10g switch; confirm traffic on vmnic and EQ nic connected to 10g
Disable the 2nd iscsi cable still connected to 1g switch; At this point, there should be no connections from the hosts to the 1g switches
The currently offline EQ controller still has both nics plugged into 1g; change both of those to their respective 10g switches
Restart the EQ controllers so that the one with both 10g connections comes online
On the hosts, move the 2nd iscsi vmnic connections from the 1g to the 10g switches; enable
For the offline controllers on the EQs, disconnect the port from the 1g switch and move to the 10g switch
DRINK!
Is the plan above doable? My biggest concern is the brief time that the EQ array online controllers are connected to 1g and 10g at the same time. Will that freak them out? Are there any other gaping holes in my plan?
Thanks.
Just an update after we accomplished this task. Although a special module would be required to maintain 2 way traffic between the different speed switches, connecting directly from the 1g switch to the 10g switch worked in that direction, and that's all we needed. To summarize, we were able to leg in all of our dual nicced hosts to the new switches, then roll over the EQ controllers. We started to run into a problem when Group Mgr lost it's connection between EQs while the members were being moved because the dual nics on the controllers couldn't communicate with each other. Connecting from the 1g to the 10g (both ports in the iscsi vlan) allowed that communication to be reestablished.
Hi zenking,
is it possible to build a vlan trunk between the old and new switches?
Then you can refit every single link to the new switches and you are loosing only one single path of your iscsi connections. Your multipathing should compensate this los.
Thanks, efanjo. Good question. I suspect the answer is yes, but I'm checking with Dell.
ETA: After reading more about the switches, they probably cannot be lagged because I think the 1g switches would have to have some 10g ports, and I don't think it's possible to add anything like that on the 5424. We'll see if Dell says anything different.
The N4032 should be capable to use 1Gb links.
I don't think the Dell PC5424 runs DNOS, but the N Series is on DNOS 6.X so you cannot LAG between them, but your overall steps looks good to me.
I am not familiar with EQ PS6210x, do you have to restart the controller to bring the interface online, couldn't you just shut/no shut the interface?
The 4032 does autonegotiate to 1g, but I don't know that that will apply for a lag. Still waiting on Dell response, though.
VFK,
On the EQ, restarting rolls over the controllers, so that's what takes one controller offline and the other online. Odd terminology and something I found confusing at first, too.
So you wouldn't anticipate a problem with a LUN briefly having a brief 1g and 10g connection at the same time? I think we could plan to have very low I/O, but I would expect a little bit if we keep VMs live (mainly some web databases).
It is possible if you have multipathing set to RR or iscsi port binding. In which case you, active/standy on the vsphere host, or configure your 10gbe interface speed to 1gbe during the migration and change your interface to 10gbe once you have completely moved to the new switch.
With vlan trunk i did not mean a lag
Just a simple link with all needed vlans.
Just connect your old switches with the new ones with one 1Gbit link so that each device connect to a switch can communicate with devices on the other switches.
One Layer 2 network over 4 switches.
Then you can migrate each link from 1Gbit to 10Gbit. Maybe you should disable drs in your cluster to avoid unnecessary vmotion traffic.
Thanks, efanjo. I appreciate the response, but I don't think that gets me where I'm going. VLANs aren't the issue, link speed is. Earlier today, I read somewhere that connecting a 1g port on a switch to a 10g port on another switch is a no-no. There has to be a 10g uplink module on the 1g switch in order to connect the two different speed switches.
If you can point me to a link that states that it can be done without the module, please post.
Just an update after we accomplished this task. Although a special module would be required to maintain 2 way traffic between the different speed switches, connecting directly from the 1g switch to the 10g switch worked in that direction, and that's all we needed. To summarize, we were able to leg in all of our dual nicced hosts to the new switches, then roll over the EQ controllers. We started to run into a problem when Group Mgr lost it's connection between EQs while the members were being moved because the dual nics on the controllers couldn't communicate with each other. Connecting from the 1g to the 10g (both ports in the iscsi vlan) allowed that communication to be reestablished.