Ok not sure where to start with this one, or if this is even the best place to be asking but here goes....
We have a HP Blade system in a c7000 enclosure with two 460bc blades, with iSCSI mez cards.
We have 4 Cisco 3020 switches, 2 in the first interconnect bays, 2 in the second, the second lot deal with the iSCSI traffic and are on a separate network. The first two switches are uplinked to our main Cisco layer 3 switch
Recently we were losing network connection with our VM systems, so we started some fault finding and it would appear the switch in slot 1 was faulty, so we had a replacement shipped out by HP.
So I put the switch into the blade centre, with no config, and no network cables plugged in, but we still have the second switch live and connected.
Within moments of plugging it in the systems started to become unresponsive, removing the switch and the systems recovered and were accessible again.
When this happens we also loose connection to the blades, we are using a dvswitch with both network cards of the blades attached to the dvSwitch.
Can anyone suggest somewhere I can look to see how to fix this issue?
Thanks
Kimbie
One thing I've noticed with our blades and chassis is you can get into mismatched firmware issues with replacement parts. I would confirm that the replacement part has the right firmware level as you currently have on your Virtual Connect and modules.
Can you move the switch in interconnect bay 2 over to bay 1 and see if the problem still occurs? This could indicate a problem with the backplane of the enclosure. Also, check the firmware as mentioned above.
I will try to get the bays swapped over one day this week, since its a production system we need to do it very early.
I will check the firmware but they swtiches are both Cisco 3020 so a slight difference in the IOS should not matter
Kimbie
Kimbie,
when you plug in the new unconfigured switch into the interconnect bay, it is powered on and the ESX hosts/dvSwitch can see active ports. Depending on the configuration the dvSwitch may try to fail back to the unconfigured switch/uplinks and that's when you loose connection.
What you basically need to do is to configure the Cisco switch before the dvSwitch can "see" it.
I could think of three possible solutions to this issue:
Disable/Detach all the uplinks to the missing Cisco switch on interconnect bay 1 in the dvSwitch's configuration , plug in the Cisco 3020 into bay 1, configure it and attach the cables/uplinks to the core switch. If done, re-attach/enable the dvSwitch uplinks.
Plug the Cisco 3020 into an unused interconnect bay, configure it and attach the uplinks to the core switch. Once done, relocate it to interconnect bay 1.
Plug the Cisco 3020 into an unused interconnect bay and disable all ports. Once done, relocate it to interconnect bay 1. Configure the Cisco switch, connect it to the core switch and last but not least re-enable the ports one-by-one.
André
I agree with André. Inserting the unconfigured replacement switch is most likely bringing up the NICs on the ESX hosts as "active" and thus the ESX hosts are attempting to pass traffic over these unconfigured ports. The ESX hosts do not have the ability to "know" what VLAN configurations are assigned to the physical uplinks and intelligently route traffic, thus the traffic would be sent via these uplinks and would lost. Your best option is to configure the switch offline in an unused I/O slot or chassis and then replace the failed switch.
-
If you found the information in this post useful, please consider awarding appropriate points.
Ok bit of an update.
We have gone with the suggestion of remove the physical adaptor from the blades. We then added in the switch and configured it up with our config.
We had pings running to the blades, nic, and VMs running on that blade.
As soon as we added the blade into the VM system we lost contact with the blade and the VMs. If we went onto the swtich via telnet via the fe interface and did "shut" agasint port 9 which is the blade, the blade could be pinged and could ping the VMs.
Now I think this sounds like a cisco style issue but I do not know if it is. I have shown the configs below, the config is identical on our two Cisco 3020 switches, I have changed the IP and name of it.
Switch 1
!
version 12.2
service config
no service pad
service timestamps debug uptime
service timestamps log uptime
no service password-encryption
!
hostname VM-Switch01
!
enable secret 5 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
!
no aaa new-model
system mtu routing 1500
udld aggressive
ip subnet-zero
!
!
mls qos map cos-dscp 0 8 16 24 32 46 46 56
!
!
macro global description cisco-global
!
!
!
errdisable recovery cause link-flap
errdisable recovery interval 60
!
spanning-tree mode rapid-pvst
spanning-tree loopguard default
spanning-tree extend system-id
!
vlan internal allocation policy ascending
!
!
interface FastEthernet0
ip address dhcp
no ip route-cache
!
interface GigabitEthernet0/1
switchport trunk encapsulation dot1q
switchport mode trunk
speed 1000
spanning-tree portfast
spanning-tree bpduguard enable
spanning-tree link-type point-to-point
!
interface GigabitEthernet0/2
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/3
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/4
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/5
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/6
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/7
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/8
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/9
switchport trunk encapsulation dot1q
switchport mode trunk
speed 1000
spanning-tree portfast
spanning-tree bpduguard enable
spanning-tree link-type point-to-point
!
interface GigabitEthernet0/10
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/11
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/12
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/13
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/14
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/15
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/16
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/17
switchport trunk encapsulation dot1q
switchport mode trunk
spanning-tree bpduguard enable
spanning-tree link-type point-to-point
!
interface GigabitEthernet0/18
switchport access vlan 50
switchport mode access
spanning-tree bpduguard enable
!
interface GigabitEthernet0/19
!
interface GigabitEthernet0/20
!
interface GigabitEthernet0/21
!
interface GigabitEthernet0/22
!
interface GigabitEthernet0/23
!
interface GigabitEthernet0/24
switchport trunk encapsulation dot1q
switchport mode trunk
macro description cisco-switch
auto qos voip trust
spanning-tree link-type point-to-point
!
interface Vlan1
ip address 10.99.60.151 255.255.255.0
no ip route-cache
!
interface Vlan90
no ip address
no ip route-cache
!
ip default-gateway 10.99.60.100
ip http server
snmp-server community xxxxxxxxx RO
snmp-server location VM Blade Centre 1
snmp-server contact xxxxxxxxx
!
control-plane
!
!
line con 0
line vty 0 4
password manager
login
line vty 5 15
password manager
login
!
end
-
Switch 2
!
version 12.2
service config
no service pad
service timestamps debug uptime
service timestamps log uptime
no service password-encryption
!
hostname VM-Switch02
!
enable secret 5 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
!
no aaa new-model
system mtu routing 1500
udld aggressive
ip subnet-zero
!
!
mls qos map cos-dscp 0 8 16 24 32 46 46 56
!
!
macro global description cisco-global
!
!
!
errdisable recovery cause link-flap
errdisable recovery interval 60
!
spanning-tree mode rapid-pvst
spanning-tree loopguard default
spanning-tree extend system-id
!
vlan internal allocation policy ascending
!
!
interface FastEthernet0
ip address dhcp
no ip route-cache
!
interface GigabitEthernet0/1
switchport trunk encapsulation dot1q
switchport mode trunk
speed 1000
spanning-tree portfast
spanning-tree bpduguard enable
spanning-tree link-type point-to-point
!
interface GigabitEthernet0/2
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/3
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/4
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/5
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/6
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/7
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/8
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/9
switchport trunk encapsulation dot1q
switchport mode trunk
speed 1000
spanning-tree portfast
spanning-tree bpduguard enable
spanning-tree link-type point-to-point
!
interface GigabitEthernet0/10
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/11
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/12
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/13
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/14
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/15
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/16
speed 1000
spanning-tree portfast
!
interface GigabitEthernet0/17
switchport trunk encapsulation dot1q
switchport mode trunk
spanning-tree bpduguard enable
spanning-tree link-type point-to-point
!
interface GigabitEthernet0/18
switchport access vlan 50
switchport mode access
spanning-tree bpduguard enable
!
interface GigabitEthernet0/19
!
interface GigabitEthernet0/20
!
interface GigabitEthernet0/21
!
interface GigabitEthernet0/22
!
interface GigabitEthernet0/23
!
interface GigabitEthernet0/24
switchport trunk encapsulation dot1q
switchport mode trunk
macro description cisco-switch
auto qos voip trust
spanning-tree link-type point-to-point
!
interface Vlan1
ip address 10.99.60.152 255.255.255.0
no ip route-cache
!
interface Vlan90
no ip address
no ip route-cache
!
ip default-gateway 10.99.60.100
ip http server
snmp-server community xxxxxxxxx RO
snmp-server location VM Blade Centre 1
snmp-server contact xxxxxxxxx
!
control-plane
!
!
line con 0
line vty 0 4
password manager
login
line vty 5 15
password manager
login
!
end
Port 24 on both switches run directly back to our main layer 3, a Cisco 3750
I would appreciate any help on this
Thanks
Kimbie