VMware Cloud Community
Kimbie
Contributor
Contributor

Problems with ESXi and HP Blades

Ok not sure where to start with this one, or if this is even the best place to be asking but here goes....

We have a HP Blade system in a c7000 enclosure with two 460bc blades, with iSCSI mez cards.

We have 4 Cisco 3020 switches, 2 in the first interconnect bays, 2 in the second, the second lot deal with the iSCSI traffic and are on a separate network. The first two switches are uplinked to our main Cisco layer 3 switch

Recently we were losing network connection with our VM systems, so we started some fault finding and it would appear the switch in slot 1 was faulty, so we had a replacement shipped out by HP.

So I put the switch into the blade centre, with no config, and no network cables plugged in, but we still have the second switch live and connected.

Within moments of plugging it in the systems started to become unresponsive, removing the switch and the systems recovered and were accessible again.

When this happens we also loose connection to the blades, we are using a dvswitch with both network cards of the blades attached to the dvSwitch.

Can anyone suggest somewhere I can look to see how to fix this issue?

Thanks

Kimbie

0 Kudos
6 Replies
reverseninja
Contributor
Contributor

One thing I've noticed with our blades and chassis is you can get into mismatched firmware issues with replacement parts. I would confirm that the replacement part has the right firmware level as you currently have on your Virtual Connect and modules.

0 Kudos
jayolsen
Expert
Expert

Can you move the switch in interconnect bay 2 over to bay 1 and see if the problem still occurs? This could indicate a problem with the backplane of the enclosure. Also, check the firmware as mentioned above.

0 Kudos
Kimbie
Contributor
Contributor

I will try to get the bays swapped over one day this week, since its a production system we need to do it very early.

I will check the firmware but they swtiches are both Cisco 3020 so a slight difference in the IOS should not matter

Kimbie

0 Kudos
a_p_
Leadership
Leadership

Kimbie,

when you plug in the new unconfigured switch into the interconnect bay, it is powered on and the ESX hosts/dvSwitch can see active ports. Depending on the configuration the dvSwitch may try to fail back to the unconfigured switch/uplinks and that's when you loose connection.

What you basically need to do is to configure the Cisco switch before the dvSwitch can "see" it.

I could think of three possible solutions to this issue:

  1. Disable/Detach all the uplinks to the missing Cisco switch on interconnect bay 1 in the dvSwitch's configuration , plug in the Cisco 3020 into bay 1, configure it and attach the cables/uplinks to the core switch. If done, re-attach/enable the dvSwitch uplinks.

  2. Plug the Cisco 3020 into an unused interconnect bay, configure it and attach the uplinks to the core switch. Once done, relocate it to interconnect bay 1.

  3. Plug the Cisco 3020 into an unused interconnect bay and disable all ports. Once done, relocate it to interconnect bay 1. Configure the Cisco switch, connect it to the core switch and last but not least re-enable the ports one-by-one.

André

stevenbright1
Enthusiast
Enthusiast

I agree with André. Inserting the unconfigured replacement switch is most likely bringing up the NICs on the ESX hosts as "active" and thus the ESX hosts are attempting to pass traffic over these unconfigured ports. The ESX hosts do not have the ability to "know" what VLAN configurations are assigned to the physical uplinks and intelligently route traffic, thus the traffic would be sent via these uplinks and would lost. Your best option is to configure the switch offline in an unused I/O slot or chassis and then replace the failed switch.

-


If you found the information in this post useful, please consider awarding appropriate points.

Kimbie
Contributor
Contributor

Ok bit of an update.

We have gone with the suggestion of remove the physical adaptor from the blades. We then added in the switch and configured it up with our config.

We had pings running to the blades, nic, and VMs running on that blade.

As soon as we added the blade into the VM system we lost contact with the blade and the VMs. If we went onto the swtich via telnet via the fe interface and did "shut" agasint port 9 which is the blade, the blade could be pinged and could ping the VMs.

Now I think this sounds like a cisco style issue but I do not know if it is. I have shown the configs below, the config is identical on our two Cisco 3020 switches, I have changed the IP and name of it.

Switch 1

!

version 12.2

service config

no service pad

service timestamps debug uptime

service timestamps log uptime

no service password-encryption

!

hostname VM-Switch01

!

enable secret 5 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

!

no aaa new-model

system mtu routing 1500

udld aggressive

ip subnet-zero

!

!

mls qos map cos-dscp 0 8 16 24 32 46 46 56

!

!

macro global description cisco-global

!

!

!

errdisable recovery cause link-flap

errdisable recovery interval 60

!

spanning-tree mode rapid-pvst

spanning-tree loopguard default

spanning-tree extend system-id

!

vlan internal allocation policy ascending

!

!

interface FastEthernet0

ip address dhcp

no ip route-cache

!

interface GigabitEthernet0/1

switchport trunk encapsulation dot1q

switchport mode trunk

speed 1000

spanning-tree portfast

spanning-tree bpduguard enable

spanning-tree link-type point-to-point

!

interface GigabitEthernet0/2

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/3

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/4

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/5

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/6

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/7

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/8

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/9

switchport trunk encapsulation dot1q

switchport mode trunk

speed 1000

spanning-tree portfast

spanning-tree bpduguard enable

spanning-tree link-type point-to-point

!

interface GigabitEthernet0/10

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/11

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/12

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/13

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/14

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/15

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/16

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/17

switchport trunk encapsulation dot1q

switchport mode trunk

spanning-tree bpduguard enable

spanning-tree link-type point-to-point

!

interface GigabitEthernet0/18

switchport access vlan 50

switchport mode access

spanning-tree bpduguard enable

!

interface GigabitEthernet0/19

!

interface GigabitEthernet0/20

!

interface GigabitEthernet0/21

!

interface GigabitEthernet0/22

!

interface GigabitEthernet0/23

!

interface GigabitEthernet0/24

switchport trunk encapsulation dot1q

switchport mode trunk

macro description cisco-switch

auto qos voip trust

spanning-tree link-type point-to-point

!

interface Vlan1

ip address 10.99.60.151 255.255.255.0

no ip route-cache

!

interface Vlan90

no ip address

no ip route-cache

!

ip default-gateway 10.99.60.100

ip http server

snmp-server community xxxxxxxxx RO

snmp-server location VM Blade Centre 1

snmp-server contact xxxxxxxxx

!

control-plane

!

!

line con 0

line vty 0 4

password manager

login

line vty 5 15

password manager

login

!

end

-


Switch 2

!

version 12.2

service config

no service pad

service timestamps debug uptime

service timestamps log uptime

no service password-encryption

!

hostname VM-Switch02

!

enable secret 5 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

!

no aaa new-model

system mtu routing 1500

udld aggressive

ip subnet-zero

!

!

mls qos map cos-dscp 0 8 16 24 32 46 46 56

!

!

macro global description cisco-global

!

!

!

errdisable recovery cause link-flap

errdisable recovery interval 60

!

spanning-tree mode rapid-pvst

spanning-tree loopguard default

spanning-tree extend system-id

!

vlan internal allocation policy ascending

!

!

interface FastEthernet0

ip address dhcp

no ip route-cache

!

interface GigabitEthernet0/1

switchport trunk encapsulation dot1q

switchport mode trunk

speed 1000

spanning-tree portfast

spanning-tree bpduguard enable

spanning-tree link-type point-to-point

!

interface GigabitEthernet0/2

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/3

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/4

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/5

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/6

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/7

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/8

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/9

switchport trunk encapsulation dot1q

switchport mode trunk

speed 1000

spanning-tree portfast

spanning-tree bpduguard enable

spanning-tree link-type point-to-point

!

interface GigabitEthernet0/10

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/11

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/12

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/13

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/14

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/15

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/16

speed 1000

spanning-tree portfast

!

interface GigabitEthernet0/17

switchport trunk encapsulation dot1q

switchport mode trunk

spanning-tree bpduguard enable

spanning-tree link-type point-to-point

!

interface GigabitEthernet0/18

switchport access vlan 50

switchport mode access

spanning-tree bpduguard enable

!

interface GigabitEthernet0/19

!

interface GigabitEthernet0/20

!

interface GigabitEthernet0/21

!

interface GigabitEthernet0/22

!

interface GigabitEthernet0/23

!

interface GigabitEthernet0/24

switchport trunk encapsulation dot1q

switchport mode trunk

macro description cisco-switch

auto qos voip trust

spanning-tree link-type point-to-point

!

interface Vlan1

ip address 10.99.60.152 255.255.255.0

no ip route-cache

!

interface Vlan90

no ip address

no ip route-cache

!

ip default-gateway 10.99.60.100

ip http server

snmp-server community xxxxxxxxx RO

snmp-server location VM Blade Centre 1

snmp-server contact xxxxxxxxx

!

control-plane

!

!

line con 0

line vty 0 4

password manager

login

line vty 5 15

password manager

login

!

end

Port 24 on both switches run directly back to our main layer 3, a Cisco 3750

I would appreciate any help on this

Thanks

Kimbie

0 Kudos