VMware Cloud Community
dagkl
Contributor
Contributor

Esx Nics went down

I have tried to learn myself a cisco asa I have got hold of some time now, but it is not easy. to ease the setup process, I decided to attach it to my esx, so I could test it alongside my production environment. I created a virtual switch that was attached to a dedicated nic on the esx, and plugged it in. When I did that I lost all connection with the esx. I had to take the power off the esx and reboot it to make it work again.

The inside address of the cisco asa is 192.168.1.1, and that is the same as the default gateway. This is the only thing that I can think of can cause the problem, but I still dont understand, because the asa is not connected to anything else, although that was the plan. it is on a dedicated vswitch.

See the attached picture of my networking settings.

Reply
0 Kudos
16 Replies
JoJoGabor
Expert
Expert

Strange, are you connecting via a switch or directly. Do you need a crossover to connect directly?

Reply
0 Kudos
kjb007
Immortal
Immortal

Did you lose console access to the ESX host? Or just the networking portion?

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
dagkl
Contributor
Contributor

I could not get access to gueste or the host itself, on any interface. Communication with the host just stopped. I dont have a console attached to it, and instead of carrying a 19 inch CRT monitor down two floors I decided to flip the power switch.

So the service console is configured to a default gateway that has an address of 192.168.1.1. The cisco is also using this address, since it is meant to take over for my other router some day.

Is there a log that I can look at somewhere if the esx has turned off the lights itself because of a network loop or something? Could it be some fault with the network cards or drivers as I have really bad throughput as stated in another thread on this site?

Reply
0 Kudos
JoJoGabor
Expert
Expert

How are your networks setup? I assume your Service Console is sharing the same pNIC as your VMs and this is the pNIC that is being attached to the ASA?

Can you confirm how you are physically connecting the devices. Also check to see what the NIC lights do when you attach it.

Reply
0 Kudos
dagkl
Contributor
Contributor

Yes, the service console is usng the same nic as my guests, but I connected the ASA on the other nic, that was not in use.

I have tried again this evening, but this time I have changed my old router with the asa. I am now running internet connection througnh the asa. The funny thing now is that connection to my esx has gone down again. But this time there is no funny connection or loops that can occur. It is a plain connection on the inside interfaces of the asa. It work for some time and now it has stopped.

Hmm, either this has something to do with the asa or the phsical nics on the esx itself.

Reply
0 Kudos
kjb007
Immortal
Immortal

Have you turned off the packet filtering/firewall on the ASA? To know for sure, I would use the 2nd interface, and plug into it directly,m or if you're not using your old router anymore, setup a separate segment, and go through that.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
dagkl
Contributor
Contributor

I have tried some more this evening. First last evening I got my asa to work properly. Sounds lame, but I didnt get the port forwarding to work properly. Now this is working, so I have changed the network setup for the new router. Also the config for the virtual networks(last picture). Tried to plug the esx into the asa, and same thing happened. Then moved the esx-server up one floor where I had a monitor. connected it to the asa with a 15 meter TP-cable. To my surprise the esx now works... for a while. It works about an hour, and what I see is that first it starts to work a lot and the cpu goes off the charts. The whole esx now turns into syrup,and connecting to it becomes more and more difficult. Look at the attached picture. at this moment I cannot even see the charts. It will not load anymore. The second picture is a picture of the chart a short while after.

Is there a log somewhere that can explain a little more what happens, let us say if we suspect something, like the nics.

All this happened with only one guest running.

Regarding the packet filtering/firewall on the asa, no I have not turned that off, and I dont know how to do it either.

Reply
0 Kudos
kjb007
Immortal
Immortal

These look to be VC maps. Have you connected to the console itself to see what kind of responses you are getting? Meaning, does everything appear slow if you are right in front of the ESX itself? It appears you may still have dealing with network problems, and not necesarilly with ESX.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
dagkl
Contributor
Contributor

Yes, the server was really slow, and I used vc whan taking those screenshots. Tried to go to the console and it was really slow there too. But I have come to the conclusion that something must be wrong with my Nics, the asa or the cables I am using, not the Esx. I think that the condition that the ESX was in was a reaction to other things.

I will try more tomorrow night. Got a vista exam to read for tonight.

Thx for all help.

Reply
0 Kudos
RegNullify
Contributor
Contributor

This is a similar problem that I am having. I have two esx hosts in an HA cluster and randomly both hosts will show a status of being disconnected and all of the respective vms that are running on that host. However the esx host is not down and the vms are running fine. I have a total of six network ports and three vswitches 0,1,2 on each esx host. Each vswitch consists of two pnics that are teamed. I did this to logically segregate network traffic to ensure functionality for the various tasks. I have two Cisco 6509s and one pnic from each team plugs into a separate 6509. The 6509 is using virtualization between the two switches to make network connectivity as resilient as possible. Any thoughts or suggestions? I will add screenshots later. Thanks for your support.

pnic-0,1= vSwitch0 "Service Console Network" (Vmotion disabled) Purpose = Virtual Center to esx host management

pnic-2,3 = vSwitch1 "Production VM Network" Purpose = Public facing VM network

pnic-4,5 = Vswitch2 "Service Console Network" (Vmotion enabled) Purpose = VMotion network traffic only

A+, N+, CNA, CNE, MCP, MCSA, VCP310, VCP410, VCI <------ Long time dedicated IT Professional specializing in U.S. Federal Government implementations.
Reply
0 Kudos
kjb007
Immortal
Immortal

Do you have two service console ports in your configuration? In vSwitch 0, you have console, and in vSwitch2, you have console. Which has your vmkernel port for vmotion? I'm assuming vSwitch2, but does that have service console port also? Are your service console and vmotion on the same segment? If so, you don't need this additional service console port on vSwitch2. Also, since you will have two ports from a single switch connecting to separate vSwitch's in your virtual networking, spanning tree will may be killing one of your ports which will cause the connection between vc and esx to die.

I would remove service console port from vSwitch2, and leave vmotion there. Also, on your 6509, make sure portfast is enabled, and if possible, turn STP off. Also, make sure port security is not enabled, or turn it off for that port as you may see the same MAC on multiple ports, again, causing your switch to disable an active port causing your problems.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
RegNullify
Contributor
Contributor

Greetings,

Let me clarify. First of all I am using VC 2.5.0 B84767 to manage my hosts which are ESX3i 3.5.0, B94430. My apologies for not being more informative.:0 With that being said the networking portion of VC allows you to either create one of two network connection types a "Virtual Machine" or a "VMKernel". In my case I have three virtual switches per esx host and they are built using two pnics per switch. Now in VC 2.0 I do recall what you are talking about. One network or Vswitch was purely are a "Service Console" for ESX host management. You would of course create another Vswitch for VMs and then you would create a "VMkernel"connection that was dedicated to vmotion traffic. I am under the impression that VMware is getting away from having as many networks and they are squeezing the vmotion functionality into the service console. Unless I am missing something in VC 2.5 there is no "Serivce Console" connection type it is a VMKernal type and the distinguishing factor for vmotion is simply a check box to enable it. Basically I have two service console networks and one is vmotion enabled. One thing you said that peaks my interest is good ole "Spanning Tree" protocol. In my organization it is a requirement that it be used, but I am going to turn it off and see what happens. Portfast is enabled. Thanks for your support I do greatly appreciate it.

Vswitch0 is a "VMkernel" connection type and "vmotion" is disabled on this Vswitch

Vswitch1 is a Virtual Machine connection type for vms

Vswitch2 is a "VMKernel" connection type with vmotion enabled on the vswitch

A+, N+, CNA, CNE, MCP, MCSA, VCP310, VCP410, VCI <------ Long time dedicated IT Professional specializing in U.S. Federal Government implementations.
Reply
0 Kudos
dagkl
Contributor
Contributor

I am stuck but I beleive now more than ever that there is something wrong with my physical nics. First of all I didnt get more than 1MB per second whaen I used my old router. The problems that I encounter with my new cisco asa 5505 is like this:

I bought new cables, but no change, they were all 1.5 metres long. Have to reboot, because the nics went down again. Tried with a 15 meter cable, and then the esx works. When I switch over to the 1.5 meter cable, the esx stops working again and the nics will not get up even if I switch back to the long cable.

I have an unmanaged switch that I set up and there both the long and the short cables are working fine. I try to switch over to the asa again, and it goes down again, new reboot.

So, just for the fun of it, I try to configure the vlan id of my inside network(see the drawing from earlier in this post), in case it is something with that. The esx behaves correctly, and on the unmanaged switch now, I lose connection to the inside network. I switch over to the asa again, and now I should no longer be able to see the service console according to my theory and the inside network should be ok. But nothing works and new reboot.

I can only come to the conclusion that the physical nics on my esx are bad, but what I do not understand is why are they working on my old unmanaged router and my unmanaged switch, and not on the cisco asa?

If there is something wrong with my pnics, my theory is that they does not have the correct driver in the esx. They are NVIDIA nForce® 590 SLITM MCP built-in dual Gigabit MAC with external Marvell PHY. Not sure whether they are supported properly. I could buy another nic that is supported just for trying.

If it is a router thing, then there must be something with the way that the asa speaks to any network device, remember that on the esx, there is a virtual switch. Maybe some sort of incompatability here.

Any thoughts?

Reply
0 Kudos
kjb007
Immortal
Immortal

Have you made sure that Spanning tree and port security are not enabled? Remove the extra layers so you can be sure they are not causing issues. A known good NIC would also help to eliminate that line of troubleshooting.

-KjB

vExpert/VCP/VCAP vmwise.com / @vmwise -KjB
Reply
0 Kudos
RegNullify
Contributor
Contributor

Greetings,

Thanks for your help. After doing some further investigation I have verified the following things, see below. The only snag I have is turning off STP. In my organization it is a requirement and in order to shut off STP it will impact the core switches and that wont be happening anytime soon. I am going to carve out another subnet using the default VLAN and plug everything into an unmanaged switch, uplink it, then re IP all of the hosts. Hopefully that will have positive results. If you have any other suggestions let me know. Thanks for your support.

Spanning Tree Protocol is enabled

Port Security is disabled

Known good NICS confirmed and %100 supported by ESX 3i and proper drivers have been installed

A+, N+, CNA, CNE, MCP, MCSA, VCP310, VCP410, VCI <------ Long time dedicated IT Professional specializing in U.S. Federal Government implementations.
Reply
0 Kudos
dagkl
Contributor
Contributor

I bought a new network card with dual nics from intel. one that hwas on the hcl, and now everything works fine. I've just installed the asa, so I havent been able to test speed yet. But anyway there is n problems now with the nics.

Thank you for your help.

Reply
0 Kudos