VMware Cloud Community
zeiktuvai
Contributor
Contributor
Jump to solution

ESXi Networking vSwitch networking issue

Hello everyone!
   I have been fighting a very strange networking issue and am turning to the community for help.  I have a vswitch, with two uplinks; the switch has 3 port groups for 3 different VLANs.  One of those VLANs is for VMs that connect directly to the ISP network with a global IP address.  
  On the switch both ports are configured as trunk ports, with a native VLAN and allowed VLANs (but not as a port group).  Randomly one VM will loose the ability to ping the external gateway (the ISP gateway and not my internal router) even though another VM on the same server, same network can still ping. 
  I have tried all kinds of things, including using a laptop to make sure I can ping from each IP to the ISP gateway, directly connecting that server to the ISP equipment (versus being connected to my own switch).

I just don't get why one vm can ping the gateway and the other cannot.  

Edit 1: I noticed that on the windows VM that cannot ping, when going into connection status that sent bytes is 0, received continues to go up though.
I hope I explained this properly, Thank you in advance for your help.
Greg

Reply
0 Kudos
1 Solution

Accepted Solutions
zeiktuvai
Contributor
Contributor
Jump to solution

Thank you to everyone that responded.  

I am posting this in case anyone else runs across this issue in the future.  This was NOT a VMWare issue, the ONT from the ISP providing termination in the business was bad and causing issues blocking random MAC's from being able to transmit.

I don't really know the details behind it, but with a new ONT the issue has been resolved.

View solution in original post

Reply
0 Kudos
9 Replies
ShahabKhan
VMware Employee
VMware Employee
Jump to solution

Hi Greg,

What is the subnet mask of your Global IP network?

Reply
0 Kudos
zeiktuvai
Contributor
Contributor
Jump to solution

248 /29

So, it randomly started working.. but then others stopped...

Could our switch be misconfigured?  Should the two switchports be put into a LACP port group and have that configured in the vswitch uplinks?

Reply
0 Kudos
zeiktuvai
Contributor
Contributor
Jump to solution

I am at my wits' end here..

Other machines randomly stop communicating with the gateway.  I can't figure out why this is happening.  I have removed the second adapter from the vswitch to no effect.  I also connected a laptop to that network with the same IP and it pinged fine so it has to be something on the vswitch.  We had this same configuration on Hyper-V before we switched to vmware with no problems.

Reply
0 Kudos
markuslam1
Contributor
Contributor
Jump to solution

Your network team checked the physical configuration already?

Reply
0 Kudos
ShahabKhan
VMware Employee
VMware Employee
Jump to solution

Hi,

Can you check Security & Traffic shaping configuration on the vSwitch. If possible please share the screenshot as well.

Regards

Reply
0 Kudos
compdigit44
Enthusiast
Enthusiast
Jump to solution

As other may have suggested, I would remove one uplink at a time and test. Also are the NIC cards drivers & firmware on the HCL and up-to-date?

Reply
0 Kudos
grimsrue
Enthusiast
Enthusiast
Jump to solution

Hello Zeiktuvai,

You have one of these issues going on.

1. You may have a dup IP address issue happening. Just to make sure, I would change the IP address of the VM with connectivity issues to rule that possibility out.

2. Some switch interfaces do not like seeing multiple MACs which happens with multiple VMs talking out the same physical interfaces. Not sure what type of switch your ESXi host is connected to but you might need to look at STP and see if Spanning-Tree portfast is enabled at the interfaces.

3. check to make sure the Portgroup your VMs are using has the Forged Transmits and MAC Learning, under Security, is set to "Accept"

4. Your "Ring Buffers" may be running out of space on your VMs. You can check ring buffers by running the below commands from the ESXi CLi.

esxcli network vm list (to get WID)

esxcli network vm port list -w (to get "port ID")

vsish -e get /net/portsets/vswitch0/ports/"port ID"/vmxnet3/rxSummary     (Change "Port ID" with port number)

Reply
0 Kudos
zeiktuvai
Contributor
Contributor
Jump to solution

So it is very likely that this is not a VMWare issue after all.  During testing, we decided to grab a fiber media converter to plug a laptop directly into the feed coming in from the ISP.  And guess what.. We couldn't ping the gateway from that laptop either.  So with the laptop connected directly to the ISP and the rest of our data center completely disconnected... i'm going to lean towards ISP. 

But it's possible the network team has screwed something up too 🤣

Thank you everyone that responded, the ISP is coming out tomorrow to take a look and so I'll keep you posted.

Reply
0 Kudos
zeiktuvai
Contributor
Contributor
Jump to solution

Thank you to everyone that responded.  

I am posting this in case anyone else runs across this issue in the future.  This was NOT a VMWare issue, the ONT from the ISP providing termination in the business was bad and causing issues blocking random MAC's from being able to transmit.

I don't really know the details behind it, but with a new ONT the issue has been resolved.

Reply
0 Kudos