Hey, sorry for my bad explanation..
we have only 1 vLAN, as you can see in the following output:
Switch Name Num Ports Used Ports Configured Ports MTU Uplinks
vSwitch0 4082 13 128 1500 vmnic0,vmnic1
PortGroup Name VLAN ID Used Ports Uplinks
VM Network 0 7 vmnic0,vmnic1
Management Network 0 1 vmnic0,vmnic1
How do I check, if the uplinks are allowed to that vLAN? In my mind, they already should be allowed to the native vLAN 0.
Any particular reason why you are tagging the VMs with the native vLAN ID.
If you are saying you native vLAN ID is 0, it is not required for you to be doing virtual switch tagging for the same vLAN.
take a look at this KB for reference:
Thanks for the KB link!
Why do you think I am tagging the VMs with the native vLAN ID? I think this is done by the host automatically, not by myself.
Just looked into the configuration of my Cisco switches, for all connected ports vLAN trunk is active. But still no success.
I can't even reach my gateway out of my host. It is hilarious, don't really know why! What else do I have to check?
I see there are 7 ports in use by the VM Network portgroup.. can you confirm that the other 6 machines can ping the gateway and all communication for them is OK..?
are the Domain Controller and the Exchange server on the same subnet as the new host, if so this would explain why you could ping those machines but not anyother environment. I would first look at the default gateway on the VM,
net I would look for shadow network devices on the VM and delete them if they are present.
Well, that's interesting. I've tested every VM which is running on the host. There is one VM which can't ping the gateway. I can't even reach this VM from a local client. From another VM I can reach this VM. But there is no difference between the other VMs running on this host. In vSphere I can't see any difference in the configuration.
Yes, they are on the same subnet as the host or all of the other network devices. I can reach every VM running on this host, from a different VM. But can't reach the host or a special VM now from a local client.
when you say "From another VM I can reach this VM" I am assuming you are either doing a ping or RDP.
remove the network of the VM..
check the below KB and remove any hidden devices from the guest.
add a new adapter and assign IP..
I'm a little confused about what can reach what and what can't, but I also agree with the above poster regarding the same subnet. If you have a single flat network (192.168.1.0/24), then anything on that network would be able to reach each other without needing to hit the gateway. Additionally, anything on the same network AND on the same ESXi host AND on the same vSwitch (looks like you just have one) doesn't need to even exit the host. It will be locally switched so it doesn't touch the external network at all.
If the issue is access between VMs on the same host and network, that would indicate a problem with your guest configuration. Maybe bad IP, maybe duplicate IP, maybe bad subnet mask, OS firewall, etc.
If the issue is access between VMs on different hosts but the same network, that would indicate a problem (including the above possibilities) with L2 switching. You have multiple links listed on your output, are they both set to active or is one set to passive? If they are both active, are you using route based on virtual port ID without port channels (the default)? I would check to see what uplinks are being used by what VMs, possibly indicating that your switchport configurations are different. You can check this by SSHing into host, running esxtop and hitting n for network. One of the columns will show you what vmnic is being used by each VM (and vmk as well, since you say host management network might be wonky as well). If you notice that VMs on one vmnic have connectivity issues but VMs on another are fine, this usually indicates an upstream switchport config issue.
If all of these are OK but you have issues from separate subnets, this is an L3 problem which usually is gateway, mask, or external firewall related.
I asked about route based on virtual port ID because I have seen misconfigurations between load balancing choice and network configuration that can cause connection oddities, mostly regarding port channel configurations.
Yes, I'm doing a ping from the VM. The VM which is not reachable, is a virtual appliance from Sophos, so I can't remove any hidden devices.
Now there is another interesting thing:
After updating the ESXi and rebooting it, I can't access to my vSphere VM, but now can access to my new VM which I can't reach before. And even the ESXi is now reachable for me. I can't understand why. There is no IP conflict, I am really sure.
These VMs are Linux based, so I can't follow your KB link. Or should I try to delete the network and re-assign it to the VM?
thanks for your detailed answer.
Our network configuration is pretty simple. I am really confused, why I can reach now the ESXi after the reboot, but not the vSphere VM running on the ESXi.
The VM which was not reachable before, is now reachable after a reboot of the ESXi. Confusing for me.
There is definitely no IP conflict, or even bad subnet mask configuration. OS firewall is not active, most of the VMs are Linux based without any firewalls.
Why is it possible to ping every VM from the DC running on the ESXi, but not from a local client connected to the network of the ESXi?
I checked the IP configuration several times, nothing special and even no mistakes.
I'm not sure what you mean when you say "local client." Is this another VM on the same network on the same host? Is this another VM on a different host? Is this another server or desktop in the data center? Is it on the same network? "local client" doesn't describe anything meaningful to troubleshoot with.
Additionally you keep saying that you can't reach a VM after the reboot, and then say you can reach a VM after the reboot. Again, are these different VMs? Where are they located? Do you have more than one ESXi host or just one?
The best I can glean and tell you from this is that you rebooted your host and the network reachability state changed. My guess is that you have a single vSwitch with active/active NICs load balanced based on virtual port ID, and the pinning changed after the reboot. Again this would indicate a problem with the upstream switch configuration. My best guess: either the ports are configured differently instead of identically, the ports are in a port channel when they should not be, or the ports are attached to different switches and the switches have different configs for either the server ports or the upstream uplink ports.
In other words, if your ESXi host has 2 active ports that it can use for traffic for individual functions (VMs or vmkernel for management, vmotion, etc.), once a function is pinned to a port, it won't actively change unless there is a fault. If one of your ports is connected to a switch with a good config, and one is connected to a switch with a bad config (or one port has a good config and one has a bad config), then anything pinned to the good port will work and anything pinned to the bad port won't.
If you have access to the upstream switches I would verify the config (or get a network guy to verify the config). If not you can test this by:
1. Creating a new VMK port on this same network and assign an IP.
2. Edit the VMK port (not the vswitch itself), override the switch NIC teaming settings, and assign one NIC as active, one as standby, and apply the change.
3. Check reachability of the IP address
4. Edit the VMK port, assign standby NIC as active, active NIC as standby, and apply the change.
5. Check the reachability of the IP address
If it is reachable with one NIC active but not the other NIC, then you know which path is the problem.
Edit: again just to reiterate, testing from a VM on this same host on the same network won't validate anything because this traffic won't actually exit the ESXi host. It will be locally switched on the vSwitch. You will have to validate with something external, either on the same subnet or different subnet (preferably both).
againg, thanks for your detailled answer!
Local client is my personal pc which I use in our company. It's not a VM, just a computer connected to the same network.
We have only one ESXi host, every VM is running on this host. Nothing special configured. Just the driver for our raid controller is integrated after installation.
You are right, the server includes 2 NICs, both are connected to the same switch and both ports are configured completely identical.
I will test your suggestion, to configure a new VMK port and test both NICs separately.
Even I will try a different switch.. I can image there is a problem with the physical switch of our network.
I will let you know asap.
Does that mean all the VMs are reachable now..? do you still have any issues..?
really sorry for this late response, a lot of business trips were in the past..
As I thought before, the hardware switches were the problem. You can configure each port at the switch with a different profile (router, switch, ip-phone etc..)
Now I changed the connected ports of my ESXi to switch and voila, every VM is reachable and can connect to the internet also.
This discussion can be closed, hopefully no one else will ever receive such a problem.