VMware Cloud Community
Highspeedlane
Contributor
Contributor

Help with virtual Nic losing connectivity

We have one virtual server (Windows 2003 R2 Ent) whose nic periodically loses connectivity, but only partially. If you log into the server with Client and access a network share shortcut, you'll get a "cannot connect to resource" error. If you try to ping the physical server hosting that share, it fails. I say "partial" connectivity loss because I am still able to authenticate at the domain, which means the nic has connectivity to the Active Directory server.

If you go to the nic in Network Connections, right click and select "repair", connectivity is restored. Then if you log out and return the next day, the problem has returned. This doesn't happen to another virtual guest server on the same ESXi host, only this one.

Our set up is: ESXi 4.1 Update 1 (also happened with ESXi 4.1 though). The virtual nics are connected to one physical nic, with the ESXi host on another physical nic (both Gig MM fiber connectors). This partial loss of connectivity also happened when the virtual guests and hypervisor shared the same physical nic.

The virtual nic I have in use now is the Intel Pro/1000 MT, but I also tried the two other VMware nics available, with no change either.

Just wondering if this is something anyone else has any experience with and what I might look into to resolve it. Thanks for any help in advance.

0 Kudos
4 Replies
opbz
Hot Shot
Hot Shot

hi

not sure if this will help or not.

But have you looked at what your NIC is doing from the ESX server level?

what does netstat -int tell you? Do you have full connectivty? Any dropped packages or other errors? What does esxcfg-nics tell you? Is the nic fully up at 1000 and with full duplex?

You also dont mention how is your networking configured at the ESX level. Are you using iscsi? How many nics on the vswitch? How many opther vswitches? What is the VM being used for? Is it heavilly used? Do you have vmware tools installed and updated? What thev mware hardware version is it at version 7? What about other VMs on the ESX server?  Could one of them be affecting this server?

Have you tried pinging the VM from the server that hosts the share? I expect it to fail but.... Is the physical host on the same network as the VM? Can you access that host from other VMs?

What driver is your VM using for its nic? Is this VM a VM created from scratch or a P2v one?

hopefully I have given you some ideas...

0 Kudos
Highspeedlane
Contributor
Contributor

Thanks for the reply. I'm a novice with much of this but I'll try to answer as many questions as I know.

I just went in to the vm server, logged in, and could not access a network share (resource could not be found). Could not ping the physical share host server. Then without changing anything, went to the physical server hosting the share and I could ping back to the virtual server. Then went back to virtual server and I could then at that point ping the physical server.

- netstat -e = no errors, no discards

- esxcfg-nics -l = 1000 Mbs, Full duplex, MTU 1500, Up

- nics on vswitch - 5 virtual, 2 physical

- # of vswitches - 1

- vm is used for - network security tools (vulnerability scanner, anti-virus server)

- vmtools installed and updated with latest upgrade to ESXi 4.1 Update 1 (also happened with older 4.1 and vmtools)

- IPv6 disabled

- driver - Microsoft 6.3.6.31

- vmware hardware version - ? (not sure where to get this info)

- could another vm be affecting this one - I don't believe so

- is physical host on same network as the vm - yes

- is vm created from scratch or a P2v - from scratch

Thanks if this helps give you any ideas, let me know. This seems to be very intermittent and unpredictable.

0 Kudos
opbz
Hot Shot
Hot Shot

ok lets look a bit more at this then

- nics on vswitch - 5 virtual, 2 physical

so you only have 1 vswitch and that is using 2 physical nics? is that correct?

You can see this in configuration -> networking

if you then click on properties of the vswitch and then click on the adapters tab it shows you the networks that are detected they should be same on both nics. Here need to ensure that nics are both up and seeing similar networks. Might be an idea to limit this to one nic for troubleshooting.  But this depend on how the VM is associated to a physical nic... also that ussually happens on VM powerup so should be constant.

- driver - Microsoft 6.3.6.31

the driver I mean is one mentioned under settings of the VM under the nic you have flexnet vmxnet3 that kind of stuff.

- vmware hardware version - ? (not sure where to get this info)

this is mentioned on the summary tab for the VM. If its not at HW 7 it might be an idea to update it. But if you cvreated VM from scratch and server you created it on was ESX 4 then you should be at this level already.

on the VM where you are having the problems do you have any sort of powermanagement running on it? something that could be knocking of nics? Does problem occur only after you leave the VM for a while or has it ever happened where everything is working fine 1 minute and next minute network has dropped...

Do you have anything else that tries to access that share? Does that have problems?

This is a bit messy...

trying to see where problem can be as it is we have 4 options:

1: the VM itself hence why I ask about hw versions, drivers and powermanagement

2: the ESX server that was why I was asking about netstat bit... that shows ESX server appears to be fine

3: The server providing the share

4: network issue

0 Kudos
Highspeedlane
Contributor
Contributor

Thanks again. When I get a bit more time I'll dig deeper using your post to get more of the info. This vm guest server is not used by domain users, but only by security personnel. The main issue is automated tools. We have a utility on it that automatically retrieves the Windows security event logs from the domain member workstations, and periodically I have noticed an unexplained absence of some of these files being retrieved. This didn't happen when the utility was installed on a physical server running the same Windows server version, so my suspicions are falling on this intermittent nic activity as being a culprit there.

Again, thanks for the assistance and I'll get back here when I have more details or info to share.

0 Kudos