VMware Cloud Community
NICJerry
Contributor
Contributor

One VM loses connection with another

I have a small vSphere cluster consisting of three ESX servers.  We recently (last week) updated them all to ESX 4.1 U2.  Recently (even before the upgrade) our monitoring VM (Ubuntu 6.06.2 running Nagios Core 3.2.3) completely loses connection with our primary Windows 2003 web server VM.  As a result, all services and websites being monitored by Nagios go down and alerts are sent out.  There is nothing wrong with the web server - all sites are up, all services are fine, the problem is the Nagios VM cannot 'see' the web server for some reason.  I can't ping the web server's IP from the command line of the Nagios VM, but all other VMs and physical machines can see the web server without issue.  The Nagios VM continues to see all other hosts - it only goes blind to the web server.  ONLY the Nagios VM is affected.  Last time, the problem just went away on its own.  This time it seems to be taking longer.

Has anyone seen anything like this before?  If there's additional info I can provide that would help, let me know.  Thanks.

0 Kudos
7 Replies
NICJerry
Contributor
Contributor

As I was saying...the problem just went away entirely on its own.  No intervetion whatsoever.  Nothing was changed, rebooted, etc.  ???

0 Kudos
ranjitcool
Hot Shot
Hot Shot

Hello,

Need some more info before I answer.

Are both the vms on the same vswitch? How is the network setup?

I don't think the upgrade would cause something like that, but i have seen sometimes my nics start acting up and I remove and readd them. This is not often though.

RJ

Please award points if you find my answers helpful Thanks RJ Visit www.rjapproves.com
0 Kudos
NICJerry
Contributor
Contributor

They are NOT on the same vSwitch.  In fact, the two VMs are not even on the same ESX host.  Each ESX host has a connection to a switch that leads to the firewall/outside and other connection(s) to a SAN network.

0 Kudos
ranjitcool
Hot Shot
Hot Shot

Okay so Nagios and the webserver, so all other vms were fine?

R

Please award points if you find my answers helpful Thanks RJ Visit www.rjapproves.com
0 Kudos
NICJerry
Contributor
Contributor

Everything else was just fine.  The only issue was the Nagios VM could not see the web server VM.  Nagios could see all other VMs, including VMs on the same host/vSwitch, and all other hosts could see the web server.

0 Kudos
ranjitcool
Hot Shot
Hot Shot

Was the windows vm able to see others, as in ping other vms on both hosts?

I would say if Nagios was able to see all and not windows it was something with windows.

What hw version in win running and what is the nic card type?

R

Please award points if you find my answers helpful Thanks RJ Visit www.rjapproves.com
0 Kudos
NICJerry
Contributor
Contributor

The hardware version is 4 and the adapter type is Flexible.

One thing I noticed...On the Summary tab of the web server VM there is a View all link for IP addresses.  That list is incomplete.  Being a web server, the windows VM has several IPs bound to it.  I do not see all those IPs on the View all list.  Could that be an issue?  The IP that Nagios mointors is NOT on that list.

0 Kudos