VMware Networking Community
jaelae
Enthusiast
Enthusiast
Jump to solution

Guest Introspection dependencies and how to make it highly available

I'm trying to determine how Guest Introspection behaves to try and address some rather critical issues in our environment. We are running NSX 6.2.3 with a single NSX Manager and 3x controller nodes. Currently we are only using the Firewall portion of NSX along with a few Edge DHCP nodes. We decided against using fat clients on each VM for Antivirus protection and instead are using an NSX supported AV (Trend in our case) which means each of our clusters has Guest Introspection service deployed as well as the Trend service deployed from the Service Deployments NSX Manager screen.

My understanding on just the guest introspection piece is that the NSX Manager VM itself deploys the guest introspection vmdk to each host and and then the controllers take over what it reports into. This means that if the NSX Manager is rebooted or something happens to it, our VMs are still protected as this manager is not needed for guest introspection (or trend for that matter) to operate.

- Is that the correct understanding of how that works?

Now here is our issue. NSX Manager has been rebooted in the past while not causing any issues on any service deployments or NSX componment errors. Everything appears to work correctly during the reboot. However, our NSX Manager VM ran out of disk space and a ton of VMs stopped working. They were up and running but lost network connectivity and had to be restarted. We were able to replicate this and confirm this was related which is very worrysome. This tells me that NSX Manager can reboot but that hosts appear to be reporting into it perhaps due to some validation. The concern here is that if something happens with NSX Manager like this then that is a single point of failure for our environment.

Any feedback is much appreciated!

1 Solution

Accepted Solutions
Sreec
VMware Employee
VMware Employee
Jump to solution

My understanding on just the guest introspection piece is that the NSX Manager VM itself deploys the guest introspection vmdk to each host and and then the controllers take over what it reports into. This means that if the NSX Manager is rebooted or something happens to it, our VMs are still protected as this manager is not needed for guest introspection (or trend for that matter) to operate.

- Is that the correct understanding of how that works?

Guest introspection feature will deploy a appliance per host and required vib's on each host. However there is no role for NSX controllers here. Main components for this feature is VC,ESXI with vib'S,NSX Manager,Service VM,Virtual Machine with Tools&required modules and finally a  supported third party security VM.

Now here is our issue. NSX Manager has been rebooted in the past while not causing any issues on any service deployments or NSX componment errors. Everything appears to work correctly during the reboot. However, our NSX Manager VM ran out of disk space and a ton of VMs stopped working. They were up and running but lost network connectivity and had to be restarted. We were able to replicate this and confirm this was related which is very worrysome. This tells me that NSX Manager can reboot but that hosts appear to be reporting into it perhaps due to some validation. The concern here is that if something happens with NSX Manager like this then that is a single point of failure for our environment.

Since NSX manager is a component in management plane,in our case NSX communicates to VC-EAM for deploying and monitoring VIBs and SVMs on host . So certainly that part would be impacted when Manager is down due to whatever the reason(Network,storage,host crash etc...)  .Yes NSX manager is a single point of failure(With very low impact) and you need to leverage vSphere HA for protecting it. May be NSX-T management plane  cluster will handle failure scenarios without 0% impact( Not fully sure about this)

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered

View solution in original post

Reply
0 Kudos
3 Replies
Sreec
VMware Employee
VMware Employee
Jump to solution

My understanding on just the guest introspection piece is that the NSX Manager VM itself deploys the guest introspection vmdk to each host and and then the controllers take over what it reports into. This means that if the NSX Manager is rebooted or something happens to it, our VMs are still protected as this manager is not needed for guest introspection (or trend for that matter) to operate.

- Is that the correct understanding of how that works?

Guest introspection feature will deploy a appliance per host and required vib's on each host. However there is no role for NSX controllers here. Main components for this feature is VC,ESXI with vib'S,NSX Manager,Service VM,Virtual Machine with Tools&required modules and finally a  supported third party security VM.

Now here is our issue. NSX Manager has been rebooted in the past while not causing any issues on any service deployments or NSX componment errors. Everything appears to work correctly during the reboot. However, our NSX Manager VM ran out of disk space and a ton of VMs stopped working. They were up and running but lost network connectivity and had to be restarted. We were able to replicate this and confirm this was related which is very worrysome. This tells me that NSX Manager can reboot but that hosts appear to be reporting into it perhaps due to some validation. The concern here is that if something happens with NSX Manager like this then that is a single point of failure for our environment.

Since NSX manager is a component in management plane,in our case NSX communicates to VC-EAM for deploying and monitoring VIBs and SVMs on host . So certainly that part would be impacted when Manager is down due to whatever the reason(Network,storage,host crash etc...)  .Yes NSX manager is a single point of failure(With very low impact) and you need to leverage vSphere HA for protecting it. May be NSX-T management plane  cluster will handle failure scenarios without 0% impact( Not fully sure about this)

Cheers,
Sree | VCIX-5X| VCAP-5X| VExpert 7x|Cisco Certified Specialist
Please KUDO helpful posts and mark the thread as solved if answered
Reply
0 Kudos
szilagyic
Hot Shot
Hot Shot
Jump to solution

I have not ever heard of VMs being affected when the NSX Manager is having issues.  AFAIK NSX Manager is only for deploying, pushing config, etc.  The services are running on the ESXi host so as long as the host is up I would think you should be fine.

However, you are asking about Guest Introspection and making that component highly available, which is exactly what we are looking at as well.  In our case we are looking at Symantec DCS.  In testing, we shut down the SVA on a host, and the AV scanning on that host stopped (basically, everything failed open).  This to us is a little worrisome as well.  Sure, it's easy to re-deploy a new SVA in a matter of minutes, however as far as we know this is a manual process.  If the SVA has an issue in the middle of the night, it is not fixed until somebody can get to it.  We have asked both VMware and Symantec to touch on this issue and so far we have nothing concrete from either side explaining how to make Guest Introspection high available for each host.

Reply
0 Kudos
Techstarts
Expert
Expert
Jump to solution

They were up and running but lost network connectivity and had to be restarted.

I'm surprised that unavailability of NSX manager has caused Network downtime for VMs. Hope you have opened a case with VMware/TrendMicro? I would certainly do. I have read this article but couldn't find any dependency between NSX manager and DSVA which can lead to fail closed situation.

VMware in all their documents have stated NSX Manager is sits in management place and therefore shouldn't cause failclosed scenario.

However, our NSX Manager VM ran out of disk space and a ton of VMs stopped working.

FYI, If VM runs of disk space, even vSphere HA won't help as failure domain is outside the hypervisor layer.

With Great Regards,