Julien_FR57
Contributor
Contributor

vSphere 6.5 - NSX Manager 6.3 for antimalware - problem with EAM and host integration

Jump to solution

Hello,

I have a Vmware deployment with a vshpere ESXi 6.5.0 update 1 cluster, a vcenter and NSX Manager 6.3.5 to do agentless antimalware with Trend Micro Deep Security.

The Vmware licence version is enterprise. NSX Manager licence version is standard (just to do agentless antimalware).

This Vmware cluster was deployed last november directly with these versions (except NSX Manager that was deployed initially in version 6.3.3).

The initial cluster contained 2 hosts.

On network side, there are :

- one dvSwitch with some dvPortGroup for VMs (production).

- one standard virtual switch : vmservice-vswitch

    - vmservice-vmknic-pg

    - vmservice-vshield-pg

- one vSwitch0 for management :

    - DC_MGMT_HV

    - Management Network

    - Vmotion-00

All was fine during the first deployment and also with NSX Manager and Trend Micro Deep Security integration.

On NSX side, the hosts were not prepared because of the licence level. It is not possible to do that and it's not necessary to do that just for antimalware.

Initially, the Guest Introspection and Trend Micro DSVA appliances were deployed without any problem.

The configuration of the network & security services for Guest Introspection and DSVA is done with Port Group specified on host.

The Agent VM is configured on each host.

For information, at the begining of the year, the meltdown & spectre patch update was applied to the cluster.

The last week we had to add two new hosts to the cluster.

We experience some problems to integrate these two hosts.

It's impossible to deploy Guest Introspection and Trend Micro DSVA VM.

The hosts were prepared and inserted into the cluster.

We could see that com.vmware.vim.eam try to install an agent but with an error.

The vCenter WebUI is displaying "Cannot complete the operation : Agent Network(s) : "missing:'DistributedVirtualPortGroup:dvportgroup-70:405....934f" not available on host."

If we look at the eam logs we find that message :

2018-01-26T15:14:51.417Z | ERROR | host-2194-1 | AuditedJob.java | 75 | JOB FAILED: [#842713751] DeployVmJob(AgentImpl(ID:'Agent:7ed64a22-2873-4bce-b681-c6d6671697fb:null')), Cause:

com.vmware.eam.EamException: Can't provision VM for AgentImpl(ID:'Agent:7ed64a22-2873-4bce-b681-c6d6671697fb:null') due to AgentVmDatastore or AgentVmNetwork missing.

        at com.vmware.eam.job.DeployVmJob.doJob(DeployVmJob.java:204)

        at com.vmware.eam.job.DeployVmJob.call(DeployVmJob.java:156)

        at com.vmware.eam.job.DeployVmJob.call(DeployVmJob.java:108)

        at com.vmware.eam.async.impl.AuditedJob.call(AuditedJob.java:35)

        at com.vmware.eam.async.impl.FutureRunnable.run(FutureRunnable.java:52)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

There is nothing in the others logs files (vmkernel,esxupdate,...).

I tried everything I can but without success. I checked the knowledge base and I verified a lot of thing and all seems to be fine with the prerequisites.

We tried to upgrade the NSX Manager from 6.3.3 to 6.3.5 in case of bug but nothing better.

We experience an other problem, it's not possible to move a VM from host 1 or 2 to host 3 or 4 if the VM is firstly deployed on the host 1 or 2. But If a VM is deployed on host 3 or 4, it's possible to move it on host 1 or 2 and to move it again on host 3 or 4.

Maybe it's the same problem that is causing both behaviors.

Thanks for your assistance,

0 Kudos
1 Solution

Accepted Solutions
Julien_FR57
Contributor
Contributor

Hello,

To keep you informed, I was able to solve my problem with vmware support engineer team.

In fact, it was a problem with the vcenter. The missing dvportgroup that was bloking the deployment was probably present somewhere in a unknown database but we didn't found it.

It seems to be a ghost dvportgroup that may be configured in the past and deleted but always present somewhere.


The workaround applied was :
1. Create new cluster and put new Hosts (3 and 4) inside.
2. Deploy GI and make sure it is operational.
3. Migrate all the workload from Host1 and Host2 to Host3 and Host4.
4. Create dummy cluster and move Host1 and Host2 to it, so the GI VMs will get deleted.
5. Destroy the existing cluster.
6. Move Host1 and Host2 to the newly created cluster and deploy the GI on them.

All is fine now with the NSX GI and Trend Micro DSVA deployment.

Thanks for your assistance.

Regards,

Julien.

View solution in original post

0 Kudos
8 Replies
cnrz
Expert
Expert

Is it possible that before adding the host to the Cluster the agent VM network property on a host be set

Also a license(NSX or Trendmicro)  or compatibility issue may be possible, which version of Deepsecurity is running? For Trendmicro as also may be CPU Socket based, could there be additional license needed for the new hosts?

https://success.trendmicro.com/solution/1060499-deep-security-and-vmware-compatibility-matrix#collap...

What does the NSX Manager>Summary>License Information show as the consumed license? Does this exceed the current standard license CPU count?

As adding the additional 2 hosts to the Cluster needs additional CPU license to be consumed and host to be added to the dVS before adding to the NSX Cluster prepared host:

https://thecloudxpert.net/2017/07/howto-add-host-vmware-nsx-enabled-vsphere-cluster/

Prerequisites

This post assumes the following:

  • The target VMware ESXi host(s) are already connected to vCenter Server.
  • The target VMware ESxi host(s) are already connected to any relevant Distributed Switches.

https://docs.vmware.com/en/VMware-NSX-for-vSphere/6.3/com.vmware.nsx.admin.doc/GUID-EA477D96-E2D3-48...

Deploy a Partner Service

If the partner solution includes a host-resident virtual appliance, you can deploy the service after the solution is registered with NSX Manager.

Prerequisites

Ensure that:

  • The partner solution is registered with NSX Manager.
  • NSX Manager can access the partner solution's management console.
  • The required license edition has been assigned. See https://kb.vmware.com/kb/2145269.

Procedure

  1. Click Networking & Security and then click Installation.
  2. Click the Service Deployments tab and click the New Service Deployment (Add) icon.
  3. In the Deploy Network and Security Services dialog box, select the appropriate solution(s).
  4. In Specify schedule (at the bottom of the dialog box), select Deploy now to deploy the solution immediately or select a deployment date and time.
  5. Click Next.
  6. Select the datacenter and cluster(s) where you want to deploy the solution and click Next.
  7. Select the datastore on which to add the solution service virtual machines storage or select Specified on host.The selected datastore must be available on all hosts in the selected cluster.If you selected Specified on host, the datastore for the ESX host must be specified in the AgentVM Settings of the host before it is added to the cluster. See vSphere API/SDK Documentation.
  8. Select the distributed virtual port group to host the management interface. This port group must be able to reach the NSX Manager’s port group.If the network is set to Specified on host, the network to be used must be specified in the Agent VM Settings > Network property of each host in the cluster. See vSphere API/SDK Documentation.You must set the agent VM network property on a host before you add it to a cluster. Navigate to Manage > Settings > Agent VM Settings > Network and click Edit to set the agent VM network.The selected port group must be available on all hosts in the selected cluster.
0 Kudos
Julien_FR57
Contributor
Contributor

Hello,

Thank you very much for your answer.

I did a check of all you wrote me :

- licenses : all is ok on the venter regarding vcenter license, vsphere license and nsx license

Extract of the Vmware NSX Install Guide : "Starting in NSX 6.2.3, the default license upon install will be NSX for vShield Endpoint. This license enables use of NSX for deploying and managing vShield Endpoint for anti-virus offload capability only, and has hard enforcement to restrict usage of VXLAN, firewall, and Edge services, by blocking host preparation and creation of NSX Edges."

- compatibility with trend micro deep security : full compatibility and we run supported versions

- agent VM network property is set on the host

- NTP and DNS is ok

When the host is inserted into the cluster, com.vmware.vim.eam is installing the agent and the operation completes successfully.

But as soon as the host exits the maintenance mode, com.vmware.vim.eam try to install an agent again and the error message "Cannot complete the operation... Agent Network(s): missing dvportgroup... not available on host.

I did exactly the configuration that is described in the procedure you sent me.

Thanks,

0 Kudos
Sreec
VMware Employee
VMware Employee

May i know what error message you are getting ?  What is the ESXI version for new host?

We experience an other problem, it's not possible to move a VM from host 1 or 2 to host 3 or 4 if the VM is firstly deployed on the host 1 or 2. But If a VM is deployed on host 3 or 4, it's possible to move it on host 1 or 2 and to move it again on host 3 or 4.

Also can you confirm if newly deployed host is correctly added to existing distributed switch and their status is green ?

Cheers,
Sree | CKA|CKAD|VCIX-3X| VCAP-4X| VExpert 5x
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
Julien_FR57
Contributor
Contributor

Here are the running versions :

VMware ESXi, 6.5.0, 7526125

NSX Manager : Version:6.3.5 Build 7119875

VSphere Client : Version 6.5.0.13000 Build 7312210

The vCenter WebUI is displaying "Cannot complete the operation : Agent Network(s) : "missing:'DistributedVirtualPortGroup:dvportgroup-70:405....934f" not available on host."

If we look at the eam logs we find that message :

2018-01-26T15:14:51.417Z | ERROR | host-2194-1 | AuditedJob.java | 75 | JOB FAILED: [#842713751] DeployVmJob(AgentImpl(ID:'Agent:7ed64a22-2873-4bce-b681-c6d6671697fb:null')), Cause:

com.vmware.eam.EamException: Can't provision VM for AgentImpl(ID:'Agent:7ed64a22-2873-4bce-b681-c6d6671697fb:null') due to AgentVmDatastore or AgentVmNetwork missing.

        at com.vmware.eam.job.DeployVmJob.doJob(DeployVmJob.java:204)

        at com.vmware.eam.job.DeployVmJob.call(DeployVmJob.java:156)

        at com.vmware.eam.job.DeployVmJob.call(DeployVmJob.java:108)

        at com.vmware.eam.async.impl.AuditedJob.call(AuditedJob.java:35)

        at com.vmware.eam.async.impl.FutureRunnable.run(FutureRunnable.java:52)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

There seems to have nothing else in the others logs files (vmkernel,esxupdate,...).

The green status of the new host is green and the DvSwitch is correctly added to the host.

But NSX Manager and Guest Introspection is not present on a DvSwitch but on a standard switch :

On network side, there are :

- one dvSwitch with some dvPortGroup for VMs (production).

- one standard virtual switch : vmservice-vswitch

    - vmservice-vmknic-pg

    - vmservice-vshield-pg

- one vSwitch0 for management :

    - DC_MGMT_HV

    - Management Network

    - Vmotion-00

0 Kudos
cnrz
Expert
Expert

Is AutoDeploy or VUM enabled on the Cluster? EAM may try to use VUM for deployment, and if VUM is enabled but stopped then it may fail to prepare the host.

https://verbrough.wordpress.com/2016/07/12/nsx-6-2-3-guest-introspection-deployment/

The guest introspection service deployment is performed per cluster. If you are deploying the Guest Introspection service to a cluster with vSphere hosts using vSphere Auto Deploy in a stateless configuration the deployment will fail.

  • What does the Installation Status shows before the hosts are added to the cluster, and after the hosts are added to the cluster?

Installation > Service Deployments>Installation Status for Guest Introspection Service  and Trend Micro Deep Security Service

If it shows succeeded, then whole ESX cluster hosts are deployed both Guest Introspection service and DSVA(Deep Security Virtual Appliance).

If it shows as fFailed, then the reason of failure may be shown on popup System Alarm Red Exclamation

https://verbrough.wordpress.com/2016/07/12/nsx-6-2-3-guest-introspection-deployment/

Installation_Status_Failed.png

  • Also what are the datastore and Port Group Settings for Guest Introspection and Trendmicro Deep Security Services  on the Cluster Level: (Are the DataStore and the Distributed Port Group configured on Cluster Level available on the new hosts deployed?)

Select the datastore, the distributed port group used by your NSX cluster, and IP assignment method, , then click Next.

https://help.deepsecurity.trendmicro.com/10/0/Reference/ref-install-ep-serv.html

Guest_Introspection_pg.png

About Standart  deployments for the Service VMs, is it possible it is not supported for the current versions of NSX and Trend Micro Deep Security? As below link shows Standart Switch is supported for vShield, but for NSX Distributed switch is shown to be supported?

NSX and agentless virus protection

However, in the documentation it says that both Service VM and workload VMs only supported on vSphere Distributed Switch (vDS)

65320_65320.PNGDSVA vSS.PNG

NSX and vSphere Distributed Switches

NSX services are not supported on vSphere Standard Switch. VM workloads must be connected to vSphere Distributed Switches to use NSX services and features.

NSX & vSphere Standard Switch Compatibility · vrandom

it does work, but isn’t supported by VMware, so obviously shouldn’t be utilized in production environments.

0 Kudos
Julien_FR57
Contributor
Contributor

Autodeploy is disabled

pastedImage_3.png

VUM is enabled.

pastedImage_4.png

The status of the NSX Service Deployment :

pastedImage_2.png

Here is the error message :

pastedImage_1.png

The vCenter WebUI is displaying "Cannot complete the operation : Agent Network(s) : "missing:'DistributedVirtualPortGroup:dvportgroup-70:405....934f" not available on host."

If we look at the eam logs we find that message :

2018-01-26T15:14:51.417Z | ERROR | host-2194-1 | AuditedJob.java | 75 | JOB FAILED: [#842713751] DeployVmJob(AgentImpl(ID:'Agent:7ed64a22-2873-4bce-b681-c6d6671697fb:null')), Cause:

com.vmware.eam.EamException: Can't provision VM for AgentImpl(ID:'Agent:7ed64a22-2873-4bce-b681-c6d6671697fb:null') due to AgentVmDatastore or AgentVmNetwork missing.

Here are the steps : the host is standalone, prepared and in maintenance mode. It is insterted into the cluster, the agent installation succeed but as soon as the host exits maintenance mode, the error message appears.

pastedImage_0.png

The Guest Introspection VM is not deployed on the new host.

0 Kudos
Julien_FR57
Contributor
Contributor

I founded this vmware KB article : VMware Knowledge Base

I wonder if my issue is not something like that even if I'm running Vmware 6.5.

The error message indicates that the dvportgroup-70 is missing and is not available on the host.

The problem is that this dvportgroup-70 doesn't exist at all anywhere.

Mayby I should apply the same procedure that is described in the KB.

0 Kudos
Julien_FR57
Contributor
Contributor

Hello,

To keep you informed, I was able to solve my problem with vmware support engineer team.

In fact, it was a problem with the vcenter. The missing dvportgroup that was bloking the deployment was probably present somewhere in a unknown database but we didn't found it.

It seems to be a ghost dvportgroup that may be configured in the past and deleted but always present somewhere.


The workaround applied was :
1. Create new cluster and put new Hosts (3 and 4) inside.
2. Deploy GI and make sure it is operational.
3. Migrate all the workload from Host1 and Host2 to Host3 and Host4.
4. Create dummy cluster and move Host1 and Host2 to it, so the GI VMs will get deleted.
5. Destroy the existing cluster.
6. Move Host1 and Host2 to the newly created cluster and deploy the GI on them.

All is fine now with the NSX GI and Trend Micro DSVA deployment.

Thanks for your assistance.

Regards,

Julien.

View solution in original post

0 Kudos