VMware Networking Community
lagsam
Contributor
Contributor

NSX 6.4 DFW -- SCCM imaging very slow or never completes

After migrating SCCM VM to our new Synergy 12000 Proliant Gen10, vSphere 6.7 and NSX 6.4, SCCM imaging is very slow or never completes.

Putting SCCM VM on the DFW exclusion list fixes the problem. This is a temporary solution.

I have created a timeout settings and tripled all the values but did not resolve the issue.

We have reports (to be verified) that another application migrated are also slow.

VMware ESXi 6.7 Update 1 seems to have the fix.

  • ESXi 6.7 Update 1 adds a Microsemi Smart PQI (smartpqi) LSU plug-in tо support attached disk management operations on the HPE ProLiant Gen10 Smart Array Controller.
  • ESXi 6.7 Update 1 adds Quick Boot support for Intel i40en and ixgben Enhanced Network Stack (ENS) drivers, and extends support for HPE ProLiant and Synergy servers. For more information, see VMware knowledge base article 52477.

I just would like to know anyone has a similar issue with any application running on an NSX environment..

Thanks

0 Kudos
4 Replies
mdac
Enthusiast
Enthusiast

A couple of questions:

1. What version of NSX are you running?

2. Does the SCCM VM have more than one vNIC in the same portgroup/subnet? This is important as it could be causing asymmetric filtering on the DFW.

3. Does the SCCM VM connect to a logical switch (vxlan backed portgroup) or a VLAN backed portgroup?

4. Does the SCCM VM have to route through an ESG for imaging purposes?

I noticed mention of an i40 based intel adapter. There is a known issue with certain NIC types with SW LRO (see KB 57993). I don't think that's the issue here though because you mentioned that things improved with the DFW turned off.

Thanks,

Mike

My blog: https://vswitchzero.com Follow me on Twitter: @vswitchzero
0 Kudos
lagsam
Contributor
Contributor

Mike

1. NSX version 6.4.1

2. SCCM VM has only one vNIC

3. SCCM is on VLAN backed portgroup (we tried it as well on VXLAN with the same result)

4. SCCM VM traffic is not yet routed to ESG and currently on L2 as we have not migrated our DG to vxlan

If we move SCCM VM on the DFW exclusion list, it works.

One more point:

Currently our Synergy 12000 Gen 10 has a problem with QLogic qfl3 driver resulting in PSODs.

VMware recommended using bnx2x driver until HPE fixes the issue with the Qlogic driver.

Thank you for your reply.

0 Kudos
mdac
Enthusiast
Enthusiast

Interesting. Roughly how many firewall rules are defined? There is an inherent performance cost associated with the DFW, but under normal circumstances it should not cause what you describe there. If you do have a lot of rules, try to create an allow all services rule at the very top of the DFW for the SCCM VM. I'd be curious to see if simply having the dvfilter applied is the problem, or something specific with filtering performance.

Might want to open an SR with GSS to have a look. A packet capture while on the DFW may help.

Regards,

Mike

My blog: https://vswitchzero.com Follow me on Twitter: @vswitchzero
0 Kudos
lagsam
Contributor
Contributor

We have the default rules. It is really open.

We were working with VMware support yesterday and they did four pktcap simultaneously while we were imaging a laptop. They couldn't find anything. It is only happening during the driver download phase of imaging and it doesn't happen during the initial and OS download phases. My colleague was asking if the guest introspection part of DFW is inspecting the individual file drivers during the download. Although, we are not yet using guest instrospection, we wanted to ask GSS if the guest introspection service is doing something even if it is not configured.

We also turned off Spoof Guard. Same problem.

0 Kudos