Problem VMs synchronization

vXav · ‎04-04-2019

Any of you have some information about the event IDs 700/701 in the ADAM (VMware VDMDSG) event log category "online defragmentation" on connection servers?

These events show that it's defragmenting the ADAM db.

I have 4 connection servers in the same group and only one of them reports these events. I guess it is some sort of master in the group?

I recently noticed something rather odd:

The sessions count and events were synchronized on all 4 servers.
servers A and B were reporting 0 problem VMs
server C was reporting 1 problem VM
server D was reporting 3 problem VMs (the one with the defrag events)
The showrepl command reported successful replication on all 4 servers (it seems to always say successful though...)
I removed the 1 problem VM on server C -> it disappeared on server D
I removed the the other 2 problem VMs on server D -> ok..
Now it seems to be in sync

It may be completely unrelated but could it be that this defrag messes with the replication?

Our configuration is pretty vanilla but we had a lot of issues with Horizon in the past, including what we think were replication issues. (has to shut down the whole thing and reboot...)

I can't recall how many times I swore at Horizon :smileyangry:

Any ideas?

Blog - Linkedin

BenFB · ‎04-04-2019

Try following Restart order of the View environment to clear ADLDS (ADAM) synchronization in Horizon View (2068381... to clear any issues.

Which version of Horizon are you running. There were some VC cache issues in 7.4 that would lead to synchronization issues.

vXav · ‎04-04-2019

I am running 7.4 and I did have issues last year where an interruption of connectivity to the vCenter would break everything and only a shutdown of all the CS would fix it.

VMware then provided me with a hotfix that got rid of this. But shutting down everything is not easy as we have around 300 users at the moment and half of them in tunneling...

We are looking at upgrading to 7.8 but I am waiting on VMware to comment on this as we are using cloud pods and I don't want another horror story with Horizon.

Blog - Linkedin

BenFB · ‎04-08-2019

That's what I was referring to, we had to apply the same hot patch.

I would advise moving away from tunneling on the connection servers if possible. If you absolutely must maintain it I would shift it to UAG where it's easier to see active users and place them in quiesce mode for maintenance. I feel it's good hygiene to regularly follow Restart order of the View environment to clear ADLDS (ADAM) synchronization in Horizon View (2068381.... Normally this is addressed by monthly patching of the Windows OS but I suspect your tunneling configuration is making that difficult.

I would suggest engaging VMware if you feel clarification is still needed on the defrag.

sjesse · ‎04-08-2019

Your going to need an outage at one point to move to the UAG at one point anyway, I personally don't see the security server available for much longer,so you'll want to schedule it at the same time The good news is once you are using the UAG you can restart the connection servers independent as the tunnel is moved from the connection server to the UAG.

I'd work with support about the mixed version cloud pod directly , I'm going to be doing the same in the future, according to that other post it looks like adding a fake global entitlement fixes it. I'd confirm directly with support as you will get a direct response as that seems like a bug to me.

vXav · ‎04-09-2019

I do have to use tunneling on half the servers because the access from workstation on the LAN to the VDI vlan is blocked.

We could indeed have UAGs internally for LAN clients but it would be yet another bunch of servers and layer of load balancing, hence increasing the complexity (which is already up there).

By the way I wasn't aware of the quiesce mode, which sounds very useful.

Clearing the ADLDS every now and again sounds like a good idea but as you said the tunneling servers are in the way.

Did you upgrade your environment from 7.4? If you use cloud pods did you encounter issues?

Blog - Linkedin

BenFB · ‎04-09-2019

Depending on the number of connection servers you have you might be able to reduce them by adding UAG. I would rather have groups of internal and external load-balanced UAG instead of connection servers with tunneling. You have more visibility in to who is actively using the UAG and can leverage quiesce mode to move them in and out of production.

All

Problem VMs synchronization