Hi there,
since some weeks we are experiencing increasingly issues with stucking deployments.
Not always but often this issue occures while new deployments, but more often while reprovision a VM. The affected VM is running and ready for login but stays in state "Finalized" instead of "On".
Therefore all Day2-Actions except of "Destroy" are inaccessable for enduser. We have a workaround Workflow, where a user can select his stucking VM and reset the status to "On" in Iaas DB. But this is not a solution, because this behaviour occures at about 80% of every reprovion task.
Does anyone have similar issues?
Attention: These Logs are upside down because they are exported from Log Insight. The newest lines are on top.
Here a badtrace from all.log ->
ATDEVCICPWS0031: Workflow upgrade completed
ATDEVCICPWS0031: VirtualMachineTracking: workflowId=[59ab1756-3c37-4714-a1dc-ce266bf4e0ca], state=[Finalized] is being upgraded, state plugins will not be called
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
ATDEVCICPWS0031: Waiting for workflow upgrade to complete ...
ATDEVCICPWS0031: VirtualMachineTracking: workflowId=[59ab1756-3c37-4714-a1dc-ce266bf4e0ca], state=[Requested] is being upgraded, state plugins will not be called
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
ATDEVCICPWS0031: Not executing external workflows or any activities in state Requested while workflow is being upgraded.
token="zzEvD0ay"] GetWorkItem AgentID:ESXVCSA12Q taskType: PowerOn Long Running: False Data Collection: False Can run long running data collection: True Can run long ru
token="cUQ9h25J"] Machine 59ab1756-3c37-4714-a1dc-ce266bf4e0ca: Name: ATDEVCICPWS0031, State: Finalized.
oken="cUQ9h25J"] ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue 6061669e-a75d-427a-b955-a85cd0dbcced. Sta
token="cUQ9h25J"] [EML001] ATDEVCICPWS0031: Email template file not found [C:\Program Files (x86)\VMware\vCAC\Server\Templates\ManagerPowerOn.xslt] for template [Manage
token="cUQ9h25J"] ATDEVCICPWS0031: Sent email to ----using template PowerOn
and here a snippet from good trace ->
Virtual Machine ATDEVCICPWS0031 is on
ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue a572343e-7232-436a-b84b-14766a0831a2. State
Machine 59ab1756-3c37-4714-a1dc-ce266bf4e0ca: Name: ATDEVCICPWS0031, State: On.
ATDEVCICPWS0031: sendEBSMessage56(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) Sent event with id 5f6ec210-f13e-11e7-560a-31c639da8969
ATDEVCICPWS0031: sendEBSMessage56(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) About to send an event with timeout (00:30:00) for state VMPSMasterWork
ATDEVCICPWS0031: sendEBSMessage56(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) for state VMPSMasterWorkflow32.On phase PRE event
ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue 2de0d7f0-2069-4bc4-844b-ecdcf3d0a2ef. Stat
ATDEVCICPWS0031: sendEBSMessage55(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) Sent event with id 5bc19890-f13e-11e7-560a-31c639da8969
ATDEVCICPWS0031: sendEBSMessage55(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) About to send an event with timeout (00:30:00) for state VMPSMasterWorkf
ATDEVCICPWS0031: sendEBSMessage55(workflow=59ab1756-3c37-4714-a1dc-ce266bf4e0ca) for state VMPSMasterWorkflow32.TurningOn phase POST event
ATDEVCICPWS0031: [EventBrokerService] Resuming workflow with id 59ab1756-3c37-4714-a1dc-ce266bf4e0ca on queue 12706aac-e8c4-4eb5-ae6e-e541aeb56e82. Stat
[EML001] ATDEVCICPWS0031: Email template file not found [C:\Program Files (x86)\VMware\vCAC\Server\Templates\ManagerPowerOn.xslt] for template [ManagerP
ATDEVCICPWS0031: Sent email to ----- using template PowerOn
These are the last lines from log concerning this specific VM.
It seems to me IaaS is waiting for something. But between "PowerOn" and "PowerOn Long Running" are only a few seconds in logfile :smileyconfused:
What version of vRA is this? Has it always stalled on this many reprovisions before?
we are using version 7.3 (the latest build I think).
But now I think I'm a step further.
Some of my customization workflows in vRO crashed while parsing the JSON-Payload.
e.g.: Error in (Workflow:Deactivate ReProvisioning Flag / Scriptable task (item1)#12758) SyntaxError: Unexcpected character in string: '\s'
I found out if I start the reprovision via vRA GUI the property "Legacy.Workflow.User" looks like this -> user@domain.local
otherwise started via my bulk reprovisioning XaaS Workflow "Legacy.Workflow.User" uses a different notation -> domain\serviceuser
but...
It still doesn't work as expected.
I think it must still be a problem with domain user notation. Thats the only difference I'm aware between vRA GUI and Xaas Workflow.
That's what I was going to suggest next is your workflows may need debugging to see if they're behaving the same way as upon a new provision. In the case of your problem, it's looking at "\s" as a linefeed character and probably needs to be escaped or a scriptable task put in place to convert the username from DLL format to UPN format.
Now it works.
There was one more scriptable task in my last workflow I found with this "\s" error.