VMware Cloud Community
darrenoid
Enthusiast
Enthusiast
Jump to solution

Help Cleaning Up Requests Stuck In Progress

Hello,

We have been working on getting vRA 7 deployed. In the process, we have amassed some strange failures and requests that are stuck "In Progress". I am trying to figure out how to clean these up.

In the cases where installing software components hanged and the deployment request is stuck "In Progress" forever, I was able to delete them using this reference http://open902.com/vra7-delete-stuck-in-progress-deployments/.

The last cases are a little harder for me to figure out how to cleanup. They are deployments that have completed, and so the object exists as an item, but the machine object under the deployment no longer exists. This results in multiple expiry and destroy requests that never complete, and stay in progress.

For example:

InProgress1.jpg

The deployments exists under items:

InProgress2.jpg

But trying to destroy some of them results in an error:

The following component requests failed: Server2012R2Agent. Internal error in processing component request: [Rest Error]: {Status code: 502}, {Error code: 10107} , {Error Source: null}, {Error Msg: You cannot perform that action because the system cannot connect to the provider at https://VRAURLREDACTED/WAPI/.}, {System Msg: Provider service is not available or in error state.}

And some of them do not have an available action to destroy:

InProgress3.jpg

I would appreciate any advice and how to go about cleaning these up. I am thinking this may involve the IaaS database...

~ Darrenoid

Reply
0 Kudos
1 Solution

Accepted Solutions
darrenoid
Enthusiast
Enthusiast
Jump to solution

Hello AstraMonti,

I did manage to get this cleaned up, but it was an ugly process. The requests were the easy part. I just ran a query on the psql database on the vra appliance to change all requests that are in progress to failed. I did that because I am very familiar with my environment and I was positive the only in progress requests were ones that were stuck and I had a lot of them. This can also be done one at a time. Here is an example of that showing the table and status that I used for this:

postgres=# UPDATE cat_request SET state='PROVIDER_FAILED' where STATE='IN_PROGRESS' AND id='653194eb-bd83-4fce-b386-a0d52b6c5616' AND requestnumber='1321';

Now the deployments that I could not destroy in vRA were the hard ones to remove. First, before I could remove the deployments, I had to remove the machines associated with them. According to VMWare, the new cloud client command line tool should be able to do that for me:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21442...

For me it didn't work. When I ran the forceunregister command I got no errors, but nothing happened. The machine never got destroyed.

So, I started looking into the IAAS part of this. The IAAS database stores the information that is seen on the Infrastructure > Managed Machines area in vRA. I noticed the machines were listed there, but had a weird status (I can't remember the status exactly) like in progress or deploying. There was one with a status of missing. I tried to delete them from that interface and I noticed only the machine that was marked as missing was deleted.

So, I logged into the IAAS server with the database and found the table where managed machine status info is stored. Sorry I don't remember which table it was. I manually changed the status of the stuck VMs to "missing". After that I was able to delete the machines from Infrastructure > Managed Machines. Once the machines were deleted I was able to destroy all of the requests.

For all of the stuck deployments I had, most of them were already removed from vcenter.

Cheers,

Darrenoid

View solution in original post

Reply
0 Kudos
10 Replies
AstraMonti
Enthusiast
Enthusiast
Jump to solution

Hi Darrenoid,


Did you have any success solving this?


Thanks.

Reply
0 Kudos
Michael_Rudloff
Enthusiast
Enthusiast
Jump to solution

With not having the option to destroy elements means some entitlements aren't done correctly - seen this before where the entitlement was named a certain way but didn't really do what it says it does. Seems radical - but try maybe to add 'all' action entitlements and see if the option becomes available.

As for 'in progress' processes - have a look at my site - I wrote a how-to :

http://open902.com/vra7-delete-stuck-in-progress-deployments/

___ My own knowledge base made public: http://open902.com
Reply
0 Kudos
darrenoid
Enthusiast
Enthusiast
Jump to solution

Hello AstraMonti,

I did manage to get this cleaned up, but it was an ugly process. The requests were the easy part. I just ran a query on the psql database on the vra appliance to change all requests that are in progress to failed. I did that because I am very familiar with my environment and I was positive the only in progress requests were ones that were stuck and I had a lot of them. This can also be done one at a time. Here is an example of that showing the table and status that I used for this:

postgres=# UPDATE cat_request SET state='PROVIDER_FAILED' where STATE='IN_PROGRESS' AND id='653194eb-bd83-4fce-b386-a0d52b6c5616' AND requestnumber='1321';

Now the deployments that I could not destroy in vRA were the hard ones to remove. First, before I could remove the deployments, I had to remove the machines associated with them. According to VMWare, the new cloud client command line tool should be able to do that for me:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=21442...

For me it didn't work. When I ran the forceunregister command I got no errors, but nothing happened. The machine never got destroyed.

So, I started looking into the IAAS part of this. The IAAS database stores the information that is seen on the Infrastructure > Managed Machines area in vRA. I noticed the machines were listed there, but had a weird status (I can't remember the status exactly) like in progress or deploying. There was one with a status of missing. I tried to delete them from that interface and I noticed only the machine that was marked as missing was deleted.

So, I logged into the IAAS server with the database and found the table where managed machine status info is stored. Sorry I don't remember which table it was. I manually changed the status of the stuck VMs to "missing". After that I was able to delete the machines from Infrastructure > Managed Machines. Once the machines were deleted I was able to destroy all of the requests.

For all of the stuck deployments I had, most of them were already removed from vcenter.

Cheers,

Darrenoid

Reply
0 Kudos
darrenoid
Enthusiast
Enthusiast
Jump to solution

Hello Michael,

Thank you. It's funny, I included your URL in my original post. I had already read it, and it was a huge help for me in dealing with the requests stuck in progress.

Regards,

Darrenoid

Reply
0 Kudos
cbaker01
Contributor
Contributor
Jump to solution

Thanks for figuring this out.  I have been working with VMware support on this for over a month.  Their only solution was to run a 6.2 fix where the KB states to not use on a 7.0 environment.  Is there anyway you can go back and take screen shots of where you where?

pastedImage_0.png

As you can see all the machines are stuck and I have others that are not in here but show up in the cloud client.  Some of them were used with expiring blueprints so now I get about 100 emails a day for servers that have expired but I can only see them in the cloud client.

pastedImage_1.png

Reply
0 Kudos
darrenoid
Enthusiast
Enthusiast
Jump to solution

Hello cbaker01,

I will try and poke around the IAAS database when I am back to work on Tuesday to see if I can retrace my steps for you.

Regards,

Darrenoid

Reply
0 Kudos
darrenoid
Enthusiast
Enthusiast
Jump to solution

Hello cbaker01,

For the machines you have under managed machines, I would suggest you check the IAAS logs and see what it says when you try to destroy the machines. C:\Program Files (x86)\VMware\vCAC\Server\Logs\All.log. When I was experiencing the issue the log was telling me that it could not execute the workflow when the machine was in a certain state.

If that is the case, then you can change the state to one that is acceptable to delete. I retraced my steps and found the table where that information is stored. It's in the IAAS database for vra. The table is VirtualMachine and the field is VirtualMachineState. Here is an example of a SQL query from MSSQL Management Studio that lists a VM in that table by name and shows if it is managed or not:

SQLMachineState1.jpg

Once you have the right VM located, you can change it's status to missing like so:

UPDATE [vra].[dbo].[VirtualMachine]

SET VirtualMachineState = 'Missing'

FROM [vra].[dbo].[VirtualMachine]

WHERE VirtualMachineName = 'VMNAME001'

After that hopefully you can destroy the machine from the managed machines page. Then you should be able to delete the deployment itself from the Items tab.

As for the ones that are not listed under managed machines. Deployments will never show up there, just machines. I am not sure what to do for those ones. Perhaps look at the appliance postgress database to find where they are listed as still "IN_PROGRESS" and change them to "PROVIDER_FAILED" and see if they show up or can be forceunregister after that with the cloud client. This is just a wild guess...

Good luck,

Darrenoid

Reply
0 Kudos
Czernobog
Expert
Expert
Jump to solution

Bumping this, because I had a similar issue and support pointed me to an updated version of the KB article, which includes a procedure to remove a missing VM using the Cloud Client (that did not work for me either), instructions to remove in from postgres and also a stored procedure for removing the VM from IaaS (see the attachment at the bottom of the KB):

Removing a virtual machine from vRealize Automation 7.x using Cloud Client (2144269) | VMware KB

Reply
0 Kudos
aenagy
Hot Shot
Hot Shot
Jump to solution

Michael.Rudloff:

When I browse to this URL the message "This content is password protected. To view it please enter your password below:" is displayed and there is a text input field for password. How do we access the content?

Reply
0 Kudos
JayhawkEric
Expert
Expert
Jump to solution

There really needs to be a way to kill these "in progress" items via UI/CLI/API.  This is getting to be such a pain with all of the external systems we are integrating with.

FYI...  If you upgrade to 7.3 you can now "Force Destroy" full deployments (as long as you want to kill off all VM's within it).  We have done this multiple times lately.  https://docs.vmware.com/en/vRealize-Automation/7.3/com.vmware.vra.prepare.use.doc/GUID-B96CACBF-4E6C...

https://docs.vmware.com/en/vRealize-Automation/7.3/com.vmware.vra.prepare.use.doc/GUID-B96CACBF-4E6C...

VCP5-DV twitter - @ericblee6 blog - http://vEric.me
Reply
0 Kudos