VMware Cloud Community
cgaya
Contributor
Contributor

Machines in failed deployments have a status of Missing

We've noticed that machines that are part of a failed deployment show up as Missing in the vRA UI. How to replicate:

  1. Submit any request that deploys a vSphere VM, and make it fail (I just force this by having one of by EBS subscriptions throw an error)
  2. Note the name of the VM
  3. After the request fails, go to Service Broker, Resources tab, Virtual Machines and search for the VM. The VM State column will show as Missing even though the deployment has failed and vRA has deleted the machine from vSphere (as expected).

You can also observe that the deployment itself still has an expiry date, its lease can be extended, as well as other Deployment day 2 actions. The only Day 2 action available on the VM is Delete.

When viewing the vRA inventory, this behavior makes it difficult to determine if a machine is truly missing or if its just an orphaned record from a failed deployment. In order to get around this behavior I have tried the following methods to either delete the VM object, OR set the lease on the deployment to 1 day so that failed machines don't stick around with Missing status until the lease runs out:

1. In our EBS error handler, delete the VM via VraMachineService.deleteMachine(). This throws an error, I assume since the deployment is still in progress. Fair enough.

2. In our EBS error handler, add a tag to the VM. This tag is monitored by a lease policy setting lease to 1 day for deployments with machines matching the tag. Code snippet:

var updateMachineSpec = new VraUpdateMachineSpecification();
updateMachineSpec.tags = tagsArray;
var machineService = vra.vraHost.createInfrastructureClient().createMachineService();
machine = machineService.updateMachine(machine, updateMachineSpec);

This code seems to work and the returned machine object has the new tag I have added along with all previous tags on the VM, however the tag never appears on the VM object in the vRA UI and my lease policy never applies to the VM!

3. Have an EBS subscription that either deletes the VM OR adds a tag to the VM AFTER the deployment has failed using the code below...

// Find all machines related to the deploymentId
var deploymentFilter = "deploymentId eq '" + inputProperties.deploymentId + "'";
var machines = VraEntitiesFinder.getMachines(vRAHost, deploymentFilter);
machines.forEach(function (machine) {
// try to add a tag to each machine or delete each machine here
});

The problem here is that VraEntitiesFinder no longer returns any machines after the deployment has completed with a failed status. As a result we never get inside the machines.forEach() function. This is despite the fact that through the REST API I can still see the machine resources associated with the deployment at the /deployment/api/resources/{resourceId} endpoint and can still execute day 2 actions against those resources.

4. Have an EBS subscription that fires after a deployment is complete which changes the lease to 1 day when a deployment has failed.. This DOES work, however since the UI only shows the state of the most recent request on a deployment, you can no longer tell that the deployment failed unless you drill down into the deployment history. I think having users not be able to see that a deployment has failed in the UI is even worse than missing machines in the inventory, so this is a no from me.

5. Add tags to the VM via day 2 action (with a lease policy targetting the tags). Same drawbacks as point 4.

6. Update the deployment name with text that is targeted by a lease policy. Same drawbacks as 4.

If anyone else has any ideas then please let me know.

Thanks!

0 Kudos
3 Replies
emacintosh
Hot Shot
Hot Shot

So if you don't need the deployment, would you consider just deleting it at that point? Or do you want it to be there in a failed state but without the missing machine resource?

Wasn't sure if you had other resources in the deployment that may need to stick around.

0 Kudos
cgaya
Contributor
Contributor

I did think about deleting the deployment but then we'd lose the ability to review the inputs and error message. No other resources, this is a single vm blueprint.

0 Kudos
emacintosh
Hot Shot
Hot Shot

If you are using vRO for your subscriptions, then both the failed message and request inputs should be available to you in the inputProperties input on your workflow (for deployment complete topic at least). So you could have a workflow that deletes the deployment (probably in an async way)....and you would still have the error message and request inputs via that workflow run. Not sure if the same is true for abx actions.

As a side note, we also subscribe event topics to workflows that simply log those types properties to our enterprise logging system too. If you have something similar available, it can help for scenarios like this.

0 Kudos