VMware Beta Community
jleavers1
Contributor
Contributor

Deployment failure: unable to wait for post customization phase

Deployments are failing with the following error:

Aug 08 19:44:20 photon bash[2574]: {"level":"info","ts":1659987860.6753228,"caller":"repair/heartbeat.go:102","msg":"updating component [{errorSet {RecoveryError 0001-01-01 00:00:00 +0000 UTC urn:vcloud:entity:vmware:capvcdCluster:8319ce29-cba5-4f65-8f20-46d0207cb386  map[]} true}] in RDE: [test-1(urn:vcloud:entity:vmware:capvcdCluster:8319ce29-cba5-4f65-8f20-46d0207cb386)]"}
Aug 08 19:44:20 photon bash[2574]: {"level":"error","ts":1659987860.6754072,"caller":"app/main.go:584","msg":"error creating cluster [test-1(urn:vcloud:entity:vmware:capvcdCluster:8319ce29-cba5-4f65-8f20-46d0207cb386)] : [error while bootstrapping the machine [test-1/EPHEMERAL_TEMP_VM]; unable to wait for post customization phase [guestinfo.cloudinit.kind.cluster.creation.status] : [invalid postcustomization phase: [failed] for key [guestinfo.cloudinit.kind.cluster.creation.status] for vm [EPHEMERAL_TEMP_VM]]]","stacktrace":"main.processRDE\n\t/app/main.go:584"}
Reply
0 Kudos
11 Replies
lzichong
VMware Employee
VMware Employee

Hi jleavers1,

Could you provide an update of the values set for default sizing policy? We recommend 2 cores and 4 GB RAM at the minimum. The current ephemeral VM is created based on the default sizing policy. We've noticed this issue from time to time, but the most common occurrence was due to GitHub API Rate Limit and lack of/not enough computing resources. 

Thanks!

Reply
0 Kudos
jleavers1
Contributor
Contributor

Hi,

 

I also get the same error with a custom sizing policy - do you know whether the issue described in Failed deployments using TKGm on VCD · Issue #1351 · vmware/container-service-extension (github.com) has been resolved in 4.0?

Reply
0 Kudos
jleavers1
Contributor
Contributor

It also fails immediately, i.e. all the log messages are from the same second:

 


Aug 11 10:26:04 photon bash[17962]: {"level":"info","ts":1660213564.434535,"caller":"kind/postCustomizationWaiter.go:80","msg":"Start: waiting for the bootstrapping phase [guestinfo.cloudinit.kind.cluster.creation.status] to complete"}
Aug 11 10:26:04 photon bash[17962]: {"level":"error","ts":1660213564.6452365,"caller":"cluster/clusterManager.go:238","msg":"error waiting for creation of cluster [test-2(urn:vcloud:entity:vmware:capvcdCluster:ac941b3a-571c-4a4e-b571-a5d4b6274d8a)]: [error while bootstrapping the machine [test-2/EPHEMERAL_TEMP_VM]; unable to wait for post customization phase [guestinfo.cloudinit.kind.cluster.creation.status] : [invalid postcustomization phase: [failed] for key [guestinfo.cloudinit.kind.cluster.creation.status] for vm [EPHEMERAL_TEMP_VM]]]","stacktrace":"gitlab.eng.vmware.com/core-build/vcd-k8s-provider/src/cluster.CreateCluster\n\t/app/src/cluster/clusterManager.go:238\nmain.processRDE\n\t/app/main.go:566"}
Aug 11 10:26:04 photon bash[17962]: {"level":"info","ts":1660213564.6459718,"caller":"repair/heartbeat.go:102","msg":"updating component [{errorSet {ScriptExecutionError 2022-08-11 10:26:04.645883229 +0000 UTC m=+293.497732785 urn:vcloud:vm:c5f20465-3c43-4c4c-9628-3df7a79b4dbf map[Detailed Error:[error while bootstrapping the machine [test-2/EPHEMERAL_TEMP_VM]; unable to wait for post customization phase [guestinfo.cloudinit.kind.cluster.creation.status] : [invalid postcustomization phase: [failed] for key [guestinfo.cloudinit.kind.cluster.creation.status] for vm [EPHEMERAL_TEMP_VM]]] during cluster creation]} false}] in RDE: [test-2(urn:vcloud:entity:vmware:capvcdCluster:ac941b3a-571c-4a4e-b571-a5d4b6274d8a)]"}

Reply
0 Kudos
agoel
VMware Employee
VMware Employee

Hello jleavers1

https://github.com/vmware/container-service-extension/issues/1351 - The VM reboot issue due to cloud-init script failure is resolved in CSE 3.1.4 as well as the fix is in CSE 4.0 Beta.

If you are seeing the issue in CSE 4.0 Beta, definitely will be helpful to triage with us.

Thanks.

Reply
0 Kudos
lzichong
VMware Employee
VMware Employee

Hi jleavers1,

Thanks for the feedback. As the error is quite vague, the only way to further troubleshoot is by logging into the Ephemeral VM and looking at the logs located at `/var/log/cloud-final.err`. Current beta behavior on any script error, the Ephemeral VM will delete itself and clean up and restart up a new one. We have allowed an option to disable the auto repair for GA so that you will be able to go into the VM for troubleshoot similar to rollback in CSE 3.x. For GA, the error message will be detailed with more info on what exactly failed, as well as in the log messages. 

The current workaround to troubleshoot and prevent the Ephemeral VM from deleting is to stop cse service/turn off the CSE 4.0 server VM while the script execution is happening. You can turn off the server once you see one or two successful ScriptExecutionEvents in the UI as the stage you are failing at is quite early into the script, with CSE 4.0 powered off the script should still continue to execute once it has been started.

You may login to the Ephemeral VM and monitor in real time by using this command 'tail -f /var/log/cloud-final.err', you should be able to see why exactly the script execution failed in there. 

Please let us know if you can reach there in the Ephemeral VM and post the logs if possible. 

Thanks!

Reply
0 Kudos
jleavers1
Contributor
Contributor

Hi,

 

Thanks for the reply. I stopped the CSE service during deployment and then got the contents of /var/log/cloud-final.err from the VM:

 

root@ubuntu-2004-001:~# cat /var/log/cloud-final.err
Cloud-init v. 21.1-19-gbad84ad4-0ubuntu1~20.04.2 running 'modules:final' at Mon, 15 Aug 2022 16:05:40 +0000. Up 8.38 seconds.
+ export COMPONENT_YAML_DOWNLOAD=guestinfo.cloudinit.provider.infra.components.yaml.download.status
+ COMPONENT_YAML_DOWNLOAD=guestinfo.cloudinit.provider.infra.components.yaml.download.status
+ export METADATA_YAML_DOWNLOAD=guestinfo.cloudinit.provider.metadata.yaml.download.status
+ METADATA_YAML_DOWNLOAD=guestinfo.cloudinit.provider.metadata.yaml.download.status
+ export KIND_BINARY_INSTALL=guestinfo.cloudinit.kind.binary.install.status
+ KIND_BINARY_INSTALL=guestinfo.cloudinit.kind.binary.install.status
+ export CLUSTERCTL_BINARY_INSTALL=guestinfo.cloudinit.clusterctl.binary.install.status
+ CLUSTERCTL_BINARY_INSTALL=guestinfo.cloudinit.clusterctl.binary.install.status
+ export KIND_CLUSTER_CREATION=guestinfo.cloudinit.kind.cluster.creation.status
+ KIND_CLUSTER_CREATION=guestinfo.cloudinit.kind.cluster.creation.status
+ export KIND_CLUSTER_CAPVCD_INSTALL=guestinfo.cloudinit.kind.cluster.capvcd.install.status
+ KIND_CLUSTER_CAPVCD_INSTALL=guestinfo.cloudinit.kind.cluster.capvcd.install.status
+ export ANTREA_MANIFEST_DOWNLOAD=guestinfo.cloudinit.antrea.manifest.download.status
+ ANTREA_MANIFEST_DOWNLOAD=guestinfo.cloudinit.antrea.manifest.download.status
+ export ANTREA_CRS_INSTALL_STATUS=guestinfo.cloudinit.antrea.crs.install.status
+ ANTREA_CRS_INSTALL_STATUS=guestinfo.cloudinit.antrea.crs.install.status
+ export KIND_CLUSTER_CAPVCD_READY=guestinfo.cloudinit.kind.cluster.capvcd.ready.status
+ KIND_CLUSTER_CAPVCD_READY=guestinfo.cloudinit.kind.cluster.capvcd.ready.status
+ export KIND_CLUSTER_CREATE_TARGET_CLUSTER=guestinfo.cloudinit.kind.cluster.capi.yaml.apply.status
+ KIND_CLUSTER_CREATE_TARGET_CLUSTER=guestinfo.cloudinit.kind.cluster.capi.yaml.apply.status
+ export TARGET_CLUSTER_READY=guestinfo.cloudinit.target.cluster.ready.status
+ TARGET_CLUSTER_READY=guestinfo.cloudinit.target.cluster.ready.status
+ export TARGET_CLUSTER_GET_KUBECONFIG=guestinfo.cloudinit.target.cluster.get.kubeconfig.status
+ TARGET_CLUSTER_GET_KUBECONFIG=guestinfo.cloudinit.target.cluster.get.kubeconfig.status
+ export TARGET_CLUSTER_INSTALL_COMPONENT=guestinfo.cloudinit.target.cluster.capvcd.init.status
+ TARGET_CLUSTER_INSTALL_COMPONENT=guestinfo.cloudinit.target.cluster.capvcd.init.status
+ export TARGET_CLUSTER_SELF_MANAGE=guestinfo.cloudinit.target.cluster.clusterctl.move.status
+ TARGET_CLUSTER_SELF_MANAGE=guestinfo.cloudinit.target.cluster.clusterctl.move.status
+ export TARGET_CLUSTER_DOWNLOAD_TANZU_CRS=guestinfo.cloudinit.target.cluster.tanzu.crs.download.status
+ TARGET_CLUSTER_DOWNLOAD_TANZU_CRS=guestinfo.cloudinit.target.cluster.tanzu.crs.download.status
+ export TARGET_CLUSTER_INSTALL_TANZU_CRS=guestinfo.cloudinit.target.cluster.tanzu.crs.install.status
+ TARGET_CLUSTER_INSTALL_TANZU_CRS=guestinfo.cloudinit.target.cluster.tanzu.crs.install.status
+ export TARGET_CLUSTER_TRUE_SCALE=guestinfo.cloudinit.target.cluster.full_capi.apply.status
+ TARGET_CLUSTER_TRUE_SCALE=guestinfo.cloudinit.target.cluster.full_capi.apply.status
+ export INSTALL_RDE_PROJECTOR=guestinfo.cloudinit.target.cluster.install.rde.projector.status
+ INSTALL_RDE_PROJECTOR=guestinfo.cloudinit.target.cluster.install.rde.projector.status
+ export CREATE_RDE_PROJECTOR_INSTANCE=guestinfo.cloudinit.target.cluster.create.rde.projector.instance.status
+ CREATE_RDE_PROJECTOR_INSTANCE=guestinfo.cloudinit.target.cluster.create.rde.projector.instance.status
+ mkdir -p /root/infrastructure-vcd/v1.0.0
+ export CURRENT_STATE=guestinfo.cloudinit.provider.infra.components.yaml.download.status
+ CURRENT_STATE=guestinfo.cloudinit.provider.infra.components.yaml.download.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.provider.infra.components.yaml.download.status in_progress'
+ wget -O /root/infrastructure-vcd/v1.0.0/infrastructure-components.yaml https://raw.githubusercontent.com/vmware/cluster-api-provider-cloud-director/vkp/infrastructure-vcd/v1.0.0/infrastructure-components.yaml
--2022-08-15 16:07:58--  https://raw.githubusercontent.com/vmware/cluster-api-provider-cloud-director/vkp/infrastructure-vcd/v1.0.0/infrastructure-components.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 44716 (44K) [text/plain]
Saving to: ‘/root/infrastructure-vcd/v1.0.0/infrastructure-components.yaml’

     0K .......... .......... .......... .......... ...       100% 9.58M=0.004s

2022-08-15 16:08:03 (9.58 MB/s) - ‘/root/infrastructure-vcd/v1.0.0/infrastructure-components.yaml’ saved [44716/44716]

+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.provider.infra.components.yaml.download.status successful'
+ export CURRENT_STATE=guestinfo.cloudinit.provider.metadata.yaml.download.status
+ CURRENT_STATE=guestinfo.cloudinit.provider.metadata.yaml.download.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.provider.metadata.yaml.download.status in_progress'
+ wget -O /root/infrastructure-vcd/v1.0.0/metadata.yaml https://raw.githubusercontent.com/vmware/cluster-api-provider-cloud-director/vkp/infrastructure-vcd/v1.0.0/metadata.yaml
--2022-08-15 16:08:05--  https://raw.githubusercontent.com/vmware/cluster-api-provider-cloud-director/vkp/infrastructure-vcd/v1.0.0/metadata.yaml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 165 [text/plain]
Saving to: ‘/root/infrastructure-vcd/v1.0.0/metadata.yaml’

     0K                                                       100% 21.2M=0s

2022-08-15 16:08:06 (21.2 MB/s) - ‘/root/infrastructure-vcd/v1.0.0/metadata.yaml’ saved [165/165]

+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.provider.metadata.yaml.download.status successful'
+ export CURRENT_STATE=guestinfo.cloudinit.kind.binary.install.status
+ CURRENT_STATE=guestinfo.cloudinit.kind.binary.install.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.binary.install.status in_progress'
++ uname
+ wget -O /usr/local/bin/kind https://kind.sigs.k8s.io/dl/v0.14.0/kind-Linux-amd64
--2022-08-15 16:08:08--  https://kind.sigs.k8s.io/dl/v0.14.0/kind-Linux-amd64
Resolving kind.sigs.k8s.io (kind.sigs.k8s.io)... 34.159.25.198, 34.159.132.250, 2a05:d014:275:cb00:ec0d:12e2:df27:aa60, ...
Connecting to kind.sigs.k8s.io (kind.sigs.k8s.io)|34.159.25.198|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://github.com/kubernetes-sigs/kind/releases/download/v0.14.0/kind-Linux-amd64 [following]
--2022-08-15 16:08:08--  https://github.com/kubernetes-sigs/kind/releases/download/v0.14.0/kind-Linux-amd64
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/148545807/f7cce986-7744-41e7-ab52-56e1c7b37102?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220815%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220815T160808Z&X-Amz-Expires=300&X-Amz-Signature=f9055e450d40cb7c858b176a42544588d59748d8e087a78399022498db62a606&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=148545807&response-content-disposition=attachment%3B%20filename%3Dkind-linux-amd64&response-content-type=application%2Foctet-stream [following]
--2022-08-15 16:08:08--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/148545807/f7cce986-7744-41e7-ab52-56e1c7b37102?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220815%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220815T160808Z&X-Amz-Expires=300&X-Amz-Signature=f9055e450d40cb7c858b176a42544588d59748d8e087a78399022498db62a606&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=148545807&response-content-disposition=attachment%3B%20filename%3Dkind-linux-amd64&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.111.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6658458 (6.3M) [application/octet-stream]
Saving to: ‘/usr/local/bin/kind’

     0K .......... .......... .......... .......... ..........  0% 40.4M 0s
    50K .......... .......... .......... .......... ..........  1% 17.2M 0s
   100K .......... .......... .......... .......... ..........  2% 87.1M 0s
   150K .......... .......... .......... .......... ..........  3% 37.2M 0s
   200K .......... .......... .......... .......... ..........  3%  291M 0s
   250K .......... .......... .......... .......... ..........  4% 59.0M 0s
   300K .......... .......... .......... .......... ..........  5% 32.5M 0s
   350K .......... .......... .......... .......... ..........  6% 43.3M 0s
   400K .......... .......... .......... .......... ..........  6% 48.5M 0s
   450K .......... .......... .......... .......... ..........  7%  290M 0s
   500K .......... .......... .......... .......... ..........  8% 42.1M 0s
   550K .......... .......... .......... .......... ..........  9% 69.8M 0s
   600K .......... .......... .......... .......... ..........  9% 29.7M 0s
   650K .......... .......... .......... .......... .......... 10% 45.7M 0s
   700K .......... .......... .......... .......... .......... 11%  151M 0s
   750K .......... .......... .......... .......... .......... 12% 35.1M 0s
   800K .......... .......... .......... .......... .......... 13% 74.5M 0s
   850K .......... .......... .......... .......... .......... 13% 53.3M 0s
   900K .......... .......... .......... .......... .......... 14% 79.4M 0s
   950K .......... .......... .......... .......... .......... 15%  168M 0s
  1000K .......... .......... .......... .......... .......... 16% 49.4M 0s
  1050K .......... .......... .......... .......... .......... 16% 67.0M 0s
  1100K .......... .......... .......... .......... .......... 17% 37.2M 0s
  1150K .......... .......... .......... .......... .......... 18% 51.1M 0s
  1200K .......... .......... .......... .......... .......... 19%  190M 0s
  1250K .......... .......... .......... .......... .......... 19% 39.1M 0s
  1300K .......... .......... .......... .......... .......... 20% 52.9M 0s
  1350K .......... .......... .......... .......... .......... 21% 53.3M 0s
  1400K .......... .......... .......... .......... .......... 22%  198M 0s
  1450K .......... .......... .......... .......... .......... 23% 46.3M 0s
  1500K .......... .......... .......... .......... .......... 23% 53.7M 0s
  1550K .......... .......... .......... .......... .......... 24% 86.5M 0s
  1600K .......... .......... .......... .......... .......... 25%  268M 0s
  1650K .......... .......... .......... .......... .......... 26% 47.4M 0s
  1700K .......... .......... .......... .......... .......... 26% 43.2M 0s
  1750K .......... .......... .......... .......... .......... 27% 41.6M 0s
  1800K .......... .......... .......... .......... .......... 28% 49.6M 0s
  1850K .......... .......... .......... .......... .......... 29%  276M 0s
  1900K .......... .......... .......... .......... .......... 29% 42.1M 0s
  1950K .......... .......... .......... .......... .......... 30% 45.0M 0s
  2000K .......... .......... .......... .......... .......... 31% 33.1M 0s
  2050K .......... .......... .......... .......... .......... 32%  287M 0s
  2100K .......... .......... .......... .......... .......... 33% 33.2M 0s
  2150K .......... .......... .......... .......... .......... 33%  247M 0s
  2200K .......... .......... .......... .......... .......... 34% 47.6M 0s
  2250K .......... .......... .......... .......... .......... 35% 41.9M 0s
  2300K .......... .......... .......... .......... .......... 36%  289M 0s
  2350K .......... .......... .......... .......... .......... 36% 33.6M 0s
  2400K .......... .......... .......... .......... .......... 37%  267M 0s
  2450K .......... .......... .......... .......... .......... 38% 76.2M 0s
  2500K .......... .......... .......... .......... .......... 39% 50.8M 0s
  2550K .......... .......... .......... .......... .......... 39% 21.9M 0s
  2600K .......... .......... .......... .......... .......... 40%  148M 0s
  2650K .......... .......... .......... .......... .......... 41%  291M 0s
  2700K .......... .......... .......... .......... .......... 42% 29.2M 0s
  2750K .......... .......... .......... .......... .......... 43%  199M 0s
  2800K .......... .......... .......... .......... .......... 43% 69.8M 0s
  2850K .......... .......... .......... .......... .......... 44%  183M 0s
  2900K .......... .......... .......... .......... .......... 45% 48.5M 0s
  2950K .......... .......... .......... .......... .......... 46% 41.7M 0s
  3000K .......... .......... .......... .......... .......... 46%  262M 0s
  3050K .......... .......... .......... .......... .......... 47% 28.6M 0s
  3100K .......... .......... .......... .......... .......... 48%  277M 0s
  3150K .......... .......... .......... .......... .......... 49% 30.3M 0s
  3200K .......... .......... .......... .......... .......... 49%  268M 0s
  3250K .......... .......... .......... .......... .......... 50% 28.4M 0s
  3300K .......... .......... .......... .......... .......... 51%  274M 0s
  3350K .......... .......... .......... .......... .......... 52% 21.1M 0s
  3400K .......... .......... .......... .......... .......... 53%  203M 0s
  3450K .......... .......... .......... .......... .......... 53%  262M 0s
  3500K .......... .......... .......... .......... .......... 54%  292M 0s
  3550K .......... .......... .......... .......... .......... 55% 35.7M 0s
  3600K .......... .......... .......... .......... .......... 56%  284M 0s
  3650K .......... .......... .......... .......... .......... 56% 62.2M 0s
  3700K .......... .......... .......... .......... .......... 57% 50.8M 0s
  3750K .......... .......... .......... .......... .......... 58%  232M 0s
  3800K .......... .......... .......... .......... .......... 59% 47.7M 0s
  3850K .......... .......... .......... .......... .......... 59% 50.6M 0s
  3900K .......... .......... .......... .......... .......... 60% 52.8M 0s
  3950K .......... .......... .......... .......... .......... 61% 33.2M 0s
  4000K .......... .......... .......... .......... .......... 62%  261M 0s
  4050K .......... .......... .......... .......... .......... 63% 4.75M 0s
  4100K .......... .......... .......... .......... .......... 63% 66.7M 0s
  4150K .......... .......... .......... .......... .......... 64% 34.0M 0s
  4200K .......... .......... .......... .......... .......... 65% 28.6M 0s
  4250K .......... .......... .......... .......... .......... 66%  351M 0s
  4300K .......... .......... .......... .......... .......... 66% 33.6M 0s
  4350K .......... .......... .......... .......... .......... 67%  336M 0s
  4400K .......... .......... .......... .......... .......... 68% 36.4M 0s
  4450K .......... .......... .......... .......... .......... 69% 36.9M 0s
  4500K .......... .......... .......... .......... .......... 69%  262M 0s
  4550K .......... .......... .......... .......... .......... 70% 36.0M 0s
  4600K .......... .......... .......... .......... .......... 71%  291M 0s
  4650K .......... .......... .......... .......... .......... 72% 38.0M 0s
  4700K .......... .......... .......... .......... .......... 73%  300M 0s
  4750K .......... .......... .......... .......... .......... 73% 40.6M 0s
  4800K .......... .......... .......... .......... .......... 74% 15.0M 0s
  4850K .......... .......... .......... .......... .......... 75%  273M 0s
  4900K .......... .......... .......... .......... .......... 76%  335M 0s
  4950K .......... .......... .......... .......... .......... 76%  348M 0s
  5000K .......... .......... .......... .......... .......... 77% 17.3M 0s
  5050K .......... .......... .......... .......... .......... 78%  196M 0s
  5100K .......... .......... .......... .......... .......... 79%  324M 0s
  5150K .......... .......... .......... .......... .......... 79%  345M 0s
  5200K .......... .......... .......... .......... .......... 80% 19.0M 0s
  5250K .......... .......... .......... .......... .......... 81%  214M 0s
  5300K .......... .......... .......... .......... .......... 82%  350M 0s
  5350K .......... .......... .......... .......... .......... 83%  359M 0s
  5400K .......... .......... .......... .......... .......... 83% 30.3M 0s
  5450K .......... .......... .......... .......... .......... 84%  258M 0s
  5500K .......... .......... .......... .......... .......... 85% 23.2M 0s
  5550K .......... .......... .......... .......... .......... 86%  232M 0s
  5600K .......... .......... .......... .......... .......... 86%  284M 0s
  5650K .......... .......... .......... .......... .......... 87% 23.1M 0s
  5700K .......... .......... .......... .......... .......... 88%  246M 0s
  5750K .......... .......... .......... .......... .......... 89% 22.3M 0s
  5800K .......... .......... .......... .......... .......... 89%  247M 0s
  5850K .......... .......... .......... .......... .......... 90%  313M 0s
  5900K .......... .......... .......... .......... .......... 91% 21.3M 0s
  5950K .......... .......... .......... .......... .......... 92%  143M 0s
  6000K .......... .......... .......... .......... .......... 93%  271M 0s
  6050K .......... .......... .......... .......... .......... 93%  358M 0s
  6100K .......... .......... .......... .......... .......... 94% 12.8M 0s
  6150K .......... .......... .......... .......... .......... 95%  277M 0s
  6200K .......... .......... .......... .......... .......... 96%  319M 0s
  6250K .......... .......... .......... .......... .......... 96%  321M 0s
  6300K .......... .......... .......... .......... .......... 97%  356M 0s
  6350K .......... .......... .......... .......... .......... 98% 32.1M 0s
  6400K .......... .......... .......... .......... .......... 99%  164M 0s
  6450K .......... .......... .......... .......... .......... 99%  367M 0s
  6500K ..                                                    100% 4578G=0.1s

2022-08-15 16:08:09 (52.3 MB/s) - ‘/usr/local/bin/kind’ saved [6658458/6658458]

+ chmod +x /usr/local/bin/kind
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.binary.install.status successful'
+ export CURRENT_STATE=guestinfo.cloudinit.clusterctl.binary.install.status
+ CURRENT_STATE=guestinfo.cloudinit.clusterctl.binary.install.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.clusterctl.binary.install.status in_progress'
++ uname
+ wget -O /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-Linux-amd64
--2022-08-15 16:08:11--  https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-Linux-amd64
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/124157517/eb64ec78-1a74-481a-a98a-64640115bdaa?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220815%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220815T160811Z&X-Amz-Expires=300&X-Amz-Signature=5973f5e5c97916ec9cc83f8a22fd424f150affef4aec6fde67f2116a6fa6fb8e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=124157517&response-content-disposition=attachment%3B%20filename%3Dclusterctl-linux-amd64&response-content-type=application%2Foctet-stream [following]
--2022-08-15 16:08:11--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/124157517/eb64ec78-1a74-481a-a98a-64640115bdaa?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220815%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220815T160811Z&X-Amz-Expires=300&X-Amz-Signature=5973f5e5c97916ec9cc83f8a22fd424f150affef4aec6fde67f2116a6fa6fb8e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=124157517&response-content-disposition=attachment%3B%20filename%3Dclusterctl-linux-amd64&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.110.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 503 Egress is over the account limit.
2022-08-15 16:08:11 ERROR 503: Egress is over the account limit..

++ catch 8 61
++ retval=8
+++ date
+++ caller
++ error_message='Mon 15 Aug 2022 04:08:11 PM UTC 61 /root/bootstrap.sh: wget -O /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-$(uname)-amd64'
++ echo 'Mon 15 Aug 2022 04:08:11 PM UTC 61 /root/bootstrap.sh: wget -O /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-$(uname)-amd64'
++ vmtoolsd --cmd 'info-set guestinfo.cloudinit.clusterctl.binary.install.status failed'
++ vmtoolsd --cmd 'info-set guestinfo.cloud_init_script_execution_failure_reason Mon 15 Aug 2022 04:08:11 PM UTC 61 /root/bootstrap.sh: wget -O /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-$(uname)-amd64'
++ vmtoolsd --cmd 'info-set guestinfo.cloud_init_script_execution_status 8'
Cloud-init v. 21.1-19-gbad84ad4-0ubuntu1~20.04.2 running 'modules:final' at Mon, 15 Aug 2022 16:07:34 +0000. Up 6.35 seconds.
2022-08-15 16:08:13,539 - cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
2022-08-15 16:08:13,540 - util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_scripts_user.py'>) failed
ci-info: no authorized SSH keys fingerprints found for user ubuntu.
Docker, Kind, Clusterctl installation done and up after 45.53 seconds
2022-08-15 16:08:13,560 - util.py[WARNING]: Failed to write boot finished file /var/lib/cloud/instance/boot-finished
Traceback (most recent call last):
  File "/usr/bin/cloud-init", line 11, in <module>
    load_entry_point('cloud-init==21.1', 'console_scripts', 'cloud-init')()
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 890, in main
    retval = util.log_time(
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2348, in log_time
    ret = func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 670, in status_wrapper
    atomic_helper.write_json(status_path, status)
  File "/usr/lib/python3/dist-packages/cloudinit/atomic_helper.py", line 44, in write_json
    return write_file(
  File "/usr/lib/python3/dist-packages/cloudinit/atomic_helper.py", line 39, in write_file
    raise e
  File "/usr/lib/python3/dist-packages/cloudinit/atomic_helper.py", line 26, in write_file
    tf = tempfile.NamedTemporaryFile(dir=os.path.dirname(filename),
  File "/usr/lib/python3.8/tempfile.py", line 679, in NamedTemporaryFile
    (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "/usr/lib/python3.8/tempfile.py", line 389, in _mkstemp_inner
    fd = _os.open(file, flags, 0o600)
FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/cloud/data/tmpmzytv4pw'
Reply
0 Kudos
jleavers1
Contributor
Contributor

I ran /root/bootstrap.sh manually and it got further, but then got stuck as per the below:

++ uname
+ wget -O /usr/local/bin/clusterctl https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-Linux-amd64
--2022-08-15 16:21:04--  https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.1.3/clusterctl-Linux-amd64
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/124157517/eb64ec78-1a74-481a-a98a-64640115bdaa?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220815%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220815T162023Z&X-Amz-Expires=300&X-Amz-Signature=c175a9aa5c6c3af8e8d30a76ed6aac67a64b34f172516e55135d538563b0e59e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=124157517&response-content-disposition=attachment%3B%20filename%3Dclusterctl-linux-amd64&response-content-type=application%2Foctet-stream [following]
--2022-08-15 16:21:04--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/124157517/eb64ec78-1a74-481a-a98a-64640115bdaa?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220815%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220815T162023Z&X-Amz-Expires=300&X-Amz-Signature=c175a9aa5c6c3af8e8d30a76ed6aac67a64b34f172516e55135d538563b0e59e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=124157517&response-content-disposition=attachment%3B%20filename%3Dclusterctl-linux-amd64&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.110.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62491727 (60M) [application/octet-stream]
Saving to: ‘/usr/local/bin/clusterctl’

/usr/local/bin/clusterctl                                                         100%[===========================================================================================================================================================================================================>]  59.60M  64.3MB/s    in 0.9s

2022-08-15 16:21:05 (64.3 MB/s) - ‘/usr/local/bin/clusterctl’ saved [62491727/62491727]

+ chmod +x /usr/local/bin/clusterctl
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.clusterctl.binary.install.status successful'

+ export CURRENT_STATE=guestinfo.cloudinit.kind.cluster.creation.status
+ CURRENT_STATE=guestinfo.cloudinit.kind.cluster.creation.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.creation.status in_progress'

+ kind create cluster --config /root/kind-cluster-with-extramounts.yaml
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.24.0) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Not sure what to do next? 😅  Check out https://kind.sigs.k8s.io/docs/user/quick-start/
+ kind export kubeconfig
Set kubectl context to "kind-kind"
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.creation.status successful'

+ export EXP_CLUSTER_RESOURCE_SET=true
+ EXP_CLUSTER_RESOURCE_SET=true
+ export CURRENT_STATE=guestinfo.cloudinit.kind.cluster.capvcd.install.status
+ CURRENT_STATE=guestinfo.cloudinit.kind.cluster.capvcd.install.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.capvcd.install.status in_progress'

+ clusterctl init --config /root/.cluster-api/clusterctl.yaml --core cluster-api:v1.1.3 -b kubeadm:v1.1.3 -c kubeadm:v1.1.3 -i vcd:v1.0.0 --v=10
Using configuration File="/root/.cluster-api/clusterctl.yaml"
Installing the clusterctl inventory CRD
Creating CustomResourceDefinition="providers.clusterctl.cluster.x-k8s.io"
Fetching providers
Fetching File="core-components.yaml" Provider="cluster-api" Type="CoreProvider" Version="v1.1.3"
Fetching File="bootstrap-components.yaml" Provider="kubeadm" Type="BootstrapProvider" Version="v1.1.3"
Fetching File="control-plane-components.yaml" Provider="kubeadm" Type="ControlPlaneProvider" Version="v1.1.3"
Fetching File="infrastructure-components.yaml" Provider="vcd" Type="InfrastructureProvider" Version="v1.0.0"
Fetching File="metadata.yaml" Provider="cluster-api" Type="CoreProvider" Version="v1.1.3"
Fetching File="metadata.yaml" Provider="kubeadm" Type="BootstrapProvider" Version="v1.1.3"
Fetching File="metadata.yaml" Provider="kubeadm" Type="ControlPlaneProvider" Version="v1.1.3"
Fetching File="metadata.yaml" Provider="vcd" Type="InfrastructureProvider" Version="v1.0.0"
Creating Namespace="cert-manager-test"
Installing cert-manager Version="v1.5.3"
Fetching File="cert-manager.yaml" Provider="cert-manager" Type="" Version="v1.5.3"
Creating Namespace="cert-manager"
Creating CustomResourceDefinition="certificaterequests.cert-manager.io"
Creating CustomResourceDefinition="certificates.cert-manager.io"
Creating CustomResourceDefinition="challenges.acme.cert-manager.io"
Creating CustomResourceDefinition="clusterissuers.cert-manager.io"
Creating CustomResourceDefinition="issuers.cert-manager.io"
Creating CustomResourceDefinition="orders.acme.cert-manager.io"
Creating ServiceAccount="cert-manager-cainjector" Namespace="cert-manager"
Creating ServiceAccount="cert-manager" Namespace="cert-manager"
Creating ServiceAccount="cert-manager-webhook" Namespace="cert-manager"
Creating ClusterRole="cert-manager-cainjector"
Creating ClusterRole="cert-manager-controller-issuers"
Creating ClusterRole="cert-manager-controller-clusterissuers"
Creating ClusterRole="cert-manager-controller-certificates"
Creating ClusterRole="cert-manager-controller-orders"
Creating ClusterRole="cert-manager-controller-challenges"
Creating ClusterRole="cert-manager-controller-ingress-shim"
Creating ClusterRole="cert-manager-view"
Creating ClusterRole="cert-manager-edit"
Creating ClusterRole="cert-manager-controller-approve:cert-manager-io"
Creating ClusterRole="cert-manager-controller-certificatesigningrequests"
Creating ClusterRole="cert-manager-webhook:subjectaccessreviews"
Creating ClusterRoleBinding="cert-manager-cainjector"
Creating ClusterRoleBinding="cert-manager-controller-issuers"
Creating ClusterRoleBinding="cert-manager-controller-clusterissuers"
Creating ClusterRoleBinding="cert-manager-controller-certificates"
Creating ClusterRoleBinding="cert-manager-controller-orders"
Creating ClusterRoleBinding="cert-manager-controller-challenges"
Creating ClusterRoleBinding="cert-manager-controller-ingress-shim"
Creating ClusterRoleBinding="cert-manager-controller-approve:cert-manager-io"
Creating ClusterRoleBinding="cert-manager-controller-certificatesigningrequests"
Creating ClusterRoleBinding="cert-manager-webhook:subjectaccessreviews"
Creating Role="cert-manager-cainjector:leaderelection" Namespace="kube-system"
Creating Role="cert-manager:leaderelection" Namespace="kube-system"
Creating Role="cert-manager-webhook:dynamic-serving" Namespace="cert-manager"
Creating RoleBinding="cert-manager-cainjector:leaderelection" Namespace="kube-system"
Creating RoleBinding="cert-manager:leaderelection" Namespace="kube-system"
Creating RoleBinding="cert-manager-webhook:dynamic-serving" Namespace="cert-manager"
Creating Service="cert-manager" Namespace="cert-manager"
Creating Service="cert-manager-webhook" Namespace="cert-manager"
Creating Deployment="cert-manager-cainjector" Namespace="cert-manager"
Creating Deployment="cert-manager" Namespace="cert-manager"
Creating Deployment="cert-manager-webhook" Namespace="cert-manager"
Creating MutatingWebhookConfiguration="cert-manager-webhook"
Creating ValidatingWebhookConfiguration="cert-manager-webhook"
Waiting for cert-manager to be available...
Updating Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Certificate="selfsigned-cert" Namespace="cert-manager-test"
Deleting Namespace="cert-manager-test"
Deleting Issuer="test-selfsigned" Namespace="cert-manager-test"
Deleting Certificate="selfsigned-cert" Namespace="cert-manager-test"
Installing Provider="cluster-api" Version="v1.1.3" TargetNamespace="capi-system"
Creating objects Provider="cluster-api" Version="v1.1.3" TargetNamespace="capi-system"
Creating Namespace="capi-system"
Creating CustomResourceDefinition="clusterclasses.cluster.x-k8s.io"
Creating CustomResourceDefinition="clusterresourcesetbindings.addons.cluster.x-k8s.io"
Creating CustomResourceDefinition="clusterresourcesets.addons.cluster.x-k8s.io"
Creating CustomResourceDefinition="clusters.cluster.x-k8s.io"
Creating CustomResourceDefinition="machinedeployments.cluster.x-k8s.io"
Creating CustomResourceDefinition="machinehealthchecks.cluster.x-k8s.io"
Creating CustomResourceDefinition="machinepools.cluster.x-k8s.io"
Creating CustomResourceDefinition="machines.cluster.x-k8s.io"
Creating CustomResourceDefinition="machinesets.cluster.x-k8s.io"
Creating ServiceAccount="capi-manager" Namespace="capi-system"
Creating Role="capi-leader-election-role" Namespace="capi-system"
Creating ClusterRole="capi-system-capi-aggregated-manager-role"
Creating ClusterRole="capi-system-capi-manager-role"
Creating RoleBinding="capi-leader-election-rolebinding" Namespace="capi-system"
Creating ClusterRoleBinding="capi-system-capi-manager-rolebinding"
Creating Service="capi-webhook-service" Namespace="capi-system"
Creating Deployment="capi-controller-manager" Namespace="capi-system"
Creating Certificate="capi-serving-cert" Namespace="capi-system"
Creating Issuer="capi-selfsigned-issuer" Namespace="capi-system"
Creating MutatingWebhookConfiguration="capi-mutating-webhook-configuration"
Creating ValidatingWebhookConfiguration="capi-validating-webhook-configuration"
Creating inventory entry Provider="cluster-api" Version="v1.1.3" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.1.3" TargetNamespace="capi-kubeadm-bootstrap-system"
Creating objects Provider="bootstrap-kubeadm" Version="v1.1.3" TargetNamespace="capi-kubeadm-bootstrap-system"
Creating Namespace="capi-kubeadm-bootstrap-system"
Creating CustomResourceDefinition="kubeadmconfigs.bootstrap.cluster.x-k8s.io"
Creating CustomResourceDefinition="kubeadmconfigtemplates.bootstrap.cluster.x-k8s.io"
Creating ServiceAccount="capi-kubeadm-bootstrap-manager" Namespace="capi-kubeadm-bootstrap-system"
Creating Role="capi-kubeadm-bootstrap-leader-election-role" Namespace="capi-kubeadm-bootstrap-system"
Creating ClusterRole="capi-kubeadm-bootstrap-system-capi-kubeadm-bootstrap-manager-role"
Creating RoleBinding="capi-kubeadm-bootstrap-leader-election-rolebinding" Namespace="capi-kubeadm-bootstrap-system"
Creating ClusterRoleBinding="capi-kubeadm-bootstrap-system-capi-kubeadm-bootstrap-manager-rolebinding"
Creating Service="capi-kubeadm-bootstrap-webhook-service" Namespace="capi-kubeadm-bootstrap-system"
Creating Deployment="capi-kubeadm-bootstrap-controller-manager" Namespace="capi-kubeadm-bootstrap-system"
Creating Certificate="capi-kubeadm-bootstrap-serving-cert" Namespace="capi-kubeadm-bootstrap-system"
Creating Issuer="capi-kubeadm-bootstrap-selfsigned-issuer" Namespace="capi-kubeadm-bootstrap-system"
Creating ValidatingWebhookConfiguration="capi-kubeadm-bootstrap-validating-webhook-configuration"
Creating inventory entry Provider="bootstrap-kubeadm" Version="v1.1.3" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.1.3" TargetNamespace="capi-kubeadm-control-plane-system"
Creating objects Provider="control-plane-kubeadm" Version="v1.1.3" TargetNamespace="capi-kubeadm-control-plane-system"
Creating Namespace="capi-kubeadm-control-plane-system"
Creating CustomResourceDefinition="kubeadmcontrolplanes.controlplane.cluster.x-k8s.io"
Creating CustomResourceDefinition="kubeadmcontrolplanetemplates.controlplane.cluster.x-k8s.io"
Creating ServiceAccount="capi-kubeadm-control-plane-manager" Namespace="capi-kubeadm-control-plane-system"
Creating Role="capi-kubeadm-control-plane-leader-election-role" Namespace="capi-kubeadm-control-plane-system"
Creating ClusterRole="capi-kubeadm-control-plane-system-capi-kubeadm-control-plane-aggregated-manager-role"
Creating ClusterRole="capi-kubeadm-control-plane-system-capi-kubeadm-control-plane-manager-role"
Creating RoleBinding="capi-kubeadm-control-plane-leader-election-rolebinding" Namespace="capi-kubeadm-control-plane-system"
Creating ClusterRoleBinding="capi-kubeadm-control-plane-system-capi-kubeadm-control-plane-manager-rolebinding"
Creating Service="capi-kubeadm-control-plane-webhook-service" Namespace="capi-kubeadm-control-plane-system"
Creating Deployment="capi-kubeadm-control-plane-controller-manager" Namespace="capi-kubeadm-control-plane-system"
Creating Certificate="capi-kubeadm-control-plane-serving-cert" Namespace="capi-kubeadm-control-plane-system"
Creating Issuer="capi-kubeadm-control-plane-selfsigned-issuer" Namespace="capi-kubeadm-control-plane-system"
Creating MutatingWebhookConfiguration="capi-kubeadm-control-plane-mutating-webhook-configuration"
Creating ValidatingWebhookConfiguration="capi-kubeadm-control-plane-validating-webhook-configuration"
Creating inventory entry Provider="control-plane-kubeadm" Version="v1.1.3" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-vcd" Version="v1.0.0" TargetNamespace="capvcd-system"
Creating objects Provider="infrastructure-vcd" Version="v1.0.0" TargetNamespace="capvcd-system"
Creating Namespace="capvcd-system"
Creating CustomResourceDefinition="vcdclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="vcdmachines.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="vcdmachinetemplates.infrastructure.cluster.x-k8s.io"
Creating ServiceAccount="capvcd-controller-manager" Namespace="capvcd-system"
Creating Role="capvcd-leader-election-role" Namespace="capvcd-system"
Creating ClusterRole="capvcd-system-capvcd-manager-role"
Creating RoleBinding="capvcd-leader-election-rolebinding" Namespace="capvcd-system"
Creating ClusterRoleBinding="capvcd-system-capvcd-manager-rolebinding"
Creating Service="capvcd-webhook-service" Namespace="capvcd-system"
Creating Deployment="capvcd-controller-manager" Namespace="capvcd-system"
Creating Certificate="capvcd-serving-cert" Namespace="capvcd-system"
Creating Issuer="capvcd-selfsigned-issuer" Namespace="capvcd-system"
Creating MutatingWebhookConfiguration="capvcd-mutating-webhook-configuration"
Creating ValidatingWebhookConfiguration="capvcd-validating-webhook-configuration"
Creating inventory entry Provider="infrastructure-vcd" Version="v1.0.0" TargetNamespace="capvcd-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -

Using configuration File="/root/.cluster-api/clusterctl.yaml"
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.capvcd.install.status successful'

+ export CURRENT_STATE=guestinfo.cloudinit.kind.cluster.capvcd.ready.status
+ CURRENT_STATE=guestinfo.cloudinit.kind.cluster.capvcd.ready.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.capvcd.ready.status in_progress'

++ kubectl get pods -n capi-kubeadm-bootstrap-system -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 10
++ kubectl get pods -n capi-kubeadm-bootstrap-system -o 'jsonpath={.items[*].status.phase}'
+ [[ Running =~ ^(Running )*Running$ ]]
++ kubectl get pods -n capi-kubeadm-control-plane-system -o 'jsonpath={.items[*].status.phase}'
+ [[ Running =~ ^(Running )*Running$ ]]
++ kubectl get pods -n capi-system -o 'jsonpath={.items[*].status.phase}'
+ [[ Running =~ ^(Running )*Running$ ]]
++ kubectl get pods -n capvcd-system -o 'jsonpath={.items[*].status.phase}'
+ [[ Running =~ ^(Running )*Running$ ]]
+ kubectl get svc -n capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-webhook-service
NAME                                     TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
capi-kubeadm-bootstrap-webhook-service   ClusterIP   10.96.216.13   <none>        443/TCP   16s
+ kubectl get svc -n capi-kubeadm-control-plane-system capi-kubeadm-control-plane-webhook-service
NAME                                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
capi-kubeadm-control-plane-webhook-service   ClusterIP   10.96.193.67   <none>        443/TCP   16s
+ kubectl get svc -n capi-system capi-webhook-service
NAME                   TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
capi-webhook-service   ClusterIP   10.96.100.46   <none>        443/TCP   18s
+ kubectl get svc -n capvcd-system capvcd-webhook-service
NAME                     TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
capvcd-webhook-service   ClusterIP   10.96.251.72   <none>        443/TCP   15s
+ kubectl wait --for=condition=Ready pods --all -n capi-kubeadm-bootstrap-system --timeout=240s
pod/capi-kubeadm-bootstrap-controller-manager-56bdcdf797-lxmjx condition met
+ kubectl wait --for=condition=Ready pods --all -n capi-kubeadm-control-plane-system --timeout=240s
pod/capi-kubeadm-control-plane-controller-manager-85dc7657bf-ng9vg condition met
+ kubectl wait --for=condition=Ready pods --all -n capi-system --timeout=240s
pod/capi-controller-manager-8b5d94fc5-tmmtb condition met
+ kubectl wait --for=condition=Ready pods --all -n capvcd-system --timeout=240s
pod/capvcd-controller-manager-857dc7c958-l72zk condition met
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.capvcd.ready.status successful'

+ export CURRENT_STATE=guestinfo.cloudinit.antrea.manifest.download.status
+ CURRENT_STATE=guestinfo.cloudinit.antrea.manifest.download.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.antrea.manifest.download.status in_progress'

+ wget -O /root/v1.5.4-antrea.yaml.template https://raw.githubusercontent.com/vmware/cluster-api-provider-cloud-director/vkp/tanzu/crs/v1.5.4/v1.5.4-antrea.yaml.template
--2022-08-15 16:22:31--  https://raw.githubusercontent.com/vmware/cluster-api-provider-cloud-director/vkp/tanzu/crs/v1.5.4/v1.5.4-antrea.yaml.template
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 154744 (151K) [text/plain]
Saving to: ‘/root/v1.5.4-antrea.yaml.template’

/root/v1.5.4-antrea.yaml.template                                                 100%[===========================================================================================================================================================================================================>] 151.12K  --.-KB/s    in 0.004s

2022-08-15 16:22:32 (34.2 MB/s) - ‘/root/v1.5.4-antrea.yaml.template’ saved [154744/154744]

+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.antrea.manifest.download.status successful'

+ export CURRENT_STATE=guestinfo.cloudinit.antrea.crs.install.status
+ CURRENT_STATE=guestinfo.cloudinit.antrea.crs.install.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.antrea.crs.install.status in_progress'

+ sed -e s/__CLUSTER_NAME__/test-3/g -e s/__NAMESPACE_NAME__/test-3-ns/g /root/v1.5.4-antrea.yaml.template
+ kubectl create ns test-3-ns
namespace/test-3-ns created
+ kubectl apply -f /root/v1.5.4-antrea.yaml
clusterresourceset.addons.cluster.x-k8s.io/test-3-antrea created
secret/test-3-antrea-crs created
secret/test-3-antrea-addon created
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.antrea.crs.install.status successful'

+ export CURRENT_STATE=guestinfo.cloudinit.kind.cluster.capi.yaml.apply.status
+ CURRENT_STATE=guestinfo.cloudinit.kind.cluster.capi.yaml.apply.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.capi.yaml.apply.status in_progress'

+ base64 /root/capi_mini_b64.yaml -d
+ kubectl apply -f /root/infrastructure-vcd/v1.0.0/capi_mini.yaml
cluster.cluster.x-k8s.io/test-3 created
secret/capi-user-credentials created
vcdcluster.infrastructure.cluster.x-k8s.io/test-3 created
vcdmachinetemplate.infrastructure.cluster.x-k8s.io/test-3-control-plane created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/test-3-control-plane created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/test-3-worker-pool-1 created
machinedeployment.cluster.x-k8s.io/test-3-worker-pool-1 created
vcdmachinetemplate.infrastructure.cluster.x-k8s.io/test-3-worker-pool-1 created
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.kind.cluster.capi.yaml.apply.status successful'

+ export CURRENT_STATE=guestinfo.cloudinit.target.cluster.ready.status
+ CURRENT_STATE=guestinfo.cloudinit.target.cluster.ready.status
+ vmtoolsd --cmd 'info-set guestinfo.cloudinit.target.cluster.ready.status in_progress'

++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
++ kubectl get machines -n test-3-ns -o 'jsonpath={.items[*].status.phase}'
+ [[ Pending =~ ^(Running )*Running$ ]]
+ sleep 20
Reply
0 Kudos
sakthi2019
VMware Employee
VMware Employee

Please collect the logs using kubeconfig and check all logs that are under each pod name.
https://github.com/vmware/cloud-provider-for-cloud-director/blob/main/scripts/generate-k8s-log-bundl....

Also, review a similar issue discussed here: https://communities.vmware.com/t5/VMware-Cloud-Director-Container/Deployment-stuck-in-loop/td-p/2922... 

lzichong
VMware Employee
VMware Employee

Hi jleavers1,

Following up on top of sakthi2019's comment, could you verify after running /root/bootstrap.sh, approximate how long were you stuck for? The latest logs you posted where it is waiting for the machines to come up usually takes a bit of time, if CSE was running it would be ~30 mins allocated to wait for the machines to come up before erroring out. There could have also been nothing wrong but may have needed extra time for the machines to come up, but if there in the case of errors please do run the script sakthi2019 provided to collect all the kubernetes logs and we could investigate. 

Additionally, as it seems that you were able to manually get past the previous step by manually running /root/bootstrap.sh I believe it should be executed similarly when being ran from VCDKE, could you try to create once more to without stopping CSE service to see if you are still receiving failing at that step? If so, please continue with the previous approach by disabling CSE service and manually running /root/bootstrap.sh and on errors encountered please do follow sakthi2019's comment so we can investigate the logs from Kubernetes if it reaches to the machines being deployed state.

Thank you!

Reply
0 Kudos
jleavers1
Contributor
Contributor

Hi,

Thanks for the link to the similar issue - when I checked the NSX-ALB setup I found that the licence had expired. I have added a new licence and can now progress further. Now, the first control plane node is created, but no other control plane or worker nodes after that.

Unfortunately I couldn't easily get into the VM to investigate further - the password found by editing the guest properties in vCD does not work and nor does connecting from the machine that has the SSH key. Unless a different user (not root) should be used? I have also reset the password by booting to single-user mode, but it appears to be overwritten by cloud-init once the VM boots again.

Instead I added another user called 'test' and added this user to the sudo group so can now SSH in.

I can also connect using the KUBECONFIG from vCD - saw that the csi-vcd-controllerplugin-0 pods were not running:

$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
antrea-agent-jbtqc 2/2 Running 6 (22m ago) 64m
antrea-controller-975d6b99b-nkkpr 1/1 Running 3 (22m ago) 64m
coredns-67c8559bb6-g966l 1/1 Running 3 (22m ago) 64m
coredns-67c8559bb6-nttpp 1/1 Running 3 (22m ago) 64m
csi-vcd-controllerplugin-0 0/3 Pending 0 64m
etcd-test-6-control-plane-52h42 1/1 Running 5 (22m ago) 64m
kube-apiserver-test-6-control-plane-52h42 1/1 Running 5 (22m ago) 64m
kube-controller-manager-test-6-control-plane-52h42 1/1 Running 5 (22m ago) 64m
kube-proxy-rtkw5 1/1 Running 3 (22m ago) 64m
kube-scheduler-test-6-control-plane-52h42 1/1 Running 5 (22m ago) 64m
vmware-cloud-director-ccm-86c85cb478-bnsmb 1/1 Running 3 (22m ago) 64m

This was because of the node taint:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 54m (x1 over 55m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Warning FailedScheduling 58m (x7 over 64m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Warning FailedScheduling 34m (x16 over 49m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Warning FailedScheduling 39s (x23 over 22m) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

I removed the taint to start the pods:

$ kubectl taint nodes test-6-control-plane-52h42 node-role.kubernetes.io/master:NoSchedule-

$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
antrea-agent-jbtqc 2/2 Running 6 (28m ago) 70m
antrea-controller-975d6b99b-nkkpr 1/1 Running 3 (28m ago) 70m
coredns-67c8559bb6-g966l 1/1 Running 3 (28m ago) 70m
coredns-67c8559bb6-nttpp 1/1 Running 3 (28m ago) 70m
csi-vcd-controllerplugin-0 3/3 Running 0 70m
csi-vcd-nodeplugin-zlf7d 2/2 Running 0 2m35s
etcd-test-6-control-plane-52h42 1/1 Running 5 (28m ago) 70m
kube-apiserver-test-6-control-plane-52h42 1/1 Running 5 (28m ago) 70m
kube-controller-manager-test-6-control-plane-52h42 1/1 Running 5 (28m ago) 70m
kube-proxy-rtkw5 1/1 Running 3 (28m ago) 70m
kube-scheduler-test-6-control-plane-52h42 1/1 Running 5 (28m ago) 70m
vmware-cloud-director-ccm-86c85cb478-bnsmb 1/1 Running 3 (28m ago) 70m

I have attached the logs generated using the script as suggested - one from before the manual taint removal and one after.

When I restarted the CSE service it deleted the VMs. 

Reply
0 Kudos
sakthi2019
VMware Employee
VMware Employee

>Unfortunately I couldn't easily get into the VM to investigate further - the password found by editing the guest properties in vCD does >not work and nor does connecting from the machine that has the SSH key. Unless a different user (not root) should be used? I have also >reset the password by booting to single-user mode, but it appears to be overwritten by cloud-init once the VM boots again.

These are known issues that are getting resolved in GA release

> When I restarted the CSE service it deleted the VMs
New flag to keep the vm(s) on failure is part of GA. But did the retry on cluster creation happen ?



Reply
0 Kudos