All Posts

Detailed Error: [error while bootstrapping the machine [/EPHEMERAL-TEMP-VM]; unable to wait for post customization phase [guestinfo.cloudinit.antrea.manifest.download.status] : [invalid postcustomiza... See more...
Detailed Error: [error while bootstrapping the machine [/EPHEMERAL-TEMP-VM]; unable to wait for post customization phase [guestinfo.cloudinit.antrea.manifest.download.status] : [invalid postcustomization phase: [failed] for key [guestinfo.cloudinit.antrea.manifest.download.status] for vm [EPHEMERAL-TEMP-VM] due to :[<13>Jul 17 12:44:36 test1@System/administrator: + wget -O /root/v1.6.1-antrea.yaml.template <13>Jul 17 12:44:36 test1@System/administrator: --2023-07-17 12:44:36-- <13>Jul 17 12:44:36 test1@System/administrator: Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ... <13>Jul 17 12:44:36 test1@System/administrator: Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected. <13>Jul 17 12:44:36 test1@System/administrator: HTTP request sent, awaiting response...  
The problem is resolved after adding DNS configuration on the Edge gateway. that way, every network created will use the DNS defined by the Edge. i defined 8.8.8.8 as primary dns and then it was able... See more...
The problem is resolved after adding DNS configuration on the Edge gateway. that way, every network created will use the DNS defined by the Edge. i defined 8.8.8.8 as primary dns and then it was able to resolve github.  
Hello, I am facing a ScriptExecution Error while trying to deploy a tkgm cluster. The deployment fails with a an error "unable to wait for post customization phase ". I tried to access the control ... See more...
Hello, I am facing a ScriptExecution Error while trying to deploy a tkgm cluster. The deployment fails with a an error "unable to wait for post customization phase ". I tried to access the control plane VM "EPHEMERAL_TEMP_VM" to figure out the problem and i noticed that there are 4 services "Cloud-init,Cloud-init-local,cloud-final,cloud-config" that are not running as attached in the screenshot. I am using CSE 4.0 for vCloud Director 10.4, TKG 1.4.3 and 1.5.4(Which are supported by cse 4.0) and Ubuntu as TKG OS no FIPS tempates downloaded from official product as required in the documentation. I tried to modify manually the template and i successfully made all services running. Then, i tried to upload the template to my catalogue to use it as my TKG cluster template but it's not available among templates  options. So, the problem resides especially in the ubuntu template and i am not allowed to modify it to make all services running. Is there any workaround that i can try to solve this problem.  Thanks in advance.
Hi, I have the same issue, so when can we increase 1800 sec timeout ? Regards.
How could i create or edit clusters with API?
Hi, any update on this topic?
Hi, I did manage to remove rows. However the table itself is still displayed at the same size. Could you please let me know what should be the next step? Is it normal that CSE beta is using the... See more...
Hi, I did manage to remove rows. However the table itself is still displayed at the same size. Could you please let me know what should be the next step? Is it normal that CSE beta is using the audit_trail and not only audit_event?
is there any "Desired State Configuration" tool, like Argo CD, compatible with Cloud Director? No Like instead of using the API to create the cluster, define specifications of the cluster in anoth... See more...
is there any "Desired State Configuration" tool, like Argo CD, compatible with Cloud Director? No Like instead of using the API to create the cluster, define specifications of the cluster in another tool and let the tool build automatically the cluster? You can use Terraform Provider to create clusters. This is still under development though.
KB to cleanup audit table https://kb.vmware.com/s/article/2106123 The KB is little old, for postgres the following should work DELETE from audit_trail WHERE event_time < '2022-09-01 06:00:00.000';
select event_type, count(*) as num from audit_trail group by event_type order by num desc This should give you the aggregates without having to rely on limits, the query should be pretty fast. I can... See more...
select event_type, count(*) as num from audit_trail group by event_type order by num desc This should give you the aggregates without having to rely on limits, the query should be pretty fast. I can run the query in under 10 seconds on a 30GB db
Update: Just discovered with another sql query that there are older logs in this table. So just using "limit 5" does not display the first entries. By using instead  SELECT id, event_type, e... See more...
Update: Just discovered with another sql query that there are older logs in this table. So just using "limit 5" does not display the first entries. By using instead  SELECT id, event_type, event_time, org_member_id, tenant_id FROM audit_trail ORDER BY event_time limit 100; i end up with the real first events, which seems already related to the beta. So maybe starting when the beta and clusters were first deployed.  
Hi @akrishnakuma @agoel  As mentioned i doubt the login events are the one consuming the most space in my case. Database: vcloud Table: audit_trail (I am wondering is this audit_trail is expect... See more...
Hi @akrishnakuma @agoel  As mentioned i doubt the login events are the one consuming the most space in my case. Database: vcloud Table: audit_trail (I am wondering is this audit_trail is expected or maybe was due to previous troubleshooting on this cloud director instance, i don't have full control of history) The weird thing is if i look at the first rows they are always the same, they are never purged and therefore database grows in size. If i looks at latest row, (i have just build a new cluster) I see a lot of "modify" event, and i think there are the one filling the database because the payload is large. (I can't display it with the query, because it breaks all formatting) Example of such event (Details of a similar event could be seen attached to my previous post) My immediate concern is how to clean this database. Could i just run a query to remove the first  (oldest) 1000 rows for example? Note, this cloud director database is not in a cluster anymore.  
Hi @ccalvetbeta @rickvanvliet, Thanks for this report and we will fix the repeated logins at top priority. Based on our understanding, an audit trial log for a login should be ~1KB. So even with so... See more...
Hi @ccalvetbeta @rickvanvliet, Thanks for this report and we will fix the repeated logins at top priority. Based on our understanding, an audit trial log for a login should be ~1KB. So even with so many logins we should not use up the database to that extent. So our suspicion is that something else could be going on to raise this size to the 160GB mentioned. Do you have a sense of what tables could be large in this database? What are the frequent operations that you perform and what is the scale that you run.  
Thanks for bringing this up. We are looking into it and might come back with some questions. Could you share the version of VCDs you are using?
Hi, I have a similar issue. vCloud director "database full". In my case the table "audit_trail" in the "vcloud" database in vCloud director postgres database is now 160 GB. Database size has been... See more...
Hi, I have a similar issue. vCloud director "database full". In my case the table "audit_trail" in the "vcloud" database in vCloud director postgres database is now 160 GB. Database size has been increased multiples time but this can't be the solution in long term vision. Stopping the vAPPs associated to the Tanzu kubernetes cluster "beta" stops the generation of new logs. I see also many "Access token created ..." events but i am not sure if they are related to beta or legacy CSE clusters. But i think the ones consuming the largest amount of data are the events of type "definedEntity/modify ''beta006'' (9ebee87d-9d05-4c3f-b8e7-01ea477ac48c)"   - beta006 is one of the cluster create with CSE beta. Because the Details are very large. See attached file. Solution attempted so far: I have already reduced in "Administration">"General settings">"Activity logs", logs history to keep and shown to 20 days but it doesn't seem that the entries older than 20 days in audit_trail are removed. I guess there is a script responsible for cleaning old events, if it is the case and someone knows how to manually start it please let me know. I am not even sure if it could work because i was in the assumption these settings works with the "audit_event" table, and when i was looking at row there, there are none in this table. Questions: Is it expect that cluster created with CSE beta will create events and therefore many rows in "audit_trail" database? (Note, it is possible that cloud director is/was configured with advanced settings i am not aware, like adding extra logging during a previous support call) What is the best way of cleaning the "audit_trail" database? Are "Activity logs" settings supposed to have an impact on the "audit_trail" table? If yes, how to manually start the cleaning script? What would be the impact of deleting the oldest rows using SQL commands against "audit_trail"? If it could be done without breaking Cloud DIrector it would be an easy workaround.
Hi @StjepanC  I apologize that my messages are not going through. Yes please open an SR; please let me know when this is submitted and the number if possible.
Thanks @ccalvetbeta  for the clarification. I mistook the loadbalancer use based on the CSE 4.0 context. You are right. We don't have this feature covered in the CPI component and will need to add i... See more...
Thanks @ccalvetbeta  for the clarification. I mistook the loadbalancer use based on the CSE 4.0 context. You are right. We don't have this feature covered in the CPI component and will need to add it. We plan to use the `loadBalancerIP` field to specify it. Would you mind creating a ticket for this in bugzilla (using VMware support), or at least in the CLoud provider (https://github.com/vmware/cloud-provider-for-cloud-director)? Else please let us know and we will do it. You could add yourself as a customer there.
hello, data center group networks for cluster deployments has been tested and is a supported workflow
Hi, Regarding using Cluster API directly I think you are right and it is better to not manipulate it. Better to consider it just as a tool used by VCD and CSE to build the cluster. Still interestin... See more...
Hi, Regarding using Cluster API directly I think you are right and it is better to not manipulate it. Better to consider it just as a tool used by VCD and CSE to build the cluster. Still interesting to know about it as part of troubleshooting to read for example related logs. Goal now will be automate everything, including creating and deleting cluster, via the VCD API. (Similar to what the CSE CLI is offering with the legacy version) Second question, is there any "Desired State Configuration" tool, like Argo CD,  compatible with Cloud Director? Like instead of using the API to create the cluster, define specifications of the cluster in another tool and let the tool build automatically the cluster?
Thank you for the reply. However it seems more related to modifying the "management IP" of the cluster. My question is about the load balancer services used by the applications.  Create an Externa... See more...
Thank you for the reply. However it seems more related to modifying the "management IP" of the cluster. My question is about the load balancer services used by the applications.  Create an External Load Balancer | Kubernetes Goal is to know before redeploying the application in a new cluster what will be the external IP instead of having to identify it after deploying load balancer service in a new cluster. So predictive VS reactive.