ccalvetbeta's Posts

How could i create or edit clusters with API?
Hi, any update on this topic?
Hi, I did manage to remove rows. However the table itself is still displayed at the same size. Could you please let me know what should be the next step? Is it normal that CSE beta is using the... See more...
Hi, I did manage to remove rows. However the table itself is still displayed at the same size. Could you please let me know what should be the next step? Is it normal that CSE beta is using the audit_trail and not only audit_event?
Update: Just discovered with another sql query that there are older logs in this table. So just using "limit 5" does not display the first entries. By using instead  SELECT id, event_type, e... See more...
Update: Just discovered with another sql query that there are older logs in this table. So just using "limit 5" does not display the first entries. By using instead  SELECT id, event_type, event_time, org_member_id, tenant_id FROM audit_trail ORDER BY event_time limit 100; i end up with the real first events, which seems already related to the beta. So maybe starting when the beta and clusters were first deployed.  
Hi @akrishnakuma @agoel  As mentioned i doubt the login events are the one consuming the most space in my case. Database: vcloud Table: audit_trail (I am wondering is this audit_trail is expect... See more...
Hi @akrishnakuma @agoel  As mentioned i doubt the login events are the one consuming the most space in my case. Database: vcloud Table: audit_trail (I am wondering is this audit_trail is expected or maybe was due to previous troubleshooting on this cloud director instance, i don't have full control of history) The weird thing is if i look at the first rows they are always the same, they are never purged and therefore database grows in size. If i looks at latest row, (i have just build a new cluster) I see a lot of "modify" event, and i think there are the one filling the database because the payload is large. (I can't display it with the query, because it breaks all formatting) Example of such event (Details of a similar event could be seen attached to my previous post) My immediate concern is how to clean this database. Could i just run a query to remove the first  (oldest) 1000 rows for example? Note, this cloud director database is not in a cluster anymore.  
Hi, I have a similar issue. vCloud director "database full". In my case the table "audit_trail" in the "vcloud" database in vCloud director postgres database is now 160 GB. Database size has been... See more...
Hi, I have a similar issue. vCloud director "database full". In my case the table "audit_trail" in the "vcloud" database in vCloud director postgres database is now 160 GB. Database size has been increased multiples time but this can't be the solution in long term vision. Stopping the vAPPs associated to the Tanzu kubernetes cluster "beta" stops the generation of new logs. I see also many "Access token created ..." events but i am not sure if they are related to beta or legacy CSE clusters. But i think the ones consuming the largest amount of data are the events of type "definedEntity/modify ''beta006'' (9ebee87d-9d05-4c3f-b8e7-01ea477ac48c)"   - beta006 is one of the cluster create with CSE beta. Because the Details are very large. See attached file. Solution attempted so far: I have already reduced in "Administration">"General settings">"Activity logs", logs history to keep and shown to 20 days but it doesn't seem that the entries older than 20 days in audit_trail are removed. I guess there is a script responsible for cleaning old events, if it is the case and someone knows how to manually start it please let me know. I am not even sure if it could work because i was in the assumption these settings works with the "audit_event" table, and when i was looking at row there, there are none in this table. Questions: Is it expect that cluster created with CSE beta will create events and therefore many rows in "audit_trail" database? (Note, it is possible that cloud director is/was configured with advanced settings i am not aware, like adding extra logging during a previous support call) What is the best way of cleaning the "audit_trail" database? Are "Activity logs" settings supposed to have an impact on the "audit_trail" table? If yes, how to manually start the cleaning script? What would be the impact of deleting the oldest rows using SQL commands against "audit_trail"? If it could be done without breaking Cloud DIrector it would be an easy workaround.
Hi, Regarding using Cluster API directly I think you are right and it is better to not manipulate it. Better to consider it just as a tool used by VCD and CSE to build the cluster. Still interestin... See more...
Hi, Regarding using Cluster API directly I think you are right and it is better to not manipulate it. Better to consider it just as a tool used by VCD and CSE to build the cluster. Still interesting to know about it as part of troubleshooting to read for example related logs. Goal now will be automate everything, including creating and deleting cluster, via the VCD API. (Similar to what the CSE CLI is offering with the legacy version) Second question, is there any "Desired State Configuration" tool, like Argo CD,  compatible with Cloud Director? Like instead of using the API to create the cluster, define specifications of the cluster in another tool and let the tool build automatically the cluster?
Thank you for the reply. However it seems more related to modifying the "management IP" of the cluster. My question is about the load balancer services used by the applications.  Create an Externa... See more...
Thank you for the reply. However it seems more related to modifying the "management IP" of the cluster. My question is about the load balancer services used by the applications.  Create an External Load Balancer | Kubernetes Goal is to know before redeploying the application in a new cluster what will be the external IP instead of having to identify it after deploying load balancer service in a new cluster. So predictive VS reactive.
Hi, thank you for the reply. I am aware they could be selected. My questions are: Has it been tested by VMware? (I do not want to spend time to create a lab to test it, if this scenario has not eve... See more...
Hi, thank you for the reply. I am aware they could be selected. My questions are: Has it been tested by VMware? (I do not want to spend time to create a lab to test it, if this scenario has not even been tested by VMware) Is it supported? (or more exactly will it be a supported configuration when CSE 4 will be in GA.)
I have done some tests regarding this topic with a cluster created with 3 master. If one control plane node is shutdown from vCenter, "get pods -A" continue to work. (As expected) If two control pl... See more...
I have done some tests regarding this topic with a cluster created with 3 master. If one control plane node is shutdown from vCenter, "get pods -A" continue to work. (As expected) If two control plane nodes are shutdown, "get pods -A" doesn't work anymore (Expected) After restarting one of the control plane node "get pods -A" works again, (Expected) So the basic functionality of a multi control plane nodes is working. One issue is that no errors are reported in the events or in status of the cluster from CSE plugin. (Status is "ready") The only thing visible is at load balancer level  which shows that some endpoints are down and VAPP that is noticing some VMs down. Would it be possible to add some kinds of "health" in the CSE plugin? (like all control planes node up and running / worker nodes up and running, load balancer associated to management IP deployed etc) Second issue, I have deleted on purpose one of the control plane VM. As mentioned above no information are reported from the CSE plugin, it still show "3 nodes". It doesn't recreate the missing node (no "auto-heal" , which would be the best) Is there a procedure on how to replace a failed node in such case?
Hi, Now, when exposing a service via load balancer from kubernetes it will pick the first IP address available in the first "IP Allocations" with IP available of the associated edge. It would help ... See more...
Hi, Now, when exposing a service via load balancer from kubernetes it will pick the first IP address available in the first "IP Allocations" with IP available of the associated edge. It would help in disaster recovery scenario to be able to define which IP will be used. Similar to: Use static IP with load balancer - Azure Kubernetes Service | Microsoft Docs Is is already planned to provide such capability in Tanzu CSE? If yes, when will it be available? If not, this is a feature request.
Hi, I do not have full details but from what i understood: NSX ALB communicate with vCenter using a "vCenter account" dedicated for this purpose. (This is part of "create NSX-T Cloud) in vcenter. ... See more...
Hi, I do not have full details but from what i understood: NSX ALB communicate with vCenter using a "vCenter account" dedicated for this purpose. (This is part of "create NSX-T Cloud) in vcenter. So it seems somehow that NSX-ALB was not able to communicate with vCenter anymore. So maybe password has been modified or something like this. Note: I am maybe mistaken an issue was with account connecting to NSX-Manager but the concept is the same, issue with credentials used with NSX-T cloud) After fixing credentials the deployment was successful. Summary: The issue was not related to Tanzu/CSE but the underlying NSX-ALB infrastructure. Unfortunately it is not easy to pinpoint the origin when looking at error at Tanzu/CSE level. Therefore, the feature requests of adding "pre-requisite" check and/or a wizard showing the progression of a cluster deployment step by step. (Showing the steps completed, current step, and next steps.) In this way it would be easier to pinpoint the origin of such issue if one step is stuck.
Hi, any update on this request? I was trying to generate a template but it didn't work testtenant@desktop:~$ clusterctl generate cluster test Error: failed to read "cluster-template.yaml" from p... See more...
Hi, any update on this request? I was trying to generate a template but it didn't work testtenant@desktop:~$ clusterctl generate cluster test Error: failed to read "cluster-template.yaml" from provider's repository "infrastructure-vcd": failed to get GitHub release v1.0.0: failed to read release "v1.0.0": GET https://api.github.com/repos/vmware/cluster-api-provider-cloud-director/releases/tags/v1.0.0: 404 Not Found [] Or is clusterctl generate still not supported for vcloud director? Like in this link clusterctl generate command doesn't support the generation of CAPI yaml for Cloud Director; Follow the guidelines provided below configure the CAPI Yaml file cluster-api-provider-cloud-director/WORKLOAD_CLUSTER.md at main · vmware/cluster-api-provider-cloud-director · GitHub
Hi, so far I have only deployed the legacy and beta CSE clusters in "Routed Organization Virtual Data Center Network". It works. Is is supported/tested to deploy in a "Routed Data Center Group Netw... See more...
Hi, so far I have only deployed the legacy and beta CSE clusters in "Routed Organization Virtual Data Center Network". It works. Is is supported/tested to deploy in a "Routed Data Center Group Network"? Goal is to see if distributed firewall could be used to isolate clusters deployed in different networks.
Hi, I am aware it is using cluster API in the background, this is why I am interested on how to use it directly. Could you please provide a guide on how to execute steps 2 to 4? Goal will be to ... See more...
Hi, I am aware it is using cluster API in the background, this is why I am interested on how to use it directly. Could you please provide a guide on how to execute steps 2 to 4? Goal will be to automatize everything, ideally with API or kubectl and second choice with a CLI.  - Create a new cluster including Control Plane and Worker Pools  - Create new worker nodes pools  - Edit Control Plane (like number of nodes)  - Edit a worker node pool I am wondering if cluster API could be used in all scenario or if it is not possible for the first step.(creating new cluster).  
After a successful deployment in the gui, is it possible to do the same with cluster API? Goal would be to automatize all deployment. Like in Cluster API Provider for VMware Cloud Director - VMware ... See more...
After a successful deployment in the gui, is it possible to do the same with cluster API? Goal would be to automatize all deployment. Like in Cluster API Provider for VMware Cloud Director - VMware Cloud Provider Blog For example could cluster API be used to edit number of nodes, create new worker nodes etc.?
The new capability to create multiples nodes for the control plane is good. However how should they be operated? I am thinking of etcd https://kubernetes.io/docs/tasks/administer-cluster/configu... See more...
The new capability to create multiples nodes for the control plane is good. However how should they be operated? I am thinking of etcd https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/ For example: What would be the process to replace a failed ectd member? Removing a failed member doesn't seem possible if they are managed by Tanzu. I didn't see an option in the gui to delete a specific node. Will their be an option to backup etcd directly from the GUI or via CLI?
Finally it has worked without any actions on my side. So it is just slow to start.
I managed to create a new cluster, It is now in state ready. It was initially provisioned with 3 control plane and 1 worker node. I am trying to increase to two worker nodes. In the resize wizard ... See more...
I managed to create a new cluster, It is now in state ready. It was initially provisioned with 3 control plane and 1 worker node. I am trying to increase to two worker nodes. In the resize wizard i select 2 for "Number of Nodes" and click submit. I end up with the message "Acknowledged node pool resize request". But after that nothing. No new events or tasks. The CSE journal doesn't seem to contain anything relevant to this request, only a "status check" of the cluster every minute. Is it a known issue or is it supposed to work?
Now it is working: Task in vcenter below, and attached the journalctl cse logs