VMware Cloud Community
L0g333
Contributor
Contributor
Jump to solution

Could not connect to one or more vCenter Server systems https://v-center:443/sdk \\ vpxd service crashes \\ vim.fault.InvalidName

Hello together,

at a customers site we are running a vcenter Server Appliance (Version 6.7.0.41000, Build 14836122). Since today in the morning I have the Problem, that my monitoring solution cannot query information about the connected ESXi hosts any more. After logging into vCenter Web Ui i received the Error message: "Could not connect to one or more vCenter Server systems https://v-center:443/sdk".

I found this thread, where user bleuze​ had the same problem: SOLVED: Could not connect to one or more vCenter Server systems: .

I followed the instructions and found out that vmware-vpxd is not running. After restarting the service everything seems to work fine for a few seconds, until vmware-vpxd crashes again.

pastedImage_8.png

I did some further investigation on the /storage/log/vmware/vpxd/vpxd.log (find attached in next posting). I am not really clear, what the cause for the crash is. Maybe there is a faulty package installed or downloaded by update manager? Find errors from Line 3292 in the log file.

2019-10-29T10:37:00.981+01:00 error vpxd[05017] [Originator@6876 sub=[SSO] opID=a048760] [UserDirectorySso] GetUserInfo exception: N7Vmacore9Authorize25AuthUserNotFoundExceptionE(User localos\com.vmware.vim.eam)

--> [context]zKq7AVECAAAAAGC34QAVdnB4ZAAA4AArbGlidm1hY29yZS5zbwAAWCUbAIyfGAHcNvV2cHhkAAFLaPUBemv1AQhx9QGiXPUBY131AZKdngGaoZ+CIbEBAWxpYnZpbS10eXBlcy5zbwABA9JyAUzWcQH943EBpD5yAHFvIwA6ciMAnVYrA9RzAGxpYnB0aHJlYWQuc28uMAAE3Y4ObGliYy5zby42AA==[/context]

2019-10-29T10:37:00.983+01:00 error vpxd[05017] [Originator@6876 sub=[SSO] opID=a048760] [UserDirectorySso] NormalizeUserName(com.vmware.vim.eam, false) exception: N7Vmacore9Authorize25AuthUserNotFoundExceptionE(User localos\com.vmware.vim.eam)

--> [context]zKq7AVECAAAAAGC34QAVdnB4ZAAA4AArbGlidm1hY29yZS5zbwAAWCUbAIyfGAHcNvV2cHhkAAFLaPUBemv1AQhx9QGiXPUBY131AZKdngGaoZ+CIbEBAWxpYnZpbS10eXBlcy5zbwABA9JyAUzWcQH943EBpD5yAHFvIwA6ciMAnVYrA9RzAGxpYnB0aHJlYWQuc28uMAAE3Y4ObGliYy5zby42AA==[/context]

2019-10-29T10:37:00.998+01:00 info vpxd[05017] [Originator@6876 sub=vpxLro opID=a048760] [VpxLRO] -- FINISH lro-233

2019-10-29T10:37:02.244+01:00 info vpxd[05007] [Originator@6876 sub=DAS] [FdmManager::MonitorVmHostStateCallback] All VMs and hosts have been protected.

2019-10-29T10:37:02.665+01:00 info vpxd[04985] [Originator@6876 sub=vpxLro opID=5c421eb3] [VpxLRO] -- BEGIN lro-238 -- CustomFieldsManager -- vim.CustomFieldsManager.addFieldDefinition -- 52c75802-dce2-4af1-9514-2c6831594611(5232e022-9e06-f146-e2fb-3654948be4fa)

2019-10-29T10:37:02.666+01:00 info vpxd[04985] [Originator@6876 sub=vpxLro opID=5c421eb3] [VpxLRO] -- FINISH lro-238

2019-10-29T10:37:02.666+01:00 info vpxd[04985] [Originator@6876 sub=Default opID=5c421eb3] [VpxLRO] -- ERROR lro-238 -- CustomFieldsManager -- vim.CustomFieldsManager.addFieldDefinition: vim.fault.DuplicateName:

--> Result:

--> (vim.fault.DuplicateName) {

-->    faultCause = (vmodl.MethodFault) null,

-->    faultMessage = <unset>,

-->    name = "com.vmware.vsan.clusterstate",

-->    object = 'vim.CustomFieldsManager:9334614a-3cb3-4a53-b975-2fce444bafc1:CustomFieldsManager'

-->    msg = ""

--> }

--> Args:

-->

--> Arg name:

--> "com.vmware.vsan.clusterstate"

--> Arg moType:

--> "vim.ClusterComputeResource"

--> Arg fieldDefPolicy:

-->

--> Arg fieldPolicy:

-->

Opening the Update Manager on vSphere Web Client displays "An unexpected error occured".

I am a little bit confused on how to continue to solve the Problem. I'd be very gratefull, if somebody had an idea on how to continue and fix this problem.

Thank you in advance!

1 Solution

Accepted Solutions
KocPawel
Hot Shot
Hot Shot
Jump to solution

I have found this i log:

2019-10-29T10:36:22.729+01:00 error vpxd[05056] [Originator@6876 sub=vpxdVdb] Shutting down the VC as there is not enough free space for the Database(used: 95%; threshold: 95%).

Helpful should be:

https://kb.vmware.com/s/article/67017

and this:

VMware Knowledge Base

vCenter automatically shutdown vCenter Service when vPostrgres DB exceed 95% (i don't remember, it could be different value).

View solution in original post

15 Replies
L0g333
Contributor
Contributor
Jump to solution

Could not attach the logfile to my original post. Attached it here.

0 Kudos
KocPawel
Hot Shot
Hot Shot
Jump to solution

I have found this i log:

2019-10-29T10:36:22.729+01:00 error vpxd[05056] [Originator@6876 sub=vpxdVdb] Shutting down the VC as there is not enough free space for the Database(used: 95%; threshold: 95%).

Helpful should be:

https://kb.vmware.com/s/article/67017

and this:

VMware Knowledge Base

vCenter automatically shutdown vCenter Service when vPostrgres DB exceed 95% (i don't remember, it could be different value).

Vijay2027
Expert
Expert
Jump to solution

Looks like you are hitting a know issue with 6.7U3

VMware Knowledge Base

L0g333
Contributor
Contributor
Jump to solution

Correct. I just found out a few minutes we are affected from the excessive Hardware health alarms described here: VMware Knowledge Base

vCenter got tons of Events that filled up the seat partition:

pastedImage_1.png

Thank you KocPawel​ for the hint. I was able to resize the vCenter partition by help of this article: https://vm.knutsson.it/2018/07/10fb-does-not-support-flow-control-autoneg/#more-749 .

For the first step I'll limit Event log to 14 days which should be enough to keep SEAT partition free.

0 Kudos
Vijay2027
Expert
Expert
Jump to solution

Re-sizing /storage/seat is not a good option. During next upgrade you might be limited to large and x-large options.

0 Kudos
L0g333
Contributor
Contributor
Jump to solution

This would be unfortunate. What would you recommend to do?

I have a vCenter Database Backup automatically created tonight and a VEEAM Backup, also created tonight. The error occured this morning, so I could restore vCenter from VEEAM Backup, run it and limit the Event retention time so that it would not fill up the disk completeley.

I never had to restore vCenter from a VEEAM VM Backup. Is there something I should consider before doing this?

0 Kudos
Vijay2027
Expert
Expert
Jump to solution

From KB: VMware Knowledge Base follow the section:

To workaround the excessive events filling VCDB, user can follow the below step:

0 Kudos
L0g333
Contributor
Contributor
Jump to solution

Thank you for your response. I have read this KB article, but I already resized the SEAT partition. So my question would be how to revert this, to avoid running into the upgrade problems you mentioned before.

Do you know if there is something i should consider before restoring VCSA from my veeam backup? I would do this to have VCSA in the state before resizing it and then following the truncation solution mentioned in the KB article.

0 Kudos
Vijay2027
Expert
Expert
Jump to solution

Please share the output of below command:

df -Th

0 Kudos
L0g333
Contributor
Contributor
Jump to solution

root@VCSA001 [ ~ ]# df -Th

Filesystem                               Type      Size  Used Avail Use% Mounted on

devtmpfs                                 devtmpfs  4.9G     0  4.9G   0% /dev

tmpfs                                    tmpfs     4.9G  872K  4.9G   1% /dev/shm

tmpfs                                    tmpfs     4.9G  684K  4.9G   1% /run

tmpfs                                    tmpfs     4.9G     0  4.9G   0% /sys/fs/cgroup

/dev/sda3                                ext4       11G  6.3G  3.8G  63% /

tmpfs                                    tmpfs     4.9G  1.5M  4.9G   1% /tmp

/dev/mapper/log_vg-log                   ext4      9.8G  3.5G  5.8G  38% /storage/log

/dev/mapper/updatemgr_vg-updatemgr       ext4       99G  1.3G   93G   2% /storage/updatemgr

/dev/mapper/db_vg-db                     ext4      9.8G  389M  8.9G   5% /storage/db

/dev/mapper/dblog_vg-dblog               ext4       15G  310M   14G   3% /storage/dblog

/dev/mapper/netdump_vg-netdump           ext4      985M  1.3M  916M   1% /storage/netdump

/dev/mapper/autodeploy_vg-autodeploy     ext4      9.8G   34M  9.2G   1% /storage/autodeploy

/dev/mapper/seat_vg-seat                 ext4       15G  4.6G  9.4G  33% /storage/seat

/dev/mapper/imagebuilder_vg-imagebuilder ext4      9.8G   23M  9.2G   1% /storage/imagebuilder

/dev/sda1                                ext4      120M   34M   78M  31% /boot

/dev/mapper/core_vg-core                 ext4       25G  180M   24G   1% /storage/core

/dev/mapper/archive_vg-archive           ext4       50G   30G   18G  63% /storage/archive

0 Kudos
sushilkm
Enthusiast
Enthusiast
Jump to solution

Did you tried to truncate the event id;s and sundry house keeping entries from embedded post gress DB .

0 Kudos
L0g333
Contributor
Contributor
Jump to solution

No i did not. But I guess this would have been successfull. Sadly I already resized the SEAT partition and am afraid of problems in future now because of Vijay2027's hint about large and x-large upgrades. Thought resizing would not be a problem. I am now thinking of restoring VCSA into a state before it ran full and to truncate the logs afterwards.

0 Kudos
sushilkm
Enthusiast
Enthusiast
Jump to solution

I would still suggest to truncate DB to avoid getting into trouble during upgrade. If this is not prod critical, perhaps it will be good idea to build a new one and do a selective restore.

0 Kudos
L0g333
Contributor
Contributor
Jump to solution

Thank you for your help! What exactly do you mean by selective restore?

0 Kudos
Vijay2027
Expert
Expert
Jump to solution

vCSA is well below default storage size 300 GB capacity. Looks good for now.