VMware Cloud Community
rcl23
Contributor
Contributor
Jump to solution

VCSA 7.0.3.01000 service " VMware vStats Service " (vstats) does not starts after power off.

Hi,

We noticed after a long time, that on one of our vCenter 7.0.3.01000, the service " VMware vStats Service " (vstats) is stopped, while it is supposed to be running as startup is set to " automatic ", no warning in vCenter GUI... 😞
It cannot be started through the GUI (A system error occurred. See logs for details) or even via SSH, the error is the following:

root@xxxxxxxxxxxxxxxxx [ ~ ]# service-control --start vstats
Operation not cancellable. Please wait for it to finish...
Performing start operation on service vstats...
Error executing start on service vstats. Details {
"detail": [
{
"id": "install.ciscommon.service.failstart",
"translatable": "An error occurred while starting service '%(0)s'",
"args": [
"vstats"
],
"localized": "An error occurred while starting service 'vstats'"
}
],
"componentKey": null,
"problemId": null,
"resolution": null
}

The only valuable thing I found support bundle log (xxxxxxxxxxxxxxxxx-2023-06-01--13.53-61238\var\log\vmware\vstats\) is that it does not work anymore since February, 24th, 2023, due to maintenance operation ongoing on that day, the vCenter has been powered off and " VMware vStats Service " (vstats) never started again since then.
I need this service to be running in order to configure backup of this vCenter.
Where can I find useful information that would indicate why this service is not able to run anymore ?

Thanks in advance, any help would be appreciated.
Regards.

0 Kudos
1 Solution

Accepted Solutions
rcl23
Contributor
Contributor
Jump to solution

Hello,

A solution has been found since last Friday, " vstats " service is still up so far.
The issue comes from the fact that we originally migrated our 6.5 vCenters and external PSC to 7.0 vCenters appliance with PSC embedded.
This is an hypothesis, but we may reasonably think that powering up "accidentally" the old PSC on 02/25 is the root cause of our problem. 
Despite the fact that we followed the procedure provided with the 7.0 migration wizard to delete all references to old external PSC, it seems that there were still some kind of link between our vCenters and those supposed decommissioned machines...

The following KB article permits to list references between vCenters and old PSC, it confirms that old PSC are still registered:
https://kb.vmware.com/s/article/2043509

Then, once again we performed the following procedures to decommission old PSC:
https://kb.vmware.com/s/article/75177

https://kb.vmware.com/s/article/2106736

It may be needed to execute this procedure several times if you still find some stale entries of old PSC ...
After that, "vstats" service could be started on vCenter, i hope it helps !

View solution in original post

14 Replies
mbufkin
Enthusiast
Enthusiast
Jump to solution

Have you considered updating the VCSA to 7.0.3.01400? I know you are on version 7 but I found an article for 6.5 and 6.7 that references an update that fixed the issue with these version. I'm just wondering if that's been addressed with 7.0.3.01400. 

0 Kudos
rcl23
Contributor
Contributor
Jump to solution

Hello,

Thanks for your reply.
Upgrading is indeed a possibility but that would not explain what the issue was.
I would envisage as well running this tool, maybe I would get interesting information:
https://flings.vmware.com/vsphere-diagnostic-tool

its-vmware
Contributor
Contributor
Jump to solution

Did you get any resolution on this? I'm running 7.0.3.01400 (21477706) and the same thing happened after using shutdown from the :5480 maintenance page.

0 Kudos
Ajay1988
Expert
Expert
Jump to solution

Is vstats the only service down?

Did u try "service-control --start --all" ? See which service fails.

Also check logs for /var/log/vmware/vstats/

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
0 Kudos
rcl23
Contributor
Contributor
Jump to solution

The issue is not yet fixed, still investigating, I will share here progress of course, if any.
OK, it looks like upgrading to 7.0.3.01400 would not help much in that case ...

Tags (1)
0 Kudos
rcl23
Contributor
Contributor
Jump to solution

Yes, " vstats " is the only service down with automatic startup type.
I did tried to start all services via the usual command line " service-control --start --all ", but it fails on " vstats "...
Like I said in my first post, I have checked \var\log\vmware\vstats\ logs and did not found any information after February, 24th, 2023, when we performed a maintenance operation and the vCenter has been powered off for a duration of around 15 hours.

0 Kudos
Ajay1988
Expert
Expert
Jump to solution

Which seems like 24th feb something happened... Any logs for those days before it stopped..

 

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
0 Kudos
rcl23
Contributor
Contributor
Jump to solution

In " vstats-stderr.log ", we reapeatdly observe that kind of event:

2023-02-24T18:25:24Z ERROR vchelper/eventsListener.go:364 Error while executing WaitForUpdates: Post "https://localhost:443/sdk": context canceled {"vStats": "MVP"}
vstats/vchelper.(*vcEventsListener).executeWaitForEvents
src/vstats/vchelper/eventsListener.go:364
vstats/vchelper.(*vcEventsListener).waitForEventsLoop.func1
src/vstats/vchelper/eventsListener.go:289
vstats/vchelper.(*vcEventsListener).waitForEventsLoop
src/vstats/vchelper/eventsListener.go:301
vstats/vchelper.(*entityTracker).executeGetEvents
src/vstats/vchelper/entityTracker.go:176
vstats/vchelper.(*entityTracker).processGetEvents.func1
src/vstats/vchelper/entityTracker.go:139
vstats/util/recoverable.Func.func1
src/vstats/util/recoverable/recoverable.go:70
vstats/vchelper.(*entityTracker).processGetEvents
src/vstats/vchelper/entityTracker.go:158
2023-02-24T18:25:24Z ERROR vchelper/client.go:358 Invoke method failed {"vStats": "MVP", "req": "CancelWaitForUpdatesBody", "error": "Post \"https://localhost:443/sdk\": context canceled"}
vstats/vchelper.(*reloginRoundTripper).RoundTrip
src/vstats/vchelper/client.go:358
2023-02-24T18:25:24Z ERROR vchelper/eventsListener.go:361 Error executing CancelWFU: Post "https://localhost:443/sdk": context canceled {"vStats": "MVP"}
vstats/vchelper.(*vcEventsListener).executeWaitForEvents
src/vstats/vchelper/eventsListener.go:361
vstats/vchelper.(*vcEventsListener).waitForEventsLoop.func1
src/vstats/vchelper/eventsListener.go:289
vstats/vchelper.(*vcEventsListener).waitForEventsLoop
src/vstats/vchelper/eventsListener.go:301
vstats/vchelper.(*entityTracker).executeGetEvents
src/vstats/vchelper/entityTracker.go:176
vstats/vchelper.(*entityTracker).processGetEvents.func1
src/vstats/vchelper/entityTracker.go:139
vstats/util/recoverable.Func.func1
src/vstats/util/recoverable/recoverable.go:70
vstats/vchelper.(*entityTracker).processGetEvents
src/vstats/vchelper/entityTracker.go:158

In " vstats-runtime.log.stderr ", the file ends like that:
...
2023/02/24 17:40:44 Reporting span 56cc320da497e27e:56cc320da497e27e:0000000000000000:1
2023/02/24 18:11:25 Reporting span 6f160590cae0a4b5:6f160590cae0a4b5:0000000000000000:1
2023/02/24 18:42:06 Reporting span 47dc932706780806:47dc932706780806:0000000000000000:1
2023/02/24 19:12:46 Reporting span 000e58fcaa2ac781:000e58fcaa2ac781:0000000000000000:1
2023/02/24 19:25:24 DEBUG: closing tracer
2023/02/24 19:25:24 DEBUG: closing reporter

In " vstats-stdout.log ":
...
2023-02-24T18:25:24Z ERROR vchelper/eventsListener.go:361 Error executing CancelWFU: Post "https://localhost:443/sdk": context canceled {"vStats": "MVP"}
vstats/vchelper.(*vcEventsListener).executeWaitForEvents
src/vstats/vchelper/eventsListener.go:361
vstats/vchelper.(*vcEventsListener).waitForEventsLoop.func1
src/vstats/vchelper/eventsListener.go:289
vstats/vchelper.(*vcEventsListener).waitForEventsLoop
src/vstats/vchelper/eventsListener.go:301
vstats/vchelper.(*entityTracker).executeGetEvents
src/vstats/vchelper/entityTracker.go:176
vstats/vchelper.(*entityTracker).processGetEvents.func1
src/vstats/vchelper/entityTracker.go:139
vstats/util/recoverable.Func.func1
src/vstats/util/recoverable/recoverable.go:70
vstats/vchelper.(*entityTracker).processGetEvents
src/vstats/vchelper/entityTracker.go:158
2023-02-24T18:25:24Z ERROR vchelper/eventsListener.go:364 Error while executing WaitForUpdates: Post "https://localhost:443/sdk": context canceled {"vStats": "MVP"}
vstats/vchelper.(*vcEventsListener).executeWaitForEvents
src/vstats/vchelper/eventsListener.go:364
vstats/vchelper.(*vcEventsListener).waitForEventsLoop.func1
src/vstats/vchelper/eventsListener.go:289
vstats/vchelper.(*vcEventsListener).waitForEventsLoop
src/vstats/vchelper/eventsListener.go:301
vstats/vchelper.(*entityTracker).executeGetEvents
src/vstats/vchelper/entityTracker.go:176
vstats/vchelper.(*entityTracker).processGetEvents.func1
src/vstats/vchelper/entityTracker.go:139
vstats/util/recoverable.Func.func1
src/vstats/util/recoverable/recoverable.go:70
vstats/vchelper.(*entityTracker).processGetEvents
src/vstats/vchelper/entityTracker.go:158
2023-02-24T18:25:24Z INFO vchelper/client.go:552 Successfully logged out from VC {"vStats": "MVP"}
2023-02-24T18:25:24Z INFO mixedmode/mm_init.go:108 Closing cache CoreDB in mixed mode {"vStats": "MVP"}
2023-02-24T18:25:24Z INFO cache/cache.go:207 Closing CoreDB {"vStats": "MVP"}
2023-02-24T18:25:24Z INFO mixedmode/mm_init.go:112 Closing database CoreDB instance in mixed mode {"vStats": "MVP"}
2023-02-24T18:25:24Z INFO mixedmode/mm_init.go:121 Closing database StateDB instance in mixed mode {"vStats": "MVP"}
2023-02-24T18:25:24Z INFO mixedmode/mm_init.go:130 Closing database TsDB instance in mixed mode {"vStats": "MVP"}

In " vstats-telemetry.log ":
...
2023-02-24T18:22:12Z INFO introspect/introspect.go:307 introspection-summary {"vStats": "MVP", "name": "vrm:VM:periodic_refresh_wait_time_summary", "count": 3642, "sum": 4616969, "min": "NaN", "max": "NaN", "median": "NaN", "p75": "NaN", "p95": "NaN", "p99": "NaN", "p99.9": "NaN"}
2023-02-24T18:22:12Z INFO introspect/introspect.go:307 introspection-summary {"vStats": "MVP", "name": "vrm:VM:resource_events_sent_count", "count": 3802, "sum": 0, "min": "NaN", "max": "NaN", "median": "NaN", "p75": "NaN", "p95": "NaN", "p99": "NaN", "p99.9": "NaN"}
2023-02-24T18:22:12Z INFO introspect/introspect.go:307 introspection-summary {"vStats": "MVP", "name": "vrm:VM:tracker_events_count", "count": 160, "sum": 160, "min": "NaN", "max": "NaN", "median": "NaN", "p75": "NaN", "p95": "NaN", "p99": "NaN", "p99.9": "NaN"}
2023-02-24T18:22:12Z INFO introspect/introspect.go:307 introspection-summary {"vStats": "MVP", "name": "vrm:rm:from_bridgeID_execution_time_summary", "count": 0, "sum": 0, "min": "NaN", "max": "NaN", "median": "NaN", "p75": "NaN", "p95": "NaN", "p99": "NaN", "p99.9": "NaN"}
2023-02-24T18:22:12Z INFO introspect/introspect.go:307 introspection-summary {"vStats": "MVP", "name": "vrm:rm:to_bridgeID_execution_time_summary", "count": 0, "sum": 0, "min": "NaN", "max": "NaN", "median": "NaN", "p75": "NaN", "p95": "NaN", "p99": "NaN", "p99.9": "NaN"}

And that's it for the log files under " /var/log/vmware/vstats/ " modified at time, or just before the VCSA has been shutdown.
It does not mean much to me, I do not see a lot of helpful information...



0 Kudos
MadMancunian
Contributor
Contributor
Jump to solution

Just want to add i have the exact same problem...
VCSA 7.0.3.01300


0 Kudos
RajeevVCP4
Expert
Expert
Jump to solution

You required open case with vmware 

it sees there is missing relations/columns in the database required for vstats service.

Rajeev Chauhan
VCIX-DCV6.5/VSAN/VXRAIL
Please mark help full or correct if my answer is use full for you
0 Kudos
likeahoss
Enthusiast
Enthusiast
Jump to solution


@RajeevVCP4 wrote:

You required open case with vmware 

it sees there is missing relations/columns in the database required for vstats service.


How did you come to this conclusion?  Why isn't there a KB for this?

Thanks!

0 Kudos
RajeevVCP4
Expert
Expert
Jump to solution

There is internal KB related this error which we can't published here 

Rajeev Chauhan
VCIX-DCV6.5/VSAN/VXRAIL
Please mark help full or correct if my answer is use full for you
rcl23
Contributor
Contributor
Jump to solution

Hello,

A solution has been found since last Friday, " vstats " service is still up so far.
The issue comes from the fact that we originally migrated our 6.5 vCenters and external PSC to 7.0 vCenters appliance with PSC embedded.
This is an hypothesis, but we may reasonably think that powering up "accidentally" the old PSC on 02/25 is the root cause of our problem. 
Despite the fact that we followed the procedure provided with the 7.0 migration wizard to delete all references to old external PSC, it seems that there were still some kind of link between our vCenters and those supposed decommissioned machines...

The following KB article permits to list references between vCenters and old PSC, it confirms that old PSC are still registered:
https://kb.vmware.com/s/article/2043509

Then, once again we performed the following procedures to decommission old PSC:
https://kb.vmware.com/s/article/75177

https://kb.vmware.com/s/article/2106736

It may be needed to execute this procedure several times if you still find some stale entries of old PSC ...
After that, "vstats" service could be started on vCenter, i hope it helps !

Ajay1988
Expert
Expert
Jump to solution

Interesting. There is high change,  that is the problem as vmdird on 7.0 could get updated again once when the old PSC was powered on. May be some stale entry was left back or a clean decommission was not done. 

Thanks for sharing. 

If you think your queries have been answered
Mark this response as "Correct" or "Helpful".

Regards,
AJ
0 Kudos