VMware Cloud Community
itorder
Contributor
Contributor

Snapshot stun issue

Hello,

I am facing longs stun on VM very busy while backuping, during the consolidation step, something like 40s, which totally down the jobs running on the VM.

I have seen that we can play with "snapshot.maxConsolidateTime" on v3.5 to v5.5, but i am on version 6.5, what can be done ?

Regards,

Tags (1)
Reply
0 Kudos
8 Replies
pragg12
Hot Shot
Hot Shot

Hi,

Welcome to VMTN. 🙂

Refer to below VMware KB on this particular setting.

How to increase the time limit on snapshot consolidation (2146270)

I haven't played with this setting before. So if you want to check, check through a test VM.

Need some insight into the the affected VM.

1. What troubleshooting steps you have performed till now ?

2. Total size of VM ?

3. Application(s) hosted on VM ?

4. Any active or manual snapshot on VM before backup starts ?

5. Is VM doing any IO intensive operation during backup time ?

6. Have you tried modifying backup schedule to see if you face long stun times again ?

Consider marking this response as "Correct" or "Helpful" if you think my response helped you in any way.
Reply
0 Kudos
itorder
Contributor
Contributor

Hi,

Thanks for your reply, i've seen this link before but it only refer 6.0, not 6.5.

We will make a try on next week.

2. Total size of VM ?

     2x100gb

3. Application(s) hosted on VM ?

     SAP, also Orchestrator on some VM, or custom apps.

4. Any active or manual snapshot on VM before backup starts ?

     No

5. Is VM doing any IO intensive operation during backup time ?

     Yes, this is precisly the moment where the vm stun 40s, when there is an "normal" IO rate the stun are like 1s

6. Have you tried modifying backup schedule to see if you face long stun times again ?

     We can't because of internal policies.

Reply
0 Kudos
pragg12
Hot Shot
Hot Shot

Normally, it doesn't take 40 sec for a 200GB VM to perform snapshot tasks. However, since you said SAP and high IOs in response to ques 3 and 5 respectively, I have few more questions.

7. What's the underlying storage type for this VM ? HDD or SSD ?

8. How many other VMs on the same datastore as this SAP VM ?

9. Are the VMs for ques 8 on same ESXi host as SAP VM ?

10. Does the backup schedule of the VMs for ques 8 run at same time as the SAP VM ?

Consider marking this response as "Correct" or "Helpful" if you think my response helped you in any way.
Reply
0 Kudos
itorder
Contributor
Contributor

7. What's the underlying storage type for this VM ? HDD or SSD ?

Mixed of SSD/HHD, VMs are stocked on Nutanix, we are using Rubrik to backup the VMs.

8. How many other VMs on the same datastore as this SAP VM ?

They are all location on different esxi, but on the same Datastore.

We already tried to migrate on VM to a different Datastore but we faced the same stun time, 40s.

9. Are the VMs for ques 8 on same ESXi host as SAP VM ?

All SAP VM are on the same esxi, We already tried to migrate on VM to a different Datastore but we faced the same stun time, 40s.

10. Does the backup schedule of the VMs for ques 8 run at same time as the SAP VM ?

The backup window is from midnight to 8am. we can manage the precise backup time in this window.

Reply
0 Kudos
pragg12
Hot Shot
Hot Shot

Hi,

I see you have marked my previous response as Correct. Did you find something which resolved your issue ?

Consider marking this response as "Correct" or "Helpful" if you think my response helped you in any way.
Reply
0 Kudos
itorder
Contributor
Contributor

Hello,

I mark as correct by error, i didn't find a fix yet.

We try to configure MaxConsolidateTime from 6 to 30, but this setting is marked for v6.0, and we got stunned for 41s still.

Regards,

Reply
0 Kudos
pragg12
Hot Shot
Hot Shot

I have few suggestions to further check this.

1. Migrate the affected VM to another host in cluster and wait for backup schedule to run at same time to see if the stun time is still same.

2. If stun time is still same, change the current backup time to a different time in same backup window when backups of VMs on same datastore have already finished.

Share your findings once you have checked. Also, let us know the RF in place for the underlying Nutanix datastore cluster.

Consider marking this response as "Correct" or "Helpful" if you think my response helped you in any way.
Reply
0 Kudos
Linjo
Leadership
Leadership

Unchecked the "Correct" answer since it seems that it was in error.

I would also recommend to open a SR with Nutanix since they would be able to trace this in their storage-stack better.

Best Regards, Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos