VMware Cloud Community
rickmaccubbin
Contributor
Contributor

VM Power on delay

I'm trying to solve an issue where certain VM's are hanging on power on, they will progress to 66% and stall, the time they stall depends on how much RAM the VM has, it seems as though the ESXi Host is reading the datastore vswap file for the entire time whilst hung @ ~100MB/sec. If the VM has 512MB ram, the delay is 10 seconds, if it has 32GB, it can be upto 10 minutes.

I don't think it's normal for these reads to be performed before power on, else we'd see it on every single VM.

Running version 6.7U3, Dell M630 with all firmware upto date, connected via FC to a Unity SAN.

Below is a power on delay with 16GB RAM.

 

Power on - Start 05/26/2021 12:52:23 Finish 05/26/2021 12:55:02

 

 

Reply
0 Kudos
18 Replies
e_espinel
Virtuoso
Virtuoso

Hello.
Just to be in sync on ideas
Did you update the VMware Tools on the VMs with problems ?
Did you recommend and install a patch level higher than Update 3 (14320388)? If not you could upgrade to Build 17167734 (ESXi 670-202011002), exactly it is not the latest available, but it is the one I have tested on some customers without problems so far.

The patches can be obtained from the following link

https://my.vmware.com/group/vmware/patch#search

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
rickmaccubbin
Contributor
Contributor

I'd rather not apply any further upgrades unless this issue is specifically listed as a resolved in the notes. The issue is only occurring in 1 datacentre out of 8, and only on certain VM's.

 

 

Yes VMware Tools has been updated with no change.

Reply
0 Kudos
rickmaccubbin
Contributor
Contributor

Reserving the VM memory completely eliminates the delay.

Reply
0 Kudos
scott28tt
VMware Employee
VMware Employee

1. The swap file is created at power on, in the VM home directory alongside the VMX file, and is (memory size - memory reserved)

2. A disk performance issue? LUN thrashing?

3. What have VMware support said?

 


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
Reply
0 Kudos
rickmaccubbin
Contributor
Contributor

 

1. The swap file is created at power on, in the VM home directory alongside the VMX file, and is (memory size - memory reserved)

My main question is, Why does the VM need to read the entire swap file before booting. On a host with 768GB of ram, with only 1 VM with 16GB Allocated, it still reads this swap file - I don't understand why this is happening. The delay is not the creation of the swap file, the delay occurs after the swapfile is created and the esxi host reads the entire amount (16GB) back at 100MB/sec.

 

2. A disk performance issue? LUN thrashing?

No, the storage is under no stress during this time, something else is limiting it to 100MB/sec.


3. What have VMware support said?

It's been escalated to the Storage team, I'm awaiting their response, Case #21220866705 if you're interested.

 

 

 

 

Reply
0 Kudos
depping
Leadership
Leadership

That is very strange, a swap file get's newly created, I have no idea why this would need to be read by the VM during the boot, as the VM itself is not even aware of the swap file normally. it is almost like the swap file is zeroed out during boot, but I never heard about that before to be honest.

rickmaccubbin
Contributor
Contributor

Thankyou for understanding and acknowledging this is not normal behaviour, it's quite frustrating to waste time with support being told the issue is a 2 year old BIOS and other completely unrelated things. I feel half sane now. 

 

 

 

Reply
0 Kudos
depping
Leadership
Leadership

you could validate swap access through "esxtop" probably, and also, is this a power-on or a restart / reset?

Reply
0 Kudos
e_espinel
Virtuoso
Virtuoso

Hello.
All operating systems have a maintenance policy, i.e. the application of patches periodically. Patching improves features, performance and prevents bugs.
If your ESXi hosts are on version 7 update 3, you need to read the following article

https://kb.vmware.com/s/article/76159

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
Reply
0 Kudos
rickmaccubbin
Contributor
Contributor

............what? No offense, but your responses are the equivalent of "SFC /scannow" responses on Microsoft forums that have nothing to do with the actual issue. The article you linked refers to ESXi Boot issues, not VM boot issues. 

rickmaccubbin
Contributor
Contributor

Only during power on operations from powered off state.

Reply
0 Kudos
depping
Leadership
Leadership

Thanks, just spoke with an engineer who also spoke to you via Reddit I think, he is investigating it, as this is very odd.

Reply
0 Kudos
e_espinel
Virtuoso
Virtuoso

Hello.
Of course your problem is in the boot of some VMs.
I pointed you to KB76159 because you are precisely at that level ESXi 6.7 Update 3 (Build 14320388), not as a solution to your problem.

 

Enrique Espinel
Senior Technical Support on IBM, Lenovo, Veeam Backup and VMware vSphere.
VSP-SV, VTSP-SV, VTSP-HCI, VTSP
Please mark my comment as Correct Answer or assign Kudos if my answer was helpful to you, Thank you.
Пожалуйста, отметьте мой комментарий как Правильный ответ или поставьте Кудо, если мой ответ был вам полезен, Спасибо.
Reply
0 Kudos
rickmaccubbin
Contributor
Contributor

Update:

vMotioning to any other datastore solved the issue, when vMotioning back to the original datastore, the issue re occured, so I thought the issue could be isolated to a single datastore.

 

However after migrating all VM except one off the datastore in question, I could not reproduce the issue anymore, that leaves me thinking it's something related to queue depths.

 

Reply
0 Kudos
rickmaccubbin
Contributor
Contributor

Your posts degrade the quality of the forum, if I wanted generic instructions totally unrelated to my issue I'd speak with VMware support.

 

 

Reply
0 Kudos
rickmaccubbin
Contributor
Contributor

Deleted

Reply
0 Kudos
depping
Leadership
Leadership

Yes that is very strange indeed that an SvMotion back and forth resolves this issue... I have not come across this ever.

Reply
0 Kudos
DickyDck
Contributor
Contributor

wondering if this has ever gotten a full solution and root cause. I have the same thing however it is only 1 VM, all VMs use the same datastore but just the one hangs, and it has 8GB of RAM assigned but took over 16 hours to power on.

Reply
0 Kudos