Recently we have migrated to 6.5.0, 5969303 from 6.0 successfully but after the activity, all the VM's/hosts getting performance issues like intermittently disconnects VM's as well hosts from vcenter with the following warning error.
we have contacted VMware but they suggest us to upgrade blade's firmware. I did not see any change after upgradation of firmware still we are facing the issues.
anyone facing similar issues? Please comment any suggestion to avoid this.
This warning occurs due to high latency at the device level. Can you check with storage team on this ?
Also launch ssh session to ESXi host, run esxtop press u
it will display all the luns with performance report, check the DAVG column, if DAVG is higher than 20 then you have to check with storage team,
What storage model are you using here? And what is the microcode version?
Are you experiencing this error on multiple hosts? Look at the disk performance stats for the virtual machines running on the volume. What's the model of the blade & storage array?
Yes facing with all the hosts.
Yes, I have checked with Dell Storage team they have routed to hardware team for firmware upgradation but I could not see any outcome after updating to latest firmware.
Hello Randhir,
Thanks for your reply, I have gone through all the readings but no luck
can you share Vpxa log during the issue ...
Regards,
Randhir
The most likely explanation for this is simple: You're overloading your storage. There may be nothing wrong with the host or the storage, but if the storage is too active for the capabilities of the hardware, you'll begin to see these warnings. So what you should look at next are performance graphs for this storage to see what the latencies are and other metrics. Either move some workloads off this/these datastores to another on the same array, or if the entire array is over-worked you may need to plan on adding performance to it somehow.
VMhost upgrade will not increase the storage array workload, I suggest check this-
1- Ask your storage team check for the response on these LUNs and also see if there is a rapid increase to workload after upgrade? Storage administrator can check historical graphs.
2- Can you provide the logs in vmkernel.log file from your ESXi hosts ? You will see the errors related connections drops & established to storage array.
3- We faced similar issue few years ago and we had to reduce the I\O block size on ESXi to 32 KB, this was recommended by the storage vendor. You will need to do environmental research to reduce the block size.
Datastore disconnection from storage
This is the setting in advance page below- default is 32 MB.
Provide historical graph of following,
> Storage controller CPU usage - before and after upgrade
> LUN I\O reads\Write stats - before and after upgrade
> Response time to LUNS
> Review the performance and look at maximum I\Ops supported by storage controller if it's excessively unitized.
Regards,
Deepak Negi