VMware Cloud Community
mordzy
Enthusiast
Enthusiast
Jump to solution

Server 2003, 2000 SQL server freezing. disk queue maximum

Yeh yeh its legacy i know but its still in production.

heres the specs;

  • farm of around 30 guests
  • 5 vsphere servers in vcenter
  • emc iscsi san with 24 spindles
  • Server in question is 32bit 2003 server. memory is around 3.4gb used of 4gb available CPU is 2 sockets with 4 cores per socket. It was P2V'd a number of years ago
  • all other server appear to be fine and don't have any performance issues.

I have an issue where the 2003 server is hanging and becoming unresponsive. End users who use the software and website that talks to the SQL database are complaining of poor performance also. the UI desktop pauses and hangs at the same time

I have checked the performance of the host vsphere server that runs this guest and resources are not overloaded. CPU maximums are around 40%

I have loaded perfmon on the guest and added av/disk queue counters, cpu % and memory/page file. When the UI hangs, users experience performance issues the av/disk queue hits 100%. disk queue returns to normal and performance responsiveness returns.

I have also run perfmon on another server that resides in the same datastore and vsphere server and i don't see any of the issues/hangs or disk queue maximums so it would appear to be guest related.

Does anyone have any suggestions?

1 Solution

Accepted Solutions
Nithy07cs055
Hot Shot
Hot Shot
Jump to solution

Yeah you can do it for the Guest which is affected , yeah as per EMC go ahead with the reboot ,

I always suggest do no over-provision your storage for the production servers , do a good capacity analysis and make sure the load across all the storage are well balanced ,

All the best and hope you will fix the issue .. Smiley Happy

Thanks and Regards, Nithyanathan R Please follow my page and Blog for more updates. Blog : https://communities.vmware.com/blogs/Nithyanathan Twitter @Nithy55 Facebook Vmware page : https://www.facebook.com/Virtualizationworld

View solution in original post

0 Kudos
10 Replies
Nithy07cs055
Hot Shot
Hot Shot
Jump to solution

Did you check the disk latency of the VM , can you try adding a new hard disk and change the adapter type ..

Which was used while converting from P2V?

Thanks and Regards, Nithyanathan R Please follow my page and Blog for more updates. Blog : https://communities.vmware.com/blogs/Nithyanathan Twitter @Nithy55 Facebook Vmware page : https://www.facebook.com/Virtualizationworld
mordzy
Enthusiast
Enthusiast
Jump to solution

Thanks. Its an LSI Logic SCSI controller as part of the virtual hardware. The disks are also SCSI. First thing i checked to make sure it wasn't IDE

I will look at the latency, however because none of the other server on the same LUN/Disk don't have an issue i figured it might be related to something with just that guest.

0 Kudos
mordzy
Enthusiast
Enthusiast
Jump to solution

Ive checked the latency and get around half a second on that lun. The lun is part of a storage pool of 28 disks. whats odd is another pool of 14 disks with different servers on also sees the same momentary latency and disk queue issues but doesn't last as long.

Ive counted a total of 30 guest machines and confirmed that no capacity planning was ever done in the first place and that the prevoise IT person just kept creating new virtual machines. It seems to me like the SAN is oversubscribed which is representing latency to this particular SQL server that provides the backed to CRM software.

0 Kudos
Nithy07cs055
Hot Shot
Hot Shot
Jump to solution

you can check the in detail Latency using esxtop command and press u and d ,.. you can verify it in terms of VM level and adapter level

Did you try to migrate a Virtual machine to another host and data store and give it a try ? did you check the PSP on the host .. make sure the storage I/o is enabled for better performance, Make sure to check the vmkerenel.log and vobd.log (observation file ) , you can get some inputs from it .

please paste the screen shot of esxtop on the host .. let me try to check the values

Thanks and Regards, Nithyanathan R Please follow my page and Blog for more updates. Blog : https://communities.vmware.com/blogs/Nithyanathan Twitter @Nithy55 Facebook Vmware page : https://www.facebook.com/Virtualizationworld
0 Kudos
mordzy
Enthusiast
Enthusiast
Jump to solution

Thanks Nithy.

Not much point moving data stores as other guests are effected also. just not as bad. PSP is MRU but the network performance does not max out.

Will check Storage I/O. Will i need to enable it on all guests or just the one i have issues with?

Will get log details ASAP

Thanks

0 Kudos
Nithy07cs055
Hot Shot
Hot Shot
Jump to solution

Do it on all the host devices ,

storage.JPG

Thanks and Regards, Nithyanathan R Please follow my page and Blog for more updates. Blog : https://communities.vmware.com/blogs/Nithyanathan Twitter @Nithy55 Facebook Vmware page : https://www.facebook.com/Virtualizationworld
0 Kudos
Nithy07cs055
Hot Shot
Hot Shot
Jump to solution

Hope you identified the issue and fixed it .. let me know with the results if you have tried with any other method Smiley Happy

Thanks and Regards, Nithyanathan R Please follow my page and Blog for more updates. Blog : https://communities.vmware.com/blogs/Nithyanathan Twitter @Nithy55 Facebook Vmware page : https://www.facebook.com/Virtualizationworld
0 Kudos
mordzy
Enthusiast
Enthusiast
Jump to solution

Hi Nithy.

Ive turned storage i/o on, do i not need to nominate a priority VM?. Also spoke to EMC who just wanted to reboot the SAN WTF.

Ive got an older machine I've filled with disks. Im going to V2V the SQL to this as its my belief the SAN is oversubscribed as no capacity plan was ever done and additional vm's have just been added with no thoughts to the nock on performance. If i have no issues on this isolated server / datastore i will then do a capacity plan and also check performance of hardware on other servers. Because its production i have limited down time and the issue is causing significant problems.

Will feed back my findings. Your thoughts would be appreciated.

0 Kudos
Nithy07cs055
Hot Shot
Hot Shot
Jump to solution

Yeah you can do it for the Guest which is affected , yeah as per EMC go ahead with the reboot ,

I always suggest do no over-provision your storage for the production servers , do a good capacity analysis and make sure the load across all the storage are well balanced ,

All the best and hope you will fix the issue .. Smiley Happy

Thanks and Regards, Nithyanathan R Please follow my page and Blog for more updates. Blog : https://communities.vmware.com/blogs/Nithyanathan Twitter @Nithy55 Facebook Vmware page : https://www.facebook.com/Virtualizationworld
0 Kudos
mordzy
Enthusiast
Enthusiast
Jump to solution

Problem being that the previous head of IT did not capacity plan, instead just bought a san. consolidated then added more VM's

Because sql is impacted I'm going to have to move off onto its own server and try a retrospective capacity plan or reduce the number of VM's. I think many are for testing or just for the sh*ts and giggles so some internal consolidation of server functions will help also.

Thanks very much for your help. Will send thanks.

0 Kudos