VMware Cloud Community
ufo8mydog
Enthusiast
Enthusiast

Storage vmotion causes massive IO Wait in Guest

Hi there

I am sv-motioning from one iSCSI datastore to another (vCenter4 and ESX3.5U4 currently. I have not tried with ESX4 yet).

The operation is causing massive loads on the internal redhat5.3 server (e.g., 300+). Although I can still SSH in and ping it might as well be down. My question is why!

  • SV motion works fine on turned off guest VMs

  • The network link to the SANs is only at 7.5% capacity (2 x PS5000X From Equallogic). Flow control is enabled.

  • VM Tools are fully updated and working

  • The SV motion is purring along and it will finish eventually

This happens to us -sometimes- right after a vmotion attempt, where loads run away to ridiculous levels for up to 15 minutes.

VMWare support (at least the level 1 team I've spoken to) has never been able to resolve/understand the query to solve it. Any help would be greatly appreciated Smiley Happy

0 Kudos
8 Replies
ufo8mydog
Enthusiast
Enthusiast

Bump... Smiley Happy

0 Kudos
AndreTheGiant
Immortal
Immortal

Your problem is quite strange.

You are doing SVMotion from VMware ESX3.5 using VC4, correct?

You are not using Equallogic Volume migration?

The RH server has only vmdk disks or also a RDM or iSCSI (inside the VM) disk?

How much is the CPU/Mem load on the ESX hosting the RH?

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
ufo8mydog
Enthusiast
Enthusiast

>>You are doing SVMotion from VMware ESX3.5 using VC4, correct?

>>You are not using Equallogic Volume migration?

Yes thats right, i'm using SVMotion using VC4.

>>The RH server has only vmdk disks or also a RDM or iSCSI (inside the VM) disk?

>>How much is the CPU/Mem load on the ESX hosting the RH?

Yes, only VMDK disks, there are no RDM or iSCSI inside the VM.

The strange thing is that externally the load seems fine, CPU is at 700mhz during the operation and RAM at 75% of 2GB total. All the physical hosts have 16GB of RAM each currently, they are allocated at between 75-90% each so there is no memory oversubscription that I am aware of.

Its just when you log in to the VM you can see the load shoot up to massive levels and everything becomes unresponsive.

0 Kudos
AndreTheGiant
Immortal
Immortal

they are allocated at between 75-90%

Maybe this is the problem.

SVMotion on ESX 3.5 need extra resource RAM resource.

Be sure that your RH VM is not using VM swap file.

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
ufo8mydog
Enthusiast
Enthusiast

Hi Andre,

How do i check that for certain?

(Would it be a good idea to bump up the RAM by 1GB or so before attempting the transfer?).

0 Kudos
AndreTheGiant
Immortal
Immortal

Just use VIC, go on your VM, performance tab, choose change chart options.

Select memory / memory swapped

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
depping
Leadership
Leadership

Or you could use ESXTOP to see if any swapping occurs,

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
ufo8mydog
Enthusiast
Enthusiast

I'm running another test hot sv-motion now. (I gave it another 1GB of RAM). Swapping is firmly on 0. The VM itself seems fragile, if I sneeze load

seems to climb Smiley Happy

Could the VM be starved of I/O bandwidth to the shared storage perhaps, causing the load to climb? That is the only other thing I can think of.

Currently I have it set up as follows, now that I think of it the svmotion and the SAN traffic may be potentially contended over the one link;

VMKernel. vmotion - enabled

=Active Adapters=

vmnic0

vmnic3

vmnic4

vmnic5

Should I have something like this instead?

VMKernel

vmotion - disabled

Active: vmnic 0, vmnic4

VMKernel2

vmotion - enabled

Active: vmnic 3, vmnic5

If so:

i) Can they be on the same vswitch, or should they be separated, or does it not matter?

ii) How do I ensure that only vmotion traffic goes over VMKernel2, and not iSCSI general traffic? I know I can check 'use this port group for Vmotion', but I would obviously like to isolate only iscsi traffic on VMKernel and only vmotion traffic on VMKernel2.

iii) I assume that svmotion and vmotion traffic are functionally the same thing when setting up the ports.

Thanks for all your assistance.

0 Kudos