VMware Cloud Community
jdelpero
Contributor
Contributor

vSphere on 3 x HS22 ESXi IBM Blades w Shared Storage HA Issue

This is another out of resources issue.

I have 3 IBM HS22 blades running ESXi with 24GB Ram and a Intel Xeon E5540 @ 2.53GHz on a S Series blade server with 1.6TB shared storage.

I have a really weird issue with the HA telling me that I have no resources available. I have only 4 VM's (3 using 6GB Ram and 1 using 8GB Ram all of which have no reservations and only a single vCPU).

I have manually set the slot size using das.slotMemInMB to 6444 so I can guarantee a minimum of 3 slots per blade. I also set resource allocations on the VM's RAM to 6144.

I can only power up two VM's tho... By doing the RAM calculation I surely should have 22/6 = 3 Slots.

I think the issue is with CPU...

When I review the Cluster summary page it states 12 Processors and 30GHz available. But when I look at Resource Allocation it states there is only a Total Capacity of 1900MHz. Total RAM shows up correctly at roughly 72GB.

Why wouldnt I be able to see all 30Ghz in the CPU total capacity?

Am I on the right track would this be my issue??

Thanks for your assistance...

0 Kudos
8 Replies
jdelpero
Contributor
Contributor

Sorry I forgot to mention that it would be a total of 5 slots in use over two blades. But the 3 slots per blade should be 6 which is enough. (9 in total but you drop a server for HA calculation)

0 Kudos
AndreTheGiant
Immortal
Immortal

Have a look on this site on how slot size works:

http://www.yellow-bricks.com/vmware-high-availability-deepdiv/

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
jdelpero
Contributor
Contributor

Thanks for the link but there has to be something going wrong re: resource pooling of cluster because I can only see 1900MHz of 30GHz on the resource page. Ram is definetly set fine but the slot calculation is barely able to create 2 slots due to the lack of CPU

0 Kudos
jdelpero
Contributor
Contributor

Ok, I completely removed vCenter server and reinstalled - Now i can see 25GHz and seems to have resolved the problem with HA.

But.... Now I have a weird issue with Fault Tolerance... basically the machine starts and the secondary starts but then the secondary orphans itself instantly and it bounces from protected to unprotected constantly.

Is this to do with the FT logging network? Does that require me to enter the dns and ip's into the host file for the FT Logging network?? vMotion and HA work fine.

I have the FT network on a single 1Gbps NIC and on its own Subnet... could this be a gateway issue or something weird? Thanks for any help.

0 Kudos
AndreTheGiant
Immortal
Immortal

There are a lot of documents about FT and best practice.

Check that you have a dedicated vmkernel interface and a dedicated network for it.

Andre

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
jdelpero
Contributor
Contributor

Gosh this is killing me... in my test lab FT just worked.

I only have 2 NIC's for the moment whilst I wait for an additional 2 NIC daughter card. When I recieve these I want to seperate the 3 networks.

Current NIC config:

1 NIC - vMotion and Management 192.168.70.x/24 (My management IP range which has a gateway)

1 NIC - vMkernel FT Logging 192.168.80.x/24 (As far as I know this has to be different subnet I have no gateway on this subnet)

End Solution:

1 NIC - vMotion

1 NIC - FT Logging

2 NIC - Failover/Load balanced Management

Like I mentioned I have statically assigned all dns to each ESXi servers host file and vCenter. So this can't be a DNS issue. The disk is thick and I have no other VM's running at the time of testing.

HA and vMotion currently work perfectly... FT just will not work at all. It configures the VM and I can see it creates a secondary. Then it fires up the primary which succeeds... it then starts the secondary but once it starts it registers as protected for about 2 seconds and then instantly orphans the secondary machine. After which it just bounces the secondary around my 3 hosts repeating the issue.

In my test labs with the same gear this all worked fine. But now that I'm putting it into production of course it's not working haha.

Is there something I've missed??? The site survey tool tells me everything is good. Is this an issue with the vlockstep? (Using Intel Xeon E5540's.) I am only in trial mode until the vSphere licences arrive is their support of any kind I can contact?

0 Kudos
jdelpero
Contributor
Contributor

Alrighty I have found the issue Smiley Happy

Apparently the virtual machines had been snapshot at some stage and although the snapshots had been removed the vmx file still needed to have the CTK variables removed.

As below Smiley Happy I hope this helps someone as I wasted a heap of time on this.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101340...

To remove CTK variables from the .vmx file:

1.Log into to the ESX service console.

2.Power off the virtual machine.

3.Unregister the virtual machine from the vCenter Server Inventory by right-clicking the virtual machine and clicking Remove from Inventory.

4.Open the .vmx file in a text editor.

5.Locate variables similar to the following and delete the entire line:

scsi0:0.ctkEnabled = "true"

ctkEnabled = "true"

Notes:

•If there is more than one virtual disk, there are additional scsi#:#.ctkEnabled entries. These must be removed as well.

•You may also find ide#.#.ctkEnabled entries. These must be removed as well.

7.To guarantee that change block tracking cannot be enabled, add the following line to the configuration file:

ctkDisallowed="true"

8.Open the datastore browser and change to the directory where the .vmx file is located

9.Right-click the .vmx file and click Add to Inventory.

10.Save and close the file.

11.Reboot the virtual machine to apply the changes.

0 Kudos
a_p_
Leadership
Leadership

Just in case you or sombody else need to do this again, there's an easier way do edit these settings using the vSphere Client.

see What is Changed Block Tracking in vSphere?

André