bigtrev3
Contributor
Contributor

SRM 1.01 Licensing

Having been burnt by the licensing requirement for SRM that each protected host must have a license for each CPU socket the following questions come to mind

1. If you do not want/need to protect all of your hosts in a data center because only a subset of VMs need to be protected for DR how are the licenses allocated, particularly if you have multiple HA/DRS clusters.

2. What does the 'The SRM Feature: SRM_PROTECTED_HOST is overallocated by xx licenses.' mean?

In our environment we have a cluster of 6 x 4 socket hosts which runs all of the 'protected' VMs - thus we purchased 24 x SRM 1CPU licenses (it seemed a bit expensive at the time). In addition there is a test cluster of 2 x 4 socket hosts and another 'high performance' cluster of 3 x 2 socket hosts. This adds up to 38 sockets.

Every 15 minutes we get the following 'The SRM Feature: SRM_PROTECTED_HOST is overallocated by 20 licenses'

Does this mean we have purchased 20 too many SRM licenses?

The number of 20 is not the difference between the 24 sockets we have purchased and the 38 actually managed by this vCenter or License server. My theory is that since HA on the 'protected' cluster is indicating that all VMs could actually be run on a single 4 socket host (failover capacity is 5 in a six host cluster) then only four licenses are needed. This is even more complex than I had thought but implies that we could have bought SRM in 4 socket batches as the utilisation of the clusters increase.

BTW - The SRM failover cluster consists of 5 x 4 Socket Hosts to the same spec as the protected cluster.

Tags (3)
0 Kudos
3 Replies
Smoggy
VMware Employee
VMware Employee

I will see if I can answer your questions

1. If you do not want/need to protect all of your hosts in a data center because only a subset of VMs need to be protected for DR how are the licenses allocated, particularly if you have multiple HA/DRS clusters.

Just talked about something similar in here but the bottom line for ease of use is either license everything (ok ok ok...nice for us I know Smiley Happy ) or split the cluster. See that thread though.

2. What does the 'The SRM Feature: SRM_PROTECTED_HOST is overallocated by xx licenses.' mean?

This means the system thinks you don't have enough SRM licenses for the current configuration. So can I just check the numbers, BTW I am not the license police so I won't coming tracking you down Smiley Happy In the production cluster where your protected VM's reside you have 6 x 4 socket hosts meaning you've got 24 sockets and then you say you have 24 single socket SRM licenses? This should be fine so we need to look at the configuration.

- what version of SRM is this? 1.x? or 4.0?

- verify that the none of the protected VM's have been moved to any of the hosts in either the test or HPC clusters. As soon as SRM sees a protected VM on a hosts it has not seen before that hosts will withdraw socket licenses from the SRM_PROTECTED_HOST total equal to the number of sockets in that host. Therefore it is definitely possible to go into your overdraft as it were and over allocate. another take on this is double check no one has created new protection groups in your SRM server that map to datastores in the test or HPC cluster as this also has same effect, caused those hosts to ask for licenses

- verify at the recovery site that noone has configured bi-directional replication and created a protection group at that site IF that is done AND those hosts/SRM server are pointing at the SAME licesne server as the primary site, again it is possible to overallocate.

- final check is restart the SRM service, during serivce start up license allocation information is dumped to the log. be useful to review that if you can attach.

- I lied...there is one final final check...it is also possible that the SRM license entries in the license file are not right (i.e they werent generated to even support you default 24 sockets). There should be two sections in the file PROD_SRM and SRM_PROTECTED_HOST. PROD_SRM is the entry for the SRM server itself and SRM_PROTECTED_HOST is the "pool" of socket licenses for the ESX hosts to tap into when they host protected VM's. I will admit license portals are a minefield and you wouldn't be the first person to have generated the SRM keys incorrectly and allocated yourself a 24 socket PROD_SRM key and 1 socket SRM_PROTECTED_HOST key which is basically the wrong way round. The SRM_PROTECTED_HOST socket count should equal the number of sockets in the protected cluster.

hope this helps,

Lee

0 Kudos
bigtrev3
Contributor
Contributor

Lee as per thread heading this is SRM 1.01...

Just to confirm - all VM's that we wish to configure are in the 24 CPU Socket cluster (while this is the case the message reads 'overllocated by 20'), as a test I moved one of my 'protected machines' to the High perfromace cluster (6 CPU sockets total) - the message changed to 'overallocated by 18'.

This looks to me like there is some underlying calculation going on that is checking not how many CPU sockets there actually are but how many are really needed to run the 'protected VMs'

The events have stopped following a restart of the SRM service - log file entries below

Changing to state: DrLicensedState

FlexLM: Created license job with server '27000@oxgbvctr01.oxfam.org.uk'

FlexLM: Checked out license 'PROD_SRM', count: 1, total: 1, avail: 1 pending: 0, daysLeft: 3650000

FlexLM: Checked out license 'SRM_PROTECTED_HOST', count: 22, total: 22, avail: 24 pending: 0, daysLeft: 3650000

FlexLM: Checked out license 'PROD_SRM', count: 1, total: 1, avail: 1 pending: 0, daysLeft: 3650000

Now it seems to be using 22 licenses .... this does not equate to any combination of CPU sockets in the clusters.

0 Kudos
Smoggy
VMware Employee
VMware Employee

sorry I missed that in the heading.

are you on the latest SRM 1.x patch? the fact that restarting the service has reset the count rings a bell, need to dig through the patch history for SRM 1.x as I seem to recall a bug that was fixed relating to licenses not being released / allocated correctly, usually a service restart reset things.

I will have a dig around and see if I can find it.

In the cluster where the protected VM's are what are the socket numbers reporting for each ESX host? can you check virtualcenter is reporting the number correctly for each host in the cluster, I think you said it was 6 x 4 socket but can you confirm that is how they are all displaying.

cheers

Lee

0 Kudos