PROBLEM
After upgrading from Virtual Center 2.x with a MSDE Database to Virtual Center 2.5 with a MS SQL Server (Express, Standard or Enterprise) your database server has increasingly lengthy periods of very high CPU utilisation and may become unresponsive.
CAUSE
The SQL upgrade scripts (cleanup_upgrade_mssql.sql) did not complete successfully at the time of the upgrade and were not reported as an error; the upgrade appeared to complete successfully. The stored procedures designed to rollup statistics do not complete their function and eventually consume 100% of resources and never complete.
By default the stats_rollup1_proc process runs every 30 minutes in sqlservr.exe to rollup statistics stored in the table VPX_Hist_Stat1, this process does not complete its function due to the mis-configuration performed by the upgrade and the number of rows in VPX_Hist_Stat1 will continue to grow causing the procedure to overwhelm resources.
WORKAROUND
Use Virtual Infrastructure Client to disable Statistics Collection.
Drop down menu - Administration > VirtualServer Management Server Configuration...
Select - Statistics : Uncheck all collection options
Server CPU usage will revert to normal.
RESOLUTION
Courtesy of Ramprasad K S of VMware Technical Support (KSRam)
I have been able to finish the analysis and reach the root cause. This has
resulted in a possible corrective action without much disruption to existing
installation. The down side of this action is that you will loose any
performance data that is currently stored in the DB.
The root cause here is that the upgrade process failed to completely upgrade.
Though the Wizard shows the process as success, there are some activities
that are not committed correctly.
Corrective action.
Truncate the data from table VPX_HIST_STAT1
Execute each of the commands from C:\Program
Files\VMware\Infrastructure\VirtualCenter
Server\dbupgrade\Upgrade-v3-to-v4\T-SQL\cleanup_upgrade_mssql.sql.
To simplify the process please see the (below process) which would help you
to execute the commands.
These commands (SQL) are to be executed on the VirtualCenter Database.
This action would put the database in correct state so that any further
performance statistics collection is not bothered.
Note: Depending on your configuration you may need to download and install the Microsoft SQL Server 2005 Express Edition Toolkit in order to access the Microsoft SQL Server Management Studio to perform the following process.
-
1. Shut down the VirtualCenter Server Service
2. Backup the database
3. Execute
TRUNCATE TABLE VPX_HIST_STAT1
4. Execute
TRUNCATE TABLE VPX_HIST_STAT;
alter table vpx_hist_stat drop constraint FK_VPX_HIST_STAT_REF_VPX_SAMP;
truncate table vpx_sample;
CREATE UNIQUE CLUSTERED INDEX PK_VPX_HIST_STAT1 ON VPX_HIST_STAT1
(COUNTER_ID,TIME_ID);
CREATE UNIQUE CLUSTERED INDEX PK_VPX_HIST_STAT2 ON VPX_HIST_STAT2
(COUNTER_ID,TIME_ID);
CREATE UNIQUE CLUSTERED INDEX PK_VPX_HIST_STAT3 ON VPX_HIST_STAT3
(COUNTER_ID,TIME_ID);
CREATE UNIQUE CLUSTERED INDEX PK_VPX_HIST_STAT4 ON VPX_HIST_STAT4
(COUNTER_ID,TIME_ID);
5. Execute
create view VPXV_HIST_STAT_DAILY as
select st.SAMPLE_TIME, st.SAMPLE_INTERVAL, sd.NAME as STAT_NAME,
sd.GROUP_NAME as STAT_GROUP,
case sd.ROLLUP_TYPE
when 0 then 'average'
when 1 then 'maximum'
when 2 then 'minimum'
when 3 then 'latest'
when 4 then 'summation'
when 5 then 'none'
end as STAT_ROLLUP_TYPE,
si.MOID AS ENTITY,
d.DEVICE_NAME, hs.STAT_VAL as STAT_VALUE
from VPX_SAMPLE_TIME1 st
, VPX_DEVICE d
, VPX_STAT_COUNTER sc
, VPX_STAT_DEF sd
, VPX_HIST_STAT1 hs
, VPX_STAT_ID si
where hs.counter_id=sc.COUNTER_ID
and sc.STAT_ID = sd.ID
and isnull(sc.DEVICE_ID,1)=isnull(d.DEVICE_ID,1)
and hs.TIME_ID = st.TIME_ID
and sc.ENTITY_ID=si.ID;
6. Execute
create view VPXV_HIST_STAT_WEEKLY as
select st.SAMPLE_TIME, st.SAMPLE_INTERVAL, sd.NAME as STAT_NAME,
sd.GROUP_NAME as STAT_GROUP,
case sd.ROLLUP_TYPE
when 0 then 'average'
when 1 then 'maximum'
when 2 then 'minimum'
when 3 then 'latest'
when 4 then 'summation'
when 5 then 'none'
end as STAT_ROLLUP_TYPE,
si.MOID AS ENTITY,
d.DEVICE_NAME, hs.STAT_VAL as STAT_VALUE
from VPX_SAMPLE_TIME2 st
, VPX_DEVICE d
, VPX_STAT_COUNTER sc
, VPX_STAT_DEF sd
, VPX_HIST_STAT2 hs
, VPX_STAT_ID si
where hs.counter_id=sc.COUNTER_ID
and sc.STAT_ID = sd.ID
and isnull(sc.DEVICE_ID,1)=isnull(d.DEVICE_ID,1)
and hs.TIME_ID = st.TIME_ID
and sc.ENTITY_ID=si.ID;
7. Exceute
create view VPXV_HIST_STAT_MONTHLY as
select st.SAMPLE_TIME, st.SAMPLE_INTERVAL, sd.NAME as STAT_NAME,
sd.GROUP_NAME as STAT_GROUP,
case sd.ROLLUP_TYPE
when 0 then 'average'
when 1 then 'maximum'
when 2 then 'minimum'
when 3 then 'latest'
when 4 then 'summation'
when 5 then 'none'
end as STAT_ROLLUP_TYPE,
si.MOID AS ENTITY,
d.DEVICE_NAME, hs.STAT_VAL as STAT_VALUE
from VPX_SAMPLE_TIME3 st
, VPX_DEVICE d
, VPX_STAT_COUNTER sc
, VPX_STAT_DEF sd
, VPX_HIST_STAT3 hs
, VPX_STAT_ID si
where hs.counter_id=sc.COUNTER_ID
and sc.STAT_ID = sd.ID
and isnull(sc.DEVICE_ID,1)=isnull(d.DEVICE_ID,1)
and hs.TIME_ID = st.TIME_ID
and sc.ENTITY_ID=si.ID;
8. Exceute
create view VPXV_HIST_STAT_YEARLY as
select st.SAMPLE_TIME, st.SAMPLE_INTERVAL, sd.NAME as STAT_NAME,
sd.GROUP_NAME as STAT_GROUP,
case sd.ROLLUP_TYPE
when 0 then 'average'
when 1 then 'maximum'
when 2 then 'minimum'
when 3 then 'latest'
when 4 then 'summation'
when 5 then 'none'
end as STAT_ROLLUP_TYPE,
si.MOID AS ENTITY,
d.DEVICE_NAME, hs.STAT_VAL as STAT_VALUE
from VPX_SAMPLE_TIME4 st
, VPX_DEVICE d
, VPX_STAT_COUNTER sc
, VPX_STAT_DEF sd
, VPX_HIST_STAT4 hs
, VPX_STAT_ID si
where hs.counter_id=sc.COUNTER_ID
and sc.STAT_ID = sd.ID
and isnull(sc.DEVICE_ID,1)=isnull(d.DEVICE_ID,1)
and hs.TIME_ID = st.TIME_ID
and sc.ENTITY_ID=si.ID;
9. Exceute
CREATE VIEW VPXV_STAT_COUNTERS as
select sd.ID AS ID
,case sd.ROLLUP_TYPE
when 0 then 'average'
when 1 then 'maximum'
when 2 then 'minimum'
when 3 then 'latest'
when 4 then 'summation'
when 5 then 'none'
end as STAT_ROLLUP
, sd.NAME
, sd.GROUP_NAME
, case sd.TYPE
when 0 then 'absolute'
when 1 then 'delta'
when 2 then 'rare'
end as TYPE
, sd.UNIT
, sd.ASSOCIATE_IDS
, sd.STAT_LEVEL AS STATS_LEVEL
FROM VPX_STAT_DEF sd;
10. Exceute
CREATE VIEW VPXV_STAT_HISTORICAL_INTERVALS as
select sid.INTERVAL_SEQ_NUM as ID
, sid.INTERVAL_DEF_NAME as NAME
, sid.INTERVAL_VAL as SAMPLING_PERIOD
, sid.INTERVAL_LENGTH as LENGTH
, sid.STATS_LEVEL
, case sid.ROLLUP_ENABLED_FLG
when 1 then 'Y'
when 0 then 'N'
end as ENABLED
FROM VPX_STAT_INTERVAL_DEF sid;
11. Exceute
drop table VPX_ALARM_EXPRESSION_bk ;
drop table VPX_ALARM_ACTION_bk ;
drop table VPX_SCHED_SCHEDULER_bk ;
drop table VPX_SCHED_ACTION_bk ;
drop table VPX_ACCESS_bk ;
drop table VPX_COMPUTE_RESOURCE_bk ;
drop table VPX_HOST_bk ;
drop table VPX_DATASTORE_bk ;
drop table VPX_NETWORK_bk ;
drop table VPX_RESOURCE_POOL_bk ;
drop table VPX_DS_ASSIGNMENT_bk ;
drop table VPX_HOST_CPU_bk ;
drop table VPX_HOST_NODE_bk ;
drop table VPX_GUEST_DISK_bk ;
drop table VPX_FIELD_DEF_bk ;
drop table VPX_FIELD_VAL_bk ;
drop table VPX_VM_bk ;
drop table VPX_GUEST_IP_ADDRESS_bk ;
drop table VPX_GUEST_NET_ADAPTER_bk ;
drop table vpx_object_type_bk ;
TRUNCATE TABLE VPX_HIST_STAT;
alter table vpx_sample_time1 drop column row_id;
alter table vpx_sample_time2 drop column row_id;
alter table vpx_sample_time3 drop column row_id;
alter table vpx_sample_time4 drop column row_id;
drop procedure hist_stats_upgrade_proc;
Great Document...
Thanks. Please feel free to rate it.
Exceute ...
from where?
Seb
From the Virtual Server Database. ( In my case SERVER_NAME\MSDB_VC )
I just follow your procedure and the issue is gone!. Thank you very much. Now i have my VC working 100%.
Added a Note on where to execute from using Microsoft SQL Server 2005 Express Edition Toolkit.
I followed the instructions and confirmed that it resolves the problem.
Very good explanation and solution to the problem!
Best Regards,
Shin
Thank you Shin. Please feel free to rate it.
Yup this work really great. We were struggling for past couple of weeks with VC now as we dopped that table and recreated it performance has improved.
Thanks man
Your welcome.
Never lost historical data, is that normal?
Seb
Great document. Thanks for putting it together. When running the queries I get errors because constraint FK_VPX_HIST_STAT_REF_VPX_SAMP does not exist and then the tables it creates later already exist. Even with that it still has helped CPU usage and it taps out much more infrequently.
hi
i have this issue but i have not done an upgrade, this was a new install on a express 2005 DB and new install of VC 2.5
thanks for the fix! worked like a charm!
Has there been a fix released since June that fixes this issue without losing historical performance data?
Same issue and got the errors when executing the SQL statements but VC/SQL is behaving much better now.
I noticed that VC 2.5 update 3 was recently released but I couldnt find anything in the release notes about this issue being resolved. Odd since VMware has known of this for months.
Also, why wouldn't they release an executable "patch" to perform the fix instead of forcing us to execute a bunch of SQL statements against the db??
Hi, we have the same issue, we did all the steps, but unfortunately the problem persists. Exactly every 30' the VC freezes for 8'.
Is this a bug that all VC users experience? If yes, is there an official patch from VMWare?
Any help would be appreciated.
Thanks
Done this, any ideas on sequence 4 why I get this:
Msg 3728, Level 16, State 1, Line 2
'FK_VPX_HIST_STAT_REF_VPX_SAMP' is not a constraint.
Msg 3727, Level 16, State 0, Line 2
Could not drop constraint. See previous errors.