VMware Cloud Community
MarkoEngelmann
Contributor
Contributor

Solution to full transaction log for database VIM_VCDB

Hi,

after having struggled to solve the reason for above error-message over the past weeks, i want to share my experience and the solution.

The information to (really) get rid of the well known error message is allready available online but scattered over different places.

Let me sum things up:

My environment:

We have three different Vsphere 4.1 Essential Environments with 120 VMs on 3 vm-hosts each. Two of them running since 2 years now, without any problems. They all are deployed using the MS SQL 2005 express-edition supplied with the vcenter server installer. The 4GB datasize limit of this edition should not be a real limit for our usecases.

My problem:

  2 of them started showing the following "transaction log"-full messages in the windows eventlog after having upgraded to vsphere 4.1 U2.

--

Event 9002, MSSQL$SQLEXP_VIM

The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

--

This was repeated every 6 hours at 1:01 am, 7:01 am, 1:01pm, 7:01pm each day on both vcenter server hosts.

So - what changed? what to do? Did we hit the 4GB limit of the engine when datasize + logsize is greater then 4GB ?

The workaround on

http://www.vladan.fr/how-to-fix-%E2%80%9Ctransaction-log-for-database-%E2%80%98vim_vcdb%E2%80%99-is-...

was easely spotted and applied. I increased the transaction log area limit vom 512MB to 2GB as suggested.

This eased the situation a bit but dit not really solve the underlying problem. The error-messages continued. More problematic: the vcenter server service started to crash - so the problem started to reach my users. Manual shrink's of the logarea solved those cases.

The situation at this point:

   Site1: vcenter-database is about 2.4GB in size, transaction log limit is increased to 3, 4  then 5 GB. Still "transaction log" errors.

   Site2: vcenter-database is about 1.8GB in size, transaction log limit increased to 2, 3 then 6 GB. Still "transaction log" errors.

  At this point we where a little bit puzzled. Our production vcenter database is running since 3 years now, has an 360days retention period but is still only 800 MB in size. Why are  the databases of our development enviroments this large?

So. What lead to a solution where the table names VPX_EVENT_ARG and VPX_EVENT i found mentioned in a forum message.

A closer inspection of our vcenter database in site1 using the SQL Server Management Studio showed this picture:

  VPX_EVENT was about 700 MB in size, with an additional index of 400MB

  VPX_EVENT_ARG was about 400 MB in size with an additional index of 300 MB.

  So in sum, this tables did hold about 1.5GB of data. A look into the table led to details for events up to 2 years ago. Seems they never had been cleaned up, despite the retention period was never longer than 180days on those servers. In site1 the EVENT-table had 2.1 million lines of event-data, in site2 we had 1.1 million lines.

As we dont need event information older than 30 days we could easely drop all those information. Using the procedure described here:

  http://blogs.vmware.com/kb/2010/09/dealing-with-vcenter-41-database-tables-growth.html

we were able to simply drop/erase all of the existing event history, starting  from scratch afterwards.

In short: During a maintenance-windows we simply made a full-backup of the database and issued

truncate table vpx_event_arg;

followed by
delete from vpx_event;

SOLUTION:

But - here comes the trick - whereas the "truncate"-step only takes seconds and no transaction-log space at all, the "delete" took 35 minutes and  13GB(!) of transaction log. On site2 it took 15 minutes and 8GB of transaction log size. With the default limits of 1GB or the suggest 2-3GB for the transaction log in place from the tipps above this procedure would never had succeeded. So, we needed 13 GB of transaction log size to erase 1.5GB of actual data consisting of 2.1 million table rows on site1.

Which leads us this this conclusion: The updated vcenter 4.1 (u2 ?) tried to clean up the mess in the event-tables but never had a chance to come to an successfull end with the default (or slightly increased) transaction log limits discussed everywhere. If your disk-space allows, increase the transaction log limit to 20GB (guess your actualy needs based on my numbers above).

The real problem here is, i think, the counterintuitive storage needs for the transaction log when cleaning up internal data. Who had guessed a factor of 10 in the first place? Anyone?

I also suspect, the internal cleanup-procedures would have catched up by increasing the transaction log limit alone - but i havent tested this in practice.

Hope this helps Smiley Wink

Kind Regards

Marko Engelmann

Reply
0 Kudos
6 Replies
EdWilts
Expert
Expert

It sounds like your recovery mode is not set to SIMPLE as per VMware's recommendations.  We've had nothing but problems when our DBAs tried to overrule VMware's best practices.

These KB articles are two of the many places where they recommend simple:  http://kb.vmware.com/kb/1001046 & http://kb.vmware.com/kb/1003980

.../Ed (VCP4, VCP5)
Reply
0 Kudos
MarkoEngelmann
Contributor
Contributor

Hi,

Thanks for your reply - i just forgot to mention this: checking the recovery-mode was set the "simple" was one of my first tasks. And, it was and has ever been Smiley Sad

"Simple" reclaims log space after the transaction finishes - you still allocate the disk space during the runtime of the transaction. In my case this was a lot more space than we expected Smiley Wink.

Regards

Marko

Reply
0 Kudos
MarkoEngelmann
Contributor
Contributor

Just a short follow-up:

On site3, which is running since 4 months now, the same issue started out of the blue.

We changed the retention period for events and  performance-data down to 30 days to keep the database small. This change  was made about a month ago, the error-messages (internal tasks every 6  hours, see above..) started to show up about a week ago. So, here we have the  perfect testcase for our issue.

This time I only changed the transaction-log upper limit to 20GB and waited for the vcenter cleanup-jobs to do its magic.

Success! The database is now 256MB smaller, the transaction log grew to 3.3GB.

Conclusion: for expiring data from the vcenter-database, you will need ~13 times its size in the transaction log.

I will now "shrink" the database as usual. Everything back to normal.

Kind Regards

Marko Engelmann

Reply
0 Kudos
eoIT
Contributor
Contributor

Thanks,

This was successful for us too, we were getting maxed out logs on "Past Day stats rollupVIM_VCDB" job, and upping the max limit on our transaction logs to somthing ridiculous allowed the job to complete. Once the job was successful, I could then afterwards shrink the log file down to a tiny size.

Cheers!

Reply
0 Kudos
KraL
Enthusiast
Enthusiast

Hi,

In addition to the first post, and to prevent massive LOG file growth, you should tune the SQL query for VPX_EVENT table.

I had over 8M rows, and it take bunch of GB to delete them all in one shot (actually more than 30GB) ...

Just select them 500K per 500K, and the LOG file should not exceed 10Go.

Instead of 'delete from vex_event;' as suggested on Dealing with VCenter 4.1 Database Tables Growth | VMware Support Insider - VMware Blogs , just select less rows at once.


1. get to top 1000, in order to know how old is the oldest row.

SELECT TOP 1000 [EVENT_ID]

      ,[CHAIN_ID]

      ,[CREATE_TIME]

  FROM [VCDB].[dbo].[VPX_EVENT]

2. query one month and see how much rows would be impacted

SELECT [EVENT_ID]

      ,[CHAIN_ID]

      ,[CREATE_TIME]

  FROM [VCDB].[dbo].[VPX_EVENT] WHERE [CREATE_TIME] <= '2013-03-30';   

3. Delete theses rows and repeat until clearing the table

delete from vpx_event WHERE [CREATE_TIME] <= '2013-03-30';

You can check the actual size remaining on your table

- properties on table dbo.VPX_EVENT

- storage tab:

     - index space list remaining size,

     - row count remaining rows in this table.

Reply
0 Kudos
ron_kool
Contributor
Contributor

Great article Marko,

after searching on the internet for this problem with vCenter 5.0 I found your article and followed the procedure you described.

Our transcaction log had to grow to 28GB for a 6GB database but there after the problem was solved.

Thanks for sharing.

Regards, Ron

Reply
0 Kudos