Enthusiast
Enthusiast

vCenter broken after upgrade to 4.1

I've been running vCenter 4.0U2 for a few weeks with no problem managing our vSphere 4 cluster. A couple of days ago I updated vCenter to 4.1 in preperation for upgrading the vSphere hosts. The upgrade went fine with no errors and vCenter 4.1 was running at the end of Friday, but when I came in yesterday, I noticed that the vCenter Server service itself was not running. If I manually restarted it, it starts, then stops again a few seconds later. The Windows 2008 server it's running on was constantly trying to restart the service over the weekend, so the event log is full of event 7031/7036 errors. The actual error isn't very helpful, and all I get is;

"The VMware vCenter Server service terminated unexpectedly"

I've checked as much as I can. The DB server (SQL 2008 Express) is running fine, and the VIM_VCDB database is accessible locally and remotely. This server does nothing else other than vCenter, so I'm at a bit of a loss.All other vCenter services are running ok.

The only other thing I did prior to rebooting the server after the vCenter update was to install a couple of outstanding WIndows patches, but these seemed innocent enough (s/w compatability update, .NET 3.5 SP1, Malicious s/w removal tool, Powershell2.0/WinRM2.0)

Is there anything else worth checking to find out what's causing this?

0 Kudos
12 Replies
Immortal
Immortal

IIS is not installed and running? You have plenty of space available on the windows volumes? What error do you get if you try to manually start the service?

0 Kudos
Enthusiast
Enthusiast

Thanks Troy

IIS is not installed on this server (never has been). If I manually start the vCenter service there are no errors and the service starts ok. Approx 15s after starting the service, it stops with the event 7031 'terminated unexpectedly' error. The vCenter service, along with the VirtualCenter Management WebServices service runs under the local domain admin account, which I did re-enter the password for just in case. Didn't make any difference, the service starts, then stops. The WebServices service is running fine.

There are no space issues at all on any volumes, and plenty of Windows resources free.

0 Kudos
Immortal
Immortal

here are a couple places I would start

not necessarily the issue, but I would look at the registry just in case

or

0 Kudos
Enthusiast
Enthusiast

I might be onto something here. I tracked down the vpxd.log, and noticed some SQL errors relating to the transaction log. I'm not an SQL expert by any means, but this snip from the log seems pretty severe. I've done nothing to the SQL database, and left all settings at default, but if there's a recommended way round this, then I'd be interested in hearing it.

Here's the snip;

SQL execution failed: update vpx_sequence WITH (ROWLOCK) set id = ? where name = ?

Execution elapsed time: 0 ms

Diagnostic data from driver is 42000:1:9002:[Microsoft][ODBC SQL Server Driver][SQL Server]The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

Bind parameters:

datatype: 1, size: 4, arraySize: 0

value = 62401

datatype: 11, size: 26, arraySize: 0

value = "VPX_EVENT_SEQ"

Failed to get next sequence number: "ODBC error: (42000) - [ODBC SQL Server Driver][SQL Server]The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases" is returned when executing SQL statement "update vpx_sequence WITH (ROWLOCK) set id = ? where name = ?"

Win32 Exception (0xe06d7363) detected at 76E376FD

rip: 0x00000076e376fd rsp: 0x000000056dc550 rbp: 0x000000056dcbf0

rax: 0x000000056dc590 rbx: 0x000000056de260 rcx: 0x000000056dc060

rdx: 0x00000000002200 rdi: 0x000000056dd2e0 rsi: 0x000000056dd850

r8: 0000000000000000 r9: 0000000000000000 r10: 0x0000003fb50000

r11: 0x000000056dc590 r12: 0x000000056de770 r13: 0x000000056dd8e8

r14: 0000000000000000 r15: 0x000000056dc698

Backtrace

backtrace[00] rip 000000018010a1aa Vmacore::System::Stacktrace::CaptureWork

backtrace[01] rip 00000001800e8018 Vmacore::System::SystemFactoryImpl::CreateFileWriter

backtrace[02] rip 00000001800e850e Vmacore::System::SystemFactoryImpl::CreateQuickBacktrace

backtrace[03] rip 000000013feae22c (no symbol)

backtrace[04] rip 0000000076edc9cf UnhandledExceptionFilter

backtrace[05] rip 0000000076fc8120 RtlCharToInteger

backtrace[06] rip 0000000076f895a4 Cspecific_handler

backtrace[07] rip 0000000076f85b4d RtlIntegerToChar

backtrace[08] rip 0000000076f89947 Cspecific_handler

backtrace[09] rip 0000000076f720b1 RtlRaiseException

backtrace[10] rip 0000000076e376fd RaiseException

backtrace[11] rip 00000000705938bb isexception_typeof

backtrace[12] rip 0000000076f96721 RtlRestoreContext

backtrace[13] rip 000000014012ed1b (no symbol)

backtrace[14] rip 00000001404b7b8e (no symbol)

backtrace[15] rip 00000001404dbb72 (no symbol)

backtrace[16] rip 00000001404dd76e (no symbol)

backtrace[17] rip 00000001404dde21 (no symbol)

backtrace[18] rip 00000001404dde60 (no symbol)

backtrace[19] rip 000000013fbaff96 (no symbol)

backtrace[20] rip 000000013fbb0919 (no symbol)

backtrace[21] rip 000000013fbabb4f (no symbol)

backtrace[22] rip 000007fefee9fe11 SetServiceStatus

backtrace[23] rip 0000000076e3be3d BaseThreadInitThunk

backtrace[24] rip 0000000076f76a51 RtlUserThreadStart

Generating minidump ...

CoreDump: Writing minidump

VpxUnhandledException

0 Kudos
Immortal
Immortal

below is a snip from your snip. Your problem is that the transaction log is full.

42000:1:9002:MicrosoftODBC SQL Server DriverSQL ServerThe transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

http://kb.vmware.com/kb/1000125

you may also want to think about going to simple recovery mode instead of full

http://kb.vmware.com/kb/1003980

0 Kudos
Enthusiast
Enthusiast

That's great, thanks. Looks like I might be on the right track now. I'm running out of time to resolve this today, but will be back on the case tomorrow. I'll post back with results.

0 Kudos
Hot Shot
Hot Shot

We ran into the same problem. You'll just need to do a transaction log backup on the SQL server and you'll be good to go. I'm still trying to figure out why the log isn't growing when it needs to though.

0 Kudos
Virtuoso
Virtuoso

Yes, that is very true, i ran itno this kind of problem before and it turns out to be that the transaction log has been reaching its limit, now i have made it to grow / shrink unlimited and never stuck into this problem again.

Kind Regards,

AWT

/* Any kind of comment or input would be greatly appreciated */
0 Kudos
Enthusiast
Enthusiast

Just to let you know that everything is now working ok. It looks like the SQL TL was full, which was causing some problems. Looking closer at SQL, the VCDB database was set to unrestricted growth in 1MB increments, but the transaction log was set to 10% growth with a 500MB limit (and it was at 480MB).

Using the SQL Management studio, I've now set the TL to unrestricted growth. I then backed up the TL, followed by a full backup. The vCenter server service then started ok, and has been running without problem ever since.

One to remember! Thanks for the pointers.

0 Kudos
Contributor
Contributor

I had the same problem. Thank you for the solution.

0 Kudos
Enthusiast
Enthusiast

I had this exact same problem today - thanks for the solution!

salimkhoury.com | @SalimKhoury
0 Kudos
Contributor
Contributor

Had the same issue.

Thanks for the post

0 Kudos