I've been running vCenter 4.0U2 for a few weeks with no problem managing our vSphere 4 cluster. A couple of days ago I updated vCenter to 4.1 in preperation for upgrading the vSphere hosts. The upgrade went fine with no errors and vCenter 4.1 was running at the end of Friday, but when I came in yesterday, I noticed that the vCenter Server service itself was not running. If I manually restarted it, it starts, then stops again a few seconds later. The Windows 2008 server it's running on was constantly trying to restart the service over the weekend, so the event log is full of event 7031/7036 errors. The actual error isn't very helpful, and all I get is;
"The VMware vCenter Server service terminated unexpectedly"
I've checked as much as I can. The DB server (SQL 2008 Express) is running fine, and the VIM_VCDB database is accessible locally and remotely. This server does nothing else other than vCenter, so I'm at a bit of a loss.All other vCenter services are running ok.
The only other thing I did prior to rebooting the server after the vCenter update was to install a couple of outstanding WIndows patches, but these seemed innocent enough (s/w compatability update, .NET 3.5 SP1, Malicious s/w removal tool, Powershell2.0/WinRM2.0)
Is there anything else worth checking to find out what's causing this?
IIS is not installed on this server (never has been). If I manually start the vCenter service there are no errors and the service starts ok. Approx 15s after starting the service, it stops with the event 7031 'terminated unexpectedly' error. The vCenter service, along with the VirtualCenter Management WebServices service runs under the local domain admin account, which I did re-enter the password for just in case. Didn't make any difference, the service starts, then stops. The WebServices service is running fine.
There are no space issues at all on any volumes, and plenty of Windows resources free.
I might be onto something here. I tracked down the vpxd.log, and noticed some SQL errors relating to the transaction log. I'm not an SQL expert by any means, but this snip from the log seems pretty severe. I've done nothing to the SQL database, and left all settings at default, but if there's a recommended way round this, then I'd be interested in hearing it.
Here's the snip;
Diagnostic data from driver is 42000:1:9002:[Microsoft][ODBC SQL Server Driver][SQL Server]The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases
Failed to get next sequence number: "ODBC error: (42000) - [ODBC SQL Server Driver][SQL Server]The transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases" is returned when executing SQL statement "update vpx_sequence WITH (ROWLOCK) set id = ? where name = ?"
rdx: 0x00000000002200 rdi: 0x000000056dd2e0 rsi: 0x000000056dd850
r8: 0000000000000000 r9: 0000000000000000 r10: 0x0000003fb50000
r11: 0x000000056dc590 r12: 0x000000056de770 r13: 0x000000056dd8e8
r14: 0000000000000000 r15: 0x000000056dc698
backtrace rip 000000018010a1aa Vmacore::System::Stacktrace::CaptureWork
backtrace rip 00000001800e8018 Vmacore::System::SystemFactoryImpl::CreateFileWriter
backtrace rip 00000001800e850e Vmacore::System::SystemFactoryImpl::CreateQuickBacktrace
backtrace rip 000000013feae22c (no symbol)
backtrace rip 0000000076edc9cf UnhandledExceptionFilter
backtrace rip 0000000076fc8120 RtlCharToInteger
backtrace rip 0000000076f895a4 Cspecific_handler
backtrace rip 0000000076f85b4d RtlIntegerToChar
backtrace rip 0000000076f89947 Cspecific_handler
backtrace rip 0000000076f720b1 RtlRaiseException
backtrace rip 0000000076e376fd RaiseException
backtrace rip 00000000705938bb isexception_typeof
backtrace rip 0000000076f96721 RtlRestoreContext
backtrace rip 000000014012ed1b (no symbol)
backtrace rip 00000001404b7b8e (no symbol)
backtrace rip 00000001404dbb72 (no symbol)
backtrace rip 00000001404dd76e (no symbol)
backtrace rip 00000001404dde21 (no symbol)
backtrace rip 00000001404dde60 (no symbol)
backtrace rip 000000013fbaff96 (no symbol)
backtrace rip 000000013fbb0919 (no symbol)
backtrace rip 000000013fbabb4f (no symbol)
backtrace rip 000007fefee9fe11 SetServiceStatus
backtrace rip 0000000076e3be3d BaseThreadInitThunk
backtrace rip 0000000076f76a51 RtlUserThreadStart
below is a snip from your snip. Your problem is that the transaction log is full.
42000:1:9002:MicrosoftODBC SQL Server DriverSQL ServerThe transaction log for database 'VIM_VCDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases
you may also want to think about going to simple recovery mode instead of full
That's great, thanks. Looks like I might be on the right track now. I'm running out of time to resolve this today, but will be back on the case tomorrow. I'll post back with results.
We ran into the same problem. You'll just need to do a transaction log backup on the SQL server and you'll be good to go. I'm still trying to figure out why the log isn't growing when it needs to though.
Yes, that is very true, i ran itno this kind of problem before and it turns out to be that the transaction log has been reaching its limit, now i have made it to grow / shrink unlimited and never stuck into this problem again.
Just to let you know that everything is now working ok. It looks like the SQL TL was full, which was causing some problems. Looking closer at SQL, the VCDB database was set to unrestricted growth in 1MB increments, but the transaction log was set to 10% growth with a 500MB limit (and it was at 480MB).
Using the SQL Management studio, I've now set the TL to unrestricted growth. I then backed up the TL, followed by a full backup. The vCenter server service then started ok, and has been running without problem ever since.
One to remember! Thanks for the pointers.