VMware Cloud Community
wapiti10
Enthusiast
Enthusiast

Windows Server 2003 R2 64 bit- 16GB RAM - 4 VCPU's- Extremely Long Boot times

ESX 3.5 Host with 16 Cores and 64 GB RAM

Guest in Question: Windows Server 2003 R2 64 bit- 16GB RAM - 4 VCPU's

Started with 1 GB RAM, and the server booted fine. I configured my server with updates and shut down.

I added 15 GB of RAM and the server began taking up to 30 minutes to boot all the way up. Hanging on the Windows Splash screen with the scrolling graphic for the majority of that time.

Since, here is what I have done (checking each time that the server sees the amount of RAM and the correct number of cpus, It does):

scoured the community.

checked the services for services that didn't start,nothing Glaring

Checked the event log, (event log service didn't start until windows finally came up).

checked limits and reservations = no reservations and unlimited is checked.

-no ballooning or swapping taking place on host

-Shut down the server,

-removed 14 GB of RAM from guest(2 total now) = boot in 2 minutes

-shutdown, add 2 GB of RAM(4 total now) = boot in 2 minutes

-shutdown, add 4 GB of RAM(8 total now) = boot in 2 minutes

-shutdown, add 4 GB of RAM(12 total now) = boot in 2 minutes

-shutdown, add 4 GB of RAM(back to 16 now) = boot in 2 minutes

-let server stand, selected "restart" from the shutdown menu = 30 minutes to come back to windows

-doubled page file to 10 GB (I know that Ideally I want my Page file to = the amount of Memory in the OS, but my OS part is too small, COULD THIS BE MY PROBLEM? though it wouldn't explain why it booted in 2 minutes with 16 GB of RAM when I stepped the server up...)

-selected "restart" from the shutdown menu = 30 minutes to come back to windows.

OK so you see my issue, I am looking for some help and I have some additional questions:

1. could it be the pagefile?

2. could I have a bad pair memory in my host? is there a log or a memory test in ESX that I could look at to find out?

3. any other suggestions?

thanks,

Dallas

Dallas
0 Kudos
58 Replies
zimmermann1
Contributor
Contributor

Hello

Any replys to your SR so far?

Thanks in Advice.

0 Kudos
epping
Expert
Expert

hi

give this ago, power down the vm, create a new vm from scratch with the 4 vcpu and the memorry, assign it the existing disk and power on, let me know if this improves the performance.

thanks

0 Kudos
zimmermann1
Contributor
Contributor

was this a hint for me?

0 Kudos
epping
Expert
Expert

yes

0 Kudos
bludden
Contributor
Contributor

I have VMs that have issues with warm reboots. the VMs hang at the 2003 loading screen with the moving bar animation forever. If I do a reset of the VM when I see this activity, it boots up much, much quicker.




Brian

Brian
0 Kudos
zimmermann1
Contributor
Contributor

I've the same problem. But I only use 2VCPU and 16GB RAM

0 Kudos
ncarde
Enthusiast
Enthusiast

The SR was closed out at this time as a known issue that VMware is working on correcting in a future release.

The interim solution provided by VMware was to leverage either of the two options in the following KB (brute force power operations or disable page sharing -- neither of which are that palatable, IMO).

http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1004901&sl...

As more customers leverage 'jumbo' VM's (i.e. like in a SAP configuration) this issue may become more prevalant -- good that VMware is working on a more permanent 'fix'.

0 Kudos
bludden
Contributor
Contributor

Thank you so much for this information, it has brought a lot of clarity to the testing I've been doing!

Since changing over to single CPUs our VMs have been working much better. I wish they would clarify what "specific hardware" is causing the problem....

Brian

Brian
0 Kudos
robertl30
Contributor
Contributor

This thread perfectly describes the issue we're seeing our ESX 3.5 Update 3 environment. I've had to build 2 Jumbo VMs for an email archiving project I'm doing. 4 vCPU, 16GB RAM, many TBs of RDM storage. W2K3R2x64 Std. I found the article that states you need to lie to VMware and tell it you're using Ent ed Windows to get it to not complain about having over 4GB of RAM. And now I've seen the article referenced above which states "on some hardware" you see these long boot times. I just completed a test where I tried the shutdown/power-on vs. restart guest. Night and day difference. 1 minute boot time compared to 30 minutes or more. I also learned on my own that if I reduce the RAM to 15GB I get more reasonable boot times of 5 or 6 minutes. But the vendor of this archiving app wants 16GB, so you know how it will go if I say VMware can't handle that... they'll be putting me on physicals and I'm already having to fight them to use VMware at all. So.... what's up with this? Large database apps are going to need 16, 32, 64GB of RAM. That's the whole point of x64 OS. Is VMware working on this issue?

0 Kudos
jesse_gardner
Enthusiast
Enthusiast

We have the same problem as well. 64-bit Windows Server 2003 and 2008 exhibits this. It seems that the more CPUs and more RAM, the more outrageous the reboot times get. I did open an SR on the problem, but was told that it would be addressed in Update 3. Well, it obviously wasn't.

We're running on IBM x3850's and x3850 M2's (Intel Pentium-based and Core based CPUs), both exhibit the problems.

0 Kudos
rrgavin
Contributor
Contributor

We are experiencing the same problem as mentioned above. Same configs, hardware.... I have passed it on to our TAM and BCS people, I will also approach IBM about this for their input. If I get an answer I will let you know.

Thx

0 Kudos
jesse_gardner
Enthusiast
Enthusiast

Was just made aware of Kb 1004901, which seems to describe this problem perfectly. For what its worth, I'm not seeing any proven performance degredation after the long boot is complete.

0 Kudos
polysulfide
Expert
Expert

For optimal page file performance, set it to a large enough size, make it static by making the min and max size the same. Then run pagedefrag from sysinternals to defragment the page file. Dynamicly expanding or system managed page files create a HUGE and FRAGMENTED file at creation which SUCKS for performance. Make it adequate, persistent, and defrag it.

If it was useful, give me credit

Jason White - VCP

0 Kudos
Zahni
Contributor
Contributor

Yes, i have the same problem on new HP 465 G5 Blades with 2x Opteron 2384 and 64 GB RAM.

This ist not fixed with Update 4. And i have not only slow boot time, also slow SAN-Access.

Here is a simple test:

Download H2testw from .

With Mem.ShareScanGHz = 4 (default), the write performance is only 21-30 Mbyte/s, with Mem.ShareScanGHz = 0, it''s run through 127 Mbytes/s.

VM Config: Windows 2008 64 bit, 16 GB RAM, 4 VCPU.

0 Kudos
jesse_gardner
Enthusiast
Enthusiast

For the record, I'm not seeing slower SAN access between my operating system versions (2003 x86, 2003 x64, 2008 x86, 2008 x64). Now granted I have not tried turning pagesharing off to see if it would improve across the board, but I do know that 2008 x64 with 4 CPUs is not significantly slower than the other OS'es.

Tested with HDD Speed.

0 Kudos
APlatt
Contributor
Contributor

I too would like to chime in on this thread.

Same issues as all posts above mine.

Server 2003 Datacenter x64 as the guest running on Dell R805 boxes with Quad-Core AMD Opteron 2356 2.3 GHZ procs, 2 on board NICS with dual Quad card nics in the pci slots (Total 10 active nics), and 64GIG ram in each R805.

4 R805's in cluster

2003 server guests are running 4vcpu, 16 gb ram in an MSCS cluster useing vmware best practices and SQL 2005 in failover config. All fully patched up with all MS patches as of today.

Guests take up to an hour to restart, and can take 15-20 minutes to power up from cold start.

Even after they are up, the boxes seem sluggish.

These are virgin VM's with no databases being activated at this time.

Page file is set on a seperate drive than the C drive, (P drive for me) C and P are on the local drives of the R805's as per vmware best practices. Unlimited resources, page files set to be 18022 min and 28056 max static entries on the P drive. No other data on the P drive.

All ESX boxes fully patched as of today. running 3.5.0, 153875

vmware tools upgraded on guest OS's as well.

SAN is an EMC and has been flawless for some time now.

Will be following this thread.

Regards,

Allen P.

0 Kudos
robertl30
Contributor
Contributor

I have a case opened with VMware and we're (slowly) troubleshooting. I sent them logs anyway. Hopefully they'll come up with something.

My 4vCPU/16GB x64 VM takes 72 minutes to restart (or 3 minutes if I do a shutdown/power on).

FWIW, I tried turning Page Sharing off and reboot the host. No help. VM still takes over an hour to restart and then it's very sluggish when under IO load. In particular we see a "system" thread that's gobbling CPU. If I drop it down to 4GB then the IO is normal and reboots are normal. But then I'm missing a lot of RAM used by my app for database caching so performance is still poor (though in a different way). I become CPU bound rather than disk bound.

I really hope they fix this. A 16GB machine isn't that unusual a config anymore.

0 Kudos
APlatt
Contributor
Contributor

Im trying to copy some SQL .BAK files across and even with the NIC only showing 1% utilization, my 4 procs are bouncing between 50 and 80% utilization and the desktop is almost non usable. I can go to start run and type a command in and watch the characters show up one by one by one....

turning off pagefile for me also did not resolve the problem, only reducing the memory to below 4 gig seemed to make things respond better.

I agree this is just plain silly.

0 Kudos
RolandK
Contributor
Contributor

We have two vm's, each w2k3 64bit r2 with sp2 and 4vcpu. One with 16GB or RAM and the other with 8GB of ram. Both need about 5 Minutes to reboot.

Our Installation: ESX3.5 Update 3 with 68 GB RAM and four quadcore amd opteron 2.2 GHz.

Check the VM Settings, Options, General Options, Guest Operating System Version. Should be

Microsoft Windows Server 2003, Enterprise Edition (64-bit) not standard edition. Because in Update 3 is a issue with that. It is solved in Update 4.

Kind regards

Roland Kudelic

0 Kudos
robertl30
Contributor
Contributor

Roland-

Are you saying you got this to work? Is the trick to use Enterprise Edition of Windows Server 2003 x64?

We are on ESX 3.5 Update 3. I found an article which explained in order to access more than 4GB of RAM you have to set the VM (Edit Settings, Options, Version) to Winows Server 2003, Enterprise Edition, 64-bit. That is how we are set with this VM. I never really liked that setting as it seemed like I was "lying" to VMware about what was installed. But if I set that to Standard Edition I get errors when trying to set vRAM higher than 4GB.

You mention an issue is fixed in Update 4. Can you be more specific? What exactly is fixed? Is there an article on it? Thanks.

0 Kudos