We just recently migrated a very large file server into ESX3i and Server 2008. Since the migration we've noticed we're getting very large spikes in CPU to 100% for some time. This is not very normal for a file server. Just curious if anyone has used Server 2008 on ESX3 and if there were any issues? I'm not sure what could be causing this. I've also noticed in my test system (also 3i) when I do a "full" format on a disk, it takes 3-4 minutes instead of the usual 2 seconds that always happened in Server 2003. Maybe this has something to do with it? Maybe we need an update or updated drivers for the SCSI controller.
Any input would be greatly appreciated.
I am seeing this issue with ESX 3.5 and Windows 2008 64 bit. Currently I'm just investigating the format on disk issue. It does take quite a bit longer on on Windows 2008 for some reason.
Kevin Riley
Vmware Support
I am also seeing the same issue. I have a freshly installed Windows Server 2008 guest running on an ESXi host and the performance is poor compared to a freshly installed Windows Server 2003 guest on the same host. Both VMs are configured with the same resources but the 2003 one seems to run much quicker in terms of disk access and lower CPU usage.
Could this be a driver issue?
I have a problem very similar to this with Windows Server 2008. When I first logged into Server 2008 over RDP, it was laggy, slow, and generally unresponsive. Windows Server 2003 is a different story. CPU idle actually looks like a real machine, and I am able to use it without fear of overloading my VMWare Server.
I'd like to know why as well, but I don't think we can point these issues directly to drivers. Server 2008 is quite a bit heavier than Server 2003 in terms of FS size. It's more secure, but the base install is 1.5 gigs larger for Server 08. Perhaps this causes an issue?
If anyone ever figures anything out, I'd like to know. I have a bunch of users complaining about their windows 2008 virtual machines. Anything from slow access, periodic "hangs", choppy mouse movement, high cpu, you name it. Appears to be worse using VI client vs using Remote Desktop, but still noticeable.
I'm seeing this same problem on a brand new Server 08 x64 host on ESX 3.5. I've noticed the VM will peg the vCPU(s) on startup when it has 2 vCPUs, however does not do this with 1 vCPU. Host seems snappy enough when it's up and running, just longer boot times (sitting at the screen with the green progress bar) with 2+ vCPUs.
I work witht he original poster, we're performing comprehensive performance comparisions between Win 2003 and Win 2008 on ESX 3.02 and physcial hardware. We're seeing much higher CPU usage under certain situaitons. It appears to be related to file system access/security checks when files are read. You can easily run this comparison yourself by choosing a large folder of files (10,000 + files) and doing a "Reset permissions" on all files. Compare the results from 2003 and 2008 when virtual and you'll see a real difference. Win2008 is slower than 2003 in this respect when physcial also, but not nearly as bad. When we perform large file transfers (as in singble 300 mb files), massive FTP's, etc we see great performance - same as physical more or less. So.....this appears to be something to do with Context Switching/ACL checking on 2008 at least in our case. I still need to slice and dice our data some more, we'll post back if we are able to draw further conclusions.
I'm assuming you are all running ESX 3.5 U2 (official supported was added in this version)?
We are investigating this issue. One of the things that came up earlier is the context switching in Windows2008 being larger than in 2003. Also check the system calls and bytes written.
I ran the timeit utility (from the resource kit), and here is a comparison of both:
Windows 2003:
C:\WINDOWS\system32> timeit -d -i -t format.com E: /FS:NTFS
The type of the file system is NTFS.
WARNING, ALL DATA ON NON-REMOVABLE DISK
DRIVE E: WILL BE LOST!
Proceed with Format (Y/N)? Y
Verifying 15351M
Creating file system structures.
Format complete.
15719568 KB total disk space.
15653344 KB are available.
Version Number: Windows NT 5.2 (Build 3790)
Exit Time: 2:24 pm, Tuesday, October 14 2008
Elapsed Time: 0:00:11.359
Process Time: 0:00:01.031
System Calls: 72537
Context Switches: 130392
Page Faults: 6481
Bytes Read: 68855276
Bytes Written: 67989072
Bytes Other: 22704
Windows 2008:
C:\Windows\System32>timeit -d -i -t format.com E: /FS:NTFS The type of the file system is NTFS.
WARNING, ALL DATA ON NON-REMOVABLE DISK
DRIVE E: WILL BE LOST!
Proceed with Format (Y/N)? y
Formatting 15357M
Creating file system structures.
Format complete.
15725564 KB total disk space.
15659348 KB are available.
Version Number: Windows NT 6.0 (Build 6001)
Exit Time: 12:54 am, Tuesday, October 14 2008
Elapsed Time: 0:04:04.921
Process Time: 0:00:29.015
System Calls: 349053
Context Switches: 271627
Page Faults: 2921
Bytes Read: 70004866
Bytes Written: 3286217320
Bytes Other: 716987
I'm running U2.
3.02 using Vista as the machine definition. 32 bit Win 2003/2008
I've just found that building a VM to run with a single vCPU the 2008
host runs much quicker for me than on 4 vCPUs.
I'm running on 3.5 U2.
I've opened a support case with VMware on it. I should hear from them today hopefully...
On the OS display I've noticed that it'll show sometimes the OS listed as vista in virtualcenter, while other times it shows server 2008 64-bit correctly... I've always seen it correctly displayed though when I went into the properties of the VM itself FWIW.
We have some preliminary results from our testing - note its hard to isolate variables unless you have a lot of identical hardware laying around. Our primary "Test" was an intensive file copy operation between a physical and a virtual windows host. The physcial host was kept steady for the duration of tests, while we changed which VM's and Platforms we tested against. The file copy operation consisted of 50,000 files, 350 mb of data used on 1 vCPU guests - inbound and outbound copies tested. Second Note: None of these tests came anywhere near saturating our SAN or LUNS, we tested on SATA and FC drives on isolated LUNS with the same results.
Here are the initial observations with a speculative conclusion.
1) When running on identical physical hardware Windows 2008 is 10-50% slower than 2003 in intensive file I/O operations (test above)
2) Virtualizing Windows 2003 OR Windows 2008 causes a 30-50% loss in performance for our specific Test, over identical non-virtual OS on bare hardware. (its the context switching) We also performed some large size file tests, single gig files - performance vitual was almost identical to raw hardware.
3) When running on identical virtual hardware, Windows 2008 is tremendously slower (400%) than Windows 2003 in very specific situations (high context switch situations). Otherwise performance is roughly equivalent.
If you read this you probably say "why is test 3 showing the OS"s performing almost the same in most situations, but not in Test 1"? Its all about the Context Switching - we just can't get those numbers to rise properly when virtualized and it appears Win 2008 has a higher requirement on top of it. Under a restricted context switch performance situation, the OS"s perform pretty close to each other except in cases where that restriction causes a huge loss of performance. Clear as mud to me.
We plan to test further, isolating performance differences from ESX 3.02 to 3.5U2 on the same SAN, same hardware, same VM's. This one should be interesting.
Conclusions?
Context Switching is a major issue for Virtualization over bare metal.....
Windows 2008 is just like its Father - Vista. MS needs to go back to the drawing board, something is sub-optimal in the Kernel. This is the same sort of thing that made me format my Laptop after running Vista for 18 months, it was just "slow" hard to put my finger on.
If anyone wants the spreadsheet with our testing methodology and results, let me know. More results to come.......
Spoke with a support rep today, it's a confirmed problem that VMware is aware of and working on. He indicated that this was a problem with quad-core intel ESX hosts and that there would be a release to resolve this issue. He wasn't sure of when but it didn't sound like it would make it into U3. There isn't a KB article yet about this issue though he mentioned that there was some sort of possible workaround...
Can you find out what the workaround is...?
Sent from my iPhone
On 17 Oct 2008, at 22:13, DanDill <communities-emailer@vmware.com
I believe it was to disable vmkernal.boot.pagesharing on the ESX host though don't quote me on that. Though that seemed to be more for hosts that had memory contention and/or were taking a very long time to boot: aka 1 hour. Neither of those applied to my situation as mine boot slowly, but still in a reasonable amount of time.
this may be of relevance as well: http://kb.vmware.com/kb/1004901
Apparently we're all having slightly different issues. Does anyone else notice the abysmal performance of Win 2008 in general and an even worse showing when virtualized (even with 1 vCPU)? Perhaps we shouldn't expect great performance from a heavily hit IIS server or File Server with millions of files. Win2k3 runs laps around it though....
I've noticed that formating a disk in W2008 is much slower overall than in W2003, both in virtual machines and physical servers.
So I believe that this is probably by design.