Re: WS10 on Linux host breaks Windows 8 guests

marcoecc · ‎10-26-2013

I''ve recently upgraded WS on Linux (CentOS 6.4) from 9.0.1 to 10.0.0.

All the existing guests work perfectly well, both before and after upgrading their virtual hardware from 9 to 10, except for a Windows8 VM.

On this Win9 VM's bootup, WS10 will always show a dialog window saying "The operation on file <virtual-disk-file> failed. Cancel, Retry, or Continue (pass the error to the guest)". The message is longer, but no further detail is given.

The vmware log shows something like: msg.vmxaiomgr.retrycontabort.rudeunplug:Operation on file <virtual-disk-file> failed. If the file resides on a remote file system, please make sure your network connection and the server where this disk resides are functioning properly. If the file resides on removable media, reattach the media. (BTW, the virtual disk is local to the WS host machine).

"Retry" will simply keep showing the same error forever, "Cancel" shuts down the virtual machine, and "Continue" will always and reliably corrupt Win8's disk beyond repair: after exactly 3 times hitting "Continue" in the same session, Win8 enters in an infinite loop of automatic repair, and is unsalvageable: none of the repair methods, advanced or not, command prompt (bootrec, bootsect, bcdedit, bcdboot, etc) nor the repair disc or usb can solve it. I still have this VM available only becuause I had external binary backups of all the partitions.

"strace" shows that WS10 issues a syscall that WS9 doesn't, which returns an error and causes the dialog window and subsequent errors, and that seems to be the culprit:

vmware.23216:syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23216:syscall_295(0x10, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23219:syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23317:syscall_295(0x5a, 0x7fee6d1d3f90, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23317:syscall_295(0x5a, 0x25f0840, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

errno 22 is EINVAL (invalid argument) and by looking at the kernel headers I believe this syscall is "preadv". Notably, some "syscall_295" calls do succeed (not shown here).

The error persists with 10.0.1, which I was prompted to download and install about 3 hours ago: unfortunately, it made no difference whatsoever to my problem.

When running WS9, I cannot see any such call being made to either syscall295 (preadv) or syscall296 (pwritev), and the Windows8 VM works pefectly well (audio and USB included).

The hyrony is that I forked out for the upgrade explicitly for its advertised support for Windows8.1 (the guest is still 8), and I'm extremely disappointed at this unexplicable regression.

I have a fairly "regular" CentOS 6.4 machine, with all the latest updates, no extraneous kernel modules (eg, no Nvidia native driver), and just the basic software to support WS.

I've gone back and forth between 9.0.1 and 10.0.0 (and 10.0.1) several times over the last week, restoring backups in between, and this situation is reproducible every time, without fail.

Additionally, if I don't hit "Continue" (thus preventing WS10 from corrupting the vdisk), then the same VM with the same disk will work with no issue in WS9 again, including after changing the HW from 10 back to 9 in the vmx file if I upgraded it: even this makes no difference.

Above I mentioned "<virtual-disk-file>" because I've tried three different configurations:

1. the VM was orginally running from it's factory partition on the original, unchanged hard disk (now not the boot disk), using as it virtual disk the entire physical disk (/dev/sdb, with MSR, EFISYS and 2 RE partitions, plus Windows's)

2. thinking that accessing the disk device could be part of the problem, I moved the entire disk setup in a single partition (by disk imaging), so the virtual disk was /dev/sdb7, a single physical partition (160GB) rather than the whole disk.

3. I then created a regular -flat.vmdk file (160GB) and a new VM using that rather than a physical disk or partition

I can still test each of the 3 configurations: WS9 works perfectly well with each of them, WS10 fails in exactly the same way: operation failed on /dev/sdb, or /dev/sdb7, or Windows8-flat.vmdk, always with syscall295 returning -11.

I checked in the strace output the arguments to the syscall: the first one is the file handler and is each time the same returned by a recent previous "open" syscall on the same virtual disk file.

The Windows8 VM's disk is EFI, but again, this is no issue in WS9 after setting firmware = "efi" in the vmx file of course.

I don't know what else to try, except running WS10 on the host and WS9 in a hardware-virtualized guest, and Win8 within it -- but I'd rather avoid this.

I also checked the .vmdk files, I can't figure out why WS seems to be randomly changing ID in the .vmdk (the descriptor, not the -flat file) stuff like:

CID=0689ad28

parentCID=ffffffff

ddb.longContentID = "dcae36e73fb7776da57edc180689ad28"

but this is not a problem with WS9 with any VM, or WS10 with any non-Windows machine.

Can anybody help?

(Despite having just paid £90.95 for the upgrade to 10.0.0, I seem to understanf I have no support entitlement for this product)

dariusd · ‎10-26-2013

Interesting. Painful. I'm very glad you had a backup! Thanks for your calm and detailed post in the face of a situation that could have been much worse. Let's try to figure out what went wrong...

At what point of the Win8 boot process does the VM fail? Has the circulating-white-dots progress gizmo appeared on the startup screen by the point of the failure?

Does the host strace show any attempt at a regular pread/pwrite immediately after the failing preadv/pwritev? If your kernel is working correctly, it is rejecting the preadv/pwritev because there are too many iov entries (iovcount of 0x6a5 exceeds 0x400, the kernel limit, so EINVAL is actually as expected), but then if your glibc is working correctly, it should handle that condition gracefully and convert the preadv/pwritev to a pread or pwrite syscall -- slightly less efficient due to the use of an internal bounce buffer, but it should work. I have no idea why we are generating an I/O with 0x6a5 iov entries... that seems excessive, but it might simply be indicative of an unusual request by the guest...

Just to round out the information gathering, I'd appreciate seeing the output of the following two commands on your CentOS 6.4 host:

uname -a

rpm -qf /lib64/libc-*

Thanks,

--

Darius

marcoecc · ‎10-27-2013

Hi Darius,

thank you for your quick reply. Actually, I've reached sky-high levels of frustration with this problem which is actually going on since about 3 weeks, but your hints seem very promising!

The Win8 VM fails right at the start, just 2-3 seconds after the EFI screen disappears (I set bios.bootdelay = "5000"), while the light blue Win8 logo is displayed. The whirling dots have at that point barely appeared.

If I hit "continue" on a "fresh" VM exactly 3 times, then the VM boots, but Win8 shows the blue screen and no means can repair the installation (I think I know and tried all of them, including repair and install disc, after extensive researching): the disk gets fundamentally corrupted, as "chkdsk" also confirms. On subsequent boots of the same VM, the WS "operation failed on <vdisk-file>" error window no longer appears, and Win8 just goes straight in one of the blue screens variations from which there's no exit but shutdown.

There don't seem to be any successfully preadv/preadw calls after the failing ones. Following is the last lines of a child strace log during one of these failed boot: it contains 256 calls to syscall_295 (most) and syscall_296 (much fewer), and only the last two fail.

Interestingly, the 3rd argument is almost always fairly small: 0x1, 0x4, 0x8, 0x10 etc... while in the failing calls is 0x6a5. There is also a successful call with 0x100 (still far lower than 0x6a5) though:

syscall_295(0x5a, 0x7fee6c03a000, 0x100, 0x31f0ff000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x100000

syscall_295(0x5a, 0x25bbcd0, 0x7, 0x2b5839000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x7000

syscall_295(0x5a, 0x27e7d80, 0x7, 0x2baf71000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x7000

syscall_295(0x5a, 0x25bbcd0, 0x7, 0x2c49d9000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x7000

syscall_295(0x5a, 0x27e7d80, 0x7, 0x2cd759000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x7000

syscall_295(0x5a, 0x27e7d80, 0x1, 0x2f56f0000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x1000

syscall_295(0x5a, 0x27e7d80, 0x1, 0x2bad0e000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = 0x1000

syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

syscall_295(0x10, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

Possibly interesting is also the fact that I have exactly 3 child strace traces containing "errno 22" for that syscall, and I'm positive these correspond to the 3 times I can hit "Continue" before Win8 boots into its blue screen (note the PIDs):

# grep "errno 22" *

vmware.23216:syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23216:syscall_295(0x10, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23219:syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23317:syscall_295(0x5a, 0x7fee6d1d3f90, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23317:syscall_295(0x5a, 0x25f0840, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

>>> I have no idea why we are generating an I/O with 0x6a5 iov entries... that seems excessive, but it might simply be indicative of an unusual request by the guest.

Indeed it can't be a chance that all the failing calls have 0x6a5 which seems quite a high value for its purpose. When this happens, the VM is literally just booting up, at the light blue Win8 logo: nothing unusual.

Erm... perhaps we should actually find out where that 0x6a5 (constant value) comes from!

This is the command line I use when trying to troubleshoot the issue; the following is issued as "root", but after "vmware" is started from the "vmware" regular user, and just after starting the relevant virtual machine.

# rm -f strace10/vmware.* ; strace -e trace=file -o strace10/vmware -f -ff -p $(ps -ef | grep Windows8.vmx | grep -v grep | awk '{ print $2; }')

This is the output of the 2 commands as requested:

# uname -a

Linux hypervisor 2.6.32-358.23.2.el6.x86_64 #1 SMP Wed Oct 16 18:37:12 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

# rpm -qf /lib64/libc-*

glibc-2.12-1.107.el6_4.5.x86_64

Kind regards,

Marco

dariusd · ‎10-27-2013

What are the next few file-related syscalls following the failing syscall_295? i.e. Assuming your most recent strace run was with "-e trace=file" and not a more limited set of traced functions, what is the output of

grep -A5 'syscall_29[56].*errno 22' vmware.23216

The value 0x6a5 probably derives from the structure of the file(s) on disk that the guest kernel is trying to load. Once the guest kernel has loaded its filesystem structures into memory, a request to read the entirety of a large and fragmented file can be represented as a single I/O to the virtual storage device with a huge IOV, each describing a single contiguous run of allocation units containing file data. Sometimes we need to split or merge these I/Os or iovec entries along the way to the host (particularly when the virtual disk is backed by a growable .vmdk which might add its own internal fragmentation to the mix).

An I/O vector of that size could result from a guest read of a highly-fragmented file of as little as 13 MBytes.

I don't think there is anything further to learn by attempting to "Continue" and/or repair the broken VM... at that point, the guest is probably very confused about what was actually read/written to the disk, and subsequent guest disk corruption is not a surprise.

I think the best step forwards is to see what is happening immediately after the failed preadv/pwritev, using the command I gave above.

Any chance you could attach a vmware.log from a failed boot? (Just choose "Use advanced editor" in the top-right when composing your reply, then use the "Browse" button below the compose window to attach it.)

Thanks,

--

Darius

marcoecc · ‎10-27-2013

Hi Darius,

The failing calls are the last each child process (3 of them, corresponding to each "Continue" before the VM boots into Win8's blue screen) executes:

# grep -A5 'syscall_29[56].*errno 22' vmware.*

vmware.23216:syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23216:syscall_295(0x10, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

--

vmware.23219:syscall_295(0x5a, 0x25e9de0, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

--

vmware.23317:syscall_295(0x5a, 0x7fee6d1d3f90, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

vmware.23317:syscall_295(0x5a, 0x25f0840, 0x6a5, 0x2e779a000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) = -1 (errno 22)

>>> The value 0x6a5 probably derives from the structure of the file(s) on disk that the guest kernel is trying to load. Once the guest kernel has loaded its filesystem structures into memory, a request to read the entirety of a large and fragmented file can be represented as a single I/O to the virtual storage device with a huge IOV, each describing a single contiguous run of allocation units containing file data. Sometimes we need to split or merge these I/Os or iovec entries along the way to the host (particularly when the virtual disk is backed by a growable .vmdk which might add its own internal fragmentation to the mix). <<<

So it seems Windows8 reading the various EFI boot files and directories causes those calls with the 0x6a5 3rd argument. From your previous message I understand the kernel's limit is 0x400 for that value, and WS relies on the kernel to convert those calls to regular "pread" and "pwrite" when they fail: this doesn't seem to be happening, at least with the last kernel on CentOS 6.4, but why doesn't WS ensure it doesn't issue an "illegal" call in the first place, and use "pread/pwrite" when the -v flavours can't be used?

I've re-installed 10.0.0 and immediately accepted, downloaded and installed the 10.0.1 update, like before. After restarting WS10, I've re-launched a Win8 VM (Windows8vmdk) that was working perfectly just 1 minute before with WS9: USB, audio and all, and no messages in the terminal window from where I launched WS as user vmware.

On the VM's launch in WS10.0.1 I immediately got the error (see WS10-error-1.png) and hit "Cancel": I forgot to mention earlier than even in this case WS10 says that it might already have corrupted the disk (can't sync, see WS10-error-2.png) -- however if I re-install WS9, it will work (because I didn't "Continue"). Another thing I forgot to mention earlier, WS10.0.1 keeps showing this annoying message:

Home directory /home/vmware not ours.

in the terminal window where I launched WS ("vmware") as the regular non-privileged user "vmware". Of course /home/vmware is owned (u&g) by vmware. This doesn't happen with WS9 or WS10.0.0.

After closing WS, this is the content of the directory:

[vmware@hypervisor:0:/vms/vmware/Windows8vmdk]$ ll -tr

total 167782872

-rw-r--r--. 1 vmware vmware 0 Oct 26 11:49 Windows8vmdk.vmsd

-rw-r--r--. 1 vmware vmware 267 Oct 26 11:49 Windows8vmdk.vmxf

drwxrwxrwx. 3 vmware vmware 4096 Oct 26 11:55 caches

-rw-------. 1 vmware vmware 3174922 Oct 26 18:00 vmmcores-1.gz

-rw-------. 1 vmware vmware 3123617 Oct 26 18:11 vmmcores-2.gz

-rw-r--r--. 1 vmware vmware 160684 Oct 26 22:04 vmware-2.log

-rw-r--r--. 1 vmware vmware 151279 Oct 26 22:13 vmware-1.log

-rw-r--r--. 1 vmware vmware 152497 Oct 28 01:15 vmware-0.log

drwxrwxrwx. 2 vmware vmware 4096 Oct 28 01:22 Windows8vmdk.vmdk.lck

-rwxr-xr-x. 1 vmware vmware 3328 Oct 28 01:22 Windows8vmdk.vmx

-rw-------. 1 vmware vmware 74232 Oct 28 01:24 Windows8vmdk.nvram

-rw-r--r--. 1 vmware vmware 514 Oct 28 01:24 Windows8vmdk.vmdk

-rw-r--r--. 1 vmware vmware 171798691840 Oct 28 01:24 Windows8vmdk-flat.vmdk

-rw-------. 1 vmware vmware 3051455 Oct 28 01:25 vmmcores.gz

-rw-r--r--. 1 vmware vmware 1026638 Oct 28 01:25 vmware.log

(the older core dumps where caused by the alsa driver: unless I set 24bit, 44100Hz in Win8, the virtual machine receives a SIGSEGV (sigh) when using sound: unrelated to the present issue, and thanks to your very good KB easily solved)

the .vmdk is left locked, the .vmx says "not shut down cleanly", there's a core dump and the terminal window shows:

[vmware@hypervisor:0:/vms/vmware/Windows8vmdk]$ vmware

Logging to /tmp/vmware-vmware/vmware-modconfig-31875.log

User requested abort: Exiting because of failed disk operation.

(BTW I found how to solve the annoying "glide" (or other GTK2 engine), "canberra" and "pk-gtk" shared object warnings so the only messages on the terminal stdout/stderr besides the modconfig notice are actual errors)

Please also find attached: the vmare.log above and the strace files corresponding to this launch.

Kind regards,

Marco

marcoecc · ‎10-27-2013

Just noticed: the 3rd argument in the failing calls is now -always- 0x6a9 -- 4 bytes longer than before, as the VM's name: probably obvious and irrelevant, but at least it means 0x6a5 is not some "magic" value.

dariusd · ‎10-27-2013

Hi Marco!

There is an important distinction to make between vmware-vmx's interaction with libc and your libc's interaction with your kernel. The kernel has a limit of 0x400 iovec entries for its preadv/pwritev syscall; libc does not (or should not) have any such limit. strace shows what is going on between libc and the kernel, not what happens between WS and libc. We always talk to libc, and libc is supposed to be responsible for issuing the preadv to the kernel as-is, checking if the kernel rejected it, and in the failure case allocating a buffer and reissuing the request as a pread syscall, precisely to ensure that the application (WS) does not need to deal with the implementation details of the underlying kernel. We're depending on that libc behavior, and for some reason it's not working out on your system. As far as I can tell, your kernel is doing the right thing... it is your libc that is failing us.

Time to go rummage through the source of the CentOS 6.4 glibc-2.12-1.107.el6_4.5.x86_64 ...

Thanks,

--

Darius

dariusd · ‎10-27-2013

From the source, it seems that the implementations of preadv/pwritev do not have the oversized-request recovery implied by the manpage. I don't quite know why.

I now see that the documentation doesn't say that preadv and pwritev behave in the way I described... they only discuss that in the context of pread and pwrite readv and writev (EDIT... Don't post while sleepy, kids.). The source code suggests that the latter don't actually have the oversized-request recovery either... Weird.

I'll file a bug report here to investigate further.

Thanks,

--

Darius

Message was edited by: Darius Davis

marcoecc · ‎10-29-2013

Hi Darius,

Is there any workaround I can use in the mean time? This looks like is going to take some time to fix properly, unless you release a patch that like WS9 doesn't make these p*v system calls. I've been looking for things like a compatibility wrapper library, but I've found nothing.

Without any doubt the cause of the error is the new system calls, whether in this instance mishandled by WS10, glibc or the kernel: strace shows SW9 doesn't make them. This also only happens with Windows8, but since my setup is fairly conventional, I wonder how this could not have been picked up by QA or regression testing... and I can't be the only one running a Windows8 guest on a CentOS (or RHEL) 6.4 host. I'm wondering whether it's my particular Windows 8's EFI and boot loader settings (meaning: combination of boot manager and loader entries, etc) that somehow manages to trip WS/glibc in making these unhandled calls, but it's still exactly the same as the OEM (Lenovo) set it up: there must be thousands of these same setups. Besides, I still could not trust a product/library combination that at any moment could issue these "invalid" calls and not just crash, but corrupt the virtual disk beyond repair.

BTW, the "independent", "persistent", "write cache" settings also (predictably) make no difference: I tried 'em all to no avail.

I bought this upgrade on 5th Oct, and more than 3 weeks later I'm still unable to use with a Windows 8 guest a product that was advertised everywhere on VMware's site and also by email to support Windows 8.1 guests. I suspect WS9 would have worked with my upgraded (not yet) Windows 8.1 guest just as well as it does with Windows 8, which is: perfectly well. If it ain't broken...

thanks,

Marco

dariusd · ‎10-29-2013

Hi Marco,

The preadv/pwritev syscalls were added to Linux 2.6.30+ for performance reasons. We added support for them to WS10 so that we could benefit from that performance improvement. This explains why WS9 doesn't make those syscalls... it simply doesn't support them at all, using the older and lower-performance interfaces instead.

The I/O pattern generated by our EFI firmware is markedly different from that generated by legacy BIOS. It is possible that that is the reason you are seeing this failure and no-one else has reported it yet. It might also be something nonstandard in the OEM installation you're virtualizing... I don't think you would find many folks who have virtualized an EFI-based OEM Windows 8 installation. We perform significant amounts of testing of Windows 8 guests on RHEL 6.x hosts (which in this regard should be identical to CentOS 6.x hosts) and have not encountered this failure at all.

I'm sitting on the fence a little here, but the more I look at the documentation for preadv/pwritev, the more become convinced that this is actually a bug in glibc which we simply didn't discover in our testing... the syscall wrapper functions are not behaving as described, and we're failing as a result.

Either way, we'll have to make it work, so I've filed a bug report and marked it with the appropriate severity level so that it should receive prompt attention. I'll let you know if I become aware of any workarounds.

Thanks,

--

Darius

marcoecc · ‎10-29-2013

Hi Darius,

thank you for your reply and clarification, in particular about when the p*v syscalls were added to the kernel and WS10 (and of course I knew why WS9 doesn't use them).

Not all of it makes complete sense to me, though. WS9 has a very good performance, from what I can see: all my Linux and Windows VMs worked for many months to my complete satisfaction, performance included. In fact, I decided to upgrade.

But what puzzles me most is the EFI explanation: as I hinted in the previous message, provided there was a magical solution to what happens during the EFI BIOS boot process and we could get past it and boot into the Windows 8 guest normally (perhaps converting it to legacy BIOS or some other way), still the underlying fact remains: WS10 is issuing system calls that, mishandled by either WS10, glibc or the kernel, can at any time result in serious data corruption on the virtual disk file or device and core dumps, even after the boot, if passed a 3rd argument > 0x400. The only (unlikely) sense I can make of it, is that you're implying that WS10 is only using these new systems calls (or only with the 3rd argument possibly too large) during the boot process, and once solved that boot problem (either a general EFI/Win8 problem, or one specific to how Lenovo set up machines like mine), all would automatically be OK: either no more p*v calls are issued certainly, or never again with illegal arguments. I will "strace" some Linux guests under WS10 to see what happens.

You mention doing many tests with CentOS or RHEL 6 hosts (of course)... did these tests also cover Windows 8 guests, 64bit, with EFI BIOS and GPT disk?

I believe that most Win8 guests would not have a OEM EFI setup, but surely some P2V machines out there are. My Win8 boot configuration is trivial anyway: 1 boot manger entry, 1 boot loader entry, with just default values and options.

I can't help but thinking that while you wait for upstream to fix either glibc or the kernel, you should meanwhile either not do these calls, or ensure you only do them with arguments that won't cause data corruptions and crashes (whatever the cause), without relying on glibc/kernel to fix the calls for you. That sounds a bit strange to me: I'd not, by default, rely on the underlying level to gracefully treat my function or system calls: in other words if I know there's some limit I try to ensure I don't exceed it, even if the library promises to fall back gracefully on some alternative if I do, and I consider any case where I do that as programming errors: in which case the underlying fall-back mechanism could (but not ought to) recover the situation.

Also, your code on receiving an I/O error seems to simply and straight assume that something is wrong with the storage device (and like others noted in older posts I found online, that message "If the file resides on a remote file system, please make sure your network connection and the server where this disk resides are functioning properly. If the file resides on removable media, reattach the media.", referred to a local hard disk or partition or a local vdisk flat file either physically attached or residing on a disk physically attached to the host's main board, is so incredibly annoying... almost more than the error itself. What about instead "uh-oh, the preadv() call failed, it might not be I/O, let's try a pread() instead"?

(I just want this product to work for what I paid it for!)

I really appreciate your help and will now just keep an eye here for any updates on this issue.

Meanwhile I shall run WS9 and WS10 on the same machine if possible, or downgrade to WS9 completely, if not. Perhaps I'll get a corresponding extension on my paid free-upgrade period!?

thanks,

Marco

dariusd · ‎04-18-2014

Hi Marco!

This issue should be resolved by the Workstation 10.0.2 update, released just recently, which adds a workaround for the failure of glibc's preadv/pwritev/preadv64/pwritev64 functions to conform to their documented interface.

If you haven't already done so, please consider updating Workstation and the problem should be resolved.

Let me know if you continue to encounter any difficulties!

Cheers,

--

Darius