VMware {code} Community
Andreas_Johanss
Contributor
Contributor
Jump to solution

VMCI-socket stability issues

Hi,

I have now finished my Java wrapper of the socket API and am experiencing severe stability issues in my tests; it occures when I push traffic through it (Bluescreens on the host pointing all over the place). My testing spec is as follows:

Host:

  • Vista Ultimate SP1

  • WS 6.5.1 build-126130

VM:

WinXP SP3

VMware tools 7.8.4 build-126130

I have about 6 concurrent handles open at the same time and very soon the computer either freeze or gives me a bluescreen. Some of the bluescreens I have observed are:

  • ntfs.sys crash

  • IRQ_NOT_LESS_OR_EQUAL

At first I thought it was because of a conflict with my other virtual server appliances (VirtualBox and Virtual PC 2007), but uninstalling them didn't resolve the issue. Something is going very bad deep into the kernel code since Vista can't recover from the error. If anyone have seen this before I would be grateful to know more about the cause.

Kind Regards

Andreas

Reply
0 Kudos
1 Solution

Accepted Solutions
admin
Immortal
Immortal
Jump to solution

We found a potential bug on our windows implementation (we are still working on validating it) where you can crash if one thread calls close while the other thread is blocked on an active socket operations.

So if t1 calls recv -- blocks

t2 calls close

t1 wakes up and the socket is gone.

Do you ever close a socket while other threads could be using it?

View solution in original post

Reply
0 Kudos
7 Replies
admin
Immortal
Immortal
Jump to solution

Hi Andreas,

I'm interested in figuring out what exactly is gonig wrong here. Probably the easiest way to do this is for you to provide a full kernel memory dump to us.

See http://support.microsoft.com/kb/254649 for details on how to setup that up. A kernel or full memory dump will be most useful. I will provide you with details on how to get the dump out of band (I need go figure out the details).

Reply
0 Kudos
Andreas_Johanss
Contributor
Contributor
Jump to solution

Great arolett,

Vista is dumping a kernel memory dump by default and the one from my last crash yesterday is 308 MB (I can't provide a full memory dump since I have more than 2 GB of RAM installed). If you can't find a place where I can upload it to I can provide a link for you where you can download it.

Kind Regards

Andreas

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Thanks for the crash dump Andreas.

One of our other windows VMCI Socket developers took a look.

Here is his reply:

If there's a bug in our code, then unfortunately it's

happening earlier on, with the crash only occuring later, elsewhere in

the kernel. Here's the stack that caused the crash:

b3191a9c 8268e5b2 badb0d00 00000000 00000000 nt!KiTrap0E+0x2ac

b3191b10 82625e43 c2714158 c348a008 20660000 nt!MiInvalidateCollidedPfns+0xb

b3191b64 827dd887 85493920 c0598400 00000002 nt!MiValidateImagePages+0x362

b3191b90 827e678d 85493920 c0598400 97ddad78 nt!MiSwitchBaseAddress+0x4c

b3191bb4 8284a1c6 85493920 00000000 00000000 nt!MiRelocateImageAgain+0xd6

b3191ccc 8284e2d8 b3191d20 0000000f 00000000 nt!MmCreateSection+0x51d

b3191d40 8265ca1a 01eaea9c 0000000f 00000000 nt!NtCreateSection+0x177

b3191d40 76e59a94 01eaea9c 0000000f 00000000 nt!KiFastCallEntry+0x12a

Looking at the java processes, I see lots of threads waiting on VMCI

datagrams, but nothing that looks like it might have caused a crash.

Most of them are identical to this:

c575aa28 826bc2ff nt!KiSwapContext+0x26

c575aa6c 82659cc8 nt!KiSwapThread+0x44f

c575aac0 a2462ff1 nt!KeWaitForSingleObject+0x492

c575aae4 a2464481 vmci!VSockOS_EventWaitSingle+0x43

c575aaf8 a246461f vmci!__VSockWaitQ_Wait+0x17

c575ab18 a2463cf2 vmci!VSockWaitQ_Wait+0x2b

c575ab44 a24640bc vmci!VSockSocketDgramWaitMsg+0x74

etc.

The processes are paged out, so I can't see the user-land stacks (not that I have the symbols anyway).

It might be helpful to have the list of socket operations that led to

this, so that we can try to recreate this (or a minimal test case if you can come up with one) Also, do you have to have

6, or does it crash with fewer? And does it crash when you run this in

the guest too?

Andreas_Johanss
Contributor
Contributor
Jump to solution

Thanks for analyzing the crash dump. It crashes when I have fewer sockets open as well and if I send less traffic it takes longer for it to occure. In my tests I had a VMCI-socket server running on the host and a VM acting client.

I will try to create a minimal example that shows the crash, I'll get back to you when I have something ready.

Kind Regards

Andreas

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

We found a potential bug on our windows implementation (we are still working on validating it) where you can crash if one thread calls close while the other thread is blocked on an active socket operations.

So if t1 calls recv -- blocks

t2 calls close

t1 wakes up and the socket is gone.

Do you ever close a socket while other threads could be using it?

Reply
0 Kudos
Andreas_Johanss
Contributor
Contributor
Jump to solution

Hi arolett,

I have gone through my code during the weekend and found a bug where a IO stream could write/read from a closed socket and I think that was the cause for my crashes - as you also state. After fixing the bug I haven't experienced any crashes and running a RDP session over the tunnel works fine. I will do more extensive tests with multiple VMs running concurrently and see if the implementation is acceptable in terms of stability and performance wise.

Thanks for your help.

Kind Regards

Andreas

Reply
0 Kudos
jyothyr
Contributor
Contributor
Jump to solution

Hi ndreas, could you share your use case? Also, why did you chose VMCI-socket API over TCP for your application?

Reply
0 Kudos