6 Replies Latest reply on Jun 22, 2017 11:10 PM by dariusd

    Issue in Workstation 12.5.5 Linux vmnet module

    swordfeng Lurker

      I discovered occasional 'kernel stack overflow' error with RIP register pointing to function 'VNetBridgeNotify'.

      Then I took a look into the source code and found there may be a bug in vmnet/bridge.c:

       

      --- bridge.c 2017-05-14 02:24:23.764324763 +0800

      +++ bridge_new.c 2017-05-14 02:24:20.494352085 +0800

      @@ -1146,7 +1146,7 @@

      void *data) // IN: device pertaining to event

      {

      VNetBridge *bridge = list_entry(this, VNetBridge, notifier);

      - struct net_device *dev = (struct net_device *) data;

      + struct net_device *dev = netdev_notifier_info_to_dev(data);

       

      switch (msg) {

      case NETDEV_UNREGISTER:

       

      It can be found in other similar source code that 'netdev_notifier_info_to_dev' is used to extract the struct net_device. For example, linux/drivers/net/ppp/pppoe.c - Elixir - Free Electrons

        • 1. Re: Issue in Workstation 12.5.5 Linux vmnet module
          dariusd Virtuoso
          VMware EmployeesUser Moderators

          Thank you very much for reporting this!  I'll see that it gets fixed.

           

          Cheers,

          --

          Darius

          • 2. Re: Issue in Workstation 12.5.5 Linux vmnet module
            dariusd Virtuoso
            User ModeratorsVMware Employees

            Hi again, swordfeng!

             

            Linux kernel 3.11 changed the usage of the 3rd argument to the notifier function from a struct net_device * to a struct netdev_notifier_info *.  Your patch will fix it neatly for newer Linux kernels, but the official fix we have queued up ends up being a little bit more complicated because we still support host OSes older than kernel 3.11. 

             

            We haven't seen any other reports of a stack overflow though... The misuse of the pointer is very nasty, but I haven't yet seen how a stack overflow could result.  Is there any chance you could share more details of the failure?  Information on which Linux distribution you're running might be useful, and/or a photo of the panic message on your screen.

             

            Thanks,

            --

            Darius

            • 3. Re: Issue in Workstation 12.5.5 Linux vmnet module
              swordfeng Lurker

              Hi, dariusd!

               

              Nice to hear that this is going to be fixed soon.

               

              I'm current using Arch Linux, kernel 4.10.13, running patched modules (https://aur.archlinux.org/cgit/aur.git/tree/?h=vmware-modules-dkms) which seems unrelated to this issue however.

               

              I have taken a picture and have kept the kernel module which produced the 'stack overflow'. And here's a piece of disassembly:

               

              0000000000004390 <VNetBridgeNotify>:

                  4390: e8 00 00 00 00       callq  4395 <VNetBridgeNotify+0x5>

                  4395: 55                   push   %rbp

                  4396: 48 89 e5             mov    %rsp,%rbp

                  4399: 41 54                 push   %r12

                  439b: 53                   push   %rbx

                  439c: 48 89 fb             mov    %rdi,%rbx

                  439f: 48 83 ec 08           sub    $0x8,%rsp

                  43a3: 48 83 fe 02           cmp    $0x2,%rsi

                  43a7: 0f 84 bf 00 00 00     je     446c <VNetBridgeNotify+0xdc>

                  43ad: 48 83 fe 06           cmp    $0x6,%rsi

                  43b1: 0f 84 8c 00 00 00     je     4443 <VNetBridgeNotify+0xb3>

                  43b7: 48 83 fe 01           cmp    $0x1,%rsi

                  43bb: 74 0b                 je     43c8 <VNetBridgeNotify+0x38>

                  43bd: 48 83 c4 08           add    $0x8,%rsp

                  43c1: 31 c0                 xor    %eax,%eax

                  43c3: 5b                   pop    %rbx

                  43c4: 41 5c                 pop    %r12

                  43c6: 5d                   pop    %rbp

                  43c7: c3                   retq  

                  43c8: 48 83 7f 28 00       cmpq   $0x0,0x28(%rdi)

                  43cd: 75 ee                 jne    43bd <VNetBridgeNotify+0x2d>

                  43cf: 48 81 ba 00 05 00 00 cmpq   $0x0,0x500(%rdx)

                  43d6: 00 00 00 00

                  43da: 75 e1                 jne    43bd <VNetBridgeNotify+0x2d>

                  43dc: 4c 8d 67 18           lea    0x18(%rdi),%r12

                  43e0: 48 89 d7             mov    %rdx,%rdi

                  43e3: 48 89 55 e8           mov    %rdx,-0x18(%rbp)

                  43e7: 4c 89 e6             mov    %r12,%rsi

                  43ea: e8 00 00 00 00       callq  43ef <VNetBridgeNotify+0x5f>

              Which convinces me that the invalid access into the *data causes this issue.

              • 4. Re: Issue in Workstation 12.5.5 Linux vmnet module
                dariusd Virtuoso
                User ModeratorsVMware Employees

                Ahhhhh yes, that'd do it!  Not really a stack overflow in the most common/traditional sense (e.g. too many function calls or stack frames too large) but instead it's a wild pointer dereference that's "too far" past the current stack pointer and causes a page fault.  I agree that your patch will take care of it.

                 

                Thanks for the awesome help, swordfeng.  We greatly appreciate it!

                 

                Cheers,

                --

                Darius

                • 5. Re: Issue in Workstation 12.5.5 Linux vmnet module
                  dariusd Virtuoso
                  User ModeratorsVMware Employees

                  I wasn't able to squeeze this fix into the Workstation 12.5.6 update which was released just now.  It's still in the queue for evaluation for the subsequent patch release.  No specific timeline to share, I'm afraid.  In the meantime, your patch should continue to work against Workstation 12.5.6.

                   

                  Thanks again!

                  --

                  Darius

                  • 6. Re: Issue in Workstation 12.5.5 Linux vmnet module
                    dariusd Virtuoso
                    User ModeratorsVMware Employees

                    Workstation 12.5.7 has been released, and contains a fix for this issue which may cause host kernel panics or unreliable network bridging when run on Linux host kernel version 3.11 and newer.

                     

                    VMware Workstation 12 Pro Version 12.5.7 Release Notes

                    Download VMware Workstation Pro

                     

                    VMware Workstation 12 Player Version 12.5.7 Release Notes

                    Download VMware Workstation Player

                     

                    Cheers,

                    --

                    Darius