VMware Communities
KappaWingman
Contributor
Contributor

Running VMware Workstation with KVM on Linux concurrently

Hello,

I am using VMware Workstation and KVM on Fedora 32.

If I use the same user to run both hypervisors, the guest running under KVM would crash frequently.

The problem is reported in Fedora bugzilla 1876123 – KVM VM crashes, auto-reboot and hangs frequently when VMWARE Workstation VM are both runni...

However, the developers at Redhat asked me to contact VMware and said it 's not a Linux Kernel or KVM problem.

Recently I tried to change the user that runs KVM. Previously it's the same user running both hypervisors.

The problem seems doesn't happened and I need further testing.

So, I would like to get some attention from VMware to see that is the root cause that of the crashes if I use the same user to run both hypervisors.

Thanks!

Reply
0 Kudos
14 Replies
KappaWingman
Contributor
Contributor

Well, running KVM guest with another user makes no difference after testing for a longer period.

Those KVM guests still crash after about three hours concurrently with VMware workstation guests.

Reply
0 Kudos
jkwhite
VMware Employee
VMware Employee

Additional details are required.

What version of Workstation - does it still happen with WS16

Is there some component that changed before the issue occured such as a kernel upgrade, WS upgrade, qemu upgrade, or anything else.

Please provide the vmware.log and qemu command lines from a failing scenario.

Reply
0 Kudos
KappaWingman
Contributor
Contributor

Yes, it still happens with WKS16.

I am using Fedora 33 now, kernel is 5.8.4, QEMU was updated to 5.1.

So all components are updated to the latest version.

I just tried to start up a KVM with vmware workstation running.

Now the KVM guest has use 100% vCPU and that KVM guest is not responsive.

I had attached the related log in this reply.


Thanks.

Reply
0 Kudos
hotvooboy
VMware Employee
VMware Employee

Hi KappaWingman,

Thank you for posting your query in community.

Actually I setup a similar system(qemu+kvm/WS) on my AMD EPYC machine.

Here is some brief description:

1.software version

Host is Linux 5.8.15-301.fc33.x86_64 Fedora 33 (Workstation Edition) Fedora 33

qemu version: 5.1.0

workstation version 16894299

2. centos 7.8 VM

five KVM VM and one VMware VM were installed and running concurrently. Docker service  was enabled also.

However, no VM crash was observed so far.

So did I miss something than your system?

Thanks.

Reply
0 Kudos
KappaWingman
Contributor
Contributor

Thanks for reply.

In the past, the VM in KVM would have different issue like having 100% CPU or random crashes.

Usually occur within an hour, higher occurrence if running over three hours to six hours.

I haven't use the both hypervisor setup for weeks.

Now I have FC33 with Kernel 5.8.16.

Let me try to start the test again.

By the way, once it started to have problem I don't know how to debug.

Reply
0 Kudos
hotvooboy
VMware Employee
VMware Employee

Thanks for your information.

My host OS is FC33 with Kernel 5.8.15. All VMs look good after running ten hours . Did you run any workload in VM?

Did you hit the same issue on intel CPU machine?

KappaWingman
Contributor
Contributor

Good to hear that.

I had been running the KVM with VMware concurrently for about six hours continuously for two times since yesterday.

Looks like the problem is gone in the later version of Fedora 33 / kernel.

The problem really occured in September, I had recorded many crashes before (1876123 – KVM VM crashes, auto-reboot and hangs frequently when VMWARE Workstation VM are both runni... .

If I encounter the problem again, I would update in this forum.

Thanks.

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

I am running rancher / rke inside KVM. Now host kernel is 5.9 and with WKS 16.1, guest VM is Centos 7.9.

Just now I have encounter kvm freeze (one of the server) and random reboot  (another server).

The core dump message is a bit similar as before. having 'BUG: unable to handle kernel paging request at'. I am not sure if it is related to etcd, KVM or VMware.

[ 5623.572853] device veth8f02d85 left promiscuous mode
[ 5623.572865] docker0: port 1(veth8f02d85) entered disabled state
[ 6544.533860] BUG: unable to handle kernel paging request at ffffffffffffffff
[ 6544.534764] IP: [<ffffffff9644dba6>] vfs_write+0xd6/0x1f0
[ 6544.534764] PGD 39814067 PUD 39816067 PMD 0
[ 6544.534764] Oops: 0000 [#1] SMP
[ 6544.537067] Modules linked in: xt_nat veth xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter bridge stp llc overlay(T) sunrpc dm_mirror dm_region_hash dm_log dm_mod xfs vfat fat crc32_pclmul snd_hda_codec_generic libcrc32c ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt iTCO_vendor_support snd_pcm aesni_intel sg snd_timer lrw gf128mul glue_helper ablk_helper cryptd pcspkr snd lpc_ich i2c_i801 joydev virtio_rng soundcore virtio_balloon i6300esb binfmt_misc ip_tables ext4 mbcache jbd2 sr_mod cdrom virtio_blk virtio_console ahci crct10dif_pclmul crct10dif_common libahci virtio_net net_failover failover
[ 6544.537067]  crc32c_intel serio_raw libata qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_pci virtio_ring virtio drm_panel_orientation_quirks ptp_kvm ptp pps_core fuse
[ 6544.537067] CPU: 1 PID: 13014 Comm: etcd Kdump: loaded Tainted: G               ------------ T 3.10.0-1160.6.1.el7.x86_64 #1
[ 6544.537067] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 6544.537067] task: ffff8bd9fc97c200 ti: ffff8bd9d9eb4000 task.ti: ffff8bd9d9eb4000
[ 6544.537067] RIP: 0010:[<ffffffff9644dba6>]  [<ffffffff9644dba6>] vfs_write+0xd6/0x1f0
[ 6544.537067] RSP: 0018:ffff8bd9d9eb7ed0  EFLAGS: 00010202
[ 6544.537067] RAX: 0000000000000019 RBX: ffff8bda4df94600 RCX: 0000000000000000
[ 6544.537067] RDX: 0000000000000000 RSI: ffff8bd9cc301240 RDI: ffff8bda44bb22e8
[ 6544.537067] RBP: ffff8bd9d9eb7f00 R08: 000000005fca06cf R09: 000000005fca06cf
[ 6544.537067] R10: 000000000f323812 R11: 0000000000000019 R12: 0000000000000019
[ 6544.537067] R13: 0000000000000002 R14: ffffffffffffffff R15: 0000000000000000
[ 6544.537067] FS:  000000c000156e90(0000) GS:ffff8bda4e900000(0000) knlGS:0000000000000000
[ 6544.537067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6544.537067] CR2: ffffffffffffffff CR3: 000000003b106000 CR4: 0000000000340fe0
[ 6544.537067] Call Trace:
[ 6544.537067]  [<ffffffff9644e96f>] SyS_write+0x7f/0xf0
[ 6544.537067]  [<ffffffff96995226>] tracesys+0xa6/0xcc
[ 6544.537067] Code: 48 89 df 48 8b 40 18 48 85 c0 0f 84 05 01 00 00 e8 a0 97 14 00 49 89 c4 4d 85 e4 7e 39 48 8b 73 18 41 bd 02 00 00 00 4c 8b 76 30 <41> 0f b7 06 66 25 00 f0 66 3d 00 40 b8 02 00 00 40 44 0f 44 e8
[ 6544.537067] RIP  [<ffffffff9644dba6>] vfs_write+0xd6/0x1f0
[ 6544.537067]  RSP <ffff8bd9d9eb7ed0>
[ 6544.537067] CR2: ffffffffffffffff

 

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

Hello,

I am having more crahses when I tried to run some rancher nodes on KVM:

One of the crash:


[ 6544.533860] BUG: unable to handle kernel paging request at ffffffffffffffff
[ 6544.534764] IP: [<ffffffff9644dba6>] vfs_write+0xd6/0x1f0
[ 6544.534764] PGD 39814067 PUD 39816067 PMD 0
[ 6544.534764] Oops: 0000 [#1] SMP
[ 6544.537067] Modules linked in: xt_nat veth xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter bridge stp llc overlay(T) sunrpc dm_mirror dm_region_hash dm_log dm_mod xfs vfat fat crc32_pclmul snd_hda_codec_generic libcrc32c ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device iTCO_wdt iTCO_vendor_support snd_pcm aesni_intel sg snd_timer lrw gf128mul glue_helper ablk_helper cryptd pcspkr snd lpc_ich i2c_i801 joydev virtio_rng soundcore virtio_balloon i6300esb binfmt_misc ip_tables ext4 mbcache jbd2 sr_mod cdrom virtio_blk virtio_console ahci crct10dif_pclmul crct10dif_common libahci virtio_net net_failover failover
[ 6544.537067] crc32c_intel serio_raw libata qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_pci virtio_ring virtio drm_panel_orientation_quirks ptp_kvm ptp pps_core fuse
[ 6544.537067] CPU: 1 PID: 13014 Comm: etcd Kdump: loaded Tainted: G ------------ T 3.10.0-1160.6.1.el7.x86_64 #1
[ 6544.537067] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 6544.537067] task: ffff8bd9fc97c200 ti: ffff8bd9d9eb4000 task.ti: ffff8bd9d9eb4000
[ 6544.537067] RIP: 0010:[<ffffffff9644dba6>] [<ffffffff9644dba6>] vfs_write+0xd6/0x1f0
[ 6544.537067] RSP: 0018:ffff8bd9d9eb7ed0 EFLAGS: 00010202
[ 6544.537067] RAX: 0000000000000019 RBX: ffff8bda4df94600 RCX: 0000000000000000
[ 6544.537067] RDX: 0000000000000000 RSI: ffff8bd9cc301240 RDI: ffff8bda44bb22e8
[ 6544.537067] RBP: ffff8bd9d9eb7f00 R08: 000000005fca06cf R09: 000000005fca06cf
[ 6544.537067] R10: 000000000f323812 R11: 0000000000000019 R12: 0000000000000019
[ 6544.537067] R13: 0000000000000002 R14: ffffffffffffffff R15: 0000000000000000
[ 6544.537067] FS: 000000c000156e90(0000) GS:ffff8bda4e900000(0000) knlGS:0000000000000000
[ 6544.537067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6544.537067] CR2: ffffffffffffffff CR3: 000000003b106000 CR4: 0000000000340fe0
[ 6544.537067] Call Trace:
[ 6544.537067] [<ffffffff9644e96f>] SyS_write+0x7f/0xf0
[ 6544.537067] [<ffffffff96995226>] tracesys+0xa6/0xcc
[ 6544.537067] Code: 48 89 df 48 8b 40 18 48 85 c0 0f 84 05 01 00 00 e8 a0 97 14 00 49 89 c4 4d 85 e4 7e 39 48 8b 73 18 41 bd 02 00 00 00 4c 8b 76 30 <41> 0f b7 06 66 25 00 f0 66 3d 00 40 b8 02 00 00 40 44 0f 44 e8
[ 6544.537067] RIP [<ffffffff9644dba6>] vfs_write+0xd6/0x1f0
[ 6544.537067] RSP <ffff8bd9d9eb7ed0>
[ 6544.537067] CR2: ffffffffffffffff

 

Another crash:

[ 2488.195495] ------------[ cut here ]------------
[ 2488.195942] kernel BUG at mm/slub.c:3774!
[ 2488.196300] invalid opcode: 0000 [#1] SMP
[ 2488.196351] Modules linked in: ipt_REJECT nf_reject_ipv4 xt_set ipt_rpfilter xt_multiport iptable_raw ip_set_hash_ip ip_set_hash_net ip_set vxlan ip6_udp_tunnel udp_tunnel veth xt_nat xt_statistic iptable_mangle xt_comment xt_mark ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc overlay(T) sunrpc dm_mirror dm_region_hash dm_log dm_mod snd_hda_codec_generic snd_hda_intel snd_hda_codec crc32_pclmul snd_hda_core snd_hwdep snd_seq ghash_clmulni_intel vfat fat iTCO_wdt iTCO_vendor_support snd_seq_device xfs snd_pcm libcrc32c aesni_intel snd_timer lrw gf128mul glue_helper ablk_helper snd cryptd sg pcspkr lpc_ich
[ 2488.196351] joydev i2c_i801 soundcore virtio_rng virtio_balloon i6300esb binfmt_misc ip_tables ext4 mbcache jbd2 sr_mod cdrom qxl drm_kms_helper ahci libahci syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_blk ttm virtio_console drm libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw virtio_net net_failover failover virtio_pci virtio_ring virtio drm_panel_orientation_quirks ptp_kvm ptp pps_core fuse
[ 2488.196351] CPU: 0 PID: 2579 Comm: kubelet Kdump: loaded Tainted: G ------------ T 3.10.0-1160.6.1.el7.x86_64 #1
[ 2488.196351] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[ 2488.196351] task: ffff8a7d695c3180 ti: ffff8a7c6e43c000 task.ti: ffff8a7c6e43c000
[ 2488.196351] RIP: 0010:[<ffffffffbaa2617c>] [<ffffffffbaa2617c>] kfree+0x13c/0x140
[ 2488.196351] RSP: 0000:ffff8a7c6e43fc90 EFLAGS: 00010246
[ 2488.196351] RAX: 7966697265762019 RBX: ffff8a7c75f11800 RCX: 0000000000000000
[ 2488.196351] RDX: 7966697265760019 RSI: 0000000000000000 RDI: ffff8a7c75f11800
[ 2488.196351] RBP: ffff8a7c6e43fca8 R08: ffff8a7d695c3718 R09: ffff8a7c6e43fc68
[ 2488.196351] R10: 0000000000000001 R11: ffffce4b01d7c440 R12: ffff8a7c75f11a70
[ 2488.196351] R13: ffffffffba93e37c R14: ffff8a7c75f11a70 R15: ffff8a7d695c3180
[ 2488.196351] FS: 00007f9554ff9700(0000) GS:ffff8a7d76e00000(0000) knlGS:0000000000000000
[ 2488.196351] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2488.196351] CR2: 0000000000000000 CR3: 000000016871e000 CR4: 0000000000340ff0
[ 2488.196351] Call Trace:
[ 2488.196351] [<ffffffffba93e37c>] __audit_free+0x1dc/0x250
[ 2488.196351] [<ffffffffba8a1f24>] do_exit+0x8f4/0xa20
[ 2488.196351] [<ffffffffbaf87229>] ? schedule+0x29/0x70
[ 2488.196351] [<ffffffffba8a20cf>] do_group_exit+0x3f/0xa0
[ 2488.196351] [<ffffffffba8b307e>] get_signal_to_deliver+0x1ce/0x5e0
[ 2488.196351] [<ffffffffba82c527>] do_signal+0x57/0x6f0
[ 2488.196351] [<ffffffffba8c6a20>] ? abort_exclusive_wait+0xa0/0xa0
[ 2488.196351] [<ffffffffba82cc32>] do_notify_resume+0x72/0xc0
[ 2488.196351] [<ffffffffbaf952ef>] int_signal+0x12/0x17
[ 2488.196351] Code: 49 8b 03 31 f6 f6 c4 40 74 04 41 8b 73 68 4c 89 df e8 e9 15 fa ff eb 84 4c 8b 58 30 48 8b 10 80 e6 80 4c 0f 44 d8 e9 28 ff ff ff <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54
[ 2488.196351] RIP [<ffffffffbaa2617c>] kfree+0x13c/0x140
[ 2488.196351] RSP <ffff8a7c6e43fc90>

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

Hello,

I am still having crashes with KVM (now kernel 5.9 and WKS 16.1) and docker-ce.

Attached file is the info for the crashes.

Thanks.

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

It's strange.

I have a VM (KVM) that had installed with FreeIPA, identity server for a weeks. It did not crash when I run it (KVM) on standalone with some VM in vmware WKS.

Now that VM crashed today when I run a K8S cluster with some KVM guests in the same machine.

Attached is the stack trace.

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

I am using Fedora 33 now and the problem persists.

This time the host machine also have problem and many programs core dump and crashes.

I had attached the related log files.

 

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

Greetings. I had upgraded to Fedora 34.

It is using kernel 5.11.x, the random crashes problem persists.

Reply
0 Kudos
Kappa-Wingman
Contributor
Contributor

Greetings,

I am using Fedora 34 and kernel 5.13.4-200.fc34.x86_64.

Since VMware Wokrstation 16.1.2 does not support kernel 5.13.4, I used the patching module from

https://github.com/mkubecek/vmware-host-modules/tree/workstation-16.1.2

Now I had been running with VMware workstation and KVM without crashing problem fro about two weeks. So some of the patches that could fix the problem. Thanks.

Reply
0 Kudos