cynar's Posts

When run under the VMware hypervisor, your system will want to have * proper drivers installed (X11 / mesa) * "open-vm-tools" (and, where available) "open-vm-tools-desktop" packages installed * th... See more...
When run under the VMware hypervisor, your system will want to have * proper drivers installed (X11 / mesa) * "open-vm-tools" (and, where available) "open-vm-tools-desktop" packages installed * the "vmtoolsd" systemd service (from the open-vm-tools package above) active and running That's good enough. Alternatively, simply manually change the screen resolution inside your guest. The tools are just convenience.  
Fedora seems to be on a mission to not only nudge, but almost bully people into Wayland. I believe Wayland is the right destination, but the path remains fraught with ... challenges.
I totally like the idea of Wayland. I am pained by the friction when I want to enable it (for KDE, on Fedora Linux, on VMware Workstation). Sigh. So, on the topic of this problem here, ... it seem... See more...
I totally like the idea of Wayland. I am pained by the friction when I want to enable it (for KDE, on Fedora Linux, on VMware Workstation). Sigh. So, on the topic of this problem here, ... it seems as if Fedora Linux despises X11 enough to ship something ancient and heavily patched, where-as Ubuntu (even in 22.04 LTS) has a more recent (still old) baseline version. And then there is ... Arch Linux. Or rather me trying EndeavourOS as in "make the install painless". What can I say: 24.678 ms [ 19987] | } /* glXChooseVisual */ So, "glxinfo" starts about 70 times faster on Arch Linux than on Fedora Linux (38/39). No, I don't care about glxinfo, but I care a lot about any other X11 client having a two second delay on startup, be it Kate, Visual Studio Code, or Firefox. I had always picked Fedora Linux to be reasonably up-to-date, an enormous amount of packages from an RPM-based distribution, similarity with Red Hat Enterprise Linux to reduce cognitive load. Time to revisit my choice of distribution?
FWIW, on a fresh Ubuntu LTS 22.04.3 this same call is 564.412 ms [ 2932] | glXChooseVisual(); which is still ... not fast ... but one second faster than on Fedora 39. Alas, OpenSUSE Tumbleweed, a... See more...
FWIW, on a fresh Ubuntu LTS 22.04.3 this same call is 564.412 ms [ 2932] | glXChooseVisual(); which is still ... not fast ... but one second faster than on Fedora 39. Alas, OpenSUSE Tumbleweed, a fully rolling distribution, also gets to that very very annoying 1.7 seconds overhead / delay. Finally, forcing software rendering completely eradicates these startup delays, but then that's software rendering. I'd expect from all this that the VMware driver infrastructure and newer versions of the X11 server (in rolling distribution) do not interact well. A fix would be very much appreciated for those who do their work _inside_ X11-based Linux distributions.
The X11 driver seems to be running into a performance regression: X11 applications will call glXChooseVisual at the very start of the process, to set up for X11. Calling this has become _very_ slow ... See more...
The X11 driver seems to be running into a performance regression: X11 applications will call glXChooseVisual at the very start of the process, to set up for X11. Calling this has become _very_ slow - so slow, that on fast hardware it delays starting every single X11 process by about two seconds. I am reproducing this on Fedora Linux (38 or 39) with the following simple steps: ltrace -T --output=ltrace.log glxinfo64 -B grep glXChooseVisual ltrace.log This yields glXChooseVisual(0x55b51abe2da0, 0, 0x55b519abb040, 0) = 0x55b51ac03f00 <1.657946> with the number at the right flagging: 1.7 seconds startup delay from calling a function, where the call should effectively be "free". Now, this is not a problem under Wayland - but I cannot use Wayland right now (no fractional scaling; challenges in interaction between KDE and VMware graphics stack)
It would seem as if Linux kernel tuning on NVMe parameters may help alleviate the problem. I had had the opportunity to set up another virtual machine on that specific WD_BLACK SN850X SSD - basicall... See more...
It would seem as if Linux kernel tuning on NVMe parameters may help alleviate the problem. I had had the opportunity to set up another virtual machine on that specific WD_BLACK SN850X SSD - basically booting a fresh and _natively_ installed Fedora Linux 39 from its three physical partitions (EFI, boot, data) via VMware Workstation physical drive access, using the virtual NVMe controller. Initially, this setup was also suffering from occasionally massively degraded performance (see above). One kernel tuning parameters seems to be making The Difference: nvme.poll_queues=64 Lets look at one of the results of the exploratory probing: Fedora 38, many "20 GB split virtual disk" files, virtual SCSI controller     read: IOPS=39.9k, BW=156MiB/s (164MB/s)(9361MiB/60004msec) Fedora 39, "Physical Drive partitions", virtual NVMe controller (virtual hardware rev 21)    read: IOPS=170k, BW=665MiB/s (697MB/s)(38.9GiB/60002msec) The performance difference is substantial, all the while dmesg --follow --level warn --time-format iso does not show any of the NVMe timeout problems. So, for the time being, I am running this virtualized physical Fedora Linux 39 with sudo grubby --update-kernel=ALL --args="nvme.poll_queues=64" sudo grubby --info=ALL   Random notes: What a lovely rabbit hole to fall into ... Nothing comes for free - NVMe polling consumes more CPU. Does it matter? The optimal poll queue count is not known, and neither is clear whether split read and write poll queues are beneficial - see https://elixir.bootlin.com/linux/latest/source/drivers/nvme/host/pci.c (or rather the version applying to your Linux kernel) for All Of The Truth (because I was unable to find any useful documentation). What Modern NVMe Storage Can Do, And How To Exploit It: High-Performance I/O for High-Performance Storage Engines (vldb.org) is an interesting article explaining a great many detail about I/O performance in Linux. And finally, for stress-testing and exploration, Benchmark persistent disk performance on a Linux VM  |  Compute Engine Documentation  |  Google Cloud is a useful resource with pre-cooked "fio" commands.
FYI, VMware Workstation 17.5, with its new hardware version 21 and refreshed NVMe support, exhibits the same unwanted behaviour. On Fedora 39 (beta) with "Linux fedora-gnome 6.5.6-300.fc39.x86_64 #1... See more...
FYI, VMware Workstation 17.5, with its new hardware version 21 and refreshed NVMe support, exhibits the same unwanted behaviour. On Fedora 39 (beta) with "Linux fedora-gnome 6.5.6-300.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 6 19:57:21 UTC 2023 x86_64 GNU/Linux", the commands cd / rg 1 | wc -l  yield plenty of timeouts - and with that comes massive delays. Run this after a fresh boot with very cold caches. [ 182.238025] nvme nvme0: I/O 192 QID 5 timeout, completion polled [ 222.686266] nvme nvme0: I/O 224 QID 6 timeout, completion polled [ 252.894103] nvme nvme0: I/O 0 QID 3 timeout, completion polled [ 283.614594] nvme nvme0: I/O 1 QID 3 timeout, completion polled [ 283.614607] nvme nvme0: I/O 192 QID 8 timeout, completion polled [ 283.614626] nvme nvme0: I/O 128 QID 11 timeout, completion polled [ 314.589824] nvme nvme0: I/O 227 QID 6 timeout, completion polled [ 344.990885] nvme nvme0: I/O 128 QID 1 timeout, completion polled [ 345.024297] nvme nvme0: I/O 193 QID 8 timeout, completion polled [ 376.286125] nvme nvme0: I/O 129 QID 11 timeout, completion polled [ 432.094158] nvme nvme0: I/O 92 QID 4 timeout, completion polled [ 432.094170] nvme nvme0: I/O 0 QID 6 timeout, completion polled [ 462.302211] nvme nvme0: I/O 248 QID 3 timeout, completion polled [ 462.302237] nvme nvme0: I/O 32 QID 6 timeout, completion polled [ 462.302242] nvme nvme0: I/O 96 QID 7 timeout, completion polled [ 462.302247] nvme nvme0: I/O 130 QID 11 timeout, completion polled [ 493.021506] nvme nvme0: I/O 225 QID 3 timeout, completion polled [ 526.301435] nvme nvme0: I/O 32 QID 5 timeout, completion polled   
I have been trialling turning on "Virtualize IOMMU (IO memory management unit)" as a completely isolated change for a few hours. It does seem as if enabling "Virtualize IOMMU (IO memory management u... See more...
I have been trialling turning on "Virtualize IOMMU (IO memory management unit)" as a completely isolated change for a few hours. It does seem as if enabling "Virtualize IOMMU (IO memory management unit)" addresses the unacceptable keyboard behaviour of VMware Workstation 17 Pro, running on a Windows 11 Pro host, with Fedora Linux 38 inside (on Linux kernel 6.4.13), on the KDE Desktop, with "accelerated key repeat" configuration in KDE. The host hardware is a Dell Inspiron 7610 notebook, with a Tiger Lake 11800H CPU (i.e. no big.LITTLE / performance+efficiency CPU core split), a hybrid Intel+NVIDIA GPU setup, 64 GB RAM. The VMs get 16 cores + 32 GB + max VRAM + 3D acceleration. With this, I have been able to run VMware Workstation 17 on the Microsoft Windows hypervisor stack ("bcdedit /set hypervisorlaunchtype auto") with tolerable UI performance for the first time ever; I could run VMware Workstation 17 with WSL2 in parallel and not be totally put off by VMware Workstation. Interactive performance of VMware Workstation 17 in this specific mode still is noticeably worse compared to running it at CPL0 ("bcdedit /set hypervisorlaunchtype off"), but it is useable for the first time ever, with KDE.
Repairing the effectively unusable keyboard when not in CPL0 would be very helpful. I would actually purchase another license to upgrade to a confirmed fixed version of VMware Workstation. Context:... See more...
Repairing the effectively unusable keyboard when not in CPL0 would be very helpful. I would actually purchase another license to upgrade to a confirmed fixed version of VMware Workstation. Context: When running VMware Workstation on top of the Microsoft Windows virtualization interface, to make it compatible with Windows Sandbox Hyper-V WSL2 ... and probably some more technologies where Microsoft have built on top of their virtualization stack the keyboard in a VMware Workstation virtual machine gets any of phantom key repeats (cursor keys are very popular) seconds of delay until a key _registers_ ... unless the user wiggles the (Bluetooth) mouse while typing on the (USB) keyboard, because the mouse apparently triggers some kind of acceleration / interrupt handling that makes the keyboard work fine. This is not a problem of the virtual machine - everything works perfectly fine when that runs at CPL0 / VMware bare metal virtualization mode.
I have now established a work-around on the VMware Workstation 17.0.2 (Windows 11) host and applied additional tuning to my existing Fedora Linux 38 guest installation: a) enable Fedora 38 init ramd... See more...
I have now established a work-around on the VMware Workstation 17.0.2 (Windows 11) host and applied additional tuning to my existing Fedora Linux 38 guest installation: a) enable Fedora 38 init ramdisk for booting not only nvme cat << EOF > /etc/dracut.conf.d/scsi.conf add_drivers+=" vmw_pvscsi mptbase mptscsih mptspi mptsas " EOF dracut -f -v halt -p b) switch controller for existing disk in VMware Workstation - remove existing NMVe hard disk (this only unlinks it, the backing store remains, your data is safe) - add new hard disk of type SCSI, pick the original backing store VMDK - start VM c) tune for access (optional) add "noatime" and "ssd" to the btrfs mounts in /etc/fstab reboot d) validate Have plenty of software and data on your virtual hard disk cd / rg 1 | wc -l //exp: no system hangs, obviously no NVMe timeouts //act: ... exactly as expected, returns a number in the millions This works much much better than before. Incidentally, it seems as if a (software CI) task which previously took 55 seconds to complete now takes 5% less time (non-scientific measurement) - but really, the important part for me: no 30 second pausing / stalls due to NVMe controller timeouts. Based on the above, I can only suggest to ignore the VMware recommendation to pick NVMe and to go for SCSI as a fast and robust choice.
I have a (working) VM with a persistent independent NVMe hard disk. Now I am trying to add another disk to that - specifically, either a SCSI or a SATA disk, again persistent and independent. Whene... See more...
I have a (working) VM with a persistent independent NVMe hard disk. Now I am trying to add another disk to that - specifically, either a SCSI or a SATA disk, again persistent and independent. Whenever I do this, booting breaks _at the BIOS == firmware level_, a boot device is no longer found, although the NVMe disk is still present. There is a CD/DVD (IDE) device, but it is not connected, and it does point to an ISO that no longer exists. Once I remove the SCSI or a SATA disk again, the BIOS is happy to boot off the totally unchanged NVMe device again. What is the best approach to get a SCSI or a SATA disk added to a VM with an existing NVMe disk, while keeping things booting? My goal is to switch my existing disk from the VMware virtual NVMe controller (which seems to have issues) to the VMware virtual SCSI controller. This is VMWare Workstation 17.0.2 (on a Windows 11 Professional host).
... and as for reproduction, it seems as if "sddm" (as the greeter / login screen) triggers the problem in the majority of the cases, see below. Most likely a reproducer would then be downloading th... See more...
... and as for reproduction, it seems as if "sddm" (as the greeter / login screen) triggers the problem in the majority of the cases, see below. Most likely a reproducer would then be downloading the Fedora 38 KDE spin from Fedora KDE Plasma Desktop | The Fedora Project and simply running that with 1/16 CPUs and 16 GB of memory. Apr 27 07:19:20 fedora kernel: CPU: 14 PID: 1565 Comm: sddm-greeter Not tainted 6.2.12-300.fc38.x86_64 #1 Apr 28 11:03:37 fedora kernel: CPU: 12 PID: 1221 Comm: sddm-greeter Not tainted 6.2.12-300.fc38.x86_64 #1 Apr 28 11:05:51 fedora kernel: CPU: 8 PID: 1189 Comm: sddm-greeter Not tainted 6.2.12-300.fc38.x86_64 #1 Apr 28 11:20:06 fedora kernel: CPU: 13 PID: 1245 Comm: sddm-greeter Not tainted 6.2.12-300.fc38.x86_64 #1 Apr 28 12:59:20 fedora kernel: CPU: 2 PID: 1180 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 Apr 28 20:33:59 fedora kernel: CPU: 6 PID: 1185 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 Apr 29 07:57:27 fedora kernel: CPU: 5 PID: 1247 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 Apr 30 06:59:19 fedora kernel: CPU: 13 PID: 1203 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 Apr 30 09:56:27 fedora kernel: CPU: 6 PID: 1200 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 Apr 30 10:16:09 fedora kernel: CPU: 3 PID: 1237 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 01 09:46:58 fedora kernel: CPU: 9 PID: 1197 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 01 10:45:38 fedora kernel: CPU: 0 PID: 1189 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 02 05:47:06 fedora kernel: CPU: 13 PID: 1197 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 02 19:56:28 fedora kernel: CPU: 0 PID: 1243 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 03 06:40:39 fedora kernel: CPU: 0 PID: 1235 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 03 06:46:05 fedora kernel: CPU: 6 PID: 1186 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 03 08:47:52 fedora kernel: CPU: 5 PID: 1191 Comm: sddm-greeter Not tainted 6.2.13-300.fc38.x86_64 #1 May 03 09:26:15 fedora kernel: CPU: 14 PID: 1194 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 04 06:25:22 fedora kernel: CPU: 0 PID: 1246 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 04 07:13:01 fedora kernel: CPU: 6 PID: 1194 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 05 06:49:56 fedora kernel: CPU: 5 PID: 1230 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 06 08:09:45 fedora kernel: CPU: 2 PID: 1232 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 06 08:17:14 fedora kernel: CPU: 8 PID: 1189 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 06 19:28:54 fedora kernel: CPU: 15 PID: 1226 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 08 06:47:32 fedora kernel: CPU: 10 PID: 1198 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 09 06:02:04 fedora kernel: CPU: 11 PID: 1229 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 09 06:04:18 fedora kernel: CPU: 4 PID: 1186 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 09 07:43:41 fedora kernel: CPU: 5 PID: 3194 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 May 09 07:43:42 fedora kernel: CPU: 0 PID: 3194 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 May 10 06:40:07 fedora kernel: CPU: 14 PID: 1241 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 10 06:48:43 fedora kernel: CPU: 8 PID: 1189 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 10 14:25:38 fedora kernel: CPU: 11 PID: 1221 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 11 05:37:59 fedora kernel: CPU: 12 PID: 1227 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 11 05:42:20 fedora kernel: CPU: 13 PID: 1184 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 11 10:32:03 fedora kernel: CPU: 14 PID: 2357 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 May 11 10:32:03 fedora kernel: CPU: 14 PID: 2357 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 May 11 10:32:25 fedora kernel: CPU: 2 PID: 2357 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 May 12 08:09:22 fedora kernel: CPU: 1 PID: 1234 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1 May 12 09:05:20 fedora kernel: CPU: 2 PID: 1225 Comm: sddm-greeter Not tainted 6.2.14-300.fc38.x86_64 #1  
Just to add some context and an indication of the extent of the challenges, the output of "journalctl | grep 'kernel: vmw_generic_ioctl'" on the VM in question is below. That's 41 entries over the c... See more...
Just to add some context and an indication of the extent of the challenges, the output of "journalctl | grep 'kernel: vmw_generic_ioctl'" on the VM in question is below. That's 41 entries over the course of two weeks, all on Fedora 38 with a Linux 6.2+ kernel. I picked the search term as it seems to be the entrypoint into the trouble and seems to match "per incident" pretty well. Almost every cold virtual machine boot to the KDE desktop seem to trigger something, perhaps with the odd other incident sprinkled in occasionally. Apr 26 08:00:56 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 27 07:17:02 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 27 07:19:20 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 28 11:03:37 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 28 11:05:51 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 28 11:20:06 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 28 12:59:20 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 28 20:33:59 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 29 07:57:27 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 30 06:59:19 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 30 09:56:27 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] Apr 30 10:16:09 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 01 09:46:58 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 01 10:45:38 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 02 05:47:06 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 02 19:56:28 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 03 06:40:39 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 03 06:46:05 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 03 08:47:52 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 03 09:26:15 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 04 06:25:22 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 04 07:13:01 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 05 06:49:56 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 06 08:09:45 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 06 08:17:14 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 06 19:28:54 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 08 06:47:32 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 09 06:02:04 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 09 06:04:18 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 09 07:43:41 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 09 07:43:42 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 10 06:40:07 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 10 06:48:43 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 10 14:25:38 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 11 05:37:59 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 11 05:42:20 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 11 10:32:03 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 11 10:32:03 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 11 10:32:25 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 11 10:32:25 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx] May 12 08:09:22 fedora kernel: vmw_generic_ioctl+0xa4/0x110 [vmwgfx]  
@Mikero wrote: Got a quick answer on this... Seems that was fixed in: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/vmwgfx?id=a950b989ea29ab3b38ea7f... See more...
@Mikero wrote: Got a quick answer on this... Seems that was fixed in: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/vmwgfx?id=a950b989ea29ab3b38ea7f6e3d2540700a3c54e8 and https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/vmwgfx?id=1a6897921f52ceb2c8665ef826e405bd96385159 I very much appreciate the very quick and very specific response - alas, it seems as if those commits might not fully address the challenge. https://gitlab.com/cki-project/kernel-ark is the source code for Fedora kernel builds. According to git tag --contains 1a6897921f52ceb2c8665ef826e405bd96385159 git tag --contains a950b989ea29ab3b38ea7f6e3d2540700a3c54e8 those commits made it at least into the "upstream" Linux v6.2 kernel release Fedora 38 has only ever been shipping Linux 6.2 kernels, so I have always been running with those fixes included - and hence it is surprising that I get the kernel diagnostics (right now I am on a Fedora Linux 38 6.2.14 distro kernel - all my system software is distro software)
@Mikero wrote: Thanks Wil for the tag, and thanks @cynar for reporting, I have brought it to the attention of our vmwgfx team. Many thanks for the feedback. I can only suggest that someone f... See more...
@Mikero wrote: Thanks Wil for the tag, and thanks @cynar for reporting, I have brought it to the attention of our vmwgfx team. Many thanks for the feedback. I can only suggest that someone from the vmwgfx team look into the Fedora ABRT analytics, digging a bit for backtraces which point towards the VMware stack. Just to provide additional context: On my end, I get the warnings very regularly with the Fedora 38 KDE spin running X11 on an external 4K screen (and a Tiger Lake 11800H CPU). The above kernel bug was the first time that I ever noticed it. From a general stability point of view, the system remains alive, although Firefox definitely has rendering artifacts. What the root cause of that is, I have no idea. That virtual machine is generally used for software development, so there is Visual Studio Code inside and Jetbrains IntelliJ.  
Alas, I have no support contract with VMware, so no support ticket from my end (this is VMware Workstation 17.0.2 purchased through the US online store)
FWIW, a real use after free: BUG: KFENCE: use-after-free read in drm_gem_handle_delete+0x4b/0xd0 I am under the impression that this is the VMware stack, all the way up and down, on Fedora 38 (with ... See more...
FWIW, a real use after free: BUG: KFENCE: use-after-free read in drm_gem_handle_delete+0x4b/0xd0 I am under the impression that this is the VMware stack, all the way up and down, on Fedora 38 (with very up-to-date mesa). *************** [17468.742850] ------------[ cut here ]------------ [17468.742856] refcount_t: addition on 0; use-after-free. [17468.742888] WARNING: CPU: 14 PID: 2357 at lib/refcount.c:25 refcount_warn_saturate+0xe1/0x110 [17468.742899] Modules linked in: xt_mark xt_comment tun xt_nat veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype nft_compat br_netfilter bridge stp llc overlay snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock sunrpc binfmt_misc intel_rapl_msr intel_rapl_common rapl vmw_balloon pcspkr pktcdvd vmw_vmci i2c_piix4 joydev loop zram crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme ghash_clmulni_intel vmwgfx sha512_ssse3 nvme_core vmxnet3 nvme_common drm_ttm_helper ttm serio_raw ata_generic pata_acpi ip6_tables ip_tables fuse [17468.743014] CPU: 14 PID: 2357 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 [17468.743016] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [17468.743018] RIP: 0010:refcount_warn_saturate+0xe1/0x110 [17468.743022] Code: eb 92 ff 0f 0b c3 cc cc cc cc 80 3d 91 b3 ae 01 00 0f 85 5e ff ff ff 48 c7 c7 38 9e 8d a7 c6 05 7d b3 ae 01 01 e8 bf eb 92 ff <0f> 0b c3 cc cc cc cc 48 c7 c7 90 9e 8d a7 c6 05 61 b3 ae 01 01 e8 [17468.743023] RSP: 0018:ffffac3b522f7a88 EFLAGS: 00010286 [17468.743025] RAX: 0000000000000000 RBX: ffff8f2e69d8b200 RCX: 0000000000000000 [17468.743027] RDX: 0000000000000003 RSI: 0000000000000027 RDI: 00000000ffffffff [17468.743028] RBP: ffff8f2de8753e40 R08: 0000000000000000 R09: ffffac3b522f7918 [17468.743029] R10: 0000000000000003 R11: ffff8f30addfffe8 R12: 0000000000000001 [17468.743030] R13: ffffac3b522f7ad8 R14: ffffac3b522f7ae0 R15: ffffac3b522f7ad8 [17468.743031] FS: 00007f22f79ff6c0(0000) GS:ffff8f30ae180000(0000) knlGS:0000000000000000 [17468.743033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17468.743034] CR2: 00007f22c28e3000 CR3: 000000015fc96002 CR4: 0000000000770ee0 [17468.743062] PKRU: 55555554 [17468.743063] Call Trace: [17468.743066] <TASK> [17468.743067] objects_lookup+0x8d/0xd0 [17468.743076] drm_gem_object_lookup+0x3a/0x60 [17468.743079] vmw_user_bo_lookup+0x11/0x70 [vmwgfx] [17468.743140] vmw_translate_mob_ptr+0x56/0x170 [vmwgfx] [17468.743153] vmw_cmd_res_switch_backup+0xa3/0xd0 [vmwgfx] [17468.743165] vmw_execbuf_process+0x54b/0x1160 [vmwgfx] [17468.743179] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] [17468.743191] vmw_execbuf_ioctl+0x151/0x280 [vmwgfx] [17468.743204] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] [17468.743216] drm_ioctl_kernel+0xc6/0x170 [17468.743219] drm_ioctl+0x235/0x410 [17468.743221] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] [17468.743234] ? __pfx_drm_ioctl+0x10/0x10 [17468.743236] vmw_generic_ioctl+0xa4/0x110 [vmwgfx] [17468.743252] __x64_sys_ioctl+0x8d/0xd0 [17468.743274] do_syscall_64+0x59/0x90 [17468.743280] ? do_syscall_64+0x68/0x90 [17468.743281] ? syscall_exit_to_user_mode+0x17/0x40 [17468.743283] ? do_syscall_64+0x68/0x90 [17468.743285] ? do_syscall_64+0x68/0x90 [17468.743286] ? do_syscall_64+0x68/0x90 [17468.743288] entry_SYSCALL_64_after_hwframe+0x72/0xdc [17468.743291] RIP: 0033:0x7f2327528edd [17468.743374] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [17468.743376] RSP: 002b:00007f22f79fcdf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [17468.743378] RAX: ffffffffffffffda RBX: 0000000000000028 RCX: 00007f2327528edd [17468.743379] RDX: 00007f22f79fceb0 RSI: 000000004028644c RDI: 0000000000000029 [17468.743380] RBP: 00007f22f79fce40 R08: 000000000000b3f0 R09: 00007f22f79fcf48 [17468.743381] R10: 0000000000000001 R11: 0000000000000246 R12: 00007f22f79fceb0 [17468.743382] R13: 000000004028644c R14: 0000000000000029 R15: 00007f22f79fcf48 [17468.743385] </TASK> [17468.743386] ---[ end trace 0000000000000000 ]--- [17468.942740] ------------[ cut here ]------------ [17468.942746] refcount_t: saturated; leaking memory. [17468.942762] WARNING: CPU: 14 PID: 2357 at lib/refcount.c:22 refcount_warn_saturate+0x51/0x110 [17468.942774] Modules linked in: xt_mark xt_comment tun xt_nat veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype nft_compat br_netfilter bridge stp llc overlay snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock sunrpc binfmt_misc intel_rapl_msr intel_rapl_common rapl vmw_balloon pcspkr pktcdvd vmw_vmci i2c_piix4 joydev loop zram crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic nvme ghash_clmulni_intel vmwgfx sha512_ssse3 nvme_core vmxnet3 nvme_common drm_ttm_helper ttm serio_raw ata_generic pata_acpi ip6_tables ip_tables fuse [17468.942841] CPU: 14 PID: 2357 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 [17468.942845] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [17468.942846] RIP: 0010:refcount_warn_saturate+0x51/0x110 [17468.942850] Code: 84 bc 00 00 00 c3 cc cc cc cc 85 f6 74 46 80 3d 1e b4 ae 01 00 75 ee 48 c7 c7 10 9e 8d a7 c6 05 0e b4 ae 01 01 e8 4f ec 92 ff <0f> 0b c3 cc cc cc cc 80 3d f7 b3 ae 01 00 75 cb 48 c7 c7 c0 9e 8d [17468.942853] RSP: 0018:ffffac3b522f79e8 EFLAGS: 00010286 [17468.942855] RAX: 0000000000000000 RBX: ffff8f2d8ce5ae00 RCX: 0000000000000000 [17468.942857] RDX: 0000000000000003 RSI: 0000000000000027 RDI: 00000000ffffffff [17468.942858] RBP: ffff8f2de8753e40 R08: 0000000000000000 R09: ffffac3b522f7878 [17468.942859] R10: 0000000000000003 R11: ffff8f30addfffe8 R12: 0000000000000001 [17468.942860] R13: ffffac3b522f7a38 R14: ffffac3b522f7a38 R15: ffffac3b522f7a34 [17468.942862] FS: 00007f22f79ff6c0(0000) GS:ffff8f30ae180000(0000) knlGS:0000000000000000 [17468.942864] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [17468.942865] CR2: 00007f22d47bb000 CR3: 000000015fc96002 CR4: 0000000000770ee0 [17468.942948] PKRU: 55555554 [17468.942950] Call Trace: [17468.942952] <TASK> [17468.942953] objects_lookup+0xc3/0xd0 [17468.942960] drm_gem_object_lookup+0x3a/0x60 [17468.942963] vmw_user_bo_lookup+0x11/0x70 [vmwgfx] [17468.942986] vmw_translate_mob_ptr+0x56/0x170 [vmwgfx] [17468.942999] vmw_cmd_res_switch_backup+0xa3/0xd0 [vmwgfx] [17468.943011] vmw_execbuf_process+0x54b/0x1160 [vmwgfx] [17468.943024] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] [17468.943036] vmw_execbuf_ioctl+0x151/0x280 [vmwgfx] [17468.943048] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] [17468.943059] drm_ioctl_kernel+0xc6/0x170 [17468.943063] drm_ioctl+0x235/0x410 [17468.943065] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] [17468.943077] ? __pfx_drm_ioctl+0x10/0x10 [17468.943079] vmw_generic_ioctl+0xa4/0x110 [vmwgfx] [17468.943094] __x64_sys_ioctl+0x8d/0xd0 [17468.943099] do_syscall_64+0x59/0x90 [17468.943104] ? do_syscall_64+0x68/0x90 [17468.943106] ? syscall_exit_to_user_mode+0x17/0x40 [17468.943108] ? do_syscall_64+0x68/0x90 [17468.943109] ? do_syscall_64+0x68/0x90 [17468.943110] ? syscall_exit_to_user_mode+0x17/0x40 [17468.943112] ? do_syscall_64+0x68/0x90 [17468.943113] ? lapic_next_deadline+0x28/0x30 [17468.943144] ? clockevents_program_event+0x86/0xf0 [17468.943149] ? hrtimer_interrupt+0x127/0x240 [17468.943152] ? sched_clock_cpu+0xb/0xc0 [17468.943155] ? __irq_exit_rcu+0x3d/0x140 [17468.943159] entry_SYSCALL_64_after_hwframe+0x72/0xdc [17468.943163] RIP: 0033:0x7f2327528edd [17468.943187] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [17468.943189] RSP: 002b:00007f22f79fc690 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [17468.943191] RAX: ffffffffffffffda RBX: 0000000000000028 RCX: 00007f2327528edd [17468.943192] RDX: 00007f22f79fc750 RSI: 000000004028644c RDI: 0000000000000029 [17468.943193] RBP: 00007f22f79fc6e0 R08: 00000000000003d8 R09: 00007f22f79fc7e8 [17468.943194] R10: 0000000000000001 R11: 0000000000000246 R12: 00007f22f79fc750 [17468.943194] R13: 000000004028644c R14: 0000000000000029 R15: 00007f22f79fc7e8 [17468.943196] </TASK> [17468.943197] ---[ end trace 0000000000000000 ]--- [17490.879236] ================================================================== [17490.879242] BUG: KFENCE: use-after-free read in drm_gem_handle_delete+0x4b/0xd0 [17490.879250] Use-after-free read at 0x00000000e28982f2 (in kfence-#229): [17490.879253] drm_gem_handle_delete+0x4b/0xd0 [17490.879255] vmw_bo_unref_ioctl+0xf/0x20 [vmwgfx] [17490.879279] drm_ioctl_kernel+0xc6/0x170 [17490.879281] drm_ioctl+0x235/0x410 [17490.879283] vmw_generic_ioctl+0xa4/0x110 [vmwgfx] [17490.879298] __x64_sys_ioctl+0x8d/0xd0 [17490.879302] do_syscall_64+0x59/0x90 [17490.879306] entry_SYSCALL_64_after_hwframe+0x72/0xdc [17490.879310] kfence-#229: 0x000000002d4c52cb-0x00000000b9fb100f, size=512, cache=kmalloc-512 [17490.879312] allocated by task 2357 on cpu 14 at 17468.655415s: [17490.879347] __kmem_cache_alloc_node+0x2ab/0x2f0 [17490.879349] kmalloc_trace+0x26/0x90 [17490.879352] vmw_bo_create+0x3c/0xa0 [vmwgfx] [17490.879368] vmw_gem_object_create_ioctl+0x6b/0x120 [vmwgfx] [17490.879439] drm_ioctl_kernel+0xc6/0x170 [17490.879441] drm_ioctl+0x235/0x410 [17490.879443] vmw_generic_ioctl+0xa4/0x110 [vmwgfx] [17490.879458] __x64_sys_ioctl+0x8d/0xd0 [17490.879459] do_syscall_64+0x59/0x90 [17490.879461] entry_SYSCALL_64_after_hwframe+0x72/0xdc [17490.879464] freed by task 2357 on cpu 2 at 17490.879219s: [17490.879496] ttm_bo_vm_close+0x12/0x20 [ttm] [17490.879504] remove_vma+0x25/0x50 [17490.879506] do_mas_align_munmap+0x2dc/0x4b0 [17490.879509] do_mas_munmap+0xd2/0x120 [17490.879510] __vm_munmap+0xba/0x170 [17490.879511] __x64_sys_munmap+0x17/0x20 [17490.879512] do_syscall_64+0x59/0x90 [17490.879514] entry_SYSCALL_64_after_hwframe+0x72/0xdc [17490.879517] CPU: 2 PID: 2357 Comm: Renderer Tainted: G W 6.2.14-300.fc38.x86_64 #1 [17490.879520] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [17490.879521] ==================================================================  
@wila wrote: Point taken. Pardon me for jumping in. Many thanks for your response - and I hope I didn't come across too harshly I myself have been singing the "community-based forum" song e... See more...
@wila wrote: Point taken. Pardon me for jumping in. Many thanks for your response - and I hope I didn't come across too harshly I myself have been singing the "community-based forum" song ever since NNTP was a thing, so I am somewhat aware of the various factors that come into play here (sadly so).
I appreciate where you are coming from. Truth to be told, I see two major core components in play here: the Linux kernel with the various VMware vmw* modules Mesa with the (VMware) svga3d driver ... See more...
I appreciate where you are coming from. Truth to be told, I see two major core components in play here: the Linux kernel with the various VMware vmw* modules Mesa with the (VMware) svga3d driver From a VMware customer's point of view, I am paying VMware to maintain those components, to take ownership of problems exposed in or through these components, to take care of the VMware reputation by addressing technical challenges. Fedora Linux as an organization bundles what VMware have (kindly) pushed out, has an upstream-first policy (as opposed to, say, Ubuntu), tries, AFAICS, to hold accountable the owner (see above) of the components for identifying any challenges. I do understand that this posting here is totally void of any technical content - but, as a customer of VMware, I hope I made clear my expectations with respect to ownership and responsibilities.
Note that a use-after-free always leaves a bad taste, as those issues tend to offer an avenue for (memory-based security) attacks on systems.