It would seem as if Linux kernel tuning on NVMe parameters may help alleviate the problem.
I had had the opportunity to set up another virtual machine on that specific WD_BLACK SN850X SSD - basically booting a fresh and _natively_ installed Fedora Linux 39 from its three physical partitions (EFI, boot, data) via VMware Workstation physical drive access, using the virtual NVMe controller.
Initially, this setup was also suffering from occasionally massively degraded performance (see above).
One kernel tuning parameters seems to be making The Difference:
nvme.poll_queues=64
Lets look at one of the results of the exploratory probing:
Fedora 38, many "20 GB split virtual disk" files, virtual SCSI controller
read: IOPS=39.9k, BW=156MiB/s (164MB/s)(9361MiB/60004msec)
Fedora 39, "Physical Drive partitions", virtual NVMe controller (virtual hardware rev 21)
read: IOPS=170k, BW=665MiB/s (697MB/s)(38.9GiB/60002msec)
The performance difference is substantial, all the while
dmesg --follow --level warn --time-format iso
does not show any of the NVMe timeout problems.
So, for the time being, I am running this virtualized physical Fedora Linux 39 with
sudo grubby --update-kernel=ALL --args="nvme.poll_queues=64"
sudo grubby --info=ALL
Random notes:
What a lovely rabbit hole to fall into ...
Nothing comes for free - NVMe polling consumes more CPU. Does it matter?
The optimal poll queue count is not known, and neither is clear whether split read and write poll queues are beneficial - see https://elixir.bootlin.com/linux/latest/source/drivers/nvme/host/pci.c (or rather the version applying to your Linux kernel) for All Of The Truth (because I was unable to find any useful documentation).
What Modern NVMe Storage Can Do, And How To Exploit It: High-Performance I/O for High-Performance St... is an interesting article explaining a great many detail about I/O performance in Linux.
And finally, for stress-testing and exploration, Benchmark persistent disk performance on a Linux VM | Compute Engine Documentation | Google Clou... is a useful resource with pre-cooked "fio" commands.