What version of DSM are you running on Synology? You should make sure you're always running the latest. I have two in my lab and they routinely release updates that address "iSCSI stability issues" and other things that can cause similar behavior.
we are running the newest available version of DSM 6.1.3-15152 Update 3.
It seems somehow, that the ESXI looses the ISCSI connection, if the benchmarks are too intensive. On the current setup there are two LUNs attached to a VM and the VM is running CrystalDiskMark on the one LUN and does some Windows operations (copying a bigger folder) on the secod LUN. After a while, the VMs freezes.
Are there no possibilities to limit the network interaction to prevent such a behaviour? I mean, we need good throughput, but also some threshold to not overload ESXi or rather Synology.
This could be due to your use of the file-based LUN. While that option is attractive because it offers the most VMware-rich data services, it comes at a price. In my tests, that option was least performing and more unstable. You may want to repeat your tests but using the block-level LUN option instead. I've found it produces lower latency and higher throughput.
Thank you for the fast reply! My results are basically completely different. File-based LUN is out-performing block-based LUN.
- Block-based LUN lower CPU usage, higher storage usage, lower throughput ~200 Mb/s (Block-based LUN screenshot)
- File-based LUN higher CPU usage, lower storage usage, higher throughput ~330 Mb/s (File-based LUN screenshot)
Furthermore, the file-based LUN managed to reach ~50.000 read IOPS, whereas the block-based LUN only had ~10.000. Does Synology have some kind of internal caching? Ist the RAM used for that? To which level can the Synology cache? GBs of data or rather only small bits - like for the IOPS benchmark? Because the HDDs are only at 5400 rpm.
Just to be sure, you have a well performing setup? Could you please provide some information about it? Perhaps, I can transfer some settings to my own setup.
Well, to be fair, you're using a Rack Station model with more CPU horsepower and more RAM, but it's odd you're seeing that much of a difference between file-based and block-based LUNs. There is a write cache setting in Storage Manager at HDD/SDD -> General. I don't know to what extent they can use RAM as a read or write cache buffer. But if you only have 4 x 4 TB drives at 5,400 RPM and you're seeing those numbers even from sequential reads, then that's pretty damn good. You might try to remove one of the uplinks and just go with two rather than three. Some systems have a hard time coping with odd-numbered links with a round-robin MPIO setting.
1) was you able to fix the ESXi freeze on Syno high load? If yes, how ?
2) is file-based LUN really faster than block-based LUN?
Thx for your feedback in advance!