<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help me understand why SIOC and shares did not protect my storage in ESXi Discussions</title>
    <link>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760646#M272827</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Mattias and thank you for you reply!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ok, sounds logical, but I can't figure out what the NetApp controller did in response to get the I/O identified as a non-esxi workload. It do have a read cache that could probably interfere, but the VM were constantly writing. Will check with NetApp.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any idea why shares didn't work? Did I set them too high?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Would be nice to be able to cap a VM without killing it, if this happens again.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/Bjorn&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Tue, 13 Sep 2016 13:08:14 GMT</pubDate>
    <dc:creator>BjornJohansson</dc:creator>
    <dc:date>2016-09-13T13:08:14Z</dc:date>
    <item>
      <title>Help me understand why SIOC and shares did not protect my storage</title>
      <link>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760644#M272825</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333px;"&gt;Today we had a SQL that went bananas on our storage. Basically affected all VM's, when we killed the server the problem went away. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333px;"&gt;While the problem is now solved (some dev messed up), I would like to understand why SIOC or modify of disk shares had no effect.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 13.3333px;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Environment:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;ESXi 6.0 hosts running on HP BL460c Gen9 blade servers&lt;/LI&gt;&lt;LI&gt;NetApp MetroCluster running in Active/Active over Fibre Channel (with some aggregates/datastores not being replicated)&lt;/LI&gt;&lt;LI&gt;Datastores via NFS&lt;/LI&gt;&lt;LI&gt;SIOC enabled with 25 ms latency setting&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To begin with: &lt;/P&gt;&lt;P&gt;Metro storage does not support SIOC, we know. Also mixing workloads on same disks (for example CIFS shares mixed with VMware workload) or internal jobs like deduplication may affect SIOC. The point is here, the problematic VM resides on a non-replicated datastore. No dedup jobs, backup jobs, snapshots etc. were taken during problems. Still the only SIOC events we can see on datastore were: &lt;EM&gt;"An unmanaged I/O workload is detected on a SIOC-enabled datastore".&lt;/EM&gt;&lt;/P&gt;&lt;P style="font-size: 13.3333px;"&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;Problem:&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;When problem was on going we could see VM write latency between 10-1000 ms. Also read latency jumped. NetApp showed lower values, but had 100% drive util and back to backs.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Since we found the bully VM, we let it run and start modifying disk shares on the DB disk but nothing happened. We also capped the IOPS to 500 without any affect.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Looking at performance on the NetApp:&lt;/P&gt;&lt;P&gt;300-350 MB/s in write throughput&lt;/P&gt;&lt;P&gt;3000 IOPS&lt;/P&gt;&lt;P&gt;10-100 ms i write latency&lt;/P&gt;&lt;P&gt;15-70 ms i read latency&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;T&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;here are 24 physical disks backing this datastore.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Can someone with better experience please help me understand this better? I know that some questions should be directed to NetApp, but SIOC and shares are relevant I think.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/BL&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 12 Sep 2016 14:00:01 GMT</pubDate>
      <guid>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760644#M272825</guid>
      <dc:creator>BjornJohansson</dc:creator>
      <dc:date>2016-09-12T14:00:01Z</dc:date>
    </item>
    <item>
      <title>Re: Help me understand why SIOC and shares did not protect my storage</title>
      <link>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760645#M272826</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;There are several factors that comes into play here.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;SIOC itself only acts on ESXi workloads and not other workloads handled by other storage operations such as RAID rebuilds, CIFS workloads and so on, in your case according to the message &lt;EM style="font-size: 14px; font-family: proxima-nova, Arial, sans-serif; color: #666666;"&gt;"An unmanaged I/O workload is detected on a SIOC-enabled datastore"&amp;nbsp; &lt;/EM&gt;SIOC detected a workload above specified threshold (25ms) but because ESXi detected the workload as non-esxi workload SIOC couldn't do anything with it other than report it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here is where the tricky part comes in.&lt;/P&gt;&lt;P&gt;In this case is actually was a VM that caused the high latency witch we deadly humans wound would say "&lt;EM&gt;Hey, a VM caused it so its sure as hell an esxi workload&lt;/EM&gt;" well thats not entirely true.&lt;/P&gt;&lt;P&gt;Depending what type of workload and how the storage array handles it plays a part how SIOC will react on it.therefore is crucial to have an array/solution that is supported with SIOC&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I can take an example from my own experience with SIOC and an EMC array running an unsupported setup with auto-tiering and FAST cache&lt;/P&gt;&lt;P&gt;The problem was the same as yours, a VM did some stuff that resulted in high latency, the problem wasn't the VMs workload perse but when the VM started to do its thing the storage array did what it was supposed to do, place hot data in the cache move cold data to disks and kick in a tiering job, due to the extremely high workload on the VM the array couldn't keep up and the result was from VMwares perspective high latency on that datastore but SIOC couldn't do anything because is was never the VMs that caused the latency but storage operations in the backend.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hopes this clarify a little how SIOC operates&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 13 Sep 2016 09:45:03 GMT</pubDate>
      <guid>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760645#M272826</guid>
      <dc:creator>MattiasN81</dc:creator>
      <dc:date>2016-09-13T09:45:03Z</dc:date>
    </item>
    <item>
      <title>Re: Help me understand why SIOC and shares did not protect my storage</title>
      <link>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760646#M272827</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Mattias and thank you for you reply!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Ok, sounds logical, but I can't figure out what the NetApp controller did in response to get the I/O identified as a non-esxi workload. It do have a read cache that could probably interfere, but the VM were constantly writing. Will check with NetApp.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any idea why shares didn't work? Did I set them too high?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Would be nice to be able to cap a VM without killing it, if this happens again.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;/Bjorn&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 13 Sep 2016 13:08:14 GMT</pubDate>
      <guid>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760646#M272827</guid>
      <dc:creator>BjornJohansson</dc:creator>
      <dc:date>2016-09-13T13:08:14Z</dc:date>
    </item>
    <item>
      <title>Re: Help me understand why SIOC and shares did not protect my storage</title>
      <link>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760647#M272828</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Did you set a share value on all the virtual disks on the SIOC enabled datastore ?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="Capture.JPG"&gt;&lt;img src="https://communities.vmware.com/t5/image/serverpage/image-id/67963i2527BA0C78D176CF/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.JPG" alt="Capture.JPG" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;If you change to low or set a IOPS limit on a VM you can at least have some control if a VM starts writing/reading like a lunatic.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 15 Sep 2016 10:44:43 GMT</pubDate>
      <guid>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760647#M272828</guid>
      <dc:creator>MattiasN81</dc:creator>
      <dc:date>2016-09-15T10:44:43Z</dc:date>
    </item>
    <item>
      <title>Re: Help me understand why SIOC and shares did not protect my storage</title>
      <link>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760648#M272829</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Indeed I did change those values without any impact. Could only test for a short while &lt;SPAN style="font-size: 13.3333px;"&gt;since we had complaints from other customers.&lt;/SPAN&gt; &lt;SPAN style="font-size: 10pt;"&gt;If I recall correctly the VM only generated ~480 IOPS but had constantly 300 MB/s in writes to disk. Setting IOPS limit to 100 would have possibly been better than the 500 cap I tried. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 15 Sep 2016 11:43:32 GMT</pubDate>
      <guid>https://communities.vmware.com/t5/ESXi-Discussions/Help-me-understand-why-SIOC-and-shares-did-not-protect-my/m-p/2760648#M272829</guid>
      <dc:creator>BjornJohansson</dc:creator>
      <dc:date>2016-09-15T11:43:32Z</dc:date>
    </item>
  </channel>
</rss>

