1 32 33 34 35 36 Previous Next 574 Replies Latest reply on Jun 1, 2018 5:44 AM by larstr Go to original post
      • 495. Re: New  !! Open unofficial storage performance thread
        lpisanec Novice

        I needed to get some storage for a small business where I work.

        We did not want to spend a lot of money, but wanted top performance of course . Space was a concern, as we have around 6TB of data, growing by about 300GB/year (or more).

         

        At first we were looking to get a Dell Powervault MD3600 or MD3620 and a 10GE switch plus some 10GE cards.

        That was a bit costly - it would have been around 25-30k EUR for 12*2TB or 24*1TB raw space.

        Besides that we would have been locked in to Dell providing spare parts, service and possible upgrades in the future.

         

        So I did some research and came up with a plan to build our own storage with Nexenta, a zfs-based storage appliance.

        It is based on a Supermicro chassis, Supermicro motherboard, 2*Xeon E5 2609, 256GB RAM. HDDs are cheap nearline SAS drives from Toshiba, 24*1TB per drive at 7.2k rpm which are connected by a SAS expander to a single port SAS controller (LSI 2308).

        22 of the drives are in a mirror-stripe configuration (also known as RAID10), the last 2 are hot spares.

         

        The storage box is connected by Infiniband (Mellanox ConnectX-2 QDR) using SRP to 2 ESXi 5.1 and will serve as main storage for 15-20 VMs.

        As for redundancy: it was too expensive to remove all single point of failures. I decided it would suffice to get some spare parts in case something dies; we can live happily with an hour or two downtime to replace stuff. Besides we can utilize 1G ethernet as a backup if infiniband goes down and takes the IB switch and most cards with it, also, our server for backups can serve as a new "head" for the storage in a pinch as well.

         

        So far this storage setup has set us back around 15k EUR.

         

        512GB testsize, no writecache at all
        (readahead caching is too good) 
        
        
        Test name Latency Avg iops Avg MBps cpu load
        Max Throughput-100%Read1.075062515824%
        RealLife-60%Rand-65%Read93.0263443%
        Max Throughput-50%Read71.88823253%
        Random-8k-70%Read91.5364354%

         

         

        512GB testsize, with writeback cache
        
        
        Test name Latency Avg iops Avg MBps cpu load
        Max Throughput-100%Read1.114972415534%
        RealLife-60%Rand-65%Read36.201613123%
        Max Throughput-50%Read1.045113215973%
        Random-8k-70%Read40.301441113%

         

         

        32GB testsize, fits into RAM cache, writeback cache
        
        
        Test name Latency Avg iops Avg MBps cpu load
        Max Throughput-100%Read1.114892515284%
        RealLife-60%Rand-65%Read1.254114732139%
        Max Throughput-50%Read1.055134516043%
        Random-8k-70%Read1.164374734140%

         

         

        Tests to get maximum performance of infiniband,
        blocksize 128k and 1M, testsize 32GB,
        writeback enabled
        
        
        Test name Latency Avg iops Avg MBps cpu load
        128k 100% random read3.4517098213734%
        128k 100% random write22.2126543315%
        128k 100% sequential read3.3317731221634%
        128k 100% sequential write2.652204027553%
        1M 100% random read26.62224922492%
        1M 100% random write150.843993993%
        1M 100% sequential read25.37235023504%
        1M 100% sequential write198.173023025%
        • 496. Re: New  !! Open unofficial storage performance thread
          IRIpl Lurker

          lpisanec

          where did you get the driver for SRP for ESX 5.1 ? Or mayby You use iscsi via infiniband IPoverIB??

          Mellanox didn't provides SRP driver for ESXi 5.1!!! on officail web site.

           

          Can You explain?

          • 497. Re: New  !! Open unofficial storage performance thread
            lpisanec Novice

            Mellanox does provide official drivers that support SRP. It was released a week or so ago.

             

            http://www.mellanox.com/page/products_dyn?&product_family=36&mtag=vmware_drivers

             

            Driver version 1.8.1.0 is the one you want (or greater, if someone reads this in a few months )

             

            Quote from release notes:

            MLNX-OFED-ESX package contains:
            o MLNX-OFED-ESX-1.8.1.zip - Hypervisor bundle which contains the following
              kernel modules:
               - mlx4_core (ConnectX family low-level PCI driver)
               - mlx4_ib (ConnectX family InfiniBand driver)
               - ib_core
               - ib_sa
               - ib_mad
               - ib_umad
               - ib_ipoib
               - ib_cm
               - ib_srp
            
            

            ...

             

              - Storage:
                o NFS over IPoIB
                o GPFS over IPoIB
                o SRP
            • 498. Re: New  !! Open unofficial storage performance thread
              IRIpl Lurker

              lpisanec

               

              Can You do performance test? from my IOMeter profile -> http://vmblog.pl/OpenPerformanceTest32-4k-Random.icf

               

              In this profile is 4k random read/write test.

               

              And show screenshot from vSphere Client configuration for scsi SRP devices?? if exist in "Storage Adapter"

               

              Thanks

               

              Tom

              • 499. Re: New  !! Open unofficial storage performance thread
                lpisanec Novice

                There you go.

                 

                 

                Test name Latency Avg iops Avg MBps cpu load
                Max Throughput-100%Read0.514839515126%
                RealLife-60%Rand-65%Read0.893665628646%
                Max Throughput-50%Read0.464834315103%
                Random-8k-70%Read0.674266133330%
                4k-Max Throu-100%Read-100%Random0.435084219831%
                4k-Max Throu-100%Write-100%Random4.25120114610%

                 

                Screen Shot 2013-03-17 at 23.58.03.png

                • 500. Re: New  !! Open unofficial storage performance thread
                  alan@vcit.ca Novice

                  SAN is a Dell R710 (1 x L5600-series, 3GB RAM) running Windows Server 2008 R2 and Microsoft iSCSI software target v3.3. IOMeter was run directly on the R710 (not from VM via hypervisor iSCSI). Data volume is a GPT partition with NTFS 64K default allocation unit size. RAID stripe element size on Perc 6i is 64K.

                   

                  Dell Perc 6i controller, 6 x 2TB Seagate ES.2 3.5" SATA, RAID-10, no hot spares
                  Access SpecificationIOPSMB/sAvg Resp Time (ms)
                  Max Throughput-100%Read26,818.24838.072.22
                  RealLife-60%Rand-65%Read1,070.448.3641.94
                  Max Throughput-50%Read22,213.96694.192.66
                  Random-8k-70%Read946.457.3947.99

                   

                  I'm happy with the sequential I/O results, but the random I/O results are just OK. I'm guessing random I/O results would be much better with more spindles, but I'm limited to 6 drive bays.

                   

                  Comments appreciated on how I can improve random I/O results, if at all. Also, would StarWind target with read cache help at all with random I/O results? MS iSCSI target doesn't do caching...

                  • 501. Re: New  !! Open unofficial storage performance thread
                    mikeyb79 Novice

                    If you want to keep Server 2008 R2 as the host OS, then yeah, Starwind iSCSI will get you caching although you’ll need more than 3GB of RAM for it to be noticeable. Although on the face of it, that’s not bad for 6 SATA spindles.

                     

                    What does performance look like across the wire?

                    • 502. Re: New  !! Open unofficial storage performance thread
                      alan@vcit.ca Novice

                      Thanks for your advice RE StarWind, mikeyb79. Based on benchmarking tests I've made with IOMeter against a StarWind iSCSI target with caching enabled on the LUN, you are right - a small RAM cache makes little difference to read performance. I'm guessing that even a larger cache (i.e. 16GB) won't help that much with random IO performance. What do you think? I'm not familiar enough with the StarWind caching algorithms to know for sure and StarWind themselves haven't been very forthcoming: http://www.starwindsoftware.com/forums/starwind-f5/how-properly-benchmark-caching-benefits-t3123.html

                       

                      When I run IOMeter within a Windows 7 64-bit VM against a direct attached VMDK that resides on a LUN exposed by MS iSCSI target, performance is pretty close to "native" for smaller IO sizes. However, as I haven't yet implemented MPIO, network throughput is currently a bottleneck through a single Gig NIC. I'll rectify that this weekend

                       

                      In general, I'm pretty happy with this performance given the relative low cost of the hardware. It would be nice to get some SSD caching involved for better random IO performance, but I don't think that's going to be possible.

                       

                      I'll post back when I have some "in VM" results to share...

                       

                      Thanks!

                      • 503. Re: New  !! Open unofficial storage performance thread
                        jb42 Novice

                        I'm still figuring all this out so take this with a grain of salt...

                         

                        One thing, your memory bus speed may pick-up by about 70% on that box if you fill out all you DIMM slots with 1333MHz sticks so there could be potetial for a better than expected cache bonus with even a moderate increse in memory if balanced across all 3 channels.

                         

                        I think you could get a 2-3x random I/O boost w/ ZFS on a *nix box but you may need 12-16x the RAM. If you think about doing it, it'd be interesting to see how much additional memory by itself improves your results on windows server with starwind, and then what additioanl boost, if any, you get from ZFS.

                         
                        
                        iSCSI SAN on Dell PE 2900 running OpenIndiana for ZFS and COMSTAR w/Nappit.
                        2x Xeon E5410 @ 2.33 GHz, 48Gb ram. Dual-aggregate 1Gb Nics.
                        6x2TB 7.2k Sata drives "passed-through" perc controller
                        as individual RAID 0 disks which is said to reduce performance.
                        RAID 10 ZFS pool. iscsi volume LU.

                        IO Meter test is from VM guest utilising RoundRobin MPIO on R510 ESXi host.
                        Test is run at 96000000 sectors to try and saturate ARC (RAM cache.) Still
                        think I'm getting some cache benefit but I get about 40MBps RL/Rdm IO on a 4GB test
                        Test nameLatencyAvg iopsAvg MBpscpu load
                        Max Throughput-100%Read17.1834241075%
                        RealLife-60%Rand-65%Read14.802874228%
                        Max Throughput-50%Read10.6655411734%
                        Random-8k-70%Read19.4820791611%
                        • 504. Re: New  !! Open unofficial storage performance thread
                          alan@vcit.ca Novice
                          Same hardware, this time with results from within a Windows 7 64-bit VM with 1.5GB of RAM and 1 vCPU with iSCSI software initiator running on ESXi 4.1 U2 host. Target VMDK resides on a LUN exposed by StarWind that has a 1GB cache (WT mode, basically just a read cache):
                          Access SpecificationIOPSMB/sAvg Resp Time (ms)
                          Max Throughput-100%Read3398.88 (NIC on SAN almost saturated)106.229.09
                          RealLife-60%Rand-65%Read1069.348.3526.16
                          Max Throughput-50%Readsaturates NIC on SAN

                          Random-8k-70%Read994.777.7728.66

                           

                          Results from within a Windows 7 64-bit VM with 1.5GB of RAM and 1 vCPU with iSCSI software initiator running on ESXi 4.1 U2 host. Target VMDK resides on a LUN exposed by StarWind that has no caching:

                           

                          Access SpecificationIOPSMB/sAvg Resp Time (ms)
                          Max Throughput-100%Read3,247.61 (NIC on SAN almost saturated)101.499.46
                          RealLife-60%Rand-65%Read1118.268.7424.26
                          Max Throughput-50%Readsaturates NIC on SAN

                          Random-8k-70%Read975.257.6228.10
                          As I mentioned before, the StarWind caching doesn't seem to offer much improvements, but perhaps that's because the 1GB cache is simply too small.
                          I'm going to try the same test (from VM) using the MS iSCSI target for comparison sake. Results should be similar to the StarWind non-cached LUN, but the proof is in the pudding
                          • 505. Re: New  !! Open unofficial storage performance thread
                            alan@vcit.ca Novice

                            Here are the results using the MS iSCSI target instead of StarWind, IOMeter run within the same test Windows 7 64-bit VM:

                             

                            Access SpecificationIOPSMB/sAvg Resp Time (ms)
                            Max Throughput-100%Read3,353.83 (NIC on SAN almost saturated)104.8117.83
                            RealLife-60%Rand-65%Read1177.839.2051.24
                            Max Throughput-50%Readsaturates NIC on SAN

                            Random-8k-70%Read1040.728.1357.96

                             

                            As far as I know, the MS iSCSI target does no caching, so these results suggest that the 1GB Starwind cache provides little if any benefit.

                             

                            Based on the "native" performance for max throughput (both 100% read and 50/50 read/write), the network throughput is clearly the limiting factor. With over 26,000 IOPS and over 800 MB/s throughput for reads, I would need 8 NICs in order to keep up! Pretty crazy...

                             

                            I had planned on using (2) pNICs on the SAN and on each ESXi server and then have two iSCSI subnets and bind the vmknic to a single pNIC. Basically, a standard MPIO setup with round robin and switching between storage paths with each IO operation.

                             

                            Can anyone out there advise on what's involved in using more than (2) pNICs with MPIO? Is it just a matter of creating additional vmknics and then binding each to a pNIC? Is it worth bothering?

                             

                            About the only sequential IO in our environment that would generate sustained iSCSI traffic would be PHD backups (virtual fulls) and I'm guessing that a good chunk of the job time is spent on hashing, compression, and verification rather than on IO operations, so backup job times likely wouldn't be reduced by a huge amount with faster disk throughput.

                             

                            Any thoughts?

                            • 506. Re: New  !! Open unofficial storage performance thread
                              jb42 Novice

                              Bit of unix-side network tuning today. Tremendous gains on the sequential tests! In case you made the same mistake I did on your solaris-based ZFS storage rig, reconfigure your network with ipadm instead of ifconfig and see if you get a huge boost too! (Btw, this is a 4Gb test so all the reads are from the ARC cache. write-back cache is also enabled.)

                               

                               

                              Test nameLatencyAvg iopsAvg MBpscpu load
                              Max Throughput-100%Read8.26707222110%
                              RealLife-60%Rand-65%Read3.68122549520%
                              Max Throughput-50%Read8.71677121116%
                              Random-8k-70%Read4.071311110220%

                              • 507. Re: New  !! Open unofficial storage performance thread
                                PNeum Lurker

                                Hi guys,

                                 

                                We just bought a HUS110 with 16SSD and 21SAS 15K and look's like have some performace issue.
                                Tested from VDI on HP blade system. with two Cisco FC shwitch m9100 4GB with low load on old EVA4400 storage.

                                This test is on clear disk array.

                                 

                                SERVER TYPE: W7 on ESX4.1

                                CPU TYPE / NUMBER: VCPU/4

                                HOST TYPE: HP BL460c G1, 52GB RAM, 2xXeonE5450 @ 3.00GHz

                                STORAGE TYPE / DISK NUMBER / RAID LEVEL: HUS110 / 16x200GB SSD 3x(4+1)+1P/ RAID5

                                Test name

                                Latency

                                Avg iops

                                Avg MBps

                                cpu load

                                Max Throughput-100%Read

                                4.78

                                11944

                                373

                                72%

                                RealLife-60%Rand-65%Read

                                40.45

                                1301

                                10

                                59%

                                Max Throughput-50%Read

                                30.21

                                1657

                                51

                                51%

                                Random-8k-70%Read

                                33.27

                                1511

                                11

                                49%

                                 

                                 

                                 

                                SERVER TYPE: W7 on ESX4.1

                                CPU TYPE / NUMBER: VCPU/4

                                HOST TYPE: HP BL460c G1, 52GB RAM, 2xXeonE5450 @ 3.00GHz

                                STORAGE TYPE / DISK NUMBER / RAID LEVEL: HUS110 / 20x300GB 15K SAS 4x(4+1)+1P / RAID5

                                Test name

                                Latency

                                Avg iops

                                Avg MBps

                                cpu load

                                Max Throughput-100%Read

                                4.78

                                11964

                                373

                                71%

                                RealLife-60%Rand-65%Read

                                116.92

                                501

                                3

                                18%

                                Max Throughput-50%Read

                                30.38

                                1645

                                51

                                50%

                                Random-8k-70%Read

                                125.57

                                468

                                3

                                34%

                                 

                                 

                                If I test it one single Vmware host 4.1 without another VDI and out of vcenter the latency goes down. It look that I have something wrong on first ESX. It could be balancing setting ? What should I check ?

                                SERVER TYPE: W7 on ESX4.1

                                CPU TYPE / NUMBER: VCPU/4

                                HOST TYPE: HP BL460c G1, 52GB RAM, 2xXeonE5405 @ 2.00GHz

                                STORAGE TYPE / DISK NUMBER / RAID LEVEL: HUS110 / 15x 200GB SSD / Raid5 4+1

                                Test name

                                Latency

                                Avg iops

                                Avg MBps

                                cpu load

                                Max Throughput-100%Read

                                4.56

                                12286

                                383

                                84%

                                RealLife-60%Rand-65%Read

                                9.76

                                4073

                                31

                                2%

                                Max Throughput-50%Read

                                5.86

                                9441

                                295

                                72%

                                Random-8k-70%Read

                                5.56

                                8453

                                66

                                2%

                                 

                                Why is reading between SSD and SAS so similar. It bottleneck FC HBA ? How else could I test it?

                                 

                                Thank you so much for any advice

                                • 508. Re: New  !! Open unofficial storage performance thread
                                  tdubb123 Master

                                  i ran the iometer tests on a storesimple appliance. can you tell me about the prformance?

                                  • 509. Re: New  !! Open unofficial storage performance thread
                                    ryatwork Lurker

                                    Host:

                                    IBM x3650 M4

                                    (2) Intel 2.9 GHZ 8 core

                                    192 GB

                                    (2) IBM SAS 6GB HBA

                                     

                                    SAN:

                                    IBM DS3524 2 GB Cache

                                    (2) EXP3524

                                    (2) LSI LSISAS6160 Switches

                                    (2) 200 GB SSD

                                    (10) 15k 146 GB

                                    (48) 10k 600 GB

                                     

                                    Guest:

                                    Windows 2008R2 Enterprise

                                    2 CPU 8 Core each

                                    8 GB

                                     

                                    Raid 10 (46) 10k 600GB cache on

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read33249.681039.0531.7013126.304418
                                    RealLife-60%Rand-65%Read6893.55653.85596.7137033.512941
                                    Max Throughput-50%Read12531.99391.62464.7126626.004028
                                    Random-8k-70%Read6779.72852.966636.2163243.666074

                                     

                                    Raid 10 (46) 10k 600 GB cache off

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read16346.72449510.835143.7451853.113469
                                    RealLife-60%Rand-65%Read4153.40337732.44846410.894372.76599
                                    Max Throughput-50%Read1395.44535943.60766735.3064432.438852
                                    Random-8k-70%Read5466.85688942.7098198.0811623.035924

                                     

                                    Diskpool (48) 10k 600 GB cache on 3 drive preservation

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read33363.029471042.5946711.6890266.452065
                                    RealLife-60%Rand-65%Read3814.12788629.79787412.0727423.061749
                                    Max Throughput-50%Read13355.51381417.3598074.4702583.003937
                                    Random-8k-70%Read3589.38077728.04203712.0296013.344783

                                     

                                    Diskpool (48) 10K 600 GB cache off 3 drive preservation

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read15781.88086493.1837773.8769354.768079
                                    RealLife-60%Rand-65%Read1533.44055711.98000431.2025132.326186
                                    Max Throughput-50%Read992.59499731.01859449.8002642.324266
                                    Random-8k-70%Read1669.93069613.04633428.8625942.284067

                                     

                                    Raid10 (10) 146 GB 15k cache on

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read33245.000031038.9062511.7067476.508344
                                    RealLife-60%Rand-65%Read4250.68342933.20846410.563223.420131
                                    Max Throughput-50%Read12514.10222391.0656964.7263332.948779
                                    Random-8k-70%Read3762.8045729.39691111.6351823.516433

                                     

                                    Raid10 (10) 146 GB 15k cache off

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read16380.95056511.9047053.752753.269551
                                    RealLife-60%Rand-65%Read2871.88393822.43659317.2538882.256621
                                    Max Throughput-50%Read1355.16117742.34878738.0875072.204567
                                    Random-8k-70%Read3083.46649724.08958216.0469792.290914

                                     

                                    Raid1 (2) 200 GB SSD cache on

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read36207.600531131.4875171.6018966.78024
                                    RealLife-60%Rand-65%Read10244.9453680.0386365.7127122.466072
                                    Max Throughput-50%Read12445.03385388.9073084.772282.912787
                                    Random-8k-70%Read10911.6067585.2469285.295152.564752

                                     

                                    Raid1 (2) 200 GB SSD cache off

                                    Test NameIOpsMBpsAVG RespCPU %
                                    Max Throughput-100%Read9446.244212295.1951326.2290752.668895
                                    RealLife-60%Rand-65%Read10015.2201778.2439085.6369352.71507
                                    Max Throughput-50%Read4008.127503125.25398414.6813261.695191
                                    Random-8k-70%Read10665.4319283.3236875.3317532.764539
                                    1 32 33 34 35 36 Previous Next