4 Replies Latest reply on Jan 4, 2008 11:55 AM by FredPeterson

    Intel CPU - Enable hardware prefetch?

    bolsen Hot Shot

      IBM enables the CPU hardware prefetch by default but Intel recommends turning the feature off depending on what the server is doing.  Anyone have any preferences?

        • 1. Re: Intel CPU - Enable hardware prefetch?
          bolsen Hot Shot

          Hmmm, I'm starting to think it should be off.

          • 2. Re: Intel CPU - Enable hardware prefetch?
            RParker Guru


            I think you would be wrong.  Try it and see what happens.






            Instruction supply may become a substantial bottleneck in future generation processors that have very long memory latencies and run application workloads with large instruction footprints such as database servers. Prefetching is a well-known technique for improving the effectiveness of the cache hierarchy



            employs a hardwarebased breadth-first search of future control-flow to cope with weakly-biased future branches, prescient instruction prefetch uses precomputation to resolve which controlflow path to follow. Furthermore, as the precomputation frequently contains load instructions, prescient instruction prefetch often improves performance by prefetching data.



            prefetch uses helper threads to perform instruction prefetch on behalf of the main thread.



            A key challenge for instruction prefetch is to accurately predict control flow sufficiently in advance of the fetch unit to tolerate the latency of the memory hierarchy. The notion of prescient instruction prefetch was first introduced as a technique that uses helper threads to improve single-threaded application performance by performing judicious and timely instruction prefetch. 



            • 3. Re: Intel CPU - Enable hardware prefetch?
              bolsen Hot Shot


              Just found this in the Systemx redbook.






              BIOS levels permit various settings for performance in certain IBM System x





                Processor Adjacent Sector Prefetch

              When this setting is enabled, (enabled is the default for most systems), the

              processor retrieves both sectors of a cache line when it requires data that is

              not currently in its cache. When it is disabled, the processor will only fetch the

              sector of the cache line that includes the data requested. For instance, only

              one 64-byte line from the 128-byte sector will be prefetched with this setting





              This setting can affect performance, depending on the application running on

              the server and memory bandwidth utilization. Typically, it affects certain

              benchmarks by a few percent, although in most real applications it will be

              negligible. This control is provided for benchmark users who want to fine-tune

              configurations and settings.




                Processor Hardware Prefetcher

              When this setting is enabled, (disabled is the default for most systems), the

              processors is able to prefetch extra cache lines for every memory request.

              Recent tests in the performance lab have shown that you will get the best

              performance for most commercial application types if you disable this feature.

              The performance gain can be as much as 20% depending on the application.

              For high-performance computing (HPC) applications, we recommend you turn

              HW Prefetch enabled and for database workloads, we recommend you leave

              the HW Prefetch disabled.




              Both prefetch settings do decrease the miss rate for the L2/L3 cache when they

              are enabled but they consume bandwidth on the front-side bus which can reach

              capacity under heavy load. By disabling both prefetch settings, multi-core setups

              achieve generally higher performance and scalability.



              • 4. Re: Intel CPU - Enable hardware prefetch?
                FredPeterson Expert

                Based on that bolsen I'd venture to say its a Turn Off for ESX hosts due to the nature of ESX and processors flipping around between different guests with different memory spaces and instructions.  There would be dead time in the processor as it "forgets" about what it prefetched for the other guest while it waits to work for a different.


                Interesting.  I'll have to keep it in mind when we replace our fleet this year (still stuck on regular Xeons on IBM Blades and 366)