VMware Communities
JoshSimons
VMware Employee
VMware Employee

SC11: Virtualized Hadoop

Abhinav Chawade from AMAX gave an excellent talk on virtualized Hadoop performance in Intel's booth theatre on the exhibit floor during SC11. He covered the results detailed in Jeff Buell's whitepaper, A Benchmarking Case Study of Virtualized Hadoop Performance on VMware vSphere 5, which was created as a collaborative effort between AMAX, VMware, and Mellanox.

Big Data was a big part of SC this year with numerous papers on Hadoop and related technologies. Several papers discussed running Hadoop in a cloud environment, perhaps most notably Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud, which I found very interesting. The abstract is produced below.

"We present Purlieus, a MapReduce cloud resource allocation system aimed at enhancing the performance of MapReduce jobs in the cloud. Purlieus provisions virtual MapReduce clusters in a locality-aware manner enabling MapReduce virtual machines (VMs) access to input data and importantly, intermediate data from local or close-by physical machines. We demonstrate how this locality-awareness during both map and reduce phases of the job not only improves runtime performance of individual jobs but also has an additional advantage of reducing network traffic generated in the cloud data center. This is accomplished using a novel coupling of, otherwise independent, data and VM placement steps. We conduct a detailed evaluation of Purlieus and demonstrate significant savings in network traffic and almost 50% reduction in job execution times for a variety of workloads."

0 Kudos
0 Replies