VMware Cloud Community
jcummings_g2
Contributor
Contributor

Adding Distributions via config-distro.rb

In looking at the Admin manual, there is a high level example of using config-distro.rb to add new distro's, however no examples or pointers to any of the distro URL's.


We're interested in deploying Cloudera CH4 as a test, but don't have the full ruby script URL input's for the distributions.  I believe I have the primary distro, but unsure on PIG and hive.

So far, I have this:

cd sbin

config-distro.rb --name Cloudera 4 --hadoop http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/

Any info or examples for adding a full CH4 distro?

7 Replies
jessehuvmw
Enthusiast
Enthusiast

Hi, please take a look at section "About Tarball- and Yum-Deployed Hadoop Distributions" and "Configure a Yum-Deployed Hadoop Distribution" on page 14 of VMware BDE Admin and User Guide, it describes how to add a distro.

In VMware BDE, CDH4, MapR and Pivotal HD are treated as Yum-Deployed Hadoop Distribution, so please follow "Configure a Yum-Deployed Hadoop Distribution" to add CDH4 distro. This includes mirroring the official CDH4 yum repo and a little copy/paste work in /opt/serengeti/www/distros/manifest.  There is a sample for CDH4 distro configuration in /opt/serengeti/www/distros/manifest.sample.

config-distro.rb is used to configure a Tarball-Deployed Hadoop Distribution (e.g. Apache Hadoop, CDH3, Greenplum, Hortonworks, etc).

-Jesse Hu @VMware

Cheers, Jesse Hu
0 Kudos
airt
Contributor
Contributor

Hi, has anyone managed to use the latest stable CDH 4.3.0 distribution. I followed the steps from the guide and managed to add CDH 4.1.2, but with 4.3.0 it is not working.

Thanks! Smiley Happy

Update: The exact problem was that I didn't have any "Deployment type" when I was creating a cluster for CDH 4.3.0 in the "Create New Big Data Cluster"  pop-up window. The solution was to add(copy/paste from CDH 4.1.2) the deployment types in the map file( /opt/serengeti/www/spec/map). Now I am able to choose a cluster type and create a cluster.

0 Kudos
jessehuvmw
Enthusiast
Enthusiast

hi airt,  in BDE Beta 1, after adding new distro in /opt/serengeti/www/distros/manifest , you will see the new distro in Serengeti CLI via 'distro list'. In order to show the distro in BDE GUI, you need to add the new distro as well in /opt/serengeti/www/specs/map , just as you have done (i.e. copy/paste from CDH 4.1.2 in this file and modify the version to 4.3.0)

Cheers, Jesse Hu
0 Kudos
shashi1978
Contributor
Contributor

Hi

I am getting a diffrent problem

When i add the distro url I get the follwing errors

[root@172 sbin]# config-distro.rb --name hdp --vendor HDP --version 1.3.2 --hadoop http://public-repo-1.hortonworks.com/HDP/centos5/1.x/updates/1.3.2.0/tars/hadoop-1.2.0.1.3.2.0-111.t... --pig http://public-repo-1.hortonworks.com/HDP/centos5/1.x/updates/1.3.2.0/tars/pig-0.11.1.1.3.2.0-111.tar... --hive http://public-repo-1.hortonworks.com/HDP/centos5/1.x/updates/1.3.2.0/tars/hive-0.11.0.1.3.2.0-111.ta... --hbase http://public-repo-1.hortonworks.com/HDP/centos5/1.x/updates/1.3.2.0/tars/hbase-0.94.6.1.3.2.0-111-s... --zookeeper http://public-repo-1.hortonworks.com/HDP/centos5/1.x/updates/1.3.2.0/tars/zookeeper-3.4.5.1.3.2.0-11... --hve true

Errors:

  The option --hadoop can be used only for one of these distros: APACHE, GPHD, HDP, BIGTOP.

  The option --pig can be used only for one of these distros: APACHE, GPHD, HDP, BIGTOP.

  The option --hive can be used only for one of these distros: APACHE, GPHD, HDP, BIGTOP.

  The option --hbase can be used only for one of these distros: APACHE, GPHD, HDP, BIGTOP.

  The option --zookeeper can be used only for one of these distros: APACHE, GPHD, HDP, BIGTOP.

  The option --repos is missing.

Usage: config-distro [options]

    -n, --name DISTRO_NAME           Distro name.

    -d, --vendor DISTRO_VENDOR       Valid distro vendor name.

    -v, --version DISTRO_VERSION     Release version of the hadoop distro.

    -a, --hadoop TARBALL_URL         Hadoop tarball url.

    -p, --pig TARBALL_URL            Pig tarball url.

    -i, --hive TARBALL_URL           Hive tarball url.

    -b, --hbase TARBALL_URL          Hbase tarball url.

    -z, --zookeeper TARBALL_URL      Zookeeper tarball url.

    -e, --hve HVE_SUPPORTED          Is HVE supported? Apache Hadoop 1.2+ and Pivotal HD support HVE.

    -r, --repos REPOS                Package repos url

    -o, --roles ROLES                Chef roles supported by this distro

    -y, --yes                        Answer yes for all confirmation.

    -h, --help                       Show this help.

[root@172 sbin]#

Even with the repos option enabled there is a diffrent error [root@172 sbin]# config-distro.rb --name hdp --vendor HDP --version 1.3.2 --repos --hadoop /opt/serengeti/sbin/hadoop-1.2.0.1.3.2.0-111.tar.gz  --repos --pig /opt/serengeti/sbin/pig-0.11.1.1.3.2.0-111.tar.gz  --repos --hive /opt/serengeti/sbin/hive-0.11.0.1.3.2.0-111.tar.gz --repos --hbase /opt/serengeti/sbin/hbase-0.94.6.1.3.2.0-111-security.tar.gz --repos --zookeeper /opt/serengeti/sbin/zookeeper-3.4.5.1.3.2.0-111.tar.gz --hve true --yes Errors:   The value(--zookeeper) of option --repos format must be http://*.repo or https://*.repo. Usage: config-distro [options]     -n, --name DISTRO_NAME           Distro name.     -d, --vendor DISTRO_VENDOR       Valid distro vendor name.     -v, --version DISTRO_VERSION     Release version of the hadoop distro.     -a, --hadoop TARBALL_URL         Hadoop tarball url.     -p, --pig TARBALL_URL            Pig tarball url.     -i, --hive TARBALL_URL           Hive tarball url.     -b, --hbase TARBALL_URL          Hbase tarball url.     -z, --zookeeper TARBALL_URL      Zookeeper tarball url.     -e, --hve HVE_SUPPORTED          Is HVE supported? Apache Hadoop 1.2+ and Pivotal HD support HVE.     -r, --repos REPOS                Package repos url     -o, --roles ROLES                Chef roles supported by this distro     -y, --yes                        Answer yes for all confirmation.     -h, --help                       Show this help.

0 Kudos
jessehuvmw
Enthusiast
Enthusiast

This is a bug of BDE 2.0. You can run this command on BDE Server to patch it :

sed -i 's|.*include? @options.vendor|["GENERIC", "MAPR", "PHD", "INTEL", "BIGTOP", "CDH"].include? @options.vendor or (@options.vendor == "HDP" and @options.version.to_f >= 2)|' /opt/serengeti/sbin/config-distro.rb

BTW, when login BDE Server and open Serengeti CLI, please login as user serengeti not root. Some function will not work when logged in as root.

Cheers, Jesse Hu
TDRoy
Enthusiast
Enthusiast

This solved the issue for me as well...constantly receiving the error that --repos is required.

Thank you

0 Kudos
shashi1978
Contributor
Contributor

Hi Jesse,

It solved the issue . Thanks a lot

Shashi

0 Kudos