Hi,
I am getting this error when creating cluster
(cluster create --name hdp --distro HDP-1.3.2 --appManager Ambari --networkName Hadoop_NW)
serengeti>appmanager list
NAME DESCRIPTION TYPE URL
----------------------------------------------------------------------
Default Default application manager Default
ambari AmbariServer Ambari http://10.6.55.239:8080
==========================
It seems that agent on the host is not able to connect to the server but the problem is the Ambari Server is not located at localhost:8080 how can i change it to ambari server's address.
Running setup agent script...
==========================
{'exitstatus': 1, 'log': "Host registration aborted. Ambari Agent host cannot reach Ambari Server 'localhost:8080'. Please check the network connectivity between the Ambari Agent host and the Ambari Server"}
Connection to node1.hadooptest.com closed.
SSH command execution finished
host=node1.hadooptest.com, exitcode=1
ERROR: Bootstrap of host node1.hadooptest.com fails because previous action finished with non-zero exit code (1)
ERROR MESSAGE: tcgetattr: Invalid argument
Connection to node1.hadooptest.com closed.
STDOUT: {'exitstatus': 1, 'log': "Host registration aborted. Ambari Agent host cannot reach Ambari Server 'localhost:8080'. Please check the network connectivity between the Ambari Agent host and the Ambari Server"}
Connection to node1.hadooptest.com closed.
Hi,
It sounds like the FQDN of the Ambari server is not properly set. DNS is a requirement of the setup. Make sure that "hostname -f" returns the correct FQDN on every server in the setup and can be resolved correctly by all other hosts.
Cheers
Charlie
Hi,
Hostname -f gives
[root@localhost conf]# hostname -f
ambari.hadooptest.com
and all the host are able to ping ambari.hadooptest.com
ping ambari.hadooptest.com
PING ambari.hadooptest.com (10.6.55.239) 56(84) bytes of data.
64 bytes from ambari.hadooptest.com (10.6.55.239): icmp_seq=1 ttl=64 time=12.9 ms
I am able to create default clusters but not using Ambari App manager.
Mohsin
Hi Mohsin,
That is strange. Are you able to post the serengeti, Ambari server and agent logs?
Cheers
Charlie
Hi Mohsin,
Cloud you post the serengeti log(/opt/serengeti/log/serengeti.log) and Ambari server log(/var/log/ambari-server/)? I will take a look them and find the root cause.
Thanks,
-qing
Hi Mosin,
Could you post the contents of /etc/hosts from the Ambari server too please.
Thanks
Charlie
Hi Charlie,
Following are the entries in /etc/hosts
[root@ambari ~]# more /etc/hosts
127.0.0.1 localhost
10.6.55.241 node1.hadooptest.com
10.6.55.242 node2.hadooptest.com
10.6.55.243 node3.hadooptest.com
10.6.55.244 node4.hadooptest.com
10.6.55.245 node5.hadooptest.com
10.6.55.246 node6.hadooptest.com
10.6.55.247 node7.hadooptest.com
10.6.55.248 node8.hadooptest.com
10.6.55.249 node9.hadooptest.com
10.6.55.250 node10.hadooptest.com
10.6.55.251 node11.hadooptest.com
10.6.55.252 node12.hadooptest.com
10.6.55.239 ambari.hadooptest.com
Mohsin
Hi Mohsin,
I have tested it on my BDE server with Ambari.
steps,
1. Set hostname to 'localhost' on Ambari server('hostname localhost').
2. Restart Ambari server service('service ambari-server restart').
3. Create a cluster using this Ambari server. And then I got the following error message like you.
{'exitstatus': 1, 'log': "Host registration aborted. Ambari Agent host cannot reach Ambari Server 'localhost:8080'. Please check the network connectivity between the Ambari Agent host and the Ambari Server"}
4. Set hostname to correct using command 'hostname FQDN'.
5. Restart Ambari server service('service ambari-server restart').
6. The cluster resumed successfully on BDE server.
So, cloud you have a try to do this following step 4 to 6? Let me know if you have any questions. If still failed we need to ask Hortonworks engineer in their community.
Thanks,
-qing
Hi Qing,
Thanks for the solution its solved the earlier problem but now i got a new one. According to the error Failed to start ping port listener of:[Errno 98] Address already in use" This the only address on the lan .... what can be causing this issue ?
Mohsin
The failed nodes: 1
----------------------------------------------------------------------------
[NAME] hdp2-worker-0
[STATUS] VM Ready
[Error Message] ==========================
Copying common functions script...
==========================
scp /usr/lib/python2.6/site-packages/common_functions
host=node5.hadooptest.com, exitcode=0
==========================
Copying OS type check script...
==========================
scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=node5.hadooptest.com, exitcode=0
==========================
Running OS type check...
==========================
Cluster primary/cluster OS type is redhat6 and local/current OS type is redhat6
Connection to node5.hadooptest.com closed.
SSH command execution finished
host=node5.hadooptest.com, exitcode=0
==========================
Checking 'sudo' package on remote host...
==========================
sudo-1.8.6p3-12.el6.x86_64
Connection to node5.hadooptest.com closed.
SSH command execution finished
host=node5.hadooptest.com, exitcode=0
==========================
Copying repo file to 'tmp' folder...
==========================
scp /etc/yum.repos.d/ambari.repo
host=node5.hadooptest.com, exitcode=0
==========================
Moving file to repo dir...
==========================
Connection to node5.hadooptest.com closed.
SSH command execution finished
host=node5.hadooptest.com, exitcode=0
==========================
Copying setup script file...
==========================
scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=node5.hadooptest.com, exitcode=0
==========================
Running setup agent script...
==========================
Restarting ambari-agent
Verifying Python version compatibility...
Using python /usr/bin/python2.6
Found ambari-agent PID: 1682
Stopping ambari-agent
Removing PID file at /var/run/ambari-agent/ambari-agent.pid
ambari-agent successfully stopped
Verifying Python version compatibility...
Using python /usr/bin/python2.6
Checking for previously running Ambari Agent...
Starting ambari-agent
Verifying ambari-agent process status...
ERROR: ambari-agent start failed
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log
('INFO 2015-04-01 06:19:59,137 HostCheckReportFileHandler.py:109 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result
INFO 2015-04-01 06:19:59,205 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:09,207 Heartbeat.py:76 - Sending heartbeat with response id: 1 and timestamp: 1427869209207. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:09,251 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:19,252 Heartbeat.py:76 - Sending heartbeat with response id: 2 and timestamp: 1427869219252. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:19,296 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:29,296 Heartbeat.py:76 - Sending heartbeat with response id: 3 and timestamp: 1427869229296. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:29,340 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:39,340 Heartbeat.py:76 - Sending heartbeat with response id: 4 and timestamp: 1427869239340. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:39,384 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:49,384 Heartbeat.py:76 - Sending heartbeat with response id: 5 and timestamp: 1427869249384. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:49,428 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:59,429 Heartbeat.py:76 - Sending heartbeat with response id: 6 and timestamp: 1427869259429. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:21:05,061 main.py:83 - loglevel=logging.INFO
INFO 2015-04-01 06:21:10,870 main.py:83 - loglevel=logging.INFO
INFO 2015-04-01 06:21:10,871 DataCleaner.py:36 - Data cleanup thread started
INFO 2015-04-01 06:21:10,875 DataCleaner.py:71 - Data cleanup started
INFO 2015-04-01 06:21:10,876 DataCleaner.py:73 - Data cleanup finished
ERROR 2015-04-01 06:21:10,877 PingPortListener.py:44 - Failed to start ping port listener of:[Errno 98] Address already in use
INFO 2015-04-01 06:21:10,877 PingPortListener.py:52 - Ping port listener killed
', None)
Connection to node5.hadooptest.com closed.
SSH command execution finished
host=node5.hadooptest.com, exitcode=255
ERROR: Bootstrap of host node5.hadooptest.com fails because previous action finished with non-zero exit code (255)
ERROR MESSAGE: tcgetattr: Invalid argument
Connection to node5.hadooptest.com closed.
STDOUT: Restarting ambari-agent
Verifying Python version compatibility...
Using python /usr/bin/python2.6
Found ambari-agent PID: 1682
Stopping ambari-agent
Removing PID file at /var/run/ambari-agent/ambari-agent.pid
ambari-agent successfully stopped
Verifying Python version compatibility...
Using python /usr/bin/python2.6
Checking for previously running Ambari Agent...
Starting ambari-agent
Verifying ambari-agent process status...
ERROR: ambari-agent start failed
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log
('INFO 2015-04-01 06:19:59,137 HostCheckReportFileHandler.py:109 - Creating host check file at /var/lib/ambari-agent/data/hostcheck.result
INFO 2015-04-01 06:19:59,205 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:09,207 Heartbeat.py:76 - Sending heartbeat with response id: 1 and timestamp: 1427869209207. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:09,251 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:19,252 Heartbeat.py:76 - Sending heartbeat with response id: 2 and timestamp: 1427869219252. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:19,296 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:29,296 Heartbeat.py:76 - Sending heartbeat with response id: 3 and timestamp: 1427869229296. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:29,340 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:39,340 Heartbeat.py:76 - Sending heartbeat with response id: 4 and timestamp: 1427869239340. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:39,384 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:49,384 Heartbeat.py:76 - Sending heartbeat with response id: 5 and timestamp: 1427869249384. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:20:49,428 Controller.py:211 - No commands sent from the Server.
INFO 2015-04-01 06:20:59,429 Heartbeat.py:76 - Sending heartbeat with response id: 6 and timestamp: 1427869259429. Command(s) in progress: False. Components mapped: False
INFO 2015-04-01 06:21:05,061 main.py:83 - loglevel=logging.INFO
INFO 2015-04-01 06:21:10,870 main.py:83 - loglevel=logging.INFO
INFO 2015-04-01 06:21:10,871 DataCleaner.py:36 - Data cleanup thread started
INFO 2015-04-01 06:21:10,875 DataCleaner.py:71 - Data cleanup started
INFO 2015-04-01 06:21:10,876 DataCleaner.py:73 - Data cleanup finished
ERROR 2015-04-01 06:21:10,877 PingPortListener.py:44 - Failed to start ping port listener of:[Errno 98] Address already in use
INFO 2015-04-01 06:21:10,877 PingPortListener.py:52 - Ping port listener killed
', None)
Connection to node5.hadooptest.com closed.
----------------------------------------------------------------------------
cluster hdp2 resume failed: Task execution failed: An exception happens when App_Manager (Ambari) creates the cluster: (hdp2). Creation fails..
Hi Mohsin,
You need to kill the processes that were running.
Log in each hadoop node and run following commands:
ps -ef | grep ambari
kill -9 <process_id>
And then run cluster resume on BDE server.
Thanks,
-qing