背景:项目中用的CDH大数据集群,但是自己的电脑上是MacM芯片的系统,网上在arm架构上搭建CDH集群的资料太少了,所以自己尝试搭建并且梳理一下文档
一、启动docker
我安装的是桌面版的docker
二、搜索CDH的镜像,然后拉取你想要的镜像
搜索的命令:
docker search cloudera
拉取的命令:
注意:下载的时候最好选择拉取人数较多的那个,看starts,这个相当于评分,评分高了的话一般出问题的概率也低一些
docker pull cloudera/quickstart
拉取成功之后,用下面的命令可以看到下载好的镜像文件
docker images |grep cloud
三、直接启动
启动命令如下;
说明:hostname:是你需要给CDH取的主机名称
-v /etc/localtime:/etc/localtime:ro \ :主要解决容器时间与宿主主机时间不一致的问题,写不写都行好像,可以试试
docker run \
--hostname=wzx.cloudera \
--privileged=true -t -i \
-v /etc/localtime:/etc/localtime:ro \
-p 8888:8888 -p 10000:10000 -p 10020:10020 -p 11000:11000 -p 18080:18080 \
-p 18081:18081 -p 18088:18088 -p 19888:19888 -p 21000:21000 -p 21050:21050 \
-p 2181:2181 -p 25000:25000 -p 25010:25010 -p 25020:25020 -p 50010:50010 \
-p 50030:50030 -p 50060:50060 -p 50070:50070 -p 50075:50075 -p 50090:50090 \
-p 60000:60000 -p 60010:60010 -p 60020:60020 -p 60030:60030 -p 7180:7180 -p 7183:7183 \
-p 7187:7187 -p 80:80 -p 8020:8020 -p 8032:8032 -p 802:8042 -p 8088:8088 -p 8983:8983 -p 9083:9083 \
cloudera/quickstart /usr/bin/docker-quickstart
启动日志记录如下;其中可以看到hue启动失败了,其余的都是成功的
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Starting mysqld: [ OK ]if [ "$1" == "start" ] ; thenif [ "${EC2}" == 'true' ]; thenFIRST_BOOT_FLAG=/var/lib/cloudera-quickstart/.ec2-key-installedif [ ! -f "${FIRST_BOOT_FLAG}" ]; thenMETADATA_API=http://169.254.169.254/latest/meta-dataKEY_URL=${METADATA_API}/public-keys/0/openssh-keySSH_DIR=/home/cloudera/.sshmkdir -p ${SSH_DIR}chown cloudera:cloudera ${SSH_DIR}curl ${KEY_URL} >> ${SSH_DIR}/authorized_keystouch ${FIRST_BOOT_FLAG}fifiif [ "${DOCKER}" != 'true' ]; thenif [ -f /sys/kernel/mm/redhat_transparent_hugepage/defrag ]; thenecho never > /sys/kernel/mm/redhat_transparent_hugepage/defragficloudera-quickstart-ipHOSTNAME=quickstart.clouderahostname ${HOSTNAME}sed -i -e "s/HOSTNAME=.*/HOSTNAME=${HOSTNAME}/" /etc/sysconfig/networkfi(cd /var/lib/cloudera-quickstart/tutorial;nohup python -m SimpleHTTPServer 80 &)# TODO: check for expired CM license and update config.js accordingly
fi
+ '[' start == start ']'
+ '[' '' == true ']'
+ '[' true '!=' true ']'
+ cd /var/lib/cloudera-quickstart/tutorial
+ nohup python -m SimpleHTTPServer 80nohup: appending output to `nohup.out'
JMX enabled by default
Using config: /etc/zookeeper/conf/zoo.cfg
Starting zookeeper ... STARTED
starting datanode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-datanode-wzx.cloudera.out
Started Hadoop datanode (hadoop-hdfs-datanode): [ OK ]
starting journalnode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-journalnode-wzx.cloudera.out
Started Hadoop journalnode: [ OK ]
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-wzx.cloudera.out
Started Hadoop namenode: [ OK ]
starting secondarynamenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-secondarynamenode-wzx.cloudera.out
Started Hadoop secondarynamenode: [ OK ]Setting HTTPFS_HOME: /usr/lib/hadoop-httpfs
Using HTTPFS_CONFIG: /etc/hadoop-httpfs/conf
Sourcing: /etc/hadoop-httpfs/conf/httpfs-env.sh
Using HTTPFS_LOG: /var/log/hadoop-httpfs/
Using HTTPFS_TEMP: /var/run/hadoop-httpfs
Setting HTTPFS_HTTP_PORT: 14000
Setting HTTPFS_ADMIN_PORT: 14001
Setting HTTPFS_HTTP_HOSTNAME: wzx.cloudera
Setting HTTPFS_SSL_ENABLED: false
Setting HTTPFS_SSL_KEYSTORE_FILE: /var/lib/hadoop-httpfs/.keystore
Setting HTTPFS_SSL_KEYSTORE_PASS: password
Using CATALINA_BASE: /var/lib/hadoop-httpfs/tomcat-deployment
Using HTTPFS_CATALINA_HOME: /usr/lib/bigtop-tomcat
Setting CATALINA_OUT: /var/log/hadoop-httpfs//httpfs-catalina.out
Using CATALINA_PID: /var/run/hadoop-httpfs/hadoop-httpfs-httpfs.pidUsing CATALINA_OPTS:
Adding to CATALINA_OPTS: -Dhttpfs.home.dir=/usr/lib/hadoop-httpfs -Dhttpfs.config.dir=/etc/hadoop-httpfs/conf -Dhttpfs.log.dir=/var/log/hadoop-httpfs/ -Dhttpfs.temp.dir=/var/run/hadoop-httpfs -Dhttpfs.admin.port=14001 -Dhttpfs.http.port=14000 -Dhttpfs.http.hostname=wzx.cloudera
Using CATALINA_BASE: /var/lib/hadoop-httpfs/tomcat-deployment
Using CATALINA_HOME: /usr/lib/bigtop-tomcat
Using CATALINA_TMPDIR: /var/run/hadoop-httpfs
Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera
Using CLASSPATH: /usr/lib/bigtop-tomcat/bin/bootstrap.jar
Using CATALINA_PID: /var/run/hadoop-httpfs/hadoop-httpfs-httpfs.pid
Started Hadoop httpfs (hadoop-httpfs): [ OK ]
starting historyserver, logging to /var/log/hadoop-mapreduce/mapred-mapred-historyserver-wzx.cloudera.out
Started Hadoop historyserver: [ OK ]
starting nodemanager, logging to /var/log/hadoop-yarn/yarn-yarn-nodemanager-wzx.cloudera.out
Started Hadoop nodemanager: [ OK ]
starting resourcemanager, logging to /var/log/hadoop-yarn/yarn-yarn-resourcemanager-wzx.cloudera.out
Started Hadoop resourcemanager: [ OK ]
starting master, logging to /var/log/hbase/hbase-hbase-master-wzx.cloudera.out
Started HBase master daemon (hbase-master): [ OK ]
starting rest, logging to /var/log/hbase/hbase-hbase-rest-wzx.cloudera.out
Started HBase rest daemon (hbase-rest): [ OK ]
starting thrift, logging to /var/log/hbase/hbase-hbase-thrift-wzx.cloudera.out
Started HBase thrift daemon (hbase-thrift): [ OK ]
Starting Hive Metastore (hive-metastore): [ OK ]
Started Hive Server2 (hive-server2): [ OK ]
Starting Sqoop Server: [ OK ]
Sqoop home directory: /usr/lib/sqoop2
Setting SQOOP_HTTP_PORT: 12000
Setting SQOOP_ADMIN_PORT: 12001
Using CATALINA_OPTS: -Xmx1024m
Adding to CATALINA_OPTS: -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE: /var/lib/sqoop2/tomcat-deployment
Using CATALINA_HOME: /usr/lib/bigtop-tomcat
Using CATALINA_TMPDIR: /var/tmp/sqoop2
Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera
Using CLASSPATH: /usr/lib/bigtop-tomcat/bin/bootstrap.jar
Using CATALINA_PID: /var/run/sqoop2/sqoop-server-sqoop2.pid
Starting Spark history-server (spark-history-server): [ OK ]
Starting Hadoop HBase regionserver daemon: starting regionserver, logging to /var/log/hbase/hbase-hbase-regionserver-wzx.cloudera.out
hbase-regionserver.
Starting hue: [FAILED]
Started Impala State Store Server (statestored): [ OK ]Setting OOZIE_HOME: /usr/lib/oozie
Sourcing: /usr/lib/oozie/bin/oozie-env.shsetting JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native"setting OOZIE_DATA=/var/lib/ooziesetting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcatsetting CATALINA_TMPDIR=/var/lib/ooziesetting CATALINA_PID=/var/run/oozie/oozie.pidsetting CATALINA_BASE=/var/lib/oozie/tomcat-deploymentsetting OOZIE_HTTPS_PORT=11443setting OOZIE_HTTPS_KEYSTORE_PASS=passwordsetting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.port=${OOZIE_HTTPS_PORT}"setting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.keystore.pass=${OOZIE_HTTPS_KEYSTORE_PASS}"setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"setting OOZIE_CONFIG=/etc/oozie/confsetting OOZIE_LOG=/var/log/oozie
Using OOZIE_CONFIG: /etc/oozie/conf
Sourcing: /etc/oozie/conf/oozie-env.shsetting JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop/lib/native"setting OOZIE_DATA=/var/lib/ooziesetting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcatsetting CATALINA_TMPDIR=/var/lib/ooziesetting CATALINA_PID=/var/run/oozie/oozie.pidsetting CATALINA_BASE=/var/lib/oozie/tomcat-deploymentsetting OOZIE_HTTPS_PORT=11443setting OOZIE_HTTPS_KEYSTORE_PASS=passwordsetting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.port=${OOZIE_HTTPS_PORT}"setting CATALINA_OPTS="$CATALINA_OPTS -Doozie.https.keystore.pass=${OOZIE_HTTPS_KEYSTORE_PASS}"setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"setting OOZIE_CONFIG=/etc/oozie/confsetting OOZIE_LOG=/var/log/oozie
Setting OOZIE_CONFIG_FILE: oozie-site.xml
Using OOZIE_DATA: /var/lib/oozie
Using OOZIE_LOG: /var/log/oozie
Setting OOZIE_LOG4J_FILE: oozie-log4j.properties
Setting OOZIE_LOG4J_RELOAD: 10
Setting OOZIE_HTTP_HOSTNAME: wzx.cloudera
Setting OOZIE_HTTP_PORT: 11000
Setting OOZIE_ADMIN_PORT: 11001
Using OOZIE_HTTPS_PORT: 11443
Setting OOZIE_BASE_URL: http://wzx.cloudera:11000/oozie
Using CATALINA_BASE: /var/lib/oozie/tomcat-deployment
Setting OOZIE_HTTPS_KEYSTORE_FILE: /var/lib/oozie/.keystore
Using OOZIE_HTTPS_KEYSTORE_PASS: password
Setting OOZIE_INSTANCE_ID: wzx.cloudera
Setting CATALINA_OUT: /var/log/oozie/catalina.out
Using CATALINA_PID: /var/run/oozie/oozie.pidUsing CATALINA_OPTS: -Doozie.https.port=11443 -Doozie.https.keystore.pass=password -Xmx1024m -Doozie.https.port=11443 -Doozie.https.keystore.pass=password -Xmx1024m -Dderby.stream.error.file=/var/log/oozie/derby.log
Adding to CATALINA_OPTS: -Doozie.home.dir=/usr/lib/oozie -Doozie.config.dir=/etc/oozie/conf -Doozie.log.dir=/var/log/oozie -Doozie.data.dir=/var/lib/oozie -Doozie.instance.id=wzx.cloudera -Doozie.config.file=oozie-site.xml -Doozie.log4j.file=oozie-log4j.properties -Doozie.log4j.reload=10 -Doozie.http.hostname=wzx.cloudera -Doozie.admin.port=11001 -Doozie.http.port=11000 -Doozie.https.port=11443 -Doozie.base.url=http://wzx.cloudera:11000/oozie -Doozie.https.keystore.file=/var/lib/oozie/.keystore -Doozie.https.keystore.pass=password -Djava.library.path=:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/nativeUsing CATALINA_BASE: /var/lib/oozie/tomcat-deployment
Using CATALINA_HOME: /usr/lib/bigtop-tomcat
Using CATALINA_TMPDIR: /var/lib/oozie
Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera
Using CLASSPATH: /usr/lib/bigtop-tomcat/bin/bootstrap.jar
Using CATALINA_PID: /var/run/oozie/oozie.pid
Starting Solr server daemon: [ OK ]
Using CATALINA_BASE: /var/lib/solr/tomcat-deployment
Using CATALINA_HOME: /usr/lib/solr/../bigtop-tomcat
Using CATALINA_TMPDIR: /var/lib/solr/
Using JRE_HOME: /usr/java/jdk1.7.0_67-cloudera
Using CLASSPATH: /usr/lib/solr/../bigtop-tomcat/bin/bootstrap.jar
Using CATALINA_PID: /var/run/solr/solr.pid
Started Impala Catalog Server (catalogd) : [ OK ]
Started Impala Server (impalad): [ OK ]
输入jps命令可以看到下面的进程都有了
[root@wzx /]# jps
1032 NameNode
7346 Jps
3287 RunJar
1875 NodeManager
2655 RESTServer
7108 -- process information unavailable
2108 ResourceManager
3007 ThriftServer
563 QuorumPeerMain
1207 SecondaryNameNode
3651 RunJar
6886 Bootstrap
657 DataNode
844 JournalNode
1554 Bootstrap
6971
这里遇到过问题,虽然jps可以看到进程,但是cdh界面进去后有问题,所以这里最好先看下容器的ID
重新启动容器一下更好一些
docker ps 查看容器的ID
docker exec -it 3dc308a6f3eb(通过上面命令查询到的容器的I) bash
四、启动Cloudera Manager管理界面
命令
sudo /home/cloudera/cloudera-manager --express --force
启动打印的日志如下;
[root@wzx /]# sudo /home/cloudera/cloudera-manager --express --force
[QuickStart] Shutting down CDH services via init scripts...
kafka-server: unrecognized service
JMX enabled by default
Using config: /etc/zookeeper/conf/zoo.cfg
[QuickStart] Disabling CDH services on boot...
error reading information on service kafka-server: No such file or directory
[QuickStart] Starting Cloudera Manager server...
[QuickStart] Waiting for Cloudera Manager API...
[QuickStart] Starting Cloudera Manager agent...
[QuickStart] Configuring deployment...
Submitted jobs: 14
[QuickStart] Deploying client configuration...
Submitted jobs: 15
[QuickStart] Starting Cloudera Management Service...
Submitted jobs: 23
[QuickStart] Enabling Cloudera Manager daemons on boot...
________________________________________________________________________________Success! You can now log into Cloudera Manager from the QuickStart VM's browser:http://quickstart.cloudera:7180Username: clouderaPassword: cloudera
说明:http://quickstart.cloudera:7180 这个是管理界面的启动地址
Username: cloudera 这是用户名
Password: cloudera 这是密码
如果想要浏览器的界面访问的简单一些,就配置映射关系到hosts文件中
sudo vi /etc/hosts
新增内容如下;
127.0.0.1 quickstart.cloudera
然后就可以通过127.0.0.1:7180访问CDH的管理界面了
但是有个问题:我此刻输入jps,其余的进程都看不到了
[root@wzx /]# jps
10754 Jps
563 -- process information unavailable
10096 Main
7108 -- process information unavailable
然后输入用户名、密码,登录成功进来,显示如下,是不是很漂亮,😄
仔细看图片中的两处异常,下面是解决问题的方法
点击hosts,进去看,发现Clock关闭了,也就是NTP service没启动