hadoo2.7.1+hive2.1.1+spark2.0.1

回复 收藏

hadoop安装按教程就行.

hive安装:

1:下载软件包,地址:http://mirrors.hust.edu.cn/apache/ 版本:apache-hive-2.1.1-bin.tar.gz

2:tar -zxvf apache-hive-2.1.1-bin.tar.gz -C /usr/local/hive

3:vim /etc/profile.d/java.sh 加入;soure一下

export JAVA_HOME=/usr/local/jdk1.7.0_79

export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export HADOOP_HOME=/usr/local/hadoop

export HIVE_HOME=/usr/local/hive

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin

4:在目录$HIVE_HOME/conf/下,执行命令 cp hive-log4j2.properties.template hive-log4j2.properties拷贝一份重命名 修改property.hive.log.dir = /usr/local/hive/logs/ 

5:Hadoop集群要先启动,执行schematool -dbType derby -initSchema进行初始化。

6:安装mysql5.6

7:MySQL的驱动包放置到$HIVE_HOME/lib目录下 本机用的安装包mysql-connector-Java-5.1.31-bin.jar

8:在目录$HIVE_HOME/conf/下,执行命令cp hive-default.xml.template hive-site.xml

9:>hive-site.xml 加入

<?xml version="1.0" encoding="UTF-8" standalone="no"?>

<?xml-stylesheet type="text/xsl" href="http://ask.apelearn.com/configuration.xsl"?>

<configuration>

    <property>  

     <name>hive.metastore.warehouse.dir</name> 

     <value>hdfs://192.168.0.70:9000/user/hive/warehouse</value> 

    </property> 

    <property> 

     <name>datanucleus.fixedDatastore</name>

     <value>false</value> 

    </property>

   <property> 

     <name>datanucleus.autoCreateSchema</name> 

     <value>true</value> 

    </property>

    <property>  

        <name>datanucleus.autoCreateTables</name>

        <value>true</value>

    </property>

 <property>

     <name>datanucleus.autoCreateColumns</name>

        <value>true</value>

    </property>

<property> 

  <name>hive.metastore.uris</name> 

  <value>thrift://192.168.0.70:9083</value> 

</property> 

<property>

       <name>javax.jdo.option.ConnectionURL</name>

       <value>jdbc:mysql://192.168.0.70:3306/hive?createDatabaseIfNotExist=true</value>

        </property>

        <property>

                <name>javax.jdo.option.ConnectionDriverName</name>

                <value>com.mysql.jdbc.Driver</value>

        </property>

        <property>

                <name>javax.jdo.option.ConnectionUserName</name>

                <value>root</value>

        </property>

        <property>

                <name>javax.jdo.option.ConnectionPassword</name>

                <value>123456</value>

        </property>

</configuration>

10:cp hive-env.sh.template  hive-env.sh

HADOOP_HOME=/usr/local/hadoop

export HIVE_CONF_DIR=/usr/local/hive/conf

export HIVE_AUX_JARS_PATH=/usr/local/hive/lib

11:

$HADOOP_HOME/bin/hadoop fs -mkdir -p /user/hive/warehouse

$HADOOP_HOME/bin/hadoop fs -mkdir -p /tmp/hive/

hadoop fs -chmod 777 /user/hive/warehouse

hadoop fs -chmod 777 /tmp/hive

同步hive和hadoop的jline版本 

cp /usr/local/hive/lib/jline-2.12.jar /usr/local/hadoop/share/hadoop/yarn/lib 

12:mysql -uroot -p123456

grant all privileges on *.* to root@'%' identified by '123456';

flush privileges;

13:schematool -dbType mysql -initSchema 初始化

14:nohup /usr/local/hive/bin/hive --service metastore &> metastore.log & 启动hive 

       输入hive,连接成功。

15:进入mysql,可以看到hive库

16:测试,hive

(1)创建数据库

create database db_hive_test;

(2)创建测试表

use db_hive_test;

create table student(id int,name string) row format delimited fields terminated by '\t';

(3)

新建student.txt 文件写入数据(id,name 按tab键分隔)

vi student.txt

1001    zhangsan

1002    lisi

1003    wangwu

1004    zhaoli

(4)load data local inpath '/home/hadoop/student.txt' into table  db_hive_test.student

(5)select * from student;

(6)desc formatted student;

(7)通过UI来看:http://192.168.0.70:50070/explorer.html#/user/hive/warehouse/db_hive_test.db

(8)通过Mysql查看创建的表

        use hive;

       select * from TBLS; 可以查看到新建的表student

spark安装

1:安装Scala

    下载地址:http://www.scala-lang.org/download/2.11.7.html   (scala-2.11.7.tgz)

   tar zxvf /usr/local/src/scala-2.11.7.tgz -C /usr/local/scala

2: vim /etc/profile.d/java.sh

  增加:export SCALA_HOME=/usr/local/scala

           export PATH=$PATH:$SCALA_HOME/bin

3:source /etc/profile

4:scala -version

5:下载spark  地址:http://mirrors.hust.edu.cn/apache/spark/

6:tar zxvf /usr/local/src/spark-2.0.1-bin-hadoop2.7.tgz -C /usr/local/spark

7:vim /etc/profile.d/java.sh#追加如下内容

  export SPARK_HOME=/usr/local/spark

  export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin

8:source /etc/profile.d/java.sh

9:spark-shell --version

10:run-example org.apache.spark.examples.SparkPi 10

11:cd /usr/local/spark/conf/ 

cp spark-env.sh.template spark-env.sh 

vi spark-env.sh#   追加如下内容

export SCALA_HOME=/usr/local/scala

export JAVA_HOME=/usr/local/jdk1.7.0_79

export SPARK_MASTER_IP=192.168.0.70

export SPARK_WORKER_MEMORY=1024m

12:$SPARK_HOME/sbin/start-all.sh

13:提交任务到Spark集群

spark-submit --master spark://192.168.0.70:7077 --class org.apache.spark.examples.SparkPi --name Spark-Pi 

/usr/local/spark/examples/jars/spark-examples_2.11-2.0.1.jar

14:http://master:8080/

15:与Hadoop结合使用,分别开启Hadoop集群和Spark集群。

$HDOOP_HOME/sbin/start-all.sh

$SPARK_HOME/sbin/start-all.sh

16:在Yarn中运行Spark任务,编辑spark-env.sh:

vim /usr/local/spark/conf/spark-env.sh

#追加如下内容

export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

17:提交Spark任务到yarn中

spark-submit --master yarn-cluster --class org.apache.spark.examples.SparkLR --name SparkLR /usr/local/spark/examples/jars/spark-examples_2.11-2.0.1.jar

18:http://master:8088/

19:结合HDFS,Spark的输入是HDFS的文件

spark-submit --master yarn-cluster --class org.apache.spark.examples.JavaWordCount --name JavaWordCount  /usr/local/spark/examples/jars/spark-examples_2.11-2.0.1.jar  hdfs://master:9000/tmp/

2017-04-19 16:21 举报
已邀请:

回复帖子,请先登录注册

退出全屏模式 全屏模式 回复
评分
可选评分理由: