[关闭]
@zhangyy 2018-04-12T14:00:44.000000Z 字数 7966 阅读 219

hue 协作框架

协作框架


  • 一: hue 的简介
  • 二: hue 的安装与配置
  • 三: 创建hue 与其它框架的集成

一: hue 的简介

  1. Hue是一个开源的Apache Hadoop UI系统,由Cloudera Desktop演化而来,最后Cloudera公司将其贡献给Apache基金会的Hadoop社区,它是基于Python Web框架Django实现的。通过使用Hue我们可以在浏览器端的Web控制台上与Hadoop集群进行交互来分析处理数据,例如操作HDFS上的数据,运行MapReduce Job,执行HiveSQL语句,浏览HBase数据库等等
  1. Hue在数据库方面,默认使用的是SQLite数据库来管理自身的数据,包括用户认证和授权,另外,可以自定义为MySQL数据库、Postgresql数据库、以及Oracle数据库。其自身的功能包含有:
  2. HDFS的访问,通过浏览器来查阅HDFS的数据。
  3. Hive编辑器:可以编写HQL和运行HQL脚本,以及查看运行结果等相关Hive功能。
  4. 提供Solr搜索应用,并对应相应的可视化数据视图以及DashBoard
  5. 提供Impala的应用进行数据交互查询。
  6. 最新的版本集成了Spark编辑器和DashBoard
  7. 支持Pig编辑器,并能够运行编写的脚本任务。
  8. Oozie调度器,可以通过DashBoard来提交和监控WorkflowCoordinator以及Bundle
  9. 支持HBase对数据的查询修改以及可视化。
  10. 支持对Metastore的浏览,可以访问Hive的元数据以及对应的HCatalog
  11. 另外,还有对Job的支持,SqoopZooKeeper以及DBMySQLSQLiteOracle等)的支持。

二: hue 的安装与配置

mysql 环境初始化配置

2.1 升级原来的mysql:

  1. rpm -e MySQL-server-5.6.24-1.el6.x86_64 MySQL-client-5.6.24-1.el6.x86_64

2.2 升级成新的mysql

  1. rpm -Uvh MySQL-server-5.6.31-1.el6.x86_64.rpm
  2. rpm -Uvh MySQL-client-5.6.31-1.el6.x86_64.rpm
  3. rpm -ivh MySQL-devel-5.6.31-1.el6.x86_64.rpm
  4. rpm -ivh MySQL-embedded-5.6.31-1.el6.x86_64.rpm
  5. rpm -ivh MySQL-shared-5.6.31-1.el6.x86_64.rpm
  6. rpm -ivh MySQL-shared-compat-5.6.31-1.el6.x86_64.rpm
  7. rpm -ivh MySQL-test-5.6.31-1.el6.x86_64.rpm

2.3 修改mysql 密码:

  1. mysql -uroot -polCiurMmcTS1zkQn
  2. 密码在/root/.mysql_scret 里面
  1. 修改密码:
  2. set password=password('123456');
  3. 授权登陆:
  4. grant all on *.* to root@'namenode01.hadoo.com' identified by '123456' ;
  5. GRANT ALL PRIVILEGES ON *.* TO 'root'@'%'IDENTIFIED BY '123456' WITH GRANT OPTION;
  6. flush privileges;

2.4 创建hive的 原数据:

  1. cd /home/hadoop/yangyang/hive
  2. bin/hive
  3. 注: hive 的安装配置参看 hive 的安装

2.5 创建oozie 的数据表:

  1. create database oozie;
  2. cd /home/hadoop/yangyang/oozie
  3. bin/oozie-setup.sh db create -run oozie.sql
  4. 注: oozie 的安装参看oozie 的安装

2.6 安装hue 所需的依赖包:

  1. yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel gmp-devel

2.7 最后一步操作去掉安装的jdk所需的依赖包:

  1. rpm -e java_cup-0.10k-5.el6.x86_64 java-1.7.0-openjdk-devel-1.7.0.101-2.6.6.4.el6_8.x86_64 tzdata-java-2016d-1.el6.noarch java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64 java-1.7.0-openjdk-1.7.0.101-2.6.6.4.el6_8.x86_64 --nodeps

三: 安装 hue

3.1 安装编译Hue

  1. tar -zxvf hue-3.7.0-cdh5.3.6.tar.gz
  2. mv hue-3.7.0-cdh5.3.6 yangyang/hue
  3. cd yangyang/hue
  4. make apps

3.2 更改hue 的配置文件

  1. cd yangyang/hue/desktop/conf
  2. vim hue.ini
  3. ---
  4. [desktop]
  5. # Set this to a random string, the longer the better.
  6. # This is used for secure hashing in the session store.
  7. secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o
  8. # Webserver listens on this address and port
  9. http_host=192.168.3.1
  10. http_port=8888
  11. # Time zone name
  12. time_zone=Asia/Shanghai

image_1akj2jobh10uik31cvh150p11o7p.png-22.3kB

3.3 启动hue 测试

  1. cd yangyang/hue
  2. build/env/bin/supervisor

image_1akj2p9rh1nfs1jffmhh19h4l2816.png-10.7kB

3.4 打开浏览器测试

  1. http://192.168.3.1:8888

image_1akj2s2on1do210pd1ndffq87fd1j.png-58.7kB

image_1akj2vfi07qi3if1i4f14eeu0v20.png-71.8kB

四: 创建hue 与其它框架的集成

hue 与 hadoop2.x 系列 集成

4.1 hue 与hdfs 集成:

  1. 更改hadoophdfs-site.xml 文件
  2. 配置启动HDFS中的webHDFS
  3. vim hdfs-site.xml 增加:
  4. <property>
  5. <name>dfs.webhdfs.enabled</name>
  6. <value>true</value>
  7. </property>
  8. <!-- 关闭权限检查 -->
  9. <property>
  10. <name>dfs.permissions.enabled</name>
  11. <value>false</value>
  12. </property>
  13. 更改hadoop core-site.xml文件
  14. 配置下Hue访问HDFS用户权限
  15. vim core-site.xml
  16. <property>
  17. <name>hadoop.proxyuser.hue.hosts</name>
  18. <value>*</value>
  19. </property>
  20. <property>
  21. <name>hadoop.proxyuser.hue.groups</name>
  22. <value>*</value>
  23. </property>
  1. 更改hue 的配置文件
  2. cd yangyang/hue/desktop/conf
  3. vim hue.ini
  4. [hadoop]
  5. # Configuration for HDFS NameNode
  6. # ------------------------------------------------------------------------
  7. [[hdfs_clusters]]
  8. # HA support by using HttpFs
  9. [[[default]]]
  10. # Enter the filesystem uri
  11. fs_defaultfs=hdfs://namenode01.hadoop.com:8020
  12. # NameNode logical name.
  13. ## logical_name=
  14. # Use WebHdfs/HttpFs as the communication mechanism.
  15. # Domain should be the NameNode or HttpFs host.
  16. # Default port is 14000 for HttpFs.
  17. webhdfs_url=http://namenode01.hadoop.com:50070/webhdfs/v1
  18. hadoop_hdfs_home=/home/hadoop/yangyang/hadoop
  19. hadoop_bin=/home/hadoop/yangyang/hadoop/bin
  20. # Change this if your HDFS cluster is Kerberos-secured
  21. ## security_enabled=false
  22. # Default umask for file and directory creation, specified in an octal value.
  23. ## umask=022
  24. # Directory of the Hadoop configuration
  25. hadoop_conf_dir=/home/hadoop/yangyang/hadoop/etc/hadoop

4.2 hue 与yarn 集成:

  1. [[yarn_clusters]]
  2. [[[default]]]
  3. # Enter the host on which you are running the ResourceManager
  4. resourcemanager_host=namenode01.hadoop.com
  5. # The port where the ResourceManager IPC listens on
  6. resourcemanager_port=8032
  7. # Whether to submit jobs to this cluster
  8. submit_to=True
  9. # Resource Manager logical name (required for HA)
  10. ## logical_name=
  11. # Change this if your YARN cluster is Kerberos-secured
  12. ## security_enabled=false
  13. # URL of the ResourceManager API
  14. resourcemanager_api_url=http://namenode01.hadoop.com:8088
  15. # URL of the ProxyServer API
  16. proxy_api_url=http://namenode01.hadoop.com:8088
  17. # URL of the HistoryServer API
  18. history_server_api_url=http://namenode01.hadoop.com:19888
  19. # In secure mode (HTTPS), if SSL certificates from Resource Manager's
  20. # Rest Server have to be verified against certificate authority
  21. ## ssl_cert_ca_verify=False
  1. 注释: 修改完之后要从新启动hadoop hdfs yarn 的服务

4.3 Hue与Hive集成

4.3.1 增加 Hive Remote MetaStore 配置

  1. 更改hive hive-site.xml 文件
  2. vim hive-site.xml
  3. ---
  4. <property>
  5. <name>hive.metastore.uris</name>
  6. <value>thrift://namenode01.hadoop.com:9083</value>
  7. <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
  8. </property>
  9. 启动hive metastore
  10. bin/hive --service metastore &
  11. 启动hive hiveserver2

image_1akj4hd901ra43mi2s31i2u1jsd2d.png-10kB
image_1akj4ncbqp3i10dna5hj1b1jgp3a.png-21.6kB

image_1akj7jf8b1ut7gcpinq19bn1vhf44.png-15.8kB

4.3.2 Hue与Hive集成

  1. 修改hue 的配置文件
  2. vim hue.ini
  3. ---
  4. [beeswax]
  5. # Host where HiveServer2 is running.
  6. # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  7. hive_server_host=namenode01.hadoop.com
  8. # Port where HiveServer2 Thrift server runs on.
  9. hive_server_port=10000
  10. # Hive configuration directory, where hive-site.xml is located
  11. hive_conf_dir=/home/hadoop/yangyang/hive
  12. # Timeout in seconds for thrift calls to Hive service
  13. server_conn_timeout=120
  14. # Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
  15. # If false, Hue will use the FetchResults() thrift call instead.
  16. ## use_get_log_api=true
  17. # Set a LIMIT clause when browsing a partitioned table.
  18. # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
  19. ## browse_partitioned_table_limit=250
  20. # A limit to the number of rows that can be downloaded from a query.
  21. # A value of -1 means there will be no limit.
  22. # A maximum of 65,000 is applied to XLS downloads.
  23. ## download_row_limit=1000000
  24. # Hue will try to close the Hive query when the user leaves the editor page.
  25. # This will free all the query resources in HiveServer2, but also make its results inaccessible.
  26. ## close_queries=false
  27. # Thrift version to use when communicating with HiveServer2
  28. ## thrift_version=5

4.4 Hue与RMDBS集成

  1. [[[mysql]]]
  2. # Name to show in the UI.
  3. ## nice_name="My SQL DB"
  4. # For MySQL and PostgreSQL, name is the name of the database.
  5. # For Oracle, Name is instance of the Oracle server. For express edition
  6. # this is 'xe' by default.
  7. ## name=mysqldb
  8. # Database backend to use. This can be:
  9. # 1. mysql
  10. # 2. postgresql
  11. # 3. oracle
  12. engine=mysql
  13. # IP or hostname of the database to connect to.
  14. host=namenode01.hadoop.com
  15. # Port the database server is listening to. Defaults are:
  16. # 1. MySQL: 3306
  17. # 2. PostgreSQL: 5432
  18. # 3. Oracle Express Edition: 1521
  19. port=3306
  20. # Username to authenticate with when connecting to the database.
  21. user=root
  22. # Password matching the username to authenticate with when
  23. # connecting to the database.
  24. password=123456
  25. # Database options to send to the server when connecting.
  26. # https://docs.djangoproject.com/en/1.4/ref/databases/
  27. ## options={}

4.5 hue 与oozie 集成

更改hue.ini

  1. [liboozie]
  2. # The URL where the Oozie service runs on. This is required in order for
  3. # users to submit jobs. Empty value disables the config check.
  4. oozie_url=http://namenode01.hadoop.com:11000/oozie
  5. # Requires FQDN in oozie_url if enabled
  6. ## security_enabled=false
  7. # Location on HDFS where the workflows/coordinator are deployed when submitted.
  8. remote_deployement_dir=/user/hadoop/oozie-apps
  9. ###########################################################################
  10. # Settings to configure the Oozie app
  11. ###########################################################################
  12. [oozie]
  13. # Location on local FS where the examples are stored.
  14. local_data_dir=/home/hadoop/yangyang/oozie/oozie-apps
  15. # Location on local FS where the data for the examples is stored.
  16. sample_data_dir=/home/hadoop/yangyang/oozie/oozie-apps/map-reduce/input-data
  17. # Location on HDFS where the oozie examples and workflows are stored.
  18. remote_data_dir=/user/hadoop/oozie-apps
  19. # Maximum of Oozie workflows or coodinators to retrieve in one API call.
  20. oozie_jobs_count=100
  21. # Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit.
  22. ##enable_cron_scheduling=true

五: 从新启动 hue

  1. 杀掉supervisor 相关进程 8888 端口所占的进程号
  2. 从新启动hue
  3. build/env/bin/supervisor &
  4. 启动hiveserver2
  5. bin/hiveserver2 &

image_1akj7olrs1j5njgdf238n1u5k4h.png-21.4kB

5.1 查看浏览器

  1. http://192.168.3.1:8888

hdfs :

image_1akj7s0dk1kgj1vjp1rhm1at41mg05b.png-51.1kB

mysql :

image_1akj7spqo1o7tjm710b782scp25o.png-62.6kB

hive :

image_1akj7tnu118gfec1a04mcag2e65.png-58.6kB

oozie:

image_1akk8ob5g1feo1g487l91ng311h69.png-63.7kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注