[关闭]
@zhangyy 2020-07-16T14:48:38.000000Z 字数 2616 阅读 106

cdh5.14.2中集成安装kylin与使用测试

大数据平台构建


  • 一:kylin 简介
  • 二:安装配置kylin
  • 三:kylin 运行实例

一:kylin 简介

  1. Apache Kylin™是一个开源的分布式分析引擎,提供Hadoop/Spark之上的SQL查询接口及多维分析(OLAP)能力以支持超大规模数据,最初由eBay Inc. 开发并贡献至开源社区。它能在亚秒内查询巨大的Hive表。

image_1cfmo0npv15od1epg1e1e1qq216rlp.png-381.6kB

  1. kylin 软件下载:
  2. 社区版kylin下载地址:https://archive.apache.org/dist/kylin/ ,本次测试使用apache-kylin-2.3.1.tar.gz

image_1cfmo369bv731o2u119d1h8gcc316.png-294.4kB

二:在cdh5.14.2 上面配置安装kylin

2.1: kylin 安装的环境配置

  1. login: node-01.flyfish
  2. tar -zxvf apache-kylin-2.3.1-cdh57-bin.tar.gz -C /usr/local/
  3. cd /usr/local/
  4. mv apache-kylin-2.3.1-bin/ kylin
  5. vim /etc/profile
  6. ----
  7. ### kylin ####
  8. export KYLIN_HOME=/usr/local/kylin
  9. PATH=$PATH:$HOME/bin:$KYLIN_HOME/bin
  10. ---
  11. source /etc/profile

image_1cfmpltko7qi8c8188e49tgk51j.png-161.4kB

2.2:kylin的启动验证

  1. cd /usr/local/kylin/
  2. ./check-env.sh

image_1cfmpog7616c3dudcii6kmvib2g.png-104.8kB

  1. su - hdfs
  2. hdfs dfs -chmod -R 777 /

image_1cfmpq8t9shuen6ki2t4f13403d.png-77.6kB

  1. 从新检测处理
  2. cd /usr/local/kylin/
  3. ./check-env.sh

image_1cfmps8015mil3c6ot1atf1r8k3q.png-47.5kB

  1. 启动kylin
  2. ./kylin.sh start

image_1cfmpuc6f1i5h3f7mjp1ueg15lg47.png-332.5kB

image_1cfmq37il17q7jsa58un651n374.png-499.3kB

image_1cfmq47lo14i018851dj2lj81gms7h.png-226kB

  1. 默认用户名:ADMIN
  2. 密码:KYLIN

image_1cfmq8foa5i91ip8gahrlijjk8e.png-273.2kB

三:kylin 运行实例使用测试

  1. cd /usr/local/kylin/bin
  2. ./sample.sh

image_1cfmqtpc06731qdc1f9d7gd1at18r.png-238.1kB

image_1cfmr0lso1rf211dspl7c7r1ign9o.png-330.1kB

image_1cfmr194n14pg1jt51heh19asqvja5.png-421.3kB

image_1cfmr2vji2709151a2maaj18fqc2.png-481.6kB

  1. 从启kylin
  2. cd /usr/local/kylin/bin
  3. ./kylin.sh stop
  4. ./kylin.sh start

image_1cfmr8h441m2616d4nj517go1nllcf.png-164.5kB

  1. 重新刷新kylin的元数据

image_1cfmrdelj1fe517vehn71ja1kesdp.png-651.3kB

image_1cfmreathjtuqh65i4hjc1k47em.png-527.5kB

image_1cfmrg3mq185q427155q1i35a5ug6.png-551.5kB

image_1cfmrh9jq11v61hbo5b11r011ulsh3.png-353.6kB

  1. impala的加载表:
  2. impala-shell -i "INVALIDATE METADATA"
  3. 单独刷新一张表:
  4. refrash + 表名
  5. hive default 当中多了几张kylin的表

image_1cfmrq46tp1l1akn1cjjjio1qn7hg.png-90.3kB

  1. 构建cube

image_1cfn261lbr821q281oqq1ko0bnpp.png-381.7kB

image_1cfn2b040mtqogs1qa31rhk67i30.png-304.3kB

  1. 这个地方如果机器配置不够的话,尽量日期间隔选小一点。

image_1cfn2brd115871vl94gb26a1u33d.png-335.5kB

image_1cfn2ke301r9gine1jfi187itap.png-380.1kB

  1. 这一步会比较耗时,因为这步会进行预计算,默认是MapReduce作业。

image_1cfn2t8r81v8k1empaqc15gf1pm816.png-566kB

image_1cfn3vt7p1lrs1nuv1bc81vg6g5u1j.png-395.4kB

  1. kylin的数据查询

image_1cfn4bquj1m1tnod8cloq15u33g.png-402.1kB

image_1cfn4gajtmp8gj6117717fj1er73t.png-409.9kB

image_1cfn4hcpt11caaesia5p9a21k4a.png-459.8kB

  1. 查询构建完成的cube
  2. 先运行简单的count,可以看到耗时4.12s,再次执行基本在0.5s级,基本是毫秒级别
  3. 就可以查询出来,这是因为kylin 支持缓存的功能

image_1cfn4ihlvhi71sv21s6u1f79qmg4n.png-371kB

  1. 复杂的查询:
  2. select sum(KYLIN_SALES.PRICE)
  3. as price_sum,KYLIN_CATEGORY_GROUPINGS.META_CATEG_NAME,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL2_NAME
  4. from KYLIN_SALES inner join KYLIN_CATEGORY_GROUPINGS
  5. on KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID and
  6. KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID
  7. group by KYLIN_CATEGORY_GROUPINGS.META_CATEG_NAME,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL2_NAME
  8. order by KYLIN_CATEGORY_GROUPINGS.META_CATEG_NAME asc,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL2_NAME desc

image_1cfn4p19jfsn1jeu10ag1a82156s54.png-518kB

image_1cfn4phdbkqf1cdt9111f8k1g475h.png-242.7kB

四:kylin 运行实例二

4.1 数据文件准备

  1. create_table.sql department.csv employee.csv

image_1cfngq2ae1su2ge7do01d0ohuv9.png-39.6kB

4.2 在hdfs 上面创建文件上传

  1. hdfs 上面创建上传目录
  2. hdfs dfs -mkdir /kylin-test
  3. hdfs dfs -put department.csv employee.csv /kylin-test

image_1cfnh06sufhoh8i1amp65c1k78m.png-161.4kB

4.3 在hive 上面执行sql 脚本加载数据与验证

  1. 执行create_table.sql
  2. create_table 内容如下
  3. ---
  4. DROP TABLE IF EXISTS employee;
  5. CREATE TABLE employee(
  6. id int,
  7. name string,
  8. deptId int,
  9. age int,
  10. salary float
  11. )
  12. ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
  13. STORED AS TEXTFILE;
  14. DROP TABLE IF EXISTS department;
  15. CREATE TABLE department(
  16. id int,
  17. name string
  18. )
  19. ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
  20. STORED AS TEXTFILE;
  21. LOAD DATA INPATH '/kylin-test/employee.csv' OVERWRITE INTO TABLE employee;
  22. LOAD DATA INPATH '/kylin-test/department.csv' OVERWRITE INTO TABLE department;
  23. ---

  1. hive 中执行create_table.sql
  2. hive -f create_table.sql

image_1cfnhidm943d1vh110ovnd91nou13.png-223.1kB

  1. hive -e "use default;select * from employee"

image_1cfnhkkns1u9k18uv1a303631v0s1g.png-141.9kB

  1. hive -e "use default;select * from department"

image_1cfnhmr67k5i1gjba4r2gn1aoe1t.png-209.9kB

4.4 在kylin 上面创建project

image_1cfni3tric3n132v17kk2541ego2q.png-260.7kB

image_1cfni5b081pnhrkl1hvsep21qt037.png-285.4kB

image_1cfni5sqq1b2dogl4jd1kuhtad3k.png-275.7kB

  1. 加载hive数据到kylin当中

image_1cfni9i60t5qo6d1hhjdka1a2141.png-209.2kB

image_1cfnicpf615th5iuugvfv71g6q4e.png-224.5kB

image_1cfnidhovqhilo69eb1lpag6v4r.png-276.1kB

image_1cfnie73a1a621lsqd861riia2058.png-179.6kB

  1. 创建model,入project的名称和描述:

image_1cfnijehl1iq511881lu96b51gm365.png-211.9kB

image_1cfnil81q1gma1t5mphovr21ql86i.png-283.2kB

image_1cfninc7d50sthreqgv771pci6v.png-290.6kB

image_1cfnip2pmv2n1dfd70bg67hpo7c.png-254.3kB

image_1cfnisd321ibr1fr1fs9ffj8f57p.png-343.5kB

  1. 选择事实表,并点击add Lookup Table查询表

image_1cfnitmt37ttnvm123f50olup8m.png-335.2kB

  1. 选择维度字段

image_1cfnj2b9f1e6mjbs1b4f16pgh4793.png-319.3kB

image_1cfnj41e6ek8mo512i04a7al9g.png-294.3kB

image_1cfnj521def8p05q0kj0ru599t.png-292kB

image_1cfnj721t1bph1vuu704q0t1chbaa.png-299.2kB

image_1cfnj81mpgrj14no28uhqjmhcan.png-323.3kB

image_1cfnj8htijqgkr15jcaoualkb4.png-281.4kB

image_1cfnj9pmka9sh3u1rbr1gvn58bh.png-307.8kB

  1. 创建cube

image_1cfnjcsjs11eh1hft1avj1kaf1vtdbu.png-222kB

image_1cfnjeapc9gu1l8rh441gqpm8mcb.png-314.3kB

image_1cfnjfcqr8j21fnu1l8719sc1hpgdo.png-270.4kB

image_1cfnjh93a187q1aru15mr1oq710jfei.png-317.3kB

image_1cfnji6ba162i9r51daehiq24dev.png-364.6kB

image_1cfnjpdju1d5imu91nk1g8s1npugc.png-358.2kB

image_1cfnjqdnf6aj1jb317veipirfagp.png-244.7kB

image_1cfnjrn3semh12s9uocjcatcjh6.png-312.8kB

image_1cfnjsqdaq4bm2c1lba3i41svehj.png-292.2kB

image_1cfnjtc3mi2i286163q1on61kpji0.png-348.6kB

image_1cfnjtr4619501nloc2q17vi1sq8id.png-270.2kB

image_1cfnjub0210a1rmn13fggck1liq.png-351.8kB

  1. cube创建完成

image_1cfnjusi3to41cvq64okmk15erj7.png-328.7kB

  1. 构建cube

image_1cfnk0m8p1t7l1gumcu6167k1hc3jk.png-316.4kB

image_1cfnk1bds9onebq69dbp61ba1kh.png-248.3kB

image_1cfnk1qls39an6l995bbgc2oku.png-267.4kB

image_1cfnk3ctn1v6t17jr3b16sjkdllb.png-277.3kB

image_1cfnk58gt12c7opf1p3l12fs1dsflo.png-642.7kB

image_1cfnk64dh1pem1hl712rbo511cc9m5.png-263.8kB

image_1cfnk8da0k8b5mhk481k5m1jm8mi.png-313kB

image_1cfnkcvdm188jsocaf513r16j0mv.png-317.9kB

  1. 查询测试:
  2. select count(*) from department;
  3. select max(salary) from EMPLOYEE

image_1cfnkguufpmjo71nv318dkdqunc.png-345.6kB

image_1cfnkjoa547t1f3j1dk31e4g1o0lo9.png-331.9kB

image_1cfnkl7fa10eiob516os1bq1fkpq6.png-238.1kB

  1. 统计各部门员工薪资总和:
  2. select d.ID,sum(e.SALARY) as salary from EMPLOYEE as e left join DEPARTMENT as d on e.DEPTID=d.id group by d.ID order by salary desc

image_1cfnkncqonm91i0qttc1sb21qrlqj.png-373.2kB

image_1cfnko11c19ho93e1q7u1ocmu7rr0.png-353.8kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注