[关闭]
@zhangyy 2018-06-10T21:26:25.000000Z 字数 2710 阅读 131

在CDH5.14.2中安装Phoenix与使用

大数据平台构建


  • 一:安装及配置Phoenix
  • 二:Phoenix的基本操作
  • 三:使用Phoenix bulkload数据到HBase
  • 四:使用Phoenix从HBase中导出数据到HDFS

一:安装及配置Phoenix

1.0:phoienx 的介绍

  1. Phoenix中文翻译为凤凰, 其最早是Salesforce的一个开源项目,Salesforce背景是一个搞ERP的,ERP软件一个很大的特点就是数据库操作,所以能搞出一个数据库中间件也是很正常的。而后,Phoenix成为Apache基金的顶级项目。
  2. Phoenix具体是什么呢,其本质是用Java写的基于JDBC API操作HBase的开源SQL引擎

1.1: 下载CDH 需要parcel包

  1. 下载地址:
  2. http://archive.cloudera.com/cloudera-labs/phoenix/parcels/latest/
  3. CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel
  4. CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel.sha1
  5. manifest.json

image_1cfkmlhklkfh15j5kvh87ekas9.png-518.2kB

1.2 配置httpd的服务

  1. yum install -y httpd*
  2. service httpd start
  3. chkconfig httpd on
  4. mkdir -p /var/www/html/phoenix
  5. mv CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel* /var/www/html/phoenix/
  6. mv manifest.json /var/www/html/phoenix/
  7. cd /var/www/html/phoenix/
  8. mv CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel.sha1 CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel.sha

image_1cfkn236oj5f1o0t15fs15vg4j3m.png-187kB

image_1cfkn6ugqu77nsk1sa81fd81tnu13.png-113.5kB

image_1cfknfokc1hft1afu1v09dogjhm1t.png-127.7kB

image_1cfknglom1k341q61i5g16t8b4s2a.png-191.4kB

1.3 在CDH5.14.2 上面 配置 phoenix

image_1cfknj7m3f0f20g1pr5l0anj32n.png-574.2kB

image_1cfknod3k2pc1tlkf3adbbhkp3k.png-596kB

image_1cfknp2me7rp1cke1n0p1o2a7m741.png-588.8kB

image_1cfknpl77akn1j34v8t1q5g113p4e.png-581.8kB

image_1cfknuiqs1u1kirl1t7s11roi7l5b.png-545.4kB

image_1cfknv1u9es9i1s1i4m13m1l0t5o.png-355.1kB

image_1cfko9rv0e7gb6j12k1o3cfg665.png-418kB


1.4 HBase服务需要部署客户端配置以及重启

image_1cfkoeadjhsu10jc1r4910pc1r386i.png-445.8kB

image_1cfkoesfluk2drql101t82c3e6v.png-313.7kB

image_1cfkoftemtg8mh9mk5mvkktv7c.png-277.5kB

image_1cfkoohersukgd9133slqo1fjl7p.png-329.3kB

image_1cfkopfe21pqk1dk715t51fsaubi86.png-473.9kB

1.5 phoeinx的连接操作

  1. cd /opt/cloudera/parcels/CLABS_PHOENIX/bin

image_1cfkosml71p2p1j4s628lpr19088j.png-60.7kB

  1. 使用Phoenix登录HBase
  2. ./phoenix-sqlline.py

image_1cfkp1kug139qq911hogr41e2j90.png-63.7kB

  1. 需要指定Zookeeper
  2. ./phoenix-sqlline.py node-01.flyfish:2181:/hbase
  3. !table

image_1cfkp4rmk1ffbgit9hmgbmrpn9d.png-283.9kB

image_1cfkpbnuj15lisf712p51ps1agg9q.png-120.3kB

二:Phoenix的基本操作

2.1 使用phoinex创建表

  1. create table hbase_test
  2. (
  3. s1 varchar not null primary key,
  4. s2 varchar,
  5. s3 varchar,
  6. s4 varchar
  7. );

image_1cfkpfs011mqa1767uml8m1542a7.png-66.9kB

  1. hbase 的接口登录
  2. hbase shell

image_1cfkpiquc1uts1poa13fq1or91mkqb1.png-86.7kB

image_1cfkqcq467j65ar5rn1982uafbe.png-164.4kB

  1. upsert into hbase_test values('1','testname','testname1','testname2');
  2. upsert into hbase_test values('2','tom','jack','harry');

image_1cfkqg0de1fd6qfgmac12hq17vacb.png-184.8kB

image_1cfkqkeep1dhic02nns1hbs8un9.png-140.3kB

  1. 删除:
  2. delete from hbase_test where s1='1'; (删除是按rowkey)

image_1cfkqq48fqir1vo71olhml5jq3m.png-157.8kB

image_1cfkqr1kc9dqdq91gdu12a616uf13.png-80.7kB

  1. upsert into hbase_test values('1','hadoop','hive','zookeeper');
  2. upsert into hbase_test values('2','oozie','hue','spark');

image_1cfkqur4h15hstfiqh1cq01j6m1g.png-155.8kB

  1. 更新数据测试,注意Phoenix中没有update语法,用upsert代替。插入多条数据需要执行多条upsert语句,没办法将所有的数据都写到一个“values”后面。
  2. upsert into hbase_test values('1','zhangyy','hive','zookeeper');

image_1cfkr3trs7131qa8eoq1u861j1t.png-221.8kB

image_1cfkr5fkt1iaaggd9nnmieb1s2a.png-140.9kB

三:使用Phoenix bulkload数据到HBase

3.1 准备测试文件

  1. 准备 导入的 测试文件
  2. ls -ld ithbase.csv
  3. head -n 1 ithbase.csv

image_1cfkrrlhcjm0o6c1etvhp11iek2n.png-94.3kB

  1. 上传到hdfs
  2. su - hdfs
  3. hdfs dfs -mkdir /flyfish
  4. hdfs dfs -put ithbase.csv /flyfish
  5. hdfs dfs -ls /flyfish

image_1cfks2hvl17lv16rk1bs41bbr18sg34.png-129.4kB

3.2 通过Phoenix创建表

  1. create table ithbase
  2. (
  3. i_item_sk varchar not null primary key,
  4. i_item_id varchar,
  5. i_rec_start_varchar varchar,
  6. i_rec_end_date varchar
  7. );

image_1cfksa0be9tu8a16mtgs15rb3h.png-82.3kB

  1. 执行bulkload命令导入数据
  2. HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol-1.2.0-cdh5.12.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/conf hadoop jar /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-4.7.0-clabs-phoenix1.3.0-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -t ithbase -i /flyfish/ithbase.csv

image_1cfksgohq156b1ov94rrfsea304u.png-380.7kB

image_1cfkshctuutq7to4oqmf612gk5b.png-260.6kB

image_1cfkshqi1onf7plfultp13jo5o.png-451.1kB

image_1cfksih0891ecgb135d548a9465.png-563.5kB

  1. select * from ithbase

image_1cfksn5f2qvt17kl1vtqbvqfu66i.png-91.1kB

image_1cfksp0361vl51qtq1cj8mc9pj06v.png-275kB

四:使用Phoenix从HBase中导出数据到HDFS

  1. cat export.pig
  2. ----
  3. REGISTER /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-4.7.0-clabs-phoenix1.3.0-client.jar;
  4. rows = load 'hbase://query/SELECT * FROM ITHBASE' USING org.apache.phoenix.pig.PhoenixHBaseLoader('node-01.flyfish:2181');
  5. STORE rows INTO 'flyfish1' USING PigStorage(',');
  6. ----
  7. 执行pig
  8. pig -x mapreduce export.pig

image_1cfktg35o9lc1a1d16vpia21ca490.png-366.7kB

image_1cfkth1c71e6g1ch01go518clodc9d.png-508.7kB

image_1cfkthnqigcj1a7l1rtt9gv7a79q.png-236.4kB

image_1cfktkoi710dv8bjesamjv1sr0a7.png-663.1kB

  1. hdfs 上面查看文件
  2. hdfs dfs -ls /user/hdfs/flyfish1
  3. hdfs dfs -cat /user/hdfs/flyfish1/part-m-00000

image_1cfktn9jv2m4sjp13r111sq11suak.png-125.5kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注