[关闭]
@zhangyy 2021-06-29T16:20:21.000000Z 字数 7675 阅读 314

greenplum 最新版本6.16.2部署

greenplum系列



一: greenplum 的使用

1.1 greenplum的概述:

  1. Greenplum。该公司成立于2003年,2006年推出了首款产品,其主营业务关注在数据仓库和商业智能方面,Greenplum DW/BI软件可以在虚拟化x86服务器上运行无分享(shared-nothing)的大规模并行处理(MPP)架构
  2. 当前使用的 OLTP程序中,用户访问一个中心数据库,如果采用SMP系统结构,它的效率要比采用MPP结构要快得多。而MPP系统在决策支持和数据挖掘方面显示了优势,
  3. 可以这样说,如果操作相互之间没有什么关系,处理单元之间需要进行的通信比较少,那采用MPP系统就要好,相反就不合适了。
  4. greenplum 的企业版本下载:
  5. https://network.pivotal.io/products/vmware-tanzu-greenplum

1.2 mpp系统

  1. Greenplum的架构采用了MPP(大规模并行处理)。在 MPP 系统中,每个 SMP节点也可以运行自己的操作系统、数据库等。换言之,每个节点内的 CPU 不能访问另一个节点的内存。
  2. 节点之间的信息交互是通过节点互联网络实现的,这个过程一般称为数据重分配(Data Redistribution) 。与传统的SMP架构明显不同,通常情况下,
  3. MPP系统因为要在不同处理单元之间传送信息,所以它的效率要比SMP要差一点,但是这也不是绝对的,因为 MPP系统不共享资源,因此对它而言,资源比SMP要多,
  4. 当需要处理的事务达到一定规模时,MPP的效率要比SMP好。这就是看通信时间占用计算时间的比例而定,如果通信时间比较多,那MPP系统就不占优势了,相反,如果通信时间比较少,
  5. MPP系统可以充分发挥资源的优势,达到高效率。

1.3 greenplum的体系架构

image_1f9b8di491d7qv2edds187r1n519.png-196.1kB

  1. Greenplum 由三部分组成:Master HostSegmentInterconnect
  2. Master Host:
  3. Master Host 节点是访问系统的入口,数据库侦听进程 (postgres),处理所有用户连接,建立查询计划,协调工作处理过程,管理工具,系统目录表和元数据(数据字典)并不存放任何用户数据。
  4. segment节点:
  5. 每段(Segment)存放一部分用户数据,一个系统可以有多段,用户不能直接存取访问,所有对段的访问都经过 Master,数据库监听进程(postgres)监听来自 Master 的连接
  6. Interconnect
  7. Greenplum 数据库之间的连接层,进程间协调和管理,基于千兆以太网架构,属于系统内部私网配置,并支持两种协议:TCP or UDP

1.4 greenplum的表的分布策略

image_1f9b8q16kp00rdjc3d1eo3108a2q.png-221.4kB

  1. Greenplum 5中,有2种分布策略:
  2. 哈希分布,随机分布
  3. Greenplum 6中,添加了另一个策略:
  4. 哈希分布,随机分布,复制分布
  5. 数据表的单个行会被分配到一个或多个segment上.

二: greenplum 的安装:

2.1 greenplum 的下载

  1. greenplum开源版本
  2. 其官方手册 https://greenplum.org/documentation/
  3. 其下载介质地址 https://github.com/greenplum-db/gpdb/releases
  4. greenplum商业版本(注:下载介质必须要付费,官方手册可以在线或离线看)
  5. 其官方手册 http://gpdb.docs.pivotal.io/570/main/index.html
  6. 其下载介质地址 https://network.pivotal.io/products/pivotal-gpdb/
  7. 本次安装收费版本:

image_1f9b915kemh11l92ful1ph8b9d9.png-255.4kB

2.2 安装greenplum系统环境准备:

  1. 1. 系统:
  2. CentOS7.9x64
  3. ----
  4. 2. 部署环境:
  5. master 一台
  6. standby 一台
  7. segment 三台
  8. 3. 系统主机名配置:
  9. cat /etc/hosts
  10. ----
  11. 192.168.100.11 node01.flyfish.com
  12. 192.168.100.12 node02.flyfish.com
  13. 192.168.100.13 node03.flyfish.com
  14. 192.168.100.14 node04.flyfish.com
  15. 192.168.100.15 node05.flyfish.com
  16. 192.168.100.16 node06.flyfish.com
  17. 192.168.100.17 node07.flyfish.com
  18. 192.168.100.18 node08.flyfish.com
  19. 192.168.100.19 node09.flyfish.com
  20. ----
  21. 本次安装前五台机器:
  22. 部署greenplum 建议生产使用实体机器,greenplum 的最小环境为,master 一台standby 一台 segment 三台

  1. 4. 系统环境初始化:(所有主机)
  2. vim /etc/sysctl.conf
  3. ---
  4. vm.swappiness = 0
  5. kernel.sysrq = 1
  6. net.ipv4.neigh.default.gc_stale_time = 120
  7. # see details in https://help.aliyun.com/knowledge_detail/39428.html
  8. net.ipv4.conf.all.rp_filter = 0
  9. net.ipv4.conf.default.rp_filter = 0
  10. net.ipv4.conf.default.arp_announce = 2
  11. net.ipv4.conf.lo.arp_announce = 2
  12. net.ipv4.conf.all.arp_announce = 2
  13. # see details in https://help.aliyun.com/knowledge_detail/41334.html
  14. net.ipv4.tcp_max_tw_buckets = 5000
  15. net.ipv4.tcp_syncookies = 1
  16. net.ipv4.tcp_max_syn_backlog = 1024
  17. net.ipv4.tcp_synack_retries = 2
  18. net.ipv4.tcp_slow_start_after_idle = 0
  19. fs.aio-max-nr = 1048576
  20. fs.file-max = 6815744
  21. net.ipv4.ip_local_port_range = 9000 65500
  22. net.ipv4.conf.default.accept_source_route = 0
  23. net.ipv4.tcp_max_syn_backlog = 4096
  24. net.ipv4.conf.all.arp_filter = 1
  25. net.core.rmem_default = 262144
  26. net.core.rmem_max = 4194304
  27. net.core.wmem_default = 262144
  28. net.core.wmem_max = 1048586
  29. kernel.sem = 204800 512000 3000 20480
  30. kernel.shmmax = 107374182400000
  31. kernel.shmall = 262144
  32. kernel.shmmni = 4096
  33. kernel.sysrq = 1
  34. kernel.core_uses_pid = 1
  35. kernel.msgmnb = 65536
  36. kernel.msgmax = 65536
  37. kernel.msgmni = 2048
  38. vm.swappiness = 10
  39. vm.overcommit_memory = 2
  40. vm.overcommit_ratio = 95
  41. vm.zone_reclaim_mode = 0
  42. vm.dirty_expire_centisecs = 500
  43. vm.dirty_writeback_centisecs = 100
  44. vm.dirty_background_ratio = 3
  45. vm.dirty_ratio = 10
  46. #64g-
  47. #vm.dirty_background_ratio = 3
  48. #vm.dirty_ratio = 10
  49. #64g+
  50. #vm.dirty_background_ratio = 0
  51. #vm.dirty_ratio = 0
  52. #vm.dirty_background_bytes = 1610612736
  53. #vm.dirty_bytes = 4294967296
  54. ----
  55. sysctl -p

  1. 资源句柄数限制:
  2. cat >> /etc/security/limits.conf << EOF
  3. * soft nproc unlimited
  4. * hard nproc unlimited
  5. * soft nofile 524288
  6. * hard nofile 524288
  7. * soft stack unlimited
  8. * hard stack unlimited
  9. * hard memlock unlimited
  10. * soft memlock unlimited
  11. EOF

image_1f9ba4ejqedn46126fpsdlbf9.png-186.9kB


  1. 系统优化启动:
  2. echo "session required pam_limits.so" >> /etc/pam.d/login
  3. cat /etc/pam.d/login
  4. echo "RemoveIPC=no" >> /etc/systemd/logind.conf
  5. service systemd-logind restart
  1. 系统关闭firewalld,iptables,selinux
  2. echo "SELINUX=disabled" > /etc/selinux/config
  3. setenforce 0
  4. systemctl stop firewalld.service
  5. systemctl disable firewalld.service
  6. systemctl status firewalld.service
  7. systemctl set-default multi-user.target

  1. 系统最大透明页
  2. yum install numactl
  3. vim /etc/default/grub
  4. GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet numa=off transparent_hugepage=never elevator=deadline"
  5. grub2-mkconfig -o /etc/grub2.cfg
  6. numastat
  7. numactl --show
  8. numactl --hardware

image_1f9bagam319un1r0rihjmqdnc0m.png-46.5kB

  1. 硬盘读写优化
  2. XFS:
  3. rw,noatime,inode64,allocsize=16m
  4. cat /etc/fstab
  5. /dev/mapper/centos-root / xfs rw,noatime,inode64,allocsize=16m 0 0
  6. UUID=df6d5e29-33c8-4d43-ae34-5573447ef51e /boot xfs defaults 0 0
  7. UUID=f8097a9d-74ed-45b7-ae1b-c0467cbf235d swap swap defaults 0 0
  8. echo "blockdev --setra 65536 /dev/sda " >> /etc/rc.d/rc.local
  9. echo deadline > /sys/block/sda/queue/scheduler

image_1f9bat4b9fcu3vh1k42a1uu0113.png-61.5kB

  1. 系统时间同步问题:(所有节点同步阿里云时间服务器)
  2. vim /etc/chrony.conf
  3. ----
  4. server ntp1.aliyun.com iburst
  5. ----
  6. service chronyd stop
  7. service chrnoyd start

image_1f9bb7u6vad1hk27l5vs2195o1t.png-103.2kB


2.3 greenplum 的依赖包

  1. yum -y install openssh-clients gcc gcc-c++ make automake autoconf libtool krb5-devel perl rsync coreutils glib2 ed lrzsz sysstat e4fsprogs xfsprogs ntp readline-devel zlib zlib-devel unzip bzip2 libyaml zip libevent

image_1f9bb1dui1dvr1sehd2edf1v671g.png-137.7kB


2.4 greenplum 安装环境

  1. 下载安装包(官方下载)
  2. 全部节点配置:
  3. 安装目录:/usr/local/greeplum-db
  4. mkdir -p /greenplum/gpdata
  5. mkdir -p /greenplum/soft
  6. 安装rpm 安装
  7. rpm -ivh greenplum-db-6.16.2-rhel7-x86_64.rpm

image_1f9bbevt61rpd156f4pd1d261pe72a.png-246.9kB

image_1f9bbhvbs146k18kd1l6r1ipv40f37.png-54.4kB

  1. 设置greenplum 的安装用户: (所有节点)
  2. useradd gpadmin
  3. echo gpadmin |passwd gpadmin --stdin
  4. 设置安装目录权限:
  5. chown -R gpadmin:gpadmin /usr/local/greenplum*
  6. chmod -R 775 /usr/local/greenplum*
  7. chmod -R 775 /greenplum
  8. chown -R gpadmin:gpadmin /greenplum

image_1f9bbmas71p4okoo10pj1bpk193f3k.png-106.2kB

image_1f9bbo552s5d1m0e3rh1gvu1teg4u.png-176.2kB

image_1f9bbmtpe1an71qle1o1lmu3gv741.png-137.2kB

image_1f9bboh2o1ajkhlh1t7i1herrva5b.png-135.9kB

image_1f9bbnel3v6g1c8c1gvb1q0s1o2s4h.png-131.7kB


  1. 需要做gpadmin用户的无密钥登录:
  2. su - gpadmin
  3. ssh-keygen ---一直敲回车到最后
  4. cat id_rsa.pub >> authorized_keys
  5. chmod 600 authorized_keys
  6. 将所有的公钥导入authorized_keys 分发到 所有的 机器的.ssh/ 下面
  7. 然后测试

image_1f9bccigpqs1gp2qvl1nb2k1t9.png-183.7kB

  1. 设置主机名配置: (在master节点配置)
  2. su - gpadmin
  3. mkdir /usr/local/greenplum-db/config
  4. vim /usr/local/greenplum-db/config/all_hosts.txt
  5. ----
  6. node01.flyfish.com
  7. node02.flyfish.com
  8. node03.flyfish.com
  9. node04.flyfish.com
  10. node05.flyfish.com
  11. ----

  1. vim /usr/local/greenplum-db/config/all_seg.txt
  2. ---
  3. node03.flyfish.com
  4. node04.flyfish.com
  5. node05.flyfish.com
  6. ---

  1. 设置gpadmin的环境变量 (全部节点)
  2. su - gpadmin
  3. vim .bash_profile
  4. ----
  5. source /usr/local/greenplum-db/greenplum_path.sh
  6. #source /greenplum/greenplum-cc-6.3.1/gpcc_path.sh
  7. export MASTER_DATA_DIRECTORY=/greenplum/gpdata/master/gpseg-1
  8. export PGHOME=/usr/local/greenplum-db
  9. export PGPORT=5432
  10. export PGDATABASE=postgres
  11. export PGUSER=gpadmin
  12. ----

  1. vim .bashrc
  2. ----
  3. source /usr/local/greenplum-db/greenplum_path.sh
  4. #source /greenplum/greenplum-cc-6.3.1/gpcc_path.sh
  5. export MASTER_DATA_DIRECTORY=/greenplum/gpdata/master/gpseg-1
  6. export PGHOME=/usr/local/greenplum-db
  7. export PGPORT=5432
  8. export PGDATABASE=postgres
  9. export PGUSER=gpadmin
  10. ----
  11. source /home/gpadmin/.bashrc
  12. source /home/gpadmin/.bash_profile

image_1f9bcq9oqadq1irsom21cqn1t4cm.png-222.1kB

  1. 所有主机测试:
  2. gpssh -f /usr/local/greenplum-db/config/all_hosts.txt -e 'ls -ls /greenplum'
  3. gpssh -f /usr/local/greenplum-db/config/all_hosts.txt -e 'date'

image_1f9bd25li1gntr3qt3v1pj97co13.png-233.1kB

image_1f9bd2rd3f48r2v16rpeeha91g.png-129.9kB

  1. 设置主机安装目录:
  2. su - gpadmin
  3. mkdir -p /greenplum/gpdata/master (主节点与备份节点配置)
  4. gpssh -f /usr/local/greenplum-db/config/all_hosts.txt -e 'mkdir -p /greenplum/gpdata/primary'
  5. gpssh -f /usr/local/greenplum-db/config/all_hosts.txt -e 'mkdir -p /greenplum/gpdata/mirror'

image_1f9bd5h2l103u2ru11ru1bqc11is1t.png-195.3kB

  1. 集群节点安装
  2. vim /usr/local/greenplum-db/config/gpinitsystem_config
  3. ---
  4. ARRAY_NAME="greenplum"
  5. SEG_PREFIX=gpseg
  6. PORT_BASE=55000
  7. declare -a DATA_DIRECTORY=(/greenplum/gpdata/primary /greenplum/gpdata/primary /greenplum/gpdata/primary)
  8. MASTER_HOSTNAME=node01.flyfish.com
  9. MASTER_DIRECTORY=/greenplum/gpdata/master
  10. MASTER_PORT=5432
  11. MACHINE_LIST_FILE=/usr/local/greenplum-db/config/all_seg.txt
  12. TRUSTED_SHELL=ssh
  13. CHECK_POINT_SEGMENTS=8
  14. ENCODING=UNICODE
  15. MIRROR_PORT_BASE=56000
  16. REPLICATION_PORT_BASE=57000
  17. MIRROR_REPLICATION_PORT_BASE=58000
  18. declare -a MIRROR_DATA_DIRECTORY=(/greenplum/gpdata/mirror /greenplum/gpdata/mirror /greenplum/gpdata/mirror)
  19. ----

image_1f9bd8v521b1fggvm8v1ls2116d2a.png-159.4kB

  1. 初始化集群:
  2. vim gpinit.sh
  3. ----
  4. gpinitsystem -c /usr/local/greenplum-db/config/gpinitsystem_config -a -h /usr/local/greenplum-db/config/all_seg.txt -s node02.flyfish.com -D -B 2
  5. ----
  6. chmod +x gpinit.sh
  7. ./gpinit.sh

image_1f9bdeldca69110qc6i12kj4862n.png-43.3kB

  1. 初始化到最后:

image_1f9bdfbe5v0p19na1llf13cb3ma34.png-324.1kB

image_1f9bdgeba1fbk149j1k3mh8jnvh3h.png-100kB

  1. 查看GP 分段情况:
  2. select * from gp_segment_configuration order by content asc,dbid;

image_1f9bdin14195ud5pqmh6if1al3u.png-297.6kB

image_1f9bdk1c0kfdtio2q1opq1hbi4b.png-162.2kB

image_1f9bdknvm13l217snikc1gju6l64o.png-377.6kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注