@devilogic
        
        2017-07-22T10:47:51.000000Z
        字数 14620
        阅读 1947
    devilogic 日志

这两年基本不搞什么太具体的技术,研究算法比较多些。乱七八糟的算法看了一堆。说真心的,不是我不想努力,是竞争对手在技术的发展太弱逼了,客户又太注重产品形式。导致我热情基本消失,一些加固厂商不用心搞技术,一心想如何写广告词。三年多了连个像样的so保护都搞不定,把so文件整体加密,然后用代理dex解密后加载。一些测试机构竟然说这样连ELF格式也没有了强度更高,并将此融入到官方的测试报告里面。真心想当的面和他们说,你们丫走点脑子行吗? 
某些加固厂商(不提名字了,免得太伤和气)号称搞什么双VMP保护,其实你不这样吹牛逼,我真还懒的和你较真。 
dex方面从创建公司开始就没有关注过。不过最近看了看也挺好玩的。可能后半年搞些有趣的保护方式分享。 
娜迦从13年开始就拥有了so保护壳,后来可以将两个so文件链接到一起合并成一个so文件。不过在推广一段时间后,发现客户其实也不怎么关注这些。只要从静态看出效果就好。本来还想开发支持64位的融合技术,公司琐事太多也就耽误了。但最重要的原因是客户不关心这些。毕竟大多数客户都没有实时对抗的需求。
上个月接手销售部来看,我们的广告词做的太差,宣传不到点上。一年几百万的会议费用买回来一堆奖杯并没有对公司的业绩起到什么帮助,相反我们另外一家竞争对手采用线下行业沙龙的形式做市场。我个人感觉也要好的多。
按照我的原则就算打广告也不能像某些加固厂商对于软件保护技术的广告做出来一股方便面广告的感觉。想帮助提高销量,从一个技术人员的角度就要重新研究技术。
昨天早上10点在虹桥机场延误期间。想先从哪里入手把加固技术找回来。就翻看了一些原来写得程序。今年阿里的自动更新技术才做到无需SDK集成。这点我们两年前就做到了,不过因为我们体量还没达到维护两款公司主营产品,后来市场也得不到推动就舍弃了这个项目。而到去年时,公司想解决自动更新在修改配置文件也可以静默更新的问题,项目负责人告诉我要配合他写一个程序,可以统一替换一个so中的所有符号名称。正好再写so链接器时有过一些这样的经验。就写了一个给他。后来随着项目流产,这份代码也深藏我硬盘中了。
重新熟悉总要有个起点,就从如何修改一个so的符号名称开始吧。
Android从7开始就慢慢支持两种hash算法。
Android提供了两种hash算法,一种是Android团队自身的算法,一种是GNU标准算法。这里可参见android-linker-7_preview\中的linker.cpp的bool soinfo::prelink_image()函数。期间会遍历所有的动态段,并一一记录各种动态项的信息,以下是处理哈希表的算法的代码。
/* Android HASH算法 */case DT_HASH:/* 描述hash表有多少个根节点 */nbucket_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr)[0];/* 描述hash表的链个数 */nchain_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr)[1];/* 哈希表根节点地址 */bucket_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr + 8);/* 链表的地址 */chain_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr + 8 + nbucket_ * 4);break;/* GNU HASH算法 */case DT_GNU_HASH:gnu_nbucket_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr)[0];// skip symndxgnu_maskwords_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr)[2];gnu_shift2_ = reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr)[3];gnu_bloom_filter_ = reinterpret_cast<ElfW(Addr)*>(load_bias + d->d_un.d_ptr + 16);gnu_bucket_ = reinterpret_cast<uint32_t*>(gnu_bloom_filter_ + gnu_maskwords_);// amend chain for symndx = header[1]gnu_chain_ = gnu_bucket_ + gnu_nbucket_ -reinterpret_cast<uint32_t*>(load_bias + d->d_un.d_ptr)[1];if (!powerof2(gnu_maskwords_)) {DL_ERR("invalid maskwords for gnu_hash = 0x%x, in \"%s\" expecting power to two",gnu_maskwords_, get_realpath());return false;}--gnu_maskwords_;/* 使用GNU哈希 */flags_ |= FLAG_GNU_HASH;break;
从上述读取结构信息的算法来看,这两种算法形成的hash表就有所不同。
如果我们要是实现替换符号,增加新符号这些需求,必须构造hash表。两种哈希算法,其实实现一种即可,不过为了让我们的程序更健壮,这里探讨两种hash表的结构以及符号库的创建。
先看下Android自己实现的Hash表结构。
还是这份代码,其中有一个函数是find_symbol_by_name。这是一个通过符号名称寻找符号的函数。这个函数被其他函数调用以提供底层的符号寻找支持。
bool soinfo::find_symbol_by_name(SymbolName& symbol_name,const version_info* vi,const ElfW(Sym)** symbol) const {uint32_t symbol_index;bool success =is_gnu_hash() ?gnu_lookup(symbol_name, vi, &symbol_index) :elf_lookup(symbol_name, vi, &symbol_index);if (success) {*symbol = symbol_index == 0 ? nullptr : symtab_ + symbol_index;}return success;}
从以上代码看,这里又分别调用了gnu_lookup与elf_lookup两个函数,我们依次分析这两个函数来逆向的导出两种hash表的结构。这里第二个参数是一个version_info结构,版本信息结构。先把符号库如何建立,然后再分析符号版本问题。
让我们首先分析一下Android自己实现的HASH算法。
bool soinfo::elf_lookup(SymbolName& symbol_name,const version_info* vi,uint32_t* symbol_index) const {/* 获取符号名称的哈希值 */uint32_t hash = symbol_name.elf_hash();TRACE_TYPE(LOOKUP, "SEARCH %s in %s@%p h=%x(elf) %zd",symbol_name.get_name(), get_realpath(),reinterpret_cast<void*>(base), hash, hash % nbucket_);/* 通过版本信息获取版本依赖 */ElfW(Versym) verneed = 0;if (!find_verdef_version_index(vi, &verneed)) {return false;}/* 这里可以看出hash表的结构* nbucket_是哈希表根节点总个数,使用hash进行模运算得到这个符号属于第几个根节点中,第一次使用根节点中的值* 这个值其实就是在符号表中对应的索引*/for (uint32_t n = bucket_[hash % nbucket_]; n != 0; n = chain_[n]) {/* 从符号表中取出符号 */ElfW(Sym)* s = symtab_ + n;/* 得到对应的符号版本结构 <- 这篇文件可以掠过这个 */const ElfW(Versym)* verdef = get_versym(n);/* 这里也是关于符号版本的,先略去吧 */// skip hidden versions when verneed == 0if (verneed == kVersymNotNeeded && is_versym_hidden(verdef)) {continue;}/* 这里关注 strcmp(get_string(s->st_name), symbol_name.get_name()) == 0* 判断当前找出来的与目标符号名称是否相同,如果相同则找到*/if (check_symbol_version(verneed, verdef) &&strcmp(get_string(s->st_name), symbol_name.get_name()) == 0 &&is_symbol_global_and_defined(this, s)) {TRACE_TYPE(LOOKUP, "FOUND %s in %s (%p) %zd",symbol_name.get_name(), get_realpath(),reinterpret_cast<void*>(s->st_value),static_cast<size_t>(s->st_size));*symbol_index = n;return true;}/* 如果没有找到,则会使用以当前符号ID作为索引,从链接中读取相应的位置,也就是说相同哈希值的符号,* 记录到了一条链表,链表节点的值是符号索引,如果非要找的符号,则指明了下一个链接节点的在链表中的位置。* 从这一点来分析chain最少也要与符号数量减去根节点数的数量相同。*/}/* 没有找到符号 */TRACE_TYPE(LOOKUP, "NOT FOUND %s in %s@%p %x %zd",symbol_name.get_name(), get_realpath(),reinterpret_cast<void*>(base), hash, hash % nbucket_);*symbol_index = 0;return true;}
简化一下上面的代码:
GElf_Sym *elf_symbase::find(std::string name, GElf_Sym *dst) {if (_hashtab == NULL ||_symtab == NULL ||_strtab == NULL) {return NULL;}uint32_t hv = hash(name.c_str());if (hv == 0) {return NULL;}unsigned nbucket = 0, nchain = 0, *bucket = NULL, *chain = NULL;if (_hash_type == DT_HASH) {unsigned char *pot = reinterpret_cast<unsigned char*>(_hashtab);nbucket = reinterpret_cast<unsigned*>(pot)[0];nchain = reinterpret_cast<unsigned*>(pot)[1];bucket = reinterpret_cast<unsigned*>(pot + 8);chain = reinterpret_cast<unsigned*>(pot + 8 + nbucket * 4);}unsigned n = bucket[hv % nbucket];for (; n != 0; n = chain[n]) {if (n > _symc) {printf_msg("[-]symbol index is over range\r\n");return NULL;}if (_class == ELFCLASS32) {Elf32_Sym *src = reinterpret_cast<Elf32_Sym*>(_symtab) + n;#define COPY(name) dst->name = src->nameCOPY (st_name);COPY (st_info);COPY (st_other);COPY (st_shndx);COPY (st_value);COPY (st_size);} else {/* ELFCLASS64 */*dst = reinterpret_cast<Elf64_Sym*>(_symtab)[n];}char *finds = _strtab + dst->st_name;//if (strcmp(finds, name)) continue;if (name != finds) continue;/* got it */return dst;}/* end for */return NULL;}
忽略这上面的一些结构,这些是libelf导出的结构,因为好用所以我修改了一份libelf的代码。
构造一份哈希表并不太容易,要同时构造符号表与字符串表。下面是取自我写的一份代码库中的函数。基本完整的实现了符号库,哈希表,字符串表的构建。因为代码过多,这里仅分析一个最终要的哈希值添加操作,随后则是完整的库代码。
int elf_symbase::hashtab_chain_add(unsigned *chain,int index, int symtab_index) {/* 这里的检查链表是否有空位,如果有则添加符号ID,并将自己的下一个节点设置为0* 这里相当于node->next = NULL 的操作*/if (chain[index] == 0) {/* 有空位,直接返回,添加成功 */chain[index] = symtab_index;chain[symtab_index] = 0;return 0;}/* 如果没有空位则使用当前的链表节点的值作为下一个节点的索引递归的添加 */index = chain[index];return hashtab_chain_add(chain, index, symtab_index);}
#define DEF_HASH_NBUCKET 0x20#define DEF_SYMNAME_LEN 0x100elf_symbase::elf_symbase(int cls, int hash_type) {_symc = 0;_strc = 0;_strsz = 0;_symtab = NULL;_strtab = NULL;_hashtab = NULL;_index = 0;_strtab_offset = 0;_symtab_size = 0;_hashtab_size = 0;_class = cls;_hash_type = hash_type;}elf_symbase::~elf_symbase() {close();}int elf_symbase::count_hashsym(void* hashtab,unsigned *nbucket, unsigned *nchain) {unsigned char *p = reinterpret_cast<unsigned char*>(hashtab);*nbucket = *reinterpret_cast<unsigned*>(p);*nchain = *reinterpret_cast<unsigned*>(p + 4);return 0;}int elf_symbase::init(int nbucket, int nchain,int cls, int hash_type) {_class = cls;_hash_type = hash_type;if (nbucket == 0) nbucket = DEF_HASH_NBUCKET;int syms = nchain;int curr_size = nchain;/* 第一个符号为空符号 */_symc = syms + 1;_strc = 0;void *symtab = symtab_create(curr_size, &curr_size);if (symtab == NULL) return -1;_symtab = symtab;_hashtab = hashtab_create(nbucket, syms);if (_hashtab == NULL) return -2;_strsz = _symc * DEF_SYMNAME_LEN;_strtab = new char [_strsz];if (_strtab == NULL)return -3;memset(_strtab, 0, _strsz);_strtab_offset = 1;/* 设置符号项与哈稀表长度 */_symtab_size = curr_size;_hashtab_size = (nbucket + nchain + 1) * sizeof(unsigned);/* 忽略第一个空符号 */_index = 1;return 0;}int elf_symbase::close() {symtab_release();hashtab_release();if (_strtab) {delete [] _strtab;_strtab = NULL;}_symc = 0;_strc = 0;_strsz = 0;_symtab = NULL;_strtab = NULL;_hashtab = NULL;_index = 0;_strtab_offset = 0;_symtab_size = 0;_hashtab_size = 0;_class = ELFCLASS32;_hash_type = DT_HASH;return 0;}int elf_symbase::add(std::string name,unsigned st_value,unsigned st_size,unsigned bind,unsigned type,unsigned char st_other,unsigned short st_shndx) {if (name.empty()) return -1;/* 添加到字符串表 */unsigned st_name = strtab_add(name);/* 添加到符号表 */int index = _index;int ret = symtab_add(index,st_name,st_value,st_size,bind,type,st_other,st_shndx);if (ret != 0) return -2;/* 添加到哈稀表 */ret = hashtab_add(name.c_str(), index);if (ret != 0) return -3;/* 索引增加 */index++;_index = index;return 0;}GElf_Sym *elf_symbase::find(std::string name, GElf_Sym *dst) {if (_hashtab == NULL ||_symtab == NULL ||_strtab == NULL) {return NULL;}uint32_t hv = hash(name.c_str());if (hv == 0) {return NULL;}unsigned nbucket = 0, nchain = 0, *bucket = NULL, *chain = NULL;if (_hash_type == DT_HASH) {unsigned char *pot = reinterpret_cast<unsigned char*>(_hashtab);nbucket = reinterpret_cast<unsigned*>(pot)[0];nchain = reinterpret_cast<unsigned*>(pot)[1];bucket = reinterpret_cast<unsigned*>(pot + 8);chain = reinterpret_cast<unsigned*>(pot + 8 + nbucket * 4);}unsigned n = bucket[hv % nbucket];for (; n != 0; n = chain[n]) {if (n > _symc) {printf_msg("[-]symbol index is over range\r\n");return NULL;}if (_class == ELFCLASS32) {Elf32_Sym *src = reinterpret_cast<Elf32_Sym*>(_symtab) + n;#define COPY(name) dst->name = src->nameCOPY (st_name);COPY (st_info);COPY (st_other);COPY (st_shndx);COPY (st_value);COPY (st_size);} else {/* ELFCLASS64 */*dst = reinterpret_cast<Elf64_Sym*>(_symtab)[n];}char *finds = _strtab + dst->st_name;//if (strcmp(finds, name)) continue;if (name != finds) continue;/* got it */return dst;}/* end for */return NULL;}/* 找不到返回-1 */int elf_symbase::find_index(std::string name) {if (_hashtab == NULL ||_symtab == NULL ||_strtab == NULL) {return -1;}uint32_t hv = hash(name.c_str());if (hv == 0) {return -1;}unsigned nbucket = 0, nchain = 0, *bucket = NULL, *chain = NULL;if (_hash_type == DT_HASH) {unsigned char *pot = reinterpret_cast<unsigned char*>(_hashtab);nbucket = reinterpret_cast<unsigned*>(pot)[0];nchain = reinterpret_cast<unsigned*>(pot)[1];bucket = reinterpret_cast<unsigned*>(pot + 8);chain = reinterpret_cast<unsigned*>(pot + 8 + nbucket * 4);}size_t st_name = 0;unsigned n = bucket[hv % nbucket];for (; n != 0; n = chain[n]) {if (n > _symc) {printf_msg("[-]symbol index is over range\r\n");return -1;}if (_class == ELFCLASS32) {Elf32_Sym *s = reinterpret_cast<Elf32_Sym*>(_symtab) + n;st_name = s->st_name;} else {/* ELFCLASS64 */Elf64_Sym *s = reinterpret_cast<Elf64_Sym*>(_symtab) + n;st_name = s->st_name;}char *finds = _strtab + st_name;//if (strcmp(finds, name)) continue;if (name != finds) continue;/* got it */return n;}/* end for */return -1;}void elf_symbase::print() {unsigned strlens = 0;char *name = _strtab + 1; /* 跳过第一个0字符 */unsigned offset = 1, symc = 0, strc = 0;for (unsigned i = 0; i < _strc; i++) {GElf_Sym sym_mem;GElf_Sym *sym = find(name, &sym_mem);if (sym == NULL) {printf_msg("[symbase]string : %s\r\n", name);strc++;} else {printf_msg("[symbase]symbol : %s(%d) value:0x%04x, size:%d, info:%d\r\n",name, sym->st_name, sym->st_value, sym->st_size,sym->st_info);symc++;}strlens = strlen(name) + 1;offset += strlens;name = _strtab + offset;}printf_msg("[symbase]%d symbols, %d strings\r\n", symc, strc);return;}void *elf_symbase::symtab_create(unsigned count, int* psize) {void *res = NULL;unsigned size = 0;unsigned char *s = NULL;if (_class == ELFCLASS32) {size = sizeof(Elf32_Sym) * count;} else {size = sizeof(Elf64_Sym) * count;}s = new unsigned char [size];if (s == NULL) {printf_msg("[-]new unsigned char [%d] failed", size);return NULL;}memset(s, 0, size);if (psize) *psize = size;res = reinterpret_cast<void*>(s);return res;}void elf_symbase::symtab_release() {if (_symtab) {delete [] reinterpret_cast<unsigned char*>(_symtab);_symtab = NULL;}}int elf_symbase::symtab_add(int index,unsigned st_name,unsigned st_value,unsigned st_size,unsigned bind,unsigned type,unsigned char st_other,unsigned short st_shndx) {if (_symtab == NULL) return -1;if (_class == ELFCLASS32) {Elf32_Sym v;unsigned char st_info = ELF32_ST_INFO(bind, type);v.st_name = st_name;v.st_value = st_value;v.st_size = st_size;v.st_info = st_info;v.st_other = st_other;v.st_shndx = st_shndx;memcpy(reinterpret_cast<Elf32_Sym*>(_symtab) + index,&v, sizeof(Elf32_Sym));} else {Elf64_Sym v;unsigned char st_info = ELF64_ST_INFO(bind, type);v.st_name = st_name;v.st_value = st_value;v.st_size = st_size;v.st_info = st_info;v.st_other = st_other;v.st_shndx = st_shndx;memcpy(reinterpret_cast<Elf64_Sym*>(_symtab) + index,&v, sizeof(Elf64_Sym));}return 0;}void *elf_symbase::hashtab_create(unsigned n, unsigned syms) {unsigned nbucket = n;unsigned nchain = syms + 1;/* 0索引的空符号 */unsigned hashtab_size = 4 + 4 + (4 * nbucket) + (4 * nchain);unsigned char *hashtab = new unsigned char [hashtab_size];memset(hashtab, 0, hashtab_size);*reinterpret_cast<unsigned*>(hashtab) = nbucket;*(reinterpret_cast<unsigned*>(hashtab) + 1) = nchain;return reinterpret_cast<void*>(hashtab);}void elf_symbase::hashtab_release() {if (_hashtab) {delete [] reinterpret_cast<unsigned char*>(_hashtab);_hashtab = NULL;}}int elf_symbase::hashtab_chain_add(unsigned *chain,int index, int symtab_index) {if (chain[index] == 0) {/* 有空位,直接返回,添加成功 */chain[index] = symtab_index;chain[symtab_index] = 0;return 0;}/* 如果没有空位,继续添加 */index = chain[index];return hashtab_chain_add(chain, index, symtab_index);}int elf_symbase::hashtab_add(const char* name, int symtab_index) {if (_hashtab == NULL) return -1;if (strlen(name) == 0) return -2;if (symtab_index <= 0) return -3;unsigned nbucket = *reinterpret_cast<unsigned*>(_hashtab);unsigned nchain = *(reinterpret_cast<unsigned*>(_hashtab) + 1);if (static_cast<unsigned>(symtab_index) > nchain) return -4;unsigned hv = hash(name);unsigned index = hv % nbucket;unsigned *bucket =reinterpret_cast<unsigned*>(reinterpret_cast<unsigned char*>(_hashtab) + 8);unsigned *chain =reinterpret_cast<unsigned*>(reinterpret_cast<unsigned char*>(_hashtab) +8 + (4 * nbucket));if (bucket[index] == 0) {bucket[index] = symtab_index;} else {index = bucket[index];return hashtab_chain_add(chain, index, symtab_index);}return 0;}int elf_symbase::strtab_find(std::string s) {unsigned strlens = 0;char *name = _strtab + 1; /* 跳过第一个0字符 */unsigned offset = 1;for (unsigned i = 0; i < _strc; i++) {//if (strcmp(name, s) == 0) {if (s == name) {return offset;}strlens = strlen(name) + 1;offset += strlens;name = _strtab + offset;}return 0;}int elf_symbase::strtab_add(std::string s) {if (_strtab == NULL) {return -1;}/* 首先先寻找,找到则返回 */int ret = strtab_find(s);if (ret) return ret;int strtab_offset = _strtab_offset;int strlens = s.length();/* 重新分配字符串表的空间 */unsigned x = static_cast<unsigned>(strtab_offset + strlens + 1);if (x > _strsz) {_strsz += (0x10 * DEF_SYMNAME_LEN); /* 更新空间大小 */char* tmp = new char [_strsz];if (tmp == NULL) {printf_msg("[-]new string table failed\r\n");return -2;}memcpy(tmp, _strtab, strtab_offset);delete [] _strtab;_strtab = tmp;}/* 复制新值 */memcpy(_strtab + strtab_offset, s.c_str(), strlens);*(_strtab + strtab_offset + strlens) = '\0';/* 索引增加 */int ret_offset = strtab_offset;strtab_offset += (strlens+1);_strtab_offset = strtab_offset;/* 增加字符串计数 */_strc++;return ret_offset;}unsigned elf_symbase::hash(const char *s) {if (_hash_type == DT_HASH) {return elf_hash(s);} else if (_hash_type == DT_GNU_HASH) {return elf_gnu_hash(s);}printf_msg("[-]unknow hash type %d\r\n", _hash_type);return 0;}unsigned elf_symbase::get_symbol_count() {return _symc;}unsigned elf_symbase::get_string_count() {return _strc;}unsigned elf_symbase::get_symtab_size() {return _symtab_size;}unsigned elf_symbase::get_hashtab_size() {return _hashtab_size;}unsigned elf_symbase::get_strtab_size() {return _strtab_offset;}int elf_symbase::read_symtab(void *dst) {memcpy(dst, _symtab, _symtab_size);return 0;}int elf_symbase::read_strtab(void *dst) {memcpy(dst, _strtab, _strtab_offset);return 0;}int elf_symbase::read_hashtab(void *dst) {memcpy(dst, _hashtab, _hashtab_size);return 0;}int elf_symbase::get_hash_type() {return _hash_type;}const void *elf_symbase::get_symtab() {return _symtab;}const void *elf_symbase::get_strtab() {return _strtab;}const void *elf_symbase::get_hashtab() {return _hashtab;}