@smilence
2016-04-10T01:56:13.000000Z
字数 5531
阅读 13325
1.Hash Table
Hash table几乎是最为重要的数据结构,主要用于基于"key"的查找,存储的基本元素是key-value的pair。
两种实现方式: Open hashing & Closed hashing
区别: whether the collisions are stored outside of the table or at other slots inside the table
Open hashing(separate chaining): Easier to implement; more efficient for large records or sparse tables; Typically, table index = hash(key) % array_length.
Closed hashing(open addressing): More complex but can be more efficient, esp for small data records.
特别地,Hash table还可以用Balanced BST来实现,如std::map, lookup的时间由增长为,但可能节省space cost ,且可以按照key的值有序输出pairs。
2.标准库的使用
对于C++,
map<Key,Val,key_comp> table ,对于严格弱序判断的不同定义,可以通过重载<操作符或者定义functor class实现。
注意对于map无需重载==运算符,因为 A==B 等效于 !(A<B)&&!(B<A)。
对于unordered_map<Key,Val>则更加复杂,需要定义:
(1) A hash function, 用于确定Index即hash值;(2) 重载==操作符,用于区分hash值相同的不同key。
详见:http://stackoverflow.com/questions/17016175/c-unordered-map-using-a-custom-class-type-as-the-key
对于JAVA,需要对HashMap重定义equals()与hashCode()。
3.Reverse array or string:
(1) 用头尾两个指针,相向遍历,swap元素;
(2) 顺序遍历元素,push入stack,再依次pop。
4.C++语言中2D-Array的初始化与参数传递
(1) Array的边界在Compile Time内确定
void func(int arr[M][N]){ // M is optional, but N is required...}
(2) Array的边界在Runtime内确定,则传递二维指针
void func( int **arr, int M, int N ){...}void main(){int **arr = new int*[M]; // or (int**)malloc( M * sizeof(int*) );for( int i = 0; i < M; i++)arr[i] = new int[N]; // or (int*)malloc( N * sizeof(int) );}
5.String Matching 常见算法
http://en.wikipedia.org/wiki/String_searching_algorithm
Let m be the length of the pattern and let n be the length of the searchable text.
Brute-Force算法: 顺序遍历字符串,遇到首字符匹配,比较substring, O(mn)
Rabin-Karp算法: Use hashing for shift searching. 将每一个匹配子串映射为一个hash值。例如,将子串看做一个多进制数,计算它的值与目标字符串的hash值比较,如果相同,再比较两个字符串是否相同。顺序shift过程中,可以增量计算hash值(扣除最高位,增加最低位)。因此能在average case下做到O(m+n)。
6.Stringstream 的基本用法
stringstream继承iostream,因此与cin,cout输入输出流等价,如>>, <<, cin.get(),getline(cin,str);
常用函数:stringstream可以将string作为参数构造新变量,或str( const string&)传入,或string str() const返回;总是可以用 while( stream.good() ) 判断字符串流的终止,用stream.ignore( 256, '.' );来跳过分隔符。
1.统计一个元素集中元素出现的次数,用Hash Table,即std::unordered_map或std::map。只统计元素出现与否的问题,可以用bitvector即bitset,二进制数来记录。
e.g. 1 Implement an algorithm to determine if a string has all unique characters.
e.g. 2 Write a method to decide if one string is a permutation of the other.
e.g. 3 Given a newspaper and message as two strings, check if the message can be composed using letters in the newspaper.(Google interview)
e.g. 4 Find a line which passes the most number of points. ( CtCI 7.6 )
2. 处理vector,array或string问题的几点细节:
a.数组或string内前驱subproblem推出problem的问题,总是以为范围,检索前驱节点 求解当前节点 。如果只是检索特定的value,则可以用将当前value作为key建立hash table(value取决于你需要什么前驱子问题的信息),使得后驱节点每次检索的复杂度降低到。实际上类似于dynamic programming的思维方式,记录前驱节点的信息,用空间换取重新扫描需要的时间。
e.g.1 Given an array of integers, find two numbers such that they add up to a specific number.(Leetcode: 2Sum)
vector<int> twoSum(vector<int> &numbers, int target) {unordered_map<int,int> numToIndex;vector<int> vi(2);for(auto it = numbers.begin(); it != numbers.end(); it++){if( numToIndex.count(target-*it) ){vi[0] = numToIndex[target-*it] + 1;vi[1] = it - numbers.begin() + 1;return vi;}numToIndex[*it] = it - numbers.begin();}}
e.g.2 Given an unsorted array of integers, find the length of the longest consecutive elements sequence. (Leetcode: Longest Consecutive Sequence)
struct Bound{int high;int low;Bound(int h = 0 , int l = 0){high = h;low = l;}};class Solution {public:int longestConsecutive(vector<int> &num) {unordered_map<int,Bound> table;int local;int max_len = 0;for(int i = 0; i < num.size(); i++){if(table.count(num[i])) continue;local = num[i];int low = local, high = local;if(table.count(local-1))low = table[local-1].low;if(table.count(local+1))high = table[local+1].high;table[low].high = table[local].high = high;table[high].low = table[local].low = low;if(high - low + 1 > max_len)max_len = high - low + 1;}return max_len;}};
b.求最值或总和的问题,总是可以在update的过程中不断记录更新,而不用再次遍历。
c.处理两个不等长的数组或string,总是可以以while( aIndex < aSize && bIndex < bSize )作为结束条件,再处理可能剩下的元素。
d.对于要求in-place的删除或修改,可以用两个int变量分别记录新index与原index,观察:如果改动后size增大,则先resize(int),再倒序赋值;反之,则先顺序赋值,再resize.
e.g. 1 Given input "Mr John Smith", output "Mr%20John%20Smith" (CtCI 1.4)
e.g. 2 Given input "Mr%20John%20Smith" , output "Mr John Smith" (Facebook interview)
e.g. 3 Given two sorted arrays, merge B into A in sorted order. (CtCI 11.1, Leetcode: Merge Sorted Array)
3.对于修改长字符串的问题,通常可以顺序或倒序一次遍历,在O(n)内解决。对于分段处理的情况,则使用两个int变量记录段首、段尾Index来处理(slicing window)。 可以利用C++ string类的 += 操作符来增量输出结果。
e.g.1 Given input -> "I have 36 books, 40 pens2."; output -> "I evah 36 skoob, 40 2snep."(Ebay interview)
e.g.2 Compress the string "aabcccccaaa" to "a2b1c5a3"( CtCI 1.5 )
e.g.3 Leetcode: Longest Substring Without Repeating Characters.e.g.4 Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n). (Leetcode: Minimum Window Substring)
*典型的将复杂问题划归为简单问题的思维过程:
http://paste.ofcode.org/auYff2uez54SBCzYetA6ik
如果需要将长字符串分段转换为其他变量,在可以舍弃其分隔符(空格或其他标点)的情况下,可以使用stringstream。注意+或-如果不跳过,会被认为是数字的一部分。
e.g. Convert a version( in a string format ) to a vector of integers
vector<int> verConvert( const string& str ){ //for example, "24.3.17"stringstream stream( str );vector<int> vi;int tmp;while( stream.good() ){stream>>tmp;vi.push_back(tmp);stream.ignore( 256, '.' );}return vi;}
4.对于处理array或string的问题,如果要求 constant space 且允许修改输入,则总是可以尝试排序后再操作(Sorting 或 建立hash table,本质上都是一种分类的过程,将同类元素归并到一起)。
e.g.1 Given an array of integers, find two numbers such that they add up to a specific number.(Leetcode: 2Sum)
e.g.2 Find all unique triplets in the array which gives the sum of zero.(Leetcode: 3Sum)
e.g.3 Remove duplicates from a vector of int.
5.处理2D-Array,通常可以按行遍历,按列遍历,或由外圈向内圈遍历。
e.g.1 Given an NxN matrix, rotate the matrix by 90 degrees.
e.g.2 If mat[i][j] is 0, then set the entire row and column to 0.
6.需要用数字或者bitset来映射/hash的情境, 可以利用进制数的正交性,即:数字的高位和低位是互相正交、独立的。或者反过来说,一个数用m进制来表示,只有唯一一种可能。
e.g. 1 Design a hash function that is suitable for words in a dictionary.(EPI: 12.1)
e.g. 2 A hash function for the state of a chess game.(EPI: 12.2)
e.g. 3 Implement a method rand7() given rand5().(CtCI:17.11)