哈希与unordered_set、unordered

1. unordered系列关联式容器

1.1.unordered_map的接口示例

1.2. 底层结构

底层差异

哈希概念

2.哈希表的模拟实现

3.unordered的封装

3.1.哈希表的改造

3.2.上层封装

3.2.1.unordered_set封装

3.2.2.unordered_map封装及operator[]实现

1. unordered系列关联式容器

在C++11中，STL又提供了4个unordered系列的关联式容器：

unordered_set

unordered_multiset

unordered_map

unordered_multimap

这四个容器与红黑树结构的关联式容器使用方式基本类似，只是其底层结构不同，他们的底层为哈希表。

1.1.unordered_map的接口示例

下面给出unordered_map常用的一些函数

1）.unordered_map的构造

函数声明	功能简介
(constructor)	构造的unordered_map对象

2）.unordered_map的容量

函数声明	功能简介
empty	返回容器是否为空
size	返回容器中储存元素个数

3）.unordered_map的修改操作

函数声明	功能简介
operator[]	访问指定key元素，若没有则插入
insert	插入元素
erase	删除指定key元素
clear	清除内容

4）.unordered_map的查询操作

函数声明	功能简介
iterator find(const key_type& k)	查找指定key元素，返回其迭代器
size_type count (const key_type& k)	返回哈希桶中关键码为key的键值对的个数

5）.unordered_map的迭代器

函数声明	功能简介
begin	返回unordered_map第一个元素的迭代器
end	返回unordered_map最后一个元素下一个位置的迭代器
cbegin	返回unordered_map第一个元素的const迭代器
cend	返回unordered_map最后一个元素下一个位置的const迭代器

1.2. 底层结构

unordered系列的关联式容器之所以效率比较高，是因为其底层使用了哈希结构。

底层差异

1.对key的要求不同

set：key支持比较大小

unordered_set：key支持转成整型+比较相等

2.set遍历有序，unordered_set遍历无序

3.性能差异（查找的时间复杂度）

set：O（logN）

unordered_set：O（1）

哈希概念

构造一种存储结构，通过某种函数(hashFunc)使元素的存储位置与它的关键码之间能够建立 一一映射的关系，那么在查找时通过该函数可以很快找到该元素。

哈希思想即为将关键码与储存位置进行映射。

○插入元素时根据待插入元素的关键码，以此函数计算出该元素的存储位置并按此位置进行存放

○搜索元素时对元素的关键码进行同样的计算，把求得的函数值当做元素的存储位置，在结构中按此位置取元素比较，若关键码相等，则搜索成功

该方式即为哈希(散列)方法，哈希方法中使用的转换函数称为哈希(散列)函数，构造出来的结构称为哈希表(Hash Table)(或者称散列表)

建立映射关系有下面两种方法

1.直接定址法

优点：快、没有哈希冲突

缺点：只适合范围相对集中关键码，否则要牺牲空间为代价

2.除留余数法

hash(key) = key % capacity

哈希冲突/碰撞:不同关键字通过哈希函数计算后映射到了相同位置

如何解决哈希冲突？

1.开散列：开放定址法——按某种规则去其他位置找一个空位置储存（a.线性探测；b.二次探测）

2.闭散列:哈希桶/拉链法——首先对关键码集合用散列函数计算散列地址，具有相同地址的关键码归于同一子集合，每一个子集合称为一个桶，各个桶中的元素通过一个单链表链接起来，各链表的头结点存储在哈希表中。

2.哈希表的模拟实现

下面给出哈希表的模拟实现

HashFunc是将关键码转为整型的仿函数

//在哈希表中定义负载因子,用于记录哈希表中存储数据个数

size_t _n;

//当_n / _tables.size() 达到一定程度后对哈希表进行扩容

//负载因子过高，进行扩容
           if (_n * 10 / _tables.size() >= 10)
           {
               HashTable<K, T, KeyOfT> newtable;
               int newsize = _tables.size() * 2;
               newtable._tables.resize(newsize);

               for (auto& e : _tables)
               {
                   Node* del = e;
                   while (e)
                   {
                       newtable.Insert(e->_data);
                       e = e->_next;
                   }
                   del = nullptr;
               }

               //调用自己类Insert遵循规则插入新表，最后交换
               _tables.swap(newtable._tables);
           }

// 哈希函数采用除留余数法
template<class K>
struct HashFunc
{size_t operator()(const K& key){return (size_t)key;}
};// 哈希表中支持字符串的操作
template<>
struct HashFunc<string>
{size_t operator()(const string& key){size_t hash = 0;for (auto e : key){//*31减小冲突的可能hash *= 31;hash += e;}return hash;}
};// 以下采用开放定址法，即线性探测解决冲突
namespace open_address
{//用枚举体表示表中相应位置状态：存在元素、空、元素删除位置enum State{EXIST,EMPTY,DELETE};template<class K, class V>struct HashData{pair<K, V> _kv;State _state = EMPTY;};template<class K, class V, class Hash = HashFunc<K>>class HashTable{public:HashTable():_n(0){_tables.resize(10);}bool Insert(const pair<K, V>& kv){if (Find(kv.first)){return false;}//负载因子过高，进行扩容if (_n * 10 / _tables.size() >= 7){HashTable<K, V> newtable;int newsize = _tables.size() * 2;newtable._tables.resize(newsize);for (auto e : _tables){if (e._state == EXIST){newtable.Insert(e._kv);}}//调用自己类Insert遵循规则插入新表，最后交换_tables.swap(newtable._tables);}Hash hashfun;int hashi = hashfun(kv.first) % _tables.size();//找非空或删除位置while (_tables[hashi]._state == EXIST){hashi++;hashi %= _tables.size();}_tables[hashi]._kv = kv;_tables[hashi]._state = EXIST;++_n;return true;}HashData<K, V>* Find(const K& key){Hash hashfun;int hashi = hashfun(key) % _tables.size();//DELETE位置也要查找，因为相同映射的元素在中间会被删除while (_tables[hashi]._state == EXIST || _tables[hashi]._state == DELETE){if (_tables[hashi]._state == EXIST && _tables[hashi]._kv.first == key){return &_tables[hashi];}hashi++;hashi %= _tables.size();}return nullptr;}bool Erase(const K& key){//直接复用查找后删除HashData<K, V>* pdata = Find(key);if (pdata == nullptr){return false;}pdata->_state = DELETE;--_n;return true;}private:vector<HashData<K, V>> _tables;size_t _n = 0;  // 表中存储数据个数};
}//哈希桶/拉链法
namespace hash_bucket
{template<class K, class V>struct HashNode{pair<K, V> _kv;HashNode<K, V>* _next;HashNode(const pair<K, V>& kv):_kv(kv), _next(nullptr){}};// Hash将key转化为整形，因为哈希函数使用除留余数法template<class K, class V, class Hash = HashFunc<K>>class HashTable{typedef HashNode<K, V> Node;public:HashTable(){_tables.resize(10, nullptr);}// 哈希桶的销毁//~HashTable();// 插入值为data的元素，如果data存在则不插入bool Insert(const pair<K, V>& kv){if (Find(kv.first)){return false;}//负载因子过高，进行扩容if (_n * 10 / _tables.size() >= 10){HashTable<K, V> newtable;int newsize = _tables.size() * 2;newtable._tables.resize(newsize);for (auto& e : _tables){while (e){newtable.Insert(e->_kv);e = e->_next;}}//调用自己类Insert遵循规则插入新表，最后交换_tables.swap(newtable._tables);}Hash hashfun;int hashi = hashfun(kv.first) % _tables.size();Node* newnode = new Node(kv);newnode->_next = _tables[hashi];_tables[hashi] = newnode;++_n;return true;}// 在哈希桶中查找值为key的元素，存在返回true否则返回falsebool Find(const K& key){Hash hashfun;int hashi = hashfun(key) % _tables.size();Node* cur = _tables[hashi];while (cur){if (cur->_kv.first == key){return true;}cur = cur->_next;}return false;}// 哈希桶中删除key的元素，删除成功返回true，否则返回falsebool Erase(const K& key){Hash hashfun;int hashi = hashfun(key) % _tables.size();Node* cur = _tables[hashi];Node* parent = nullptr;while (cur){if (cur->_kv.first == key){Node* next = cur->_next;if (cur == _tables[hashi]){_tables[hashi] = next;}else{parent->_next = next;}delete cur;--_n;return true;}parent = cur;cur = cur->_next;}return false;}private:vector<Node*> _tables;  // 指针数组size_t _n = 0;			// 表中存储数据个数};
}

3.unordered的封装

封装unordered应按照以下步骤进行

1.实现哈希表

2.封装unordered_set、unordered_map，解决KeyOfT问题（取出数据类型中的关键码）

3.实现Iterator

4.operator[]的实现

3.1.哈希表的改造

上面我们已经实现了哈希表，下面我们对哈希表进行改造：解决KeyOfT问题、实现Iterator

//哈希桶/拉链法
namespace hash_bucket
{template<class T>struct HashNode{T _data;HashNode<T>* _next;HashNode(const T& data):_data(data), _next(nullptr){}};//前置哈希表声明template<class K, class T, class KeyOfT, class Hash>class HashTable;//哈希表迭代器template<class K,class T,class Ptr,class Ref,class KeyOfT,class Hash = HashFunc<K>>struct HashTableIterator{typedef HashNode<T> Node;typedef HashTable<K, T, KeyOfT,Hash> HashBucket;typedef HashTableIterator Self;HashTableIterator(Node* node,const HashTable<K, T, KeyOfT,Hash>* pht):_node(node), _pht(pht){}Self& operator++(){Hash hashfun;KeyOfT kot;Node* cur = _node;if (_node->_next){_node = _node->_next;}else{int hashi = hashfun(kot(cur->_data)) % _pht->_tables.size();++hashi;while (hashi < _pht->_tables.size() && _pht->_tables[hashi] == nullptr){++hashi;}if (hashi >= _pht->_tables.size()){_node = nullptr;return *this;}_node = _pht->_tables[hashi];}return  *this;}Ref operator*(){return _node->_data;}Ptr operator->(){return &_node->_data;}//因为end()返回为一个临时对象，必须加constbool operator!=(const Self& ito){return _node != ito._node;}Node* _node;const HashBucket* _pht;};// Hash将key转化为整形，因为哈希函数使用除留余数法template<class K, class T, class KeyOfT, class Hash = HashFunc<K>>class HashTable{public:typedef HashNode<T> Node;typedef HashTableIterator<K, T,T*, T&, KeyOfT> Iterator;typedef HashTableIterator<K, T,const T*,const T&, KeyOfT> ConstIterator;template<class K, class T, class KeyOfT,  class Ptr, class Ref, class Hash>friend struct HashTableIterator;public:HashTable(){_tables.resize(10, nullptr);}// 哈希桶的销毁~HashTable(){int hashi = 0;Node* cur;Node* next;while (hashi < _tables.size()){cur = _tables[hashi];while (cur){next = cur->_next;delete cur;cur = next;}++hashi;}}Iterator Begin(){if (_n == 0)return End();int hashi = 0;while (hashi <= _tables.size() && _tables[hashi] == nullptr){++hashi;}if (hashi >= _tables.size()){return Iterator(nullptr, this);}else{return Iterator(_tables[hashi],this);}}Iterator End(){return Iterator(nullptr, this);}ConstIterator Begin()const{int hashi = 0;while (hashi <= _tables.size() && _tables[hashi] == nullptr){++hashi;}if (hashi >= _tables.size()){return ConstIterator(nullptr, this);}else{return ConstIterator(_tables[hashi],this);}}ConstIterator End()const{return ConstIterator(nullptr, this);}// 插入值为data的元素，如果data存在则不插入pair<Iterator,bool> Insert(const T& data){KeyOfT kot;Iterator ret(nullptr,this);ret = Find(kot(data));if (ret._node != nullptr){return make_pair(ret,false);}//负载因子过高，进行扩容if (_n * 10 / _tables.size() >= 10){HashTable<K, T, KeyOfT> newtable;int newsize = _tables.size() * 2;newtable._tables.resize(newsize);for (auto& e : _tables){Node* del = e;while (e){newtable.Insert(e->_data);e = e->_next;}del = nullptr;}//调用自己类Insert遵循规则插入新表，最后交换_tables.swap(newtable._tables);}Hash hashfun;int hashi = hashfun(kot(data)) % _tables.size();Node* newnode = new Node(data);newnode->_next = _tables[hashi];_tables[hashi] = newnode;ret._node = newnode;++_n;return make_pair(ret,true);}// 在哈希桶中查找值为key的元素，存在返回true否则返回falseIterator Find(const K& key){KeyOfT kot;Hash hashfun;int hashi = hashfun(key) % _tables.size();Node* cur = _tables[hashi];while (cur){if (kot(cur->_data) == key){return Iterator(cur,this);}cur = cur->_next;}return Iterator(nullptr,this);}// 哈希桶中删除key的元素，删除成功返回true，否则返回falsebool Erase(const K& key){KeyOfT kot;Hash hashfun;int hashi = hashfun(key) % _tables.size();Node* cur = _tables[hashi];Node* parent = nullptr;while (cur){if (kot(cur->_data) == key){Node* next = cur->_next;if (cur == _tables[hashi]){_tables[hashi] = next;}else{parent->_next = next;}delete cur;--_n;return true;}parent = cur;cur = cur->_next;}return false;}private:vector<Node*> _tables;  // 指针数组size_t _n = 0;			// 表中存储数据个数};}

3.2.上层封装

然后我们对unordered_set、unordered_map完成封装，unordered_map实现operator[]

3.2.1.unordered_set封装

namespace bit
{using namespace hash_bucket;template<class K>class unorderded_set{public:struct setKeyOfT{const K& operator()(const K& key){return key;}};typedef typename HashTable<K,const K, setKeyOfT>::Iterator iterator;typedef typename HashTable<K,const K, setKeyOfT>::ConstIterator const_iterator;pair<iterator, bool> insert(const K& data){return _pht.Insert(data);}bool erase(const K& key){return _pht.Erase(key);}iterator find(const K& key){return _pht.Find(key);}iterator begin(){return _pht.Begin();}iterator end(){return _pht.End();}const_iterator begin()const{return _pht.Begin();}const_iterator end()const{return _pht.End();}private:HashTable<K,const K, setKeyOfT> _pht;};
}

3.2.2.unordered_map封装及operator[]实现

operator[]实现需注意下层迭代器及Insert的实现

namespace bit
{template<class K, class V>class unorderded_map{public:struct mapKeyOfT{const K& operator()(const pair<K, V>& t){return t.first;}};typedef typename HashTable<K, pair<const K,V>, mapKeyOfT>::Iterator iterator;typedef typename HashTable<K, pair<const K, V>, mapKeyOfT>::ConstIterator const_iterator;pair<iterator, bool> insert(const pair<K,V>& data){return _pht.Insert(data);}bool erase(const K& key){return _pht.Erase(key);}iterator find(const K& key){return _pht.Find(key);}//要点在于下层迭代器及Insert的实现V& operator[](const K& key){pair<iterator, bool>  pa = insert(make_pair(key, V()));return pa.first->second;}iterator begin(){return _pht.Begin();}iterator end(){return _pht.End();}const_iterator begin()const{return _pht.Begin();}const_iterator end()const{return _pht.End();}private:hash_bucket::HashTable<K, pair<const K,V>, mapKeyOfT> _pht;};