剖析HashMap
本文为书籍《Java编程的逻辑》1和《剑指Java:核心原理与应用实践》2阅读笔记
1.1 Map 接口
Map
是映射,有键和值的概念,映射表示键和值之间的对应关系,一个键映射到一个值,Map
按照键存储和访问值,键不能重复,即一个键只会存储一份,给同一个键重复设值会覆盖原来的值。使用Map
可以方便地处理需要根据键访问对象的场景,比如:
- 一个词典应用,键可以为单词,值可以为单词信息类,包括含义、发音、例句等;
- 统计和记录一本书中所有单词出现的次数,可以以单词为键,以出现次数为值;
- 管理配置文件中的配置项,配置项是典型的键值对;
- 根据身份证号查询人员信息,身份证号为键,人员信息为值。
Map
接口的定义如代码清单如下所示:
public interface Map<K, V> { // K 和 V 是类型参数,分别表示键(key)和值(value)的类型V put(K key, V value); // 保存键值对,如果原来有 key,覆盖,返回原来的值V get(Object key); // 根据键获取值, 没找到,返回 nullV remove(Object key); // 根据键删除键值对, 返回 key 原来的值,如果不存在,返回 nullint size(); //查看 Map 中键值对的个数boolean isEmpty(); // 是否为空boolean containsKey(Object key); // 查看是否包含某个键boolean containsValue(Object value); // 查看是否包含某个值void putAll(Map<? extends K, ? extends V> m); // 保存 m 中的所有键值对到当前 Mapvoid clear(); // 清空 Map 中所有键值对Set<K> keySet(); //获取 Ma p中键的集合Collection<V> values(); // 获取 Map 中所有值的集合Set<Map.Entry<K, V>> entrySet(); // 获取 Map 中的所有键值对interface Entry<K, V> { // 嵌套接口,表示一条键值对K getKey(); // 键值对的键V getValue(); // 键值对的值V setValue(V value);boolean equals(Object o);int hashCode();}boolean equals(Object o);int hashCode();
}
Java 8
增加了一些默认方法,如getOrDefault
、forEach
、replaceAll
、putIfAbsent
、replace
、computeIfAbsent
、merge
等,Java 9
增加了多个重载的of
方法,可以方便地根据一个或多个键值对构建不变的Map
,具体可参见API
文档或源码。
Set
是一个接口,表示的是数学中的集合概念,即没有重复的元素集合。Java
中的Set
定义为:
public interface Set<E> extends Collection<E> {
}
它扩展了Collection
,具体的函数定义这里我们不详细展开了,不过,它要求所有实现者都必须确保Set
的语义约束,即不能有重复元素。Map
中的键是没有重复的,所以keySet()
返回了一个Set
。keySet()
、values()
、entrySet()
有一个共同的特点,它们返回的都是视图,不是复制的值,基于返回值的修改会直接修改Map
自身,比如:
@Testpublic void testHashMapView(){HashMap<String, String> hashMap = new HashMap<>();hashMap.put("name", "nwq");hashMap.put("age", "18");hashMap.keySet().clear();assertTrue(hashMap.isEmpty());}
hashMap.keySet().clear()
会删除所有键值对。
1.2 基本用法
HashMap
实现了Map
接口,我们通过一个简单的例子来看如何使用。我们写一个程序,来看随机产生的数是否均匀。比如,随机产生 1000 1000 1000个 0 ∼ 3 0\sim3 0∼3的数,统计每个数的次数,代码如下所示:
@Testpublic void testHashMapBasics() {Random rnd = new Random(150);Map<Integer, Integer> countMap = new HashMap<>();for (int i = 0; i < 1000; i++) {int num = rnd.nextInt(4);Integer count = countMap.get(num);if (count == null) {countMap.put(num, 1);} else {countMap.put(num, count + 1);}}StringBuilder stringBuilder = new StringBuilder();for (Map.Entry<Integer, Integer> kv : countMap.entrySet()) {stringBuilder.append(kv.getKey() + ", " + kv.getValue() + ", ");}assertTrue("0, 253, 1, 230, 2, 243, 3, 274, ".equals(stringBuilder.toString()));}
次数分别是 253 253 253、 230 230 230、 243 243 243、 274 274 274。
除了默认构造方法, HashMap
还有如下构造方法:
public HashMap(int initialCapacity)
public HashMap(int initialCapacity, float loadFactor)
public HashMap(Map<? extends K, ? extends V> m)
最后一个以一个已有的Map
构造,复制其中的所有键值对到当前Map
。前两个涉及参数initialCapacity
和loadFactor
,它们是什么意思呢?我们需要看下HashMap
的实现原理。
1.3 实现原理
我们看下HashMap
的内部组成以及主要的方法,代码基于java 17
分析。
1.3.1 内部组成
HashMap
内部有如下几个主要的实例变量:
/*** The table, initialized on first use, and resized as* necessary. When allocated, length is always a power of two.* (We also tolerate length zero in some operations to allow* bootstrapping mechanics that are currently not needed.)*/transient Node<K,V>[] table;/*** The number of key-value mappings contained in this map.*/transient int size;/*** The next size value at which to resize (capacity * load factor).** @serial*/int threshold;/*** The load factor for the hash table.** @serial*/final float loadFactor;
size
表示实际键值对的个数。table
是一个Node
类型的数组,称为哈希表或哈希桶,其中的每个元素指向一个单向链表,链表中的每个节点表示一个键值对。Node
是一个内部类,它的实例变量和构造方法代码如下:
/*** Basic hash bin node, used for most entries. (See below for* TreeNode subclass, and in LinkedHashMap for its Entry subclass.)*/static class Node<K,V> implements Map.Entry<K,V> {final int hash;final K key;V value;Node<K,V> next;Node(int hash, K key, V value, Node<K,V> next) {this.hash = hash;this.key = key;this.value = value;this.next = next;}public final K getKey() { return key; }public final V getValue() { return value; }public final String toString() { return key + "=" + value; }public final int hashCode() {return Objects.hashCode(key) ^ Objects.hashCode(value);}public final V setValue(V newValue) {V oldValue = value;value = newValue;return oldValue;}public final boolean equals(Object o) {if (o == this)return true;return o instanceof Map.Entry<?, ?> e&& Objects.equals(key, e.getKey())&& Objects.equals(value, e.getValue());}}
其中,key
和value
分别表示键和值,next
指向下一个Node
节点,hash
是key
的hash
值,待会我们会讨论其计算方法。直接存储hash
值是为了在比较的时候加快计算。table
的初始值为null
。在添加键值时,如果table
为null
,那么,会调用resize()
,对table
进行扩展,扩展的策略类似于ArrayList
。添加第一个元素时,默认分配的大小为 16 16 16,不过,并不是size
大于 16 16 16时再进行扩展,下次什么时候扩展与threshold
有关。threshold
表示阈值,当键值对个数size
大于等于threshold
时考虑进行扩展。threshold
是怎么算出来的呢?一般而言,threshold
等于table.length
乘以loadFactor
。比如,如果table.length
为 16 16 16,loadFactor
为 0.75 0.75 0.75,则threshold
为 12 12 12。loadFactor
是负载因子,表示整体上table
被占用的程度,是一个浮点数,默认为 0.75 0.75 0.75,可以通过构造方法public HashMap(int initialCapacity, float loadFactor)
进行修改。
1.3.2 构造方法
默认构造方法的代码为:
/*** Constructs an empty {@code HashMap} with the default initial capacity* (16) and the default load factor (0.75).*/public HashMap() {this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted}
DEFAULT_LOAD_FACTOR
为 0.75 0.75 0.75。可以看到,并没有给threshold
赋值,threshold
赋值后移到第一次put
中,给定的值是DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY=0.75*16=12
。
还有一个构造函数,HashMap(int initialCapacity, float loadFactor)
。
public HashMap(int initialCapacity, float loadFactor) {if (initialCapacity < 0)throw new IllegalArgumentException("Illegal initial capacity: " +initialCapacity);if (initialCapacity > MAXIMUM_CAPACITY)initialCapacity = MAXIMUM_CAPACITY;if (loadFactor <= 0 || Float.isNaN(loadFactor))throw new IllegalArgumentException("Illegal load factor: " +loadFactor);this.loadFactor = loadFactor;this.threshold = tableSizeFor(initialCapacity);}
从上述代码中,可以知道,loadFactor
给定多少,就是多少,threshold
值调用了tableSizeFor
函数,代码如下:
/*** Returns a power of two size for the given target capacity.*/static final int tableSizeFor(int cap) {int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;}
tableSizeFor
返回一个大于且最接近给定cap
2 2 2的幂次方数,什么意思呢?比如:
cap | 幂次方数 | tableSizeFor 返回值 |
---|---|---|
7 | 2 3 = 8 2^3=8 23=8 | 8 |
8 | 2 3 = 8 2^3=8 23=8 | 8 |
9 | 2 4 = 16 2^4=16 24=16 | 16 |
10 | 2 4 = 16 2^4=16 24=16 | 16 |
对于 7 7 7,最接近 2 2 2的幂次方数为 8 8 8,指数为 3 3 3。
1.3.3 保存键值对
下面,我们来看HashMap
是如何把一个键值对保存起来的,代码如下所示:
public V put(K key, V value) {return putVal(hash(key), key, value, false, true);}
执行putVal
之前,调用了hash
方法,计算key
的hash
值,代码如下:
/*** Computes key.hashCode() and spreads (XORs) higher bits of hash* to lower. Because the table uses power-of-two masking, sets of* hashes that vary only in bits above the current mask will* always collide. (Among known examples are sets of Float keys* holding consecutive whole numbers in small tables.) So we* apply a transform that spreads the impact of higher bits* downward. There is a tradeoff between speed, utility, and* quality of bit-spreading. Because many common sets of hashes* are already reasonably distributed (so don't benefit from* spreading), and because we use trees to handle large sets of* collisions in bins, we just XOR some shifted bits in the* cheapest possible way to reduce systematic lossage, as well as* to incorporate impact of the highest bits that would otherwise* never be used in index calculations because of table bounds.*/static final int hash(Object key) {int h;return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);}
这里需要注意的是,key
是支持为null
的,当为null
时,计算的hash
值为 0 0 0,将会被存储在HashMap
的第一个位置上(即Table
数组的第一个位置上)。
调用内部函数putVal
的代码如下所示:
/*** Implements Map.put and related methods.** @param hash hash for key* @param key the key* @param value the value to put* @param onlyIfAbsent if true, don't change existing value* @param evict if false, the table is in creation mode.* @return previous value, or null if none*/final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {Node<K,V>[] tab; Node<K,V> p; int n, i;if ((tab = table) == null || (n = tab.length) == 0)n = (tab = resize()).length;if ((p = tab[i = (n - 1) & hash]) == null)tab[i] = newNode(hash, key, value, null);else {Node<K,V> e; K k;if (p.hash == hash &&((k = p.key) == key || (key != null && key.equals(k))))e = p;else if (p instanceof TreeNode)e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);else {for (int binCount = 0; ; ++binCount) {if ((e = p.next) == null) {p.next = newNode(hash, key, value, null);if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1sttreeifyBin(tab, hash);break;}if (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))break;p = e;}}if (e != null) { // existing mapping for keyV oldValue = e.value;if (!onlyIfAbsent || oldValue == null)e.value = value;afterNodeAccess(e);return oldValue;}}++modCount;if (++size > threshold)resize();afterNodeInsertion(evict);return null;}
如果是第一次保存,首先调用resize
方法给table
分配实际的空间:
/*** Initializes or doubles table size. If null, allocates in* accord with initial capacity target held in field threshold.* Otherwise, because we are using power-of-two expansion, the* elements from each bin must either stay at same index, or move* with a power of two offset in the new table.** @return the table*/final Node<K,V>[] resize() {Node<K,V>[] oldTab = table;int oldCap = (oldTab == null) ? 0 : oldTab.length;int oldThr = threshold;int newCap, newThr = 0;if (oldCap > 0) {if (oldCap >= MAXIMUM_CAPACITY) {threshold = Integer.MAX_VALUE;return oldTab;}else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&oldCap >= DEFAULT_INITIAL_CAPACITY)newThr = oldThr << 1; // double threshold}else if (oldThr > 0) // initial capacity was placed in thresholdnewCap = oldThr;else { // zero initial threshold signifies using defaultsnewCap = DEFAULT_INITIAL_CAPACITY;newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);}if (newThr == 0) {float ft = (float)newCap * loadFactor;newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?(int)ft : Integer.MAX_VALUE);}threshold = newThr;@SuppressWarnings({"rawtypes","unchecked"})Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];table = newTab;if (oldTab != null) {for (int j = 0; j < oldCap; ++j) {Node<K,V> e;if ((e = oldTab[j]) != null) {oldTab[j] = null;if (e.next == null)newTab[e.hash & (newCap - 1)] = e;else if (e instanceof TreeNode)((TreeNode<K,V>)e).split(this, newTab, j, oldCap);else { // preserve orderNode<K,V> loHead = null, loTail = null;Node<K,V> hiHead = null, hiTail = null;Node<K,V> next;do {next = e.next;if ((e.hash & oldCap) == 0) {if (loTail == null)loHead = e;elseloTail.next = e;loTail = e;}else {if (hiTail == null)hiHead = e;elsehiTail.next = e;hiTail = e;}} while ((e = next) != null);if (loTail != null) {loTail.next = null;newTab[j] = loHead;}if (hiTail != null) {hiTail.next = null;newTab[j + oldCap] = hiHead;}}}}}return newTab;}
默认情况下,capacity
的值为 16 16 16,threshold
会变为 12 12 12,table
会分配一个长度为 16 16 16的Node
数组。
接下来,计算i = (n - 1) & hash
,计算应该将这个键值对放到table
的哪个位置。HashMap
中,length
为 2 2 2的幂次方, (n - 1) & hash
等同于求模运算h%length
。找到了保存位置i
,table[i]
指向一个单向链表。接下来,就是在这个链表中逐个查找是否已经有这个键了,遍历代码为:
for (int binCount = 0; ; ++binCount) {if ((e = p.next) == null) {p.next = newNode(hash, key, value, null);if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1sttreeifyBin(tab, hash);break;}if (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))break;p = e;}
比较的时候,是先比较hash
值,hash
相同的时候,再比较key
或者使用equals
方法进行比较,代码为:
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
为什么要先比较hash
呢?因为hash
是整数,比较的性能一般要比equals
高很多,hash
不同,就没有必要调用equals
方法了,这样整体上可以提高比较性能。如果能找到,直接修改Node
中的value
即可。modCount++
的含义与ArrayList
和LinkedList
中介绍一样,为记录修改次数,方便在迭代中检测结构性变化。如果没找到,则调用newNode
方法在给定的位置添加一条,代码如下所示:
p.next = newNode(hash, key, value, null)
我们发现,在添加后,会检查一下binCount >= TREEIFY_THRESHOLD - 1
,如果成立,那么会调用treeifyBin(tab, hash)
,这是何意呢?首先看下treeifyBin(tab, hash)
代码,如下所示:
/*** Replaces all linked nodes in bin at index for given hash unless* table is too small, in which case resizes instead.*/final void treeifyBin(Node<K,V>[] tab, int hash) {int n, index; Node<K,V> e;if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)resize();else if ((e = tab[index = (n - 1) & hash]) != null) {TreeNode<K,V> hd = null, tl = null;do {TreeNode<K,V> p = replacementTreeNode(e, null);if (tl == null)hd = p;else {p.prev = tl;tl.next = p;}tl = p;} while ((e = e.next) != null);if ((tab[index] = hd) != null)hd.treeify(tab);}}
当链表上节点的数量超过MIN_TREEIFY_CAPACITY=64
时,hash
表中的链表就会转为红黑树结构,以增加查找效率。
在putVal
函数增加键值对之后,会调用resize()
函数,函数代码如下所示:
/*** Initializes or doubles table size. If null, allocates in* accord with initial capacity target held in field threshold.* Otherwise, because we are using power-of-two expansion, the* elements from each bin must either stay at same index, or move* with a power of two offset in the new table.** @return the table*/final Node<K,V>[] resize() {Node<K,V>[] oldTab = table;int oldCap = (oldTab == null) ? 0 : oldTab.length;int oldThr = threshold;int newCap, newThr = 0;if (oldCap > 0) {if (oldCap >= MAXIMUM_CAPACITY) {threshold = Integer.MAX_VALUE;return oldTab;}else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&oldCap >= DEFAULT_INITIAL_CAPACITY)newThr = oldThr << 1; // double threshold}else if (oldThr > 0) // initial capacity was placed in thresholdnewCap = oldThr;else { // zero initial threshold signifies using defaultsnewCap = DEFAULT_INITIAL_CAPACITY;newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);}if (newThr == 0) {float ft = (float)newCap * loadFactor;newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?(int)ft : Integer.MAX_VALUE);}threshold = newThr;@SuppressWarnings({"rawtypes","unchecked"})Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];table = newTab;if (oldTab != null) {for (int j = 0; j < oldCap; ++j) {Node<K,V> e;if ((e = oldTab[j]) != null) {oldTab[j] = null;if (e.next == null)newTab[e.hash & (newCap - 1)] = e;else if (e instanceof TreeNode)((TreeNode<K,V>)e).split(this, newTab, j, oldCap);else { // preserve orderNode<K,V> loHead = null, loTail = null;Node<K,V> hiHead = null, hiTail = null;Node<K,V> next;do {next = e.next;if ((e.hash & oldCap) == 0) {if (loTail == null)loHead = e;elseloTail.next = e;loTail = e;}else {if (hiTail == null)hiHead = e;elsehiTail.next = e;hiTail = e;}} while ((e = next) != null);if (loTail != null) {loTail.next = null;newTab[j] = loHead;}if (hiTail != null) {hiTail.next = null;newTab[j + oldCap] = hiHead;}}}}}return newTab;}
如果空间不够,即size
已经要超过阈值threshold
了,并且对应的table
位置已经插入过对象了,分配一个容量为原来两倍的Node
数组,并将将原来的键值对移植过来。
1.3.4 查找方法
根据键获取值的get
方法的代码为:
public V get(Object key) {Node<K,V> e;return (e = getNode(key)) == null ? null : e.value;}
调用了getNode(Object key)
函数,代码如下所示:
/*** Implements Map.get and related methods.** @param key the key* @return the node, or null if none*/final Node<K,V> getNode(Object key) {Node<K,V>[] tab; Node<K,V> first, e; int n, hash; K k;if ((tab = table) != null && (n = tab.length) > 0 &&(first = tab[(n - 1) & (hash = hash(key))]) != null) {if (first.hash == hash && // always check first node((k = first.key) == key || (key != null && key.equals(k))))return first;if ((e = first.next) != null) {if (first instanceof TreeNode)return ((TreeNode<K,V>)first).getTreeNode(hash, key);do {if (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))return e;} while ((e = e.next) != null);}}return null;}
getNode
处理的逻辑如下:
-
判断
table
不为null
且数组长度大于零并且数组第一个元素不为null
,否则直接返回null
,在计算第一个元素时计算了key
的hash
值,代码为:if ((tab = table) != null && (n = tab.length) > 0 &&(first = tab[(n - 1) & (hash = hash(key))]) != null)
-
检查第一个节点是否是目标,如果是则返回,代码为:
if (first.hash == hash && // always check first node((k = first.key) == key || (key != null && key.equals(k))))return first;
-
判断第一个节点是链表节点还是树节点,如果是树节点则走树的查找代码,否则按照顺序遍历;链表剩余的节点,直到找到为止:
if ((e = first.next) != null) {if (first instanceof TreeNode)return ((TreeNode<K,V>)first).getTreeNode(hash, key);do {if (e.hash == hash &&((k = e.key) == key || (key != null && key.equals(k))))return e;} while ((e = e.next) != null);}
1.3.5 根据键删除键值对
根据键删除键值对的代码为:
/*** Removes the mapping for the specified key from this map if present.** @param key key whose mapping is to be removed from the map* @return the previous value associated with {@code key}, or* {@code null} if there was no mapping for {@code key}.* (A {@code null} return can also indicate that the map* previously associated {@code null} with {@code key}.)*/public V remove(Object key) {Node<K,V> e;return (e = removeNode(hash(key), key, null, false, true)) == null ?null : e.value;}
removeNode
的代码为:
/*** Implements Map.remove and related methods.** @param hash hash for key* @param key the key* @param value the value to match if matchValue, else ignored* @param matchValue if true only remove if value is equal* @param movable if false do not move other nodes while removing* @return the node, or null if none*/final Node<K,V> removeNode(int hash, Object key, Object value,boolean matchValue, boolean movable) {Node<K,V>[] tab; Node<K,V> p; int n, index;if ((tab = table) != null && (n = tab.length) > 0 &&(p = tab[index = (n - 1) & hash]) != null) {Node<K,V> node = null, e; K k; V v;if (p.hash == hash &&((k = p.key) == key || (key != null && key.equals(k))))node = p;else if ((e = p.next) != null) {if (p instanceof TreeNode)node = ((TreeNode<K,V>)p).getTreeNode(hash, key);else {do {if (e.hash == hash &&((k = e.key) == key ||(key != null && key.equals(k)))) {node = e;break;}p = e;} while ((e = e.next) != null);}}if (node != null && (!matchValue || (v = node.value) == value ||(value != null && value.equals(v)))) {if (node instanceof TreeNode)((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);else if (node == p)tab[index] = node.next;elsep.next = node.next;++modCount;--size;afterNodeRemoval(node);return node;}}return null;}
基本逻辑分析如下。
-
判断
table
不为null
且数组长度大于零并且数组第一个元素不为null
,否则直接返回null
,代码为:if ((tab = table) != null && (n = tab.length) > 0 &&(p = tab[index = (n - 1) & hash]) != null)
-
判断第一个节点是否是目标节点,如果是则j记录查找节点,代码为:
if (p.hash == hash &&((k = p.key) == key || (key != null && key.equals(k))))node = p;
-
判断第一个节点是链表节点还是树节点,如果是树节点则走树的查找代码,如果是链表,则顺序遍历,直到找到为止:
else if ((e = p.next) != null) {if (p instanceof TreeNode)node = ((TreeNode<K,V>)p).getTreeNode(hash, key);else {do {if (e.hash == hash &&((k = e.key) == key ||(key != null && key.equals(k)))) {node = e;break;}p = e;} while ((e = e.next) != null);}}
-
删除找到的节点:
if (node != null && (!matchValue || (v = node.value) == value ||(value != null && value.equals(v)))) {if (node instanceof TreeNode)((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);else if (node == p)tab[index] = node.next;elsep.next = node.next;++modCount;--size;afterNodeRemoval(node);return node;}
1.4 小结
HashMap
内部有一个哈希表,即数组table
,每个元素table[i]
指向一个单向链表或红黑树,根据键存取值,用键算出hash
值,取模得到数组中的索引位置buketIndex
,然后操作table[buketIndex]
指向的单向链表。存取的时候依据键的hash
值,只在对应的链表中操作,不会访问别的链表,在对应链表操作时也是先比较hash
值,如果相同再用equals
方法比较。这就要求,相同的对象其hashCode
返回值必须相同,如果键是自定义的类,就特别需要注意这一点。这也是hashCode
和equals
方法的一个关键约束。需要说明的是,Java 8
对HashMap
的实现进行了优化,在哈希冲突比较严重的情况下,即大量元素映射到同一个链表的情况下(具体是至少 8 8 8个元素,且总的键值对个数至少是 64 64 64),Java 8
会将该链表转换为一个红黑树,以提高查询的效率。
HashMap
实现了Map
接口,可以方便地按照键存取值,内部使用数组链表和哈希的方式进行实现,这决定了它有如下特点:
- 根据键保存和获取值的效率都很高,为 O ( 1 ) O(1) O(1),每个单向链表往往只有一个或少数几个节点,根据
hash
值就可以直接快速定位; HashMap
中的键值对没有顺序,因为hash
值是随机的。
如果经常需要根据键存取值,而且不要求顺序,那么HashMap
就是理想的选择。如果要保持添加的顺序,可以使用HashMap
的一个子类LinkedHashMap
。Map
还有一个重要的实现类TreeMap
,它可以排序。需要说明的是,HashMap
不是线程安全的,Java
中还有一个类Hashtable
,它是Java
最早实现的容器类之一,实现了Map
接口,实现原理与HashMap
类似,但没有特别的优化,它内部通过synchronized
实现了线程安全。在HashMap
中,键和值都可以为null
,而在Hashtable
中不可以。在不需要并发安全的场景中,推荐使用HashMap
。在高并发的场景中,推荐使用ConcurrentHashMap
。
马俊昌.Java编程的逻辑[M].北京:机械工业出版社,2018. ↩︎
尚硅谷教育.剑指Java:核心原理与应用实践[M].北京:电子工业出版社,2023. ↩︎