浅析Java集合类源码（二）--- HashSet, HashMap, Hashtable

4.HashSet

HashSet继承了AbstractSet，实现了Set，Cloneable，Serializable等接口。HashSet不是线程安全类。

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable

HashSet通过一个HashMap来保存对象，Map的键是每次添加的元素，值是一个 static final Object对象PRESENT。

// Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

（1）初始化

初始化可设置初始容量initialCapacity和加载因子loadFactor，HashSet调用HashMap的初始化方法，因此默认初始容量initialCapacity和HashMap相同，是16；默认加载因子loadFactor也和HashMap相同，是0.75。

当【元素数量（size） > 容量（capacity） * 加载因子（loadFactor）】时，HashSet的容量会相应的进行扩展至原来的2倍

public HashSet() {
    map = new HashMap<>();
}

public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}

public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}

（2）添加

HashSet的添加方法调用HashMap的添加方法

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

（3）删除

HashSet的删除方法调用HashMap的删除方法

public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
}

5.HashMap

HashMap继承了AbstractMap，实现了Map，Cloneable，Serializable。HashMap不是线程安全类。

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable

HashMap对键key取hash值，然后将同一个hash值的元素保存在数组某个位置中。所谓同一个hash值，是指将元素的hash值与（数组容量-1）取（与&）关系后得到的值，目的是去除hash值的高位，保留hash值的低位，将其作为key的hash值。

HashMap中hash值的计算程序：将Key的原始hash值与（数组容量-1）取与（&）关系得到的值。因为数组容量是2的幂次方，减1后低位则全部变为1，对hash值进行与（&）运算后得到hash值的低位，去除了高位。

if ((tab = table) == null || (n = tab.length) == )
    n = (tab = resize()).length;
if ((p = tab[i = (n - ) & hash]) == null)
    tab[i] = newNode(hash, key, value, null);

当HashMap中的元素不太多时，使用Node链表来存储；当HashMap中的元素过多时，使用红黑树TreeNode的存储结构来保存元素。

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
Node<K,V> next;
}

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
}

使用Node节点保存HashMap元素的数据结构：

浅析Java集合类源码（二）--- HashSet, HashMap, Hashtable

（1）初始化

HashMap初始化可设置初始容量initialCapacity和加载因子loadFactor。默认初始容量是16，会在添加第一个元素的时候初始hash数组的长度。默认加载因子是0.75。

public HashMap(int initialCapacity, float loadFactor)

public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

初始容量需为2的幂次方，当初始容量不为2的幂次方时，HashMap会调用tableSizeFor()方法把容量变为2的幂次方。

static final int tableSizeFor(int cap) {
    int n = cap - ;
    n |= n >>> ;
    n |= n >>> ;
    n |= n >>> ;
    n |= n >>> ;
    n |= n >>> ;
    return (n < ) ?  : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + ;
}

（2）查找

当调用get()方法对查找某一个Key值的元素时，getNode()方法计算键Key的hash值，先找到对应hash值在数组中的位置，然后对数组中对应位置的元素链表遍历比较，直到节点Node的Key值与查找的Key值相同。

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) >  &&
        (first = tab[(n - ) & hash]) != null) {
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        if ((e = first.next) != null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}

（3）添加

添加元素时，若Key的hash值对应的数组位置还没有元素，则直接添加到该位置

if ((p = tab[i = (n - ) & hash]) == null)
    tab[i] = newNode(hash, key, value, null);

若该位置已有元素，则先判断该元素的Key值是否与添加的Key值相同，若相同直接替换该Node节点的value值。

Node<K,V> e; K k;
if (p.hash == hash &&
    ((k = p.key) == key || (key != null && key.equals(k))))
    e = p;

否则遍历该链表的元素。若没有发现相同的Key值，则将该元素添加到链表的末尾。若发现相同的Key值，则替换该Node节点的value值。

else {
    for (int binCount = ; ; ++binCount) {
        if ((e = p.next) == null) {
            p.next = newNode(hash, key, value, null);
            break;
        }
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            break;
        p = e;
    }
}
if (e != null) { // existing mapping for key
    V oldValue = e.value;
    if (!onlyIfAbsent || oldValue == null)
        e.value = value;
    afterNodeAccess(e);
return oldValue;

（4）删除

删除某个节点和添加节点相似，当查找到某个节点后，将后继节点赋值到前驱节点的next，相当于把查找的节点移除

if (node != null && (!matchValue || (v = node.value) == value ||
                     (value != null && value.equals(v)))) {
    if (node instanceof TreeNode)
        ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
    else if (node == p)
        tab[index] = node.next;
    else
        p.next = node.next;
}

（5）调整元素

若HashMap中的元素数量达到了threshold门限值（当前数组大小Capacity * 加载因子loadFactor），便通过resize()方法调整数组大小。

基本思想：设置loHead，loTail，hiHead，hiTail四个变量：其中lo变量是用于保存hash值在原数组的节点，hi变量是用于保存hash值在新数组的节点；Head变量用于保存同一个hash值链表的第一个元素，对于一组hash链表元素进行遍历时，只被初始化一次，Tail变量用于保存同一个hash值链表的最后一个元素。

例子：

1.初始HashMap的数组长度是16，先插入Key的hash值为{1}的节点，再插入Key的hash值为{17}的节点，再插入Key的hash值为{31}的节点。由于{1}，{17}，{31}对15取（与&）运算的值均为1，因此均保存在hash数组的table[1]元素中，以链表的方式存储。

2.当数组长度需要扩充至32时：

（1）{1}对16取（与&）运算后仍为0，因此仍然在hash[1]位置，此时loHead和loTail均为{1}；

（2）{17}对16取（与&）运算后不为0，而原来{17}位于hash[1]的位置，因此其位于新table数组的[1+16]=[17]的位置，此时hiHead和hiTail均为{17}；

（3）{31}对16取（与&）运算后仍为0，因此仍在table[1]位置，此时loTail变为{31}，loHead不变，仍为{1}。

for (int j = ; j < oldCap; ++j) {
    Node<K,V> e;
    if ((e = oldTab[j]) != null) {
        oldTab[j] = null;
        if (e.next == null)
            newTab[e.hash & (newCap - )] = e;
        else if (e instanceof TreeNode)
            ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
        else { // preserve order
            Node<K,V> loHead = null, loTail = null;
            Node<K,V> hiHead = null, hiTail = null;
            Node<K,V> next;
            do {
                next = e.next;
                if ((e.hash & oldCap) == ) {
                    if (loTail == null)
                        loHead = e;
                    else
                        loTail.next = e;
                    loTail = e;
                }
                else {
                    if (hiTail == null)
                        hiHead = e;
                    else
                        hiTail.next = e;
                    hiTail = e;
                }
            } while ((e = next) != null);
            if (loTail != null) {
                loTail.next = null;
                newTab[j] = loHead;
            }
            if (hiTail != null) {
                hiTail.next = null;
                newTab[j + oldCap] = hiHead;
            }
        }
    }
}

5.Hashtable

Hashtable继承Dictionary，实现了Map，Cloneable，Serializable接口。Hashtable是线程安全类。

public class Hashtable<K,V>
    extends Dictionary<K,V>
    implements Map<K,V>, Cloneable, java.io.Serializable

和HashMap一样，Hashtable也是将Key的hash值相同但Key不同的元素放入数组的同一个位置，保存为Entry节点，然后将节点用链表的形式串起来。

private static class Entry<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    V value;
    Entry<K,V> next;
}

Hashtable的hash值计算程序：将hash值与0x7FFFFFFF取与（&）后，对当前数组容量取模（%）。取模（%）的目的是保证hash值在当前table[]数组的范围内。

int index = (hash & ) % tab.length;

Hashtable的数据结构如下图

浅析Java集合类源码（二）--- HashSet, HashMap, Hashtable

（1）初始化

Hashtable在初始化时设置初始容量initialCapacity和加载因子loadFactor，默认initialCapacity为11，loadFactor为0.75

public Hashtable() {
    this(, f);
}

public Hashtable(int initialCapacity) {
    this(initialCapacity, f);
}

public Hashtable(int initialCapacity, float loadFactor)

（2）查找

通过调用get()方法来查找相应Key的Value时，先通过Key的hash值计算获取在数组中的索引index值，再遍历该index值的链表，找到相同Key值的元素Entry，返回其值。

public synchronized V get(Object key) {
    Entry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & ) % tab.length;
    for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
        if ((e.hash == hash) && e.key.equals(key)) {
            return (V)e.value;
        }
    }
    return null;
}

（3）添加

添加的过程和查找很相似，都是先通过Key的hash值计算获取在数组中的索引index值，再遍历该index值的链表。当发现相同Key值的元素Entry，则更改此Entry的为新值，返回旧值。否则，添加一个新的Entry作为链表的头节点。

public synchronized V put(K key, V value) {
    // Makes sure the key is not already in the hashtable.
    Entry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & ) % tab.length;
    Entry<K,V> entry = (Entry<K,V>)tab[index];
    for(; entry != null ; entry = entry.next) {
        if ((entry.hash == hash) && entry.key.equals(key)) {
            V old = entry.value;
            entry.value = value;
            return old;
        }
    }
    addEntry(hash, key, value, index);
    return null;
}

private void addEntry(int hash, K key, V value, int index) {
    modCount++;
    Entry<?,?> tab[] = table;
    // Creates the new entry.
    @SuppressWarnings("unchecked")
    Entry<K,V> e = (Entry<K,V>) tab[index];
    tab[index] = new Entry<>(hash, key, value, e);
    count++;
}

（4）删除

删除时先计算出数组的索引值，然后遍历该位置的链表。找到符合的Key值，然后进行链表的删除操作。

public synchronized V remove(Object key) {
    Entry<?,?> tab[] = table;
    int hash = key.hashCode();
    int index = (hash & ) % tab.length;
    @SuppressWarnings("unchecked")
    Entry<K,V> e = (Entry<K,V>)tab[index];
    for(Entry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
        if ((e.hash == hash) && e.key.equals(key)) {
            modCount++;
            if (prev != null) {
                prev.next = e.next;
            } else {
                tab[index] = e.next;
            }
            count--;
            V oldValue = e.value;
            e.value = null;
            return oldValue;
        }
    }
    return null;
}

（5）调整元素

当数组需要扩展时，Hashtable调用rehash()方法调整元素的位置。

当数组需要扩展时，需要先对旧数组的每个索引位置的链表进行扫描，计算链表中每个元素的新的索引值，将其插入到该索引值的链表的头部。

for (int i = oldCapacity ; i-- >  ;) {
    for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
        Entry<K,V> e = old;
        old = old.next;

        int index = (e.hash & ) % newCapacity;
        e.next = (Entry<K,V>)newMap[index];
        newMap[index] = e;
    }
}

浅析Java集合类源码（二）--- HashSet, HashMap, Hashtable

4.HashSet

（1）初始化

（2）添加

（3）删除

5.HashMap

（1）初始化

（2）查找

（3）添加

（4）删除

（5）调整元素

5.Hashtable

（1）初始化

（2）查找

（3）添加

（4）删除

（5）调整元素

继续阅读

关于Gradle配置的小结

Java小案例——随机数猜测随机数猜测

nginx location中斜线的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的简单使用

neo4j之cypher使用文档

GitHub连夜封杀！这份阿里 10W 字内部 Java 字面试手册到底有多强？

spark/scala关于【资源文件】加载方法概述外部文件加载方案测试资源文件打包入jar包中小结

mybatis_入门程序Mybatis入门

AOP编程_Android优雅权限框架(1)概念基础，2021金三银四前言正文大纲正文

Effective Java 8:通用程序设计

OOM三种类型

工厂模式-三种类型

【递归】高效率求2的n次幂

win10本地scala和spark安装安装scala安装spark

scala (3) Function 和 Method