Java HashMap執行個體源碼分析

引言

hashmap在鍵值對存儲中被經常使用，那麼它到底是如何實作鍵值存儲的呢？

一 entry

entry是map接口中的一個内部接口，它是實作鍵值對存儲關鍵。在hashmap中，有entry的實作類，叫做entry。entry類很簡

單，裡面包含key，value，由外部引入的hash，還有指向下一個entry對象的引用，和資料結構中學的連結清單中的note節點很類似。

entry類的屬性和構造函數:

final k key;

v value;

entry<k,v> next;

int hash;

/**

* creates new entry.

entry(int h, k k, v v, entry<k,v> n) {

value = v;

next = n;

key = k;

hash = h;

}

二 hashmap的初始化

//hashmap構造方法

public hashmap(int initialcapacity, float loadfactor) {

if (initialcapacity < 0)

throw new illegalargumentexception("illegal initial capacity: " +

initialcapacity);

if (initialcapacity > maximum_capacity)

initialcapacity = maximum_capacity;

if (loadfactor <= 0 || float.isnan(loadfactor))

throw new illegalargumentexception("illegal load factor: " +

loadfactor);

this.loadfactor = loadfactor;

threshold = initialcapacity;

init();

這是hashmap的構造函數之一，其他構造函數都引用這個構造函數進行初始化。參數initialcapacity指的是hashmap中

table數組最初的大小，參數loadfactory指的是hashmap可容納鍵值對與數組長度的比值（舉個例子：數組長度預設值為

16，loadfactory預設值為0.75，如果hashmap中存儲的鍵值對即entry多于12，則會進行擴容，擴容後大小為目前數組長度的2

倍）。在構造函數中不會對數組進行初始化，隻有在put等操作方法内會進行判斷是否要初始化或擴容。

三 table數組

在hashmap中有一個概念叫做threshold(實際可容納量)，實際可容納量指的是在hashmap中允許存在最多的entry的個數，它

是由hashmap中内置的數組table的長度*load factory(負載因子)得來。其作用是保證hashmap的效率。

table數組是hashmap實作鍵值對存儲的又一關鍵，具體鍵值對是怎麼存的呢？請看下圖

如圖中的[key，value]就是entry對象來實作的，而table數組是用來存放entry對象的。

//數組的初始化：

private static int rounduptopowerof2(int number) {

return number >= maximum_capacity

? maximum_capacity

: (number > 1) ? integer.highestonebit((number - 1) << 1) : 1;

private void inflatetable(int tosize) {

// find a power of 2 >= tosize

int capacity = rounduptopowerof2(tosize);

threshold = (int) math.min(capacity * loadfactor, maximum_capacity + 1);

table = new entry[capacity];

inithashseedasneeded(capacity);

在put等方法中發現數組未進行初始化時會調用inflatetable方法進行初始化，輸入參數為初始設定的initialcapacity，實

際上他會調用rounduptopowerof2方法傳回一個比初始容量大的最小的2的幂數（其中一個原因是在得到entry所在數組位置時友善）。

四 put方法

public v put(k key, v value) {

if (table == empty_table) {

inflatetable(threshold);

if (key == null)

return putfornullkey(value);

int hash = hash(key);

int i = indexfor(hash, table.length);

for (entry<k,v> e = table[i]; e != null; e = e.next) {

object k;

if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {

v oldvalue = e.value;

e.value = value;

e.recordaccess(this);

return oldvalue;

}

modcount++;

addentry(hash, key, value, i);

return null;

private v putfornullkey(v value) {

for (entry<k,v> e = table[0]; e != null; e = e.next) {

if (e.key == null) {

addentry(0, null, value, 0);

void addentry(int hash, k key, v value, int bucketindex) {

if ((size >= threshold) && (null != table[bucketindex])) {

resize(2 * table.length);

hash = (null != key) ? hash(key) : 0;

bucketindex = indexfor(hash, table.length);

createentry(hash, key, value, bucketindex);

void createentry(int hash, k key, v value, int bucketindex) {

entry<k,v> e = table[bucketindex];

table[bucketindex] = new entry<>(hash, key, value, e);

size++;

在put方法中

1. 首先會判斷數組是否為空，如果為空會對數組進行初始化。

2. 接下來判斷key是否為null，如果為null就采用第二個方法對鍵值對進行put。

3. 接下來對key進行hash得到一個數值，再對這個數值進行處理（indexfor方法）得到所在數組中的位置。

4. 接下來會周遊所在數組位置的連結清單，如果key的hash和傳入key的hash相同且（key記憶體位址相等或 equals方法相等），則意味着會更新在連結清單中的value值，并傳回舊的value值。

5. 如果上邊的方法都沒有奏效，則會調用第三個方法，建立一個新的entry對象。

在putfornullkey方法中，我們看到它是為了null值專門設定的，null值的hash始終為0，是以key為null的entry對象肯定在數組的第0個位置。同樣，如果找到則更新，沒有找到則添加。

調用addentry方法

意味着要往這個數組連結清單中添加一個entry，是以會在最開始判斷已經存在的entry數量是否超過了實際可容納量。如果超過了，則會調用resize方

法将數組擴大兩倍，注意在擴大之後會對已經存入的entry進行重排，原因是當初存入時indexfor方法與數組長度有關系。接着會調用第四個方法。

createentry方法很簡單，就是将原本在數組中存放的連結清單頭置入到新的entry之後，将新的entry放入數組中。從這裡我們可以看出hashmap不保證順序問題。

get方法和contains方法原理和put方法一緻，即先通過對key的hash得到其value值所在的連結清單頭在數組中的位置，再通過equals方法判斷value是否存在。

五其他

//hash方法

final int hash(object k) {

int h = hashseed;

if (0 != h && k instanceof string) {

return sun.misc.hashing.stringhash32((string) k);

h ^= k.hashcode();

// this function ensures that hashcodes that differ only by

// constant multiples at each bit position have a bounded

// number of collisions (approximately 8 at default load factor).

h ^= (h >>> 20) ^ (h >>> 12);

return h ^ (h >>> 7) ^ (h >>> 4);

hash方法中最終傳回值與key的hashcode方法有關。

總結

最終數組初始化的容量大小會是大于等于你傳入初始容量的最小2的幂數。

key為null或value為null能存入hashmap的原因是對null值會進行單獨的操作。

在table數組中的連結清單中每個entry的共同點是key的hash(key.hashcode)部分相同。

注意對key的hashcode和equals方法的重寫當你想讓兩個key映射一個對象，因為判定key相等的條件是（hashcode相等+（記憶體相等或 equals相等））。

最早存入的鍵值對會在連結清單的末端。

當數組沒有連結清單存在時，hashmap性能最好為o(1)。而最差為o(threshould)。

來源：51cto

Java HashMap執行個體源碼分析

繼續閱讀

關于Gradle配置的小結

Java小案例——随機數猜測随機數猜測

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method