天天看點

【JVM技術探索】各種類型對象占用記憶體情況分析(下)前提回顧對象頭執行個體資料:枚舉類ArrayListHashMap

前提回顧

建議大家從【Java技術專題-JVM研究系列(39)Java各種類型對象占用記憶體情況分析(上)】開始學習比較好,這樣子會有一個承接和過度。根據前面的學習的記憶體占用計算規則,可以計算出一個對象在記憶體中的占用空間大小情況,下面舉例分析下Java中的Enum, ArrayList及HashMap的記憶體占用情況,讀者可以仿照分析計算過程來計算其他資料結構的記憶體占用情況。

注: 下面的分析計算基于HotSpot Jvm, JDK1.8, 64位機器,開啟指針壓縮。。

對象頭

這裡隻關注其記憶體占用大小。在64位機器上,預設不開啟指針壓縮(-XX:-UseCompressedOops)的情況下,對象頭占用12bytes,開啟指針壓縮(-XX:+UseCompressedOops)則占用16bytes。

執行個體資料:

對象引用(reference)類型在64位機器上,關閉指針壓縮時占用8bytes, 開啟時占用4bytes,一般指的是局部變量表或者操作數棧中的reference類型或者針對于成員變量情況下的位址引用(shallow size)。

注: 下面的分析計算基于HotSpot Jvm, JDK1.8, 64位機器,開啟指針壓縮。

枚舉類

建立enum時,編譯器會生成一個相關的類,這個類繼承自

java.lang.Enum

。Enum類擁有兩個屬性變量,分别為int的ordinal和String的name, 相關源碼如下:
public abstract class Enum<E extends Enum<E>>
        implements Comparable<E>, Serializable {
    /**
     * The name of this enum constant, as declared in the enum declaration.
     * Most programmers should use the {@link #toString} method rather than
     * accessing this field.
     */
    private final String name;

    /**
     * The ordinal of this enumeration constant (its position
     * in the enum declaration, where the initial constant is assigned
     * an ordinal of zero).
     *
     * Most programmers will have no use for this field.  It is designed
     * for use by sophisticated enum-based data structures, such as
     * {@link java.util.EnumSet} and {@link java.util.EnumMap}.
     */
    private final int ordinal;
}           

以下面的TestEnum為例進行枚舉類的記憶體占用分析

public enum TestEnum {
        ONE(1, "one"),
        TWO(2, "two");

        private int code;
        private String desc;

        TestEnum(int code, String desc) {
            this.code = code;
            this.desc = desc;
        }

        public int getCode() {
            return code;
        }

        public String getDesc() {
            return desc;
        }
}           

這裡TestEnum的每個執行個體除了父類的兩個屬性外,還擁有一個int的code及String的desc屬性,是以一個TestEnum的執行個體本身所占用的記憶體大小為:

12(header) + 4(ordinal) + 4(name reference) + 4(code) + 4(desc reference) = 28(padding) -> 32 bytes.           

總共占用的記憶體大小為:

按照上面對字元串類型的分析,desc和name都占用:48bytes。

是以TestEnum.ONE占用總記憶體大小為:

12(header) + 4(ordinal) + 4(code) + 48 * 2(desc, name) + 4(desc reference) + 4(name reference) = 128 (bytes)           

ArrayList

ArrayList實作List接口,底層使用數組儲存所有元素。其操作基本上是對數組的操作。下面分析下源代碼:

底層使用數組儲存資料:

/**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     */
    transient Object[] elementData; // non-private to simplify nested class access           

構造方法

ArrayList提供了三種方式的構造器,可以構造一個預設的空清單、構造一個指定初始容量的空清單及構造一個包含指定collection元素的清單,這些元素按照該collection的疊代器傳回它們的順序排列。
/**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    /**
     * Constructs an empty list with the specified initial capacity.
     *
     * @  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }

    /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @ c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if ((size = elementData.length) != 0) {
            // c.toArray might (incorrectly) not return Object[] (see 6260652)
            if (elementData.getClass() != Object[].class)
                elementData = Arrays.copyOf(elementData, size, Object[].class);
        } else {
            // replace with empty array.
            this.elementData = EMPTY_ELEMENTDATA;
        }
    }           

存儲

ArrayList提供了set(int index, E element)、add(E e)、add(int index, E element)、addAll(Collection<? extends E> c)等,這裡着重介紹一下add(E e)方法。
/**
     * Appends the specified element to the end of this list.
     *
     * @ e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }           
add方法将指定的元素添加到此清單的尾部。這裡注意下ensureCapacityInternal方法,這個方法會檢查添加後元素的個數是否會超過目前數組的長度,如果超出,數組将會進行擴容。
/**
     * Default initial capacity.
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * Increases the capacity of this <tt>ArrayList</tt> instance, if
     * necessary, to ensure that it can hold at least the number of elements
     * specified by the minimum capacity argument.
     *
     * @   minCapacity   the desired minimum capacity
     */
    public void ensureCapacity(int minCapacity) {
        int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
            // any size if not default element table
            ? 0
            // larger than default for default empty table. It's already
            // supposed to be at default size.
            : DEFAULT_CAPACITY;

        if (minCapacity > minExpand) {
            ensureExplicitCapacity(minCapacity);
        }
    }

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }           
如果初始時沒有指定ArrayList大小,在第一次調用add方法時,會初始化數組預設最小容量為10。看下grow方法的源碼:
/**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @ minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }           
從上述代碼可以看出,數組進行擴容時,會将老數組中的元素重新拷貝一份到新的數組中,每次數組擴容的增長是原容量的1.5倍。這種操作的代價是很高的,是以在實際使用時,應該盡量避免數組容量的擴張。當可預知要儲存的元素的數量時,要在構造ArrayList執行個體時,就指定其容量,以避免數組擴容的發生。或者根據實際需求,通過調用ensureCapacity方法來手動增加ArrayList執行個體的容量。

記憶體占用

下面開始分析ArrayList的記憶體占用情況。ArrayList繼承AbstractList類,AbstractList擁有一個int類型的modCount屬性,ArrayList本身擁有一個int類型的size屬性和一個數組屬性。

是以一個ArrayList執行個體本身的的大小為:

12(header) + 4(modCount) + 4(size) + 4(elementData reference) = 24 (bytes)

下面分析一個隻有一個Integer(1)元素的ArrayList<Integer>執行個體占用的記憶體大小。

ArrayList<Integer> testList = Lists.newArrayList();
   testList.add(1);           

根據上面對ArrayList原理的介紹,當調用add方法時,ArrayList會初始化一個預設大小為10的數組,而數組中儲存的Integer(1)執行個體大小為16 bytes。則testList占用的記憶體大小為:

*24(ArrayList itselft) + 16(elementData array header) + 10 4(elemetData reference) + 16(Integer) = 96 (bytes)**

HashMap

HashMap的資料結構

HashMap是一個“連結清單散列”的資料結構,即數組和連結清單的結合體。

【JVM技術探索】各種類型對象占用記憶體情況分析(下)前提回顧對象頭執行個體資料:枚舉類ArrayListHashMap

從圖上可以看出,HashMap底層是一個數組結構,數組中的每一項又是一個連結清單。當建立一個HashMap的時候,初始化一個數組,源碼如下:

/**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K,V>[] table;           

Node是連結清單中一個結點,一個Node對象儲存了一對HashMap的Key,Value以及指向下一個節點的指針,源碼如下:

/**
     * Basic hash bin node, used for most entries.  (See below for
     * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
     */
    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
     }           

HashMap提供了四種方式的構造器,分别為指定初始容量及負載因子構造器,指定初始容量構造器,不指定初始容量及負載因子構造器,以及根據已有Map生成新Map的構造器。
/**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @  initialCapacity the initial capacity
     * @  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }

    /**
     * Constructs a new <tt>HashMap</tt> with the same mappings as the
     * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
     * default load factor (0.75) and an initial capacity sufficient to
     * hold the mappings in the specified <tt>Map</tt>.
     *
     * @   m the map whose mappings are to be placed in this map
     * @throws  NullPointerException if the specified map is null
     */
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }           
如果不指定初始容量及負載因子,預設的初始容量為16, 負載因子為0.75。

負載因子衡量的是一個散清單的空間的使用程度,負載因子越大表示散清單的裝填程度越高,反之愈小。對于使用連結清單法的散清單來說,查找一個元素的平均時間是O(1+a),是以如果負載因子越大,對空間的利用更充分,然而後果是查找效率的降低;如果負載因子太小,那麼散清單的資料将過于稀疏,對空間造成嚴重浪費。

HashMap有一個容量門檻值屬性threshold,是根據初始容量和負載因子計算得出threshold=capacity*loadfactor, 如果HashMap中數組元素的個數超過這個門檻值,則HashMap會進行擴容。HashMap底層的數組長度總是2的n次方,每次擴容容量為原來的2倍。

擴容的目的是為了減少hash沖突,提高查詢效率。而在HashMap數組擴容之後,最消耗性能的點就出現了:原數組中的資料必須重新計算其在新數組中的位置,并放進去,這就是resize。

資料的存儲

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    /**
     * Implements Map.put and related methods
     * @ hash hash for key
     * @ key the key
     * @ value the value to put
     * @ onlyIfAbsent if true, don't change existing value
     * @ evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        //初始化數組的大小為16,容量門檻值為16*0.75=12
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        //如果key的hash值對應的數組位置沒有元素,則建立Node放入此位置
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }           

從上面的源代碼中可以看出:當我們往HashMap中put元素的時候,先根據key的hashCode重新計算hash值,根據hash值得到這個元素在數組中的位置(即下标)。

如果數組該位置上已經存放有其他元素了,那麼在這個位置上的元素将以連結清單的形式存放,新加入的放在鍊頭,最先加入的放在鍊尾。如果數組該位置上沒有元素,就直接将該元素放到此數組中的該位置上。

HashMap記憶體占用

這裡分析一個隻有一組鍵值對的HashMap, 結構如下:

Map<Integer, Integer> testMap = Maps.newHashMap();
testMap.put(1, 2);           

首先分析HashMap本身的大小。HashMap對象擁有的屬性包括:

/**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K,V>[] table;

    /**
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     */
    transient Set<Map.Entry<K,V>> entrySet;

    /**
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

    /**
     * The next size value at which to resize (capacity * load factor).
     *
     * @serial
     */
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
    // DEFAULT_INITIAL_CAPACITY.)
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;           

HashMap繼承了AbstractMap<K,V>, AbstractMap有兩個屬性:

transient Set<K> keySet;
transient Collection<V> values;           

是以一個HashMap對象本身的大小為:

12(header) + 4(table reference) + 4(entrySet reference) + 4(size) + 4(modCount) + 4(threshold) + 8(loadFactor) + 4(keySet reference) + 4(values reference) = 48(bytes)

接着分析testMap執行個體在總共占用的記憶體大小。

根據上面對HashMap原理的介紹,可知每對鍵值對對應一個Node對象。根據上面的Node的資料結構,一個Node對象的大小為:

12(header) + 4(hash reference) + 4(key reference) + 4(value reference) + 4(next pointer reference) = 28 (padding) -> 32(bytes)           

加上Key和Value兩個Integer對象,一個Node占用記憶體總大小為:

32 + 2 * 16 = 64(bytes)

下面分析HashMap的Node數組的大小。

根據上面HashMap的原理可知,在不指定容量大小的情況下,HashMap初始容量為16,是以testMap的Node[]占用的記憶體大小為:

*16(header) + 16 4(Node reference) + 64(Node) = 144(bytes)**