7.1. 散清單
散清單的基本思想,是通過一定的函數将需搜尋的鍵值映射為一個整數,将這個整數視為索引值去通路某片連續的記憶體區域。理論上,在最優情況下,散清單能提供O(1)複雜度的搜尋效率。
用于映射的函數稱為散列函數(hash function),而映射後的值稱為元素的散列值(hash value)。在散清單的實作中,所選擇的散列函數的優劣将直接決定所實作的散清單的搜尋效率的高低。
在使用散清單的過程中,不同的對象經過散列函數的作用,可能被映射為相同的散列值。而且随着需要存儲的資料的增多,這樣的沖突就會發生得越來越頻繁。散列沖突是散列技術與生俱來的問題。
裝載率是散清單中已使用空間和總空間的比值。如果散清單一共可以容納10個元素,而目前已經裝入了6個元素,那麼裝載率就是6/10。研究表明,當散清單的裝載率大于2/3時,散列沖突發生的機率就會大大增加。
當産生散列沖突時,Python會通過一個二次探測函數f,計算下一個候選位置addr,如果位置addr可用,則可将待插入元素放到位置addr;如果位置addr不可用,則Python會再次使用探測函數f,獲得下一個候選位置,如此不斷探測,總會找到一個可用的位置。
當需要删除某條探測鍊上的某個元素時,問題就産生了。假如這條鍊的首元素位置為a,尾元素的位置為c,現在需要删除中間的某個位置b上的元素。如果直接将位置b上的元素删除,則會導緻探測鍊的斷裂,進而找不到c。是以,在采用開放定址的沖突解決政策的散清單中,删除某條探測鍊上的元素時不能進行真正的删除,而是進行一種“僞删除”操作,必須要讓該元素還存在于探測鍊上,擔當承前啟後的重任。
7.2. Dict對象資料結構
7.2.1. Dict對象的C源碼
PyDictObject的C源碼如下:
// dictobject.h
typedef struct _dictkeysobject PyDictKeysObject;
/* The ma_values pointer is NULL for a combined table
* or points to an array of PyObject* for a split table
*/
typedef struct {
PyObject_HEAD
/* Number of items in the dictionary */
Py_ssize_t ma_used;
/* Dictionary version: globally unique, value change each time the dictionary is modified */
uint64_t ma_version_tag;
PyDictKeysObject *ma_keys;
/* If ma_values is NULL, the table is "combined": keys and values are stored in ma_keys.
If ma_values is not NULL, the table is splitted: keys are stored in ma_keys and values are stored in ma_values */
PyObject **ma_values;
} PyDictObject;
PyDictObject結構裡最重要的一個屬性是ma_keys,在combined table模式下ma_keys存着key和value。ma_keys為PyDictKeysObject 類型,PyDictKeysObject的C源碼如下:
// dict-common.h
typedef struct {
/* Cached hash code of me_key. */
Py_hash_t me_hash;
PyObject *me_key;
PyObject *me_value; /* This field is only meaningful for combined tables */
} PyDictKeyEntry;
/* dict_lookup_func() returns index of entry which can be used like DK_ENTRIES(dk)[index].
* -1 when no entry found, -3 when compare raises error.
*/
typedef Py_ssize_t (*dict_lookup_func)
(PyDictObject *mp, PyObject *key, Py_hash_t hash, PyObject **value_addr);
/* See dictobject.c for actual layout of DictKeysObject */
struct _dictkeysobject {
Py_ssize_t dk_refcnt;
/* Size of the hash table (dk_indices). It must be a power of 2. */
Py_ssize_t dk_size;
/* Function to lookup in the hash table (dk_indices): */
dict_lookup_func dk_lookup;
/* Number of usable entries in dk_entries. */
Py_ssize_t dk_usable;
/* Number of used entries in dk_entries. */
Py_ssize_t dk_nentries;
/* Actual hash table of dk_size entries. It holds indices in dk_entries,
or DKIX_EMPTY(-1) or DKIX_DUMMY(-2).
Indices must be: 0 <= indice < USABLE_FRACTION(dk_size).
The size in bytes of an indice depends on dk_size:
- 1 byte if dk_size <= 0xff (char*)
- 2 bytes if dk_size <= 0xffff (int16_t*)
- 4 bytes if dk_size <= 0xffffffff (int32_t*)
- 8 bytes otherwise (int64_t*)
Dynamically sized, 8 is minimum. */
union {
int8_t as_1[8];
int16_t as_2[4];
int32_t as_4[2];
#if SIZEOF_VOID_P > 4
int64_t as_8[1];
#endif
} dk_indices;
};
7.2.2. PyDictKeysObject記憶體布局
記憶體布局如下:
+---------------+
| dk_refcnt |
| dk_size |
| dk_lookup |
| dk_usable |
| dk_nentries |
+---------------+
| dk_indices |
| |
+---------------+
| dk_entries |
| |
+---------------+
其中:
- dk_indices是實際的hash表,它儲存dk_entries中的index,或DKIX_EMPTY(-1),或DKIX_DUMMY(-2)。dk_indices的size為dk_size。
- dk_size不同,index的類型也不同,如下表。值得注意的是,由于-1和-2有特殊的含義,是以索引必須是有符号整數,是以dk_size <= 128的時候使用int8,而不是dk_size <=256的時候使用int8。
int8 for dk_size <= 128
int16 for 256 <= dk_size <= 2**15
int32 for 2**16 <= dk_size <= 2**31
int64 for 2**32 <= dk_size
- dk_entries是PyDictKeyEntry類型的數組,數組的size是USABLE_FRACTION(dk_size),即可dk_size * 2/3。通過DK_ENTRIES(dk)可以擷取dk_entries的指針。
7.2.3. PyDictKeysObject的兩種形式
分為combined table和split table:
- combined table:ma_values == NULL,dk_refcnt == 1,ma_keys存儲key和值;
- split table:ma_values != NULL,dk_refcnt >= 1,ma_keys存儲key,me_value數組裡存儲值;
下面的分析基于combined table。
7.3. Dict對象
Dict對象是“變長對象”。
7.3.1. Python中的建立
Python中Dict對象最重要的建立方法為PyDict_New,如下Python語句最終會調用到PyDict_New:
test = {1:'hello'}
7.3.2. PyDict_New的C調用棧
// compile.c
PyAST_CompileObject
// symtable.c
=>PySymtable_BuildObject
=>symtable_new
// dictobject.c
=> PyDict_New
7.3.3. PyDict_New源碼
// dictobject.c
#define USABLE_FRACTION(n) (((n) << 1)/3)
#define PyDict_MINSIZE 8
#define DK_SIZE(dk) ((dk)->dk_size)
#define DK_IXSIZE(dk) \
(DK_SIZE(dk) <= 0xff ? \
1 : DK_SIZE(dk) <= 0xffff ? \
2 : DK_SIZE(dk) <= 0xffffffff ? \
4 : sizeof(int64_t))
#define DK_ENTRIES(dk) \
((PyDictKeyEntry*)(&(dk)->dk_indices.as_1[DK_SIZE(dk) * DK_IXSIZE(dk)]))
PyObject *
PyDict_New(void)
{
PyDictKeysObject *keys = new_keys_object(PyDict_MINSIZE);
if (keys == NULL)
return NULL;
return new_dict(keys, NULL);
}
new_keys_object用于建立PyDictKeysObject對象。計算usable,如果緩沖區沒有可用的PyDictKeysObject對象則計算PyDictKeysObject對象的實際大小并配置設定記憶體,然後各種設定,最後設定查找函數lookdict_unicode_nodummy,并memset記憶體。源碼如下:
// dictobject.c
static PyDictKeysObject *new_keys_object(Py_ssize_t size)
{
PyDictKeysObject *dk;
Py_ssize_t es, usable;
assert(size >= PyDict_MINSIZE);
assert(IS_POWER_OF_2(size));
usable = USABLE_FRACTION(size);
if (size <= 0xff) {
es = 1;
}
else if (size <= 0xffff) {
es = 2;
}
#if SIZEOF_VOID_P > 4
else if (size <= 0xffffffff) {
es = 4;
}
#endif
else {
es = sizeof(Py_ssize_t);
}
if (size == PyDict_MINSIZE && numfreekeys > 0) {
dk = keys_free_list[--numfreekeys];
}
else {
dk = PyObject_MALLOC(sizeof(PyDictKeysObject)
- Py_MEMBER_SIZE(PyDictKeysObject, dk_indices)
+ es * size
+ sizeof(PyDictKeyEntry) * usable);
if (dk == NULL) {
PyErr_NoMemory();
return NULL;
}
}
DK_DEBUG_INCREF dk->dk_refcnt = 1;
dk->dk_size = size;
dk->dk_usable = usable;
dk->dk_lookup = lookdict_unicode_nodummy;
dk->dk_nentries = 0;
memset(&dk->dk_indices.as_1[0], 0xff, es * size);
memset(DK_ENTRIES(dk), 0, sizeof(PyDictKeyEntry) * usable);
return dk;
}
new_dict建立PyDictObject 對象,PyDictKeysObject對象的殼。如果緩沖區沒有可用的PyDictObject對象則配置設定記憶體,并且各種設定。源碼如下:
// dictobject.c
static PyObject *
new_dict(PyDictKeysObject *keys, PyObject **values)
{
PyDictObject *mp;
assert(keys != NULL);
if (numfree) {
mp = free_list[--numfree];
assert (mp != NULL);
assert (Py_TYPE(mp) == &PyDict_Type);
_Py_NewReference((PyObject *)mp);
}
else {
mp = PyObject_GC_New(PyDictObject, &PyDict_Type);
if (mp == NULL) {
DK_DECREF(keys);
free_values(values);
return NULL;
}
}
mp->ma_keys = keys;
mp->ma_values = values;
mp->ma_used = 0;
mp->ma_version_tag = DICT_NEXT_VERSION();
assert(_PyDict_CheckConsistency(mp));
return (PyObject *)mp;
}
可以看到:
- PyDictKeysObject對象緩沖:通過操作numfreekeys實作,numfreekeys在free_keys_object和dictresize方法中進行調整。
// dictobject.c
#define PyDict_MAXFREELIST 80
static PyDictKeysObject *keys_free_list[PyDict_MAXFREELIST];
static int numfreekeys = 0;
- PyDictObject對象緩沖:通過操作numfree實作,numfree在dict_dealloc方法中進行調整。
// dictobject.c
#define PyDict_MAXFREELIST 80
static PyDictObject *free_list[PyDict_MAXFREELIST];
static int numfree = 0;
7.4. Dict對象的查找
Dict對象的查找是Dict對象最重要的方法。Python Dict對象預設的查找方法為lookdict_unicode_nodummy,在lookdict_unicode_nodummy方法裡會判斷如果key不是unicode類型,則将查找方法設定為lookdict,并調用lookdict進行查找。
lookdict_unicode_nodummy與lookdict最重要的差別在于,hash相同的情況下對key的比對,lookdict_unicode_nodummy調用的unicode_eq,而lookdict調用的PyObject_RichCompareBool。由于Python源碼實作中大量的使用string作為key的dict對象,是以這是一項優化。但是對于整個查找機制而言,隻要分析明白其中一個方法即可。
lookdict_unicode_nodummy源碼如下:
// dictobject.c
#define DK_MASK(dk) (((dk)->dk_size)-1)
static Py_ssize_t _Py_HOT_FUNCTION
lookdict_unicode_nodummy(PyDictObject *mp, PyObject *key,
Py_hash_t hash, PyObject **value_addr)
{
assert(mp->ma_values == NULL);
if (!PyUnicode_CheckExact(key)) {
mp->ma_keys->dk_lookup = lookdict;
return lookdict(mp, key, hash, value_addr);
}
PyDictKeyEntry *ep0 = DK_ENTRIES(mp->ma_keys);
size_t mask = DK_MASK(mp->ma_keys);
size_t perturb = (size_t)hash;
size_t i = (size_t)hash & mask;
for (;;) {
Py_ssize_t ix = dk_get_index(mp->ma_keys, i);
assert (ix != DKIX_DUMMY);
if (ix == DKIX_EMPTY) {
*value_addr = NULL;
return DKIX_EMPTY;
}
PyDictKeyEntry *ep = &ep0[ix];
assert(ep->me_key != NULL);
assert(PyUnicode_CheckExact(ep->me_key));
if (ep->me_key == key ||
(ep->me_hash == hash && unicode_eq(ep->me_key, key))) {
*value_addr = ep->me_value;
return ix;
}
perturb >>= PERTURB_SHIFT;
i = mask & (i*5 + perturb + 1);
}
Py_UNREACHABLE();
}
lookdict_unicode_nodummy中通過:
i = (size_t)hash & mask;
Py_ssize_t ix = dk_get_index(dk, i);
計算i在dk_entries中的索引。
如果該索引:
1. DKIX_EMPTY,即沒有找到,則value=NULL,并傳回索引值(DKIX_EMPTY);
2. DKIX_DUMMY,跳轉到4;
3. 索引值ix >= 0:
3.1. 如果key對象完全一緻,則傳回value和索引值;
3.2. 如果hash值一緻,則調用unicode_eq或PyObject_RichCompareBool 比較key
3.2.1. 如果key相等,則傳回value和索引;
3.2.2. 否則跳轉到4;
3.3. 否則跳轉到4
4. 根據探測函數(i = mask & (i*5 + perturb + 1))計算下一個ix;
是以lookdict_unicode_nodummy方法:
1. 要麼傳回NULL和DKIX_EMPTY;
2. 要麼傳回value和索引;
7.5. Dict對象的維護
7.5.1. 設定/添加元素
如下Python語句最終會調用到PyDict_SetItem:
test = {100: 200}
PyDict_SetItem的C調用棧:
// pystate.c
PyInterpreterState_New
// ceval.c
=>_PyEval_EvalFrameDefault (case BUILD_MAP)
// dictobject.c
=> PyDict_SetItem
PyDict_SetItem源碼:
// dictobject.c
int
PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value)
{
PyDictObject *mp;
Py_hash_t hash;
if (!PyDict_Check(op)) {
PyErr_BadInternalCall();
return -1;
}
assert(key);
assert(value);
mp = (PyDictObject *)op;
if (!PyUnicode_CheckExact(key) ||
(hash = ((PyASCIIObject *) key)->hash) == -1)
{
hash = PyObject_Hash(key);
if (hash == -1)
return -1;
}
/* insertdict() handles any resizing that might be necessary */
return insertdict(mp, key, hash, value);
}
PyDict_SetItem方法中做了一下檢查,調用insertdict方法:
// dictobject.c
static int
insertdict(PyDictObject *mp, PyObject *key, Py_hash_t hash, PyObject *value)
{
PyObject *old_value;
PyDictKeyEntry *ep;
Py_INCREF(key);
Py_INCREF(value);
if (mp->ma_values != NULL && !PyUnicode_CheckExact(key)) {
if (insertion_resize(mp) < 0)
goto Fail;
}
Py_ssize_t ix = mp->ma_keys->dk_lookup(mp, key, hash, &old_value);
if (ix == DKIX_ERROR)
goto Fail;
assert(PyUnicode_CheckExact(key) || mp->ma_keys->dk_lookup == lookdict);
MAINTAIN_TRACKING(mp, key, value);
/* When insertion order is different from shared key, we can't share
* the key anymore. Convert this instance to combine table.
*/
if (_PyDict_HasSplitTable(mp) &&
((ix >= 0 && old_value == NULL && mp->ma_used != ix) ||
(ix == DKIX_EMPTY && mp->ma_used != mp->ma_keys->dk_nentries))) {
if (insertion_resize(mp) < 0)
goto Fail;
ix = DKIX_EMPTY;
}
if (ix == DKIX_EMPTY) {
/* Insert into new slot. */
assert(old_value == NULL);
if (mp->ma_keys->dk_usable <= 0) {
/* Need to resize. */
if (insertion_resize(mp) < 0)
goto Fail;
}
Py_ssize_t hashpos = find_empty_slot(mp->ma_keys, hash);
ep = &DK_ENTRIES(mp->ma_keys)[mp->ma_keys->dk_nentries];
dk_set_index(mp->ma_keys, hashpos, mp->ma_keys->dk_nentries);
ep->me_key = key;
ep->me_hash = hash;
if (mp->ma_values) {
assert (mp->ma_values[mp->ma_keys->dk_nentries] == NULL);
mp->ma_values[mp->ma_keys->dk_nentries] = value;
}
else {
ep->me_value = value;
}
mp->ma_used++;
mp->ma_version_tag = DICT_NEXT_VERSION();
mp->ma_keys->dk_usable--;
mp->ma_keys->dk_nentries++;
assert(mp->ma_keys->dk_usable >= 0);
assert(_PyDict_CheckConsistency(mp));
return 0;
}
if (_PyDict_HasSplitTable(mp)) {
mp->ma_values[ix] = value;
if (old_value == NULL) {
/* pending state */
assert(ix == mp->ma_used);
mp->ma_used++;
}
}
else {
assert(old_value != NULL);
DK_ENTRIES(mp->ma_keys)[ix].me_value = value;
}
mp->ma_version_tag = DICT_NEXT_VERSION();
Py_XDECREF(old_value); /* which **CAN** re-enter (see issue #22653) */
assert(_PyDict_CheckConsistency(mp));
Py_DECREF(key);
return 0;
Fail:
Py_DECREF(value);
Py_DECREF(key);
return -1;
}
insertdict中調用lookdict_unicode_nodummy或lookdict方法尋找Dice對象:
1. 沒有找到key
1.1. 檢查是否已經用完空間,如果用完,調用insertion_resize調整dict大小;
1.2. 調用find_empty_slot方法尋找探測鍊上第一個為DKIX_DUMMY或DKIX_EMPTY的,設定索引值,各種設定和調整
2. 否則設定value即可
find_empty_slot方法源碼如下:
// dictobject.c
static Py_ssize_t
find_empty_slot(PyDictKeysObject *keys, Py_hash_t hash)
{
assert(keys != NULL);
const size_t mask = DK_MASK(keys);
size_t i = hash & mask;
Py_ssize_t ix = dk_get_index(keys, i);
for (size_t perturb = hash; ix >= 0;) {
perturb >>= PERTURB_SHIFT;
i = (i*5 + perturb + 1) & mask;
ix = dk_get_index(keys, i);
}
return i;
}
7.5.2. 删除元素
如下Python語句最終會調用到PyDict_DelItem:
test = {100: 200}
del test[100]
// dictobject.c
dict_ass_sub
=> PyDict_DelItem
// dictobject.c
int
PyDict_DelItem(PyObject *op, PyObject *key)
{
Py_hash_t hash;
assert(key);
if (!PyUnicode_CheckExact(key) ||
(hash = ((PyASCIIObject *) key)->hash) == -1) {
hash = PyObject_Hash(key);
if (hash == -1)
return -1;
}
return _PyDict_DelItem_KnownHash(op, key, hash);
}
PyDict_SetItem方法中做了一下檢查,調用_PyDict_DelItem_KnownHash方法:
// dictobject.c
int
_PyDict_DelItem_KnownHash(PyObject *op, PyObject *key, Py_hash_t hash)
{
Py_ssize_t ix;
PyDictObject *mp;
PyObject *old_value;
if (!PyDict_Check(op)) {
PyErr_BadInternalCall();
return -1;
}
assert(key);
assert(hash != -1);
mp = (PyDictObject *)op;
ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &old_value);
if (ix == DKIX_ERROR)
return -1;
if (ix == DKIX_EMPTY || old_value == NULL) {
_PyErr_SetKeyError(key);
return -1;
}
// Split table doesn't allow deletion. Combine it.
if (_PyDict_HasSplitTable(mp)) {
if (dictresize(mp, DK_SIZE(mp->ma_keys))) {
return -1;
}
ix = (mp->ma_keys->dk_lookup)(mp, key, hash, &old_value);
assert(ix >= 0);
}
return delitem_common(mp, hash, ix, old_value);
}
查找key,查找到了調用delitem_common。需要注意的是查找到了,隻是把狀态設定為DKIX_DUMMY, 并沒有從探測鍊上摘除。delitem_common源碼如下:
// dictobject.c
static int
delitem_common(PyDictObject *mp, Py_hash_t hash, Py_ssize_t ix,
PyObject *old_value)
{
PyObject *old_key;
PyDictKeyEntry *ep;
Py_ssize_t hashpos = lookdict_index(mp->ma_keys, hash, ix);
assert(hashpos >= 0);
mp->ma_used--;
mp->ma_version_tag = DICT_NEXT_VERSION();
ep = &DK_ENTRIES(mp->ma_keys)[ix];
dk_set_index(mp->ma_keys, hashpos, DKIX_DUMMY);
ENSURE_ALLOWS_DELETIONS(mp);
old_key = ep->me_key;
ep->me_key = NULL;
ep->me_value = NULL;
Py_DECREF(old_key);
Py_DECREF(old_value);
assert(_PyDict_CheckConsistency(mp));
return 0;
}
static Py_ssize_t
lookdict_index(PyDictKeysObject *k, Py_hash_t hash, Py_ssize_t index)
{
size_t mask = DK_MASK(k);
size_t perturb = (size_t)hash;
size_t i = (size_t)hash & mask;
for (;;) {
Py_ssize_t ix = dk_get_index(k, i);
if (ix == index) {
return i;
}
if (ix == DKIX_EMPTY) {
return DKIX_EMPTY;
}
perturb >>= PERTURB_SHIFT;
i = mask & (i*5 + perturb + 1);
}
Py_UNREACHABLE();
}
7.5.3. 調整大小
無論是設定、插入還是删除,在滿足一定條件下(參見上面的代碼分析),都會調用insertion_resize。insertion_resize方法會重新建立PyDictKeysObject對象,在這個過程中會拷貝舊的對象所有的資料。
// dictobject.c
#define GROWTH_RATE(d) (((d)->ma_used*2)+((d)->ma_keys->dk_size>>1))
static int
insertion_resize(PyDictObject *mp)
{
return dictresize(mp, GROWTH_RATE(mp));
}
static int
dictresize(PyDictObject *mp, Py_ssize_t minsize)
{
Py_ssize_t newsize, numentries;
PyDictKeysObject *oldkeys;
PyObject **oldvalues;
PyDictKeyEntry *oldentries, *newentries;
/* Find the smallest table size > minused. */
for (newsize = PyDict_MINSIZE;
newsize < minsize && newsize > 0;
newsize <<= 1)
;
if (newsize <= 0) {
PyErr_NoMemory();
return -1;
}
oldkeys = mp->ma_keys;
/* Allocate a new table. */
mp->ma_keys = new_keys_object(newsize);
if (mp->ma_keys == NULL) {
mp->ma_keys = oldkeys;
return -1;
}
// New table must be large enough.
assert(mp->ma_keys->dk_usable >= mp->ma_used);
if (oldkeys->dk_lookup == lookdict)
mp->ma_keys->dk_lookup = lookdict;
numentries = mp->ma_used;
oldentries = DK_ENTRIES(oldkeys);
newentries = DK_ENTRIES(mp->ma_keys);
oldvalues = mp->ma_values;
if (oldvalues != NULL) {
/* Convert split table into new combined table.
* We must incref keys; we can transfer values.
* Note that values of split table is always dense.
*/
for (Py_ssize_t i = 0; i < numentries; i++) {
assert(oldvalues[i] != NULL);
PyDictKeyEntry *ep = &oldentries[i];
PyObject *key = ep->me_key;
Py_INCREF(key);
newentries[i].me_key = key;
newentries[i].me_hash = ep->me_hash;
newentries[i].me_value = oldvalues[i];
}
DK_DECREF(oldkeys);
mp->ma_values = NULL;
if (oldvalues != empty_values) {
free_values(oldvalues);
}
}
else { // combined table.
if (oldkeys->dk_nentries == numentries) {
memcpy(newentries, oldentries, numentries * sizeof(PyDictKeyEntry));
}
else {
PyDictKeyEntry *ep = oldentries;
for (Py_ssize_t i = 0; i < numentries; i++) {
while (ep->me_value == NULL)
ep++;
newentries[i] = *ep++;
}
}
assert(oldkeys->dk_lookup != lookdict_split);
assert(oldkeys->dk_refcnt == 1);
if (oldkeys->dk_size == PyDict_MINSIZE &&
numfreekeys < PyDict_MAXFREELIST) {
DK_DEBUG_DECREF keys_free_list[numfreekeys++] = oldkeys;
}
else {
DK_DEBUG_DECREF PyObject_FREE(oldkeys);
}
}
build_indices(mp->ma_keys, newentries, numentries);
mp->ma_keys->dk_usable -= numentries;
mp->ma_keys->dk_nentries = numentries;
return 0;
}
由于擴/縮容,是以要調整索引,調用build_indices方法:
// dictobject.c
static void
build_indices(PyDictKeysObject *keys, PyDictKeyEntry *ep, Py_ssize_t n)
{
size_t mask = (size_t)DK_SIZE(keys) - 1;
for (Py_ssize_t ix = 0; ix != n; ix++, ep++) {
Py_hash_t hash = ep->me_hash;
size_t i = hash & mask;
for (size_t perturb = hash; dk_get_index(keys, i) != DKIX_EMPTY;) {
perturb >>= PERTURB_SHIFT;
i = mask & (i*5 + perturb + 1);
}
dk_set_index(keys, i, ix);
}
}
7.6. Dict對象的特性
支援tp_as_sequence、tp_as_mapping兩種操作。
7.6.1. 序列操作
// dictobject.c
&dict_as_sequence, /* tp_as_sequence */
// dictobject.c
static PySequenceMethods dict_as_sequence = {
0, /* sq_length */
0, /* sq_concat */
0, /* sq_repeat */
0, /* sq_item */
0, /* sq_slice */
0, /* sq_ass_item */
0, /* sq_ass_slice */
PyDict_Contains, /* sq_contains */
0, /* sq_inplace_concat */
0, /* sq_inplace_repeat */
};
- PyDict_Contains
test = {200:100}
200 in test # True
100 in test # False
7.6.2. 關聯操作
// dictobject.c
&dict_as_mapping, /* tp_as_mapping */
// dictobject.c
static PyMappingMethods dict_as_mapping = {
(lenfunc)dict_length, /*mp_length*/
(binaryfunc)dict_subscript, /*mp_subscript*/
(objobjargproc)dict_ass_sub, /*mp_ass_subscript*/
};
- dict_length
test = {200:100}
len(test)
- dict_subscript
test = {200:100}
test[200]
- dict_ass_sub
test = {}
test[200] = 100
7.6.3. to string
// dictobject.c
(reprfunc)dict_repr, /* tp_repr */
0, /* tp_str */
7.6.4. hash
// dictobject.c
PyObject_HashNotImplemented, /* tp_hash */
7.6.5. 比較
// dictobject.c
dict_richcompare, /* tp_richcompare */
7.6.6. 内置方法
// dictobject.c
mapp_methods, /* tp_methods */
7.7. 參考
- Python源碼剖析