我们知道 SET 和 GET 命令的底层依赖都是 dict 这个数据结构。本篇文章就展开讲讲 dict 数据结构和 rehash 过程。

dict 数据结构

dict 使用两个 hash 表来存储数据。

为什么要这样做呢？这和 dict 的扩展性有关，dict 的结构如下：

大多数情况下 ht_table[1] 这个 hash 表是空的，所有希望存储于 dict 的 key 先计算 hash 再取余，定位到 bucket 后，以链表的形式插入到 ht_table[0] 中，也就是一种非常经典的使用链表法解决 hash 冲突的 hash 表结构。那么这种结构存在一个问题，即随着 key 的增加，hash 冲突的概率越来越大，链表越来越长，查找 key 的复杂度将从 O(1) 不断地向 O(n) 退化。

Redis 是如何解决这个问题的呢？

当 ht_table[0] 塞满时（即链表长度超过某个阈值），Redis 将创建 ht_table[1]。ht_table[1] 的 bucket 数量一般要比 ht_table[0] 多出一倍。然后将 ht_table[0] 中的数据迁移到 ht_table[1] 中去。这个过程我们称为 rehash。

我们知道 Redis 是单线程的，上述的 rehash 涉及大量的数据迁移，Redis 是如何做到在 rehash 过程中不影响业务访问的呢？我们带着这个问题来看看代码。

// dict 结构体的定义
struct dict {
    // dict 的类型
    dictType *type;
    // 两个 hash 表
    dictEntry **ht_table[2];
    // 两个 hash 表分别拥有多少元素
    unsigned long ht_used[2];

    // 和 rehash 相关，rehash 的进展
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
		
    // 和 rehash 相关
    /* Keep small vars at end for optimal (minimal) struct padding */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
    signed char ht_size_exp[2]; /* exponent of size. (size = 1<<exp) */
};

/* Create a new hash table */
dict *dictCreate(dictType *type)
{
    // 申请一段内存，用于存储 dict 结构体
    dict *d = zmalloc(sizeof(*d));

    _dictInit(d,type);
    return d;
}

/* Initialize the hash table */
int _dictInit(dict *d, dictType *type)
{
    _dictReset(d, 0);
    _dictReset(d, 1);
    d->type = type;
    d->rehashidx = -1;
    d->pauserehash = 0;
    return DICT_OK;
}

/* Reset hash table parameters already initialized with _dictInit()*/
static void _dictReset(dict *d, int htidx)
{
    d->ht_table[htidx] = NULL;
    d->ht_size_exp[htidx] = -1;
    d->ht_used[htidx] = 0;
}

dict 结构体中主要是包含维护两个 hash 表所必要的一些变量，初始化 dict 这个结构体的过程也很好理解，就是赋初值。

Rehash

我们先来看看 rehash 的过程。

// 对 dict 进行 rehash 操作，*d 是 dict 结构体指针，n 代表本次操作要迁移多少个 bucket
int dictRehash(dict *d, int n) {
    int empty_visits = n*10; /* Max number of empty buckets to visit. */
    if (!dictIsRehashing(d)) return 0;

    // 判断当 n 降低到 0 或者 0 号 hash 表中已经没有数据时，跳出循环
    while(n-- && d->ht_used[0] != 0) {
        dictEntry *de, *nextde;

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(DICTHT_SIZE(d->ht_size_exp[0]) > (unsigned long)d->rehashidx);
	// 从 d->rehashidx （初始化为0）开始便利 0 号 hash 表，找到一个存有 key 的 bucket
        while(d->ht_table[0][d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht_table[0][d->rehashidx];
        //将这个 bucket 中所有的 key 迁移到新的 bucket 中
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & DICTHT_SIZE_MASK(d->ht_size_exp[1]);
            de->next = d->ht_table[1][h];
            d->ht_table[1][h] = de;
            d->ht_used[0]--;
            d->ht_used[1]++;
            de = nextde;
        }
	// 将旧的 bucket 置空
        d->ht_table[0][d->rehashidx] = NULL;
	// rehash 的进度加一
        d->rehashidx++;
    }

    // 如果旧的 hash 已经空了，那意味着这一轮 rehash 完毕
    if (d->ht_used[0] == 0) {
	// 释放 0 号 hash 表的内存
        zfree(d->ht_table[0]);
        // 让 1 号 hash 表 成为新的 0 号
        d->ht_table[0] = d->ht_table[1];
        d->ht_used[0] = d->ht_used[1];
        d->ht_size_exp[0] = d->ht_size_exp[1];
	// 重置 1 号 hash 表
        _dictReset(d, 1);
	// 重置 rehash 进度，为下一次 rehash 做准备
        d->rehashidx = -1;
	// 如果本轮 rehash 完毕，则返回 0
        return 0;
    }

    // 当前函数执行完毕，部分 key 被迁移到了新的 hash 表
    // 但是旧的 hash 表还存在 key，需要下一次再执行这个函数
    // 这种情况会返回 1
    return 1;
}

通过这个函数，我们可以了解到 Redis 执行 rehash 过程并不是一次性执行完毕的。dictRehash 每被执行一次，就有一部分 key 被迁移到新的 hash 表，直到所有的 key 都被迁移完毕。dictRehash 会使用 1 号 hash 表替代旧的 0 号，为下一次 rehash 做准备。这样做的目的是化大为小，避免一次操作太多的 key 耗时过长而影响前台业务。

插入键值对

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
    long index;
    dictEntry *entry;
    int htidx;

    // 如果在 rehash 过程中，则进行 rehash 操作
    if (dictIsRehashing(d)) _dictRehashStep(d);

    /* Get the index of the new element, or -1 if
     * the element already exists. */
    // 判断 key 是否是已经存在了
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;

    /* Allocate the memory and store the new entry.
     * Insert the element in top, with the assumption that in a database
     * system it is more likely that recently added entries are accessed
     * more frequently. */
    // 如果正处于 rehash 过程中，则采用 1 号 hash 表，否则采用 0 号 hash 表
    htidx = dictIsRehashing(d) ? 1 : 0;

    // 初始化 entry 
    size_t metasize = dictMetadataSize(d);
    entry = zmalloc(sizeof(*entry) + metasize);
    if (metasize > 0) {
        memset(dictMetadata(entry), 0, metasize);
    }

    // 将 event 插入到 bucket 链表中
    entry->next = d->ht_table[htidx][index];
    d->ht_table[htidx][index] = entry;
    d->ht_used[htidx]++;

    // 将 Key 设置进 entry 中
    dictSetKey(d, entry, key);
    return entry;
}

dictAddRaw 计算 key 的 hash 找到对应的 bucket，将新的 key 插进链表头，这是比较常规的 hash 表操作。

当 dict 正在进行 rehash 时，dictAddRaw 函数会执行一次 rehash 操作（_dictRehashStep 函数实质上调用了 dictRehash 函数），然后直接将新的 key 插入到 1 号 hash 表中。

查找键值对

dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h, idx, table;

    // dict 中没有任何 key，直接退出
    if (dictSize(d) == 0) return NULL; /* dict is empty */

    // 如果在 rehash 过程中，则进行 rehash 操作
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);

    // 在 0 号 hash 表中查找 key， 找不到就去 1 号里面找，如果都找不到，则返回空
    for (table = 0; table <= 1; table++) {
        idx = h & DICTHT_SIZE_MASK(d->ht_size_exp[table]);
        he = d->ht_table[table][idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
	// 如果没在 rehash 过程中，就不去 1 号里面找
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}

在查找 key 的过程中，主要流程就是先去 0 号 hash 表中找对应的 key，如果找不到，就再去 1 号 hash 表中找。因为在 rehash 过程中，一个 key 可能存在任一 hash 表中。

总结

dict 中使用 hash + 链表的结构来存储 key。
当 hash 表过大，dict 通过渐进式 rehash 的方式扩展 hash 表，简单的说就是一次迁移一不分 key，而不是一次性迁移全部的 key。
当 dict 进行 rehash 的时候，仍然可以正常插入、查询 key，对外部而言，rehash 操作是无感的。

作者：Serendipity96

链接：https://juejin.cn/post/7177417841669308477

来源：稀土掘金

Redis 源码阅读：dict 数据结构和 rehash 过程

dict 数据结构

Rehash

插入键值对

查找键值对

总结

继续阅读

Nacos 2.0 升级前后性能对比压测

Spring数据和Redis

redis集群数据一致性_RedisRaft为Redis集群带来强大的数据一致性

D. Ehab the Xorcist(构造+思维)

查找算法之二分查找查找算法之二分查找

JAVA高效编程指南

CQ V1.0分词bates(基于双数组tire树)—应该是目前最快的中文分词算法

supervisor 管理redis 和httpd 环境centos7

Redis简介一(单机版)发展历程Redis

【数据结构】医院选址

[转]ISUP信令的IAM消息详细内容

笔试面试题目：滑动窗口(二)

交通/城市相关的公开的数据集上学时整理的Xie et al., 2020收集的

数据结构与算法（27）——排序（二）

GitHub连夜封杀！这份阿里 10W 字内部 Java 字面试手册到底有多强？

Linux设备模型（中）之上层容器