分布式 ID 的设计方案

分布式ID基本的要求

分布式系统内全局唯一。
有序性，通常都需要保证生成的 ID 是有序递增的。例如，在数据库存储等场景中，有序 ID便于确定数据位置，往往更加高效。

典型方案(twitter的snowflake算法):

基于数据库自增序列的实现。好处是简单易用，但是在扩展性和可靠性等方面存在局限性。
基于 Twitter 早期开源的 Snowflake 的实现，以及相关改动方案。

分布式 ID 的设计方案
整体长度通常是 64 （1 + 41 + 10+ 12 = 64）位，适合使用 Java 语言中的 long 类型来存储。
- 头部是 1 位的正负标识位。
- 紧跟着的高位部分包含 41 位时间戳，通常使用 System.currentTimeMillis()。
- 后面是 10 位的 WorkerID，标准定义是 5 位数据中心 + 5 位机器 ID，组成了机器编号，以区分不同的集群节点。
- 最后的 12 位就是单位毫秒内可生成的序列号数目的理论极限。

Snowflake 算法的Java 实现依赖于 System.currentTimeMillis()，它是返回当前时间和 1970 年 1 月 1 号 UTC 时间相差的毫秒数.

简单使用

核心实现类如下：

package util;

import java.lang.management.ManagementFactory;
import java.net.InetAddress;
import java.net.NetworkInterface;

/**
 * <p>名称：IdWorker.java</p>
 * <p>描述：分布式自增长ID</p>
 * <pre>
 *     Twitter的 Snowflake　JAVA实现方案
 * </pre>
 * 核心代码为其IdWorker这个类实现，其原理结构如下，我分别用一个0表示一位，用—分割开部分的作用：
 * 1||0---0000000000 0000000000 0000000000 0000000000 0 --- 00000 ---00000 ---000000000000
 * 在上面的字符串中，第一位为未使用（实际上也可作为long的符号位），接下来的41位为毫秒级时间，
 * 然后5位datacenter标识位，5位机器ID（并不算标识符，实际是为线程标识），
 * 然后12位该毫秒内的当前毫秒内的计数，加起来刚好64位，为一个Long型。
 * 这样的好处是，整体上按照时间自增排序，并且整个分布式系统内不会产生ID碰撞（由datacenter和机器ID作区分），
 * 并且效率较高，经测试，snowflake每秒能够产生26万ID左右，完全满足需要。
 * <p>
 * 64位ID (42(毫秒)+5(机器ID)+5(业务编码)+12(重复累加))
 *
 * @author Polim
 */
public class IdWorker {
    // 时间起始标记点，作为基准，一般取系统的最近时间（一旦确定不能变动）
    private final static long twepoch = 1288834974657L;
    // 机器标识位数
    private final static long workerIdBits = 5L;
    // 数据中心标识位数
    private final static long datacenterIdBits = 5L;
    // 机器ID最大值
    private final static long maxWorkerId = -1L ^ (-1L << workerIdBits);
    // 数据中心ID最大值
    private final static long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    // 毫秒内自增位
    private final static long sequenceBits = 12L;
    // 机器ID偏左移12位
    private final static long workerIdShift = sequenceBits;
    // 数据中心ID左移17位
    private final static long datacenterIdShift = sequenceBits + workerIdBits;
    // 时间毫秒左移22位
    private final static long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;

    private final static long sequenceMask = -1L ^ (-1L << sequenceBits);
    /* 上次生产id时间戳 */
    private static long lastTimestamp = -1L;
    // 0，并发控制
    private long sequence = 0L;

    private final long workerId;
    // 数据标识id部分
    private final long datacenterId;

    public IdWorker(){
        this.datacenterId = getDatacenterId(maxDatacenterId);
        this.workerId = getMaxWorkerId(datacenterId, maxWorkerId);
    }
    /**
     * @param workerId
     *            工作机器ID
     * @param datacenterId
     *            序列号
     */
    public IdWorker(long workerId, long datacenterId) {
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
    }
    /**
     * 获取下一个ID
     *
     * @return
     */
    public synchronized long nextId() {
        long timestamp = timeGen();
        if (timestamp < lastTimestamp) {
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }

        if (lastTimestamp == timestamp) {
            // 当前毫秒内，则+1
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                // 当前毫秒内计数满了，则等待下一秒
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
        lastTimestamp = timestamp;
        // ID偏移组合生成最终的ID，并返回ID
        long nextId = ((timestamp - twepoch) << timestampLeftShift)
                | (datacenterId << datacenterIdShift)
                | (workerId << workerIdShift) | sequence;

        return nextId;
    }

    private long tilNextMillis(final long lastTimestamp) {
        long timestamp = this.timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = this.timeGen();
        }
        return timestamp;
    }

    private long timeGen() {
        return System.currentTimeMillis();
    }

    /**
     * <p>
     * 获取 maxWorkerId
     * </p>
     */
    protected static long getMaxWorkerId(long datacenterId, long maxWorkerId) {
        StringBuffer mpid = new StringBuffer();
        mpid.append(datacenterId);
        String name = ManagementFactory.getRuntimeMXBean().getName();
        if (!name.isEmpty()) {
         /*
          * GET jvmPid
          */
            mpid.append(name.split("@")[0]);
        }
      /*
       * MAC + PID 的 hashcode 获取16个低位
       */
        return (mpid.toString().hashCode() & 0xffff) % (maxWorkerId + 1);
    }

    /**
     * <p>
     * 数据标识id部分
     * </p>
     */
    protected static long getDatacenterId(long maxDatacenterId) {
        long id = 0L;
        try {
            InetAddress ip = InetAddress.getLocalHost();
            NetworkInterface network = NetworkInterface.getByInetAddress(ip);
            if (network == null) {
                id = 1L;
            } else {
                byte[] mac = network.getHardwareAddress();
                id = ((0x000000FF & (long) mac[mac.length - 1])
                        | (0x0000FF00 & (((long) mac[mac.length - 2]) << 8))) >> 6;
                id = id % (maxDatacenterId + 1);
            }
        } catch (Exception e) {
            System.out.println(" getDatacenterId: " + e.getMessage());
        }
        return id;
    }


}

1.在spring配置文件中添加配置

<bean id="idWorker" class="util.IdWorker">
    	<!-- 进程ID -->
    	<constructor-arg index="0" value="0"></constructor-arg>
    	<!-- 数据中心ID -->
    	<constructor-arg index="1" value="0"></constructor-arg>
 </bean>

2.在相关服务实现层生成ID即可

@Autowired
	private IdWorker idWorker;
	
	long orderId = idWorker.nextId();

需要考虑的其他要素

除了唯一和有序,通常还会额外希望分布式 ID 保证：

有意义，或者说包含更多信息，例如时间、业务等信息。这一点和有序性要求存在一定关

联，如果 ID 中包含时间，本身就能保证一定程度的有序，虽然并不能绝对保证。ID 中包含

额外信息，在分布式数据存储等场合中，有助于进一步优化数据访问的效率。
高可用性
紧凑性，ID 的大小可能受到实际应用的制约，例如数据库存储往往对长 ID 不友好，太长的

ID 会降低 MySQL 等数据库索引的性能；

局限性

时钟偏斜问题（Clock Skew），发生时钟回拨等问题

这就会导致时间戳不准确，进而产生重复 ID。

解决办法

开启 NTP，只是可以考虑将 stepback 设置为 0，以禁止回调。
缓存历史时间戳，在序列生成之前进行检验，如果出现当前时间落后于历史时间的不合理情况，可以采取相应的动作，要么重试、等

待时钟重新一致，或者就直接提示服务不可用。

分布式 ID 的设计方案

分布式ID基本的要求

典型方案(twitter的snowflake算法):

简单使用

需要考虑的其他要素

局限性

解决办法

继续阅读

ZooKeeper ： Curator框架之分布式屏障DistributedDoubleBarrier

RabbitMQ：交换机（fanout exchange）

Doris SQL 原理解析

ZooKeeper ： Curator框架之分布式锁InterProcessMutex

解决Failure to transfer org.apache.maven.plugins:maven-surefire-plugin:pom:2.12.4 from http://maven.al

阿里巴巴分布式服务框架 Dubbo 团队成员梁飞专访

数据迁移方法数据迁移原则数据迁移之双写方案数据迁移之级联同步方案

微服务-性能压测\缓存redis和分布式锁redisson和SpringCache

Nacos 2.0 升级前后性能对比压测

Spring数据和Redis

redis集群数据一致性_RedisRaft为Redis集群带来强大的数据一致性

Centos7 下 Hadoop 2.6.4 分布式集群环境搭建摘要集群准备安装JDK 安装 Hadoop 2.6.4 部署 slaver1-slaver4 启动 hadoop 集群成功了

【Tomcat】Tomcat参数调优:连接数和并发数

celery使用入门

MapReduce的几个企业级经典面试案例MapReduce的几个企业级经典面试案例

Ajax发送和获取json数据到Spring mvc 1.spring mvc后端2.web前段