全局时间戳

在分布式系统中，由于各个机器的时间可能存在差异，那么很多场景需要一个全局时间戳。比如分布式事务系统，每一个事务号需要全局唯一且能体现时间序。另外全局时间戳还能作为分布式id使用。本小节将参照tidb Pd的tso的实现介绍如何实现全局时间戳。

Tso全称是timestamp oracle，作为全局时间戳系统他需要有如下两个特性

要满足快速大量分配
分配的时间一定是单调递增的，不能出现回退的情况

Tso结构

Tso由两个部分组成，物理时间和逻辑时间组成，物理时间由当前unix时间戳到毫秒数，逻辑时间是一个最大值为262144（1 << 18）的计数器，所以说1ms可以分配262144个全局时间戳，这个量在大多数场景都是可以满足的。

type Tso struct {
	physical time.Time
	logical  int64
}

量可以满足了，性能如何保证的？我们看下分配函数

func (s *Server) GetTs(count int32) proto.Timestamp {
	ts := s.ts.Load().(*Tso)
	resp := proto.Timestamp{}
	resp.Logical = atomic.AddInt64(&ts.logical, int64(count))
	resp.Physical = ts.physical.UnixNano() / 1e6
	return resp
}

上面的代码所示，获取Tso的计算都在内存中，用了原子操作给逻辑时间增加，性能能得到保障。

如何保证单调递增的，不能出现回退的情况

Tso服务器工作的时候是单点的，只有这样才能保证物理时间是递增的。单点带来的风险就是如果机器故障了那么这服务如何做到高可用? Tso服务会每隔一段时间把自己的加了3秒后的时间戳（纳秒）存到etcd中，etcd会保证这个数据的安全，当tso宕机之后，备用tso服务器会立即启动，从etcd中读取数值然后和自己的时间比对，如果发现自己的时间比etcd中的值小则会等待。

last, err := s.loadTimestamp()
if err != nil {
    return errors.Trace(err)
}

var now time.Time

for {
    now = time.Now()
    if wait := last.Sub(now) + updateTimestampGuard; wait > 0 {
        log.Warnf("wait %v to guarantee valid generated timestamp", wait)
        time.Sleep(wait)
        continue
    }
    break
}

经过如上的处理后，分配的时间一定是单调递增的，不能出现回退的情况要求就可以完美解决。但是引入一个两个新的问题

tso宕机后另一个备用服务器怎么知道，如何快速接管服务

假设tso有两台机器A和B，A和B启动的时候会发送Leader选举，选举的机制利用etcd的租约（lease）和事务，代码如下

// CampaignLeader is used to campaign the leader.
func (m *Member) CampaignLeader(lease *LeaderLease, leaseTimeout int64) error {
    err := lease.Grant(leaseTimeout)
    if err != nil {
        return err
    }

    leaderKey := m.GetLeaderPath()
    // The leader key must not exist, so the CreateRevision is 0.
    resp, err := NewSlowLogTxn(m.client).
        If(clientv3.Compare(clientv3.CreateRevision(leaderKey), "=", 0)).
        Then(clientv3.OpPut(leaderKey, m.MemberInfo(), clientv3.WithLease(lease.ID))).
        Commit()
    if err != nil {
        return errors.WithStack(err)
    }
    if !resp.Succeeded {
        return errors.New("failed to campaign leader, other server may campaign ok")
    }
    return nil
}

所以A和B只有一台能成为leader。假定A成为了leader，那么ectd存{tso_leader:A}，租约超时为1s，B和集群中其他client会watch这个key，那么A会每500毫秒会去续约，如果超过1秒A没有去续约，{tso_leader:A}这条数据会失效，这个时候会再次选举B就能顺利接管服务。

停3秒时间是否有办法优化

上面如果发生A和B tso服务发生了切换有可能会发生3s内没办法服务的情况，tidb官方老早就解决了这个问题， A在宕机前存在etcd的时间为A的时间加了3秒，这个时候A宕机，B被选成leader，B拿到etcd中值，发现这个比自己的大，B不会在等待（sleep）直接把这个值作为自己的物理时间，然后在下个时间更新周期可能不更新。

func (s *Server) SyncTime(lease *cluster.LeaderLease) error {
    //从etcd
    last, err := s.loadTs()
    if err != nil {
        return errors.WithStack(err)
    }
    next := time.Now()
    if tSub(next, last) < updateTimestampGuard {
        next = last.Add(updateTimestampGuard)
    }
    save := next.Add(saveInterval)
    if err := s.SaveTime(save); err != nil {
        return err
    }
    s.lease = lease
    tso := Tso{physical: next}
    s.ts.Store(&tso)

    return nil
}

func (s *Server) UpdateTime() error {
    prev := s.ts.Load().(*Tso)
    now := time.Now()
    since := tSub(now, prev.physical)
    if since > 3*UpdateTimestampStep {
        log.Log.Warn("clock offset: %v, prev: %v, now %v", since, prev.physical, now)
    }
    if since < 0 { // 这里轮空
        return nil
    }

    var next time.Time
    prevLogical := atomic.LoadInt64(&prev.logical)
    if since > updateTimestampGuard {
        next = now
    } else if prevLogical > maxLogical/2 {
        log.Log.Warn("the logical time may be not enough  prev-logical :%d", prevLogical)
        next = prev.physical.Add(time.Millisecond)
    } else {
        return nil
    }

    if tSub(s.lastSavedTime, next) <= updateTimestampGuard {
        save := next.Add(saveInterval)
        if err := s.SaveTime(save); err != nil {
            return err
        }
    }
    current := &Tso{
        physical: next,
        logical:  0,
    }
    s.ts.Store(current)
    return nil
}

总结

本小节参照tidb Pd的tso的实现介绍如何实现全局时间戳，以及如何保证全局递增和如何保证高效分配。还顺带介绍了如何利用etcd选主。在分布式系统系统中全局时间戳系统的运用场景非常广，实现稳定可靠的全局时间戳系统非常有价值。

参考资料

TiKV 功能介绍 - Placement Driver

Go一站式编程

全局时间戳

Tso结构

如何保证单调递增的，不能出现回退的情况

总结

参考资料