IT博客汇 | 通过源码解析MySQL Seconds_Behind

通过源码解析MySQL Seconds_Behind_Master

admin发表于 2014-09-15 06:37:15

官方文档的说明

官方文档说的很清楚（估计是更新过），建议仔细阅读

This field is an indication of how “late” the slave is:

When the slave is actively processing updates, this field shows the difference between the current timestamp on the slave and the original timestamp logged on the master for the most event currently being processed on the slave.
When no event is currently being processed on the slave, this value is 0.

In essence, this field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. If the network connection between master and slave is fast, the slave I/O thread is very close to the master, so this field is a good approximation of how late the slave SQL thread is compared to the master. If the network is slow, this is not a good approximation; the slave SQL thread may quite often be caught up with the slow-reading slave I/O thread, so Seconds_Behind_Master often shows a value of 0, even if the I/O thread is late compared to the master. In other words, this column is useful only for fast networks.

This time difference computation works even if the master and slave do not have identical clock times, provided that the difference, computed when the slave I/O thread starts, remains constant from then on. Any changes—including NTP updates—can lead to clock skews that can make calculation of Seconds_Behind_Master less reliable.

This field is NULL (undefined or unknown) if the slave SQL thread is not running, or if the slave I/O thread is not running or is not connected to the master. For example, if the slave I/O thread is running but is not connected to the master and is sleeping for the number of seconds given by the CHANGE MASTER TO statement or --master-connect-retry option (default 60) before reconnecting, the value is NULL. This is because the slave cannot know what the master is doing, and so cannot say reliably how late it is.

The value of Seconds_Behind_Master is based on the timestamps stored in events, which are preserved through replication. This means that if a master M1 is itself a slave of M0, any event from M1′s binary log that originates from M0′s binary log has M0′s timestamp for that event. This enables MySQL to replicate TIMESTAMPsuccessfully. However, the problem for Seconds_Behind_Master is that if M1 also receives direct updates from clients, the Seconds_Behind_Master value randomly fluctuates because sometimes the last event from M1 originates from M0 and sometimes is the result of a direct update on M1.

入口函数

sql/slave.cc

//执行show slave status会执行下面的方法
bool show_master_info(THD* thd, Master_info* mi)
{
    /*省略*/
    /*
      Seconds_Behind_Master: if SQL thread is running and I/O thread is
      connected, we can compute it otherwise show NULL (i.e. unknown).
    */
    if ((mi-&gt;slave_running == MYSQL_SLAVE_RUN_CONNECT) &amp;&amp;
        mi-&gt;rli.slave_running)
    {
      long time_diff= ((long)(time(0) - mi-&gt;rli.last_master_timestamp)
                       - mi-&gt;clock_diff_with_master);
      /*
        Apparently on some systems time_diff can be store((longlong)(mi->rli.last_master_timestamp ?max(0, time_diff) : 0));*/
 
    }
    else
    {
      protocol-&gt;store_null();
    }
    /*省略*/
}

可以通过上面函数看到sql thread正在执行并且io thread连接状态时，通过下面公式进行计算，其余时间为0

long time_diff= ((long)(time(0) – mi->rli.last_master_timestamp) – mi->clock_diff_with_master);

对上述公式进行逐个剖析

1.(long)time(0)从库当前系统时间
2.mi->rli.last_master_timestamp

class Relay_log_info : public Slave_reporting_capability
{
public:
  time_t last_master_timestamp;
}
//last_master_timestamp表示sql_thread执行的binlog中的timestamp值
void Relay_log_info::stmt_done(my_off_t event_master_log_pos,                                   time_t event_creation_time)
{
  last_master_timestamp= event_creation_time;
}

3.mi->clock_diff_with_master

通过下面注释既可以看明白，io_thread在第一次执行的时候会在master上执行select unix_timestamp()然后计算master和slave的系统时间差，因此可以明白在测试过程，如果io thread已经运行的情况下，修改系统时间会存在问题（阅读代码的版本中不存在负数的问题）

class Master_info : public Slave_reporting_capability
{
/*
The difference in seconds between the clock of the master and the clock of
the slave (second - first). It must be signed as it may be 0.
clock_diff_with_master is computed when the I/O thread starts; for this the
I/O thread does a SELECT UNIX_TIMESTAMP() on the master.
"how late the slave is compared to the master" is computed like this:
clock_of_slave - last_timestamp_executed_by_SQL_thread - clock_diff_with_master
*/
long clock_diff_with_master;
}

总结

Seconds_Behind_Master在sql_thread正在运行时，通过当前系统时间减去sql_thread最近接受到的binlog的timestamp，然后在减去master和slave的系统时间差；如果sql_thread当前没有运行，则显示0。

存在的问题是：如果io_thread因为网络等缘故延迟非常大，会导致binlog迟迟没有发过来，这个时候sql_thread已经执行完毕手头的binlog，处于空闲状态，因此Seconds_Behind_Master会显示为0，实际上已经产生延迟，在网络延迟较大的环境下需要才用类似heartbeat的机制来解决。

参照：
seconds_behind_master解密
 官方文档