用HandlerSocket大量并发插入数据到多个InnoDB的表(都使用自增id)的时候就会大量出现167的错误,从而TPS下降极为厉害.
让我们来分析这个bug是怎么产生的.
我们先来重现这个bug:
先建两个表:
CREATE TABLE test1 ( id int(20) unsigned NOT NULL AUTO_INCREMENT, data varchar(200) DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB AUTO_INCREMENT=1; CREATE TABLE test2 ( id int(20) unsigned NOT NULL AUTO_INCREMENT, data varchar(200) DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB AUTO_INCREMENT=1;
然后用HandlerSocket大并发的执行类似这样SQL的命令:
insert into test1 (data) values ('1'); insert into test1 (data) values ('2'); insert into test2 (data) values ('3'); insert into test2 (data) values ('4');
那么很快就会出现167的错误
那我们来看看167是什么意思, 167是
Handler->ha_write_row()
返回的错误码,即
#define HA_ERR_AUTOINC_ERANGE 167 /* Failed to set row autoinc value */
表示已经越界了
查看可能返回HA_ERR_AUTOINC_ERANGE的地方,就在int handler::update_auto_increment()这个方法里,当
if (unlikely(nr == ULONGLONG_MAX)) DBUG_RETURN(HA_ERR_AUTOINC_ERANGE);
时,会有问题
而什么时候nr会等于ULONGLONG_MAX呢,看上面的代码
/* This call ignores all its parameters but nr, currently */ get_auto_increment(variables->auto_increment_offset, variables->auto_increment_increment, nb_desired_values, &nr, &nb_reserved_values); if (nr == ULONGLONG_MAX) DBUG_RETURN(HA_ERR_AUTOINC_READ_FAILED); // Mark failure /* That rounding below should not be needed when all engines actually respect offset and increment in get_auto_increment(). But they don't so we still do it. Wonder if for the not-first-in-index we should do it. Hope that this rounding didn't push us out of the interval; even if it did we cannot do anything about it (calling the engine again will not help as we inserted no row). */ nr= compute_next_insert_id(nr-1, variables);
在nr= compute_next_insert_id(nr-1, variables); 这一步会导致nr等于ULONGLONG_MAX
那么在get_auto_increment()这步的时候nr返回就必须是0,而nr传进去的时候就是0
那么就是get_auto_increment()没有起效果
我们再看InnoDB的get_auto_increment()实现
UNIV_INTERN void ha_innobase::get_auto_increment( /*============================*/ ulonglong offset, /*!< in: table autoinc offset */ ulonglong increment, /*!< in: table autoinc increment */ ulonglong nb_desired_values, /*!< in: number of values reqd */ ulonglong *first_value, /*!< out: the autoinc value */ ulonglong *nb_reserved_values) /*!< out: count of reserved values */ { trx_t* trx; ulint error; ulonglong autoinc = 0; /* Prepare prebuilt->trx in the table handle */ update_thd(ha_thd()); error = innobase_get_autoinc(&autoinc); if (error != DB_SUCCESS) { *first_value = (~(ulonglong) 0); return; } /* This is a hack, since nb_desired_values seems to be accurate only for the first call to get_auto_increment() for multi-row INSERT and meaningless for other statements e.g, LOAD etc. Subsequent calls to this method for the same statement results in different values which don't make sense. Therefore we store the value the first time we are called and count down from that as rows are written (see write_row()). */ trx = prebuilt->trx; /* Note: We can't rely on *first_value since some MySQL engines, in particular the partition engine, don't initialize it to 0 when invoking this method. So we are not sure if it's guaranteed to be 0 or not. */ /* We need the upper limit of the col type to check for whether we update the table autoinc counter or not. */ ulonglong col_max_value = innobase_get_int_col_max_value( table->next_number_field); /* Called for the first time ? */ if (trx->n_autoinc_rows == 0) { trx->n_autoinc_rows = (ulint) nb_desired_values; /* It's possible for nb_desired_values to be 0: e.g., INSERT INTO T1(C) SELECT C FROM T2; */ if (nb_desired_values == 0) { trx->n_autoinc_rows = 1; } set_if_bigger(*first_value, autoinc); /* Not in the middle of a mult-row INSERT. */ } else if (prebuilt->autoinc_last_value == 0) { set_if_bigger(*first_value, autoinc); /* Check for -ve values. */ } else if (*first_value > col_max_value && trx->n_autoinc_rows > 0) { /* Set to next logical value. */ ut_a(autoinc > trx->n_autoinc_rows); *first_value = (autoinc - trx->n_autoinc_rows) - 1; } *nb_reserved_values = trx->n_autoinc_rows; /* With old style AUTOINC locking we only update the table's AUTOINC counter after attempting to insert the row. */ if (innobase_autoinc_lock_mode != AUTOINC_OLD_STYLE_LOCKING) { ulonglong need; ulonglong current; ulonglong next_value; current = *first_value > col_max_value ? autoinc : *first_value; need = *nb_reserved_values * increment; /* Compute the last value in the interval */ next_value = innobase_next_autoinc( current, need, offset, col_max_value); prebuilt->autoinc_last_value = next_value; if (prebuilt->autoinc_last_value < *first_value) { *first_value = (~(ulonglong) 0); } else { /* Update the table autoinc variable */ dict_table_autoinc_update_if_greater( prebuilt->table, prebuilt->autoinc_last_value); } } else { /* This will force write_row() into attempting an update of the table's AUTOINC counter. */ prebuilt->autoinc_last_value = 0; } /* The increment to be used to increase the AUTOINC value, we use this in write_row() and update_row() to increase the autoinc counter for columns that are filled by the user. We need the offset and the increment. */ prebuilt->autoinc_offset = offset; prebuilt->autoinc_increment = increment; dict_table_autoinc_unlock(prebuilt->table); }
能导致first_value不改变的请求只有
if (trx->n_autoinc_rows == 0) { trx->n_autoinc_rows = (ulint) nb_desired_values; /* It's possible for nb_desired_values to be 0: e.g., INSERT INTO T1(C) SELECT C FROM T2; */ if (nb_desired_values == 0) { trx->n_autoinc_rows = 1; } set_if_bigger(*first_value, autoinc); /* Not in the middle of a mult-row INSERT. */ } else if (prebuilt->autoinc_last_value == 0) { set_if_bigger(*first_value, autoinc); /* Check for -ve values. */ } else if (*first_value > col_max_value && trx->n_autoinc_rows > 0) { /* Set to next logical value. */ ut_a(autoinc > trx->n_autoinc_rows); *first_value = (autoinc - trx->n_autoinc_rows) - 1; }
这3个判断都没有满足
而HandlerSocket再并发插入多表时确实可能导致3个条件都不满足.
为什么呢,在于trx->n_autoinc_rows和prebuilt->autoinc_last_value这两个变量
由于新的InnoDB自增id的策略,在多行插入的时候为了减少自增id的锁堵塞,所以会一次生成一个大于auto_increment_offset跨度的id序列,trx->n_autoinc_rows表示就是这个序列的大小
prebuilt->autoinc_last_value表示的是这个表的这个序列的最后一个值
1.第一个条件不满足
大家注意到了n_autoinc_rows是在trx上的,autoinc_last_value是在prebuilt上的,trx又是在THD上的.这就导致多表自增的时候一个表生成的n_autoinc_rows,可能会作用于两个表上.
比如nb_desired_values=2而实际只插入了一条记录,那么循环后trx->n_autoinc_rows = 1 那么换成另一个表的时候后trx->n_autoinc_rows = 1,那么第一个条件就不满足.
2.第二个条件不满足
而第二个条件,即当第二个表也有自增过之后
/* Compute the last value in the interval */ next_value = innobase_next_autoinc( current, need, offset, col_max_value); prebuilt->autoinc_last_value = next_value;
prebuilt->autoinc_last_value必然是大于0的,就是有错误,*first_value=0 最后prebuilt->autoinc_last_value也会等于1,那么这个时候第二个条件也会不满足!
3.第三个条件不满足
而当*first_value=0 时,第三个条件自然也是满足不了的.
所以最后就导致update_auto_increment()返回了167错误~
解决办法:
1.设置innodb_autoinc_lock_mode=0 是InnoDB采用老式的自增id分配算法可以避免这种请求
2.修改HandlerSocket代码,在dbcontext::cmd_insert_internal()的const int r = hnd->ha_write_row(buf)之前加上hnd->start_stmt(thd,TL_WRITE)对trx->n_autoinc_rows进行重置来避免这个问题