生产环境kafka
集群,在数据量大的情况下,会出现单机各个磁盘间的占用不均匀情况,经常出现“一边倒”的情形。
原因探究
这是因为kafka
只保证分区数量在各个磁盘上均匀分布,但它无法统计每个分区实际占用磁盘空间。因此很有可能出现某些分区消息数量巨大导致占用大量磁盘空间的情况。在1.1版本之前,用户对此基本没有优雅的处理方法,即便手动迁移日志文件和offset信息,也需要重启生效,风险极高。因为1.1之前kafka
只支持分区数据在不同broker
间的重分配,而无法做到在同一个broker
下的不同磁盘间做重分配。1.1版本正式支持副本在不同路径间的迁移,具体的实现细节详见kafka
官方wiki
KIP-113
。
目录间迁移步骤
假设我在server.properties
文件中配置了多个日志存储路径(代表多块磁盘),如下所示:
1 2 3
| log.dirs=/data1/kafka-logs,/data2/kafka-logs,/data3/kafka-logs
|
然后我创建了一个9分区的topic
,并发送了900W条消息。查询这些目录发现Kafka
均匀地将9个分区分布到这三个路径上,如下所示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| > ll /data1/kafka-logs/ |grep test-topic drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-3 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-4 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-5 > ll /data2/kafka-logs/ |grep test-topic drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-0 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-1 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-2 > ll /data3/kafka-logs/ |grep test-topic drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-6 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-7 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-8
|
假设由于还有其他topic
数据分布等原因,导致磁盘存储不均衡。需要将test-topic
的6
,7
,8
分区全部迁移到/data2
路径下,并且把test-topic
的1
分区迁移到/data1
下。若要实现这个需求,我们首先需要写一个JSON
文件,migrate-replica.json
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| { "partitions": [ { "topic": "test-topic", "partition": 1, "replicas": [ 0 ], "log_dirs": [ "/data1/kafka-logs" ] }, { "topic": "test-topic", "partition": 6, "replicas": [ 0 ], "log_dirs": [ "/data2/kafka-logs" ] }, { "topic": "test-topic", "partition": 7, "replicas": [ 0 ], "log_dirs": [ "/data2/kafka-logs" ] }, { "topic": "test-topic", "partition": 8, "replicas": [ 0 ], "log_dirs": [ "/data2/kafka-logs" ] } ], "version": 1 }
|
其中,replicas
中的0表示broker ID
,由于本文只启动了一个broker
,且broker.id = 0
,故这里只写0即可。实际上你可以指定多个broker
实现为多个broker
同时迁移副本的功能。另外当前的version
固定是1。
保存好这个JSON
后,我们执行以下命令执行副本迁移:
1 2 3 4 5 6 7 8
| > bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --bootstrap-server localhost:9092 --reassignment-json-file ../migrate-replica.json --execute Current partition replica assignment {"version":1,"partitions":[{"topic":"test-topic","partition":8,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":4,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":5,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":2,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":6,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":3,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":1,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":7,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":0,"replicas":[0],"log_dirs":["any"]}]} Save this to use as the --reassignment-json-file option during rollback Successfully started reassignment of partitions.
|
迁移结果
执行完成后,我们再次查看存储目录副本分布:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| > ll /data1/kafka-logs/ |grep test-topic drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-1 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-3 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-4 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-5 > ll /data2/kafka-logs/ |grep test-topic drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-0 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-1 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-2 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-6 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-7 drwxr-xr-x 6 kafka staff 192 Dec 14 17:21 test-topic-8 > ll /data3/kafka-logs/ |grep test-topic
|
可以看到,6
,7
,8
已经被成功地迁移到/data2
下,而分区1
也迁移到了/data1
下。值得一提的是,不仅所有的日志段、索引文件被迁移,实际上分区外层的checkpoint
文件也会被更新。比如我们检查/data2
下的replication-offset-checkpoint
文件可以发现,现在该文件已经包含了6
,7
,8
分区的位移数据,如下所示:
1 2 3 4 5 6 7 8 9 10
| > cat replication-offset-checkpoint 0 7 test-topic 8 1000000 test-topic 2 1000000 test 0 1285714 test-topic 6 1000000 test-topic 7 1000000 test-topic 0 1000000 test 2 1285714
|