背景:
尝试解决如下问题:单分片存在过多文档,超过lucene限制
分析
1.一般为日志数据或者OLAP数据,直接删除索引重建
2.尝试保留索引,生成新索引
- 数据写入新索引,查询时候包含 old_index,new_index
3.尝试split
split index API
如果需要将当前index的primary shard数量增加时,可以使用split index api。
会生成一个新index,但会保留原来的index。
步骤:
确保source index只读
PUT source_index/_settings
{"settings": {"index.blocks.write": true }
}
spilt API修改primary shard数量
POST source_index/_split/new_index
{"settings": {"index.number_of_shards": 10}
}
监控执行进度
GET _cat/recovery/new_index
测试
版本 7.17.5
# 新建测试索引
PUT test_split
{}# 关闭source索引的写入
PUT /test_split/_settings
{"settings": {"index.blocks.write": true }
}# 执行split API
POST /test_split/_split/test_split_new
{"settings": {"index.number_of_shards": 12}
}
遇到报错并解决,在split API执行阶段:
1. source 索引必须是 read-only 的{"error": {"root_cause": [{"type": "illegal_state_exception","reason": "index test_split must be read-only to resize index. use \"index.blocks.write=true\""}],"type": "illegal_state_exception","reason": "index test_split must be read-only to resize index. use \"index.blocks.write=true\""},"status": 500
}2. source分片数(3)必须是target分片数的因子(所以target不能为11,可以为12){"error": {"root_cause": [{"type": "illegal_argument_exception","reason": "the number of source shards [3] must be a factor of [11]"}],"type": "illegal_argument_exception","reason": "the number of source shards [3] must be a factor of [11]"},"status": 400
}
应用
集群版本 6.8.5
设置source索引 "index.blocks.write": true 之后,执行split API异常:
{"error": {"root_cause": [{"type": "remote_transport_exception","reason": "[es-log-all-2][10.xx.x.xx:9300][indices:admin/resize]"}],"type": "illegal_state_exception","reason": "the number of routing shards [5] must be a multiple of the target shards [20]"},"status": 500
}
即:目标索引的主分片个数必须是index.number_of_routing_shards
的因数;
注意:number_of_routing_shards 不可以动态修改
结论:ES6.8无法通过split API解决索引分片过少的问题
官方doc:Split index API | Elasticsearch Guide [8.9] | Elastic
Shrink index API
如果需要将当前index的primary shard数量减少时,可以使用shrink index api。
会生成一个新index,但会保留原来的index。
(Shrinks an existing index into a new index with fewer primary shards.)
POST /my-index-000001/_shrink/shrunk-my-index-000001
步骤
# 新建index
PUT test_shrink
{}# 查看索引的shard在哪些node
GET _cat/shards/test_shrink?v# 将所有主分片分配到node1,副本设置为0,设置readOnly
PUT test_shrink/_settings
{"settings": {"index.number_of_replicas": 0,"index.routing.allocation.require._name": "node-es-0","index.blocks.write": true}
}# 执行shrink API
POST /test_shrink/_shrink/new_test_shrink
{"settings": {"index.number_of_replicas": 1,"index.number_of_shards": 1, "index.codec": "best_compression" },"aliases": {"my_search_indices": {}}
}
如果上述命令修改成:
POST /test_shrink/_shrink/new_test_shrink
{"settings": {"index.number_of_replicas": 1,"index.number_of_shards": 2, "index.codec": "best_compression" },"aliases": {"my_search_indices": {}}
}
新的number_of_shards不是source index的number_of_shards的因子,那么出现如下错误:
{"error": {"root_cause": [{"type": "illegal_argument_exception","reason": "the number of source shards [3] must be a multiple of [2]"}],"type": "illegal_argument_exception","reason": "the number of source shards [3] must be a multiple of [2]"},"status": 400
}
官方doc:Shrink index API | Elasticsearch Guide [8.9] | Elastic