Elasitcsearch : Cluster health showing red and unassigned shards.

Elasitcsearch : Cluster health showing red and unassigned shards.

Question:

When we check cluster health it shows red and unassigned shards. This causes indexing of new document failure we can perform read operation.

Answer:
This issue occurs when shards get unassigned from elasticsearch  cluster node. Elasticsearch automatically tries to evenly balance the shards b/w all elasticsearch cluster node. To check the cause behind this we should run below command to check which are the shards who unassigned.

curl -XGET ES-IP:9200/_cluster/allocation/explain?pretty

{
“index” : “vunet-1-1-vublock-sources”,
“shard” : 1,
“primary” : true,
“current_state” : “unassigned”,
“unassigned_info” : {
“reason” : “ALLOCATION_FAILED”,
“at” : “2021-12-11T07:24:07.468Z”,
“failed_allocation_attempts” : 5,
“details” : “failed shard on node [fHA4Cq__TcuZWM8RRpAanw]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[vunet-1-1-vublock-sources][1]: obtaining shard lock timed out after 5000ms]; “,
“last_allocation_status” : “no”
},
“can_allocate” : “no”,
“allocate_explanation” : “cannot allocate because allocation is not permitted to any of the nodes that hold an in-sync shard copy”,
“node_allocation_decisions” : [
{
“node_id” : “XgmtzW7eT4avKM3YJsgCRw”,
“node_name” : “elasticsearch-3”,
“transport_address” : “10.10.0.97:9300”,
“node_decision” : “no”,
“store” : {
“found” : false
}
},
{
“node_id” : “cNFaI423Qgeh8h1lTy3Piw”,
“node_name” : “elasticsearch-1”,
“transport_address” : “10.10.0.102:9300”,
“node_decision” : “no”,
“store” : {
“found” : false
}
},
{
“node_id” : “fHA4Cq__TcuZWM8RRpAanw”,
“node_name” : “elasticsearch-2”,
“transport_address” : “10.10.0.104:9300”,
“node_decision” : “no”,
“store” : {
“in_sync” : true,
“allocation_id” : “YBPFgc0gTaCaSAjXpllq1w”,
“store_exception” : {
“type” : “shard_lock_obtain_failed_exception”,
“reason” : “[vunet-1-1-vublock-sources][1]: obtaining shard lock timed out after 5000ms”,
“index_uuid” : “0XsasG9PQhK2uswi8F_eLw”,
“shard” : “1”,
“index” : “vunet-1-1-vublock-sources”
}
},
“deciders” : [
{
“decider” : “max_retry”,
“decision” : “NO”,
“explanation” : “shard has exceeded the maximum number of retries [5] on failed allocation attempts – manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-12-11T07:24:07.468Z], failed_attempts[5], delayed=false, details[failed shard on node [fHA4Cq__TcuZWM8RRpAanw]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[vunet-1-1-vublock-sources][1]: obtaining shard lock timed out after 5000ms]; ], allocation_status[deciders_no]]]”
}
]
}
]
}

This command will show all unassigned shards and reason behind that. As we can see from above it is showing shard allocation failed because of obtain in-memory shard lock. Now we have to reroute these shards manually using below command in elasticsearch cluster.

curl -XPOST ES-IP:9200/_cluster/reroute?retry_failed=true

This command will reroute all the shards in cluster and perform rebalancing of shards b/w all nodes of cluster. After doing this check cluster health to validate cluster health is green and allocation status.
    • Related Articles

    • Cross Cluster

      What is cross cluster and how this can applied to Production and DR monitoring scenario? Any link to understand this in detail Hopping this question is related to ES Cross Cluster We have two types of Cross Cluster In ES. Please find below.   ...
    • How do we find the disk usage for each ES Node in a cluster

      Question: Is there a command that provides the disk usage by ES for each node in a cluster? Also, how much is actual data size because we may be replicating it for High Availability? Answer: ​​Run below command from any of the analyzer where ...
    • Automatic health check report-mail setting

      Solution Document Healthcheck Report Mail Settings Overview General/Customer specific General Author Mantika Jadhav Reviewer Rukmini Approver  Ravi  Release date 08/09/2022 Product Version 8.5r5 Steps to be followed to send a health check report via ...
    • Event Duration field is not showing 331

         Solution Document        Event Duration field is not showing      Overview   General/Customer specific ICICI Author Thirupathi Rao Vardhineni Reviewer Seema Approver  Deepak Release date 05/07/2022 Product Version       8.5r5           Audience: ...
    • Shard allocation issues

      Ques - My cluster is red due to one or more unassigned shards. How do I bring it back to green ? Ans - ES provides an explain API to give more information on unassinged shards. For example, when you find some shards in UNASSIGNED state for a very ...