How to solve Elasticsearch Unassigned Shards on Reboot

We have had our ElasticSearch cluster in production for a few weeks. Checking on drive space we decided we wanted to increase it since it was sitting at 67, 88 and 95 for each of our three nodes. The sysadmin added the space, and rebooted the box (ok, this was the first mistake, you can do this online now days). When it rebooted, horror set in, all three shards were unassigned and appeared gone. Cluster status, yellow.

Elasticsearch Unallocated Shards

Quick online search lead to a shard allocation setting.

Elasticsearch Shard Allocation

Tried it, nothing. I would just use the same image above, but it is a waste of pixels.

Then, I hit pay dirt, found a way to ask ElasticSearch, or explain itself, to you on why a shard for an index didn’t exist. You can pass in the node or shard numbers to it, this is a great API.

Elasticsearch Allocation Explain

Error, drive space watermark “NO”. Bingo! Alright, let us fix that drive now.

Drive space added, shards are back online. Now the thing to remember is everything was fine, but it appears with ElasticSearch 1.6 that if you hit the water mark, no matter what happens, you have to go back under it before that node will come fully online. It was processing data, serving up content and overall working, just the shards were missing. Migrating to ElasticSearch 2.3 soon, will find out if it has the same watermark very soon.