How to solve Elasticsearch Unassigned Shards on Reboot

We have had our ElasticSearch cluster in production for a few weeks. Checking on drive space we decided we wanted to increase it since it was sitting at 67, 88 and 95 for each of our three nodes. The sysadmin added the space, and rebooted the box (ok, this was the first mistake, you can do this online now days). When it rebooted, horror set in, all three shards were unassigned and appeared gone. Cluster status, yellow.

Elasticsearch Unallocated Shards

Quick online search lead to a shard allocation setting.

Elasticsearch Shard Allocation

Tried it, nothing. I would just use the same image above, but it is a waste of pixels.

Then, I hit pay dirt, found a way to ask ElasticSearch, or explain itself, to you on why a shard for an index didn’t exist. You can pass in the node or shard numbers to it, this is a great API.

Elasticsearch Allocation Explain

Error, drive space watermark “NO”. Bingo! Alright, let us fix that drive now.

Drive space added, shards are back online. Now the thing to remember is everything was fine, but it appears with ElasticSearch 1.6 that if you hit the water mark, no matter what happens, you have to go back under it before that node will come fully online. It was processing data, serving up content and overall working, just the shards were missing. Migrating to ElasticSearch 2.3 soon, will find out if it has the same watermark very soon.

 

 

ElasticSearch Online ReIndex With Writes C#

We just now deployed a application solely using ElasticSearch (Elastic.co) as our backend full-text search provider. We migrated away from SQL Full Text search for reasons I will post later. ElasticSearch runs on top of Lucene, but adds a REST api and cluster goodness on top.

Quick setup,
You need Java, and the Java Runtime(JRE) in your path. I cheat and update the elasticsearch batch files to point at a download.
You need to run elasticsearch. On windows/linux, you just run it from the command line. Later you can register it as a background service.
Install a management plugin. I use a plugin from mobz “plugin install mobz/elasticsearch-head” in the elasticsearch bin folder should do the trick.
Port is 9200 and the plugin’s run on the “_plugin” folder. If you get it up and running you should get this screen minus the indexes.

ElasticSearch-Head-Home

Terms
Index==Table in the SQL terms
Mappings==I want to tell you what this field means and how to store it. In SQL, we just have data types, in ElasticSearch you have much more control than “This is a string”.
Alias==A view in SQL terms, this is important later.
Replica==Backup copy
Shard==A way to split one really large index into smaller indexes. Keep in mind you need to know details of how to update a document if you shard.

The first problem you run into with ElasticSearch is “How do I change my mappings?”. Easy, you make a new index and point everything to it. In SQL Server, SSMS will do this for you by creating a new table, moving the data, dropping the old table, renaming the current table. Elasticsearch doesn’t have an official client, so you have to code this yourself. Second problem is how to handle this while the data is live. ElasticSearch doesn’t have the concept of “locks” in the SQL server sense. Plus relying on locking data to perform an update is so 2004 of an application. Large ElasticSearch instances contains billions of documents, hard to lock or file copy all of them.

Source Code C# (If you want to cut to the chase)
https://github.com/joshbartley/ElasticSearch_ReIndex

ElasticSearch-Single-Index
You always start with one index.

You get some data into ElasticSearch, you find out your mapping is wrong and you need to change it. Four records is easy, four million would be harder. First step you should verison your indexes, dates, numbers, random word generator, doesn’t matter. Whatever makes the most sense to you to understand.

You create your new secondary index, and add in a lowercase mapping, this is superfluous to this post, just an example.

ElasticSearch-Second-Index-ReIndex

If you move all the data now, you may miss the records that were being put into “datav1” during the Re-Index. Here is where the aliases come into play. You can create them however you like, I named them *_R(ead) and *_R(ead)W(rite). You cannot write to an alias with more than one index.

ElasticSearch-Read-Write-Alias

The “data_rw” alias is only to let your application know where to write the data. You have to code your application to accept the alias, grab the indexes, and write to both. In this example I am not using the bulk query as I am only dealing with two records on each operation. Anyone want to PR a change to the C# Client?

[csharp]
private static void WriteSecondaryObjects(ElasticClient client)
{
var indexes = client.GetAlias(x => x.Name("data_rw"));

foreach (var index in indexes.Indices)
{
client.Index(new Company() { Name = "Mega Acme Corp" }, idx => idx.Index(index.Key));
client.Index(new Company() { Name = "Global World Domination Corp Acme LLC" }, idx => idx.Index(index.Key));
}
}
[/csharp]

After I wrote two records, the document count goes from 2 to 4 on the first index, and 0 to 2 on the second. Notice the “data_r” alias is still pointed at the old index, this is because the new index doesn’t have all the data yet.

chrome_2016-04-05_21-23-44

The following code I grabbed from a StackOverflow answer and updated it to the latest NEST (C# Client) for ElasticSearch. http://stackoverflow.com/a/34867857/32963

[csharp]
public static void Reindex(ElasticClient client, string aliasName, string currentIndexName, string nextIndexName)
{
Console.WriteLine("Reindexing documents to new index…");
var searchResult = client.Search<object>(s => s.Index(currentIndexName).AllTypes().From(0).Size(100).Query(q => q.MatchAll()).SearchType(Elasticsearch.Net.SearchType.Scan).Scroll("2m"));
if (searchResult.Total <= 0)
{
Console.WriteLine("Existing index has no documents, nothing to reindex.");
}
else
{
var page = 0;
IBulkResponse bulkResponse = null;
do
{
var result = searchResult;
searchResult = client.Scroll<object>(new Time("2m"), result.ScrollId);
if (searchResult.Documents != null && searchResult.Documents.Any())
{
ThrowOnError(searchResult,"reindex scroll " + page);
bulkResponse = (IBulkResponse)ThrowOnError(client.Bulk(b =>
{
foreach (var hit in searchResult.Hits)
{
b.Index<object>(bi => bi.Document(hit.Source).Type(hit.Type).Index(nextIndexName).Id(hit.Id));
}

return b;
}),"reindex page " + page);
Console.WriteLine("Reindexing progress: " + (page + 1) * 100);
}

++page;
}
while (searchResult.IsValid && bulkResponse != null && bulkResponse.IsValid && searchResult.Documents != null && searchResult.Documents.Any());
Console.WriteLine("Reindexing complete!");
}

Console.WriteLine("Updating alias to point to new index…");
client.Alias(a => a
.Add(aa => aa.Alias(aliasName).Index(nextIndexName))
.Remove(aa => aa.Alias(aliasName).Index(currentIndexName)));
}

[/csharp]

The NEST ReIndex function creates a new index and requires a mapping at create time, that is not what we need. Once it runs, it will grab all the old data. Now if you had data being pumped into ElasticSearch it should be writing it to both indexes.

elasticsearch-reindex

At this point, you would use some feature toggles/flags to test out the new index’s mappings. If all is well, switch over the read alias.

elasticsearch-reindex-complete

DONE!

If you made it this far and want to know what the rollback plan is, well we didn’t delete anything. So just swap the alias back over. Ideally if you used the feature toggle/flag, you would have reduced your risk for the rollback. We are testing this out on Thursday, so we will see how it works with half a million records đŸ™‚

 

S.L.A.B. To ElasticSearch

I always like to find new ways to not write the same code twice. Logging has been one of those features that every application needs, and every application does it differently. Do you log to SQL, a text file, for how long, what format?

The Semantic Logging Application Block supports SQL, Azure, ElasticSearch or any other “Sink” you want to write. It supports buffering, retry for when a Sink isn’t available. Awesome, all the stuff I want but don’t need to write my own version of. More details on SLAB here, Semantic Logging Application Block @ CodePlex

So how do you use this wonderful tool?

1. Install ElasticSearch. It is more like an XCopy deploy then an MSI Install. Installing ElasticSearch @ ElasticSearch.org

2.  Install the template from ElasticSearch Sink to help ElasticSearch understand your log messages.

3.Download the Nuget Packages,

PM > Install-Package EnterpriseLibrary.SemanticLogging.Elasticsearch

4. Put this Gist into your code somewhere and point it to your ElasticSearch server. Make sure your log type matches the one used in the ElasticSearch Template. Best to just use “etw” as below.

[gist id=”7e367d7bf5fbcab5129c” synhi=true lang=”csharp”]

5. Log an error

[csharp]LoggingEvents.Log.Error(ex);
[/csharp]

6. Done.

Next post I will show how to use two different ElasticSearch plugins to monitor and view your errors.