When using elasticsearch or elastic, the reindexing process must be an important task to deal with. Indeed, this process must be done with zero downtime, and nothing visible for users.
Reindexing can be useful in many cases like :
- Type/mapping updates.
- New physical infrastructure with more (or less) nodes.
- Splitting an index into many others.
- Any type of cluster/nodes/indexes/configuration updates.
If you want more information, the doc is very clear.
But we are going to see how to have this zero downtime with Symfony and FOSElasticaBundle.
Elasticsearch (elastic)
To work with elasticsearch, the first thing we need, is to get/install… elasticsearch. As a developer, I very often use elastic thanks to this Vagrant box or this Dockerfile and container.
Docker
Start the docker container:
Then access to:
- http://localhost:9200/_cluster/health?pretty=true
- http://localhost:9200/_plugin/marvel
- http://localhost:9200/_plugin/paramedic
- http://localhost:9200/_plugin/HQ
- http://localhost:9200/_plugin/bigdesk
- http://localhost:9200/_plugin/head
Vagrant
Start the docker container:
Then access to (with the default IP pattern config):
- http://10.0.0.11:9200/_plugin/marvel
- http://10.0.0.11:9200/_plugin/paramedic/
- http://10.0.0.11:9200/_plugin/head/
- http://10.0.0.11:9200/_plugin/bigdesk
- http://10.0.0.11:9200/_plugin/HQ/
FOSElasticaBundle
Install
…then…
Configuration
The important part of the configuration is the following:
You must define use_alias: true
to tell FOSElasticaBundle to use an alias for your index.
This way, indexing and search queries will be performed using the alias and not the real index name.
Once the entire configuration (indexes, mapping, types,…) of your application is done, start the indexing process:
The real index name will have the following pattern: app_prod_YYYY-MM-DD-HHMMSS
.
When the indexing process is finished we can see our app_prod
alias on our app_prod_2015-05-28-213059
index:
The alias is created at the end of the (first) indexing process.
Zero downtime indexing
The magic is in the fact that with FOSElasticaBundle we can start a reindexing process running the previous command again:
The command will reindex data creating another index with another name. And more important, the previous index still exists with our alias:
At the end of the reindexing process, the command will change the target of the index, and will destroy the previous index:
This is how zero downtime reindexing process is achieved with Symfony and FOSElasticaBundle.
With the Marvel product we can follow the process on graphs:
On Marvel Shard allocation dashboard you can see and (re)play history, automatically or step by step. This is really amazing :
Conclusion
Elasticsearch/elastic is a fantastic tool to make search easy and awesome. Coupled with other tools (marvel, kibana, logstash,…) of the elastic company, the possibilities are limitless.