Elasticsearch (ES) is a distributed, scalable, search and analytics engine that enables fast data retrieval. . It exposes a RESTful API, Java and Python libraries, and extensibility with various plugins like the Zookeeper cluster integration plugin.
In this post I'll deploy a small Elasticsearch cluster consisting of one Nginx loadbalancer, two ES client nodes, three ES master nodes and two ES data nodes.
The role of the three different ES node types is:
- Client nodes: act as load balancer for routing queries and index processes. The client nodes do not hold any data.
- Data nodes: hold data, merge segments and execute queries. The data nodes are the main workers.
- Master nodes: manages the cluster and elects a master node using Unicast. The master nodes hold configuration data and the mapping of all the indexes in the cluster.
A simple one node deployment by default configures the ES server to be both a master and a data node. To be able to further scale however, you'll need multiple data nodes, at least 3 masters (to prevent split brain scenarios) and two client nodes to route the requests and results.
Installing and configuring ES cluster is rather simple. First download the Oracle JRE and install it on all ES nodes:
With Java installed, download and install elasticsearch on all ES nodes:
Next configure the different types of ES nodes, starting with the three masters:
The two data nodes:
And finally the two client nodes (note how the main difference between the node types is the node.data and node.master stanzas:
On Ubuntu elasticsearch ships with configurable defaults in /etc/default/elasticsearch. Let's set the HEAP size to 2GB and the max number of available file descriptors on all nodes in the cluster:
Start all the masters, data nodes and client nodes and watch the formation of the cluster and election of a leading master:
Fail the current master and observe the election of a new one:
Elasticsearch exposes a great API to check the cluster status or gather various metrics :
And finally to add and retrieve an index:
We can directly connect to the two client nodes to perform operations, or front end them with a load balancer:
To test connect to the LB:
To export an index from ES and import it to a different ES server we can use a tool called elasticdump:
Here are few other useful examples: