Sunday, February 24, 2019

ElasticSearch - Part 1 - How To Deploy Single Node Cluster On AWS EC2

This blog shows you how to deploy and manage your own ElasticSeacrh cluster on AWS EC2.

Environment:

  • Instance: t2.2xlarge (32GB Mem, 8vCPUs)
  • ElasticSearch version: 6.6.1
Assuming you have all your AWS environment setup, you are ready to launch a EC2 instance and have ssh access to it.

Provision AWS EC2 instance:

Elasticsearch runs on various operating systems such as CentOS, Redhat, Ubuntu, and Amazon Linux. We suggest using the latest Amazon Linux AMI.

Choose "t2.2xlarge" instance type, which provides 8vCPUs, 32 GB of memory and EBS volume for data, which is a reasonable starting point. Go ahead and start up the instance. Two thing to notice:
  • 1. The security group open ports:
    • - port 22: SSH
    • - port 9200: ElasticSearch requests
  • 2. The storage:
    • - 30GB; You can expand later


Install ElasticSearch

Once the EC2 instance is up and running, we can start ElasticSearch installation.

1. Log into EC2 instance:
$ ssh -i ~/.ssh/tony-aws.pem ec2-user@54.1xx.1xx.xxx
$ sudo su -
$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         16G     0   16G   0% /dev
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G  472K   16G   1% /run
tmpfs            16G     0   16G   0% /sys/fs/cgroup
/dev/xvda1       30G  1.2G   29G   4% /
tmpfs           3.2G     0  3.2G   0% /run/user/0
tmpfs           3.2G     0  3.2G   0% /run/user/1000


2. Install Javav1.8.0
$ yum install java-1.8.0-openjdk
Installed:
  java-1.8.0-openjdk.x86_64 1:1.8.0.191.b12-0.amzn2


3. Download ElasticSearch 6.6.1
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.6.1.rpm
--2019-02-25 01:02:20--  https://artifacts.elastic.co/downloads/
elasticsearch/elasticsearch-6.6.1.rpm
Resolving artifacts.elastic.co (artifacts.elastic.co)... 151.101.250.222, 
2a04:4e42:3b::734
Connecting to artifacts.elastic.co (artifacts.elastic.co)|151.101.250.222
|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 114067654 (109M) [application/octet-stream]
Saving to: ‘elasticsearch-6.6.1.rpm’

2019-02-25 01:02:22 (115 MB/s) - ‘elasticsearch-6.6.1.rpm’ 
saved [114067654/114067654]

4. Install the Elasticsearch RPM package on each EC2 instance as instructed below.
$ rpm -iv elasticsearch-6.6.1.rpm
warning: elasticsearch-6.6.1.rpm: Header V4 RSA/SHA512 Signature, 
key ID d88e42b4: NOKEY
Preparing packages...
Creating elasticsearch group... OK
Creating elasticsearch user... OK
elasticsearch-0:6.6.1-1.noarch

### NOT starting on installation, please execute the following 
statements to configure elasticsearch service to start automatically using systemd
 sudo systemctl daemon-reload
 sudo systemctl enable elasticsearch.service

### You can start elasticsearch service by executing
 sudo systemctl start elasticsearch.service
Created elasticsearch keystore in /etc/elasticsearch

5. By default the Elasticsearch service doesn’t log information in the systemd journal. To enable journalctl logging, the "--quiet" option must be removed from the ExecStart command line in the elasticsearch.service file.
$ vim /usr/lib/systemd/system/elasticsearch.service
# Remove --quiet by Tony
#ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet
ExecStart=/usr/share/elasticsearch/bin/elasticsearch -p ${PID_DIR}/elasticsearch.pid --quiet

6. Configure Elasticsearch
Elasticsearch defaults to using /etc/elasticsearch for runtime configuration. Elasticsearch loads its configuration from the /etc/elasticsearch/elasticsearch.yml file by default. The format of this config file is explained in Configuring Elasticsearch (https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html).

Update the bind host:
$ vim /etc/elasticsearch/elasticsearch.yml
locate "network.host: 192.168.0.1", and update it to
network.host: 0.0.0.0

This opens up ElasticSearch to listen on traffic from all hosts.

Update the cluster name:
$ vim /etc/elasticsearch/elasticsearch.yml
locate "cluster.name: my-application", and update it to
cluster.name: tony-es-cluster

Setting the heap size:
By default, Elasticsearch tells the JVM to use a heap with a minimum and maximum size of 1 GB. However, it is important to configure heap size to ensure that Elasticsearch has enough heap available. Elasticsearch will assign the entire heap specified in jvm.options via the Xms (minimum heap size) and Xmx (maximum heap size) settings.
The value for these setting depends on the amount of RAM available on the instance, a rule of thumb is "Set Xmx to no more than 50% of your physical RAM, to ensure that there is enough physical RAM left for kernel file system caches." In our case, the value is 16GB.
$ vim /etc/elasticsearch/jvm.options
Locate "-Xms1g -Xmx1g", and update it to
-Xms16g
-Xmx16g


6. Start ElasticSearch
$ systemctl start elasticsearch.service
$ systemctl status elasticsearch.service
● elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; 
disabled; vendor preset: disabled)
   Active: active (running) since Mon 2019-02-25 01:32:29 UTC; 2s ago
     Docs: http://www.elastic.co
 Main PID: 13803 (java)
   CGroup: /system.slice/elasticsearch.service
           └─13803 /bin/java -Xms16g -Xmx16g -XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly 
-Des.networkaddress.cache.ttl=60 -Des.networkaddr.


7. Verify from API request
$ curl -X GET http://54.1xx.1xx.1xx:9200/
{
  "name" : "ZAvN4SU",
  "cluster_name" : "tony-es-cluster",
  "cluster_uuid" : "bYSZ8nkqS-mnI8x2F3eHhQ",
  "version" : {
    "number" : "6.6.1",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "1fd8f69",
    "build_date" : "2019-02-13T17:10:04.160291Z",
    "build_snapshot" : false,
    "lucene_version" : "7.6.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}


Since we enabled the journal entries for ElasticSeach, you can list the entries now:
$ journalctl --unit elasticsearch
-- Logs begin at Mon 2019-02-25 00:58:47 UTC, end at Mon 2019-02-25 01:32:29 UTC. --

Feb 25 01:26:55 ip-172-31-88-104.ec2.internal systemd[1]: Started Elasticsearch.
Feb 25 01:26:55 ip-172-31-88-104.ec2.internal systemd[1]: Starting Elasticsearch...
Feb 25 01:28:13 ip-172-31-88-104.ec2.internal systemd[1]: Stopping Elasticsearch...
Feb 25 01:28:14 ip-172-31-88-104.ec2.internal systemd[1]: Stopped Elasticsearch.
Feb 25 01:29:38 ip-172-31-88-104.ec2.internal systemd[1]: Started Elasticsearch.
Feb 25 01:29:38 ip-172-31-88-104.ec2.internal systemd[1]: Starting Elasticsearch...
Feb 25 01:31:01 ip-172-31-88-104.ec2.internal systemd[1]: Stopping Elasticsearch...
Feb 25 01:31:01 ip-172-31-88-104.ec2.internal systemd[1]: Stopped Elasticsearch.
Feb 25 01:32:29 ip-172-31-88-104.ec2.internal systemd[1]: Started Elasticsearch.
Feb 25 01:32:29 ip-172-31-88-104.ec2.internal systemd[1]: Starting Elasticsearch...

At this point, you have a running ElasticSearch 6.6.1 single node cluster.

No comments: