Getting started with ElasticSearch

March 2013
M	T	W	T	F	S	S
	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Posted by Jai on March 15, 2013

A quick introduction to ElasticSearch, an open source, ditributed and RESTful search engine based on lucene, and how easily you can start working with it.

Search Solution

As mentioned in the earlier post, Choosing the right search solution for your site, feel free to analyze a best suited search solution for your requirements. In below section, we will cover further some of functionality and capabilities offered by ElasticSearch.

ElasticSearch

In brief, ElasticSearch is open source, distributed, Schema Less and RESTful search engine based on Lucene. Some of the typical functionality of ES are,

Distributed: aggregated results of search performed on multiple shards/indices

Schema Less: is document oriented. Supports JSON format, automatic mapping types is supported.

RESTful: supports REST interface

Faceted Search: support for navigational search functionality

Replication: supports index replication

Fail over: replication and distributed nature provides inbuilt fail over.

Near Real time: supports near real time updates

Versioning: allows to store different versions of document.

Percolation: allows to register queries against an index, returning matching queries for a doc.

Index Aliasing: allows to create alias for indices.

Installing ElasticsSearch

Download the latest version of ES from site, download

Refer to the installation guide for environment specific steps.

Extract the zip file to destination folder, and go to installation folder. To start the process in foreground,

$ bin/elasticsearch

To start the process in background,

$ bin/elasticsearch &

Running ElasticSearch as service:

Download the Service Wrapper from github repository.

Check the README file to install the service wrapper.

$ bin/service start/stop

Install plugin

Refer to Plugin Guide page for detailed list of available plugins.

To install ElasticSearch Head Plugin, go to installation directory

$ bin/plugin -install mobz/elasticsearch-head

To browse the installed plugin, http://localhost:9200/_plugin/head/

Configure ES server

Refer to Configuration page for all the configurations.

To change server configurations, go to installation directory

$ vi config/elasticsearch.yml

Change relevant settings for your environment. eg. cluster name for your cluster (cluster.name: localtestsearch). Restart server for changes to take place and browse using head plugin to see the changes.

Testing ES server from command line

Refer to online guide, Index for creating index and adding documents


#Create Index
$ curl -XPUT 'http://localhost:9200/twitter/'

#Add document
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
    "tweet" : {
        "user" : "kimchy",
        "post_date" : "2009-11-15T14:12:12",
        "message" : "trying out Elastic Search"
    }
}'

#Get document by id
$ curl -XGET 'http://localhost:9200/twitter/tweet/1'

#Search document
$ curl -XGET 'http://localhost:9200/twitter/tweet/_search?q=user:kimchy'

Testing ES server using head plugin

Browsing index data using Head plugin, http://localhost:9200/_plugin/head/.

Accessing from Java

As a java developer, you would prefer to start connecting to the server using quick test case or java application. Get ready and start using java api,

Maven integraton

Use elastic search java api through maven dependency,


<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>0.20.5</version>
</dependency>

Using Java API

To connect to the locally installed ES server,

create client
create index and set settings and mappings for document type
add documents to the index
get document


//Create Client
Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "localtestsearch").build();
TransportClient transportClient = new TransportClient(settings);
transportClient = transportClient.addTransportAddress(new InetSocketTransportAddress("localhost", 9300));
return (Client) transportClient;

//Create Index and set settings and mappings

CreateIndexRequestBuilder createIndexRequestBuilder = client.admin().indices().prepareCreate(indexName);
createIndexRequestBuilder.execute().actionGet();

//Add documents
IndexRequestBuilder indexRequestBuilder = client().prepareIndex(indexName, documentType, documentId);
//build json object
XContentBuilder contentBuilder = jsonBuilder().startObject().prettyPrint();
contentBuilder.field("name", "jai");
contentBuilder.stopObject();
indexRequestBuilder.setSource(contentBuilder);
IndexResponse response = indexRequestBuilder .execute().actionGet();

//Get document
GetRequestBuilder getRequestBuilder = client().prepareGet(indexName, type, id);
getRequestBuilder.setFields(new String[]{"name"});
GetResponse response = getRequestBuilder.execute().actionGet();
String name = response.field("name").getValue().toString();

In later posts, we will be discussing further the advanced options of using java api.

Online documentation

Some quick links to the ES online documentations,

Glossary:First get familiar with different glossary used to under the concepts better.
Guide:The online Guide, details out different sections along with quick examples to start with.
Blog: The Blog section covers regular update details.
Tutorials: The Tutorials section covers common concepts around ES usage.
API docs: Check the Java API section covering sample examples.
Video: Check the Video covering various ES topics.
Github: Check different ElasticSearch projects on githun covering sample examples etc. also.

Starting with basic concepts

Lucene Concepts

Lucene is text search engine library.

Get familiar with the basic lucene terminology,
Document: collection of fields
Field: string based key-value pair
Collection: set of documents
Precision: number of relevant documents
Recall: number of documents returned
Inverted index: a term can list the number of documents it contains
Lucene blocks: Index Writer, Analyzer, tokenizer, Query parser, Query, Index Searcher etc.
Index and Segments: Indexes written as non modifiable segments.
Score: relevancy for each document matching the query

Refer to Lucene online Documentation and Wiki for further details.

ElasticSearch Concepts

Document: JSON document with data
Field: string based on key-value pair
Type: like a table in Relational database
Index: like a database with multiple type
Mapping: like a schema for database
Distributed nature: node, shards, replicas etc.

Refer to ElasticSearch Glossary section for further details.

This entry was posted on March 15, 2013 at 3:46 am and is filed under ElasticSearch, Java, Maven. Tagged: ElasticSearch, Java, Lucene, Search. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

5 Responses to “Getting started with ElasticSearch”

vatsalad said

March 15, 2013 at 9:44 am
Hi Jaibeer, Looking forward to more posts on the next steps to take after this. Thanks for posting this one.
regards
Vatsala

Reply
ElasticSearch: Faceted Search for Hierarchical data « Jai’s Weblog – Tech, Security & Fun… said

March 19, 2013 at 12:48 am
[…] vatsalad on Getting started with Elas… […]

Reply
Praveenkumar Arepalli said

August 30, 2013 at 8:47 pm
Hi Jaibeer,
Bit confused with many classes
can you organize them properly
I am looking for elastic search java API
please share any document with me

Regards & Thanks
Praveenkumar Arepalli

Reply
- Jai said
  
  August 30, 2013 at 9:54 pm
  Easy way would be to download the source code and have a look, https://github.com/elasticsearch/elasticsearch
  
  Reply
Exploring Enterprise Search Solution Critical Capabilities « Jai’s Weblog – Tech, Security & Fun… said

March 30, 2023 at 1:05 pm
[…] Getting started with ElasticSearch […]

Reply

	Exploring Enterprise… on Oozie: Scheduling Coordinator/…
	Exploring Enterprise… on ElasticSearch-Hadoop: Indexing…
	Exploring Enterprise… on Flume: Gathering customer prod…
	Exploring Enterprise… on Customer product search clicks…
	Exploring Enterprise… on ElasticSearch: Indexing setup…
	Exploring Enterprise… on ElasticSearch: Learn Java API…
	Exploring Enterprise… on ElasticSearch: Boosting score…
	Exploring Enterprise… on ElasticSearch: Text analysis f…
	Exploring Enterprise… on ElasticSearch: Faceted Search…
	Exploring Enterprise… on Getting started with Elas…

Jai’s Weblog – Tech, Security & Fun…

Tech, Security & Fun…

Jaibeer Malik

Subscribe

Feedburner

Email Subscription

Archives

Categories

Stats

Live Traffic

Books

Posts on:

Top Posts

Recent Comments

Follow me on Twitter

Interesting Links

Follow me on FriendFeed