Jai’s Weblog – Tech, Security & Fun…

Tech, Security & Fun…

  • Jaibeer Malik

    Jaibeer Malik
  • View Jaibeer Malik's profile on LinkedIn
  • Subscribe

  • Feedburner

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 40 other subscribers
  • Archives

  • Categories

  • Stats

    • 426,577
  • Live Traffic

Posts Tagged ‘Search’

ElasticSearch: Text analysis for content enrichment

Posted by Jai on March 26, 2013


Every text search solution is as powerful as the text analysis capabilities it offers. Lucene is such open source information retrieval library offering many text analysis possibilities. In this post, we will cover some of the main text analysis features offered by ElasticSearch available to enrich your search content.

Content Enrichment

Taking an example of a typical eCommerce site, serving the right content in search to the end customer is very important for the business. The text analysis strategy provided by any search solution plays very big role in it. As a search user, I would prefer some of typical search behavior for my query to automatically return,

  • should look for synonyms matching my query text
  • should match singluar and plural words or words sounding similar to enter query text
  • should not allow searching on protected words
  • should allow search for words mixed with numberic or special characters
  • should not allow search on html tags
  • should allow search text based on proximity of the letters and number of matching letters

Read the rest of this entry »

Posted in ElasticSearch, Java, Quality | Tagged: , , , , , | 1 Comment »

ElasticSearch: Faceted Search for Hierarchical data

Posted by Jai on March 19, 2013


Faceted Search is the navigational search allowing business to clearly define the properties or characteristics of the product catalog and navigate user to find relevant products with minimum efforts. Most of the available search solutions support the functionality now a days, in this post we will cover how to implemented faceted search using flattened data approach for hierarchical data using ElasticSearch for a typical eCommerce platform.

Search Scenarios/Business Example:

Earlier post, Data Modeling approach for search content and tagging  explains the different characteristics of a typical eCommerce platform serving hierarchical data in terms of categorization of data and sub categorization.

Take an example of such a typical eCommerce platform where one site you need to display the Navigration browsing of your hierarchical data based on some search solution. For example, you need to display products like Books/Clothes etc. Each product has its own specific characteristics and can be categoriezed in different categories and sub categories.

Hierarchical Data:

The hierarchical data in business form represents the taxonomy for your data. The way you can characterize your data in the form of different category type, categories and sub categories for the product catalog.
Read the rest of this entry »

Posted in Architecture, ElasticSearch, Java | Tagged: , , , , , , , | 1 Comment »

Getting started with ElasticSearch

Posted by Jai on March 15, 2013


A quick introduction to ElasticSearch, an open source, ditributed and RESTful search engine based on lucene, and how easily you can start working with it.

Search Solution

As mentioned in the earlier post, Choosing the right search solution for your site, feel free to analyze a best suited search solution for your requirements. In below section, we will cover further some of  functionality and capabilities offered by ElasticSearch.

ElasticSearch

In brief, ElasticSearch is open source, distributed, Schema Less and RESTful search engine based on Lucene. Some of the typical functionality of ES are,

Distributed: aggregated results of search performed on multiple shards/indices

Schema Less: is document oriented. Supports JSON format, automatic mapping types is supported.

RESTful: supports REST interface

Faceted Search: support for navigational search functionality

Replication: supports index replication

Fail over: replication and distributed nature provides inbuilt fail over.

Near Real time: supports near real time updates

Versioning: allows to store different versions of document.

Percolation: allows to register queries against an index, returning matching queries for a doc.

Index Aliasing: allows to create alias for indices.

Read the rest of this entry »

Posted in ElasticSearch, Java, Maven | Tagged: , , , | 5 Comments »

Data Modeling approach for search content and tagging

Posted by Jai on March 13, 2013


For effective search solution, the process of converting the unstructured data into structured format is very important for a successful business. The process includes understanding the user requirements, analyzing the tons of unstructured data for different format and specific system properties and enhancement of the same over a period of time. In this article we will discuss further the data modelling part taking example of typical eCommerce platform, including business process and technical relational database format for the search content to tag in structured format.

eCommerce Platform for search:

A typical eCommerce site, eCommerce

Product catalog with unlimited categories & sub categories.
Featured product/hot product on the home page.
Product search facility.
Search Record display page having advance search option on the top, with the help of this user can refine his search.
Option to compare product on search listing page.
Product listing page with link to product detail page

Read the rest of this entry »

Posted in Architecture, Database, ElasticSearch | Tagged: , , , | 1 Comment »