Jai’s Weblog – Tech, Security & Fun…

Tech, Security & Fun…

  • Jaibeer Malik

    Jaibeer Malik
  • View Jaibeer Malik's profile on LinkedIn
  • Subscribe

  • Feedburner

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 9 other followers

  • Archives

  • Categories

  • Stats

    • 168,038
  • Live Traffic

ElasticSearch: Boosting score for content relevancy

Posted by Jai on April 10, 2013


Every search solution is build to serve relevant content to the users. Each of these search solutions provide different algorithms and mechanism for you to serve the relevant content matching your business requirements. The flexibility provided in terms of affecting and manipulating the relevancy of search results for your content allows business to serve end customers better. In this post, we will cover in details how ElasticSearch helps you to retrieve the relevant content and the ways to affect the scoring of your search data.

Content Relevance Ranking

Content relevancy is to return the relevant documents for a search query. Every business domain has its own relevancy for the content. For example, a typical eCommerce platform will have different relevancy for products based on different customers and different search criteria on different times.

content-relevancy-scoring

You would like each product to score differently based on different criteria. Some of the typical requirements from business side for relevancy of the products,

  • Relevancy based on search term, how many time a particular term occurs for a document
  • Relevancy based on different content fields of the document, eg. title is more important than description for a document
  • Relevancy based on combination of different field values
  • Relevancy based on current conditions and field values at the query time
  • Negative relevancy to some products based on certain ceriteria or field values
  • Higher relevancy for new data
  • Higher relevancy to products which are most liked, visited etc.

Read the rest of this entry »

Posted in ElasticSearch, Java | Tagged: , , , , | Leave a Comment »

ElasticSearch: Text analysis for content enrichment

Posted by Jai on March 26, 2013


Every text search solution is as powerful as the text analysis capabilities it offers. Lucene is such open source information retrieval library offering many text analysis possibilities. In this post, we will cover some of the main text analysis features offered by ElasticSearch available to enrich your search content.

Content Enrichment

Taking an example of a typical eCommerce site, serving the right content in search to the end customer is very important for the business. The text analysis strategy provided by any search solution plays very big role in it. As a search user, I would prefer some of typical search behavior for my query to automatically return,

  • should look for synonyms matching my query text
  • should match singluar and plural words or words sounding similar to enter query text
  • should not allow searching on protected words
  • should allow search for words mixed with numberic or special characters
  • should not allow search on html tags
  • should allow search text based on proximity of the letters and number of matching letters

Read the rest of this entry »

Posted in ElasticSearch, Java, Quality | Tagged: , , , , , | Leave a Comment »

ElasticSearch: Faceted Search for Hierarchical data

Posted by Jai on March 19, 2013


Faceted Search is the navigational search allowing business to clearly define the properties or characteristics of the product catalog and navigate user to find relevant products with minimum efforts. Most of the available search solutions support the functionality now a days, in this post we will cover how to implemented faceted search using flattened data approach for hierarchical data using ElasticSearch for a typical eCommerce platform.

Search Scenarios/Business Example:

Earlier post, Data Modeling approach for search content and tagging  explains the different characteristics of a typical eCommerce platform serving hierarchical data in terms of categorization of data and sub categorization.

Take an example of such a typical eCommerce platform where one site you need to display the Navigration browsing of your hierarchical data based on some search solution. For example, you need to display products like Books/Clothes etc. Each product has its own specific characteristics and can be categoriezed in different categories and sub categories.

Hierarchical Data:

The hierarchical data in business form represents the taxonomy for your data. The way you can characterize your data in the form of different category type, categories and sub categories for the product catalog.
Read the rest of this entry »

Posted in Architecture, ElasticSearch, Java | Tagged: , , , , , , , | Leave a Comment »

Getting started with ElasticSearch

Posted by Jai on March 15, 2013


A quick introduction to ElasticSearch, an open source, ditributed and RESTful search engine based on lucene, and how easily you can start working with it.

Search Solution

As mentioned in the earlier post, Choosing the right search solution for your site, feel free to analyze a best suited search solution for your requirements. In below section, we will cover further some of  functionality and capabilities offered by ElasticSearch.

ElasticSearch

In brief, ElasticSearch is open source, distributed, Schema Less and RESTful search engine based on Lucene. Some of the typical functionality of ES are,

Distributed: aggregated results of search performed on multiple shards/indices

Schema Less: is document oriented. Supports JSON format, automatic mapping types is supported.

RESTful: supports REST interface

Faceted Search: support for navigational search functionality

Replication: supports index replication

Fail over: replication and distributed nature provides inbuilt fail over.

Near Real time: supports near real time updates

Versioning: allows to store different versions of document.

Percolation: allows to register queries against an index, returning matching queries for a doc.

Index Aliasing: allows to create alias for indices.

Read the rest of this entry »

Posted in ElasticSearch, Java, Maven | Tagged: , , , | 2 Comments »

Data Modeling approach for search content and tagging

Posted by Jai on March 13, 2013


For effective search solution, the process of converting the unstructured data into structured format is very important for a successful business. The process includes understanding the user requirements, analyzing the tons of unstructured data for different format and specific system properties and enhancement of the same over a period of time. In this article we will discuss further the data modelling part taking example of typical eCommerce platform, including business process and technical relational database format for the search content to tag in structured format.

eCommerce Platform for search:

A typical eCommerce site, eCommerce

Product catalog with unlimited categories & sub categories.
Featured product/hot product on the home page.
Product search facility.
Search Record display page having advance search option on the top, with the help of this user can refine his search.
Option to compare product on search listing page.
Product listing page with link to product detail page

Read the rest of this entry »

Posted in Architecture, Database, ElasticSearch | Tagged: , , , | 1 Comment »

 
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: