May 2024
M	T	W	T	F	S	S
	1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Archive for the ‘ElasticSearch’ Category

Exploring Enterprise Search Solution Critical Capabilities

Posted by Jai on March 30, 2023

In this series of blog posts we will review the enterprise search solution capabilities, available software solutions, from basic to advanced search capabilities along with using AI/ML models to achieve both consumer and business value. We will also cover the search solution for healthcare domain common user interactions, domain dataset and ML alignments. In this post we will cover the critical enterprise search solution capabilities on high level.

History

A long has changed in the search solution capabilities since the initial blog posts shared here in this blog for ElasticSearch capabilities. Additionally using Elasticsearch for user behavior clickstream data with Hadoop big data capabilities to process those information. A quick glance at those here,

ElasticSearch

Getting used to Elastic search and having hands directly with different tricks,

Getting started with ElasticSearch

ElasticSearch: Faceted Search for Hierarchical data

ElasticSearch: Text analysis for content enrichment

ElasticSearch: Boosting score for content relevancy

ElasticSearch: Learn Java API usage with test cases

ElasticSearch: Indexing setup using Akka tutorial

Read the rest of this entry »

Posted in Architecture, Artificial intelligence, Data Security, ElasticSearch, Healthcare, Machine Learning, NLP, Quality, Security | Tagged: Architecture, Artificial intelligence, BERT, Consumer Experience, ElasticSearch, Healthcare, Machine Learning, MORO, NER, NLP | Leave a Comment »

Oozie: Scheduling Coordinator/Bundle jobs for Hive partitioning and ElasticSearch indexing

Posted by Jai on May 28, 2014

This post covers to use Oozie to schedule Hive add partition every hour with the help of Coordinator jobs and to automatically update the ElasticSearch data served to customer based on nightly jobs using Bundle jobs functionality. The automated procedure using oozie jobs will help to update the statistical data used on website to display product views count and top search query string.

In continuation to the previous posts on

As described in earlier posts, the hive partitioning strategy is added based on current time and accordingly the elasticsearch indexing based on analytic data also. We will cover here to automate the process using Oozie to add hive partition once data is available in hadoop directory.

Oozie

Oozie is a workflow scheduler system to manage Apache Hadoop jobs.

Read the rest of this entry »

Posted in ElasticSearch, Hadoop, Hive, Java, Oozie | Tagged: ElasticSearch-Hadoop, Hadoop, Hive, Oozie | 3 Comments »

ElasticSearch-Hadoop: Indexing product views count and customer top search query from Hadoop to ElasticSearch

Posted by Jai on May 22, 2014

This post covers to use ElasticSearch-Hadoop to read data from Hadoop system and index that in ElasticSearch. The functionality it covers is to index product views count and top search query per customer in last n number of days. The analyzed data can further be used on website to display customer recently viewed, product views count and top search query string.

In continuation to the previous posts on

we already have customer search clicks data gathered using Flume and stored in Hadoop HDFS and ElasticSearch, and how to analyze same data using Hive and generate statistical data. Here we will further see how to use the analyzed data to enhance customer experience on website and make it relevant for the end customers.

ElasticSearch-Hadoop

Elasticsearch for Apache Hadoop allows Hadoop jobs to interact with ElasticSearch with small library and easy setup.

elasticsearch-hadoop-hive, allows to access ElasticSearch using Hive. As shared in previous post, we have product views count and also customer top search query data extracted in Hive tables. We will read and index the same data to ElasticSearch so that it can be used for display purpose on website.

Read the rest of this entry »

Posted in ElasticSearch, Hadoop, Java, Spring Data | Tagged: ElasticSearch, ElasticSearch-Hadoop, Hadoop, Spring Data | 4 Comments »

Flume: Gathering customer product search clicks data using Apache Flume

Posted by Jai on May 19, 2014

This post covers to use Apache flume to gather customer product search clicks and store the information using hadoop and elasticsearch sinks. The data may consist of different product search events like filtering based on different facets, sorting information, pagination information and further the products viewed and some of the products marked as favorite by the customers. In later posts we will analyze data further to use the same information for display and analytic.

Product Search Functionality

Any eCommerce platform offers different products to customers and search functionality is one of the basics of that. Allowing user for guided navigation using different facets/filters or free text search for the content is trivial of the any of existing search functionality.

SearchQueryInstruction

Consider a similar scenario where customer can search for a product and allows us to capture the product search behavior with following information,

Read the rest of this entry »

Posted in ElasticSearch, Flume, Hadoop, Java | Tagged: ElasticSearch, Flume, Hadoop | 6 Comments »

Customer product search clicks analytics using big data

Posted by Jai on May 14, 2014

The application demonstrate to setup customer product search clicks analytics using big data Hadoop, Hive, Pig, Oozie, ElasticSearch, Akka, Spring Data etc.

Github Repository

URL: https://github.com/jaibeermalik/searchanalytics-bigdata

Analyzing Search Clicks Data Using Flume, Hadoop, Hive, Pig, Oozie, ElasticSearch, Akka, Spring Data.

Repository contains unit/integration test cases to generate analytics based on clicks events related to the product search on any e-commerce website.

Getting Started

The project is maven project and can be build with Eclipse. Check pom dependencies for relevant version of earch application. It uses cloudera hadoop distribution version 2.3.0-cdh5.0.0.

Functionality

The scenario covered in the application for the search analytics using big data is as follow,
Read the rest of this entry »

Posted in Akka, ElasticSearch, Flume, Hadoop, Hive, Java, Oozie, Pig, Spring, Spring Data | Tagged: Akka, Big Data, ElasticSearch, Flume, Hadoop, Hive, Oozie, Pig, Spring Data | 6 Comments »

« Previous Entries

	Exploring Enterprise… on Oozie: Scheduling Coordinator/…
	Exploring Enterprise… on ElasticSearch-Hadoop: Indexing…
	Exploring Enterprise… on Flume: Gathering customer prod…
	Exploring Enterprise… on Customer product search clicks…
	Exploring Enterprise… on ElasticSearch: Indexing setup…
	Exploring Enterprise… on ElasticSearch: Learn Java API…
	Exploring Enterprise… on ElasticSearch: Boosting score…
	Exploring Enterprise… on ElasticSearch: Text analysis f…
	Exploring Enterprise… on ElasticSearch: Faceted Search…
	Exploring Enterprise… on Getting started with Elas…

Jai’s Weblog – Tech, Security & Fun…

Tech, Security & Fun…

Jaibeer Malik

Subscribe

Feedburner

Email Subscription

Archives

Categories

Stats

Live Traffic

Books

Posts on:

Top Posts

Recent Comments

Follow me on Twitter

Interesting Links

Follow me on FriendFeed

Archive for the ‘ElasticSearch’ Category

Exploring Enterprise Search Solution Critical Capabilities

History

ElasticSearch

Oozie: Scheduling Coordinator/Bundle jobs for Hive partitioning and ElasticSearch indexing

Oozie

ElasticSearch-Hadoop: Indexing product views count and customer top search query from Hadoop to ElasticSearch

Recently Viewed Items

ElasticSearch-Hadoop

Flume: Gathering customer product search clicks data using Apache Flume

Product Search Functionality

SearchQueryInstruction

Customer product search clicks analytics using big data

Github Repository

Analyzing Search Clicks Data Using Flume, Hadoop, Hive, Pig, Oozie, ElasticSearch, Akka, Spring Data.

Getting Started

Functionality

Tech, Security & Fun…

Jaibeer Malik

Subscribe

Feedburner

Email Subscription

Archives

Categories

Stats

Live Traffic

Books

Posts on:

Top Posts

Recent Comments

Archive for the ‘ElasticSearch’ Category

History

ElasticSearch

Share this:

Oozie

Share this:

Recently Viewed Items

ElasticSearch-Hadoop

Share this:

Product Search Functionality

SearchQueryInstruction

Share this:

Github Repository

Analyzing Search Clicks Data Using Flume, Hadoop, Hive, Pig, Oozie, ElasticSearch, Akka, Spring Data.

Getting Started

Functionality

Share this: