Big Data

Data Lakes – Is it Time for Your Business to Wade In?

As data continues to grow in both volume and structural variety, traditional relational database approaches fall increasingly short in providing the needed flexibility, agility, scalability, and economy to support its processing.  Alternative and complimentary approaches for managing information have been pioneered, and given time to mature, in the last few years to satisfy today’s big […]

ThoughtSpot – For Near Instant Analytics Gratification

ThoughtSpot ups the ante when it comes to rapidly and effortlessly delivering insightful and completely ad-hoc data analytics and visuals to your business, even for large many TB data sets.

ThoughtSpot has trail blazed a new area of BI called Search BI. This type of BI differs from the current genre of more established BI tools […]

Big Data Trends in 2016

It is 2016 and data is growing more rapidly than ever. 2015 was big data’s year. There were many conferences related to big data everywhere. Professionals working in different industries, such as healthcare, insurance, bank, and etc., were eager to learn more about big data to solve their big data problems or perhaps, to seek […]

HBase Data Extraction

Our Client is a NE based data solution provider in the healthcare industry. The client manages a single node CDH5 cluster Ver 5.3.2 in Ubuntu (Trusted Tahr) . The client had two main concerns. One of them being extracting data from HBase. Each table in HBase has its own metadata file. The metadata files provide […]


Click on to Download Centos5.4Hadoop2.0Multicluster Tutorial Here — Centos5.4Hadoop2.0Multicluster

Pivotal GemFire

Hello everyone. This is an installation guide for Pivotal GemFire, which is a “distributed data management platform”. Pivotal is a company launched from VMware and EMC. It is relatively a new company that was founded in 2013 but with GE’s $105 million investment, they’ve been running strong with their own Hadoop distribution called, Pivotal HD.

On […]

Hortonworks Sandbox

In this tutorial, students will learn how to set environment to use Sandbox. We are using CentOS 6.6 for this tutorial. Although, we used CentOS 6.6 GUI in the tutorial, I’ve written it in a way that even server operating system users can follow the tutorial without any problem. Hope you all enjoy.

Click on to […]

Hadoop 2.x on Amazon EC2

This is a Amazon EC2 tutorial. This tutorial will help students to understand current stable 2.x Hadoop and how it can be deploy on Amazon EC2 instance. In the tutorial, students will learn to create Hadoop cluster that is production ready.

Click on to Download Amazon EC2 Tutorial here —AmazonEC2

Hadoop Multi-node Cluster Installation on Centos 6

This is a Hadoop multi-node cluster installation guide, which will help you to understand how each node process in Hadoop. Everything in this guide is straightforward. We are using Centos6.6 since it is widely used in production servers. Every step is explained with pictures and comments. Just follow through all the steps and you […]

CDH5 Single Node Installation Guide

This is a CDH5 installation guide, which will give you some basic ideas on installation of Hadoop. Everything in this guide is straightforward. We are using Ubuntu Desktop in this tutorial so that even non-linux based operating system users can follow the guide easily. Every step is explained with pictures and comments. Just follow through […]