Excellent Big Data Trainings … !!!!


Excellent Big Data trainings are available at www.mybigdatacoach.com

Big Data has arrived!! If you are an IT professional who wish to change your career path to Big Data and become a Big Data Expert in a month, you have come to the right place.
We provide personalized Hadoop Training with hands on real life use-cases. Our mission is to ensure that you are a Big Data Expert within a month.

For Further queries contact us at

Email: bigdatacoach@gmail.com

Private Browsing Modes in Firefox and Chrome

If we dont want to save the browsing history, search history, download history, web form history, cookies, or temporary internet files,we have an option called private browsing. This option is available in various browsers. This will be helpful when we browse from public computers. The shortcuts for entering into private browsing mode of various browsers are shown below.

Google Chrome

In google chrome, private browsing can be achieved by pressing the key combination Ctrl+shift+N.
By pressing this private browsing window will be opened.

Mozilla Firefox

In firefox, private browsing can be achieved by pressing the key combination Ctrl+shift+P.
By pressing this, private browsing window will be opened.

Sorting Algorithms using Java

Last day I thought about the different ways for sorting. Previously I was familiar with only two methods of sorting. When I searched in Wikipedia, I found several sorting algorithms. Here I am implementing some of the sorting algorithms the using Java.

1)      Selection sorting

2)      Bubble sorting

3)      Insertion sorting

4)      Merge sorting

5)      Quick sorting

6)      Shell sorting


Selection Sorting

a)    the sublist of items already sorted, which is built up from left to right

b)    the sublist of items remaining to be sorted that occupy the rest of the list.

Initially the  sorted sublist is empty and the unsorted sublist is the entire input list. The algorithm proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right. The sample java code is given below.

Bubble Sorting

Bubble sort is a simple sorting algorithm that works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent items and swapping them if they are in the wrong order. The pass through the list is repeated until no swaps are needed, which indicates that the list is sorted. The algorithm gets its name from the way smaller elements “bubble” to the top of the list. Because it only uses comparisons to operate on elements, it is a comparison sort. Although the algorithm is simple, most of the other sorting algorithms are more efficient for large lists. The sample code using java is given below.

Insertion Sorting

Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time. It is much less efficient on large lists. In this method we take one element and compares it with the rest of the elements in the list and fits into the position between the elements larger than and smaller than the actual element. Finally we will get a sorted list. Sample java code is given below.

Twitter opensourced summingbird under Github

Summingbird is a library that lets you write streaming MapReduce programs that look like native Scala or Java collection transformations and execute them on a number of well-known distributed MapReduce platforms like Storm and Scalding.
The main feature of summingbird is that you can execute the Summingbird program in:

  • batch mode (using Scalding on Hadoop)
  • real-time mode (using Storm)
  • hybrid batch/real-time mode (offers attractive fault-tolerance properties)

For more details please check Summingbird


Google Chrome Browser Shortcuts

A lot of shortcuts are available in google chrome browser. Most of us are unaware of these shortcuts.

Alt+F – Open the wrench menu (i.e chrome settings menu)
Ctrl+J – Go to downloads window
Ctrl+H – Go to history window
Ctrl+Tab – Navigate Tabs
Alt+Home – Go to home page
Ctrl+U – View source code of the current page
Ctrl+K – To search quickly in the address bar
Ctrl+L – Highlights the URL in the address bar (use this to copy/paste the URL quickly)
Ctrl+N – Open a new Chrome browser window
Ctrl+Shift+N – Open a new incognito window (for private browsing)
Ctrl+Shift+B – Toggle bookmark display
Ctrl+W – Close the current Tab
Alt+Left Arrow – Go to the previous page from your history
Alt+Right Arrow – Go to the next page from your history
Space bar – Scroll down the current web page

Big data – Latest Trend in Industry


Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.
Simply saying big data problem occurs because of three factors.
1) Volume
2) Velocity
3) Variety
Data with huge volume can be a big data.
Data that comes with large velocity is also a big data. Consider 1 GB comes in microseconds. It will be difficult to handle in such a short duration of time. So that is also a big data.
If data comes with large varieties, it will be difficult to process. So variety is also a factor affecting big data.
The main challenge with big data is processing. We need the data to be processed within some limited time. The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to “spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions. Consider the case of facebook, everyone are uploading pictures, texts etc. If we consider as a whole, the data to be handled is huge. We are not suffering any delay in facebook, because it is handled effectively using big data technologies.
Big data technologies include a lot of open source projects such as Hadoop, Hive, Pig, oozie, flume, zookeeper, Hbase, Storm, Solr, Elastic search etc.
All these technologies run in a clustered environment, ie not in a single server environment.
It can be scaled horizontally depending upon the load. Our conventional data management system are scaling vertically. There are a lot of problems in vertical scaling. These big data technologies are not a replacement for conventional data handling technologies, but this will work along with conventional systems and make the data handling more effective.