Course Features

  • Lectures 0
  • Quizzes 0
  • Duration 10 week
  • Skill level Mid Level
  • Language English
  • Students 0
  • Assessments Yes

Categorires

0 Review
0 student

Course Details

In the current technological scenario, Big Data has become indispensable for any business. Big Data Analytics can be categorized into:

  • Prescriptive Analytics
  • Predictive Analytics
  • Descriptive Analytics

Big Data analytics involves using analytics techniques like machine learning, data mining, natural language processing, and statistics. The data is extracted, prepared and blended to provide analysis for the businesses. Large enterprises and multinational organizations use these techniques widely these days in different ways.

Big Data technologies help in analyzing large volumes of data – either structured or unstructured, to gain business insights and make strategic decisions.

Module 1 –  Introduction to Hadoop and its Ecosystem, Map Reduce and HDFS

  • Big Data, Factors constituting Big Data
  • What is Hadoop?
  • Overview of Hadoop Ecosystem
  • Map Reduce -Concepts of Map, Reduce, Ordering, Concurrency, Shuffle, Reducing, Concurrency
  • Hadoop Distributed File System (HDFS) Concepts and its Importance
  • Deep Dive in Map Reduce – Execution Framework, Partitioner, Combiner, Data Types, Key pairs
  • HDFS Deep Dive – Architecture, Data Replication, Name Node, Data Node, Data Flow
  • Parallel Copying with DISTCP, Hadoop Archives

Module 2 – Hands on Exercises

  • Installing Hadoop in Pseudo Distributed Mode, Understanding Important configuration files, their Properties and Demon Threads
  • Accessing HDFS from Command Line
  • Map Reduce – Basic Exercises
  • Understanding Hadoop Eco-system
  • Introduction to Sqoop, use cases .
  • Introduction to Hive, use cases.
  • Introduction to Pig, use cases.
  • Introduction to Oozie, use cases.
  • Introduction to Flume, use cases.
  • Introduction to Yarn

Module 3 – Deep Dive in Map Reduce and Yarn

  • How to develop Map Reduce Application, writing unit test
  • Best Practices for developing and writing, Debugging Map Reduce applications
  • Joining Data sets in Map Reduce
  • Hadoop API’s
  • Introduction to Hadoop Yarn
  • Difference between Hadoop 1.0 and 2.0

 

Module 4 – Deep Dive in Pig

  • Introduction to Pig
  • Basic Data Analysis with Pig

Module 5 – Deep Dive in Hive

  • Introduction to Hive
  • Relational Data Analysis with Hive
  • Hive Data Management
  • Hive Optimization

Module 6 – Introduction to Hbase architecture

  • What is Hbase
  • Where does it fits
  • What is NOSQL

Module 7 – Hadoop Cluster Setup and Running Map Reduce Jobs

  • Running Map Reduce Jobs on Cluster

Module 8 – Advance Mapreduce

  • Delving Deeper Into The Hadoop API
  • More Advanced Map Reduce Programming, Joining Data Sets in Map Reduce
  • Graph Manipulation in Hadoop

Module 9 – Job and certification support

  • Major Project, Hadoop Development, cloudera Certification Tips and Guidance and Mock Interview Preparation, Practical Development Tips and Techniques, certification preparation
Curriculum is empty

0.00 average based on 0 ratings

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%
X