• +91 90320 18369
  • info@eazygurus.com
  • USA, INDIA
  • Home
  • Courses
  • About Us
  • Services
    • Online Training
    • IT Training
    • Corporate Training
    • Project Mentoring
    • Generative AI
  • Blog
  • Trainer Registration
  • Contact
  • Payments
Eaazygurus-LogoR-310x105

HADOOP

Posted on June 11, 2020July 13, 2020 by admin
0

COURSE : HADOOP | ONLINE TRAINING | DURATION : 40 HOURS

Course Details
Course Syllabus
Enroll Now

ABOUT COURSE

There is a lot of Data that keeps flooding from various social network sites, public information sites , Internet Archives etc . To manage such large amounts of data we have Big Data. Hadoop is the backbone for Big Data. Hadoop is a set of programs and procedures used extensively when we learn about BigData. It helps in distributed storage and processing of data of Big Data. Understanding Hadoop is a highly valuable skill for anyone working with large amounts of data.It is a programming model which involves large scale processing of data within reasonable time framework.

At EazyGurus , we provide a detailed understanding of the concepts of Hadoop and practical usage of the technology. The training starts with introduction of the scope of Hadoop and understanding the scenarios in which it can be applied. Proceeding further , the training focuses on learning the Pillars of Hadoop which is Hadoop Distributed File System and Map Reduce. The remaining part of the taining program consists of learning the various concepts that build the Hadoop ecosystem like HIVE, PIG, HBASE, SQOOP, NOSQL, FLUME.

Course Objective

  • Master the Hadoop Distributed File System
  • Learn Map Reduce and Architecture and understanding its Programming model.
  • Working with Hive Query Language and learn more about the Hive Architecture

Career Opportunities in Hadoop

With the popularity of Big Data increasing exponentially, opportunities as Hadoop administrators/consultants/analytics has been growing in all major industry sectors like Financial application, Enterprise processing, Business Service sector etc.  Training programs on Hadoop technology by EazyGurus focuses on empowering the students with the latest concepts and industry specific topics.  Our well experienced trainer and well planned course materials ensures for 100% success in interviews.

Who can learn?

Targeted Audience

  • Java consultants
  • DBA consultants
  • SQL Experts
  • College Freshers with Programming background
  • ETL Professionals

Prerequisite to learn the course

Having basic knowledge on LINUX would be helpful in learning Hadoop. Also knowing the basic programing principles of Java would be an added advantage. Knowledge in SQL will improve the overall learning experience.

Course Syllabus

Big Data Introduction:

  • What is Big Data
  • Evolution of Big Data
  • Benefits of Big Data
  • Operational vs Analytical Big Data
  • Need for Big Data Analytics
  • Big Data Challenges

Hadoop cluster:

  • Master Nodes
    • Name Node
    • Secondary Name Node
    • Job Tracker
  • Client Nodes
  • Slaves
  • Hadoop configuration
  • Setting up a Hadoop cluster

HDFS:

  • Introduction to HDFS
  • HDFS Features
  • HDFS Architecture
  • Blocks
  • Goals of HDFS
  • The Name node & Data Node
  • Secondary Name node
  • The Job Tracker
  • The Process of a File Read
  • How does a File Write work
  • Data Replication
  • Rack Awareness
  • HDFS Federation
  • Configuring HDFS
  • HDFS Web Interface
  • Fault tolerance
  • Name node failure management
  • Access HDFS from Java

Yarn

  • Introduction to Yarn
  • Why Yarn
  • Classic MapReduce v/s Yarn
  • Advantages of Yarn
  • Yarn Architecture
    • Resource Manager
    • Node Manager
    • Application Master
  • Application submission in YARN
  • Node Manager containers
  • Resource Manager components
  • Yarn applications
  • Scheduling in Yarn
    • Fair Scheduler
    • Capacity Scheduler
  • Fault tolerance

MapReduce:

  • What is MapReduce
  • Why MapReduce
  • How MapReduce works
  • Difference between Hadoop 1 & Hadoop 2
  • Identity mapper & reducer
  • Data flow in MapReduce
  • Input Splits
  • Relation Between Input Splits and HDFS Blocks
  • Flow of Job Submission in MapReduce
  • Job submission & Monitoring
  • MapReduce algorithms
    • Sorting
    • Searching
    • Indexing
    • TF-IDF

Hadoop Fundamentals:

  • What is Hadoop
  • History of Hadoop
  • Hadoop Architecture
  • Hadoop Ecosystem Components
  • How does Hadoop work
  • Why Hadoop & Big Data
  • Hadoop Cluster introduction
  • Cluster Modes
    • Standalone
    • Pseudo-distributed
    • Fully – distributed
  • HDFS Overview
  • Introduction to MapReduce
  • Hadoop in demand

HDFS Operations:

  • Starting HDFS
  • Listing files in HDFS
  • Writing a file into HDFS
  • Reading data from HDFS
  • Shutting down HDFS

HDFS Command Reference:

  • Listing contents of directory
  • Displaying and printing disk usage
  • Moving files & directories
  • Copying files and directories
  • Displaying file contents

Java Overview For Hadoop:

  • Object oriented concepts
  • Variables and Data types
  • Static data type
  • Primitive data types
  • Objects & Classes
  • Java Operators
  • Method and its types
  • Constructors
  • Conditional statements
  • Looping in Java
  • Access Modifiers
  • Inheritance
  • Polymorphism
  • Method overloading & overriding
  • Interfaces

MapReduce Programming:

  • Hadoop data types
  • The Mapper Class
    • Map method
  • The Reducer Class
    • Shuffle Phase
    • Sort Phase
    • Secondary Sort
    •  Reduce Phase
  • The Job class
    • Job class constructor
  • JobContext interface
  • Combiner Class
    • How Combiner works
    • Record Reader
    • Map Phase
    • Combiner Phase
    • Reducer Phase
    • Record Writer
  • Partitioners
    • Input Data
    • Map Tasks
    • Partitioner Task
    • Reduce Task
    • Compilation & Execution

  Hadoop Ecosystems

Pig:

  • What is Apache Pig?
  • Why Apache Pig?
  • Pig features
  • Where should Pig be used
  • Where not to use Pig
  • The Pig Architecture
  • Pig components
  • Pig v/s MapReduce
  • Pig v/s SQL
  • Pig v/s Hive
  • Pig Installation
  • Pig Execution Modes & Mechanisms
  • Grunt Shell Commands
  • Pig Latin – Data Model
  • Pig Latin Statements
  • Pig data types
  • Pig Latin operators
  • CaseSensitivity
  • Grouping & Co Grouping in Pig Latin
  • Sorting & Filtering
  • Joins in Pig latin
  • Built-in Function
  • Writing UDFs
  • Macros in Pig

HBase:

  • What is HBase
  • History Of HBase
  • The NoSQL Scenario
  • HBase & HDFS
  • Physical Storage
  • HBase v/s RDBMS
  • Features of HBase
  • HBase Data model
  • Master server
  • Region servers & Regions
  • HBase Shell
  • Create table and column family
  • The HBase Client API

Spark:

  • Introduction to Apache Spark
  • Features of Spark
  • Spark built on Hadoop
  • Components of Spark
  • Resilient Distributed Datasets
  • Data Sharing using Spark RDD
  • Iterative Operations on Spark RDD
  • Interactive Operations on Spark RDD
  • Spark shell
  • RDD transformations
  • Actions
  • Programming with RDD
    • Start Shell
    • Create RDD
    • Execute Transformations
    • Caching Transformations
    • Applying Action
    • Checking output
  • GraphX overview

Impala:

  • Introducing Cloudera Impala
  • Impala Benefits
  • Features of Impala
  • Relational databases vs Impala
  • How Impala works
  • Architecture of Impala
  • Components of the Impala
    • The Impala Daemon
    • The Impala Statestore
    • The Impala Catalog Service
  • Query Processing Interfaces
  • Impala Shell Command Reference
  • Impala Data Types
  • Creating & deleting databases and tables
  • Inserting & overwriting table data
  • Record Fetching and ordering
  • Grouping records
  • Using the Union clause
  • Working of Impala with Hive
  • Impala v/s Hive v/s HBase

MongoDB Overview:

  • Introduction to MongoDB
  • MongoDB v/s RDBMS
  • Why & Where to use MongoDB
  • Databases & Collections
  • Inserting & querying documents
  • Schema Design
  • CRUD Operations

Oozie & Hue Overview:

  • Introduction to Apache Oozie
  • Oozie Workflow
  • Oozie Coordinators
  • Property File
  • Oozie Bundle system
  • CLI and extensions
  • Overview of Hue

Hive:

  • What is Hive?
  • Features of Hive
  • The Hive Architecture
  • Components of Hive
  • Installation & configuration
  • Primitive types
  • Complex types
  • Built in functions
  • Hive UDFs
  • Views & Indexes
  • Hive Data Models
  • Hive vs Pig
  • Co-groups
  • Importing data
  • Hive DDL statements
  • Hive Query Language
  • Data types & Operators
  • Type conversions
  • Joins
  • Sorting & controlling data flow
  • local vs mapreduce mode
  • Partitions
  • Buckets

Sqoop:

  • Introducing Sqoop
  • Scoop installation
  • Working of Sqoop
  • Understanding connectors
  • Importing data from MySQL to Hadoop HDFS
  • Selective imports
  • Importing data to Hive
  • Importing to Hbase
  • Exporting data to MySQL from Hadoop
  • Controlling import process

Flume:

  • What is Flume?
  • Applications of Flume
  • Advantages of Flume
  • Flume architecture
  • Data flow in Flume
  • Flume features
  • Flume Event
  • Flume Agent
    •  Sources
    •  Channels
    •  Sinks
  • Log Data in Flume

Zookeeper Overview:

  • Zookeeper Introduction
  • Distributed Application
  • Benefits of Distributed Applications
  • Why use Zookeeper
  • Zookeeper Architecture
  • Hierarchial Namespace
  • Znodes
  • Stat structure of a Znode
  • Electing a leader

Kafka Basics:

  • Messaging Systems
    • Point-to-Point
    • Publish – Subscribe
  • What is Kafka
  • Kafka Benefits
  • Kafka Topics & Logs
  • Partitions in Kafka
  • Brokers
  • Producers & Consumers
  • What are Followers
  • Kafka Cluster Architecture
  • Kafka as a Pub-Sub Messaging
  • Kafka as a Queue Messaging
  • Role of Zookeeper
  • Basic Kafka Operations
    • Creating a Kafka Topic
    • Listing out topics
    • Starting Producer
    • Starting Consumer
    • Modifying a Topic
    • Deleting a Topic
  • Integration With Spark

Scala Basics:

  • Introduction to Scala
  • Spark & Scala interdependence
  • Objects & Classes
  • Class definition in Scala
  • Creating Objects
  • Scala Traits
  • Basic Data Types
  • Operators in Scala
  • Control structures
  • Fields in Scala
  • Functions in Scala
  • Collections in Scala
    • Mutable collection
    • Immutable collection

Register Now

Search Course

Trending Courses
Appian BPM Training Course in Hyderabad
Appian Training in Hyderabad With 100% Placement Assistance Enroll For...
AI & Data Science Using Python
AI & Data Science Using Python Training in Hyderabad With...
Multicloud Devops with Security
Multi Cloud DevOps Training in Hyderabad With 100% Placement Assistance...
Digital Marketing
Digital Marketing Training in Hyderabad With 100% Placement Assistance Enroll...
Cyber Security
Cyber Security Training in Hyderabad With 100% Placement Assistance Enroll...
Latest News
Appian BPM Training Course in Hyderabad
Appian Training in Hyderabad With 100% Placement Assistance Enroll For...
AI & Data Science Using Python
AI & Data Science Using Python Training in Hyderabad With...
Multicloud Devops with Security
Multi Cloud DevOps Training in Hyderabad With 100% Placement Assistance...
Digital Marketing
Digital Marketing Training in Hyderabad With 100% Placement Assistance Enroll...
Cyber Security
Cyber Security Training in Hyderabad With 100% Placement Assistance Enroll...
Latest Blog
13 SMART SOCIAL MEDIA MARKETING TIPS FOR 2020
Social Media, It slowly crept into our lives, little by...
DATA SCIENCE IN HEALTH CARE | 7 WAYS DATA SCIENCE IS RESHAPING HEALTHCARE
7 Ways Data Science Is Reshaping Healthcare What do healthcare...
TOP 9 SOCIAL MEDIA TRENDS TO WATCH IN 2020
Social media trends rarely stay the same from year-to-year. That’s...
TOP SOCIAL MEDIA TRENDS FOR 2020 AND BEYOND
Saying that most people spend a good part of their...
AI FABRIC CLOUD WITH UiPath PLATFORM
AI Fabric Cloud is live AI Fabric Cloud is live,...

EazyGurus is a leading IT learning and training solution provider. We support individual IT career aspirants to learn and nurture advanced IT skills and competencies.

Privacy

Terms of Use
Privacy Policy
Cancellation & Refund Policy
Shipping & Delivery Policy

Quick Links

About Us
Services
Courses
Blog

Trending Courses

  • AI & Data Science Using Python

  • Multicloud Devops with Security

  • Digital Marketing

  • Cyber Security

  • Generative AI and Prompt Engineering

  • Python Full Stack Developer

  • Java Full Stack Developer

© 2025 Welcome to EazyGurus

90320 18369
info@eazygurus.com
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. ACCEPT
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT