Best training institute in Hyderabad for Hadoop & Big data

Hadoop Course Content

What is Big Data?
What is Hadoop?
The relation between Big Data and Hadoop.
What is the need for going ahead with Hadoop?
Scenarios to apt Hadoop Technology in REAL TIME Projects
Challenges with Big Data

Storage
Processing

How Hadoop is addressing Big Data Changes
Comparison with Other Technologies

RDBMS
Data Warehouse
Teradata

Define attributes of Material TypesDifferent Components of Hadoop Echo System

Storage Components
Processing Components

HDFS (Hadoop Distributed File System)

What is a Cluster Environment?
Cluster Vs Hadoop Cluster.
The significance of HDFS in Hadoop
Features of HDFS
Storages aspects of HDFS

Block
How to Configure block size
Default Vs Configurable Block size
Why HDFS Block size so large?
Design Principles of Block Size

HDFS Architecture – 5 Daemons of Hadoop

Name Node and its functionality
Data Node and its functionality
Job Tracker and its functionality
Task Track and its functionality
Secondary Name Node and its functionality.

Replication in Hadoop – Fail Over Mechanism

Data Storage in Data Nodes
Fail Over Mechanism in Hadoop – Replication
Replication Configuration
Custom Replication
Design Constraints with Replication Factor

Accessing HDFS

CLI(Command Line Interface) and HDFS Commands
Java Based Approach

Map Reduce

Why is Map Reduce essential in Hadoop?
Processing Daemons of Hadoop
Job Tracker

Roles Of Job Tracker
Drawbacks w.r.to Job Tracker failure in Hadoop Cluster
How to Configure Job Tracker in the Hadoop cluster

Task Tracker

Roles of Task Tracker
Drawbacks w.r.to Job Tracker Failure in Hadoop Cluster

Input Split

Input Split
Need of Input Split in Map Reduce
Input Split Size
Input Split Size Vs Block Size
Input Split Vs Mappers

Map Reduce Life Cycle

Communication Mechanism of Job Tracker & Task Tracker
Input Format Class
Record Reader Class
Success Case Scenarios
Failure Case Scenarios
Retry Mechanism in Map Reduce

Map Reduce Programming Model

Different phases of Map Reduce Algorithm
Different Data Types in Map Reduce

Primitive Data Types Vs Map Reduce Data Types

How to write a basic Map Reduce Program

Driver Code
Mapper Code
Reducer Code

Driver Code

Importance of Driver Code in a Map-Reduce Program
How to Identify the Driver Code in Map Reduce Program
Different sections of Driver code

Mapper Code

Importance of Mapper Phase in Map Reduce
How to Write a Mapper Class?
Methods in Mapper Class

Reducer Code

Importance of Reduce phase in Map Reduce
How to Write Reducer Class?
Methods in Reducer Class

IDENTITY MAPPER & IDENTITY REDUCER

Input Format’s in Map Reduce

Text Input Format
Key Value Text Input Format
Nine Input Format
DB Input Format
Sequence File Input Format.
How to Use the specific Output format in Map Reduce

Output Format’s in Map Reduce

Text Output Format
Key Value Text Input Format
Nine Input Format
DB Input Format
Sequence File Input Format.
How to Use the specific Output format in Map Reduce

Combiner in Map Reduce

Is combiner mandate in Map Reduce
How to Use the Combiner class in Map Reduce
Performance tradeoffs w.r.to Combiner

Partitioner in Map Reduce

Importance of Practitioner class in Map Reduce
How to use the Partitioner class in Map Reduce
Hash Partitioner Functionality
How to write a custom Partitioner

Compression Techniques in Map Reduce

Importance of Compression in Map Reduce
What is CODEC
Compression Types
Gzip Codec
Bzip Codec
LZO Codec
Snappy Codec
Configurations w.r.to Compression Techniques
How to customize the Compression per one job Vs all the job.

Joins – in Map Reduce

Map Side Join
Reduce Side Join
Distributed cache

How to debug MapReduce jobs in Local and Pseudo cluster Mode
Data Localization in Map Reduce

Apache PIG

Introduction to Apache Pig
SQL Vs Apache Pig
Different data types in Pig
Modes of Execution in Pig

Local Mode
Map Reduce OR Distributed Mode

Execution Mechanism

Grunt Shell
Script

Embedded
Transformations in Pig
How to develop the Complex Pig Script
Bags, Tuples, and fields in PIG
UDF’s in Pig
Driver Code

Need for using UDF’s in PIG
How to use UDF’s
REGISTER keyword in PIG

When to use Map Reduce & Apache PIG in REAL TIME Projects

APACHE HIVE

Hive Introduction
Need of Apache HIVE in Hadoop
Hive Architecture

Driver
Compiler
Executor(Semantic Analyzer)

Meta Store in Hive

Importance of Hive Meta Store
Embedded meta store configuration
External meta store configuration
Communication mechanics with Metastore

Hive Integration with Hadoop
Hive Query Language (Hive QL)
Configuring Hive With MySQL Metastore
SQL Vs Hive QL
Data Slicing Mechanisms

Partitions in Hive
Buckets In Hive
Partitioning Vs Bucketing
Real-Time Use Cases

Collection Data Types in HIVE

Array
Struct
Map

User Defined Functions(UDFs) in HIVE

UDFs
UDAFs
UDTFs
Need of UDFs in HIVE

Hive Serializer/De-serializer – SerDe
HIVE – HBase Integration

APACHE SQOOP

Introduction to Sqoop.
MySQL client and Server Installation
How to connect to Relational Database using Sqoop
Different Sqoop Commands

Different Flavors of Imports
Export
Hive-Imports

APACHE HBase

HBase Introduction
HDFS Vs HBase
HBase Use cases
HBase basics

Column Families
Scans

HBase Architecture
Clients

REST
Thrift
Java Based
Avro

Map Reduce Integration
Map Reduce over HBase
HBase Admin

Scheme Definition
Basic CRUD Operations

APACHE Flume

Flume Introduction
Flume Architecture
Flume Master, Flume Collector, and Flume Agent
Flume Configurations
Real-Time Use Case using Apache Flume

APACHE Oozie

Oozie Introduction
Oozie Architecture
Oozie Configuration Files
Oozie Job Submission

Workflow.xml
Coordinators.xml
Job.coordinator.properties

YARN (Yet Another Resource Negotiator)-Next Gen.Map Reduce

What is YARN?
YARN Architecture

Resource Manager
Application Master
Node Manager

When should we go ahead with YARN
Classic Map Reduce Vs YARN Map Reduce, Different Configuration Files for YARN

MongoDB (As part of NoSQL Databases)

The need for NoSQL Database
Relational Vs Non-Relational Databases
Introduction to MongoDB
Installation of MongoDB
Mongo DB Basic operations

APACHE SPARK

Spark Architecture
Spark Processing with Use cases
Spark with SCALA
Spark With SQL

Hadoop Administration

Hadoop Single Node Cluster Set Up(Hands on Installation on Laptops)
Operating System Installation
JDK Installation
SSH Configuration
Dedicated Group & User Creation
Hadoop Installation
Different Configuration Files Setting
Name node format
Starting the Hadoop Daemons
PIG Installation (Hands on Installation on Laptops)

Local Mode
Clustered Mode
Bashrc file configuration

SQOOP Installation (Hand on Installation on Laptops)

Sqoop installation with MySQL Client

HIVE Installation(Hands on Installation on Laptops)

Local Mode
Clustered Mode

HBase Installation (Hand on Installation on Laptops)

Local Mode
Clustered Mode

Offers from Training:

Provided 2 POC’s to work with Hadoop and Its Components
Provided All the Materials Soft copy with Use cases
Provided Certification Assistance
Provided Project Exposure and Discussion

Courses Features

Language

English
Lectures

1
Certification

Yes
Project

1 Minor + 2 Major
Duration

50 hrs
Max-Students

20

DEMO

DROP US A QUERY

© Copyright - 2021 | Cyberaegis . All Rights Reserved.