Case Study of Hadoop

Hadoop is an open-source java-based software framework sponsored by the Apache Software Foundation for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

It provides storage for big data at reasonable cost. Hadoop process big data in a single place as in a storage cluster doubling as a compute cluster.

Hadoop Architecture and Components:

Apache Hadoop consist of two major parts:

Hadoop Distributed File System (HDFS)
MapReduce

1. Hadoop Distributed File System:

HDFS is a file system or storage layer of Hadoop. It can store data and can handle very large amount of data.

When capacity of file is large then it is necessary to partition it. And the file systems manage the storage across a network of machine are called distributed file systems.

An HDFS cluster has two types of node operating in a master-worker pattern- Name Node and No. of Data Nodes.

Hadoop keep data safe by duplicating data across nodes.

2. MapReduce:

MapReduce is a programming framework. It organize multiple computers in a cluster in order to perform the calculations. It takes care of distributing the work between computers and putting results together.

Hadoop works in a Master-Worker / Master-slave fashion:-

1. Master:
Master contains Name node and Job tracker components.

Name node: It holds information about all the other nodes in the Hadoop Cluster, files in the cluster, blocks of files, their locations etc.
Job tracker: It keeps track of the individual tasks assigned to each of the nodes and coordinates the exchange of information and result.

2. Worker:
Worker contains Task tracker and Data node components.

Task Tracker: It is responsible for running the task assigned to it.
Data node: It is responsible for holding the data.

Other components of Hadoop architecture are : -Chukwa, Hive, HBase, Mahoutetc.

Characteristics of Hadoop:

Hadoop provides a reliable shared storage(HDFS) and analysis system (Map Reduce).
Hadoop is highly scalable. It can contain thousands of servers.
Hadoop works on the principles of write once and read multiple times.
Hadoop is highly flexible, can process both structured as well as unstructured data.

Download as PDF

3 thoughts on “Case Study of Hadoop”

www.myclevernest.com

March 4, 2026 at 2:31 pm

I am really loving the theme/design of your website. Do you ever run into
any internet browser compatibility problems? A few of my blog visitors have complained about my blog not operating correctly in Explorer but looks great in Firefox.
Do you have any ideas to help fix this issue?
gudangbet88 login

January 17, 2026 at 5:34 pm

You actually make it seem so easy together with your presentation but I to find this matter to be really one thing which I believe
I would never understand. It sort of feels too complicated and extremely broad for me.
I am looking forward for your next submit, I’ll attempt to get the
hang of it!
ancient rome travel

September 15, 2024 at 1:38 am

In the end, the goal of counseling psychology is to enhance the psychological well being and nicely-being of people and their loved ones, and to help clients to lead happier and extra fulfilling lives.

Hadoop Architecture and Components:

1. Hadoop Distributed File System:

2. MapReduce:

Characteristics of Hadoop:

3 thoughts on “Case Study of Hadoop”

Leave a Comment