Jayvardhan Reddy
Oct 27, 2017 · 2 min read

BigData - part3 “Hadoop 1.0 Architecture”

Hadoop 1.0

The components associated with 1.0 architecture cluster are NameNode(Master) ,Backup Node and a collection of DataNodes(slaves).

Internal Process of 1.0 architecture

When a job is triggered via client the NameNode is intimated about it which internally uses Job tracker in order to check the availability of DataNodes. Based on the availability it breaks down the job into chunks of tasks and assigns it to DataNodes which in turn distributes it amongst the Task-tracker. The Task-tracker then performs the required operations that’s parallel processing(MapReduce algorithm).

Job Distribution into DataNodes

During this process the NameNode checks for any failure of DataNode and replace them accordingly, based on the Heartbeat signal that is received by NameNode from the working DataNode (DataNodes sends heartbeat signal at regular intervals of time).

The default number of replication Factor for each block stored into HDFS is 3 and the size of each data block is 64MB. The three sets of copies are usually stored in different DataNodes as part of different racks for fault tolerance.

When the data is being processed, the NameNode at regular intervals takes a snapshot of an intermediary state (snapshot) known as FS Image and stores the image in secondary NameNode In case the NameNode goes down.

Pro’s and Con’s involved in this architecture

Pros:

  • Batch processing was made possible with large chunks of data
  • Efficient storage with fault tolerance

Con’s :

  • Scalability - You cannot increase the number of Name Nodes.
  • If a NameNode failure occurs then a manual intervention(boot up) is required in order to get the secondary NameNode up and running.
  • As there is a single Job tracker if there are a number of tasks it can be overloaded (MapReduce processing) resulting in a delay in performance.
  • Ample amount of time required in order to get secondary NameNode running.

If you enjoyed reading it, you can click the heart ❤️ below and let others know about it. If you have got anything to add please feel free to leave a response 💬💭.

Jayvardhan Reddy

Written by

Data Engineer. I write about Bigdata Architecture, tools and techniques that are used to build Bigdata pipelines and other generic blogs.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade