This is second part of blog post dedicated to Apache HBase basics. First part can be found here.

This chapter will be dedicated to HBase administration topics, e.g. HBase cluster architecture, replication, data storage format, etc. It will be helpful for system administrators as well as developers which want to know how HBase works inside.

HBase architecture

We start from components which HBase cluster have under hood and how it interacts with each other.

HBase cluster consist of few Master servers and many RegionServers.

HBase runs on top of Apache Hadoop(it mostly requires only HDFS where it stores the data) and Apache Zookeeper. Apache Zookeeper cluster is used for failure detection of HBase nodes and stores distributed configuration of HBase cluster(more info in following sections). …

This article is aimed to be “beginners` guide” to Apache HBase.

HBase is very mature product and has extensive documentation which can provide great volume of information about it. Nothing can replace official documentation, this is source of truth:)

But for people who see documentation first time, it very hard to get a quick overview of the system capabilities and understand is it suitable for his/her task. That’s why I wrote this post.

Let’s see what you can learn from it:

  • data model: how HBase stores your data, what is a table in HBase, etc.
  • how to access data in HBase at client…

Igor Skokov

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store