This is second part of blog post dedicated to Apache HBase basics. First part can be found here.
This chapter will be dedicated to HBase administration topics, e.g. HBase cluster architecture, replication, data storage format, etc. It will be helpful for system administrators as well as developers which want to know how HBase works inside.
We start from components which HBase cluster have under hood and how it interacts with each other.
HBase runs on top of Apache Hadoop(it mostly requires only HDFS where it stores the data) and Apache Zookeeper. Apache Zookeeper cluster is used for failure detection of HBase nodes and stores distributed configuration of HBase cluster(more info in following sections). …
This article is aimed to be “beginners` guide” to Apache HBase.
HBase is very mature product and has extensive documentation which can provide great volume of information about it. Nothing can replace official documentation, this is source of truth:)
But for people who see documentation first time, it very hard to get a quick overview of the system capabilities and understand is it suitable for his/her task. That’s why I wrote this post.
Let’s see what you can learn from it: