Photo by Viktor Talashuk on Unsplash

Google File System Architecture

It is a classic case that learning the process of distributed systems cannot be avoided.

JIN
Published in
13 min readJul 10, 2022

--

The current distributed file systems are RedHat’s GFS (Global File System), IBM’s GPFS, and the Lustre file system which are used in high-performance computing or large-scale data centers, and have high requirements on hardware facilities. The Google File System (GFS) is not amazing technology. It is a large distributed system built on cheap servers. It regards server failure as a normal phenomenon and is automatically fault-tolerant through software, which greatly reduces the cost of the system while ensuring its reliability and availability of the system. It is closely integrated with technologies such as Chubby, MapReduce, and Bigtable, Megastore, Percolator and is at the bottom of all core technologies. Since it is not an open-source system, we can only understand a little from the technical documents published by Google.

When I read this paper again, I felt that there are many problems that can be only solved in this way, because it is the best solution. Google has the largest dataset in the world. How to efficiently and reliable store such large-scale data becomes an important issue.

The characteristics of Google Applications

  1. The dataset is huge

--

--