I’ve been using HDFS as storage for almost 3 years reading data from and writing data to it by HIVE and Spark, but I’ve never learned the detail. Finally I have some time to watch the Big Data Essentials on Coursera, which inspired me to have a deep dive in HDFS architecture. This blog contains so much about HDFS that I spent 3 days to sum up and mark them down. If anything is worng, it’s very nice of you to tell me and I’ll figure it out! Let’s take a look.