CMU | Carrie Zhang
Menu
Home
About
Experiences
Projects
Publications
Blogs
Games
Contact
Google File System
Learning Notes of CMU Course 15-640 Distributed Systems
Posted by
haohanz
May 08, 2019 · Stay Hungry · Stay Foolish
Table of Contents
GFS’s Focus
Assumptions
Master-Slave Pattern
GFS’s Chunk Size
Client Operations
Fault Tolerance
GFS Consistency Model
Limitations
GFS’s Focus
Fault tolerance
Scalability (focus on throughput and sacrifice response time/latency for individual reads/writes)
Low synchronization overhead between entities of GFS
no caching
Allow inconsistency on append, no synchronization overhead between clients
Decoupling control and data for throughput and scalability
GFS’s Assumptions
Files are large, chunk size = 64MB
Streaming read and write (read once, write once)
Atomicity for appends (Want atomicity for appends without synchronization overhead among clients)
Master-Slave Pattern
Role of master:
Garbage collection
Metadata
Mitigates chunks between servers for fault tolerance
Delegate consistency management
Role of chunk server:
Know nothing besides itself
Send heartbeats to server
Handle client request
No caching!
Role of client:
No caching!
Reduce consistency overhead, more easy
Assume streaming read and write, no benefit for caching
Directly issue data flow with chunk server
Reduce master server bottleneck, exploits parallelism.
Control flow with master server
Chunk Size
Why 64MB chunk size?
Cons:
GFS overhead per chunk
Pros:
Reduce server bottleneck - reduce metadata
Reduce communication - reduce server workload - scalability
Assume large file
If with smaller chunk size, stream reading would be less easier, more communication required
3 replicas, 2 in same rack, 1 in different
Access time and safety tradeoff
Client Operations
Append:
Atomicity ensured, would not overwrite each other
At least once
Why append?
Clients don’t overwrite, append more often
Higher performance for writing lots of small data instead of large chunks, e.g. log data, CPU info…
Read:
Read the closest
Write:
Write to all
Data passing - daisy chain
Order are the same
Client request decide the offset
Delete:
File rename to hidden
Scan in background to delete metadata in RAM and delete chunk in chunk server
Fault Tolerance
Chunk server fault tolerance
Missing heart beat
Decrement count of chunks
Replicate in background
The chunk-server fault tolerance mechanism - lease and version number
Master grant lease to chunk primary (only during modification operation!)
The lease would be revoked if file is being renamed or deleted!
The lease would be updated each time assigning a new one
The lease would be refreshed every 60 second if modification is not finished
Why Lease/version number?
Network partition
Primary failed
Revoke lease when expire or rename/delete file
Detect outdated chunk server with version number, consider the server with failed (shut down during lease renewal).
Master fault tolerance
WAL
Master-slave replica
Only reply when log are safe
Logs cannot be too long, bottleneck on replication. Log replay cost recovery time. RAM limit?
Build periodical checkpoints
Recovery steps
Metedata recovery
File-to-chunkID recovery
Chunk-server - chunkID-to-chunk-server recovery
Has older version number - stale, down the chunk server during recovery
Has newer version number - accept it.
GFS Consistency Model
Failure makes inconsistence for append
Change to namespace is consistent
WAL, master-slave replica, single master
GFS’s Limitations
Data corruption - application-level checksum
Master biggest bottleneck to scaling
Take long time to rebuild/recovery metedata
Performance and availability
Solution: Multiple server
Not so secure, users can interfere
Chunk size cannot be smaller
Read-once and write-once
No caching
Next
Post
→
← Previous
Post