HPC, MPI, MapReduce

Learning Notes of CMU Course 15-640 Distributed Systems

Posted by haohanz May 09, 2019 · Stay Hungry · Stay Foolish


(Supercomputers + MPI)

  • MPI’s focus:
    • Low latency
    • High scalability
  • HPC:
    • Character
      • High bandwidth
      • Long-lived process
      • In memory computation
      • Spatial locality exploited via partitioning
    • Pros
      • High utilization about resource
      • Effectiveness
    • Cons
      • Need tuning between application and resource
      • Generability is low, cannot afford and variant
      • Fault tolerance
    • Fault tolerance
      • Build checkpoint
        • large I/O traffic
        • Wasted intervening computation
      • Scaling sensitive to failing numbers
  • MPA:
    • Pros
      • Handle iterative, interactive computations.
      • Handle tightly-coupled computations
      • Handle communication components
    • Cons
      • Fault tolerance
      • Cannot be more complex

Cluster Computing

(MapReduce + Implementation)

  • MapReduce deals with stragglers:
    • Solved: Caused not by poor partition (by I/O latency, server performance, bugs)
      • Reschedule the tasks not done
    • Not solved: by poor partition (which is solved by Spark) Hashing is on mapper, M mapper R reducer, each mapper output R files.
  • Why use cluster file system on the input and output? (HDFS)
    • Fault tolerance
    • Dynamic schedule tasks in replicated files
    • Hetorogeneous nodes
    • Maximize cost-performance
    • Minimal downtime, minimal staffing
  • MapReduce:
    • Pros
      • Failure model
      • High throughput
      • Simple programming
      • Loosely-coupled, course-grained parallel tasks
    • Cons
      • Disk IO, high latency
      • Iterative, interactive computation
      • Abstraction not expressive enough, have to build specialized analytics systems.