Virtualization, Auto-Scaling, RPC

Virtualization
Cachine Review
Scaling Architectures
RPC

Virtualization

Reasons for success of virtualization
- Scaling system: Ability to dynamically grow resources in cloud is hard
  - e.g. “software-defined storage” virtualizes storage components
  - e.g “software-defined networking” virtualizes network components
- Change CAPEX -> OPEX: change from capital expense to operational expense
- Flexible allocation of resources in cloud – “elasticity”
Why virtualization
- Elasticity - CAPEX->OPEX
- Resource & failure isolation -> monitor & limit resource usage, contain failure locally.
- Mixed-OS environment
- Security isolation -> fully control any resource access and possible actions of tenant
Virtualization interfaces
Legacy world have hardware interface, the waistline is narrow and stable
- Narrow - freer innovation, vendor neutrality
- Stable - longevity & ubiquity
Hardware virtualization (requirements, implementation ideas, implications)
- Overview: fidelity, near exact
- Thin waistline interface (see above), narrow and stable
- Implementation
  - Create Virtualization layer, an extra level
  - of indirection, divide OS into multiple parts,
  - Originally, os controls hardware, software tightly coupled with hardware; with virtualization layer, os and hardware are decoupled
- Goal: fully emulate architecture
- Requirements
  - Portability
  - Encapsulation: (Mixed-OS) capture all vm state,
  - Isolation: resource, fault, performance, software isolation
  - Interposition:
    - Completely control access to all system resources
    - Transformations on instructions, CPU, I/O, Memory, Disks
- Type 1:
  - Bare/Native metal, hypervisor, os, vm are separated
  - Live migration
  - Higher performance
- Type 2:
  - Hosted: os > hypervisor > vm > software
  - Easy to install, Leverage host’s device drivers ○ VMware Workstation, Parallels
- Can add significant overhead
  - Memory/Disk overhead (duplicate data)
  - I/O overhead
  - OS-startup overhead per VM
  - Hypervisor overhead
OS-level virtualization (requirements, implementation ideas, implications)
- Wide interface (OS interface, syscalls)
  - Brittle abstraction, os-specific, hard to secure, maintain, deploy
- Perception: VM have too much overhead, hard to scale up.
- Idea:
  - Multiple isolated instances of programs (VM, programs are not isolated)
  - Running in user-space (shared kernel)
  - Instances see only resources (files, devices) assigned to their container
- Requirements:
  - Same as VM: resource & failure isolation, encapsulation, portability
  - Diff from VM: no interposition - no hypervisor (VMM)
- Problems:
  - Isolating which resources containers see
  - Isolating resource usage
  - Efficient per-container filesystems
- Implementation
  - Resource view isolation (user can only see the container’s resource of himself.
    - Each process have a namespace. Container make direct syscall, security is depended on kernel.
  - Resource usage isolation - each container (have his group process) use counter- rate limit for CPU, I/O, or hard disk limit, OOM for shutting down
  - File system isolation - VM uses a virtual disk, Container don’t.
    - Use two layers, one RO, one RW (for each container, ephemeral)
- Advantage
  - Fast boot time
  - High density
  - Small I/O overhead
  - Require no CPU support
- Container
  - Implementation difficult, large waistline - overlayfs, namespace, cgroup
  - Less general
  - Hard to migrate
  - Unsecure: large surface, container have all syscall access!
Summary
- VMs
  - Strengths: strong isolation guarantees, can run different Oss, VM migration practical
  - Weeknesses: OS startup, disk, memory, hypervisor overhead
- Containers
  - Strengths: fast startup times, negligible I/O overheads, very high density
  - Weaknesses: weak security isolation

Caching Reivew

Lease -> \inf == call back
Lease -> 0 == check on use
At most once
- Have sequence number
- Have duplication
- TCP simplifies this
- Server compute at most once
- Client send request multiple times
- Have a final timeout
- Most likely to have orphans
At least once
- Resend, compute, resend, compute, ….,
- Server side is stateless
- No duplication
- Don’t store the computed result.

Scaling Architectures

What is scalability, when do we need it
- Ability to rapidly and easily grow the system, we need load scalability - add more concurrent users
Scale up vs scale out (comparison, implementation, performance)
- Scala up: add resource to node
  - Easy, fast, easy to reach limit, no new failure model, latency concerns between nodes.
- Scale out: add node
  - Rewrite, more failure model, latency concerns across node
- Latency:
  - Load = arrive rate/service rate
Load = avg arrive rate / avg service rate
Avg request latency = load/((1−load)∗servic_rate)
Scale-out architectures, load balancers
- JSQ: join shortest queue
  - stateful, hard to implement
  - Decide on the arrival time is insufficient, job may have different time to run
- Power-of-two: random request two server
- Central-queue LB:
  - Even better than scale up
What is a node, scalable web services
- Node: thread, process, vm, physical, ram…
- Reliability: less. Easy to fail, but non state, fs use duplicate.
Elastic Scaling
- sources of latency, what happens under overload
  - Temporary overload
  - Persistent overload
- options of dealing with overload
  - Live with high latency
  - Drop job
  - Scale out
- sub-linear scaling intuition (no need to know square root formula)
- scaling based on load signals and with cost-aware load balancers
- challenges of o o othis approach, what is the right scaling signal

RPC

Motivation
Advantages / disadvantages, limitations
Failure independence
DS challenges (packet loss, unpredictable delays, failures,
Distinguishing server death from network failure/delay)
Synchronous model
Stubs, interfaces
Serialization
Reliable transport (timeouts, retransmission, duplicate elimination)
RPC semantics and how to apply them
Exactly-once (theoretical idea, why theoretical)
At-most-once (practical, implementation)
At-least-once (assumption)
Safety and liveness properties

Next Post → ← Previous Post

Learning Notes of CMU Course 15-640 Distributed Systems (Midterm Review)

Table of Contents

Virtualization

Caching Reivew

Scaling Architectures

RPC