cloud-lab · service models, virtualization, scaling & reliability

1. Service models — who manages what

The NIST stack of nine layers, from networking to application. Pick a delivery model and see where the line between provider-managed and you-managed falls — the defining difference between On-Prem, IaaS, PaaS and SaaS.

delivery model

you manage—

provider manages—

example—

Green = your responsibility · indigo = the cloud provider's.

2. Virtual machines vs containers

Both pack workloads onto one host, but VMs each carry a full guest OS over a hypervisor, while containers share the host kernel. Set the per-workload footprint and watch how many of each fit before the host's RAM is exhausted.

host RAM (GB) 32 app needs (GB each) 1.0 guest OS overhead per VM (GB) 2.0

VMs that fit—

containers that fit—

density gain—

3. Horizontal autoscaling

A fluctuating request load arrives; an autoscaler adds or removes instances to keep per-instance utilization inside a target band. Watch instances spin up and down in real time and see how the target threshold trades cost against headroom.

capacity / instance (req/s) 100 scale-up target util 70% max instances 12

load—

instances—

utilization—

4. Load-balancing algorithms

A load balancer spreads incoming requests across a backend pool. Compare round-robin, least-connections and weighted distribution, and pull a backend offline to see how traffic redistributes.

algorithm backends 4

requests sent0

max imbalance—

Click a backend to toggle it healthy / down.

5. Availability & redundancy

A "number of nines" SLA caps annual downtime. For $n$ identical components each up with probability $p$, redundancy in parallel gives $1-(1-p)^n$ while a series chain gives $p^{\,n}$. See how composition shifts the system's effective availability.

component availability 99.0% components $n$ 3 topology

system availability—

nines—

downtime / year—

6. Serverless — cold starts & concurrency

A FaaS platform (Azure Functions, AWS Lambda) holds no idle servers: each concurrent request needs a warm worker, and a fresh worker pays a cold-start penalty. Fire bursts of events and watch workers warm, run and idle out.

request rate (req/s) 8 exec time (ms) 300 cold start (ms) 800

warm workers—

cold starts—

cold-start rate—

7. Cost models & FinOps

The same VM workload costs wildly different amounts depending on the pricing model. Compare on-demand, 1- and 3-year reserved/committed use, and spot — for a chosen running fraction and discount, against the equivalent always-on monthly bill.

on-demand rate ($/hr) 0.50 hours used / month 730 reserved discount 40% spot discount 70%

cheapest—

monthly saving—

8. CAP theorem & distributed data

A partition splits a distributed datastore into two halves. With a network partition present you must choose: keep serving (Availability, risking stale reads) or refuse writes (Consistency). Toggle the partition and the policy to see what each replica returns.

policy under partition network partition active

left replica reads—

right replica reads—

guarantee held—

9. Kubernetes — scheduling & self-healing

A Deployment declares a desired replica count; the scheduler bin-packs pods onto nodes by available CPU. Kill a node and the controller reschedules its pods elsewhere — the essence of declarative, self-healing orchestration.

desired replicas 6 nodes 3 pod CPU request 2

scheduled—

pending (unschedulable)—

status—

Click a node to cordon/kill it; pods reschedule.

10. DDoS & rate limiting

A token-bucket rate limiter is the front line against floods. Legitimate users and an attacker both hit the edge; the bucket refills at a fixed rate and drops requests when empty. Tune the bucket and watch good vs malicious traffic get admitted or shed.

legit traffic (req/s) 20 attack traffic (req/s) 120 bucket refill (tokens/s) 60 bucket size 80

admitted—

dropped—

legit served—

11. Infrastructure as Code — plan & drift

Terraform compares your declared desired state against the real current state and prints a plan: resources to +create, ~update or -destroy. Toggle resources in each column and read the plan, then apply to converge.

+ create0

~ update0

- destroy0

plan—

Click a resource: left col toggles desired (.tf), right col toggles deployed reality (drift).