Cloud Computing — course structure
An in-depth, syllabus-driven outline of the Cloud Computing course (BCSAI, IE University): every module and every numbered session, with the core concept behind each topic, cross-linked to the interactive demos in this lab.
Cloud computing is one of the most significant technology developments of our lifetimes. It has made many new businesses possible and lets large enterprises move from a CAPEX to an OPEX, consumption-based model — benefiting from high availability, scalability, elasticity and agility while reducing expense and development time through pay-as-you-use.
This is a practical course designed to give computer scientists the business, architectural and hands-on experience to tackle small-to-medium Cloud projects. All paradigms are studied (SaaS, PaaS, IaaS), with Microsoft Azure and Amazon Web Services as the platforms of choice for demonstrating concepts and design patterns. A solid foundation on enabling technologies — virtualization, containers and Linux automation — underpins the work. Students build an end-to-end Cloud Architecture in groups; this project is the common thread across all sessions.
The course is organised as a deliberate arc: it starts from the physical data center and the virtualization layer, climbs through containers and the public-cloud service catalogues, then specialises into serverless, IaaS automation, PaaS data services and, finally, the cross-cutting concerns of scaling, security and operations. Each block feeds the group project so that theory is applied immediately rather than studied in isolation.
Learning objectives
By the end of the course, students will be able to:
- Trace the evolution of the data center — understand the historical path from mainframes and on-premise server rooms to the hyperscale data centers that led to the cloud revolution.
- Reason about cloud models — know the cloud architectures and the service, delivery and business models (IaaS / PaaS / SaaS; public / private / hybrid) and when each fits.
- Navigate the two leading platforms — acquire working familiarity with Microsoft Azure and AWS terminology and their most popular services.
- Virtualize and containerize — develop hands-on experience with virtualization (VMs) and container (Docker) technologies, including images, volumes and multi-container apps.
- Automate infrastructure — gain practical experience with the automation tools cloud engineers use: Linux/Bash, Ansible (configuration management) and Terraform (infrastructure as code).
- Apply automation to real tasks — use those tools to provision and configure cloud resources repeatably rather than by hand.
- Architect cloud solutions — design basic cloud systems using industry-standard design patterns for scalability, resilience, security and cost.
Teaching methodology — activity weighting
IE's method is collaborative, active and applied: students build knowledge by participating, and the professor leads and guides. Across the course a student is expected to dedicate 75 hours total, distributed across these learning activities.
AI policy. Specific use cases of GenAI are encouraged: GenAI tools may be used to aid group-project design and development with appropriate acknowledgement. GenAI may not be used for quizzes, exams or any in-class activity; inappropriate use is treated as academic misconduct and may result in failing the assignment or the course. A short acknowledgement format is suggested in the syllabus.
Assessment & evaluation criteria
Final grade composition. The two knowledge quizzes carry the most weight, followed by the group Cloud Architecture project.
- Knowledge quizzes — 40%. Deliverable: two in-class quizzes (Session 8 covers Modules 1–4; Session 13 covers Modules 5–8). Evaluated on understanding of cloud concepts from the lectures and from the readings shared on the discussion board. No GenAI permitted.
- Group assignments — 25%. Deliverable: the end-to-end Cloud Architecture, built in groups of 4–6 and presented in Sessions 14–15 (10–15 min per team). Evaluated on the architecture's design, implementation and the final presentation; it is the common thread running through every session.
- Class participation — 20%. Split into 5% discussion-forum activity (timeliness and relevance of contributions to professor-shared readings) and 15% other in-class activities (in-class exercises, applying concepts to real-world problems and teamwork).
- Individual assignments — 15%. Deliverable: multiple individual lab assignments, one per course block (e.g. the Docker labs, the Bash scripting lab). Evaluated per submission.
- Attendance. A minimum 80% attendance is required. Students who fall below it fail both calls (ordinary and extraordinary) for the year and must re-take (re-enroll) the next academic year.
- Four calls. Each student has four chances to pass a course across two consecutive academic years: ordinary call plus extraordinary call (re-sit) in June/July.
- Re-sit exam. The June/July re-sit is a single comprehensive exam taken in person (Segovia or Madrid); continuous-evaluation marks do not carry over. The minimum passing grade is 5 and the maximum obtainable in the re-sit is 8.0 (“notable”). Retakers (3rd call) may obtain up to 10.0 and should confirm criteria with the assigned professor.
- Appeals. A review session follows grading; attending it is a prerequisite for any grade appeal. Failing more than 18 ECTS in the year after the re-sits leads to being asked to leave the program.
Program — modules & sessions
15 live in-person sessions, grouped into the syllabus's thematic blocks. Each module opens with an overview and learning outcomes; each topic carries a one-line explanation and, where useful, the core concept and a key idea. Topic bullets link to the matching interactive demo where one exists.
The foundation block. It answers “what is the cloud?” by grounding it in the physical data center, the NIST reference model, and the virtualization layer that turns hardware into elastic, software-defined resources. The group project is launched here so every later topic has a home.
By the end of this module you can
- Define cloud computing using NIST's five essential characteristics and three service models.
- Distinguish public, private and hybrid deployment and explain the CAPEX→OPEX shift.
- Explain how virtualization and software-defined data centers enable elasticity, and contrast cloud, fog and edge computing.
Set up the course and the group project, then frame cloud computing with the canonical NIST reference model and its delivery/deployment taxonomy.
- Introduction to the course — how the sessions, assessment and the project thread fit together.
- Group project: a Cloud Architecture implementation — the brief for the end-to-end system each team builds and presents in Sessions 14–15.
- Data center environment — racks, compute/storage/network, power and cooling — the physical substrate the cloud abstracts away.
- NIST model of cloud computing — the standard vocabulary for what counts as “cloud”.
- Cloud delivery models — IaaS, PaaS, SaaS (see demo 1) — how much of the stack the provider runs versus you.
- Deployment / ownership models — public, private and hybrid clouds and the trade-offs between control, cost and reach.
NIST defines cloud computing by five essential characteristics — on-demand self-service, broad network access, resource pooling, rapid elasticity and measured service — across three service models (IaaS/PaaS/SaaS) and four deployment models (public, private, community, hybrid).
- Lisdorf, Cloud Computing Basics — non-technical framing of the CAPEX→OPEX story and why the cloud emerged; sets context for the whole module.
- Erl, Concepts & Architecture, ch. on roles & models — the formal NIST-aligned taxonomy of service and deployment models.
Move from physical data centers to software-defined infrastructure and the virtualization layer that makes the cloud possible.
- Software-defined data centers (SDDC) — compute, storage and networking delivered as software, provisioned through APIs rather than cabling.
- Cloud / fog / edge computing — a spectrum of where computation happens: centralized cloud, intermediate fog, and edge close to devices for low latency.
- Virtualization technology — VM vs container density (see demo 2) — a hypervisor slices one physical host into many isolated VMs.
- VirtualBox & Vagrant — a desktop hypervisor plus a tool to define and spin up reproducible VMs from code.
- VMware products — the enterprise virtualization stack (ESXi/vSphere) common in private clouds.
- Container technology — OS-level virtualization sharing one kernel — lighter and faster to start than VMs.
- Lab: VMs in the cloud — first hands-on: create and use virtual machines on a public cloud.
A hypervisor (Type 1 bare-metal, e.g. ESXi; or Type 2 hosted, e.g. VirtualBox) runs full guest OSes with strong isolation but heavyweight overhead. A container virtualizes at the OS level — many containers share one kernel — giving far higher density and near-instant start at the cost of weaker isolation.
- Gupta, The Cloud Computing Journey — practical walkthrough of building cloud infrastructure, useful for the SDDC and virtualization picture.
- Erl, Concepts & Architecture, virtualization chapter — mechanics of hypervisors and how virtualization underpins resource pooling.
Two hands-on Docker labs that turn the container idea from Module 1 into working artefacts: building images, running and persisting containers, then composing several into one application. This is where the group project starts to take a runnable shape.
By the end of this module you can
- Build a Docker image from a Dockerfile and manage the container lifecycle.
- Persist state with volumes and understand why containers are otherwise ephemeral.
- Wire a multi-container app (web + database) together with
docker-compose.
First hands-on Docker lab: understand the engine, build images and manage container lifecycle and persistent data.
- Introduction to Docker — the daemon, client and registry, and how a container is just an isolated process from an image.
- Image management — pulling, tagging and inspecting images; the layered, cache-friendly filesystem.
- Creating Docker images — writing a
Dockerfileand building reproducible images layer by layer. - Managing containers — run, stop, exec, logs and the container lifecycle.
- Storing data in volumes — decoupling persistent data from the container's ephemeral filesystem.
An image is an immutable, layered template; a container is a running instance of it with a thin writable layer on top. Anything written outside a volume dies with the container — volumes are how state survives.
- Titmus, Cloud Native Go — containerization and twelve-factor app principles that motivate why we package services as images.
Compose multiple containers into a working application and wire a web tier to a database.
- Linking Docker containers — connecting containers over a user-defined network so they can address each other by name.
- Web server + database — running a two-tier app where the web container talks to a database container.
- Configuring with docker-compose — declaring the whole multi-container stack in one YAML file and bringing it up with one command.
docker-compose.yml describes the desired set of services, networks and volumes;
docker compose up reconciles reality to match it. This is the same declarative idea you
meet again in Kubernetes and Terraform.
- Titmus, Cloud Native Go — service decomposition and how cloud-native apps are assembled from small, independently deployable parts.
A two-session tour of the two dominant public clouds. Sessions 5 and 6 follow the same structure — Azure overview, AWS overview, then FinOps — so students learn to map a need onto the equivalent service on either platform and to keep an eye on cost from day one.
By the end of this module you can
- Name and compare the most popular Azure and AWS services (compute, storage, networking, identity).
- Translate a requirement into the right managed service on either cloud.
- Apply FinOps thinking — tagging, right-sizing and pricing models — to control spend.
Tour the two leading public clouds, comparing their flagship services, then introduce FinOps as the discipline for managing cloud spend.
- Microsoft Azure overview — basics, terminology (regions, resource groups, subscriptions) and popular services, with hands-on demos.
- AWS overview — basics, terminology (regions, AZs, IAM, VPC) and popular services, with hands-on demos.
- FinOps: managing & optimizing spend — cost models (see demo 7) — bringing engineering, finance and product together to make cost a first-class metric.
FinOps is the operating model for cloud cost: real-time visibility (tagging and showback), optimization (right-sizing, choosing the right pricing model), and accountability shared across teams. It reframes the OPEX model so that the people provisioning resources also own their cost.
- Storment & Fuller, Cloud FinOps — the canonical FinOps text: phases (inform, optimize, operate) and concrete cost-control practices.
- Mulder, Multi-Cloud Strategy — how enterprises reason about choosing and combining clouds; context for the Azure-vs-AWS comparison.
Continue the Azure/AWS comparison in greater depth and reinforce cost-optimization practice with FinOps.
- Microsoft Azure overview (cont.) — deeper dive into services and terminology with further hands-on demos.
- AWS overview (cont.) — deeper dive into services and terminology with further hands-on demos.
- FinOps (cont.) — reserved vs spot (see demo 7) — comparing commitment-based discounts against interruptible spot capacity.
On-demand = pay per second/hour, no commitment, highest unit price. Reserved / savings plans = commit 1–3 years for a large discount. Spot = bid on spare capacity for up to ~90% off, but it can be reclaimed at short notice — ideal for fault-tolerant, batch work.
- Storment & Fuller, Cloud FinOps — pricing-model trade-offs and how to build a commitment portfolio.
- Zikopoulos et al., Cloud Without Compromise — hybrid/multi-cloud lens on portability and avoiding lock-in across Azure and AWS.
Serverless removes the server from the developer's mental model: you ship functions and the platform handles provisioning, scaling and idling-to-zero. This session positions serverless against VM- and container-based designs and gets first functions running on both clouds.
By the end of this module you can
- Explain when serverless beats containers or VMs (and when it does not).
- Distinguish FaaS from BaaS and describe event-driven triggers and sources.
- Deploy a basic Azure Function and AWS Lambda.
Contrast serverless against server- and container-based architectures, and build first functions on both clouds.
- Serverless design patterns — composing systems from event-triggered functions and managed back-end services.
- Comparison with server-/container-based architectures — trade-offs in cost, scaling, operational burden and control.
- Function as a Service (FaaS) — cold starts & concurrency (see demo 6) — run code in stateless functions billed per invocation/duration.
- Backend as a Service (BaaS) — consuming managed back-ends (auth, databases, storage) instead of building them.
- Event-driven computing — events, triggers and event sources that invoke functions reactively.
- Azure Functions example — a simple hands-on function on Azure.
- AWS Lambda example — a simple hands-on function on AWS.
In FaaS the platform scales function instances with load and to zero when idle — you pay only for execution. The trade-off is the cold start: when no warm instance exists, the first request waits for the runtime to initialize. Concurrency limits cap how many instances run at once.
- Erl, Concepts & Architecture — cloud architecture patterns that frame serverless within the broader catalogue of designs.
IaaS gives you raw machines — and on the cloud those machines almost always run Linux. This block (after Knowledge Quiz 1) builds the Linux literacy a cloud engineer needs, then puts it to work automating tasks with Bash.
By the end of this module you can
- Navigate the Linux filesystem and use the core command line confidently.
- Explain the kernel/userspace split and the Filesystem Hierarchy Standard.
- Write Bash scripts that automate repetitive cloud-engineering tasks.
First knowledge quiz, then the Linux foundations every cloud engineer needs: kernel, filesystem and the command line.
- Knowledge Quiz 1 — in-class quiz covering Modules 1–4 and the discussion-board readings (part of the 40%).
- Refresher: VMs in the cloud — re-grounding the IaaS layer before going deep on Linux.
- Linux concepts for cloud engineers — why Linux dominates cloud compute and what you must know to operate it.
- Linux architecture & kernel — the kernel/userspace boundary, processes, and system calls.
- Filesystem Hierarchy Standard (FHS) — the standard layout (
/etc,/var,/usr…) so you know where things live on any distro. - Basic command-line usage — navigation, files, permissions, pipes and redirection.
The kernel manages hardware, processes and memory and exposes system calls; everything else runs in userspace. Permissions, processes and the FHS layout are the day-to-day surface a cloud engineer interacts with.
- Hausenblas, Learning Modern Linux — modern, cloud-oriented Linux: kernel, filesystem, processes and the shell. Primary reading for Sessions 8–9.
Apply Linux skills to automate cloud-engineering tasks with shell scripts.
- Cloud engineering: Bash scripting lab — variables, conditionals, loops and functions to script provisioning, log-wrangling and routine ops tasks.
A Bash script captures a manual procedure as a repeatable, version-controllable artefact. It is the first rung on the automation ladder that continues with Ansible and Terraform in Module 6.
- Hausenblas, Learning Modern Linux, shell chapter — the shell, scripting constructs and tooling used in the lab.
Two complementary automation tools. Ansible handles configuration management — bringing existing machines to a desired state — while Terraform (and its fork OpenTofu) provisions the infrastructure itself as declarative code. Together they replace click-ops with reproducible, reviewable automation.
By the end of this module you can
- Write Ansible playbooks with variables, facts, control structures and Jinja2 templates.
- Explain the Terraform workflow (write → plan → apply) and the role of state and drift.
- Choose configuration management vs provisioning for a given automation task.
Automate configuration with Ansible and provision declarative infrastructure with Terraform / OpenTofu.
- Understanding Ansible & setup — agentless, push-based config management over SSH; setting up the control node and inventory.
- Ad-hoc commands & modules — one-off tasks and the reusable modules that do the actual work.
- Playbooks, variables, facts & control structures — YAML descriptions of desired state, parameterized and conditional.
- Templating with Jinja2 — generating config files dynamically from variables and facts.
- Terraform & OpenTofu — declarative provisioning across providers; OpenTofu is the open-source fork.
- Terraform workflow & best practices — plan & drift (see demo 11) —
init → plan → apply, remote state, and keeping config the source of truth. - Hands-on + advanced — exercise; advanced note on Terraform and the Go programming language.
Infrastructure as Code declares the desired end state; the tool computes the diff against
recorded state and applies only what's needed (idempotence). Drift is when reality
diverges from the code (a manual change in the console) — terraform plan surfaces it.
- Morris, Infrastructure as Code — primary reading: principles, patterns and pitfalls of IaC; directly underpins the Terraform portion.
- Titmus, Cloud Native Go — supports the advanced note on Terraform and Go.
Managed data services let teams store and query data without running database servers. This session surveys cloud storage and big-data platforms and confronts the central trade-off of any distributed datastore: the CAP theorem.
By the end of this module you can
- Choose between object, block and file storage and between SQL and NoSQL datastores.
- State the CAP theorem and reason about consistency-vs-availability trade-offs.
- Use Azure Storage accounts and AWS S3 buckets for object storage.
Survey managed storage and data services and the consistency trade-offs of distributed datastores.
- Storage, databases & big-data platforms — CAP theorem (see demo 8) — object/block/file storage, SQL/NoSQL, and managed analytics services.
- Azure Storage accounts — Azure's unified object/blob, file, queue and table storage.
- AWS S3 buckets — highly durable object storage with lifecycle policies and tiers.
In a distributed datastore you can guarantee at most two of Consistency, Availability and Partition tolerance. Since network partitions are unavoidable, real systems choose between CP (reject some requests to stay consistent) and AP (stay available, accept eventual consistency).
- Zburivsky & Partner, Designing Cloud Data Platforms — reference design for ingestion, storage and serving layers; primary reading for this session.
The cross-cutting block. It combines the patterns that make systems scale and stay available (load balancing, API gateways, Kubernetes, autoscaling) with the security model and the operational practices — observability, SRE, Zero Trust — that keep them safe and healthy in production.
By the end of this module you can
- Apply scaling/resilience patterns: load balancing, API gateways, Kubernetes orchestration, autoscaling.
- Describe the shared-responsibility and Zero-Trust security models and key cloud security tools.
- Explain observability and the SRE role, including how to defend against DDoS.
Combine the scalability and resilience design patterns with the security and operations practices that keep cloud systems healthy.
- Load balancing — algorithms (see demo 4) — spreading traffic across healthy instances (round-robin, least-connections, etc.).
- API gateways — a single managed entry point handling routing, auth, throttling and versioning.
- Event-driven & stream processing — reacting to and processing continuous data streams asynchronously.
- Kubernetes — scheduling & self-healing (see demo 9), autoscaling (see demo 3) — declarative container orchestration that schedules, heals and scales workloads.
- Cloud security concepts & tools — identity, network and data security in the cloud.
- Azure Security Center & AWS Security Hub — posture-management dashboards aggregating findings across resources.
- DDoS attacks — rate limiting (see demo 10) — volumetric/protocol/application attacks and mitigations like rate limiting and scrubbing.
- Critical-infrastructure security & Zero Trust — “never trust, always verify”: authenticate and authorize every request regardless of network location.
- Observability & the SRE role — logs, metrics and traces; SLOs and error budgets to balance reliability and velocity.
- Shared responsibility model — availability & redundancy (see demo 5) — who secures what — provider “of the cloud” vs customer “in the cloud”.
Autoscaling adjusts capacity to demand against a target metric (e.g. keep CPU ≈ 60%), adding instances when load rises and removing them when it falls. Kubernetes generalizes this: you declare the desired state and a control loop continuously reconciles reality — rescheduling failed pods (self-healing) and scaling replicas.
The shared responsibility model draws the security line: the provider secures the underlying cloud; the customer secures what they put in it (data, identity, configuration).
- Buckwell et al., Security Architecture for Hybrid Cloud — Zero Trust, shared responsibility and security architecture methods for the security half of the session.
- Majors et al., Observability Engineering — logs/metrics/traces, SLOs and the SRE practices behind the operations half.
The closing block consolidates everything: the second quiz, a forward look at where the cloud is heading, and the capstone presentations where each team defends the architecture it has built across the semester.
By the end of this module you can
- Demonstrate consolidated understanding across Modules 5–8 (Quiz 2).
- Discuss emerging trends and future directions in cloud computing.
- Present and justify a complete end-to-end cloud architecture.
Second knowledge quiz, followed by a look ahead at where cloud computing is heading.
- Knowledge Quiz 2 — in-class quiz covering Modules 5–8 and the discussion-board readings (part of the 40%).
- Emerging trends & future directions — where cloud is heading — edge, AI workloads, multi-cloud, sustainability and beyond.
- Mulder, Multi-Cloud Strategy — strategic, forward-looking view that frames the trends discussion.
Each team presents the end-to-end Cloud Architecture it has built across the semester.
- Team presentations — each team gets 10–15 minutes to present their Cloud Architecture implementation.
- Capstone of the project thread — the culmination of the group work assessed as part of the 25% group grade.
Key concepts — glossary
Quick-reference definitions for the recurring terms across the course. The chip shows the session(s) where each first matters.
- NIST cloud model S1
- The standard definition of cloud computing: five essential characteristics, three service models and four deployment models.
- IaaS / PaaS / SaaS S1
- Service models that differ by how much of the stack the provider manages — from raw VMs (IaaS) to fully managed apps (SaaS).
- Public / private / hybrid S1
- Deployment models trading control and isolation against reach and cost; hybrid blends on-prem with public cloud.
- CAPEX → OPEX S1
- Shifting from large up-front capital purchases to pay-as-you-use operating expense — a core economic driver of cloud.
- Elasticity S1
- The ability to scale resources up and down automatically and quickly to match demand.
- Software-defined data center S2
- Compute, storage and networking abstracted into software and provisioned via APIs rather than hardware.
- Hypervisor S2
- Software that creates and runs virtual machines, partitioning one physical host into many isolated guests.
- Container S2
- OS-level virtualization sharing the host kernel — lighter and faster to start than a VM, with weaker isolation.
- Cloud / fog / edge S2
- A spectrum of where computing happens, from centralized cloud to fog to the edge near devices for low latency.
- Docker image vs container S3
- An image is an immutable layered template; a container is a running instance of it with a writable layer.
- Volume S3
- Persistent storage attached to containers so data survives the container's ephemeral lifecycle.
- docker-compose S4
- A YAML file declaring a multi-container app's services, networks and volumes, brought up with one command.
- FinOps S5
- The discipline of cloud financial operations: cost visibility, optimization and shared accountability.
- Pricing models S6
- On-demand, reserved/savings-plan, and spot — trading commitment and interruptibility for unit price.
- FaaS S7
- Function as a Service: run stateless functions billed per invocation, scaling to zero when idle.
- BaaS S7
- Backend as a Service: managed back-end building blocks (auth, database, storage) consumed via API.
- Cold start S7
- The latency incurred when a function platform must initialize a new instance to serve a request.
- Event-driven computing S7
- Architecture where events and triggers from sources invoke compute reactively rather than on a schedule.
- Linux kernel / FHS S8
- The kernel manages hardware via system calls; the Filesystem Hierarchy Standard fixes the directory layout.
- Infrastructure as Code S10
- Declaring infrastructure in version-controlled code so it can be reviewed and recreated reproducibly.
- Idempotence S10
- Applying the same configuration repeatedly yields the same end state — central to Ansible and Terraform.
- State & drift S10
- Terraform records expected state; drift is when real infrastructure diverges from that recorded state.
- CAP theorem S11
- A distributed store can guarantee at most two of consistency, availability and partition tolerance.
- Object storage S11
- Durable, scalable key-based storage (S3, Azure Blob) for unstructured data, billed per use.
- Load balancing S12
- Distributing incoming traffic across healthy instances to improve throughput and availability.
- Kubernetes S12
- Declarative container orchestration that schedules, self-heals and autoscales containerized workloads.
- Autoscaling S12
- Automatically adding/removing capacity to track a target metric such as CPU utilization.
- Zero Trust S12
- “Never trust, always verify” — every request is authenticated and authorized regardless of network location.
- Shared responsibility S12
- Security is split: the provider secures the cloud; the customer secures what they run in it.
- Observability / SRE S12
- Understanding system state from logs, metrics and traces; SREs use SLOs and error budgets to manage reliability.
Bibliography (recommended) — annotated
Recommended digital readings supporting the course modules. Each entry notes what it covers and the sessions it best supports. Readings shared on the discussion board are also examinable in the quizzes.
- Jeroen Mulder (2023). Multi-Cloud Strategy for Cloud Architects, 2nd ed. Packt. ISBN 9781804616734 Strategic view of choosing and combining clouds and avoiding lock-in. → Sessions 5–6, 13
- Thomas Erl, Eric Barceló Monroy (2023). Cloud Computing: Concepts, Technology, Security, and Architecture, 2nd ed. Pearson. ISBN 9780138052256 The course's architectural backbone: NIST models, virtualization, patterns and serverless. → Sessions 1, 2, 7
- Anders Lisdorf (2021). Cloud Computing Basics: A Non-Technical Introduction. Apress. ISBN 9781484269213 Gentle, business-oriented intro to why the cloud exists and the CAPEX→OPEX story. → Session 1
- Paul Zikopoulos et al. (2021). Cloud Without Compromise. O'Reilly. ISBN 9781098103736 Hybrid and multi-cloud portability without lock-in. → Session 6
- Divit Gupta (2024). The Cloud Computing Journey. Packt. ISBN 9781805122289 Practical end-to-end build of cloud infrastructure; complements the project thread. → Session 2
- J.R. Storment, Mike Fuller (2023). Cloud FinOps, 2nd ed. O'Reilly. ISBN 9781492098355 The canonical FinOps reference: cost phases, pricing models and accountability. → Sessions 5–6
- Danil Zburivsky, Lynda Partner (2021). Designing Cloud Data Platforms. Manning. ISBN 9781617296444 Reference architecture for cloud storage, databases and big-data platforms. → Session 11
- Mark Buckwell, Stefaan Van daele, Carsten Horst (2024). Security Architecture for Hybrid Cloud. O'Reilly. ISBN 9781098157777 Zero Trust, shared responsibility and a method for designing secure architectures. → Session 12
- Kief Morris (2020). Infrastructure as Code, 2nd ed. O'Reilly. ISBN 9781098114671 Principles and patterns of IaC; primary reading behind Terraform. → Session 10
- Matthew A. Titmus (2021). Cloud Native Go. O'Reilly. ISBN 9781492076339 Containerization, cloud-native service design and the Terraform-with-Go advanced note. → Sessions 3–4, 10
- Michael Hausenblas (2022). Learning Modern Linux. O'Reilly. ISBN 9781098108946 Modern, cloud-oriented Linux: kernel, filesystem, processes and the shell. → Sessions 8–9
- Charity Majors, Liz Fong-Jones, George Miranda (2022). Observability Engineering. O'Reilly. ISBN 9781492076445 Logs/metrics/traces, SLOs and SRE practice for operating cloud systems. → Session 12