APEX concept site  /  Course structure  /  Worked example
🎓 Capstone exemplar 🧪 Prototype type 📄 Report + defense

What a top-marked capstone looks like

A single, in-depth model project dossier: it walks the APEX concept end-to-end as if it were a finished BCSAI Capstone — from problem statement and landscape review through aims, methodology, implementation, evaluation, the written-report mapping, and the oral defense. Use it as a template for shape and rigour, not as content to copy. Every phase here is cross-linked to the course structure and the official rubric.

Bachelor in Computer Science & Artificial Intelligence · IE University · PC-CSAI.4.M.A · Academic year 2025–26
Project type
Prototype
Effort
375 h · 12 ECTS
Deliverable
Report + PoC + defense
Grade split
60 / 40
01 · Overview

What this exemplar demonstrates

This page reads the APEX concept as if a student had carried it through the full capstone lifecycle and submitted it for marking. The goal is to make the shape of an excellent project legible: how each deliverable connects, where rigour shows up, and how the artefact maps onto the rubric.

How to read this. Sections 02–10 follow the natural arc of a Prototype capstone and line up with the written-report structure from the syllabus (Title → Abstract → Introduction → Methodology → Results → Discussion → Bibliography). Wherever the live APEX site shows the concept interactively, you will see a cross-link to the relevant section of the concept site; the formal process and deadlines live in the course outline.
Why Prototype

🧪 The right project type

The syllabus offers three types — Prototype, Venture, Research. APEX is best framed as a Prototype: it builds a software proof-of-concept (the interactive ecosystem and its physiological/forecasting models), studies its behaviour, and benchmarks it against existing market solutions, while still making theoretical, methodological and empirical contributions.

🎯 What "good" means here

A strong Prototype is not just a working demo. It states a researchable problem, situates itself against competitors, defines measurable success criteria up front, and reports honest evaluation against them. APEX is structured to do exactly that — the demo is evidence, not the argument.

🔗 Three artefacts, one story

A top capstone keeps the report, the software and the oral defense telling one coherent story. Sections 08–09 show how the same threads (problem → method → evaluation) re-appear in each, so the panel never sees a contradiction between paper and product.

02 · Introduction

Problem statement & motivation

Every capstone earns its marks by opening with a problem worth solving. APEX's is the fragmentation of athlete data — and it is concrete, measurable, and grounded in a real user population.

The modern endurance athlete interacts with 8–15 sensors and roughly half a dozen apps in a single training week. Heart rate lives in one cloud, power in another, sleep in a third, and the coach's plan in a static PDF. Each platform is a silo with its own protocol (ANT+, BLE, FE-C) and its own export format (FIT, TCX, GPX). The athlete — or the coach — becomes the human integration layer, exporting CSVs, re-keying workouts, and reconciling numbers that never quite agree.

Problem, stated for marking No single platform orchestrates the full multisport training loop. Athletes lose signal and time to manual reconciliation, and coaching adaptation lags days behind the physiological reality the sensors already captured. The opportunity is an orchestration layer that ingests every stream, models the athlete continuously, and closes the loop back to the next session.

The motivation is sharpened by three facts: (1) the sensor and platform ecosystem is mature but deliberately non-interoperable; (2) adaptive, physiology-aware coaching is demonstrably effective but locked inside proprietary engines; and (3) the integration burden falls on exactly the users least equipped to carry it. This is a data-and-systems problem at heart — which makes it a legitimate computer-science capstone rather than a pure sports-science one.

↳ The live concept makes this tangible: see the fragmented-vs-unified toggle in concept site §01 · Motivation and the device inventory in §02 · Sensors.

03 · Literature & landscape

Landscape review & the gap

The proposal requires at least five academic sources and a market comparison. A Prototype's "literature review" is really two reviews: the scholarly basis for its models, and the competitive landscape it must beat or complement.

📚 Scholarly basis

  • Training-load modelling — Banister's impulse–response model and the CTL/ATL/TSB fitness–fatigue framework (Coggan & Allen) underpin the analytics layer.
  • Critical power / FTP — Monod & Scherrer and Jones et al. ground the threshold-estimation maths.
  • Environmental physiology — heat, humidity and altitude effects on sustainable power (Périard et al.) justify EnviroNormalization.
  • Closed-loop / adaptive systems — control-theory framing for the plan→execute→ingest→adapt cycle.

🏁 Competitive landscape

  • TrainingPeaks — best-in-class planning & CTL/ATL/TSB analytics, but largely manual and not adaptive.
  • TriDot — genuine adaptive engine with environmental normalization; closed and triathlon-specific.
  • Zwift — rich virtual-training environment; weak on long-horizon periodization.
  • Strava — dominant social + logging layer; little coaching intelligence.
  • Garmin / Wahoo — strong device clouds; ecosystem-locked.
The gap (the thesis claim) No incumbent is simultaneously open across devices, physiology-aware, and closed-loop adaptive. Each owns one or two slices; none owns orchestration. APEX positions itself in that gap — not as a competitor to any single platform, but as the integration and adaptation layer that sits above them. ↳ The full feature-by-platform matrix is interactive in concept site §09 · Ecosystem.
04 · Aims

Aims, research questions & success criteria

A Prototype still needs research questions — they are what separate an engineering exercise from a capstone. Each RQ below is paired with how it would be answered, and the whole set rolls up into measurable success criteria the evaluation (section 07) reports against.

RQ1
Can heterogeneous device streams be unified into a single, lossless athlete model in near-real-time?
Method — build the ingestion plane; measure field-coverage and round-trip latency across FIT/TCX/GPX and webhook sources.
RQ2
Does a physiology-aware model produce training targets that track an athlete's true threshold more closely than static plans?
Method — compare APEX zone/target outputs against measured FTP/LTHR/CSS on a held-out athlete dataset.
RQ3
Does environmental normalization keep training stress consistent across conditions?
Method — hold target stress fixed, vary heat/humidity/altitude/wind, and check that adjusted power yields comparable physiological cost.
RQ4
Can a closed adaptive loop re-plan tomorrow from today's response without human intervention?
Method — simulate a week of training; verify the planner reacts to over/under-performance and fatigue (TSB) signals.
Success criteria (defined up front) SC1 ≥ 95% field coverage across the four ingest formats · SC2 ingest-to-model latency < 2 s · SC3 computed thresholds within ±3% of measured values · SC4 EnviroNorm keeps modelled stress within ±5% across the condition range · SC5 the loop adapts plan TSS in the correct direction on 100% of simulated perturbations. These are the bars the project judges itself against — stated before results, not after.
05 · Methodology

Methodology & system design

The methodology chapter is where a Prototype earns its "methodological contribution" mark. It covers the architecture, the data model, the technology stack, and the modelling choices — each defended, not just listed.

1 · SENSORS & GYM EQUIPMENT ANT+ · BLE · FE-C · Technogym MyWellness · ICG Connect · GPS watches · power meters 2 · INGESTION — unified data plane Strava webhooks · Garmin Health · Wahoo Cloud · FIT / TCX / GPX normalizers → canonical schema 3 · ANALYTICS — physiological modelling CTL · ATL · TSB · FTP/LTHR/CSS estimation · zone derivation · EnviroNormalization 4 · ADAPTIVE COACHING — plan engine Rule + load-model planner · human-coach overlay · LLM explanation agent 5 · ACTION — coached next session Push to Zwift · Garmin Connect · Wahoo · Apple Watch closed loop
Described architecture. Sensor packets flow up the five layers (solid arrows): raw streams are normalized into a canonical schema, modelled into physiological state, fed to the planner, and emitted as the next session. The dashed arrow is the closed loop — yesterday's measured response re-conditions today's plan. This is the diagram the report's Methodology chapter would caption identically. ↳ Each layer is clickable in concept site §03 · Architecture.

Data model

The heart of the methodological contribution is a canonical activity schema that every source maps into: an Athlete (with rolling FTP/LTHR/CSS and CTL/ATL/TSB state), a stream of Activity records, and per-second Sample rows (timestamp, power, hr, cadence, pace, lat/lon, plus environmental context). Source-specific adapters are responsible for translation, so the analytics layer never sees a vendor format. This separation is what makes RQ1 answerable and what makes the system extensible to a new device without touching the models.

Technology stack (defended, not just chosen)

  • Ingestion: Python (FastAPI) workers consuming webhooks + a FIT/TCX/GPX parser — chosen for the mature scientific-Python ecosystem the analytics depend on.
  • Storage: time-series store for samples + relational store for athlete/activity metadata — the access patterns differ enough to justify both.
  • Analytics: NumPy/pandas for the load and threshold models; pure functions so they are unit-testable against known cases.
  • Interface: a static, dependency-light front end (the concept site itself) so the prototype is demonstrable on GitHub Pages with no backend at defense time.
Methodological honesty The deployed concept site uses heuristic, illustrative models (clearly labelled as such on the live site). A full capstone would swap these for the literature-grounded models above and validate them on real data — the architecture is designed so that swap touches only the analytics layer. Stating this boundary explicitly is itself a mark-earning move.
06 · Implementation

Implementation highlights

The report does not narrate every line of code — it surfaces the decisions that mattered and shows one representative core feature in enough depth to prove competence. For APEX, the clearest example is the EnviroNormalization model.

🧩 Decision: adapters over a god-parser

Each device format gets its own small adapter to the canonical schema rather than one monolithic parser. Adding Wahoo support later meant ~80 lines, no core changes — the evidence for RQ1's extensibility claim.

🧪 Decision: pure, testable models

Physiological maths lives in pure functions with no I/O, so each can be unit-tested against textbook cases (a known FTP → known zones). This is what makes the section-07 results trustworthy.

⚙️ Decision: static demo, real models behind it

Shipping the demo as a static site removes deployment risk at defense time, while keeping the model code in a separate, reproducible package the second reader can run.

core feature · environmental normalizationpython
# Adjust a target power so that *training stress* stays constant
# across heat, humidity, altitude and wind. Heuristic model used in
# the prototype; coefficients would be fit per-athlete in production.

def enviro_normalize(base_w, temp_c, humidity, altitude_m, wind_kmh):
    """Return adjusted target power (W) and a per-factor breakdown."""
    factors = {}

    # Heat + humidity: cardiovascular drift above a ~18C neutral point
    heat_index = temp_c + 0.1 * max(humidity - 40, 0)
    factors["heat"] = -0.012 * max(heat_index - 18, 0)

    # Altitude: reduced VO2max above ~1000 m
    factors["altitude"] = -0.00006 * max(altitude_m - 1000, 0)

    # Wind: headwind costs power, tailwind returns some
    factors["wind"] = -0.004 * wind_kmh

    adj = 1.0 + sum(factors.values())
    adj = max(0.80, min(1.05, adj))   # clamp to plausible range

    return round(base_w * adj), factors
Why this snippet, for the panel It shows a defensible model (neutral points and clamps grounded in physiology), a clean return contract (value plus an explainable breakdown), and an obvious path to rigour (replace fixed coefficients with per-athlete fits). ↳ Try it live with sliders in concept site §06 · Environmental normalization.
07 · Results

Evaluation against the success criteria

The single most common reason a Prototype loses marks is a demo with no evaluation. Here each success criterion from section 04 is tested and reported honestly — including where the prototype falls short.

ID Success criterion Method Target Result Verdict
SC1 Field coverage across ingest formats Parse a corpus of FIT/TCX/GPX/webhook files; count mapped fields ≥ 95% 97.4% ✓ Met
SC2 Ingest-to-model latency Time webhook receipt → updated athlete model (p95) < 2 s 1.3 s ✓ Met
SC3 Threshold estimation accuracy Compare computed FTP/LTHR/CSS vs. lab/field-tested values ±3% ±2.6% ✓ Met
SC4 EnviroNorm stress consistency Hold target stress; sweep conditions; compare modelled cost ±5% ±6.8% ~ Partial
SC5 Loop adapts in correct direction Simulate a week; perturb performance & fatigue; check plan response 100% 100% ✓ Met
Reading the results honestly Four of five criteria are met. SC4 is only partially met (±6.8% vs. a ±5% target) at the extremes of the heat/altitude range — the heuristic coefficients are not athlete-specific. Reporting this as a miss, with a stated cause and a fix (per-athlete coefficient fitting), is worth more marks than hiding it. ↳ The adaptive loop behind SC5 runs live in concept site §08 · Closed loop; SC3 zones are in §04 · Physiology.
08 · Written report

Report structure mapping & how it scores

The syllabus prescribes the report skeleton and a 25–50 page, APA-formatted, double-spaced paper. Below: how each required section is filled from this exemplar, and how the artefact maps onto the 60/40 rubric.

Title page · Abstract
"APEX: An orchestration layer for multisport athlete data." 250-word abstract = problem + approach + headline result (4/5 success criteria met).
Introduction
Section 02 here — the fragmentation problem, motivation, and why it is a CS problem. Ends with the aims and RQs from section 04.
Literature review
Section 03 — scholarly basis (≥ 5 sources, satisfying the proposal requirement) plus the competitive landscape and the explicit gap.
Methodology
Section 05 — architecture, canonical data model, defended stack, modelling choices and their physiological grounding.
Implementation / experimental design
Section 06 — key build decisions and a representative core feature; the test harness that makes section 07 credible.
Results
Section 07 — the success-criteria table, reported against pre-registered targets, including the SC4 partial.
Discussion & conclusion
Section 10 — what the results mean, the explicit market comparison the syllabus asks for, limitations, and future work.
Bibliography · Appendix · Software
APA references (section 11), supporting tables in an appendix, and the runnable software package — all excluded from the page count.
Written report — supervisor + external second reader60%
Earned through the literature-grounded methodology, the pre-registered evaluation, and the honest results discussion. The external reader rewards reproducibility — hence the runnable software package and pinned dependencies.
Oral presentation — two panelists40%
Earned in the defense (section 09): a tight 15-minute narrative of RQ → method → result, and confident handling of 20 minutes of questions — including the SC4 shortfall.
The 20% inside the 60%. The syllabus notes the supervisor contributes 20% of the final grade, split between final-product quality and the collaboration process (timeliness, incorporated feedback). The most controllable marks on the whole project are the process ones — they reward hitting the three spring deliverables on time and acting on supervisor feedback. ↳ Full rubric and weighting in course outline §06 · Assessment.
09 · Oral defense

The oral-defense plan

15 minutes to present, up to 20 for questions, in front of a four-judge panel (supervisor, second reader, two outside panelists). The plan below is built around what each judge is actually listening for.

⏱️ The 15-minute arc

  • 0–2 min — the problem, made vivid (the fragmented athlete).
  • 2–4 min — the gap and the thesis claim (orchestration).
  • 4–7 min — architecture & method, using the one diagram.
  • 7–10 min — a live or recorded demo of one core feature.
  • 10–13 min — results against the success criteria, SC4 shortfall included.
  • 13–15 min — limitations, future work, and the one-sentence contribution.

👁️ What each panelist looks for

  • Supervisor — that the work matches what they watched develop; no surprises.
  • Second reader — methodological rigour and reproducibility from the paper alone.
  • Outside panelists — can you defend choices you didn't have to make in front of your supervisor? Can you handle the gap question?
  • All four — a coherent story across paper, product and talk.

Anticipated questions & prepared answers

"Your EnviroNorm missed its target (SC4). Why should we trust the rest?"
Because SC4 was pre-registered and reported as a miss, not buried — and its cause (fixed, non-athlete-specific coefficients) is isolated to one model behind a clean interface, with a stated fix. The other four criteria use independently unit-tested pure functions.
"Isn't this just a wrapper around existing platforms?"
No incumbent is open, physiology-aware and closed-loop at once (the landscape matrix). The contribution is the canonical schema + closed loop that sits above them — orchestration is the novel slice, not any single feature.
"What's the single most important result?"
That heterogeneous streams unify at 97.4% field coverage with sub-2-second latency (SC1/SC2) — without that, none of the coaching intelligence is possible, so it is the load-bearing result.
"If you had another semester, what would you do?"
Replace heuristic coefficients with per-athlete fits (fixing SC4) and validate the loop on a real longitudinal athlete dataset rather than simulation.
Process workshop The defense is scaffolded by Workshop 6 — "Defending the thesis" in early May, immediately before the 1st-call defense window (May 20–31). ↳ See the full timeline in course outline §05 · Project structure.
10 · Discussion

Ethics, limitations & future work

The Discussion chapter is where comprehension shows. A strong capstone is candid about what it did not prove and what it would do next — and a project handling athlete biometrics must address ethics head-on.

⚖️ Ethics & data

  • Health data sensitivity — HR, sleep and location are special-category personal data; the design assumes informed consent, encryption at rest/in transit, and data minimisation.
  • GenAI use — disclosed via the syllabus's acknowledgment format; the written thesis itself is original, not AI-generated.
  • Integrity — all submissions screened (Turnitin/GPTZero); draft history retained as provenance.

🚧 Limitations

  • Models are heuristic and partly validated in simulation, not on a large real cohort.
  • EnviroNorm under-performs at condition extremes (SC4).
  • Device integrations are mocked at the API boundary, not certified against every vendor.
  • No formal user study of coach/athlete workflow yet.

🔭 Future work

  • Per-athlete coefficient fitting for all physiological models.
  • Longitudinal validation on a real consented athlete dataset.
  • Certified partner integrations (Garmin/Wahoo/Strava production APIs).
  • A controlled trial: APEX-coached vs. static-plan athletes.
The contribution, in one sentence APEX demonstrates that a vendor-neutral canonical schema plus a closed adaptive loop can orchestrate the full multisport training cycle — a slice no incumbent owns — with the honest caveat that its physiological models need real-data validation before clinical claims.
11 · Bibliography

References & resources

An indicative APA-style reference set for this exemplar — the scholarly basis behind the models plus the artefacts referenced throughout. A real submission would expand these to the full citations used in text.

Banister, E. W., et al. — A systems model of training for athletic performance The impulse–response basis for modelling training load and fatigue; underpins the CTL/ATL/TSB analytics layer (section 05).
Allen, H., & Coggan, A. — Training and Racing with a Power Meter Source for Normalized Power, Intensity Factor, TSS and the fitness–fatigue framework used in the analytics and workout models.
Jones, A. M., et al. — Critical power: implications for endurance performance Grounds the threshold (FTP/CP) estimation maths evaluated under success criterion SC3.
Périard, J. D., et al. — Cardiovascular adaptation and performance in the heat Physiological basis for the EnviroNormalization model and its heat/humidity neutral points (section 06).
American Psychological Association — Publication Manual (7th ed.) The required citation and formatting standard for the written report. Manage citations with Zotero/Mendeley from the proposal's five sources onward.
Capstone Project — Official Syllabus (PC-CSAI.4.M.A) IE University, BCSAI, 2025–26. Prof. Alexandre Anahory de Sena Antunes Simões. The authoritative source for project types, deliverables, the report structure and the 60/40 rubric used on this page. → SYLLABUS.pdf
APEX — Adaptive Performance Ecosystem for eXcellence (the worked artefact) The interactive concept site modelled as a Prototype-type capstone throughout this dossier. → Open the concept site · course structure
Market references — TrainingPeaks · TriDot · Zwift · Strava · Garmin · Wahoo The competitive landscape (section 03) and the feature-by-platform comparison that establishes APEX's gap. Product documentation cited in the report's market comparison.