affect-lab · course outline — AI: Personality & Emotion for AI Design

Course structure — AI: Personality & Emotion for AI Design

A compulsory second-year course in the Bachelor in Computer Science & Artificial Intelligence (BCSAI) at IE University, taught by Prof. Luciano Dyballa. Across 15 in-person sessions it builds from the foundations of cognitive science and the artificial neuron up through deep learning, perception, language, and the simulation of emotion and personality in AI — closing on the philosophy and ethics of machine minds. Many lectures pair theory with a live coding demo; the interactive labs on this site mirror those demos.

Program

BCSAI — Bachelor in CS & AI

Code

AIPE-CSAI.2.M.A

Area

Computer Science

Sessions

15 (live in-person)

Credits

3.0 ECTS

Academic year

2025–26 · SEP-2025

Degree course

Second

Semester

2º

Learning objectives

By the end of the course, students should be able to:

Understand core concepts of cognitive science, including perception, learning, memory, attention and consciousness.
Understand the key differences between learning in artificial vs. biological neural networks.
Grasp the foundational principles of neural networks and deep learning through in-class demos.
Explore how AI systems can be designed to simulate human traits such as emotion and personality.
Critically assess the potential of AI to replicate human-level intelligence.
Explore the societal impacts of AI technologies.

Methodology & assessment

Before each class students prepare assignments and readings at home; lectures combine theoretical explanation with practical examples, and most include a live coding demonstration. Each new technique is followed by worked examples, and problem sets build both the intuition behind the theory and the coding skills to implement the algorithms. Students are encouraged to collaborate on understanding the material, but all written work — including code — must be your own; any hints or outside solutions must be acknowledged.

Learning-activity weighting · ~75 h total

Lectures

24.0%

Group work

22.7%

Exercises / async / field work

20.0%

Individual studying

20.0%

Discussions

13.3%

Estimated student time: Lectures 18 h · Discussions 10 h · Exercises/async/field work 15 h · Group work 17 h · Individual studying 15 h.

Assessment — components & deliverables

Final exam

40%

Group presentation

30%

Individual work

20%

Class participation

10%

Final exam40%

Deliverable: comprehensive written exam (Session 13). Evaluation: covers the full program — theory and the algorithms behind each demo. Gate: you must score at least 3.5/10 on the exam to pass the course, even if every other component is already passed.

Group presentation30%

Deliverable: a written report plus an in-class presentation (Sessions 14–15). Evaluation: graded as a group on the report and the talk. Note: GenAI may not be used for group project submissions.

Individual work20%

Deliverable: selected exercises / problem sets announced throughout the course. Evaluation: assesses individual progress; all code must be your own with outside help acknowledged.

Class participation10%

Deliverable: active engagement in this highly interactive class — questions, remarks, occasional short in-class presentations or exercises. Evaluation: punctuality and class conduct are also taken into account.

Pass, attendance & re-sit rules

Exam gate: a minimum exam score of 3.5/10 is required to pass the overall course.
Attendance: students who do not meet the 80% attendance rule fail both the ordinary and extraordinary calls for the year and must re-enroll (re-take) the following Academic Year.
Four calls: each course allows four chances to pass over two consecutive Academic Years (ordinary + extraordinary re-sits in June/July).
Re-sit: the June/July re-sit is a single comprehensive exam taken in person on campus (Segovia or Madrid); continuous evaluation is not carried over and the grade is capped at 8.0/10 ("notable"). Retakers (3rd call) may reach 10.0 and must contact the professor for their specific criteria.
GenAI policy: permitted for some specific tasks with acknowledgment; prohibited for group submissions, quizzes and exams. Misuse counts as academic misconduct and may fail the assignment or course.

Program — 15 sessions

The full program grouped into four thematic modules. Each module opens with its arc and learning outcomes; each session lists its objective, annotated topics, a key idea, the core model or formula where the content is technical, annotated readings, and links to the matching interactive lab where one exists.

Module I Foundations: cognition & the artificial neuron

Sessions 1–4 · What intelligence is, how biological and artificial neurons work, and how networks learn.

Module learning outcomes

Situate cognitive science and AI historically and define what "intelligence" could mean.
Explain how a biological neuron fires and how the perceptron abstracts it.
Distinguish supervised, statistical and Hebbian learning rules.
Derive how stacking neurons + backpropagation lets a network learn non-linear functions like XOR.

Session 1 · live in-person

Introduction to Cognitive Science and AI

Frame the course: what cognition is, what AI is, and what "intelligent" even means.

What is cognitive science? The interdisciplinary study of mind spanning psychology, neuroscience, linguistics, philosophy and CS; key theories model cognition as information processing (the mind as a symbol- or signal-manipulating system).
What is artificial intelligence? A short history from symbolic AI and the 1956 Dartmouth workshop through expert systems to the deep-learning era, and the field's shifting goals (mimic vs. rival human thought).
What is "intelligence"? Competing definitions — goal-directed rational action, adaptation, the Turing Test as a behavioural benchmark — and why none is fully settled.

Key idea

Cognitive science treats the mind as an information-processing system; AI asks how far that processing can be built in a machine — making "intelligence" a moving target rather than a fixed property.

Friedenberg — Cognitive Science, ch. 1: defines the field and the information-processing view of mind.

Russell & Norvig — AIMA, ch. 1: the four schools of AI (thinking/acting × humanly/rationally) and a history of the field.

Friedenberg — Cognitive Science, ch. 1 Russell & Norvig — AIMA, ch. 1

Session 2 · live in-person

Neurons and the Basics of Neural Networks

From biological neurons to the first computational model that can learn.

How biological neurons work: dendrites integrate incoming signals, and the neuron fires an all-or-none action potential down its axon once the membrane potential crosses threshold — a noisy, spiking, energy-efficient computer.
Models of artificial neurons: the McCulloch–Pitts / threshold unit replaces this with a weighted sum passed through a step or activation function.
The perceptron: a single linear unit that learns a decision boundary by nudging weights whenever it misclassifies an input.

Key idea

A neuron — biological or artificial — is fundamentally a thresholded weighted sum; learning means adjusting the weights.

Artificial neuron: y = φ(Σᵢ wᵢxᵢ + b) · Perceptron update: wᵢ ← wᵢ + η(t − y)xᵢ (η = learning rate, t = target).

Nielsen — NNDL, ch. 1: perceptrons, sigmoid neurons and why a smooth activation matters for learning.

Nielsen — NNDL, ch. 1 artificial neuron perceptron

Session 3 · live in-person

Principles of learning in biological vs. artificial neural networks

Contrast supervised, statistical and Hebbian routes to learning.

Supervised learning in AI: training on labelled (input, target) pairs so the network minimises the gap between its output and the known answer.
Pattern recognition: mapping raw inputs to category labels — the canonical task that perceptrons and their successors are built to solve.
Hebbian learning: the biological rule "cells that fire together wire together" — an unsupervised, local rule that strengthens correlated connections.

Key idea

Brains rely heavily on local, unsupervised Hebbian adjustment; most AI relies on global, supervised error correction — two very different answers to "how should a weight change?"

Hebbian rule: Δwᵢⱼ = η · xᵢ · xⱼ (weight grows with the correlation of the two units' activity).

Nielsen — NNDL, ch. 1–2: how networks learn from data and the gradient-based alternative to Hebbian updates.

Nielsen — NNDL, ch. 1–2 perceptron Hebbian / PCA

Session 4 · live in-person

Multi-layer perceptrons

How depth and backpropagation overcome the limits of a single neuron.

Feedforward networks: stacking layers of neurons so the network can represent non-linear functions a single perceptron cannot (e.g. XOR).
Loss functions and optimization: a loss (e.g. mean-squared error or cross-entropy) scores how wrong the output is; training is searching weights that minimise it.
Stochastic Gradient Descent (SGD): step the weights downhill along the loss gradient, using one mini-batch at a time for speed.
Backpropagation: the chain rule applied layer-by-layer to compute each weight's contribution to the loss efficiently.

Key idea

Depth gives expressive power; backprop + gradient descent make that power trainable by assigning credit and blame across every weight.

Gradient-descent step: w ← w − η ∂L/∂w · backprop propagates ∂L/∂w via the chain rule.

Nielsen — NNDL, ch. 2: the backpropagation algorithm derived from first principles, plus SGD.

Nielsen — NNDL, ch. 2 gradient descent MLP / XOR

Module II Deep learning, memory & representation

Sessions 5–7 · Training deep networks in practice, generalization, and how representations compress information.

Module learning outcomes

Relate neural development and plasticity to the practical levers of training deep nets.
Choose sensible weight initialization, activation functions and regularization.
Explain overfitting, generalization and why deep nets are data-hungry compared with humans.
Describe how hidden layers learn compressed latent representations of data.

Session 5 · live in-person

Developing brains vs. training ANNs

Compare neural development with the practical craft of training deep nets.

Development & plasticity: brains wire themselves through experience-dependent plasticity; the nature-vs-nurture balance shapes which circuits form.
Training deep networks in practice: the engineering analogue — the choices that decide whether a deep net actually converges.
Weight initialization, activation functions, regularization: good init avoids vanishing/exploding gradients; non-linear activations (ReLU, sigmoid, tanh) give expressive power; regularization (L2, dropout) curbs overfitting.

Key idea

Both brains and deep nets need the right starting conditions and constraints to learn well — initialization and regularization are the network's "developmental environment."

Activations: ReLU(x)=max(0,x) · σ(x)=1/(1+e⁻ˣ) · tanh(x). L2 penalty adds λΣw² to the loss.

Nielsen — NNDL, ch. 3: better initialization, the cross-entropy cost, and regularization techniques (incl. dropout) for improving how nets learn.

Nielsen — NNDL, ch. 3 activations

Session 6 · live in-person

Memory, learning and problem solving

Why humans generalize from few examples while networks often need many.

Learning from limited examples: humans generalise from a handful of instances via inductive reasoning and strong priors — few-shot by default.
Generalization in AI: the gap between training and test performance; overfitting is memorising noise, fought with data augmentation and regularization.
Data hunger: with millions of parameters and weak priors, deep nets usually need large datasets to generalise reliably.

Key idea

Generalization, not memorisation, is the goal — and the human brain's strong priors are exactly what data-hungry networks lack.

Overfitting shows as low training loss but high test loss; augmentation and regularization shrink that generalization gap.

Nielsen — NNDL, ch. 3: overfitting, regularization and why more data helps.

Friedenberg — Cognitive Science (memory): human memory systems and inductive reasoning as a contrast case.

Nielsen — NNDL, ch. 3 Friedenberg — Cognitive Science (memory) MLP / XOR

Session 7 · live in-person

Dimensionality reduction

How latent representations compress high-dimensional data onto meaningful axes.

Dimensionality reduction: projecting high-dimensional data onto a few informative axes that capture most of its variance (e.g. PCA).
Latent representations in hidden layers: hidden units learn compressed codes for the input — the network's internal "concepts."
Hebbian networks: a Hebbian unit can converge to the principal component of its inputs, linking biological plasticity to PCA.

Key idea

Both PCA and Hebbian learning find the directions of maximal variance — so a simple "fire-together" rule recovers the same compressed representation as a classic statistical method.

PCA keeps the top eigenvectors of the covariance matrix C = (1/n)Σ xxᵀ; Oja's Hebbian rule converges to the leading one.

Nielsen — NNDL: what hidden-layer representations encode and why compression aids learning.

Nielsen — NNDL Hebbian / PCA

Module III Perception, vision & language

Sessions 8–10 · How humans and machines perceive images and sound, see, and process language.

Module learning outcomes

Contrast bottom-up, top-down, Gestalt and ecological theories of perception.
Explain how convolutional networks exploit local structure the way the visual cortex does.
Account for adversarial examples and where machine and human vision diverge.
Connect language and thought to NLP and modern language models.

Session 8 · live in-person

Perception in Humans and Artificial Neural Networks

Theories of perception and the convolutional networks that echo the visual cortex.

Processing visual and auditory information: how sensory signals are transduced and built up into stable percepts.
Theories of perception: bottom-up (data-driven) vs. top-down (expectation-driven) processing, Gestalt grouping principles, and Gibson's ecological view that the environment affords perception directly.
Convolutional Neural Networks (CNNs): local receptive fields, shared weights and pooling — directly inspired by the hierarchy of the visual cortex.

Key idea

CNNs encode the visual-cortex insight that nearby pixels are related: weight sharing across the image yields translation-tolerant feature detectors.

Convolution: (I∗K)(x,y)=ΣᵢΣⱼ I(x+i,y+j)·K(i,j) — a small kernel K slid across the image I.

Friedenberg — Cognitive Science (perception): bottom-up/top-down processing, Gestalt and ecological theories.

Nielsen — NNDL, ch. 6: convolutional networks and deep learning for image recognition.

Friedenberg — Cognitive Science (perception) Nielsen — NNDL, ch. 6 convolution

Session 9 · live in-person

Vision

Deepen CNNs and probe where machine and human vision diverge.

CNNs, part II: deeper architectures, stacked feature hierarchies and what successive layers come to detect (edges → textures → objects).
Adversarial examples: tiny, often imperceptible input perturbations that flip a CNN's prediction — exposing that machine vision does not "see" the way humans do.

Key idea

Matching human accuracy does not mean matching human robustness: adversarial examples reveal that CNNs latch onto statistical cues invisible and irrelevant to people.

An adversarial input adds a crafted perturbation: x′ = x + ε·sign(∇ₓL) that maximises the loss while staying near x.

Nielsen — NNDL, ch. 6: deep convolutional networks; basis for discussing their failure modes.

Nielsen — NNDL, ch. 6 convolution

Session 10 · live in-person

Language

From the language–thought relationship to NLP and language models.

Language and thought: how language structures cognition (e.g. linguistic-relativity debates) and what it reveals about the mind.
Human language processing: how people parse, produce and comprehend language in real time.
NLP and language models: representing words as vectors and predicting text statistically — the foundation that leads into attention and LLMs.

Key idea

Modern NLP treats meaning as geometry: words become vectors and a language model learns P(next word | context), sidestepping explicit grammar.

Friedenberg — Cognitive Science (language): human language processing and the language–thought relationship.

Friedenberg — Cognitive Science (language) attention

Module IV Emotion, personality, philosophy & assessment

Sessions 11–15 · Attention and LLMs, simulating emotion and personality, the philosophy of mind, and final assessment.

Module learning outcomes

Explain the attention mechanism and how it underpins Large Language Models.
Represent emotion (valence–arousal, PAD) and personality (Big Five) computationally.
Identify sources of bias in AI systems and their societal impact.
Engage critically with the Chinese Room and the question of machine understanding.

Session 11 · live in-person

Attention, Bias, Personality & Emotion

The heart of the course: how AI recognizes and simulates affect and character.

Attention in deep learning: letting the network weigh which parts of the input matter most for each output — the mechanism behind Transformers.
Large Language Models: Transformer-based models trained on vast text that can be steered to express a persona or tone.
Recognising & simulating personality & emotion: emotion modelled as valence–arousal or PAD (pleasure–arousal–dominance); personality as the Big Five (OCEAN) trait dimensions.
Bias in AI systems: models inherit and can amplify the social biases present in their training data.

Key idea

Emotion and personality become tractable for AI once represented as low-dimensional coordinates — valence–arousal, PAD, or the Big Five — that a model can read off and reproduce.

Attention: softmax(QKᵀ/√d)·V · Emotion: 2-D (valence, arousal) or 3-D PAD · Personality: Big Five (O, C, E, A, N).

Minsky — The Emotion Machine: emotions as alternative "ways to think," motivating their place in AI design.

Minsky — The Emotion Machine attention valence–arousal PAD Big Five

Session 12 · live in-person

Philosophy of mind vs. AI

Can a machine truly understand? Classic critiques of strong AI.

What is "understanding"? The gap between manipulating symbols and grasping their meaning (syntax vs. semantics).
The Chinese Room: Searle's argument that running a program to produce fluent answers does not entail genuine understanding.
Can AI be truly intelligent? Weighing strong vs. weak AI, and what behavioural success can and cannot prove about minds.

Key idea

Passing a behavioural test (Turing) is not the same as understanding (Searle): the course closes by separating competence from comprehension.

Minsky — The Emotion Machine: a constructivist counterpoint to the idea that mind is irreducibly non-computational.

Russell & Norvig — AIMA (philosophy): the weak/strong-AI distinction and the Chinese Room debate.

Minsky — The Emotion Machine Russell & Norvig — AIMA (philosophy)

Session 13 · live in-person

Final Exam

Comprehensive exam — 40% of the grade; minimum 3.5/10 required to pass the course.

Covers the full program: cognitive-science concepts and the algorithms behind every demo. A score below 3.5 fails the course even if all other components passed.

assessment

Session 14 · live in-person

Group Project presentations

In-class presentation of the group project (written report + presentation = 30%).

Graded as a group; GenAI may not be used for the project submission.

assessment

Session 15 · live in-person

Group Project presentations

Continuation of group project presentations.

Remaining groups present; closes the continuous-evaluation portion of the course.

assessment

Key concepts — glossary

A quick reference to the core terms used across the program, in roughly the order they appear.

Cognitive science: The interdisciplinary study of mind that models cognition as information processing across psychology, neuroscience, linguistics, philosophy and CS.
Turing Test: A behavioural benchmark: a machine is deemed intelligent if its conversation is indistinguishable from a human's.
Action potential: The all-or-none electrical spike a biological neuron fires once its membrane potential crosses threshold.
Artificial neuron: A unit computing a weighted sum of inputs plus bias passed through an activation function, y=φ(Σwᵢxᵢ+b).
Perceptron: A single linear threshold neuron that learns a decision boundary by correcting its weights on misclassified examples.
Activation function: The non-linearity (ReLU, sigmoid, tanh) that gives a network expressive power beyond a linear map.
Supervised learning: Learning from labelled (input, target) pairs by minimising the error between output and target.
Hebbian learning: An unsupervised, local rule — "cells that fire together wire together," Δw=η·xᵢ·xⱼ.
Multi-layer perceptron: A feedforward network of stacked neuron layers able to represent non-linear functions such as XOR.
Loss function: A scalar measure of how wrong predictions are (e.g. MSE, cross-entropy) that training seeks to minimise.
Gradient descent: Iteratively stepping weights downhill along the loss gradient, w←w−η∂L/∂w.
Backpropagation: Efficient computation of loss gradients for every weight by applying the chain rule backward through layers.
Regularization: Techniques (L2 weight decay, dropout) that constrain a model to curb overfitting.
Overfitting: Memorising training noise so that test performance lags far behind training performance.
Generalization: A model's ability to perform well on unseen data — the actual goal of learning.
Dimensionality reduction: Projecting high-dimensional data onto fewer informative axes that retain most variance (e.g. PCA).
Latent representation: The compressed internal code a hidden layer learns for its inputs.
Convolutional neural network: A vision network using local receptive fields, shared weights and pooling, inspired by the visual cortex.
Adversarial example: A tiny, often imperceptible input perturbation that flips a model's prediction.
Attention: A mechanism weighting which parts of the input matter for each output, softmax(QKᵀ/√d)V; basis of Transformers.
Large Language Model: A Transformer trained on vast text to predict the next token, steerable toward a persona or tone.
Valence–arousal: A 2-D model of affect: pleasantness (valence) by activation level (arousal) — the circumplex of emotion.
PAD model: A 3-D emotion space of Pleasure, Arousal and Dominance.
Big Five (OCEAN): The five trait dimensions — Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism — used to model personality.
Algorithmic bias: Systematic unfairness a model inherits and can amplify from biased training data.
Chinese Room: Searle's argument that symbol manipulation producing fluent output need not constitute genuine understanding.

Bibliography

Recommended texts, annotated with where each is used across the program.

Stuart Russell & Peter Norvig (2020). Artificial Intelligence: A Modern Approach, 4th ed. Pearson. ISBN 978013461099 (printed). Use: the AI backbone — history and definitions of intelligence (Session 1) and the philosophy of AI / Chinese Room (Session 12).
Jay Friedenberg, Gordon Silverman & Michael James Spivey (2021). Cognitive Science: An Introduction to the Study of Mind, 4th ed. SAGE. ISBN 9781544380155 (printed). Use: the cognitive-science side — foundations (S1), memory & inductive reasoning (S6), perception theories (S8), language and thought (S10).
Marvin Minsky (2007). The Emotion Machine. Simon & Schuster. ISBN 978074327664 (printed). Use: emotion and personality in AI (Session 11) and the philosophy-of-mind discussion (Session 12).
Michael A. Nielsen (2015). Neural Networks and Deep Learning. Determination Press — neuralnetworksanddeeplearning.com (digital). Use: the technical core for the network sessions — neurons & perceptrons (S2–3, ch. 1–2), backprop & SGD (S4, ch. 2), training/regularization & generalization (S5–7, ch. 3), and CNNs/vision (S8–9, ch. 6).

Additional policies (Code of Conduct, Attendance, Ethics) follow the University's general regulations; the Program Director may provide further indications. See the full syllabus PDF for the complete text.