affect-lab AI: Personality & Emotion for AI Design

Course structure — AI: Personality & Emotion for AI Design

A compulsory second-year course in the Bachelor in Computer Science & Artificial Intelligence (BCSAI) at IE University, taught by Prof. Luciano Dyballa. Across 15 in-person sessions it builds from the foundations of cognitive science and the artificial neuron up through deep learning, perception, language, and the simulation of emotion and personality in AI — closing on the philosophy and ethics of machine minds. Many lectures pair theory with a live coding demo; the interactive labs on this site mirror those demos.

Program
BCSAI — Bachelor in CS & AI
Code
AIPE-CSAI.2.M.A
Area
Computer Science
Sessions
15 (live in-person)
Credits
3.0 ECTS
Academic year
2025–26 · SEP-2025
Degree course
Second
Semester
Category
Compulsory
Language
English
Professor
Luciano Dyballa
Contact
ldyballa@faculty.ie.edu

Subject description: this course introduces the foundational concepts of cognitive science and deep learning, exploring the intersection of human cognition and artificial intelligence — how the brain processes information and how neural networks attempt to replicate it. Key topics include perception, learning, memory, attention and consciousness, alongside the more nuanced questions of whether emotion and personality can be simulated in AI. Students critically analyse whether AI can exhibit human-like intelligence and discuss the ethical implications of building such systems. The instructor, Prof. Luciano Dyballa, holds a Ph.D. in Computer Science from Yale University in machine learning, vision and computational neuroscience, and researches the principles bridging biological and deep neural networks.

Learning objectives

By the end of the course, students should be able to:

Methodology & assessment

Before each class students prepare assignments and readings at home; lectures combine theoretical explanation with practical examples, and most include a live coding demonstration. Each new technique is followed by worked examples, and problem sets build both the intuition behind the theory and the coding skills to implement the algorithms. Students are encouraged to collaborate on understanding the material, but all written work — including code — must be your own; any hints or outside solutions must be acknowledged.

Learning-activity weighting · ~75 h total
Lectures
24.0%
Group work
22.7%
Exercises / async / field work
20.0%
Individual studying
20.0%
Discussions
13.3%

Estimated student time: Lectures 18 h · Discussions 10 h · Exercises/async/field work 15 h · Group work 17 h · Individual studying 15 h.

Assessment — components & deliverables
Final exam
40%
Group presentation
30%
Individual work
20%
Class participation
10%
Final exam40%

Deliverable: comprehensive written exam (Session 13). Evaluation: covers the full program — theory and the algorithms behind each demo. Gate: you must score at least 3.5/10 on the exam to pass the course, even if every other component is already passed.

Group presentation30%

Deliverable: a written report plus an in-class presentation (Sessions 14–15). Evaluation: graded as a group on the report and the talk. Note: GenAI may not be used for group project submissions.

Individual work20%

Deliverable: selected exercises / problem sets announced throughout the course. Evaluation: assesses individual progress; all code must be your own with outside help acknowledged.

Class participation10%

Deliverable: active engagement in this highly interactive class — questions, remarks, occasional short in-class presentations or exercises. Evaluation: punctuality and class conduct are also taken into account.

Pass, attendance & re-sit rules

Program — 15 sessions

The full program grouped into four thematic modules. Each module opens with its arc and learning outcomes; each session lists its objective, annotated topics, a key idea, the core model or formula where the content is technical, annotated readings, and links to the matching interactive lab where one exists.

Module I Foundations: cognition & the artificial neuron

Sessions 1–4 · What intelligence is, how biological and artificial neurons work, and how networks learn.

Module learning outcomes
  • Situate cognitive science and AI historically and define what "intelligence" could mean.
  • Explain how a biological neuron fires and how the perceptron abstracts it.
  • Distinguish supervised, statistical and Hebbian learning rules.
  • Derive how stacking neurons + backpropagation lets a network learn non-linear functions like XOR.
Session 1 · live in-person
Introduction to Cognitive Science and AI
Frame the course: what cognition is, what AI is, and what "intelligent" even means.
  • What is cognitive science? The interdisciplinary study of mind spanning psychology, neuroscience, linguistics, philosophy and CS; key theories model cognition as information processing (the mind as a symbol- or signal-manipulating system).
  • What is artificial intelligence? A short history from symbolic AI and the 1956 Dartmouth workshop through expert systems to the deep-learning era, and the field's shifting goals (mimic vs. rival human thought).
  • What is "intelligence"? Competing definitions — goal-directed rational action, adaptation, the Turing Test as a behavioural benchmark — and why none is fully settled.
Key idea

Cognitive science treats the mind as an information-processing system; AI asks how far that processing can be built in a machine — making "intelligence" a moving target rather than a fixed property.

Friedenberg — Cognitive Science, ch. 1: defines the field and the information-processing view of mind.
Russell & Norvig — AIMA, ch. 1: the four schools of AI (thinking/acting × humanly/rationally) and a history of the field.
Friedenberg — Cognitive Science, ch. 1 Russell & Norvig — AIMA, ch. 1
Session 2 · live in-person
Neurons and the Basics of Neural Networks
From biological neurons to the first computational model that can learn.
  • How biological neurons work: dendrites integrate incoming signals, and the neuron fires an all-or-none action potential down its axon once the membrane potential crosses threshold — a noisy, spiking, energy-efficient computer.
  • Models of artificial neurons: the McCulloch–Pitts / threshold unit replaces this with a weighted sum passed through a step or activation function.
  • The perceptron: a single linear unit that learns a decision boundary by nudging weights whenever it misclassifies an input.
Key idea

A neuron — biological or artificial — is fundamentally a thresholded weighted sum; learning means adjusting the weights.

Artificial neuron: y = φ(Σᵢ wᵢxᵢ + b)  ·  Perceptron update: wᵢ ← wᵢ + η(t − y)xᵢ (η = learning rate, t = target).

Nielsen — NNDL, ch. 1: perceptrons, sigmoid neurons and why a smooth activation matters for learning.
Nielsen — NNDL, ch. 1 artificial neuron perceptron
Session 3 · live in-person
Principles of learning in biological vs. artificial neural networks
Contrast supervised, statistical and Hebbian routes to learning.
  • Supervised learning in AI: training on labelled (input, target) pairs so the network minimises the gap between its output and the known answer.
  • Pattern recognition: mapping raw inputs to category labels — the canonical task that perceptrons and their successors are built to solve.
  • Hebbian learning: the biological rule "cells that fire together wire together" — an unsupervised, local rule that strengthens correlated connections.
Key idea

Brains rely heavily on local, unsupervised Hebbian adjustment; most AI relies on global, supervised error correction — two very different answers to "how should a weight change?"

Hebbian rule: Δwᵢⱼ = η · xᵢ · xⱼ (weight grows with the correlation of the two units' activity).

Nielsen — NNDL, ch. 1–2: how networks learn from data and the gradient-based alternative to Hebbian updates.
Nielsen — NNDL, ch. 1–2 perceptron Hebbian / PCA
Session 4 · live in-person
Multi-layer perceptrons
How depth and backpropagation overcome the limits of a single neuron.
  • Feedforward networks: stacking layers of neurons so the network can represent non-linear functions a single perceptron cannot (e.g. XOR).
  • Loss functions and optimization: a loss (e.g. mean-squared error or cross-entropy) scores how wrong the output is; training is searching weights that minimise it.
  • Stochastic Gradient Descent (SGD): step the weights downhill along the loss gradient, using one mini-batch at a time for speed.
  • Backpropagation: the chain rule applied layer-by-layer to compute each weight's contribution to the loss efficiently.
Key idea

Depth gives expressive power; backprop + gradient descent make that power trainable by assigning credit and blame across every weight.

Gradient-descent step: w ← w − η ∂L/∂w  ·  backprop propagates ∂L/∂w via the chain rule.

Nielsen — NNDL, ch. 2: the backpropagation algorithm derived from first principles, plus SGD.
Nielsen — NNDL, ch. 2 gradient descent MLP / XOR
Module II Deep learning, memory & representation

Sessions 5–7 · Training deep networks in practice, generalization, and how representations compress information.

Module learning outcomes
  • Relate neural development and plasticity to the practical levers of training deep nets.
  • Choose sensible weight initialization, activation functions and regularization.
  • Explain overfitting, generalization and why deep nets are data-hungry compared with humans.
  • Describe how hidden layers learn compressed latent representations of data.
Session 5 · live in-person
Developing brains vs. training ANNs
Compare neural development with the practical craft of training deep nets.
  • Development & plasticity: brains wire themselves through experience-dependent plasticity; the nature-vs-nurture balance shapes which circuits form.
  • Training deep networks in practice: the engineering analogue — the choices that decide whether a deep net actually converges.
  • Weight initialization, activation functions, regularization: good init avoids vanishing/exploding gradients; non-linear activations (ReLU, sigmoid, tanh) give expressive power; regularization (L2, dropout) curbs overfitting.
Key idea

Both brains and deep nets need the right starting conditions and constraints to learn well — initialization and regularization are the network's "developmental environment."

Activations: ReLU(x)=max(0,x) · σ(x)=1/(1+e⁻ˣ) · tanh(x). L2 penalty adds λΣw² to the loss.

Nielsen — NNDL, ch. 3: better initialization, the cross-entropy cost, and regularization techniques (incl. dropout) for improving how nets learn.
Nielsen — NNDL, ch. 3 activations
Session 6 · live in-person
Memory, learning and problem solving
Why humans generalize from few examples while networks often need many.
  • Learning from limited examples: humans generalise from a handful of instances via inductive reasoning and strong priors — few-shot by default.
  • Generalization in AI: the gap between training and test performance; overfitting is memorising noise, fought with data augmentation and regularization.
  • Data hunger: with millions of parameters and weak priors, deep nets usually need large datasets to generalise reliably.
Key idea

Generalization, not memorisation, is the goal — and the human brain's strong priors are exactly what data-hungry networks lack.

Overfitting shows as low training loss but high test loss; augmentation and regularization shrink that generalization gap.

Nielsen — NNDL, ch. 3: overfitting, regularization and why more data helps.
Friedenberg — Cognitive Science (memory): human memory systems and inductive reasoning as a contrast case.
Nielsen — NNDL, ch. 3 Friedenberg — Cognitive Science (memory) MLP / XOR
Session 7 · live in-person
Dimensionality reduction
How latent representations compress high-dimensional data onto meaningful axes.
  • Dimensionality reduction: projecting high-dimensional data onto a few informative axes that capture most of its variance (e.g. PCA).
  • Latent representations in hidden layers: hidden units learn compressed codes for the input — the network's internal "concepts."
  • Hebbian networks: a Hebbian unit can converge to the principal component of its inputs, linking biological plasticity to PCA.
Key idea

Both PCA and Hebbian learning find the directions of maximal variance — so a simple "fire-together" rule recovers the same compressed representation as a classic statistical method.

PCA keeps the top eigenvectors of the covariance matrix C = (1/n)Σ xxᵀ; Oja's Hebbian rule converges to the leading one.

Nielsen — NNDL: what hidden-layer representations encode and why compression aids learning.
Nielsen — NNDL Hebbian / PCA
Module III Perception, vision & language

Sessions 8–10 · How humans and machines perceive images and sound, see, and process language.

Module learning outcomes
  • Contrast bottom-up, top-down, Gestalt and ecological theories of perception.
  • Explain how convolutional networks exploit local structure the way the visual cortex does.
  • Account for adversarial examples and where machine and human vision diverge.
  • Connect language and thought to NLP and modern language models.
Session 8 · live in-person
Perception in Humans and Artificial Neural Networks
Theories of perception and the convolutional networks that echo the visual cortex.
  • Processing visual and auditory information: how sensory signals are transduced and built up into stable percepts.
  • Theories of perception: bottom-up (data-driven) vs. top-down (expectation-driven) processing, Gestalt grouping principles, and Gibson's ecological view that the environment affords perception directly.
  • Convolutional Neural Networks (CNNs): local receptive fields, shared weights and pooling — directly inspired by the hierarchy of the visual cortex.
Key idea

CNNs encode the visual-cortex insight that nearby pixels are related: weight sharing across the image yields translation-tolerant feature detectors.

Convolution: (I∗K)(x,y)=ΣᵢΣⱼ I(x+i,y+j)·K(i,j) — a small kernel K slid across the image I.

Friedenberg — Cognitive Science (perception): bottom-up/top-down processing, Gestalt and ecological theories.
Nielsen — NNDL, ch. 6: convolutional networks and deep learning for image recognition.
Friedenberg — Cognitive Science (perception) Nielsen — NNDL, ch. 6 convolution
Session 9 · live in-person
Vision
Deepen CNNs and probe where machine and human vision diverge.
  • CNNs, part II: deeper architectures, stacked feature hierarchies and what successive layers come to detect (edges → textures → objects).
  • Adversarial examples: tiny, often imperceptible input perturbations that flip a CNN's prediction — exposing that machine vision does not "see" the way humans do.
Key idea

Matching human accuracy does not mean matching human robustness: adversarial examples reveal that CNNs latch onto statistical cues invisible and irrelevant to people.

An adversarial input adds a crafted perturbation: x′ = x + ε·sign(∇ₓL) that maximises the loss while staying near x.

Nielsen — NNDL, ch. 6: deep convolutional networks; basis for discussing their failure modes.
Nielsen — NNDL, ch. 6 convolution
Session 10 · live in-person
Language
From the language–thought relationship to NLP and language models.
  • Language and thought: how language structures cognition (e.g. linguistic-relativity debates) and what it reveals about the mind.
  • Human language processing: how people parse, produce and comprehend language in real time.
  • NLP and language models: representing words as vectors and predicting text statistically — the foundation that leads into attention and LLMs.
Key idea

Modern NLP treats meaning as geometry: words become vectors and a language model learns P(next word | context), sidestepping explicit grammar.

Friedenberg — Cognitive Science (language): human language processing and the language–thought relationship.
Friedenberg — Cognitive Science (language) attention
Module IV Emotion, personality, philosophy & assessment

Sessions 11–15 · Attention and LLMs, simulating emotion and personality, the philosophy of mind, and final assessment.

Module learning outcomes
  • Explain the attention mechanism and how it underpins Large Language Models.
  • Represent emotion (valence–arousal, PAD) and personality (Big Five) computationally.
  • Identify sources of bias in AI systems and their societal impact.
  • Engage critically with the Chinese Room and the question of machine understanding.
Session 11 · live in-person
Attention, Bias, Personality & Emotion
The heart of the course: how AI recognizes and simulates affect and character.
  • Attention in deep learning: letting the network weigh which parts of the input matter most for each output — the mechanism behind Transformers.
  • Large Language Models: Transformer-based models trained on vast text that can be steered to express a persona or tone.
  • Recognising & simulating personality & emotion: emotion modelled as valence–arousal or PAD (pleasure–arousal–dominance); personality as the Big Five (OCEAN) trait dimensions.
  • Bias in AI systems: models inherit and can amplify the social biases present in their training data.
Key idea

Emotion and personality become tractable for AI once represented as low-dimensional coordinates — valence–arousal, PAD, or the Big Five — that a model can read off and reproduce.

Attention: softmax(QKᵀ/√d)·V · Emotion: 2-D (valence, arousal) or 3-D PAD · Personality: Big Five (O, C, E, A, N).

Minsky — The Emotion Machine: emotions as alternative "ways to think," motivating their place in AI design.
Minsky — The Emotion Machine attention valence–arousal PAD Big Five
Session 12 · live in-person
Philosophy of mind vs. AI
Can a machine truly understand? Classic critiques of strong AI.
  • What is "understanding"? The gap between manipulating symbols and grasping their meaning (syntax vs. semantics).
  • The Chinese Room: Searle's argument that running a program to produce fluent answers does not entail genuine understanding.
  • Can AI be truly intelligent? Weighing strong vs. weak AI, and what behavioural success can and cannot prove about minds.
Key idea

Passing a behavioural test (Turing) is not the same as understanding (Searle): the course closes by separating competence from comprehension.

Minsky — The Emotion Machine: a constructivist counterpoint to the idea that mind is irreducibly non-computational.
Russell & Norvig — AIMA (philosophy): the weak/strong-AI distinction and the Chinese Room debate.
Minsky — The Emotion Machine Russell & Norvig — AIMA (philosophy)
Session 13 · live in-person
Final Exam
Comprehensive exam — 40% of the grade; minimum 3.5/10 required to pass the course.

Covers the full program: cognitive-science concepts and the algorithms behind every demo. A score below 3.5 fails the course even if all other components passed.

assessment
Session 14 · live in-person
Group Project presentations
In-class presentation of the group project (written report + presentation = 30%).

Graded as a group; GenAI may not be used for the project submission.

assessment
Session 15 · live in-person
Group Project presentations
Continuation of group project presentations.

Remaining groups present; closes the continuous-evaluation portion of the course.

assessment

Key concepts — glossary

A quick reference to the core terms used across the program, in roughly the order they appear.

Cognitive science
The interdisciplinary study of mind that models cognition as information processing across psychology, neuroscience, linguistics, philosophy and CS.
Turing Test
A behavioural benchmark: a machine is deemed intelligent if its conversation is indistinguishable from a human's.
Action potential
The all-or-none electrical spike a biological neuron fires once its membrane potential crosses threshold.
Artificial neuron
A unit computing a weighted sum of inputs plus bias passed through an activation function, y=φ(Σwᵢxᵢ+b).
Perceptron
A single linear threshold neuron that learns a decision boundary by correcting its weights on misclassified examples.
Activation function
The non-linearity (ReLU, sigmoid, tanh) that gives a network expressive power beyond a linear map.
Supervised learning
Learning from labelled (input, target) pairs by minimising the error between output and target.
Hebbian learning
An unsupervised, local rule — "cells that fire together wire together," Δw=η·xᵢ·xⱼ.
Multi-layer perceptron
A feedforward network of stacked neuron layers able to represent non-linear functions such as XOR.
Loss function
A scalar measure of how wrong predictions are (e.g. MSE, cross-entropy) that training seeks to minimise.
Gradient descent
Iteratively stepping weights downhill along the loss gradient, w←w−η∂L/∂w.
Backpropagation
Efficient computation of loss gradients for every weight by applying the chain rule backward through layers.
Regularization
Techniques (L2 weight decay, dropout) that constrain a model to curb overfitting.
Overfitting
Memorising training noise so that test performance lags far behind training performance.
Generalization
A model's ability to perform well on unseen data — the actual goal of learning.
Dimensionality reduction
Projecting high-dimensional data onto fewer informative axes that retain most variance (e.g. PCA).
Latent representation
The compressed internal code a hidden layer learns for its inputs.
Convolutional neural network
A vision network using local receptive fields, shared weights and pooling, inspired by the visual cortex.
Adversarial example
A tiny, often imperceptible input perturbation that flips a model's prediction.
Attention
A mechanism weighting which parts of the input matter for each output, softmax(QKᵀ/√d)V; basis of Transformers.
Large Language Model
A Transformer trained on vast text to predict the next token, steerable toward a persona or tone.
Valence–arousal
A 2-D model of affect: pleasantness (valence) by activation level (arousal) — the circumplex of emotion.
PAD model
A 3-D emotion space of Pleasure, Arousal and Dominance.
Big Five (OCEAN)
The five trait dimensions — Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism — used to model personality.
Algorithmic bias
Systematic unfairness a model inherits and can amplify from biased training data.
Chinese Room
Searle's argument that symbol manipulation producing fluent output need not constitute genuine understanding.

Bibliography

Recommended texts, annotated with where each is used across the program.

Additional policies (Code of Conduct, Attendance, Ethics) follow the University's general regulations; the Program Director may provide further indications. See the full syllabus PDF for the complete text.