Course structure — AI: Personality & Emotion for AI Design
A compulsory second-year course in the Bachelor in Computer Science & Artificial Intelligence (BCSAI) at IE University, taught by Prof. Luciano Dyballa. Across 15 in-person sessions it builds from the foundations of cognitive science and the artificial neuron up through deep learning, perception, language, and the simulation of emotion and personality in AI — closing on the philosophy and ethics of machine minds. Many lectures pair theory with a live coding demo; the interactive labs on this site mirror those demos.
Learning objectives
By the end of the course, students should be able to:
- Understand core concepts of cognitive science, including perception, learning, memory, attention and consciousness.
- Understand the key differences between learning in artificial vs. biological neural networks.
- Grasp the foundational principles of neural networks and deep learning through in-class demos.
- Explore how AI systems can be designed to simulate human traits such as emotion and personality.
- Critically assess the potential of AI to replicate human-level intelligence.
- Explore the societal impacts of AI technologies.
Methodology & assessment
Before each class students prepare assignments and readings at home; lectures combine theoretical explanation with practical examples, and most include a live coding demonstration. Each new technique is followed by worked examples, and problem sets build both the intuition behind the theory and the coding skills to implement the algorithms. Students are encouraged to collaborate on understanding the material, but all written work — including code — must be your own; any hints or outside solutions must be acknowledged.
Deliverable: comprehensive written exam (Session 13). Evaluation: covers the full program — theory and the algorithms behind each demo. Gate: you must score at least 3.5/10 on the exam to pass the course, even if every other component is already passed.
Deliverable: a written report plus an in-class presentation (Sessions 14–15). Evaluation: graded as a group on the report and the talk. Note: GenAI may not be used for group project submissions.
Deliverable: selected exercises / problem sets announced throughout the course. Evaluation: assesses individual progress; all code must be your own with outside help acknowledged.
Deliverable: active engagement in this highly interactive class — questions, remarks, occasional short in-class presentations or exercises. Evaluation: punctuality and class conduct are also taken into account.
- Exam gate: a minimum exam score of 3.5/10 is required to pass the overall course.
- Attendance: students who do not meet the 80% attendance rule fail both the ordinary and extraordinary calls for the year and must re-enroll (re-take) the following Academic Year.
- Four calls: each course allows four chances to pass over two consecutive Academic Years (ordinary + extraordinary re-sits in June/July).
- Re-sit: the June/July re-sit is a single comprehensive exam taken in person on campus (Segovia or Madrid); continuous evaluation is not carried over and the grade is capped at 8.0/10 ("notable"). Retakers (3rd call) may reach 10.0 and must contact the professor for their specific criteria.
- GenAI policy: permitted for some specific tasks with acknowledgment; prohibited for group submissions, quizzes and exams. Misuse counts as academic misconduct and may fail the assignment or course.
Program — 15 sessions
The full program grouped into four thematic modules. Each module opens with its arc and learning outcomes; each session lists its objective, annotated topics, a key idea, the core model or formula where the content is technical, annotated readings, and links to the matching interactive lab where one exists.
Sessions 1–4 · What intelligence is, how biological and artificial neurons work, and how networks learn.
- Situate cognitive science and AI historically and define what "intelligence" could mean.
- Explain how a biological neuron fires and how the perceptron abstracts it.
- Distinguish supervised, statistical and Hebbian learning rules.
- Derive how stacking neurons + backpropagation lets a network learn non-linear functions like XOR.
- What is cognitive science? The interdisciplinary study of mind spanning psychology, neuroscience, linguistics, philosophy and CS; key theories model cognition as information processing (the mind as a symbol- or signal-manipulating system).
- What is artificial intelligence? A short history from symbolic AI and the 1956 Dartmouth workshop through expert systems to the deep-learning era, and the field's shifting goals (mimic vs. rival human thought).
- What is "intelligence"? Competing definitions — goal-directed rational action, adaptation, the Turing Test as a behavioural benchmark — and why none is fully settled.
Cognitive science treats the mind as an information-processing system; AI asks how far that processing can be built in a machine — making "intelligence" a moving target rather than a fixed property.
- How biological neurons work: dendrites integrate incoming signals, and the neuron fires an all-or-none action potential down its axon once the membrane potential crosses threshold — a noisy, spiking, energy-efficient computer.
- Models of artificial neurons: the McCulloch–Pitts / threshold unit replaces this with a weighted sum passed through a step or activation function.
- The perceptron: a single linear unit that learns a decision boundary by nudging weights whenever it misclassifies an input.
A neuron — biological or artificial — is fundamentally a thresholded weighted sum; learning means adjusting the weights.
- Supervised learning in AI: training on labelled (input, target) pairs so the network minimises the gap between its output and the known answer.
- Pattern recognition: mapping raw inputs to category labels — the canonical task that perceptrons and their successors are built to solve.
- Hebbian learning: the biological rule "cells that fire together wire together" — an unsupervised, local rule that strengthens correlated connections.
Brains rely heavily on local, unsupervised Hebbian adjustment; most AI relies on global, supervised error correction — two very different answers to "how should a weight change?"
- Feedforward networks: stacking layers of neurons so the network can represent non-linear functions a single perceptron cannot (e.g. XOR).
- Loss functions and optimization: a loss (e.g. mean-squared error or cross-entropy) scores how wrong the output is; training is searching weights that minimise it.
- Stochastic Gradient Descent (SGD): step the weights downhill along the loss gradient, using one mini-batch at a time for speed.
- Backpropagation: the chain rule applied layer-by-layer to compute each weight's contribution to the loss efficiently.
Depth gives expressive power; backprop + gradient descent make that power trainable by assigning credit and blame across every weight.
Sessions 5–7 · Training deep networks in practice, generalization, and how representations compress information.
- Relate neural development and plasticity to the practical levers of training deep nets.
- Choose sensible weight initialization, activation functions and regularization.
- Explain overfitting, generalization and why deep nets are data-hungry compared with humans.
- Describe how hidden layers learn compressed latent representations of data.
- Development & plasticity: brains wire themselves through experience-dependent plasticity; the nature-vs-nurture balance shapes which circuits form.
- Training deep networks in practice: the engineering analogue — the choices that decide whether a deep net actually converges.
- Weight initialization, activation functions, regularization: good init avoids vanishing/exploding gradients; non-linear activations (ReLU, sigmoid, tanh) give expressive power; regularization (L2, dropout) curbs overfitting.
Both brains and deep nets need the right starting conditions and constraints to learn well — initialization and regularization are the network's "developmental environment."
- Learning from limited examples: humans generalise from a handful of instances via inductive reasoning and strong priors — few-shot by default.
- Generalization in AI: the gap between training and test performance; overfitting is memorising noise, fought with data augmentation and regularization.
- Data hunger: with millions of parameters and weak priors, deep nets usually need large datasets to generalise reliably.
Generalization, not memorisation, is the goal — and the human brain's strong priors are exactly what data-hungry networks lack.
- Dimensionality reduction: projecting high-dimensional data onto a few informative axes that capture most of its variance (e.g. PCA).
- Latent representations in hidden layers: hidden units learn compressed codes for the input — the network's internal "concepts."
- Hebbian networks: a Hebbian unit can converge to the principal component of its inputs, linking biological plasticity to PCA.
Both PCA and Hebbian learning find the directions of maximal variance — so a simple "fire-together" rule recovers the same compressed representation as a classic statistical method.
Sessions 8–10 · How humans and machines perceive images and sound, see, and process language.
- Contrast bottom-up, top-down, Gestalt and ecological theories of perception.
- Explain how convolutional networks exploit local structure the way the visual cortex does.
- Account for adversarial examples and where machine and human vision diverge.
- Connect language and thought to NLP and modern language models.
- Processing visual and auditory information: how sensory signals are transduced and built up into stable percepts.
- Theories of perception: bottom-up (data-driven) vs. top-down (expectation-driven) processing, Gestalt grouping principles, and Gibson's ecological view that the environment affords perception directly.
- Convolutional Neural Networks (CNNs): local receptive fields, shared weights and pooling — directly inspired by the hierarchy of the visual cortex.
CNNs encode the visual-cortex insight that nearby pixels are related: weight sharing across the image yields translation-tolerant feature detectors.
- CNNs, part II: deeper architectures, stacked feature hierarchies and what successive layers come to detect (edges → textures → objects).
- Adversarial examples: tiny, often imperceptible input perturbations that flip a CNN's prediction — exposing that machine vision does not "see" the way humans do.
Matching human accuracy does not mean matching human robustness: adversarial examples reveal that CNNs latch onto statistical cues invisible and irrelevant to people.
- Language and thought: how language structures cognition (e.g. linguistic-relativity debates) and what it reveals about the mind.
- Human language processing: how people parse, produce and comprehend language in real time.
- NLP and language models: representing words as vectors and predicting text statistically — the foundation that leads into attention and LLMs.
Modern NLP treats meaning as geometry: words become vectors and a language model learns P(next word | context), sidestepping explicit grammar.
Sessions 11–15 · Attention and LLMs, simulating emotion and personality, the philosophy of mind, and final assessment.
- Explain the attention mechanism and how it underpins Large Language Models.
- Represent emotion (valence–arousal, PAD) and personality (Big Five) computationally.
- Identify sources of bias in AI systems and their societal impact.
- Engage critically with the Chinese Room and the question of machine understanding.
- Attention in deep learning: letting the network weigh which parts of the input matter most for each output — the mechanism behind Transformers.
- Large Language Models: Transformer-based models trained on vast text that can be steered to express a persona or tone.
- Recognising & simulating personality & emotion: emotion modelled as valence–arousal or PAD (pleasure–arousal–dominance); personality as the Big Five (OCEAN) trait dimensions.
- Bias in AI systems: models inherit and can amplify the social biases present in their training data.
Emotion and personality become tractable for AI once represented as low-dimensional coordinates — valence–arousal, PAD, or the Big Five — that a model can read off and reproduce.
- What is "understanding"? The gap between manipulating symbols and grasping their meaning (syntax vs. semantics).
- The Chinese Room: Searle's argument that running a program to produce fluent answers does not entail genuine understanding.
- Can AI be truly intelligent? Weighing strong vs. weak AI, and what behavioural success can and cannot prove about minds.
Passing a behavioural test (Turing) is not the same as understanding (Searle): the course closes by separating competence from comprehension.
Key concepts — glossary
A quick reference to the core terms used across the program, in roughly the order they appear.
- Cognitive science
- The interdisciplinary study of mind that models cognition as information processing across psychology, neuroscience, linguistics, philosophy and CS.
- Turing Test
- A behavioural benchmark: a machine is deemed intelligent if its conversation is indistinguishable from a human's.
- Action potential
- The all-or-none electrical spike a biological neuron fires once its membrane potential crosses threshold.
- Artificial neuron
- A unit computing a weighted sum of inputs plus bias passed through an activation function, y=φ(Σwᵢxᵢ+b).
- Perceptron
- A single linear threshold neuron that learns a decision boundary by correcting its weights on misclassified examples.
- Activation function
- The non-linearity (ReLU, sigmoid, tanh) that gives a network expressive power beyond a linear map.
- Supervised learning
- Learning from labelled (input, target) pairs by minimising the error between output and target.
- Hebbian learning
- An unsupervised, local rule — "cells that fire together wire together," Δw=η·xᵢ·xⱼ.
- Multi-layer perceptron
- A feedforward network of stacked neuron layers able to represent non-linear functions such as XOR.
- Loss function
- A scalar measure of how wrong predictions are (e.g. MSE, cross-entropy) that training seeks to minimise.
- Gradient descent
- Iteratively stepping weights downhill along the loss gradient, w←w−η∂L/∂w.
- Backpropagation
- Efficient computation of loss gradients for every weight by applying the chain rule backward through layers.
- Regularization
- Techniques (L2 weight decay, dropout) that constrain a model to curb overfitting.
- Overfitting
- Memorising training noise so that test performance lags far behind training performance.
- Generalization
- A model's ability to perform well on unseen data — the actual goal of learning.
- Dimensionality reduction
- Projecting high-dimensional data onto fewer informative axes that retain most variance (e.g. PCA).
- Latent representation
- The compressed internal code a hidden layer learns for its inputs.
- Convolutional neural network
- A vision network using local receptive fields, shared weights and pooling, inspired by the visual cortex.
- Adversarial example
- A tiny, often imperceptible input perturbation that flips a model's prediction.
- Attention
- A mechanism weighting which parts of the input matter for each output, softmax(QKᵀ/√d)V; basis of Transformers.
- Large Language Model
- A Transformer trained on vast text to predict the next token, steerable toward a persona or tone.
- Valence–arousal
- A 2-D model of affect: pleasantness (valence) by activation level (arousal) — the circumplex of emotion.
- PAD model
- A 3-D emotion space of Pleasure, Arousal and Dominance.
- Big Five (OCEAN)
- The five trait dimensions — Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism — used to model personality.
- Algorithmic bias
- Systematic unfairness a model inherits and can amplify from biased training data.
- Chinese Room
- Searle's argument that symbol manipulation producing fluent output need not constitute genuine understanding.
Bibliography
Recommended texts, annotated with where each is used across the program.
- Stuart Russell & Peter Norvig (2020). Artificial Intelligence: A Modern Approach, 4th ed. Pearson. ISBN 978013461099 (printed). Use: the AI backbone — history and definitions of intelligence (Session 1) and the philosophy of AI / Chinese Room (Session 12).
- Jay Friedenberg, Gordon Silverman & Michael James Spivey (2021). Cognitive Science: An Introduction to the Study of Mind, 4th ed. SAGE. ISBN 9781544380155 (printed). Use: the cognitive-science side — foundations (S1), memory & inductive reasoning (S6), perception theories (S8), language and thought (S10).
- Marvin Minsky (2007). The Emotion Machine. Simon & Schuster. ISBN 978074327664 (printed). Use: emotion and personality in AI (Session 11) and the philosophy-of-mind discussion (Session 12).
- Michael A. Nielsen (2015). Neural Networks and Deep Learning. Determination Press — neuralnetworksanddeeplearning.com (digital). Use: the technical core for the network sessions — neurons & perceptrons (S2–3, ch. 1–2), backprop & SGD (S4, ch. 2), training/regularization & generalization (S5–7, ch. 3), and CNNs/vision (S8–9, ch. 6).