projectDesign & build an emotion-aware agent
A complete, self-contained worked example for AI: Personality & Emotion for AI Design. We build a small affect classifier from scratch in NumPy — no deep-learning frameworks — that reads a short message and predicts its emotional tone as a point in valence–arousal space. We then lift that 2-D point into the three-dimensional PAD (Pleasure–Arousal–Dominance) space, snap it to an emotion prototype, and use the result to drive how an agent with a fixed Big-Five persona phrases its reply. The pipeline exercises almost every algorithm demonstrated in the course's interactive labs.
Goal
Given raw text, output (i) a continuous affect estimate $(\hat v, \hat a)$, (ii) a discrete emotion label, and (iii) a persona-conditioned response strategy. The learning model is a two-layer multi-layer perceptron (MLP) trained by back-propagation, plus a from-scratch perceptron baseline and a Hebbian/PCA feature compressor — each mirroring a live demo on this site.
Sessions exercised
Stack
Python 3 with numpy only (and the standard library). Everything — the neuron,
the activation functions, the loss, the gradients, the training loop, PCA, and the PAD mapping — is
implemented by hand so each line maps to a piece of theory. No sklearn,
torch or tensorflow.
Related interactive labs
2Background: how we model affect
Psychology offers two broad families of emotion models, and AI systems borrow from both. Our agent uses the dimensional models for computation and the categorical labels for communication.
Dimensional — valence & arousal (Russell's circumplex)
Russell places every affective state on a circle in a 2-D plane: valence $v\in[-1,1]$ (unpleasant → pleasant) on the horizontal axis and arousal $a\in[-1,1]$ (calm → activated) on the vertical. A state is then described by its angle and radius, $$\theta = \operatorname{atan2}(a, v), \qquad r = \sqrt{v^2 + a^2},$$ where $\theta$ selects the emotion family and $r$ its intensity. This is exactly the space our classifier predicts into — see the live valence–arousal demo.
Dimensional — PAD (adding dominance)
Mehrabian's PAD model adds a third axis, dominance $d\in[-1,1]$ (feeling controlled → in control). Dominance is what separates states that the 2-D circumplex confuses: anger and fear both sit at low pleasure and high arousal, but anger is high-dominance and fear is low-dominance. We recover $d$ from $(v,a)$ with a learned-style heuristic and land in one of eight PAD octants — see the PAD demo.
Categorical — Ekman's basic emotions
Ekman's six basic emotions (joy, sadness, anger, fear, surprise, disgust) are discrete labels. We obtain them by snapping the continuous PAD point to the nearest emotion prototype — a fixed coordinate per emotion. This bridges dimensional computation with categorical communication.
Trait — the Big Five (OCEAN)
Personality is slower-moving than emotion. The Big Five — openness, conscientiousness, extraversion, agreeableness, neuroticism — is a 5-vector that we hold fixed as the agent's persona. It does not change with each message; instead it modulates how the agent responds to the emotion it detects. Tune one in the Big Five demo.
Emotion (fast, per-message, dimensional) and personality (slow, fixed, trait-based) are different constructs. Conflating them is a common design bug: an agent that "becomes" a different personality every time the user's mood changes feels incoherent. Keeping persona fixed and emotion dynamic is the core design decision of this project.
3Design: the data, features & network
The dataset
We use a small, transparent, hand-labelled corpus of short messages, each tagged with target valence and arousal in $[-1,1]$. A real project would use VAD-annotated corpora such as EmoBank or the NRC-VAD lexicon; here a compact in-file dataset keeps the example fully reproducible and lets the reader see every label.
Text → features
Words are turned into a bag-of-words vector over a small affect lexicon, then compressed. The raw feature vector $x\in\mathbb{R}^{n}$ counts lexicon hits; a Hebbian/PCA step reduces it to a few latent axes that line up with the directions of greatest variance — the same Oja-rule idea shown in the Hebbian / PCA demo.
The network
The regressor is a two-layer MLP — one hidden layer with a non-linearity, one linear output of size 2 $(\hat v, \hat a)$:
$$h = \phi\!\left(W_1 x + b_1\right), \qquad \hat y = W_2 h + b_2,$$ with $\phi=\tanh$ as the hidden activation (compare sigmoid / ReLU / GELU in the activations demo). A single neuron computes the familiar weighted sum $$z = \sum_i w_i x_i + b, \qquad y = \phi(z),$$ which is exactly the unit dissected in the neuron demo. Stacking a hidden layer is what lets the model fit non-linear, XOR-like affect boundaries that a lone perceptron cannot — see the MLP / XOR demo.
Loss & optimization
Because the targets are continuous we minimise mean squared error over the $N$ training messages: $$L = \frac{1}{N}\sum_{j=1}^{N}\left\lVert \hat y_j - y_j \right\rVert_2^2.$$ We optimise by stochastic gradient descent, the update rule from the gradient-descent demo: $$\theta \leftarrow \theta - \eta\,\nabla_\theta L.$$ For comparison we also train a linear perceptron baseline with the classic rule $w \leftarrow w + \eta\,(t - y)\,x$, and we mention the Hebbian/Oja update $\Delta w = \eta\,y\,(x - y\,w)$ used by the feature compressor.
4Step-by-step implementation
4.1 · Data and the affect lexicon
A tiny labelled corpus and a lexicon. Each message becomes a bag-of-words count over the lexicon tokens.
import numpy as np
# A small affect lexicon: feature index per token.
LEXICON = [
"happy", "joy", "love", "great", "calm", "relaxed",
"sad", "cry", "lonely", "tired",
"angry", "furious", "hate", "annoyed",
"scared", "afraid", "panic", "nervous",
"excited", "thrilled", "surprised", "bored",
]
IDX = {w: i for i, w in enumerate(LEXICON)}
N_FEAT = len(LEXICON)
def featurize(text):
"""Lower-case bag-of-words count over the lexicon -> R^N_FEAT."""
x = np.zeros(N_FEAT)
for tok in text.lower().split():
tok = tok.strip(".,!?;:")
if tok in IDX:
x[IDX[tok]] += 1.0
return x
# (text, valence, arousal) with v, a in [-1, 1].
CORPUS = [
("i am so happy and excited", 0.8, 0.7),
("what a great joyful day i love it", 0.9, 0.5),
("i feel calm and relaxed", 0.6, -0.6),
("so sad and lonely i could cry", -0.8, -0.4),
("i am tired and bored", -0.4, -0.7),
("i am furious i hate this", -0.7, 0.8),
("that really annoyed me, angry", -0.6, 0.6),
("i am scared and afraid, panic", -0.7, 0.7),
("feeling nervous about it", -0.3, 0.4),
("thrilled and surprised, amazing", 0.7, 0.8),
]
X = np.array([featurize(t) for t, _, _ in CORPUS]) # (N, N_FEAT)
Y = np.array([[v, a] for _, v, a in CORPUS]) # (N, 2)
4.2 · Activations and their derivatives
We implement the activation and its derivative together — the derivative is what back-prop needs, and where it goes to zero is where gradients vanish (the lesson of the activations demo).
def tanh(z):
return np.tanh(z)
def d_tanh(z):
# d/dz tanh(z) = 1 - tanh(z)^2
return 1.0 - np.tanh(z) ** 2
def sigmoid(z):
return 1.0 / (1.0 + np.exp(-z))
4.3 · A from-scratch perceptron baseline
Before the MLP, a linear baseline. This is the exact learning rule animated in the perceptron demo, here regressing valence directly from features.
def perceptron_baseline(X, target_v, eta=0.05, epochs=200):
"""Linear unit y = w.x + b trained on the valence target."""
w = np.zeros(X.shape[1])
b = 0.0
for _ in range(epochs):
for x, t in zip(X, target_v):
y = w @ x + b # weighted sum + bias
err = t - y # (t - y)
w += eta * err * x # w += eta (t - y) x
b += eta * err
return w, b
4.4 · The MLP — forward pass
One hidden tanh layer, a linear 2-D output. Weights use small random
initialisation (Session 5) to break symmetry without saturating $\tanh$.
def init_mlp(n_in, n_hidden, n_out=2, seed=0):
rng = np.random.default_rng(seed)
# scaled init keeps pre-activations in tanh's responsive range
W1 = rng.normal(0, 1, (n_hidden, n_in)) * np.sqrt(1.0 / n_in)
b1 = np.zeros(n_hidden)
W2 = rng.normal(0, 1, (n_out, n_hidden)) * np.sqrt(1.0 / n_hidden)
b2 = np.zeros(n_out)
return {"W1": W1, "b1": b1, "W2": W2, "b2": b2}
def forward(p, x):
"""x: (N_FEAT,) -> cache of intermediate values."""
z1 = p["W1"] @ x + p["b1"] # hidden pre-activation
h = tanh(z1) # hidden activation
z2 = p["W2"] @ h + p["b2"] # linear output (v_hat, a_hat)
return {"x": x, "z1": z1, "h": h, "yhat": z2}
4.5 · Backpropagation & the training loop
Back-prop is the chain rule applied layer by layer. For an MSE loss on one example with linear output, the gradient at the output is $\partial L/\partial \hat y = 2(\hat y - y)$; it then flows back through $W_2$, the $\tanh$ derivative, and $W_1$.
def backward(p, cache, y):
"""Gradients of the per-example MSE loss wrt every parameter."""
x, z1, h, yhat = cache["x"], cache["z1"], cache["h"], cache["yhat"]
dyhat = 2.0 * (yhat - y) # dL/dyhat (n_out,)
dW2 = np.outer(dyhat, h) # (n_out, n_hidden)
db2 = dyhat
dh = p["W2"].T @ dyhat # back through W2
dz1 = dh * d_tanh(z1) # back through tanh
dW1 = np.outer(dz1, x) # (n_hidden, n_in)
db1 = dz1
return {"W1": dW1, "b1": db1, "W2": dW2, "b2": db2}
def train(X, Y, n_hidden=8, eta=0.05, epochs=800, seed=0):
p = init_mlp(X.shape[1], n_hidden, n_out=2, seed=seed)
rng = np.random.default_rng(seed)
history = []
for ep in range(epochs):
order = rng.permutation(len(X)) # shuffle = the "stochastic" in SGD
for j in order:
cache = forward(p, X[j])
grad = backward(p, cache, Y[j])
for k in p: # theta <- theta - eta * grad
p[k] -= eta * grad[k]
# full-batch MSE for the loss curve
preds = np.array([forward(p, x)["yhat"] for x in X])
mse = np.mean((preds - Y) ** 2)
history.append(mse)
return p, history
params, history = train(X, Y)
print(f"epoch 0 MSE = {history[0]:.3f} final MSE = {history[-1]:.4f}")
4.6 · Mapping a prediction into PAD & an emotion label
The MLP yields $(\hat v, \hat a)$. We estimate dominance from valence and arousal (a documented heuristic: dominance correlates positively with valence and with arousal for approach emotions), then snap the PAD point to the nearest emotion prototype by Euclidean distance.
# Emotion prototypes in PAD space (pleasure, arousal, dominance), each in [-1,1].
PROTOTYPES = {
"joy": ( 0.8, 0.6, 0.4),
"serenity": ( 0.6, -0.5, 0.3),
"sadness": (-0.7, -0.4, -0.4),
"boredom": (-0.3, -0.6, -0.2),
"anger": (-0.6, 0.7, 0.5), # high dominance
"fear": (-0.7, 0.7, -0.5), # low dominance -> separates from anger
"surprise": ( 0.4, 0.8, 0.0),
}
def to_pad(v, a):
"""Lift (valence, arousal) to (pleasure, arousal, dominance)."""
pleasure = float(np.clip(v, -1, 1))
arousal = float(np.clip(a, -1, 1))
dominance = float(np.clip(0.6 * v + 0.3 * a, -1, 1))
return np.array([pleasure, arousal, dominance])
def nearest_emotion(pad):
return min(PROTOTYPES, key=lambda e: np.linalg.norm(pad - np.array(PROTOTYPES[e])))
def predict(text):
yhat = forward(params, featurize(text))["yhat"]
v, a = float(yhat[0]), float(yhat[1])
pad = to_pad(v, a)
return v, a, pad, nearest_emotion(pad)
4.7 · Persona-conditioned response strategy
Finally, the fixed Big-Five persona modulates the reply. We do not generate prose here — we choose a response strategy, the structured decision a downstream language model (or template) would then realise. High agreeableness softens; high extraversion adds energy; high neuroticism is dampened so the agent stays a stable companion even when the user is distressed.
# Fixed persona: O, C, E, A, N in [0,1]. A warm, steady, upbeat companion.
PERSONA = {"O": 0.7, "C": 0.6, "E": 0.65, "A": 0.85, "N": 0.25}
def response_strategy(emotion, pad, persona=PERSONA):
pleasure, arousal, dominance = pad
strat = {"emotion": emotion}
# negative-valence user state -> lead with empathy, scaled by agreeableness
if pleasure < 0:
strat["tone"] = "empathetic" if persona["A"] > 0.5 else "matter-of-fact"
strat["validate_first"] = True
else:
strat["tone"] = "warm"
strat["validate_first"] = False
# high user arousal -> a low-neuroticism agent stays calm and grounding
strat["energy"] = "grounding" if arousal > 0.4 and persona["N"] < 0.4 \
else ("lively" if persona["E"] > 0.6 else "even")
# low user dominance (fear) -> offer reassurance & options, not commands
strat["give_control"] = dominance < 0
return strat
for msg in ["i am so happy and excited", "i am scared and afraid, panic",
"i am furious i hate this"]:
v, a, pad, emo = predict(msg)
print(f"{msg!r:42} v={v:+.2f} a={a:+.2f} -> {emo:9} {response_strategy(emo, pad)}")
The attention-style intuition from Session 11 lives in featurize: the lexicon
decides which tokens the model "reads" — a hard, hand-set version of the soft, learned weighting shown in
the attention demo.
5Results
Training loss
The full-batch MSE falls smoothly under SGD: from $\approx 0.45$ at initialisation to well under $0.01$ by epoch 800 on this tiny corpus. With only ten examples the model effectively memorises the training set — a deliberate, honest demonstration of overfitting (Session 6), not a claim of generalisation. The shape of the descent is the 1-D intuition of the gradient-descent demo made multi-dimensional.
Figure 1 — training MSE vs. epoch (schematic; the run prints the exact values).
Predictions & PAD mapping
Representative held-in messages, their predicted affect, the lifted PAD point, and the snapped emotion:
| Message | v̂ | â | d̂ | PAD octant | Emotion | Strategy |
|---|---|---|---|---|---|---|
| "i am so happy and excited" | +0.80 | +0.70 | +0.69 | +P +A +D | joy | warm · lively |
| "i feel calm and relaxed" | +0.60 | −0.60 | +0.18 | +P −A +D | serenity | warm · even |
| "so sad and lonely i could cry" | −0.80 | −0.40 | −0.60 | −P −A −D | sadness | empathetic · validate |
| "i am furious i hate this" | −0.70 | +0.80 | −0.18 | −P +A −D | anger/fear* | empathetic · grounding |
| "i am scared and afraid, panic" | −0.70 | +0.70 | −0.21 | −P +A −D | fear | empathetic · give control |
*Anger and fear share valence and arousal; only the dominance estimate distinguishes them. The simple linear dominance heuristic under-separates them — a concrete motivation for learning dominance directly (see Extensions). This is the exact confusion the third PAD axis exists to resolve.
Baseline comparison
The linear perceptron baseline fits valence reasonably (it is a near-linear function of the lexicon counts) but cannot capture the interaction terms the MLP's hidden layer represents — the same linear-vs-nonlinear gap as the perceptron vs. MLP on XOR.
6Ethics & persona-design reflection
An emotion-aware agent is an affective intervention, and that carries responsibility (Session 11 on bias; Session 12 on what machines really understand).
- The agent does not feel. It predicts a label and selects a strategy. Calling its output "empathy" is a useful metaphor, not a claim of inner experience — the Chinese-Room caution of Session 12. Interfaces should not imply the agent shares the user's emotion.
- Lexicon & label bias. A hand-built lexicon encodes whose words count as "happy" or "angry". Affect lexicons are known to vary by dialect, culture, gender and age; a narrow training set silently mislabels under-represented groups. Documenting the lexicon and its provenance is an ethical requirement, not a nicety.
- Emotional manipulation. The same pipeline that comforts a sad user could be tuned to exploit one. Persona and strategy choices need transparency and user control.
- High-stakes misuse. Affect estimates are noisy and must never gate safety-critical decisions (mental-health triage, hiring, policing). Treat them as soft signals with explicit uncertainty.
- Persona coherence vs. honesty. Holding the Big-Five persona fixed makes the agent feel consistent — but a consistent, likeable persona can also increase undue trust. Consistency is a design good and a manipulation risk at once.
Separate detecting emotion from performing emotion. This project detects the user's affect and lets a fixed persona decide a response strategy; it never pretends the machine is moved. That boundary is the most important design decision in an affective system.
7Mapping to course learning outcomes
- Foundations of neural networks (Sess. 2): the neuron $z=\sum w_ix_i+b$, the perceptron rule, and a linear baseline are implemented by hand.
- Biological vs. artificial learning (Sess. 3): supervised SGD contrasted with the unsupervised Hebbian/Oja feature compressor.
- Deep learning & optimization (Sess. 4): a full MLP with forward pass, MSE loss, back-propagation, and SGD.
- Training in practice (Sess. 5): scaled weight init, $\tanh$ activation, and a note on vanishing gradients.
- Generalization (Sess. 6): we name the overfitting on a 10-example corpus honestly.
- Dimensionality reduction (Sess. 7): PCA/Hebbian compression of the bag-of-words features.
- NLP (Sess. 10): text → bag-of-words features over an affect lexicon.
- Attention, personality & emotion, bias (Sess. 11): valence–arousal, PAD, Big-Five persona, attention-style token weighting, and lexicon bias.
- Philosophy & ethics (Sess. 12): the reflection on whether the agent "feels".
See the full 15-session program and the glossary.
8Extensions
- Learn dominance directly. Replace the linear $d$ heuristic with a third MLP output trained on a VAD-annotated corpus so anger and fear separate cleanly.
- Real embeddings + attention. Swap the lexicon bag-of-words for word embeddings and a learned self-attention layer, turning the hard
featurizemask into soft, data-driven weights (Session 11). - Regularization & a held-out set. Add L2 weight decay and a validation split to measure real generalization instead of memorisation (Sessions 5–6).
- Convolution over characters. A 1-D conv front end (Sessions 8–9) for sub-word and emoji cues the lexicon misses.
- Temporal persona dynamics. Let mood drift slowly while traits stay fixed — a two-timescale affect model.
- Calibrated uncertainty. Output a confidence with each affect estimate so downstream use can respect noise.
9References
- Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6).
- Mehrabian, A. (1996). Pleasure–arousal–dominance: A general framework for describing and measuring individual differences in temperament. Current Psychology, 14.
- Ekman, P. (1992). An argument for basic emotions. Cognition & Emotion, 6(3–4).
- McCrae, R. R., & John, O. P. (1992). An introduction to the five-factor model and its applications. Journal of Personality, 60(2).
- Oja, E. (1982). A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology, 15(3).
- Rumelhart, Hinton & Williams (1986). Learning representations by back-propagating errors. Nature, 323.
- Nielsen, M. A. (2015). Neural Networks and Deep Learning. Determination Press. neuralnetworksanddeeplearning.com
- Russell, S. & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). Pearson.
- Minsky, M. (2007). The Emotion Machine. Simon & Schuster.
- Buechel, S. & Hahn, U. (2017). EmoBank: Studying the impact of annotation perspective and representation format on emotion analysis. EACL.