modeling-lab random numbers · monte carlo · discrete events · regression · classification

1. Modeling change — a dynamic system

A model is a rule for how a quantity changes over time. Population $P$ grows logistically: $\frac{dP}{dt}=rP\left(1-\frac{P}{K}\right)$. Small populations grow nearly exponentially; growth slows as $P$ approaches the carrying capacity $K$. We integrate the rule forward with small time steps — the essence of simulation.

final $P$
inflection at

2. Random numbers — the linear congruential generator

Computers make "random" numbers deterministically. The LCG iterates $x_{n+1}=(a\,x_n+c)\bmod m$ and returns $u_n=x_n/m\in[0,1)$. Good constants fill the unit square; bad ones collapse to a few lattice lines. The sequence always repeats after at most $m$ steps — its period.

view(uₙ, uₙ₊₁)
detected period

3. Generating random variables — inverse transform

To sample a distribution from uniform $U\sim\mathrm{Unif}(0,1)$, invert its CDF: $X=F^{-1}(U)$. For the exponential, $X=-\frac{1}{\lambda}\ln(1-U)$. We draw $U$, push it through $F^{-1}$, and the histogram of samples converges to the target density.

sample mean
theoretical mean

4. Monte Carlo simulation — estimating $\pi$

Throw random darts into the unit square. The fraction landing inside the quarter circle approaches $\frac{\pi}{4}$, so $\hat\pi=4\cdot\frac{\text{hits}}{\text{throws}}$. Error shrinks like $1/\sqrt{n}$ — the signature convergence rate of Monte Carlo.

throws0
inside0
estimate $\hat\pi$
error

5. Monte Carlo inference — the sampling distribution

Repeatedly draw a sample of size $n$ from a population, compute its mean, and plot the means. The Central Limit Theorem says this sampling distribution is approximately normal with spread $\sigma/\sqrt{n}$ — wider for small $n$, tighter for large $n$. This is how Monte Carlo lets us reason about estimators.

mean of means
std of means (SE)

6. Discrete-event simulation — an M/M/1 queue

Customers arrive at rate $\lambda$ and a single server works at rate $\mu$. The simulation jumps between discrete events (arrival, departure). When traffic intensity $\rho=\lambda/\mu$ nears 1 the queue explodes; theory predicts mean queue length $L_q=\frac{\rho^2}{1-\rho}$.

utilization $\rho$
customers in system0
mean queue $L_q$

7. Simple linear regression — least squares

Fit $\hat y=\beta_0+\beta_1 x$ by minimising the sum of squared residuals. Drag the generating slope and noise; the fitted line and $R^2$ update. Click the canvas to add your own points. $R^2$ is the share of variance the line explains.

$\hat\beta_1$ (slope)
$\hat\beta_0$ (intercept)
$R^2$

8. Polynomial regression — fitting curvature

A straight line cannot capture a curved trend. Raising the degree $\hat y=\sum_{k=0}^{d}\beta_k x^k$ lets the fit bend. Watch training error fall as degree rises — but a very high degree starts chasing noise, foreshadowing overfitting.

training RMSE
parameters fit

9. Overfitting & cross-validation

Splitting data into train and test reveals the bias–variance trade-off. As model complexity rises, training error keeps falling, but test error turns back up once the model memorises noise. The sweet spot is the bottom of the test curve.

train RMSE
test RMSE
best test degree

10. Logistic regression — the logit curve

For a yes/no outcome we model the probability with the sigmoid $p(x)=\dfrac{1}{1+e^{-(\beta_0+\beta_1 x)}}$. The slope $\beta_1$ controls how sharply the curve switches; the points are 0/1 labels. A cutoff turns probabilities into predictions.

accuracy

11. Confusion matrix & ROC curve

Sliding the classification cutoff trades false positives against false negatives. The confusion matrix counts TP/FP/FN/TN; the ROC curve plots the true-positive rate against the false-positive rate across all cutoffs. The area under it (AUC) summarises the classifier.

TP · FP
FN · TN
AUC