RESEARCH

Mathematical Foundations

Deep Dives into Machine Learning & Optimization

01
Mathematics
[ PROBABILITY // V1.0 ]

Probability Theory

The mathematical foundation beneath every algorithm. Measure theory, concentration inequalities, and the probabilistic soul of ML.

25 min read Analyze
02
Information Theory
[ MATHEMATICS // V1.0 ]

Information Theory

The mathematics of uncertainty and learning. Explore Shannon entropy, KL divergence, cross-entropy, and the maximum entropy principle.

18 min read Analyze
03
f ∇f (n×1) J = ∂y/∂x ∇²f (sym)
Mathematics
[ CALCULUS // V1.0 ]

Matrix Calculus

∂y/∂x for every shape. Differentials, Jacobians, trace tricks, and full derivations of linear regression, PCA, and backprop.

40 min read Analyze
04
Supervised Learning
[ REGRESSION // V1.1 ]

Linear Regression

The foundation of predictive modeling. Complete mathematical derivation of Ordinary Least Squares, normal equations, and assumptions.

12 min read Analyze
05
Optimization
[ CALCULUS // V1.2 ]

Bias Variance TradeOff

The fundamental trade-off between model simplicity and prediction accuracy.

15 min read Analyze
06
Optimization
[ OPTIMIZATION // V1.0 ]

Gradient Descent

The workhorse of machine learning optimization. Understand partial derivatives, learning rates, and convergence behavior from first principles.

10 min read Analyze
07
Classification
[ CLASSIFICATION // V2.0 ]

Logistic Regression

Moving from continuous to categorical. Explore sigmoid functions, maximum likelihood estimation, and cross-entropy loss gradients.

14 min read Analyze
08
Optimization
[ CALCULUS // V1.2 ]

Lagrange Multipliers

Constrained optimization unlocked. A deep dive into the method of Lagrange multipliers, dual problems, and their geometric intuition.

15 min read Analyze
09
Optimization
[ CONVEX // V1.0 ]

Convex Optimization

Convex sets, Jensen's inequality, duality, KKT conditions, proximal methods — the rigorous bridge between gradient descent and Lagrange multipliers.

35 min read Analyze
10
A =
Linear Algebra
[ MATHEMATICS // V1.0 ]

Singular Value Decomposition

The most powerful factorization in all of mathematics. Works on every matrix, reveals hidden geometry, and underlies PCA and compression.

22 min read Analyze
11
Dimensionality Reduction
[ UNSUPERVISED // V1.0 ]

Principal Component Analysis

From variance maximization to SVD equivalence. A definitive guide to understanding PCA's mathematical machinery from the ground up.

20 min read Analyze
12
Probabilistic ML
[ BAYESIAN // V1.0 ]

Bayesian Machine Learning

From the philosophical divide between frequentist and Bayesian thinking, through Bayes' theorem, priors, posteriors, and conjugate families.

25 min read Analyze
13
Deep Learning
[ NEURAL NETS // V1.0 ]

Neural Networks

The mathematical foundations of deep learning. Explore forward propagation, backpropagation derivations, and the universal approximation theorem.

25 min read Analyze
14
x z L ∂L/∂x
Deep Learning
[ BACKPROP // V1.0 ]

Backpropagation

The algorithm that trains every neural network. Computational graphs, reverse-mode AD, Jacobians, VJPs, matrix calculus, and PyTorch autograd — derived from first principles.

45 min read Analyze
15
Deep Learning
[ ATTENTION // V1.0 ]

Transformers

From the failure modes of RNNs, through the mathematical derivation of attention, to multi-head attention and positional encoding.

28 min read Analyze
16
flat min sharp min saddle SGD path
Deep Learning
[ OPTIMIZATION // V1.0 ]

DL Optimization

Loss landscapes, saddle points, flat vs sharp minima, SGD noise as implicit regularization, scaling laws, and grokking.

50 min read Analyze