My blogs
Explore by topic
Statistics & Inference (Theory)
Modeling
Systems & Infrastructure
Interpretability & Understanding
AI for Science (Domain)
Agents & Scientific Workflows
Notes coming later.
-
Training a Language Model from Scratch (Part 1: Building Blocks)
A future-me refresher on the main pieces behind a small Transformer language model: byte-level BPE, embeddings, RoPE, attention, normalization, loss, optimization, and decoding.
-
Data 145: Evidence and Uncertainty - Topic Map
A compact topic map for my Data 145 Phase 1 and Phase 2 notes.
-
Data 145 Phase 1: From MLE to Neyman-Pearson to Reward Models
My Data 145 Phase 1 notes: a broad roadmap of statistical inference, with connections to modern reward-based AI.
-
Explainable AI (XAI) and Model Interpretability (SHAP, Integrated Gradients, and Sparse Autoencoders)
A future-me-friendly toolbox of interpretability methods: feature attribution (Shapley/SHAP, Integrated Gradients), perturbation tests, and representation-level methods like Sparse Autoencoders.
-
Diffusion Language Models Deep Dive (Part 1: Method)
This post explains general development of diffusion language models (DLMs), including Discrete Diffusion, and Simple and Effective Masked Diffusion Language Models.