Blog | Terry (Taehan) Kim

Technical Blogs

Scribbles and Notes

My blogs

Explore by topic

Statistics & Inference (Theory)

Modeling

Systems & Infrastructure

Interpretability & Understanding

Explainable AI (XAI) and Model Interpretability (SHAP, Integrated Gradients, and Sparse Autoencoders)

AI for Science (Domain)

Old Blog Notes: AI Drug Discovery

Agents & Scientific Workflows

Notes coming later.

Training a Language Model from Scratch (Part 1: Building Blocks)

A future-me refresher on the main pieces behind a small Transformer language model: byte-level BPE, embeddings, RoPE, attention, normalization, loss, optimization, and decoding.

37 min read · May 06, 2026

2026 · ml llm transformers notes from-scratch modeling-generative · technical-blogs
Data 145: Evidence and Uncertainty - Topic Map

A compact topic map for my Data 145 Phase 1 and Phase 2 notes.

2 min read · April 23, 2026

2026 · statistics data145 notes statistics-inference · technical-blogs
Data 145 Phase 1: From MLE to Neyman-Pearson to Reward Models

My Data 145 Phase 1 notes: a broad roadmap of statistical inference, with connections to modern reward-based AI.

66 min read · March 08, 2026

2026 · statistics ml notes statistics-inference · technical-blogs
Explainable AI (XAI) and Model Interpretability (SHAP, Integrated Gradients, and Sparse Autoencoders)

A future-me-friendly toolbox of interpretability methods: feature attribution (Shapley/SHAP, Integrated Gradients), perturbation tests, and representation-level methods like Sparse Autoencoders.

9 min read · February 15, 2026

2026 · ml notes interpretability · technical-blogs
Diffusion Language Models Deep Dive (Part 1: Method)

This post explains general development of diffusion language models (DLMs), including Discrete Diffusion, and Simple and Effective Masked Diffusion Language Models.

6 min read · January 10, 2026

2026 · ml notes modeling-generative · technical-blogs

Technical Blogs

Scribbles and Notes

My blogs

Statistics & Inference (Theory)

Modeling

Systems & Infrastructure

Interpretability & Understanding

AI for Science (Domain)

Agents & Scientific Workflows

Training a Language Model from Scratch (Part 1: Building Blocks)

Data 145: Evidence and Uncertainty - Topic Map

Data 145 Phase 1: From MLE to Neyman-Pearson to Reward Models

Explainable AI (XAI) and Model Interpretability (SHAP, Integrated Gradients, and Sparse Autoencoders)

Diffusion Language Models Deep Dive (Part 1: Method)