-
How Text-to-Speech Models WorkFrom raw waveforms to voice cloning — understanding Kokoro, CSM, and Pocket TTS from first principles. Neural audio codecs, vector quantization, flow matching, and why a 100M-parameter model can clone your voice from 5 seconds of audio.Waveform → Codec → Tokens / Latents → Language Model → Speech
-
Building Ocean Sparkles from First PrinciplesRaymarching, halftone post-processing, and procedural sparkle generation in a single HTML file. From the mathematics of noise through the physics of Fresnel reflection to risograph-style rendering with blue noise dithering.Noise & fBm → Raymarching → Kawase Blur → Halftone → Sparkles