Hacker News Logo

Offline

dayweek

Accelerating Gemma 4: faster inference with multi-token prediction drafters

542 points|blog.google|
amrrs|18hrs