Hacker News Logo

Offline

dayweek

Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint

74 points|modal.com|
charles_irl|10hrs