Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apriel-H1-15b-Thinker-SFT uses incremental distillation from Apriel-Nemotron-15B-Thinker, selectively replacing less critical attention layers with linear Mamba blocks to reduce computational complexity while preserving reasoning quality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: