Skip to content
블로그로 돌아가기

태그

kernels

kernels 태그가 달린 글 1개.

Fused Linear Cross-Entropy : Why fusing the LM head projection with cross-entropy is the single biggest memory win for training LLMs at long context.