Rust+OpenCL+AVX2 implementation of LLaMA inference code
Rust+OpenCL+AVX2 implementation of LLaMA inference code - GitHub - Noeda/rllama: Rust+OpenCL+AVX2 implementation of LLaMA inference code... (more…)
Read more »I simplified a bunch of warp routes with macros and attention to detail. Read more