Exploring AI, HPC, robotics and hydraulics, through hobby projects.
llama distributed
I’ve been thinking about how to run very large language models at home with limited resources while keeping the setup scalable. The goal is to see if existing open-source inference software combined with RDMA (Remote Direct Memory Access) can make this feasible.
To test this, i started extending the llama project to support RDMA… If it works, this approach could make running large LLMs locally using llama possible.…
Read more ⟶
First
First!…
Read more ⟶