Exploring AI, HPC, robotics and hydraulics, through hobby projects.

llama distributed

Feb 17, 2025

I’ve been thinking about how to run very large language models at home with limited resources while keeping the setup scalable. The goal is to see if existing open-source inference software combined with RDMA (Remote Direct Memory Access) can make this feasible. To test this, i started extending the llama project to support RDMA… If it works, this approach could make running large LLMs locally using llama possible.…

First

Feb 8, 2025

First!…