coding-assistant - easy way to run local LLM with llama.cpp

llama.cpp is one of those projects that makes you appreciate open source — a dependency-free framework for running quantized SLMs locally on your laptop.

I'm bullish on where local models are heading. Teacher/Student training is showing real promise, and while tokens-per-second still lags behind cloud APIs, we're building on the shoulders of open source contributors who came before us. That's how the internet was built too.

To make experimenting easier, I put together a lightweight shell wrapper that connects local SLMs to any CLI tool that supports local LLMs. Tested it with Opencode — works well, and runs faster than Ollama on Docker.

https://github.com/hamr0/co...github.com

coding-assistant - easy…

coding-assistant - easy way to run local LLM with llama.cpp

// comments · sort: