Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!
Well they are fully closed source except for the open source project they are a wrapper on. The open source part is llama.cpp
Fair enough, but it’s damn handy and simple to use. And I don’t know how to do speculative decoding with ollama, which massively speeds up the models for me.