[-] bia@lemmy.ml 11 points 1 year ago

Do you have a degree in theoretical physics, or do you theoretical have a degree. ;)

[-] bia@lemmy.ml 5 points 1 year ago* (last edited 1 year ago)

Hmm. I'd actually argue it's a good solution in some cases. We run multiple services where load is intermittent, services are short-lived, or the code is complex and hard to refactor. Just adding hardware resources can be a much cheaper solution than optimizing code.

1
submitted 1 year ago by bia@lemmy.ml to c/localllama@sh.itjust.works

I’ve been using llama.cpp, gpt-llama and chatbot-ui for a while now, and I’m very happy with it. However, I’m now looking into a more stable setup using only GPU. Is this llama.cpp still still a good candidate for that?

bia

joined 1 year ago