18
New technique to run 70B LLM Inference on a single 4GB GPU
(ai.gopubby.com)
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
Yeah, I'm not sure how they get that, but maybe, if you're wanting to run a model in-house, as many people would prefer, you can then run much more capable models on consumer grade hardware and make savings there compared to requiring the more expensive kit. Many would already have decent hardware, and this extends what they can run before needing to fork out for new hardware.
I know, I'm guessing.