Interesting. You're using a model without special finetuning for this specific purpose and managed to get it to work with just giving it a prompt. I didn't think that was possible. How would you piece together something like this? Can I just ask AI to give me a prompt which I can use on it/another AI?
How much of VRAM does your GPU have?
I had never heard of Kobold AI. I was going to self-host Ollama and try with it but I'll take a look at Kobold. I had never heard about controls on world-building and dialogue triggers either; there's a lot to learn.
Will more VRAM solve the problem of not retaining context? Can I throw 48GB of VRAM towards an 8B model to help it remember stuff?
Yes, I'm looking at image generation (stable diffusion) too. Thanks