2
Early Santa: what is missing in selfhosted
(alien.top)
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
For Example
We welcome posts that include suggestions for good self-hosted alternatives to popular online services, how they are better, or how they give back control of your data. Also include hints and tips for less technical readers.
Useful Lists
That's because LLMs don't do that.
The companies that offer those services basically do some tricks behind the curtain.
Like let's say you want an LLM to learn your corporate docs. LLMs can't do that because they need millions of text from across the internet just to learn to speak English.. You can't feed your 1000 docs and 10,000 emails in and point to it and say "Forget the billion documents you injested and pay attention to this.... but also retain the ability to speak English"
What they actually implement is a standard text search engine, that returns matching paragraphs from the relevant documents, prompts to LLM with something like "This paragraph may contain an answer to user question X. If it does, please paraphrase it.
Yes, that’s exactly what I want it to do
Most of my 60-70 email replies per day are answered in almost the exact same way
I want it to read an email and then, using paragraphs or sentences from my previous emails, automatically generate a response
There are already companies out there who are generating what they term small language models - basically hybrid models of say gpt 3.5 plus a large volume of corporate data - but they are all cloud based
Others offer plugins to help answer your emails
I’d like a combination of the same to run locally
I think you will find most of these are not small language models, but are instead the thing I said above - a llm like gpt + a search engine. Even small language models require millions of texts and only perform very specialised tasks.