LocalLLaMA

2732 readers

15 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

pax@sh.itjust.works

SkySyrup@sh.itjust.works

noneabove1182@sh.itjust.works

Smokeydope@lemmy.world

Loaded benchmark for 1-3-4-7b models? (lemm.ee)

submitted 4 days ago by thickertoofan@lemm.ee to c/localllama@sh.itjust.works

4 comments fedilink hide all child comments

I don't care a lot about mathematical tasks, but code intellingence is a minor preference but the most anticipated one is overall comprehension, intelligence. (For RAG and large context handling) But anyways any benchmark with a wide variety of models is something I am searching for, + updated.

you are viewing a single comment's thread
view the rest of the comments

[–] Smokeydope@lemmy.world 3 points 4 days ago* (last edited 4 days ago) (3 children)

The average of all different benchmarks can be thought of as a kind of 'average intelligence', though in reality its more of a gradient and vibe type thing.

Many models are "benchmaxxed" trained to answer the exact kinds of questions the test asked which often makes the benchmarks results unrelated to real world use case checks. Use them as general indicators but not to be taken too seriously.

All model families are different in ways that you only really understand by spending time with them. Don't forget to set the rigt chat template and correct sample range values as needed per model. Openleaderboard is a good place to start.

[–] thickertoofan@lemm.ee 3 points 3 days ago (1 children)

i use pageassist with Ollama

[–] Smokeydope@lemmy.world 2 points 3 days ago* (last edited 3 days ago) (1 children)

Cool, page assist looks neat I'll have to check it out sometimes. My llm engine is kobold.cpp, and I just user the openwebui in internet browser to connect.

Sorry I don't really have good suggestions for you beyond to just try some of the more popular 1-4bs in a very high quant if not full f8 and see which works best for your use case.

Llama 4b, mistral 4b, phi-3-mini, tinyllm 1.5b, qwen 2-1.5b, ect ect. I assume you want a model with large context size and good comprehension skills to summarize youtube transcripts and webpage articles? At least I think thats what the add-on you mentioned suggested was its purpose.

So look for models with those things over ones that try to specialize in a little bit of domain knowledge.

[–] thickertoofan@lemm.ee 2 points 1 day ago

I checked mostly all of em out from the list, but 1b models are generally unusable for RAG.

load more comments (1 replies)