Large Language Models

214 readers

1 users here now

A place to discuss large language models.

Rules

Please tag [not libre software] and [never on-device] services as such (those not green in the License column here).
Be useful to others

Resources

github.com/ollama/ollama
github.com/open-webui/open-webui
github.com/Aider-AI/aider
wikipedia.org/wiki/List_of_large_language_models

founded 2 years ago

MODERATORS

autonomoususer@lemmy.world

llm tools - where to begin? (lemmy.world)

submitted 2 months ago by deckerrj05@lemmy.world to c/llm@lemmy.world

1 comments fedilink hide all child comments

I'm running ollama with llama3.2:1b smollm, all-minilm, moondream, and more. I am able to integrate it with coder/code-server, vscode, vscodium, page assist, cli, and also created a discord ai user.

I'm an infrastructure and automation guy, not a developer so much. Although my field is technically devops.

Now, I hear that some llms have "tools." How do I use them? How do I find a list of tools for a model?

I don't think I can simply prompt "Hi llama3.2, list your tools." Is this part of prompt engineering?

What, do you take a model and retrain it or something?

Anybody able to point me in the right direction?

top 1 comments

sorted by: hot top controversial new old

[–] sga 1 points 2 months ago

it is close to being prompt. You have to look for models which are trianed with tool calling in training (if i am not wrong, mistral 7b 0.3 is one such model, many big models have that, probably phi also, and gemma 3 11/27 b might also have it (this is a thing you can check in their huggingface readme).

If it is in their training, then they can ouptut a tool calling token (i do not remember what is called, but lets say it is [call-tool]). So when you would ask it do some calculation (many models available on chat interface of huggingface have selectable tools you can decide from, enable or disable them, math is one of them) then they would decide if it is appropriate to call a tool (you would not call math tool if you wrote hello). Once they emit a [call-tool] token, then they would prepare some "structred data" (most models do json) lets say prompt was, find integral of sin for 3/4 of its period. And lets consider the model is smart enough to call a tool, and also smart enough to prepare a reasonable structure of it, for example, for maths stuff, they are usually converted to one of either latex or python compatible first, depending on the interface it was trained. lets say this one is trained for latex, then it would generate [call-token] { type_of_tool: "math"; prompt: "$integral_0^(3/2 * pi) sin (x) dx$"; } then the interface (where you are running code) is expected to take over from here, depending on interface, the generation of response would either be frozen, and then resumed, with the result of input now part of current chain, or some would let it continue its yapping, and then pasted a littel afterwards, and then continued to generate output. the json would now be processed by interface, and a relavent tool would be called, which would process the input, and produce its own output (most likely also json), and then given to input.

If you want to implement it yourselves, you can probably take a look at llama cpp implementation, if not use it in llama cpp https://github.com/ggml-org/llama.cpp/blob/master/docs/function-calling.md

also, it would most likely be available in ollama too.

Most agentic model stuff is basically tool-calling in steroids.