this post was submitted on 19 Apr 2025
2 points (100.0% liked)

Large Language Models

194 readers
1 users here now

A place to discuss large language models.

Rules

  1. Please tag [not libre software] and [never on-device] services as such (those not green in the License column here).
  2. Be useful to others

Resources

github.com/ollama/ollama
github.com/open-webui/open-webui
github.com/Aider-AI/aider
wikipedia.org/wiki/List_of_large_language_models

founded 2 years ago
MODERATORS
 

I'm running ollama with llama3.2:1b smollm, all-minilm, moondream, and more. I am able to integrate it with coder/code-server, vscode, vscodium, page assist, cli, and also created a discord ai user.

I'm an infrastructure and automation guy, not a developer so much. Although my field is technically devops.

Now, I hear that some llms have "tools." How do I use them? How do I find a list of tools for a model?

I don't think I can simply prompt "Hi llama3.2, list your tools." Is this part of prompt engineering?

What, do you take a model and retrain it or something?

Anybody able to point me in the right direction?

top 1 comments
sorted by: hot top controversial new old
[–] sga 1 points 3 days ago

it is close to being prompt. You have to look for models which are trianed with tool calling in training (if i am not wrong, mistral 7b 0.3 is one such model, many big models have that, probably phi also, and gemma 3 11/27 b might also have it (this is a thing you can check in their huggingface readme).

If it is in their training, then they can ouptut a tool calling token (i do not remember what is called, but lets say it is [call-tool]). So when you would ask it do some calculation (many models available on chat interface of huggingface have selectable tools you can decide from, enable or disable them, math is one of them) then they would decide if it is appropriate to call a tool (you would not call math tool if you wrote hello). Once they emit a [call-tool] token, then they would prepare some "structred data" (most models do json) lets say prompt was, find integral of sin for 3/4 of its period. And lets consider the model is smart enough to call a tool, and also smart enough to prepare a reasonable structure of it, for example, for maths stuff, they are usually converted to one of either latex or python compatible first, depending on the interface it was trained. lets say this one is trained for latex, then it would generate [call-token] { type_of_tool: "math"; prompt: "$integral_0^(3/2 * pi) sin (x) dx$"; } then the interface (where you are running code) is expected to take over from here, depending on interface, the generation of response would either be frozen, and then resumed, with the result of input now part of current chain, or some would let it continue its yapping, and then pasted a littel afterwards, and then continued to generate output. the json would now be processed by interface, and a relavent tool would be called, which would process the input, and produce its own output (most likely also json), and then given to input.

If you want to implement it yourselves, you can probably take a look at llama cpp implementation, if not use it in llama cpp https://github.com/ggml-org/llama.cpp/blob/master/docs/function-calling.md

also, it would most likely be available in ollama too.

Most agentic model stuff is basically tool-calling in steroids.