The OpenWebUI client retrieves available tools from the MCP server.
In this particular case, the MCP server is using FastAPI and responds with the Swagger JSON that OpenWebUI expects:
{
"openapi": "3.1.0",
"info": {
"title": "FastAPI",
"version": "0.1.0"
},
"paths": {
"/get_weather": {
"get": {
"summary": "Get Weather",
"parameters": [
{
"name": "location",
"in": "query",
"required": true,
"schema": {
"type": "string",
"description": "Location to retrieve weather for"
}
}
]
}
}
}
}
OpenWebUI constructs an OpenAI-format payload containing the user prompt and available tools.
User prompt: What's the weather in Austin, TX?
{
"model": "qwen-2.5:32b",
"messages": [
{
"role": "user",
"content": "What's the weather in Austin, TX?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "Location to retrieve weather for"
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}
Ollama receives the request on its OpenAI-compatible endpoint and formats the message using the chat template defined for the model.
In this case, I'm using a Qwen 2.5 32b - Instruct model with a chat template that can be found here.
The chat template defined in the tokenizer_config.json file is a Jinja template
{%- if tool_call.function is defined %}
{%- set tool_call = tool_call.function %}
{%- endif %}
<|im_start|>system You are Qwen, created by Alibaba Cloud. You are a helpful assistant. Tools You may call one or more functions to assist with the user query. You are provided with function signatures within <tools></tools> XML tags: <tools> {"name": "get_weather", "description": "Get the current weather for a location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "Location to retrieve weather for"}}, "required": ["location"]}} </tools> For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags: <tool_call> {"name": <function-name>, "arguments": <args-json-object>} </tool_call> <|im_end|> <|im_start|>user What's the weather in Austin, TX? <|im_end|> <|im_start|>assistant
The LLM analyzes the user request and determines that a tool call is required. It outputs the desired tool call within the XML tags as specified by the chat template:
I'll help you get the current weather in Austin, TX. <tool_call> {"name": "get_weather", "arguments": {"location": "Austin, TX"}} </tool_call>
Ollama parses the tool call block from the LLM output and formats the response back into OpenAI-compatible format with the tool calls array populated:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1234567890,
"model": "qwen-2.5:32b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I'll help you get the current weather in Austin, TX.",
"tool_calls": [
{
"id": "get_weather_1",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Austin, TX\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}
OpenWebUI receives Ollama's response, detects the tool call request, and makes an HTTP request to the MCP server using the endpoint and parameters matching the Swagger schema in step 0:
POST /get_weather { "location": "Austin, TX" }
The MCP server performs whatever blackbox shenanigans to determine the weather in Austin, TX and returns a response matching the Swagger schema from step 0:
{
"temperature": 102.4,
"location": "Austin, TX",
"unit": "fahrenheit"
}
OpenWebUI receives the MCP server response, adds it to the conversation history as a tool response, and sends the updated conversation back to Ollama:
{
"model": "qwen-2.5:32b",
"messages": [
{
"role": "user",
"content": "What's the weather in Austin, TX?"
},
{
"role": "assistant",
"content": "I'll help you get the current weather in Austin, TX.",
"tool_calls": [
{
"id": "get_weather_1",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\":\"Austin, TX\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "get_weather_1",
"content": "{\"temperature\": 102.4, \"location\": \"Austin, TX\", \"unit\": \"fahrenheit\"}"
}
]
}
Ollama formats the messages including the new tool response according to the model's chat template.
tool role, this particular model/chat template displays them in the chat under user.
<|im_start|>system You are Qwen, created by Alibaba Cloud. You are a helpful assistant. Tools [... tool definitions ...] <|im_end|> <|im_start|>user What's the weather in Austin, TX? <|im_end|> <|im_start|>assistant I'll help you get the current weather in Austin, TX. <tool_call> {"name": "get_weather", "arguments": {"location": "Austin, TX"}} </tool_call> <|im_end|> <|im_start|>user {"temperature": 102.4, "location": "Austin, TX", "unit": "fahrenheit"} <|im_end|> <|im_start|>assistant
The LLM processes the formatted chat message with the tool response data and generates a natural language response using the weather information:
Based on the current weather data, it's quite hot in Austin, TX right now with a temperature of 102.4°F. Make sure to stay hydrated and seek air conditioning if you're planning to be outside!
Ollama formats the LLM's natural language response back into OpenAI-compatible format for return to OpenWebUI:
{
"id": "chatcmpl-def456",
"object": "chat.completion",
"created": 1234567920,
"model": "qwen-2.5:32b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Based on the current weather data, it's quite hot in Austin, TX right now with a temperature of 102.4°F. Make sure to stay hydrated and seek air conditioning if you're planning to be outside!"
},
"finish_reason": "stop"
}
]
}
OpenWebUI receives the final formatted response from Ollama and displays the natural language message to the user in the chat interface:
Based on the current weather data, it's quite hot in Austin, TX right now with a temperature of 102.4°F. Make sure to stay hydrated and seek air conditioning if you're planning to be outside!