Ollama 的网络搜索 API 可用于通过最新信息增强模型,从而减少幻觉并提高准确性。 网络搜索以 REST API 的形式提供,并在 Python 和 JavaScript 库中提供更深层次的工具集成。这也使得像 OpenAI 的 gpt-oss 这样的模型能够执行长时间运行的研究任务。身份验证
如需访问 Ollama 的网络搜索 API,请创建一个 API 密钥。需要一个免费的 Ollama 账户。
网络搜索 API
针对单个查询执行网络搜索并返回相关结果。
POST https://ollama.ac.cn/api/web_search
query (字符串,必填):搜索查询字符串
max_results (整数,可选):要返回的最大结果数(默认为 5,最大为 10)
返回一个包含以下内容的对象
results (数组):搜索结果对象数组,每个对象包含
title (字符串):网页的标题
url (字符串):网页的 URL
content (字符串):来自网页的相关内容摘要
请确保已设置 OLLAMA_API_KEY,否则必须在 Authorization 标头中传递该密钥。
cURL 请求
curl https://ollama.ac.cn/api/web_search \
--header "Authorization: Bearer $OLLAMA_API_KEY" \
-d '{
"query":"what is ollama?"
}'
响应
{
"results": [
{
"title": "Ollama",
"url": "https://ollama.ac.cn/",
"content": "Cloud models are now available..."
},
{
"title": "What is Ollama? Introduction to the AI model management tool",
"url": "https://www.hostinger.com/tutorials/what-is-ollama",
"content": "Ariffud M. 6min Read..."
},
{
"title": "Ollama Explained: Transforming AI Accessibility and Language ...",
"url": "https://www.geeksforgeeks.org/artificial-intelligence/ollama-explained-transforming-ai-accessibility-and-language-processing/",
"content": "Data Science Data Science Projects Data Analysis..."
}
]
}
Python 库
import ollama
response = ollama.web_search("What is Ollama?")
print(response)
输出示例
results = [
{
"title": "Ollama",
"url": "https://ollama.ac.cn/",
"content": "Cloud models are now available in Ollama..."
},
{
"title": "What is Ollama? Features, Pricing, and Use Cases - Walturn",
"url": "https://www.walturn.com/insights/what-is-ollama-features-pricing-and-use-cases",
"content": "Our services..."
},
{
"title": "Complete Ollama Guide: Installation, Usage & Code Examples",
"url": "https://collabnix.com/complete-ollama-guide-installation-usage-code-examples",
"content": "Join our Discord Server..."
}
]
更多 Ollama Python 示例
JavaScript 库
import { Ollama } from "ollama";
const client = new Ollama();
const results = await client.webSearch("what is ollama?");
console.log(JSON.stringify(results, null, 2));
输出示例
{
"results": [
{
"title": "Ollama",
"url": "https://ollama.ac.cn/",
"content": "Cloud models are now available..."
},
{
"title": "What is Ollama? Introduction to the AI model management tool",
"url": "https://www.hostinger.com/tutorials/what-is-ollama",
"content": "Ollama is an open-source tool..."
},
{
"title": "Ollama Explained: Transforming AI Accessibility and Language Processing",
"url": "https://www.geeksforgeeks.org/artificial-intelligence/ollama-explained-transforming-ai-accessibility-and-language-processing/",
"content": "Ollama is a groundbreaking..."
}
]
}
更多 Ollama JavaScript 示例
网页抓取 API (Web fetch API)
通过 URL 抓取单个网页并返回其内容。
POST https://ollama.ac.cn/api/web_fetch
返回一个包含以下内容的对象
title (字符串):网页的标题
content (字符串):网页的主要内容
links (数组):在页面上找到的链接数组
cURL 请求
curl --request POST \
--url https://ollama.com/api/web_fetch \
--header "Authorization: Bearer $OLLAMA_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"url": "ollama.com"
}'
响应
{
"title": "Ollama",
"content": "[Cloud models](https://ollama.ac.cn/blog/cloud-models) are now available in Ollama...",
"links": [
"https://ollama.ac.cn/",
"https://ollama.ac.cn/models",
"https://github.com/ollama/ollama"
]
Python SDK
from ollama import web_fetch
result = web_fetch('https://ollama.ac.cn')
print(result)
结果
WebFetchResponse(
title='Ollama',
content='[Cloud models](https://ollama.ac.cn/blog/cloud-models) are now available in Ollama\n\n**Chat & build
with open models**\n\n[Download](https://ollama.ac.cn/download) [Explore
models](https://ollama.com/models)\n\nAvailable for macOS, Windows, and Linux',
links=['https://ollama.ac.cn/', 'https://ollama.ac.cn/models', 'https://github.com/ollama/ollama']
)
JavaScript SDK
import { Ollama } from "ollama";
const client = new Ollama();
const fetchResult = await client.webFetch("https://ollama.ac.cn");
console.log(JSON.stringify(fetchResult, null, 2));
结果
{
"title": "Ollama",
"content": "[Cloud models](https://ollama.ac.cn/blog/cloud-models) are now available in Ollama...",
"links": [
"https://ollama.ac.cn/",
"https://ollama.ac.cn/models",
"https://github.com/ollama/ollama"
]
}
构建搜索代理
使用 Ollama 的网络搜索 API 作为工具来构建一个微型搜索代理。 本示例使用阿里巴巴的 Qwen 3 模型(4B 参数)。
from ollama import chat, web_fetch, web_search
available_tools = {'web_search': web_search, 'web_fetch': web_fetch}
messages = [{'role': 'user', 'content': "what is ollama's new engine"}]
while True:
response = chat(
model='qwen3:4b',
messages=messages,
tools=[web_search, web_fetch],
think=True
)
if response.message.thinking:
print('Thinking: ', response.message.thinking)
if response.message.content:
print('Content: ', response.message.content)
messages.append(response.message)
if response.message.tool_calls:
print('Tool calls: ', response.message.tool_calls)
for tool_call in response.message.tool_calls:
function_to_call = available_tools.get(tool_call.function.name)
if function_to_call:
args = tool_call.function.arguments
result = function_to_call(**args)
print('Result: ', str(result)[:200]+'...')
# Result is truncated for limited context lengths
messages.append({'role': 'tool', 'content': str(result)[:2000 * 4], 'tool_name': tool_call.function.name})
else:
messages.append({'role': 'tool', 'content': f'Tool {tool_call.function.name} not found', 'tool_name': tool_call.function.name})
else:
break
结果
Thinking: Okay, the user is asking about Ollama's new engine. I need to figure out what they're referring to. Ollama is a company that develops large language models, so maybe they've released a new model or an updated version of their existing engine....
Tool calls: [ToolCall(function=Function(name='web_search', arguments={'max_results': 3, 'query': 'Ollama new engine'}))]
Result: results=[WebSearchResult(content='# New model scheduling\n\n## September 23, 2025\n\nOllama now includes a significantly improved model scheduling system. Ahead of running a model, Ollama’s new engine
Thinking: Okay, the user asked about Ollama's new engine. Let me look at the search results.
First result is from September 23, 2025, talking about new model scheduling. It mentions improved memory management, reduced crashes, better GPU utilization, and multi-GPU performance. Examples show speed improvements and accurate memory reporting. Supported models include gemma3, llama4, qwen3, etc...
Content: Ollama has introduced two key updates to its engine, both released in 2025:
1. **Enhanced Model Scheduling (September 23, 2025)**
- **Precision Memory Management**: Exact memory allocation reduces out-of-memory crashes and optimizes GPU utilization.
- **Performance Gains**: Examples show significant speed improvements (e.g., 85.54 tokens/s vs 52.02 tokens/s) and full GPU layer utilization.
- **Multi-GPU Support**: Improved efficiency across multiple GPUs, with accurate memory reporting via tools like `nvidia-smi`.
- **Supported Models**: Includes `gemma3`, `llama4`, `qwen3`, `mistral-small3.2`, and more.
2. **Multimodal Engine (May 15, 2025)**
- **Vision Support**: First-class support for vision models, including `llama4:scout` (109B parameters), `gemma3`, `qwen2.5vl`, and `mistral-small3.1`.
- **Multimodal Tasks**: Examples include identifying animals in multiple images, answering location-based questions from videos, and document scanning.
These updates highlight Ollama's focus on efficiency, performance, and expanded capabilities for both text and vision tasks.
上下文长度与代理
网络搜索结果可能会返回数千个 token。建议将模型的上下文长度增加到至少 32000 个 token 左右。搜索代理在全上下文长度下工作效果最佳。Ollama 的云端模型以全上下文长度运行。
MCP 服务器
您可以通过 Python MCP 服务器在任何 MCP 客户端中启用网络搜索。
Cline
使用 MCP 服务器配置可以轻松地将 Ollama 的网络搜索与 Cline 集成。 Manage MCP Servers > Configure MCP Servers > 添加以下配置:{
"mcpServers": {
"web_search_and_fetch": {
"type": "stdio",
"command": "uv",
"args": ["run", "path/to/web-search-mcp.py"],
"env": { "OLLAMA_API_KEY": "your_api_key_here" }
}
}
}
Codex
Ollama 可以很好地与 OpenAI 的 Codex 工具配合使用。 将以下配置添加到 ~/.codex/config.toml[mcp_servers.web_search]
command = "uv"
args = ["run", "path/to/web-search-mcp.py"]
env = { "OLLAMA_API_KEY" = "your_api_key_here" }
Goose
Ollama 可以通过其 MCP 功能与 Goose 集成。

其他集成
Ollama 可以通过直接集成 Ollama 的 API、Python / JavaScript 库、兼容 OpenAI 的 API 以及 MCP 服务器集成,从而整合到大多数可用工具中。