Custom Tools#
MassGen allows you to give agents access to your own custom Python functions as tools. This enables agents to use your domain-specific functionality, business logic, or specialized algorithms alongside built-in tools and MCP servers.
Note
Quick Setup Summary:
Write a Python function that returns
ExecutionResultReference it in your YAML config under
custom_toolsRun MassGen - agents can now use your function
For long-running calls, use the background lifecycle in Background Tool Execution
Quick Start: Try It Now#
MassGen includes working examples you can try immediately:
# Single agent with custom tool
massgen \
--config massgen/configs/tools/custom_tools/gemini_custom_tool_example.yaml \
"What's the sum of 123 and 456?"
# Custom tool + MCP weather integration
massgen \
--config massgen/configs/tools/custom_tools/gemini_custom_tool_with_mcp_example.yaml \
"What's the sum of 123 and 456? And what's the weather in Tokyo?"
The agent will use the custom two_num_tool to calculate and respond with “The sum of 123 and 456 is 579”.
How The Example Works#
The Tool (massgen/tool/_basic/_two_num_tool.py):
from massgen.tool._result import ExecutionResult, TextContent
async def two_num_tool(x: int, y: int) -> ExecutionResult:
"""Add two numbers together.
Args:
x: First number
y: Second number
Returns:
Sum of the two numbers
"""
result = x + y
return ExecutionResult(
output_blocks=[
TextContent(data=f"The sum of {x} and {y} is {result}"),
],
)
The Config (gemini_custom_tool_example.yaml):
agents:
- id: "gemini2.5flash_custom_tool"
backend:
type: "gemini"
model: "gemini-2.5-flash"
custom_tools:
- name: ["two_num_tool"]
category: "math"
path: "massgen/tool/_basic/_two_num_tool.py"
function: ["two_num_tool"]
system_message: |
You are an AI assistant with access to a custom math calculation tool.
When users ask about adding two numbers together, use the two_num_tool.
ui:
display_type: "rich_terminal"
That’s the complete pattern! Now let’s see how to create your own tools.
How It Works#
Custom tools in MassGen follow a simple pattern:
Function Signature: Write an async function with type hints
Docstring: Add a Google-style docstring (used for tool description)
Return Type: Return
ExecutionResultwith your outputYAML Config: Reference the function in your agent’s
custom_tools
MassGen automatically:
Generates JSON schema from your function signature
Makes the tool available to agents
Handles execution and result streaming
Works across all backends (Claude, Gemini, OpenAI, etc.)
Creating Your Own Custom Tools#
To create your own custom tool, follow the same pattern as two_num_tool.
Step-by-Step: Create a Custom Tool#
1. Create your tool file (e.g., my_tools/calculator.py):
from massgen.tool import ExecutionResult, TextContent
async def calculator(operation: str, x: float, y: float) -> ExecutionResult:
"""Perform basic math operations.
Args:
operation: The operation (add, subtract, multiply, divide)
x: First number
y: Second number
Returns:
ExecutionResult with calculation result
"""
operations = {
"add": x + y,
"subtract": x - y,
"multiply": x * y,
"divide": x / y if y != 0 else None,
}
if operation in operations and operations[operation] is not None:
result = operations[operation]
return ExecutionResult(
output_blocks=[TextContent(data=f"{operation}({x}, {y}) = {result}")]
)
else:
return ExecutionResult(
output_blocks=[TextContent(data=f"Error: Invalid operation or division by zero")]
)
2. Create a config file (e.g., my_calculator_config.yaml):
agents:
- id: "calculator_agent"
backend:
type: "gemini"
model: "gemini-2.5-flash"
custom_tools:
- name: ["calculator"]
category: "math"
path: "my_tools/calculator.py"
function: ["calculator"]
system_message: |
You are an AI assistant with access to a calculator tool.
Use it when users ask for math operations.
ui:
display_type: "simple"
3. Run it:
massgen --config my_calculator_config.yaml "What's 15 times 27?"
Basic Tool Structure#
Every custom tool follows this pattern:
from massgen.tool import ExecutionResult, TextContent
async def my_tool_name(param1: str, param2: int) -> ExecutionResult:
"""Brief description of what this tool does.
Args:
param1: Description of first parameter
param2: Description of second parameter
Returns:
ExecutionResult with the tool output
"""
# Your logic here
output = f"Processed {param1} with {param2}"
return ExecutionResult(
output_blocks=[TextContent(data=output)]
)
Key Requirements:
Use
async def(even if your function doesn’t use await)Include type hints for all parameters
Write a Google-style docstring with Args and Returns sections
Return
ExecutionResultwith at least one content block
Understanding ExecutionResult#
ExecutionResult is the container for all tool outputs. It tells MassGen what to return to the agent.
Basic Usage:
from massgen.tool import ExecutionResult, TextContent
return ExecutionResult(
output_blocks=[TextContent(data="Your output here")]
)
Available Content Types:
TextContent - Plain text output (most common)
TextContent(data="The result is 42")
ImageContent - Base64-encoded image data
ImageContent(data="base64_encoded_image_string")
AudioContent - Base64-encoded audio data
AudioContent(data="base64_encoded_audio_string")
ExecutionResult Parameters:
ExecutionResult(
output_blocks=[...], # Required: List of content blocks
meta_info={"key": "value"}, # Optional: Metadata (not shown to agent)
is_streaming=False, # Optional: Is this a streaming result?
is_final=True, # Optional: Is this the final result?
was_interrupted=False # Optional: Was execution interrupted?
)
Multimodal Results#
Tools can return multiple content types:
from massgen.tool import ExecutionResult, TextContent, ImageContent
async def generate_chart(data: list) -> ExecutionResult:
"""Generate a chart from data."""
# Generate chart (your code here)
import base64
chart_base64 = create_chart_image(data)
return ExecutionResult(
output_blocks=[
TextContent(data="Chart generated successfully"),
ImageContent(data=chart_base64)
],
meta_info={"chart_type": "bar", "data_points": len(data)}
)
Streaming Results#
For long-running operations, stream progress updates:
from typing import AsyncGenerator
import asyncio
async def process_large_dataset(file_path: str) -> AsyncGenerator[ExecutionResult, None]:
"""Process a large dataset with progress updates."""
# Initial status
yield ExecutionResult(
output_blocks=[TextContent(data="Starting processing...")],
is_streaming=True,
is_final=False
)
# Process in chunks
for i in range(10):
await asyncio.sleep(1) # Simulate work
yield ExecutionResult(
output_blocks=[TextContent(data=f"Progress: {(i+1)*10}%")],
is_streaming=True,
is_final=False
)
# Final result
yield ExecutionResult(
output_blocks=[TextContent(data="Processing complete!")],
is_streaming=True,
is_final=True
)
YAML Configuration#
Basic Configuration#
Reference your tool in the agent’s backend config:
agents:
- id: "agent_id"
backend:
type: "claude"
model: "claude-sonnet-4"
custom_tools:
# Reference external file
- name: "my_function"
path: "path/to/my_tools.py"
function: "my_function"
category: "utilities"
# Use built-in tool (no path needed)
- name: "run_python_script"
function: "run_python_script"
Configuration Options#
custom_tools:
- name: "tool_name" # Unique identifier
path: "path/to/file.py" # Path to Python file (optional for built-ins)
function: "function_name" # Function name in the file
category: "category_name" # Group related tools (optional)
description: "Tool description" # Override auto-generated description (optional)
Multiple Tools Example:
custom_tools:
- name: "calculator"
path: "tools/math.py"
function: "calculator"
category: "math"
- name: "text_analyzer"
path: "tools/text.py"
function: "analyze_text"
category: "text_processing"
# Use built-in tool
- name: "run_python_script"
function: "run_python_script"
Built-in Tool Functions#
Important
When to use the standard approach instead:
File Operations: Use Claude Code’s native tools or File Operations & Workspace Management with MCP filesystem servers
Code Execution: Use backend built-in code execution or Code Execution with MCP
These built-in functions are primarily for:
Building blocks when creating your own custom tools (import and use them in your code)
Backends that don’t have native file/code execution support
Available Functions#
MassGen provides these built-in functions you can import and use in your custom tools as examples or building blocks to show custom tool capabilities:
Code Execution:
run_python_script- Execute Python code in isolated subprocessrun_shell_script- Execute shell commands
File Operations:
read_file_content- Read files with optional line rangesave_file_content- Write content to filesappend_file_content- Append or insert content into files
See Tool System API for complete API documentation of these functions.
Example Configurations#
MassGen includes 58 working config examples in massgen/configs/tools/custom_tools/. All examples use the two_num_tool shown above.
Example 1: Claude Code with Custom Tool#
massgen \
--config massgen/configs/tools/custom_tools/claude_code_custom_tool_example.yaml \
"What's the sum of 15 and 27?"
Config: claude_code_custom_tool_example.yaml
orchestrator:
snapshot_storage: "claude_code_snapshots"
agent_temporary_workspace: "claude_code_temp"
agents:
- id: "claude_code_custom_tools"
backend:
type: "claude_code"
model: "claude-sonnet-4-20250514"
cwd: "claude_code_workspace"
custom_tools:
- name: ["two_num_tool"]
category: "math"
path: "massgen/tool/_basic/_two_num_tool.py"
function: ["two_num_tool"]
description: ["Add two numbers together"]
append_system_prompt: |
You are an AI assistant with access to custom calculation tools
in addition to your built-in Claude Code tools.
ui:
display_type: "simple"
logging_enabled: true
Example 2: Gemini with Custom Tool#
massgen \
--config massgen/configs/tools/custom_tools/gemini_custom_tool_example.yaml \
"What's the sum of 123 and 456?"
Config: gemini_custom_tool_example.yaml
agents:
- id: "gemini2.5flash_custom_tool"
backend:
type: "gemini"
model: "gemini-2.5-flash"
custom_tools:
- name: ["two_num_tool"]
category: "math"
path: "massgen/tool/_basic/_two_num_tool.py"
function: ["two_num_tool"]
system_message: |
You are an AI assistant with access to a custom math calculation tool.
When users ask about adding two numbers together, use the two_num_tool.
ui:
display_type: "rich_terminal"
logging_enabled: true
Example 3: Custom Tool + MCP Integration#
massgen \
--config massgen/configs/tools/custom_tools/gemini_custom_tool_with_mcp_example.yaml \
"What's the sum of 123 and 456? And what's the weather in Tokyo?"
Config: gemini_custom_tool_with_mcp_example.yaml
agents:
- id: "gemini2.5flash_custom_tool"
backend:
type: "gemini"
model: "gemini-2.5-flash"
# Custom tools
custom_tools:
- name: ["two_num_tool"]
category: "math"
path: "massgen/tool/_basic/_two_num_tool.py"
function: ["two_num_tool"]
# MCP servers
mcp_servers:
- name: "weather"
type: "stdio"
command: "npx"
args: ["-y", "@fak111/weather-mcp"]
system_message: |
You are an AI assistant with access to a custom math calculation tool
and a weather information MCP tool.
ui:
display_type: "simple"
logging_enabled: true
Example 4: Multimodal Understanding Tools#
New in v0.1.3+: MassGen provides custom tools for analyzing multimodal content (images, audio, video, documents) using OpenAI’s gpt-4.1 API.
# Analyze an image
massgen \
--config massgen/configs/tools/custom_tools/multimodal_tools/understand_image.yaml \
"Describe the content in this image"
# Transcribe audio
massgen \
--config massgen/configs/tools/custom_tools/multimodal_tools/understand_audio.yaml \
"What is being said in this audio?"
# Analyze video
massgen \
--config massgen/configs/tools/custom_tools/multimodal_tools/understand_video.yaml \
"What's happening in this video?"
# Process documents
massgen \
--config massgen/configs/tools/custom_tools/multimodal_tools/understand_file.yaml \
"Summarize this PDF document"
Config Example: understand_image.yaml
agents:
- id: "understand_image_tool"
backend:
type: "openai"
model: "gpt-5-nano"
cwd: "workspace1"
custom_tools:
- name: ["understand_image"]
category: "multimodal"
path: "massgen/tool/_multimodal_tools/understand_image.py"
function: ["understand_image"]
system_message: |
You are an AI assistant with access to image understanding capabilities.
Use the understand_image tool to analyze and understand images using OpenAI's gpt-4.1 API.
orchestrator:
context_paths:
- path: "massgen/configs/resources/v0.1.3-example/multimodality.jpg"
permission: "read"
ui:
display_type: "rich_terminal"
logging_enabled: true
Available Multimodal Tools:
understand_image- Analyze images (PNG, JPEG, JPG)understand_audio- Transcribe and analyze audio filesunderstand_video- Extract key frames and analyze videosunderstand_file- Process documents (PDF, DOCX, XLSX, PPTX)
Key Features:
Works with any backend - uses OpenAI’s gpt-4.1 for analysis
Processes files from agent workspaces
Structured JSON responses with detailed metadata
Path validation for security
See Multimodal Capabilities for complete multimodal capabilities documentation.
Example 5: Crawl4AI Web Scraping Tools#
New in v0.1.4: Docker-based web scraping with multiple output formats via crawl4ai custom tools.
# Start crawl4ai Docker container (one-time setup)
docker pull unclecode/crawl4ai:latest
docker run -d -p 11235:11235 --name crawl4ai --shm-size=1g unclecode/crawl4ai:latest
# Use crawl4ai tools
massgen \
--config massgen/configs/tools/custom_tools/crawl4ai_example.yaml \
"Please scrape the MassGen docs, take a screenshot, and explain that screenshot"
Config Example: crawl4ai_example.yaml
agents:
- id: "web_scraper_agent"
backend:
type: "openai"
model: "gpt-5-mini"
cwd: "workspace1"
# Register crawl4ai custom tools
custom_tools:
- name: ["crawl4ai_md", "crawl4ai_html", "crawl4ai_screenshot", "crawl4ai_pdf", "crawl4ai_execute_js", "crawl4ai_crawl"]
category: "web_scraping"
path: "massgen/tool/_web_tools/crawl4ai_tool.py"
function: ["crawl4ai_md", "crawl4ai_html", "crawl4ai_screenshot", "crawl4ai_pdf", "crawl4ai_execute_js", "crawl4ai_crawl"]
- name: ["understand_image"]
category: "multimodal"
path: "massgen/tool/_multimodal_tools/understand_image.py"
function: ["understand_image"]
ui:
display_type: "rich_terminal"
logging_enabled: true
Available Crawl4AI Tools:
crawl4ai_md- Extract clean markdown from web contentcrawl4ai_html- Get preprocessed HTMLcrawl4ai_screenshot- Capture webpage screenshotscrawl4ai_pdf- Generate PDF documentscrawl4ai_execute_js- Run JavaScript on web pagescrawl4ai_crawl- Perform multi-URL crawling
Key Features:
Docker-based isolation (no Python dependencies needed)
Multiple output formats (markdown, HTML, screenshots, PDFs)
JavaScript execution for dynamic content
Concurrent crawling (up to 5 simultaneous crawls)
Automatic Docker health checks with clear error messages
Requirements:
Docker installed and running
crawl4ai container accessible at
http://localhost:11235
If the Docker container isn’t running, agents receive a helpful error message with setup instructions.
Example 6: Computer Use Tools#
New in v0.1.8: MassGen provides browser and desktop automation tools for AI agents.
MassGen offers three computer use tools optimized for different providers:
gemini_computer_use- Google Gemini Computer Use (autonomous browser/desktop control)claude_computer_use- Anthropic Claude Computer Use (thorough automation with enhanced actions)browser_automation- Simple browser automation (works with ANY model: gpt-4.1, gpt-4o, etc.)
Quick Example:
# Simple browser automation (any model)
massgen \
--config massgen/configs/tools/custom_tools/simple_browser_automation_example.yaml \
"Go to Wikipedia and search for Jimmy Carter"
# Gemini Computer Use
massgen \
--config massgen/configs/tools/custom_tools/gemini_computer_use_example.yaml \
"Go to cnn.com and get the top headline"
# Claude Computer Use
massgen \
--config massgen/configs/tools/custom_tools/claude_computer_use_docker_example.yaml \
"Navigate to Wikipedia and search for Artificial Intelligence"
See also
For complete documentation on computer use tools including:
Detailed tool comparisons and performance benchmarks
Configuration examples for browser and Docker environments
Visualization and monitoring with VNC/non-headless mode
Multi-agent computer use coordination
Troubleshooting and best practices
See Computer Use Tools - Complete Computer Use Tools guide
Example 7: Terminal Evaluation Tools#
MassGen can evaluate its own terminal display and frontend UX by recording sessions with VHS and analyzing them using AI vision models.
MassGen provides terminal evaluation tools for assessing display quality and user experience:
run_massgen_with_recording- Record MassGen terminal sessions as video (MP4/GIF/WebM)understand_video- Analyze video recordings using GPT-4.1 visionunderstand_image- Analyze screenshots and frames
Quick Example:
# Record and evaluate a MassGen session
massgen \
--config massgen/configs/tools/custom_tools/terminal_evaluation.yaml \
"Record and evaluate the terminal display for the todo example config"
Config Example: terminal_evaluation.yaml
agents:
- id: "terminal_evaluator"
backend:
type: "openai"
model: "gpt-5-nano"
cwd: "workspace1"
# Terminal evaluation tools
custom_tools:
- name: ["run_massgen_with_recording"]
category: "terminal_recording"
path: "massgen/tool/_multimodal_tools/run_massgen_with_recording.py"
function: ["run_massgen_with_recording"]
- name: ["understand_video"]
category: "multimodal"
path: "massgen/tool/_multimodal_tools/understand_video.py"
function: ["understand_video"]
- name: ["understand_image"]
category: "multimodal"
path: "massgen/tool/_multimodal_tools/understand_image.py"
function: ["understand_image"]
ui:
display_type: "rich_terminal"
logging_enabled: true
Available Terminal Evaluation Tools:
run_massgen_with_recording- Records MassGen sessions as MP4/GIF/WebM videos using VHSunderstand_video- Extracts frames and analyzes videos with GPT-4.1understand_image- Analyzes individual frames or screenshots
Key Features:
VHS integration for high-quality terminal recording
Video frame extraction (configurable frame count)
AI-powered UX evaluation using GPT-4.1 vision
Automatic workspace management for recordings
Support for multiple output formats (MP4, GIF, WebM)
Prerequisites:
VHS terminal recorder:
brew install vhs(macOS) orgo install github.com/charmbracelet/vhs@latestOpenAI API key configured in
.env
Workflow:
Agent creates VHS tape script to record terminal session
Runs MassGen command (without
--automationto capture rich display)VHS records the session as video
Extracts key frames from video
Analyzes frames using GPT-4.1 vision model
Returns detailed UX evaluation with recommendations
Use Cases:
Frontend development - Evaluate UI/UX changes to terminal display
Quality assurance - Verify status indicators and agent outputs
Case study creation - Record demos and generate video content
User testing - Analyze how well terminal communicates progress
See also
For complete documentation on terminal evaluation including:
Detailed recording workflow and VHS configuration
Frame extraction and analysis techniques
Evaluation criteria and best practices
Integration with case study creation
Troubleshooting and monitoring
See Terminal Evaluation - Complete Terminal Evaluation guide
Available Example Configs#
The massgen/configs/tools/custom_tools/ directory contains examples for all backends:
Basic Custom Tools:
Claude API:
claude_custom_tool_example.yamlClaude Code:
claude_code_custom_tool_example.yamlGemini:
gemini_custom_tool_example.yamlOpenAI (GPT):
gpt5_nano_custom_tool_example.yaml,gpt_oss_custom_tool_example.yamlGrok:
grok3_mini_custom_tool_example.yamlQwen:
qwen_api_custom_tool_example.yaml,qwen_local_custom_tool_example.yamlWith MCP:
*_custom_tool_with_mcp_example.yamlvariants for each backend
Multimodal Understanding Tools:
multimodal_tools/understand_image.yaml- Image analysismultimodal_tools/understand_audio.yaml- Audio transcriptionmultimodal_tools/understand_video.yaml- Video analysismultimodal_tools/understand_file.yaml- Document processing
Web Scraping Tools:
crawl4ai_example.yaml- Docker-based web scraping with multiple output formats
Computer Use Tools:
gemini_computer_use_example.yaml- Google Gemini computer use automationclaude_computer_use_docker_example.yaml- Anthropic Claude computer use automationsimple_browser_automation_example.yaml- Simple browser automation for any model
Terminal Evaluation Tools:
terminal_evaluation.yaml- Record and evaluate MassGen terminal sessions with VHS and GPT-4.1
Backend Support#
Custom tools work with most MassGen backends:
✅ Supported Backends:
OpenAI (
openai) - OpenAI’s GPT modelsClaude (
claude) - Anthropic’s Claude APIClaude Code (
claude_code) - Claude with native file/code toolsGemini (
gemini) - Google’s Gemini modelsGrok (
grok) - xAI’s Grok modelsChat Completions (
chatcompletion) - Generic OpenAI-compatible APIsLM Studio (
lmstudio) - Local model hostingInference (
inference) - vLLM, SGLang, custom inference servers
❌ Not Supported:
Azure OpenAI (
azure_openai) - Does not implement custom tools interfaceAG2 Framework (
ag2) - Does not implement custom tools interface
Why Some Backends Don’t Support Custom Tools:
Azure OpenAI and AG2 inherit from the base LLMBackend class directly without the custom tools layer. These backends focus on their native capabilities rather than custom tool integration.
Troubleshooting#
Tool Not Found#
Error: ToolNotFound: No tool named 'my_tool' exists
Solutions:
Verify the file path is correct relative to where you run the command
Check function name matches exactly
Ensure the function is imported/defined in the file
Custom tool names are prefixed with
custom_tool__internally
Function Import Errors#
Error: ModuleNotFoundError or ImportError
Solutions:
Use relative or absolute paths correctly
Ensure all imports in your tool file are available
Check that dependencies are installed
Schema Generation Fails#
Error: TypeError: cannot create schema for function
Solutions:
Add type hints to all parameters
Use
async defeven for non-async functionsReturn
ExecutionResult(not plain values)
Tool Execution Errors#
Check the error in the agent’s output. Common issues:
Missing required parameters
Wrong parameter types
Exceptions in your function code
Add error handling to your tools:
async def safe_tool(param: str) -> ExecutionResult:
"""A tool with error handling."""
try:
# Your logic
result = process(param)
return ExecutionResult(
output_blocks=[TextContent(data=f"Success: {result}")]
)
except Exception as e:
return ExecutionResult(
output_blocks=[TextContent(data=f"Error: {str(e)}")]
)
Best Practices#
Clear Function Names: Use descriptive names that indicate what the tool does
Type Hints Required: Always include type hints for parameters and return type
Detailed Docstrings: Agents use these to understand when to use your tool
Error Handling: Return errors as
ExecutionResultrather than raising exceptionsTest Independently: Test your function works before adding to MassGen
Keep Functions Focused: One tool should do one thing well
Use Categories: Group related tools together
Advanced Usage (Developer API)#
Note
The sections below are for advanced users and developers who want to programmatically manage tools or understand internal APIs. Most users don’t need this.
For most use cases, the YAML configuration above is sufficient. However, if you’re building on top of MassGen or need programmatic control, you can use the ToolManager API.
ToolManager API#
The ToolManager class provides programmatic control over tools:
from massgen.tool import ToolManager
# Create manager
manager = ToolManager()
# Add tool from file
manager.add_tool_function(
path="my_tools/calculator.py",
func="calculator",
category="math"
)
# Get available tools
schemas = manager.fetch_tool_schemas()
# Execute a tool
result = await manager.execute_tool({
"name": "custom_tool__calculator",
"input": {"operation": "add", "x": 5, "y": 3}
})
Tool Categories#
Programmatically manage tool categories:
# Create category
manager.setup_category(
category_name="data_science",
description="Data analysis tools",
enabled=True
)
# Enable/disable categories
manager.modify_categories(["data_science"], enabled=False)
# Delete categories
manager.delete_categories("old_category")
See also
Tool System API - Complete ToolManager API reference with all methods, parameters, and examples.
Next Steps#
Related Guides:
MCP Integration - External tools via MCP
Background Tool Execution - Non-blocking lifecycle for long-running tool calls
Tools and Capabilities - Tools and capabilities overview
Backend Configuration - Backend capabilities
YAML Configuration Reference - Complete YAML reference
Developer API Documentation:
For programmatic tool management and internal APIs:
Tool System API - Complete Tool System API reference (ToolManager, ExecutionResult, exceptions, built-in tools)
Examples:
Config Examples - 58 configuration examples
Test Examples - Python usage examples