Backend Configuration#
Backends connect MassGen agents to AI model providers. Each backend is configured in YAML and provides specific capabilities like web search, code execution, and file operations.
Overview#
Each agent in MassGen requires a backend configuration that specifies:
Provider: Which AI service to use (OpenAI, Claude, Gemini, etc.)
Model: Which specific model within that provider
Capabilities: Which built-in tools are enabled
Parameters: Model settings like temperature, max_tokens, etc.
Available Backends#
Backend Types#
MassGen supports these backend types (configured via type field in YAML):
Backend Type |
Provider |
Models |
|---|---|---|
|
OpenAI |
GPT-5, GPT-5-mini, GPT-5-nano, GPT-4, GPT-4o |
|
Anthropic |
Claude Haiku 3.5, Claude Sonnet 4, Claude Opus 4 |
|
Anthropic (SDK) |
Claude Sonnet 4, Claude Opus 4 (with dev tools) |
|
OpenAI (CLI) |
GPT-5.4, GPT-5.3-Codex, GPT-5.2-Codex, GPT-5.1-Codex |
|
Gemini 2.5 Flash, Gemini 2.5 Pro |
|
|
Google (CLI) |
Gemini 3, Gemini 2.5 Models (via Gemini CLI) |
|
xAI |
Grok-4, Grok-3, Grok-3-mini |
|
Microsoft Azure |
GPT-4, GPT-4o, GPT-5 (Azure deployments) |
|
ZhipuAI |
GLM-4.5 |
|
AG2 Framework |
Any AG2-compatible agent |
|
LM Studio |
Local open-source models |
|
OpenAI (CLI) |
GPT-5.4, GPT-5.3-Codex, GPT-5.2-Codex, GPT-5.1-Codex, GPT-4.1 |
|
GitHub Copilot |
GPT-5-mini, GPT-4.1, Claude Sonnet 4, Gemini 2.5 Pro |
|
vLLM / SGLang |
Any locally served model |
|
Generic |
Any OpenAI-compatible API |
Backend Capabilities#
Different backends support different built-in tools:
Backend |
Web Search |
Code Execution |
Bash/Shell |
Image |
Audio |
Video |
MCP Support |
Filesystem |
Custom Tools |
|---|---|---|---|---|---|---|---|---|---|
|
⭐ |
⭐ |
✅ |
⭐ Both |
⭐ Both |
⭐ Generation |
✅ |
✅ |
✅ |
|
⭐ |
⭐ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
⭐ |
❌ |
⭐ |
🔧 |
🔧 |
🔧 |
✅ |
⭐ |
✅ |
|
⭐ |
❌ |
⭐ |
🔧 |
🔧 |
🔧 |
✅ |
⭐ |
✅ |
|
⭐ |
❌ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
⭐ |
⭐ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
⭐ |
⭐ |
⭐ |
🔧 |
🔧 |
🔧 |
✅ |
⭐ |
✅ |
|
⭐ |
❌ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
⭐ |
⭐ |
✅ |
⭐ Both |
❌ |
❌ |
✅ |
✅ |
❌ |
|
❌ |
❌ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
❌ |
❌ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
❌ |
❌ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
❌ |
❌ |
✅ |
🔧 |
🔧 |
🔧 |
✅ |
✅ |
✅ |
|
❌ |
⭐ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
❌ |
Notes:
Symbol Legend:
⭐ Built-in - Native backend feature (e.g., Anthropic’s web search, OpenAI’s native image API, Claude Code’s Bash tool)
🔧 Via Custom Tools - Available through custom tools (requires
OPENAI_API_KEYfor multimodal understanding)✅ MCP-based or Available - Feature available via MCP integration or standard capability
❌ Not available - Feature not supported
Custom Tools:
Custom tools allow you to give agents access to your own Python functions
Most backends support custom tools (OpenAI, Claude, Claude Code, Codex, Copilot, Gemini, Grok, Chat Completions, LM Studio, ZAI, Inference)
Azure OpenAI and AG2 do not support custom tools as they inherit from the base backend class without the custom tools layer
Custom tools are essential for multimodal understanding features (
understand_image,understand_video,understand_audio,understand_file)See Custom Tools for complete documentation on creating and using custom tools
Code Execution vs Bash/Shell:
Warning
Common Confusion:
enable_code_executionandenable_code_interpreterrun code in the provider’s sandbox (cloud environment) with NO access to your local filesystem. If you need agents to read/write files in your project, use MCP-based bash instead.Code Execution (⭐): Backend provider’s native code execution tool (runs in provider sandbox - no access to MassGen workspaces)
openai: OpenAI code interpreter for calculations and data analysisclaude: Anthropic’s code execution toolgemini: Google’s code execution toolazure_openai: Azure OpenAI code interpreterag2: AG2 framework code executors (Local, Docker, Jupyter, Cloud)When to use: Quick calculations, data analysis, isolated code snippets that don’t need filesystem access
Bash/Shell: MassGen-level feature with direct workspace access
⭐ (
claude_code,codex,gemini_cli): Native shell/bash tools built into the backend CLI/SDK✅ (all MCP-enabled backends): Universal bash/shell via
enable_mcp_command_line: trueWhen to use: Code that needs to interact with your project files, run tests, execute scripts
See Code Execution for detailed setup and comparison
Recommendation: Choose one approach based on your needs. Use built-in code execution for isolated computational tasks, and MCP bash/shell for operations that need to affect your workspace files.
Filesystem:
⭐ (
claude_code,codex,gemini_cli): Native filesystem tools via CLI/SDK (Read, Write, Edit, Bash, etc.)✅ (all backends with
cwdparameter): Filesystem operations handled automatically through workspace configurationSee File Operations & Workspace Management for detailed filesystem configuration
Multimodal Capabilities:
⭐ Native Multimodal Support: The backend/model API directly handles multimodal content
⭐ Both (e.g.,
openai,azure_openai): Native API supports BOTH understanding (analyze) AND generation (create)⭐ Generation (e.g.,
openaivideo): Can create videos via Sora-2 API but not analyze them
🔧 Via Custom Tools: Multimodal understanding through custom tools (
understand_image,understand_video,understand_audio)Works with any backend that supports custom tools
Requires
OPENAI_API_KEYin.envfile (tools use OpenAI’s API for processing)Examples:
claude,claude_code,gemini,grok,chatcompletion,lmstudio,inferenceDoes NOT work with
azure_openaiorag2(these backends don’t support custom tools)See Multimodal Capabilities for complete setup instructions
Understanding vs Generation:
Understanding: Analyze existing content (images, audio, video)
Generation: Create new content from text prompts
Both: Supports both understanding AND generation
See Supported Models & Backends for the complete backend capabilities reference.
Configuring Backends#
Basic Backend Configuration#
Every agent needs a backend section in the YAML configuration:
agents:
- id: "my_agent"
backend:
type: "openai" # Backend type (required)
model: "gpt-5-nano" # Model name (required)
Backend-Specific Examples#
OpenAI Backend#
Basic Configuration:
agents:
- id: "gpt_agent"
backend:
type: "openai"
model: "gpt-5-nano"
enable_web_search: true
enable_code_interpreter: true
With Reasoning Parameters:
agents:
- id: "reasoning_agent"
backend:
type: "openai"
model: "gpt-5-nano"
text:
verbosity: "medium" # low, medium, high
reasoning:
effort: "high" # low, medium, high
summary: "auto" # auto, concise, detailed
Supported Models: GPT-5, GPT-5-mini, GPT-5-nano, GPT-4, GPT-4o, GPT-4-turbo, GPT-3.5-turbo
Claude Backend#
Basic Configuration:
agents:
- id: "claude_agent"
backend:
type: "claude"
model: "claude-sonnet-4"
enable_web_search: true
enable_code_interpreter: true
With MCP Integration:
agents:
- id: "claude_mcp"
backend:
type: "claude"
model: "claude-sonnet-4"
mcp_servers:
- name: "weather"
type: "stdio"
command: "npx"
args: ["-y", "@modelcontextprotocol/server-weather"]
Supported Models: claude-haiku-4-5-20251001, claude-sonnet-4-5-20250929, claude-opus-4-1-20250805, claude-sonnet-4-20250514, claude-3-5-sonnet-latest, claude-3-5-haiku-latest
Claude Code Backend#
With Workspace Configuration:
agents:
- id: "code_agent"
backend:
type: "claude_code"
model: "claude-sonnet-4"
cwd: "workspace" # Working directory for file operations
orchestrator:
snapshot_storage: "snapshots"
agent_temporary_workspace: "temp_workspaces"
Authentication:
The Claude Code backend supports flexible authentication:
API key: Set
CLAUDE_CODE_API_KEYorANTHROPIC_API_KEYenvironment variableSubscription: If no API key is set, uses Claude subscription authentication
This allows you to use Claude Code with a subscription while using a separate API key for standard Claude backend agents.
Special Features:
Native file operations (Read, Write, Edit, Bash, Grep, Glob)
Workspace isolation
Snapshot sharing between agents
Full development tool suite
Codex Backend#
Basic Configuration:
agents:
- id: "codex_agent"
backend:
type: "codex"
model: "gpt-5.4"
cwd: "workspace"
Authentication:
The Codex backend supports flexible authentication:
API key: Set
OPENAI_API_KEYenvironment variableChatGPT subscription: If no API key, uses OAuth via
codex login
Supported Models: gpt-5.4 (default), gpt-5.3-codex, gpt-5.2-codex, gpt-5.1-codex, gpt-5-codex, gpt-4.1
Reasoning Effort Configuration:
agents:
- id: "codex_reasoning"
backend:
type: "codex"
model: "gpt-5.4"
model_reasoning_effort: "xhigh" # low | medium | high | xhigh
# reasoning:
# effort: "xhigh" # OpenAI-style alias (also supported)
If both model_reasoning_effort and reasoning.effort are provided,
model_reasoning_effort takes precedence.
Special Features:
Native shell and file operations via Codex CLI
Web search capability
Session persistence and resumption
MCP server support via workspace config
Warning
Sandbox Limitation: Codex uses OS-level sandboxing (Seatbelt/Landlock) which only restricts writes, NOT reads. Codex can read any file on the filesystem. For security-sensitive workloads, use Docker mode or consider Claude Code instead. See Native Tool Backends for details.
Recommended: Docker Mode for Security:
agents:
- id: "secure_codex"
backend:
type: "codex"
model: "gpt-5.4"
cwd: "workspace"
enable_mcp_command_line: true
command_line_execution_mode: "docker"
command_line_docker_network_mode: "bridge" # Required for Codex
Gemini CLI Backend#
The gemini_cli backend (alias: gemini-cli) wraps Google’s Gemini CLI (@google/gemini-cli) for local or Docker execution.
Basic Configuration (Local):
agents:
- id: "gemini_cli_agent"
backend:
type: "gemini_cli"
model: "gemini-2.5-pro"
cwd: "workspace"
Authentication:
CLI login: Run
geminiinteractively to login with Google (preferred)API key: Set
GOOGLE_API_KEYorGEMINI_API_KEYenvironment variable
Installation: npm install -g @google/gemini-cli
Docker Mode: Requires command_line_docker_network_mode: "bridge". Add @google/gemini-cli to
command_line_docker_packages.preinstall.npm or use an image with Gemini CLI pre-installed.
Supported Models: gemini-2.5-pro (default), gemini-2.5-flash, gemini-2.5-flash-lite, gemini-3-flash-preview, gemini-3-pro-preview, gemini-3.1-pro-preview
Example configs: massgen/configs/providers/gemini/gemini_cli_local.yaml, gemini_cli_docker.yaml
GitHub Copilot Backend#
Prerequisites:
An active GitHub Copilot subscription
Install the Copilot CLI:
# macOS / Linux brew install copilot-cli # npm (all platforms) npm install -g @github/copilot # Windows winget install GitHub.Copilot
Authenticate — run
copilotand use the/loginslash command, or set aGH_TOKEN/GITHUB_TOKENenvironment variable with a fine-grained PAT that has the Copilot Requests permission.
Basic Configuration:
agents:
- id: "copilot-assistant"
backend:
type: "copilot"
model: "gpt-5-mini"
Supported Models: gpt-5-mini (default), gpt-4, claude-sonnet-4, gemini-2.5-pro
Special Features:
No API key required — authentication is handled through your GitHub subscription
Web search capability
MCP server support
Session persistence and resumption
Gemini Backend#
Basic Configuration:
agents:
- id: "gemini_agent"
backend:
type: "gemini"
model: "gemini-2.5-flash"
enable_web_search: true
enable_code_execution: true
With Safety Settings:
agents:
- id: "safe_gemini"
backend:
type: "gemini"
model: "gemini-2.5-pro"
safety_settings:
HARM_CATEGORY_HARASSMENT: "BLOCK_MEDIUM_AND_ABOVE"
HARM_CATEGORY_HATE_SPEECH: "BLOCK_MEDIUM_AND_ABOVE"
Supported Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.5-flash-thinking
Grok Backend#
Basic Configuration:
agents:
- id: "grok_agent"
backend:
type: "grok"
model: "grok-3-mini"
enable_web_search: true
Supported Models: grok-4, grok-4-fast, grok-3, grok-3-mini
Azure OpenAI Backend#
Configuration:
agents:
- id: "azure_agent"
backend:
type: "azure_openai"
model: "gpt-4"
deployment_name: "my-gpt4-deployment"
api_version: "2024-02-15-preview"
Required Environment Variables:
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_VERSION=YOUR-AZURE-OPENAI-API-VERSION
AG2 Backend#
Configuration:
agents:
- id: "ag2_agent"
backend:
type: "ag2"
agent_type: "ConversableAgent"
llm_config:
config_list:
- model: "gpt-4"
api_key: "${OPENAI_API_KEY}"
code_execution_config:
executor: "local"
work_dir: "coding"
See General Framework Interoperability for detailed AG2 configuration.
LM Studio Backend#
For Local Models:
agents:
- id: "local_agent"
backend:
type: "lmstudio"
model: "lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF"
port: 1234
Features:
Automatic LM Studio CLI installation
Auto-download and loading of models
Zero-cost usage
Full privacy (local inference)
OpenRouter Backend#
OpenRouter provides unified access to multiple AI providers through a single API.
Use the chatcompletion backend type with OpenRouter’s base URL.
Basic Configuration:
agents:
- id: "openrouter_agent"
backend:
type: "chatcompletion"
model: "openai/gpt-5-mini"
base_url: "https://openrouter.ai/api/v1"
With Reasoning Tokens:
OpenRouter normalizes reasoning tokens across providers. Configure reasoning for models that support it (OpenAI o-series, GPT-5, Claude 3.7+, Gemini 2.5+, DeepSeek R1, Grok):
agents:
- id: "reasoning_agent"
backend:
type: "chatcompletion"
model: "openai/gpt-5-mini"
base_url: "https://openrouter.ai/api/v1"
reasoning:
effort: "medium" # xhigh, high, medium, low, minimal, none
max_tokens: 2000 # Optional: direct token limit (Anthropic-style)
exclude: false # Optional: set true to hide reasoning from response
With Web Search:
agents:
- id: "search_agent"
backend:
type: "chatcompletion"
model: "openai/gpt-5-mini"
base_url: "https://openrouter.ai/api/v1"
enable_web_search: true
engine: "exa" # exa (AI-native) or native (traditional)
max_results: 10
search_context_size: "high" # low, medium, high
Reasoning Effort Levels:
xhigh: ~95% of max_tokens for reasoninghigh: ~80% of max_tokens for reasoningmedium: ~50% of max_tokens for reasoning (default)low: ~20% of max_tokens for reasoningminimal: ~10% of max_tokens for reasoningnone: Disable reasoning entirely
Environment Variable:
OPENROUTER_API_KEY=your-openrouter-api-key
Note
Reasoning tokens are output tokens and billed accordingly. Models automatically
include reasoning in responses when appropriate. Use exclude: true if you
want the model to reason internally without returning the reasoning text.
Local Inference Backends (vLLM & SGLang)#
Unified Inference Backend (v0.0.24-v0.0.25)
MassGen supports high-performance local model serving through vLLM and SGLang with automatic server detection:
agents:
- id: "local_vllm"
backend:
type: "chatcompletion"
model: "meta-llama/Llama-3.1-8B-Instruct"
base_url: "http://localhost:8000/v1" # vLLM default port
api_key: "EMPTY"
- id: "local_sglang"
backend:
type: "chatcompletion"
model: "meta-llama/Llama-3.1-8B-Instruct"
base_url: "http://localhost:30000/v1" # SGLang default port
api_key: "${SGLANG_API_KEY}"
Auto-Detection:
vLLM: Default port 8000
SGLang: Default port 30000
Automatically detects server type based on configuration
Unified InferenceBackend class handles both
SGLang-Specific Parameters:
backend:
type: "chatcompletion"
model: "meta-llama/Llama-3.1-8B-Instruct"
base_url: "http://localhost:30000/v1"
separate_reasoning: true # SGLang guided generation
top_k: 50 # Sampling parameter
repetition_penalty: 1.1 # Prevent repetition
Mixed Deployments:
Run both vLLM and SGLang simultaneously:
agents:
- id: "vllm_agent"
backend:
type: "chatcompletion"
model: "Qwen/Qwen2.5-7B-Instruct"
base_url: "http://localhost:8000/v1"
api_key: "EMPTY"
- id: "sglang_agent"
backend:
type: "chatcompletion"
model: "Qwen/Qwen2.5-7B-Instruct"
base_url: "http://localhost:30000/v1"
api_key: "${SGLANG_API_KEY}"
separate_reasoning: true
Benefits of Local Inference:
Cost Savings: Zero API costs after initial setup
Privacy: No data sent to external services
Control: Full control over model selection and parameters
Performance: Optimized for high-throughput inference
Customization: Fine-tune models for specific use cases
Setup vLLM Server:
# Install vLLM
pip install vllm
# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--host 0.0.0.0 \
--port 8000
Setup SGLang Server:
# Install SGLang
pip install "sglang[all]"
# Start SGLang server
python -m sglang.launch_server \
--model-path meta-llama/Llama-3.1-8B-Instruct \
--host 0.0.0.0 \
--port 30000
Configuration Example:
See @examples/basic/multi/two_qwen_vllm_sglang.yaml for a complete mixed deployment example.
Common Backend Parameters#
Model Parameters#
All backends support these common parameters:
backend:
type: "openai"
model: "gpt-5-nano"
# Generation parameters
temperature: 0.7 # Randomness (0.0-2.0, default 0.7)
max_tokens: 4096 # Maximum response length
top_p: 1.0 # Nucleus sampling (0.0-1.0)
# API configuration
api_key: "${OPENAI_API_KEY}" # Optional - uses env var by default
timeout: 60 # Request timeout in seconds
Tool Configuration#
Enable or disable built-in tools:
backend:
type: "gemini"
model: "gemini-2.5-flash"
# Enable tools
enable_web_search: true
enable_code_execution: true
# MCP servers (see MCP Integration guide)
mcp_servers:
- name: "server_name"
type: "stdio"
command: "npx"
args: ["..."]
Multi-Backend Configurations#
Using Different Backends#
Each agent can use a different backend:
agents:
- id: "fast_researcher"
backend:
type: "gemini"
model: "gemini-2.5-flash"
enable_web_search: true
- id: "deep_analyst"
backend:
type: "openai"
model: "gpt-5"
reasoning:
effort: "high"
- id: "code_expert"
backend:
type: "claude_code"
model: "claude-sonnet-4"
cwd: "workspace"
This is the recommended approach - use each backend’s strengths:
Gemini 2.5 Flash: Fast research with web search
GPT-5: Advanced reasoning and analysis
Claude Code: Development with file operations
Backend Selection Guide#
Choosing the Right Backend#
Consider these factors when selecting backends:
For Research Tasks:
Gemini 2.5 Flash: Fast, cost-effective, excellent web search
GPT-5-nano: Good reasoning with web search
Grok: Real-time information access
For Coding Tasks:
Claude Code: Best for file operations, full dev tools
GPT-5: Advanced code generation with reasoning
Gemini 2.5 Pro: Complex code analysis
For Analysis Tasks:
GPT-5: Deep reasoning and complex analysis
Claude Sonnet 4: Long context, detailed analysis
Gemini 2.5 Pro: Comprehensive multimodal analysis
For Cost-Sensitive Tasks:
GPT-5-nano: Low-cost OpenAI model
Grok-3-mini: Fast and affordable
Gemini 2.5 Flash: Very cost-effective
LM Studio: Free (local inference)
For Privacy-Sensitive Tasks:
LM Studio: Fully local, no data sharing
Azure OpenAI: Enterprise security
Self-hosted vLLM: Private cloud deployment
Native Tool Backends (Claude Code, Codex & Gemini CLI)#
MassGen supports three “native tool” agent backends that wrap CLI/SDK tools rather than just API calls: Claude Code (Anthropic’s Claude Code SDK), Codex (OpenAI’s Codex CLI), and Gemini CLI (Google’s Gemini CLI). All three are agent backends — they require no API key and authenticate via their own CLI login flow. They come with built-in filesystem and shell tools, providing a more integrated development experience but with different security characteristics than API-only backends.
Architecture Differences#
Aspect |
Agent Backends (Claude Code, Codex, Gemini CLI) |
API Backends (OpenAI, Claude, Gemini, etc.) |
|---|---|---|
Tool Execution |
Native tools (Read, Write, Bash) run locally via CLI/SDK |
Tools run via MassGen’s MCP servers |
Permission Control |
Backend’s own sandbox + limited MassGen hooks |
Full MassGen PathPermissionManager control |
Filesystem Access |
Direct local filesystem access |
Controlled through MCP filesystem tools |
State Management |
Stateful (session persistence, conversation history) |
Stateless (each call is independent) |
Authentication |
CLI login (no API key required) |
API key required |
Agent Backend Comparison#
Feature |
Claude Code |
Codex |
Gemini CLI |
|---|---|---|---|
Provider |
Anthropic (Claude Code SDK) |
OpenAI (Codex CLI) |
Google (Gemini CLI) |
Authentication |
Subscription or |
|
|
Models |
Claude Sonnet 4, Claude Opus 4 |
GPT-5.4, GPT-5.3-Codex, GPT-5.2-Codex, GPT-5.1-Codex |
gemini-2.5-pro, gemini-2.5-flash, gemini-3.1-pro-preview |
Native Tools |
Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch |
shell, apply_patch, web_search, image_view |
ReadFile, WriteFile, RunShellCommand, WebSearch, WebFetch |
MCP Support |
Yes (SDK-native) |
Yes (via .codex/config.toml) |
Yes (via .gemini/settings.json) |
Sandbox Type |
SDK permission hooks |
OS-level (Seatbelt on macOS, Landlock on Linux) |
Process-level (workspace isolation) |
Read Restrictions |
Yes - SDK hooks block reads outside allowed paths |
No - OS sandbox only restricts writes |
Yes - workspace-scoped |
Write Restrictions |
Yes - SDK hooks enforce write permissions |
Yes - OS sandbox restricts writes to writable_roots |
Yes - workspace-scoped |
Warning
Codex Sandbox Limitation: Codex uses OS-level sandboxing (Seatbelt on macOS, Landlock on Linux) which only restricts writes, NOT reads. This means Codex can read any file on the filesystem, including sensitive files outside the workspace and context_paths (SSH keys, credentials, environment files, etc.).
MassGen’s permission hooks cannot intercept Codex’s native tool calls because they run directly through the Codex CLI’s internal tools.
Security Recommendations#
For security-sensitive workloads, prefer Docker mode which provides full filesystem isolation via container boundaries:
# Recommended: Docker mode for Codex with sensitive data
agents:
- id: "secure_codex"
backend:
type: "codex"
model: "gpt-5.4"
cwd: "workspace"
enable_mcp_command_line: true
command_line_execution_mode: "docker"
command_line_docker_network_mode: "bridge" # Required for Codex
command_line_docker_enable_sudo: true
Important
Codex in Docker mode requires command_line_docker_network_mode: "bridge".
Without this setting, Codex will fail to execute. The validator will check for this.
In Docker mode:
The container itself is the sandbox - Codex’s native tools can only access what’s mounted
Host filesystem is fully isolated from the agent
~/.codex/is mounted read-only for OAuth token accessThe Codex CLI runs with
--sandbox danger-full-accesssince the container provides isolation
When Docker is not available, consider:
Use Claude Code or Gemini CLI instead - Both provide read/write restrictions via their own permission model
Limit context_paths - Only grant access to directories that need agent access
Avoid sensitive data - Don’t run Codex in directories with credentials or secrets
Use API-only backends - For maximum control, use
openaiorclaudebackends with MCP tools
Backend Configuration Best Practices#
Start with defaults: Test with default parameters before tuning
Use environment variables: Never hardcode API keys
Match backend to task: Use each backend’s strengths
Enable only needed tools: Disable unused capabilities
Set appropriate timeouts: Longer timeouts for complex tasks
Monitor costs: Track API usage across backends
Test configurations: Verify settings before production use
Advanced Backend Configuration#
For detailed backend-specific parameters, see:
YAML Configuration Reference - Complete YAML schema
MCP Integration#
See MCP Integration for:
Adding MCP servers to backends
Tool filtering (allowed_tools, exclude_tools)
Planning mode configuration (v0.0.29)
HTTP-based MCP servers
File Operations#
See File Operations & Workspace Management for:
Workspace configuration
Snapshot storage
Permission management
Cross-agent file sharing
Troubleshooting#
Backend not found:
Ensure the backend type is correct:
# Correct backend types
type: "openai" # ✅
type: "claude_code" # ✅
type: "codex" # ✅
type: "copilot" # ✅
type: "gemini" # ✅
type: "gemini_cli" # ✅
# Incorrect (common mistakes)
type: "gpt" # ❌ Use "openai"
type: "claude" # ✅ (but consider "claude_code" for dev tools)
type: "google" # ❌ Use "gemini"
API key not found:
Check your .env file has the correct variable name:
# Backend type → Environment variable
openai → OPENAI_API_KEY
claude → ANTHROPIC_API_KEY
claude_code → CLAUDE_CODE_API_KEY (falls back to ANTHROPIC_API_KEY)
codex → OPENAI_API_KEY (or use `codex login` for OAuth)
copilot → GH_TOKEN or GITHUB_TOKEN (or use /login in Copilot CLI)
gemini → GOOGLE_API_KEY
gemini_cli → GOOGLE_API_KEY or GEMINI_API_KEY (or use `gemini` login)
grok → XAI_API_KEY
zai → ZAI_API_KEY
azure_openai → AZURE_OPENAI_API_KEY
Note
Separate API keys for Claude Code: The claude_code backend checks
CLAUDE_CODE_API_KEY first, then falls back to ANTHROPIC_API_KEY.
This allows you to use a Claude subscription (no API key needed) or a
separate API key for Claude Code agents while using a different API key
for standard Claude backend agents.
Model not supported:
Verify the model name matches the backend’s supported models:
# Check supported models in README.md or use --model flag
backend:
type: "openai"
model: "gpt-5-nano" # ✅ Supported
model: "gpt-6" # ❌ Not yet available
Next Steps#
Configuration - Full configuration guide
MCP Integration - Add external tools via MCP
File Operations & Workspace Management - Enable file system operations
Supported Models & Backends - Complete model list
Basic Examples - See backends in action