Supported Models & Backends#
MassGen supports a wide range of LLM providers and models. This page provides comprehensive information about backend types, model support, and setup requirements.
Quick Reference: Backend Setup#
Backend Type |
Setup Requirements |
|---|---|
Claude API |
|
Claude Code |
Native tools: Read, Write, Edit, Bash, Grep, Glob, TodoWrite. If logged in via Anthropic account, |
Gemini API |
|
OpenAI API |
|
Grok API |
|
Azure OpenAI |
Azure deployment config: |
Z AI |
|
ChatCompletion |
|
LM Studio |
Local LM Studio server running |
vLLM/SGLang |
Local inference server on port 8000 (vLLM) or 30000 (SGLang) |
AG2 Framework |
AG2 installation + LLM API keys for chosen provider |
For detailed backend capabilities (web search, code execution, MCP support), see: Backend Configuration
API-Based Models#
Azure OpenAI#
Models |
GPT-4, GPT-4o, GPT-3.5-turbo, GPT-4.1, GPT-5-chat |
Backend Type |
|
Tools Support |
Code interpreter, Azure deployment management |
MCP Support |
❌ Not yet supported |
Claude (Anthropic)#
Models |
Haiku 3.5, Sonnet 4, Opus 4 series |
Backend Type |
|
Tools Support |
✅ Web search, code execution, file operations |
MCP Support |
✅ Full integration |
Claude Code#
Models |
Native Claude Code SDK |
Backend Type |
|
Tools Support |
✅ Native dev tools: Read, Write, Edit, Bash, Grep, Glob, TodoWrite |
MCP Support |
✅ Full integration |
Gemini (Google)#
Models |
Gemini 2.5 Flash, Gemini 2.5 Pro series |
Backend Type |
|
Tools Support |
✅ Web search, code execution, file operations |
MCP Support |
✅ Full integration with planning mode |
Grok (xAI)#
Models |
Grok-4, Grok-3, Grok-3-mini series |
Backend Type |
|
Tools Support |
✅ Web search, file operations |
MCP Support |
✅ Full integration |
OpenAI#
Models |
GPT-5.2, GPT-5, GPT-5-mini, GPT-5-nano, GPT-4 series |
Backend Type |
|
Tools Support |
✅ Web search, code interpreter, file operations |
MCP Support |
✅ Full integration |
Z AI#
Models |
GLM-4.5 |
Backend Type |
|
Tools Support |
File operations |
MCP Support |
✅ Integration available |
ChatCompletion (Generic OpenAI-Compatible)#
The chatcompletion backend provides a generic way to connect to any OpenAI-compatible API endpoint. This is the most flexible backend type and works with many providers.
Backend Type |
|
Compatible Providers |
Cerebras AI, Together AI, Fireworks AI, Groq, OpenRouter, POE, and any OpenAI-compatible API |
Required Config |
|
API Key |
Provider-specific (e.g., |
MCP Support |
✅ Full integration |
Tools Support |
Depends on provider’s function calling support |
Configuration Example:
backend:
type: "chatcompletion"
model: "gpt-oss-120b" # Model name
base_url: "https://api.cerebras.ai/v1" # Provider endpoint
api_key: "${CEREBRAS_API_KEY}" # Provider API key
temperature: 0.7
max_tokens: 2000
mcp_servers: # Optional MCP tools
- name: "weather"
type: "stdio"
command: "npx"
args: ["-y", "@modelcontextprotocol/server-weather"]
Supported Providers:
Provider |
Base URL |
Environment Variable |
|---|---|---|
Cerebras AI |
|
|
Together AI |
|
|
Fireworks AI |
|
|
Groq |
|
|
OpenRouter |
|
|
Kimi/Moonshot |
|
|
Nebius AI Studio |
Provider-specific |
|
POE |
Platform-specific |
Platform credentials |
Common Models:
Cerebras:
gpt-oss-120b,gpt-oss-70bTogether AI:
meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo,mistralai/Mixtral-8x7B-Instruct-v0.1Fireworks AI:
accounts/fireworks/models/llama-v3p1-405b-instructGroq:
llama-3.1-70b-versatile,mixtral-8x7b-32768
Tool Enablement Reference#
This section shows exactly which configuration parameters work with which backends.
Backend-Level Tool Parameters#
Backend |
|
|
|
Notes |
|---|---|---|---|---|
claude |
✅ |
✅ |
❌ |
Built-in tools via Anthropic API |
claude_code |
N/A |
N/A |
N/A |
Native tools always available: Read, Write, Edit, Bash, Grep, Glob, TodoWrite. Control via |
gemini |
✅ |
✅ |
❌ |
Google Search and code execution tools |
openai |
✅ |
❌ |
✅ |
Web search via Responses API, code interpreter for calculations |
grok |
✅ |
❌ |
❌ |
Built-in Live Search feature |
azure_openai |
❌ |
❌ |
❌ |
Limited tool support |
zai |
❌ |
❌ |
❌ |
Basic file operations only |
chatcompletion |
Varies |
Varies |
Varies |
Depends on provider (Cerebras, Together AI, etc.) |
lmstudio |
❌ |
❌ |
❌ |
Local models, tool support varies |
vllm |
❌ |
❌ |
❌ |
Local inference server |
sglang |
❌ |
❌ |
❌ |
Local inference server |
ag2 |
N/A |
N/A |
N/A |
Uses AG2 code execution config |
MCP Backend Parameters#
These parameters are available for all backends with MCP support (Claude, Gemini, OpenAI, Grok, ChatCompletion, etc.).
Parameter |
Type |
Description & Usage |
|---|---|---|
|
string |
Working directory for MCP filesystem operations. Relative or absolute path. Available for all MCP-enabled backends. |
|
list |
Whitelist specific tools. Only listed tools will be available. Example: |
|
list |
Blacklist specific tools. All tools available except those listed. Example: |
|
list |
Exclude specific MCP tools from being available to the agent. Similar to |
Claude Code Additional Parameters#
These parameters are specific to the Claude Code backend only.
Parameter |
Type |
Description & Usage |
|---|---|---|
|
integer |
Maximum tokens for internal reasoning. Default: 8000. Increase for complex tasks. |
|
string |
Custom system prompt for the agent. Prepended to default instructions. |
|
string |
|
|
list |
For Claude Code native tools (Read, Write, Edit, Bash, etc.). Default: |
Example MCP Configuration (any backend):
backend:
type: "gemini" # or claude, openai, grok, etc.
model: "gemini-2.5-flash"
cwd: "my_project" # File operations handled via cwd
disallowed_tools: ["mcp__weather__set_location"]
mcp_servers:
- name: "weather"
type: "stdio"
command: "npx"
args: ["-y", "@modelcontextprotocol/server-weather"]
Example Claude Code Configuration:
backend:
type: "claude_code"
model: "claude-sonnet-4-20250514"
cwd: "my_project"
disallowed_tools: ["Bash(rm*)", "Bash(sudo*)", "WebSearch"]
max_thinking_tokens: 10000
system_prompt: "You are an expert Python developer"
Local Models#
LM Studio#
Models |
LLaMA, Mistral, Qwen, and other open-weight models |
Backend Type |
|
Features |
Automatic CLI installation, auto-download, zero-cost usage |
MCP Support |
Limited |
vLLM & SGLang#
Unified inference backend supporting both vLLM and SGLang servers.
Port Detection |
Auto-detection: vLLM (8000), SGLang (30000) |
Parameters |
Supports both vLLM and SGLang-specific params (top_k, repetition_penalty, separate_reasoning) |
Mixed Deployment |
Can run both vLLM and SGLang servers simultaneously |
External Frameworks#
AG2#
Agent Types |
ConversableAgent, AssistantAgent |
Backend Type |
|
Features |
Code execution (Local, Docker, Jupyter, Cloud) |
LLM Support |
OpenAI, Azure, Anthropic, Google via AG2 config |
See Also#
Backend Configuration - Detailed backend configuration
MCP Integration - MCP tool setup
General Framework Interoperability - Framework interoperability (including AG2)
YAML Configuration Reference - YAML configuration reference