Supported Models & Backends#

MassGen supports a wide range of LLM providers and models. This page provides comprehensive information about backend types, model support, and setup requirements.

Quick Reference: Backend Setup#

Backend Type

Setup Requirements

Claude API

ANTHROPIC_API_KEY

Claude Code

Native tools: Read, Write, Edit, Bash, Grep, Glob, TodoWrite. If logged in via Anthropic account, ANTHROPIC_API_KEY is NOT needed (comment it out in .env or it will default to using the API key)

Gemini API

GEMINI_API_KEY

OpenAI API

OPENAI_API_KEY

Grok API

XAI_API_KEY

Azure OpenAI

Azure deployment config: AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_VERSION

Z AI

ZAI_API_KEY

ChatCompletion

base_url + provider-specific API key (e.g., CEREBRAS_API_KEY, TOGETHER_API_KEY)

LM Studio

Local LM Studio server running

vLLM/SGLang

Local inference server on port 8000 (vLLM) or 30000 (SGLang)

AG2 Framework

AG2 installation + LLM API keys for chosen provider

For detailed backend capabilities (web search, code execution, MCP support), see: Backend Configuration

API-Based Models#

Azure OpenAI#

Models

GPT-4, GPT-4o, GPT-3.5-turbo, GPT-4.1, GPT-5-chat

Backend Type

azure_openai

Tools Support

Code interpreter, Azure deployment management

MCP Support

❌ Not yet supported

Claude (Anthropic)#

Models

Haiku 3.5, Sonnet 4, Opus 4 series

Backend Type

claude

Tools Support

✅ Web search, code execution, file operations

MCP Support

✅ Full integration

Claude Code#

Models

Native Claude Code SDK

Backend Type

claude_code

Tools Support

Native dev tools: Read, Write, Edit, Bash, Grep, Glob, TodoWrite

MCP Support

✅ Full integration

Gemini (Google)#

Models

Gemini 2.5 Flash, Gemini 2.5 Pro series

Backend Type

gemini

Tools Support

✅ Web search, code execution, file operations

MCP Support

✅ Full integration with planning mode

Grok (xAI)#

Models

Grok-4, Grok-3, Grok-3-mini series

Backend Type

grok

Tools Support

✅ Web search, file operations

MCP Support

✅ Full integration

OpenAI#

Models

GPT-5.2, GPT-5, GPT-5-mini, GPT-5-nano, GPT-4 series

Backend Type

openai

Tools Support

✅ Web search, code interpreter, file operations

MCP Support

✅ Full integration

Z AI#

Models

GLM-4.5

Backend Type

zai

Tools Support

File operations

MCP Support

✅ Integration available

ChatCompletion (Generic OpenAI-Compatible)#

The chatcompletion backend provides a generic way to connect to any OpenAI-compatible API endpoint. This is the most flexible backend type and works with many providers.

Backend Type

chatcompletion

Compatible Providers

Cerebras AI, Together AI, Fireworks AI, Groq, OpenRouter, POE, and any OpenAI-compatible API

Required Config

base_url pointing to the provider’s API endpoint

API Key

Provider-specific (e.g., CEREBRAS_API_KEY, TOGETHER_API_KEY)

MCP Support

✅ Full integration

Tools Support

Depends on provider’s function calling support

Configuration Example:

backend:
  type: "chatcompletion"
  model: "gpt-oss-120b"              # Model name
  base_url: "https://api.cerebras.ai/v1"  # Provider endpoint
  api_key: "${CEREBRAS_API_KEY}"    # Provider API key
  temperature: 0.7
  max_tokens: 2000
  mcp_servers:                       # Optional MCP tools
    - name: "weather"
      type: "stdio"
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-weather"]

Supported Providers:

Provider

Base URL

Environment Variable

Cerebras AI

https://api.cerebras.ai/v1

CEREBRAS_API_KEY

Together AI

https://api.together.xyz/v1

TOGETHER_API_KEY

Fireworks AI

https://api.fireworks.ai/inference/v1

FIREWORKS_API_KEY

Groq

https://api.groq.com/openai/v1

GROQ_API_KEY

OpenRouter

https://openrouter.ai/api/v1

OPENROUTER_API_KEY

Kimi/Moonshot

https://api.moonshot.cn/v1

MOONSHOT_API_KEY

Nebius AI Studio

Provider-specific

NEBIUS_API_KEY

POE

Platform-specific

Platform credentials

Common Models:

  • Cerebras: gpt-oss-120b, gpt-oss-70b

  • Together AI: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo, mistralai/Mixtral-8x7B-Instruct-v0.1

  • Fireworks AI: accounts/fireworks/models/llama-v3p1-405b-instruct

  • Groq: llama-3.1-70b-versatile, mixtral-8x7b-32768

Tool Enablement Reference#

This section shows exactly which configuration parameters work with which backends.

Backend-Level Tool Parameters#

Backend

enable_web_search

enable_code_execution

enable_code_interpreter

Notes

claude

Built-in tools via Anthropic API

claude_code

N/A

N/A

N/A

Native tools always available: Read, Write, Edit, Bash, Grep, Glob, TodoWrite. Control via allowed_tools or disallowed_tools

gemini

Google Search and code execution tools

openai

Web search via Responses API, code interpreter for calculations

grok

Built-in Live Search feature

azure_openai

Limited tool support

zai

Basic file operations only

chatcompletion

Varies

Varies

Varies

Depends on provider (Cerebras, Together AI, etc.)

lmstudio

Local models, tool support varies

vllm

Local inference server

sglang

Local inference server

ag2

N/A

N/A

N/A

Uses AG2 code execution config

MCP Backend Parameters#

These parameters are available for all backends with MCP support (Claude, Gemini, OpenAI, Grok, ChatCompletion, etc.).

Parameter

Type

Description & Usage

cwd

string

Working directory for MCP filesystem operations. Relative or absolute path. Available for all MCP-enabled backends.

allowed_tools

list

Whitelist specific tools. Only listed tools will be available. Example: ["read_file", "write_file", "list_directory"]

disallowed_tools

list

Blacklist specific tools. All tools available except those listed. Example: ["write_file", "create_directory", "move_file"]

exclude_tools

list

Exclude specific MCP tools from being available to the agent. Similar to disallowed_tools for MCP servers.

Claude Code Additional Parameters#

These parameters are specific to the Claude Code backend only.

Parameter

Type

Description & Usage

max_thinking_tokens

integer

Maximum tokens for internal reasoning. Default: 8000. Increase for complex tasks.

system_prompt

string

Custom system prompt for the agent. Prepended to default instructions.

permission_mode

string

"bypassPermissions" to skip confirmation prompts (use in automation)

disallowed_tools

list

For Claude Code native tools (Read, Write, Edit, Bash, etc.). Default: ["Bash(rm*)", "Bash(sudo*)", "Bash(su*)", "Bash(chmod*)", "Bash(chown*)"]. Example to block web access: ["Bash(rm*)", "WebSearch"]

Example MCP Configuration (any backend):

backend:
  type: "gemini"  # or claude, openai, grok, etc.
  model: "gemini-2.5-flash"
  cwd: "my_project"  # File operations handled via cwd
  disallowed_tools: ["mcp__weather__set_location"]
  mcp_servers:
    - name: "weather"
      type: "stdio"
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-weather"]

Example Claude Code Configuration:

backend:
  type: "claude_code"
  model: "claude-sonnet-4-20250514"
  cwd: "my_project"
  disallowed_tools: ["Bash(rm*)", "Bash(sudo*)", "WebSearch"]
  max_thinking_tokens: 10000
  system_prompt: "You are an expert Python developer"

Local Models#

LM Studio#

Models

LLaMA, Mistral, Qwen, and other open-weight models

Backend Type

lmstudio

Features

Automatic CLI installation, auto-download, zero-cost usage

MCP Support

Limited

vLLM & SGLang#

Unified inference backend supporting both vLLM and SGLang servers.

Port Detection

Auto-detection: vLLM (8000), SGLang (30000)

Parameters

Supports both vLLM and SGLang-specific params (top_k, repetition_penalty, separate_reasoning)

Mixed Deployment

Can run both vLLM and SGLang servers simultaneously

External Frameworks#

AG2#

Agent Types

ConversableAgent, AssistantAgent

Backend Type

ag2

Features

Code execution (Local, Docker, Jupyter, Cloud)

LLM Support

OpenAI, Azure, Anthropic, Google via AG2 config

See Also#