MassGen: Multi-Agent Scaling System for GenAI#
What is MassGen?#
MassGen is a cutting-edge multi-agent system that leverages the power of collaborative AI to solve complex tasks. It assigns a task to multiple AI agents who work in parallel, observe each other’s progress, and refine their approaches to converge on the best solution to deliver a comprehensive and high-quality result.
How It Works:
Work in Parallel - Multiple agents tackle the problem simultaneously, each bringing unique capabilities
See Recent Answers - At each step, agents view the most recent answers from other agents
Decide Next Action - Each agent chooses to provide a new answer or vote for an existing answer
Share Workspaces - When agents provide answers, their workspace is captured so others can review their work
Natural Consensus - Coordination continues until all agents vote, then the agent with most votes presents the final answer
Think of it as a “parallel study group” for AI - inspired by advanced systems like xAI’s Grok Heavy and Google DeepMind’s Gemini Deep Think. Agents learn from each other to produce better results than any single agent could achieve alone.
Quick Start#
pip install uv # if needed
uv venv && source .venv/bin/activate
uv pip install massgen
uv run massgen # Setup wizard, then ask your first question
from dotenv import load_dotenv
load_dotenv() # Load OPENROUTER_API_KEY from .env
import litellm
from massgen import register_with_litellm
register_with_litellm()
response = litellm.completion(
model="massgen/build",
messages=[{"role": "user", "content": "Your question"}],
optional_params={"models": ["openrouter/openai/gpt-5", "openrouter/anthropic/claude-sonnet-4.5"]}
)
print(response.choices[0].message.content)
Key Features#
Use Claude, Gemini, GPT, Grok together - each agent can use a different model.
Multiple agents work simultaneously with voting and consensus detection.
Model Context Protocol for web search, code execution, file operations, and custom tools.
Full async Python API and LiteLLM integration for seamless application embedding.
Real-time terminal display showing agents’ working processes and coordination.
Interactive conversations with context preservation across turns.
Integrate external frameworks (AG2, LangGraph, AgentScope, OpenAI, SmolAgent) as tools.
Work directly with your codebase using context paths with granular read/write permissions.
Recent Releases#
v0.1.19 (December 2, 2025) - LiteLLM Provider & Claude Strict Tool Use
LiteLLM custom provider integration with programmatic API (run(), build_config()). Claude strict tool use with structured outputs support via enable_strict_tool_use and output_schema. Gemini exponential backoff for rate limit resilience.
v0.1.18 (November 28, 2025) - Agent Communication & Claude Advanced Tooling
Agent-to-agent and human broadcast communication via ask_others() tool with three modes (disabled, agents-only, human-only). Claude programmatic tool calling from code execution via enable_programmatic_flow flag. Claude tool search for deferred tool discovery via enable_tool_search.
v0.1.17 (November 26, 2025) - Textual Terminal Display
Interactive terminal UI using the Textual library with dark/light theme support. Multi-panel layout with dedicated views for each agent and orchestrator status. Real-time streaming with syntax highlighting and emoji fallback.
Supported Models#
Claude (Anthropic) · Gemini (Google) · GPT (OpenAI) · Grok (xAI) · Azure OpenAI · Groq · Together · LM Studio · and more…