Architecture#
MassGen’s architecture is designed for scalability, flexibility, and extensibility.
System Overview#
┌─────────────────────────────────────────┐
│ User Application │
└─────────────┬───────────────────────────┘
│
┌─────────────▼───────────────────────────┐
│ Orchestrator Layer │
│ ┌─────────────┬──────────────────┐ │
│ │ Strategy │ Consensus │ │
│ │ Manager │ Engine │ │
│ └─────────────┴──────────────────┘ │
└─────────────┬───────────────────────────┘
│
┌─────────────▼───────────────────────────┐
│ Agent Layer │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │Agent1│ │Agent2│ │Agent3│ │AgentN│ │
│ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ │
└─────┼────────┼────────┼────────┼──────┘
│ │ │ │
┌─────▼────────▼────────▼────────▼──────┐
│ Backend Abstraction │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │OpenAI│ │Claude│ │Gemini│ │ Grok │ │
│ └──────┘ └──────┘ └──────┘ └──────┘ │
└─────────────────────────────────────────┘
Core Components#
Orchestrator#
The orchestrator manages agent coordination:
Task distribution
Strategy execution
Consensus building
Result aggregation
Agent#
Agents are autonomous units with:
Unique identity and role
Backend connection
Tool access
Memory management
Backend#
Backends provide LLM capabilities:
API abstraction
Model management
Response handling
Error recovery
Design Principles#
Modularity: Components are loosely coupled
Extensibility: Easy to add new agents, backends, tools
Scalability: Supports horizontal scaling
Resilience: Fault-tolerant design
Flexibility: Multiple orchestration strategies
Coordination Protocol#
MassGen uses a “parallel study group” coordination protocol inspired by advanced systems like xAI’s Grok Heavy and Google DeepMind’s Gemini Deep Think.
Vote-Based Consensus#
The coordination process follows these steps:
Parallel Execution: All agents receive the same query and work simultaneously
Answer Observation: Agents can see recent answers from other agents
Decision Making: Each agent chooses to either:
Provide a new/refined answer (
new_answertool)Vote for an existing answer they think is best (
votetool)
Dynamic Updates: When an agent provides
new_answer:Other agents receive update injection mid-work
Agents continue with preserved context (inject-and-continue)
All existing votes are cleared (new answer invalidates votes)
Consensus Detection: Coordination continues until all agents have voted
Winner Selection: The agent with the most votes is selected
Final Presentation: The winning agent delivers the final answer
Key Features:
Natural Convergence: No forced consensus, agents naturally agree on best answer
Iterative Refinement: Agents can refine their answers after seeing others’ work
Workspace Sharing: When agents answer, their workspace is snapshotted for others to review
Tie Resolution: Deterministic tie-breaking based on answer order
Inject-and-Continue (Preempt-Not-Restart)#
When an agent provides a new_answer while other agents are working, MassGen uses an inject-and-continue approach instead of restarting agents from scratch.
Traditional Approach (Restart):
Agent A: Working on solution... [deep in analysis]
Agent B: Provides new_answer
↓
Agent A: KILL stream → Clear context → Restart from zero
❌ Lost all partial work and thinking
MassGen Approach (Inject-and-Continue):
Agent A: Working on solution... [deep in analysis]
Agent B: Provides new_answer
↓
Agent A: Receive UPDATE → Inject new context → Continue working
✅ Preserved all partial work and thinking
✅ Can now build on Agent B's answer
Benefits:
Context Preservation: Agents keep their full thinking history
Efficiency: No wasted computation regenerating ideas
Better Collaboration: Agents can synthesize multiple perspectives
Natural Building: Agents reference and improve each other’s work
Update Injection:
Updates are injected at safe points during agent execution:
Between iteration loops (after completing a response)
When agent checks for new context
NOT mid-stream (would break agent reasoning)
Race Condition: If an agent is deep in its first response when a new answer arrives, it won’t see the injection until completing that response. By then, it may already have full context from the orchestrator’s normal flow. This is acceptable - the agent still gets all answers, just via different mechanism (full context on next spawn vs. injection mid-work).
Implementation: massgen/orchestrator.py:_inject_update_and_continue()
Answer Labeling#
Each answer gets a unique identifier: agent{N}.{attempt}
agent1.1= Agent 1’s first answeragent2.1= Agent 2’s first answeragent1.2= Agent 1’s second answer (after restart)agent1.final= Agent 1’s final answer (if winner)
This labeling system enables:
Clear vote tracking
Answer evolution visualization
Transparent decision history
Implementation: massgen/orchestrator.py
Workspace Management#
Each agent gets an isolated workspace for safe file operations.
Directory Structure#
.massgen/
├── workspaces/ # Agent working directories
│ ├── agent1/ # Agent 1's isolated workspace
│ └── agent2/ # Agent 2's isolated workspace
├── snapshots/ # Workspace snapshots for coordination
│ ├── agent1_20250113_143022/ # Snapshot of agent1's work
│ └── agent2_20250113_143025/ # Snapshot of agent2's work
├── temp_workspaces/ # Previous turn results for multi-turn
│ ├── agent1_turn_1/ # Agent 1's turn 1 results
│ └── agent2_turn_1/ # Agent 2's turn 1 results
├── sessions/ # Multi-turn conversation history
│ └── session_20250113_143000/
│ ├── turn_1/
│ └── turn_2/
└── massgen_logs/ # All logging output
└── log_20250113_143000/
Snapshot System#
When an agent provides an answer during coordination:
Capture: Their workspace is copied to
snapshots/Share: Other agents receive read-only access to the snapshot
Review: Agents can examine files, code, and outputs
Build: Agents build on insights from other agents’ work
This enables agents to:
See concrete work, not just descriptions
Catch errors in code or logic
Build incrementally on each other’s contributions
Provide informed votes based on actual outputs
Implementation: massgen/filesystem_manager/
Multi-Turn Conversations#
MassGen supports interactive multi-turn conversations with full context preservation.
Session Management#
Each multi-turn session maintains:
Session ID: Unique identifier (e.g.,
session_20250113_143000)Turn History: Numbered turns (
turn_1,turn_2, …)Workspace Persistence: Each turn’s workspace is preserved
Context Paths: Previous turns become read-only context for next turns
Turn Lifecycle#
Turn Start: Increment turn counter, create turn directory
Context Loading: Previous turn’s workspace becomes read-only context
Execution: Agents work with fresh writeable workspace + previous context
Persistence: Winning agent’s workspace is saved to turn directory
Summary Update: SESSION_SUMMARY.txt is updated with turn details
This allows agents to:
Compare “what I changed” vs “what was originally there”
Build incrementally across multiple turns
Reference previous results explicitly
Maintain project continuity
Implementation: massgen/cli.py (multi-turn mode)
MCP Integration#
MassGen integrates Model Context Protocol (MCP) for external tool access.
Architecture#
Backend → MCP Client → MCP Server → External Tools
↓
Tools List → Agent → Tool Calls → Tool Results
Supported Backends:
Claude: Native MCP support via
claude_messagesAPIGemini: MCP support via function calling
Others: Via tool conversion layer
Planning Mode#
Special coordination mode for MCP tools:
During Coordination: Agents can plan tool usage without execution
After Consensus: Winner executes tools in their final answer
Safety: Prevents irreversible actions during collaboration
This is critical for:
File operations (create, delete, modify)
API calls with side effects
Database operations
External service integrations
Implementation: massgen/backend/gemini.py, massgen/backend/claude.py
Backend Abstraction#
All LLM interactions go through a unified backend interface.
Backend Interface#
Each backend implements:
class Backend:
async def chat(messages, stream=True):
"""Stream responses with tool calls"""
async def get_available_tools():
"""Return tools for this backend"""
def format_messages(messages):
"""Convert to backend-specific format"""
Supported Backends:
API-based: OpenAI, Claude, Gemini, Grok, Azure OpenAI
Local: LM Studio, vLLM, SGLang
External: AG2 framework agents
Custom: Claude Code CLI with filesystem access
Implementation: massgen/backend/
File Permission System#
MassGen enforces granular file permissions for safe project integration.
Context Paths#
Agents can access specific directories with permissions:
orchestrator:
context_paths:
- path: "/path/to/project"
permission: "write"
protected_paths:
- ".git"
- "node_modules"
Permission Types:
read: View files onlywrite: Read, create, modify, delete files (except protected)
Protected Paths:
Immune from modification/deletion
Relative to context path
Supports files and directories
Safety Features:
Read-Before-Delete: Agents must read files before deletion
Permission Validation: All file operations are checked
Audit Trail: All operations logged to massgen.log
Implementation: massgen/filesystem_manager/_path_permission_manager.py
Code Organization#
massgen/
├── orchestrator.py # Coordination engine
├── chat_agent.py # Agent implementations
├── cli.py # Command-line interface
├── config_builder.py # Interactive config wizard
├── agent_config.py # Configuration models
├── backend/ # LLM backend implementations
│ ├── claude.py # Anthropic Claude
│ ├── gemini.py # Google Gemini
│ ├── response.py # OpenAI
│ ├── grok.py # xAI Grok
│ ├── claude_code.py # Claude Code CLI
│ ├── external.py # External frameworks (AG2)
│ └── ...
├── frontend/ # UI components
│ └── coordination_ui.py # Terminal UI
├── filesystem_manager/ # File operations & permissions
│ ├── _path_permission_manager.py
│ ├── _workspace_tools_server.py
│ └── ...
├── logger_config.py # Logging configuration
└── adapters/ # External framework adapters
└── ag2/ # AG2 adapter
Key Modules:
orchestrator.py: Vote tracking, consensus detection, workspace snapshots
chat_agent.py: Agent lifecycle, message handling, tool execution
backend/: LLM-specific implementations with unified interface
filesystem_manager/: Permission system, workspace isolation
frontend/: Real-time coordination display with Rich
Extension Points#
Adding New Backends#
Subclass
Backendbase classImplement
chat()andformat_messages()Register in
cli.py’screate_backend()Add to
AgentConfigfactory methods
Example: massgen/backend/grok.py
Adding MCP Servers#
Configure in YAML:
backend: type: "claude" mcp_servers: - name: "weather" command: "npx" args: ["-y", "@modelcontextprotocol/server-weather"]
Servers auto-start when backend initializes
Tools automatically discovered and presented to agent
Example: All MCP configs in massgen/configs/tools/mcp/
Adding External Frameworks#
Create adapter in
massgen/adapters/{framework}/Implement
ExternalAgentAdapterinterfaceRegister in
adapters/__init__.pyAgents work seamlessly with native MassGen agents
Example: massgen/adapters/ag2/
Performance Considerations#
Parallel Execution: All agents run concurrently
Streaming: All responses stream in real-time
Workspace Isolation: Copy-on-write for efficiency
Async I/O: All file operations are non-blocking
Token Management: Per-backend rate limiting
See Also#
Contributing to MassGen - How to contribute code
Writing Configuration Files - Configuration authoring guide
massgen/orchestrator.py- Core coordination logicmassgen/backend/- Backend implementationsmassgen/filesystem_manager/- Permission system