MassGen Roadmap

Contents

MassGen Roadmap#

Full Roadmap#

MassGen Roadmap#

Current Version: v0.1.87

Release Schedule: Mondays, Wednesdays, Fridays @ 9am PT

Last Updated: May 15, 2026

This roadmap outlines MassGen’s development priorities for upcoming releases. Each release focuses on specific capabilities with real-world use cases.


πŸ‘₯ Contributors & Contact#

Want to contribute or collaborate on a specific track? Reach out to the track owners below:

Track

GitHub

Discord

Tool System Refactoring

@qidanrui

danrui2020

Multimodal Support

@qidanrui

danrui2020

General Interoperability

@qidanrui

danrui2020

RL Integration

@qidanrui @praneeth999

danrui2020, ram2561

Agent Adapter System

@Eric-Shang

ericshang.

Framework Streaming

@Eric-Shang

ericshang.

Irreversible Actions Safety

@franklinnwren

zhichengren

Computer Use

@franklinnwren

zhichengren

Memory Module

@qidanrui @ncrispino

danrui2020, nickcrispino

Rate Limiting System

@AbhimanyuAryan

abhimanyuaryan

DSPy Integration

@praneeth999

ram2561

MassGen Handbook

@a5507203 @Henry-811

crinvo, henry_weiqi

Session Management

@ncrispino

nickcrispino

Automatic MCP Tool Selection

@ncrispino

nickcrispino

Parallel File Operations

@ncrispino

nickcrispino

MassGen Terminal Evaluation

@ncrispino

nickcrispino

Textual Terminal Display

@praneeth999

ram2561

Web UI

@voidcenter

justin_zhang

For general questions, join the #massgen channel on Discord


Release

Target

Feature

Owner

Use Case

v0.1.88

05/18/26

Image/Video Edit Capabilities

@ncrispino

Check and support img/video editing capabilities β€” deferred from v0.1.86/v0.1.87 (#959)

Discriminative Criteria Refinements

@ncrispino

Selection, ranking, and retirement of stale criteria for long-running refinement loops

All releases ship on MWF @ 9am PT when ready


βœ… v0.1.87 - Documentation: Framework Comparisons & llms.txt (Completed)#

Released: May 15, 2026 | PRs: #1094

Features#
  • Framework Comparison Pages: Three new β€œMassGen vs …” pages β€” crewai.rst, langgraph.rst, autogen.rst β€” under docs/source/reference/comparisons/, positioning MassGen against each framework’s coordination shape

  • llms.txt Index: Curated llmstxt.org-spec index published at the docs site root via Sphinx html_extra_path

  • llms-full.txt Corpus: Concatenated full-docs dump (~1 MB, 59 files), generated by a Sphinx build-finished hook in conf.py

  • Docs Landing Page Update: β€œHow Does MassGen Compare?” now lists all four comparisons (LLM Council + the three new ones); parent comparisons.rst drops β€œcoming soon” and gains a toctree

  • README Pointers: One-line pointer in README.md / README_PYPI.md directing AI agents to llms.txt / llms-full.txt

  • bootstrap_subagent Single-Shot Fix: Orchestrator._run_bootstrap_discriminator_step passes refine=False to spawn_subagent β€” the canonical knob SubagentManager respects at the orchestrator level (the orchestrator’s max_new_answers_per_agent: 3 default was shadowing coordination-dict overrides)

Notes#
  • Image/Video Edit Capabilities (#959) and Discriminative Criteria Refinements deferred to v0.1.88.

  • Closes #1082 (llms.txt + llms-full.txt) and #1083 (CrewAI / LangGraph / AutoGen comparison pages).


βœ… v0.1.86 - bootstrap_subagent Discriminator + Codex MCP Approval Fix (Completed)#

Released: May 13, 2026 | PRs: #1090

Features#
  • bootstrap_subagent Variant (fully functional): The dedicated critic-driven criteria path now runs an in-process LLM discriminator between rounds. The critic reads the task and each agent’s latest answer, emits proposed_criteria as JSON, and merges them into the accumulator for the next round’s checklist

  • Answer-Snapshot Gate: The discriminator runs once per unique answer snapshot, avoiding repeated critiques when the answer set has not changed

  • Session-End Drain: Late stdio JSONL criteria emissions are drained before final presentation so they are not stranded after the final checklist resolution pass

  • Codex MCP Approval Fix: Codex workspaces now write both non-interactive approval bypasses needed for external MCP tool calls under codex exec

  • Tests: Bootstrap criteria coverage expanded to 35 tests; Codex workspace approval policy coverage added across approval modes

Notes#
  • Image/Video Edit Capabilities (#959) remain deferred to v0.1.87.


βœ… v0.1.85 - Discriminative Criteria Emergence (criteria_mode) (Completed)#

Released: May 11, 2026

Features#
  • bootstrap_inline Variant (fully functional): Agents emit proposed_criteria alongside submit_checklist; proposals are deduped, FIFO-capped, persisted to bootstrap_criteria_accumulator.json, and merged into the next round’s checklist via EvaluationSection. Works on all backends with checklist tool support β€” SDK (Claude Code) via in-process schema, stdio backends via a JSONL emission channel

  • bootstrap_subagent Variant (wired, LLM step deferred): Same accumulator pipeline; in-process LLM discriminator pass queued for v0.1.86

  • New Module: massgen/bootstrap_criteria.py with merge_proposals, augment_with_accumulator, is_bootstrap_mode, validate_criteria_mode

  • Config Fields: CoordinationConfig.{criteria_mode, bootstrap_max_per_agent_per_round, bootstrap_max_total}

  • Anti-Goodhart by construction: Criteria come from observed gaps, not priors

  • Example Configs: massgen/configs/coordination/{bootstrap_inline_criteria,bootstrap_subagent_criteria}.yaml

  • Tests: 30 new tests in massgen/tests/test_bootstrap_criteria.py (476 lines) covering merge/dedup/cap, config validation, augmentation, rendering gating, and round-N β†’ round-N+1 propagation

Notes#
  • Originally-planned Image/Video Edit Capabilities (#959) deferred to v0.1.86.


βœ… v0.1.84 - TUI Consensus Map (Completed)#

Released: May 8, 2026 | PRs: #1085

Features#
  • TUI Consensus Map: Compact visual map below the agent status ribbon during multi-agent runs that summarizes coordination state without replacing the timeline

  • Visibility Logic: Hidden on welcome screen and single-agent runs; only shown when more than one active agent is coordinating

  • Event-Driven State: Map state updates from existing coordination events (answer_submitted, vote, agent_stopped, winner_selected, final_presentation_start, agent_restart, phase_change, context_received) without backend schema changes

  • Direct-Callback Fallback: Map remains accurate even when direct TUI callbacks update agent status or votes outside the unified event pipeline

  • OpenSpec Coverage: Full change proposal, scenarios, and tasks under openspec/changes/add-tui-consensus-map/

Notes#
  • Originally-planned Image/Video Edit Capabilities (#959) deferred to v0.1.85.


βœ… v0.1.83 - In-Session Standalone Checkpoint MCP Integration (Completed)#

Released: May 1, 2026 | PRs: #1079

Features#
  • In-Session Standalone Checkpoint: Standalone checkpoint MCP server (originally for external hosts like Claude Code) can now run inside a normal MassGen single-agent session, exposing the richer init + checkpoint tools backed by its own reviewer team

  • coordination.standalone_checkpoint Config Block: New YAML block with enabled, team_config, mode (generate | verify), single_checkpoint, include_workspace_context; invalid mode falls back to generate with a warning

  • Single-Agent-Only Affordance Gating: Multi-agent parents skip the standalone server with a warning β€” the standalone server runs its own reviewer panel

  • Enhanced Checkpoint Tool Card: TUI tool card visualization distinguishes primary checkpoint operations from system tasks

  • Example Configs: massgen/configs/checkpoint/standalone_mcp/{fast_iteration,reviewers}.yaml

Notes#
  • Originally-planned Checkpoint Safety Mode (#1026) and Round Evaluator over-indexing fix (#994) deferred to a future release.


βœ… v0.1.82 - TUI Copy Mode & Checkpoint Quality Improvements (Completed)#

Released: April 29, 2026 | PRs: #1076

Features#
  • TUI Copy Mode: New Ctrl+Shift+S toggle releases terminal mouse tracking so users can drag-select text natively; auto-restores on exit

  • Checkpoint Workspace Context: New include_workspace_context config option for standalone checkpoint MCP server (default off)

  • Checkpoint Plan Quality Criteria: Mode-aware quality criteria with selective branch depth scoring for single vs. multi-checkpoint modes

  • Single-Checkpoint Agent Recovery: Detailed recovery workflow for agents when a plan branch resolves to terminate

  • TUI Visual Polish: Ribbon dividers changed from β”‚ to Β·

Notes#
  • Cloud Modal MVP deferred from v0.1.82 to v0.1.83.


βœ… v0.1.81 - Multi-Region Circuit Breaker Failover (Phase 6) (Completed)#

Released: April 27, 2026 | PRs: #1072

Features#
  • Multi-Region Failover: LLM circuit breaker fails over to backup regions when the primary trips OPEN, with automatic recovery when the primary returns to healthy

  • Production-Grade Resilience: Builds on Phase 4 (distributed store) and Phase 5 (adaptive thresholds) for full multi-region resilience

Notes#
  • Cloud Modal MVP originally planned for v0.1.81 β€” deferred again to v0.1.82.


βœ… v0.1.80 - Adaptive Circuit Breaker & Checkpoint Modes (Completed)#

Released: April 22, 2026 | PRs: #1065, #1070

Features#
  • Circuit Breaker Adaptive Thresholds (Phase 5): Self-tuning thresholds that respond to each backend’s actual failure patterns

  • Single Checkpoint Mode: New standalone checkpoint mode β€” no recheckpointing within a single operation

  • Draft Plan Verify Mode: New standalone checkpoint mode β€” verify a draft plan before executing

Notes#
  • Cloud Modal MVP originally planned for v0.1.80 β€” deferred again to v0.1.81.


βœ… v0.1.79 - Fast Mode Speed Control & Broader Checkpoint Framing (Completed)#

Released: April 20, 2026

Features#
  • Better Fast Mode Options: New options to control coordination speed β€” fine-grained speed vs. quality tradeoff

  • Broader Checkpoint Framing: Checkpoint mode framing broadened from safety-only to high-stakes and coordinated phases

  • Checkpoint Instructions Clarity: More clarity in trust settings for checkpoint agents

Notes#
  • Cloud Modal MVP originally planned for v0.1.79 β€” deferred again to v0.1.80.


βœ… v0.1.78 - Circuit Breaker Distributed Store (Phase 4) (Completed)#

Released: April 17, 2026 | PRs: #1061

Features#
  • Pluggable CB state store: The LLM circuit breaker’s state is now held behind a CircuitBreakerStore Protocol and can be shared across workers and processes. Default (store=None) preserves single-process behavior.

  • In-memory CB state store: Thread-safe, zero-dependency implementation for single-process deployments and tests.

  • Redis-backed CB state store: Distributed implementation via optional redis>=4.0; install with pip install massgen[redis-store].

  • Atomic CB transitions: atomic_record_failure / atomic_record_success make CB state transitions linearizable when workers race on the same upstream backend.

Notes#
  • Cloud Modal MVP originally planned for v0.1.78 β€” deferred to v0.1.79.


βœ… v0.1.77 - Answer Now Button (Completed)#

Released: April 15, 2026 | PRs: #1062

Features#
  • Answer Now Button: Agents can submit answers more quickly, both within a round, and bypassing additional refinement rounds when quality is already sufficient


βœ… v0.1.76 - Exa Search & Circuit Breaker Observability (Completed)#

Released: April 13, 2026 | PRs: #1056, #1057, #1058

Features#
  • Exa AI Search Tool: New Exa AI-powered search tool for MCP with example config

  • Circuit Breaker Observability (Phase 3): Probe ownership, lock release, per-attempt latency tracking across all backends

  • Checkpoint Agent Instructions: Copyable custom instructions for agent memory files with checkpoint MCP information

  • Docker Dependency Fixes: Fixed Dockerfile installs for reliable container builds


βœ… v0.1.75 - Codex Hooks & Checkpoint WebUI (Completed)#

Released: April 10, 2026 | PRs: #1053

Features#
  • Codex Native Hooks: Hybrid hook system for Codex backend combining native hooks and MCP capabilities

  • Checkpoint WebUI Auto-Launch: Checkpoint workflows auto-launch WebUI with configurable host/port for visual monitoring

  • Standalone MCP Server Docs: Guide for massgen-checkpoint-mcp with safety policy integration

  • Safety Policy Update: Updated safety policy for checkpoint based on Claude Code safe mode


πŸ“‹ v0.1.88 - Image/Video Edit & Criteria Refinements (Deferred from v0.1.86/v0.1.87)#

Features#

1. Image/Video Edit Capabilities (@ncrispino)

  • Issue: #959

  • Investigate and support image and video editing capabilities across providers

  • Multi-turn editing workflows with continuation IDs

  • Use Case: Enable iterative media editing within multi-agent workflows

2. Discriminative Criteria Refinements (@ncrispino)

  • Build on v0.1.85/v0.1.86 criteria emergence with selection, ranking, and retirement of stale criteria

  • Use Case: Keep long-running refinement loops focused on the most useful current criteria

Success Criteria#
  • Image editing capabilities documented and tested

  • Video editing capabilities documented and tested

  • Multi-turn editing flow works end-to-end

  • Criteria refinement behavior is documented and covered by tests


πŸ”¨ Ongoing Work & Continuous Releases#

These features are being actively developed on separate parallel tracks and will ship incrementally on the MWF release schedule:

Track: Agent Adapter System (@Eric-Shang, ericshang.)#
  • PR: #283

  • Unified agent interface for easier backend integration

  • Shipping: Continuous improvements

Track: Irreversible Actions Safety (@franklinnwren, zhichengren)#
  • Human-in-the-loop approval system for dangerous operations

  • LLM-based tool risk detection

  • Target: v0.1.3 and beyond

Track: Multimodal Support (@qidanrui, danrui2020)#
  • PR: #252

  • Image, audio, video processing across backends

  • Shipping: Incremental improvements each release

Track: Memory Module (@qidanrui, @ncrispino, danrui2020, nickcrispino)#
  • Issues: #347, #348

  • Short and long-term memory implementation with persistence

  • Status: βœ… Completed in v0.1.5

Track: Agent Task Planning (@ncrispino, nickcrispino)#
  • Agent task planning with dependency tracking

  • Status: βœ… Completed in v0.1.7

Track: Automation & Meta-Coordination (@ncrispino, nickcrispino)#
  • LLM agent automation with status tracking and silent execution

  • MassGen running MassGen for self-improvement workflows

  • Status: βœ… Completed in v0.1.8

  • Case Study: Meta-level self-analysis demonstrating automation mode (meta-self-analysis-automation-mode.md)

Track: DSPy Integration (@praneeth999, ram2561)#
  • Question paraphrasing for multi-agent diversity

  • Semantic validation and caching system

  • Status: βœ… Completed in v0.1.8

Track: Framework Streaming (@Eric-Shang, ericshang.)#
  • PR: #462

  • Real-time streaming for LangGraph and SmoLAgent intermediate steps

  • Enhanced debugging and monitoring for external framework tools

  • Status: βœ… Completed in v0.1.10

Track: Rate Limiting System (@AbhimanyuAryan, abhimanyuaryan)#
  • PR: #383

  • Multi-dimensional rate limiting for Gemini models

  • Model-specific limits with sliding window tracking

  • Status: βœ… Completed in v0.1.11

Track: MassGen Handbook (@a5507203, @Henry-811, crinvo, henry_weiqi)#
  • Issue: #387

  • Comprehensive user documentation and handbook at https://massgen.github.io/Handbook/

  • Centralized policies and resources for development and research teams

  • Status: βœ… Completed in v0.1.10

Track: Computer Use (@franklinnwren, zhichengren)#
  • PR: #402

  • Browser and desktop automation with OpenAI, Claude, and Gemini integration

  • Visual perception through screenshot processing and action execution

  • Status: βœ… Completed in v0.1.9

Track: Session Management (@ncrispino, nickcrispino)#
  • PR: #466

  • Complete session state tracking and restoration

  • Resume previous MassGen conversations with full context

  • Status: βœ… Completed in v0.1.9

Track: Semtools & Serena Skills (@ncrispino, nickcrispino)#
  • PR: #515

  • Semantic search capabilities via semtools (embedding-based similarity)

  • Symbol-level code understanding via serena (LSP integration)

  • Package as reusable skills within MassGen framework

  • Status: βœ… Completed in v0.1.12

Track: System Prompt Architecture (@ncrispino, nickcrispino)#
  • PR: #515

  • Complete refactoring of system prompt assembly

  • Hierarchical structure with improved LLM attention management

  • Skills system local execution support

  • Status: βœ… Completed in v0.1.12

Track: Multi-Agent Computer Use (@franklinnwren, zhichengren)#
  • PR: #513

  • Enhanced Gemini computer use with Docker integration

  • Multi-agent coordination for computer automation

  • VNC visualization and debugging support

  • Status: βœ… Completed in v0.1.12

Track: Code-Based Tools System / Automatic MCP Tool Selection (@ncrispino, nickcrispino)#
  • Issue: #414

  • Tool integration via importable Python code instead of schema-based tools

  • MCP server registry with auto-discovery

  • Reduces token usage through on-demand tool loading

  • Status: βœ… Completed in v0.1.13

Track: NLIP Integration (@praneeth999, @qidanrui, ram2561, danrui2020)#
  • PR: #475

  • Natural Language Integration Platform for advanced tool routing

  • Multi-backend support across Claude, Gemini, and OpenAI

  • Per-agent and orchestrator-level configuration

  • Status: βœ… Completed in v0.1.13

Track: Parallel Tool Execution (@praneeth999, ram2561)#
  • PR: #520

  • Configurable concurrent tool execution across all backends

  • Model-level and local execution controls

  • Asyncio-based scheduling with semaphore limits

  • Status: βœ… Completed in v0.1.14

Track: Gemini 3 Pro Support (@ncrispino, nickcrispino)#
  • PR: #530

  • Full integration for Google’s Gemini 3 Pro model

  • Function calling support with parallel tool capabilities

  • Status: βœ… Completed in v0.1.14

Track: Parallel File Operations (@ncrispino, nickcrispino)#
  • Issue: #441

  • Increase parallelism of file read operations

  • Standard efficiency evaluation and benchmarking methodology

  • Status: βœ… Completed in v0.1.14

Track: Persona Generation System (@ncrispino, nickcrispino)#
  • PR: #547

  • Automatic generation of diverse system messages for multi-agent configurations

  • Multiple generation strategies: complementary, diverse, specialized, adversarial

  • Status: βœ… Completed in v0.1.15

Track: Docker Distribution Enhancement (@ncrispino, nickcrispino)#
  • PR: #545, #538

  • GitHub Container Registry integration with ARM support

  • MassGen pre-installed in Docker images for immediate use

  • Status: βœ… Completed in v0.1.15

Track: Launch Custom Tools in Docker (@ncrispino, nickcrispino)#
  • Issue: #510

  • Enable custom tools to run in isolated Docker containers

  • Security isolation and portability for custom tool execution

  • Status: βœ… Completed in v0.1.15

Track: MassGen Terminal Evaluation (@ncrispino, nickcrispino)#
  • Issue: #476

  • PR: #553

  • Self-evaluation and improvement of frontend/UI through terminal recording

  • Automated video generation and case study creation using VHS

  • Status: βœ… Completed in v0.1.16

Track: LiteLLM Cost Tracking Integration (@ncrispino, nickcrispino)#
  • Issue: #543

  • PR: #553

  • Accurate cost calculation using LiteLLM’s pricing database

  • Integration with LiteLLM pricing for 500+ models with auto-updates

  • Status: βœ… Completed in v0.1.16

Track: Memory Archiving System (@ncrispino, nickcrispino)#
  • PR: #555

  • Persistent memory with multi-turn session support

  • Memory archiving for session persistence and continuity

  • Status: βœ… Completed in v0.1.16

Track: MassGen Self-Evolution Skills (@ncrispino, nickcrispino)#
  • Issue: #476

  • Four new skills for MassGen to develop and maintain itself

  • Self-documenting release workflows and configuration generation

  • Status: βœ… Completed in v0.1.16

Track: Improve Consistency of Memory & Tool Reminders (@ncrispino, nickcrispino)#
  • Issue: #537

  • Enhance consistency of memory retrieval across agents

  • Improve tool reminder system for better agent awareness

  • Standardize memory access patterns

  • Status: βœ… Completed in v0.1.16

Track: Textual Terminal Display (@praneeth999, ram2561)#
  • Issue: #539

  • PR: #482

  • Rich terminal UI using Textual framework with dark/light themes

  • Enhanced visualization for multi-agent coordination

  • Status: βœ… Completed in v0.1.17

Track: Broadcasting to Humans/Agents (@ncrispino, nickcrispino)#
  • Issue: #437

  • PR: #569

  • Enable agents to broadcast questions when facing implementation uncertainties

  • Human-in-the-loop and agent-to-agent communication for clarification

  • Status: βœ… Completed in v0.1.18

Track: Claude Advanced Tooling (@praneeth999, ram2561)#
  • PR: #568

  • Programmatic tool calling from code execution sandbox

  • Server-side tool search with deferred loading

  • Status: βœ… Completed in v0.1.18

Track: LiteLLM Integration & Programmatic API (@ncrispino, nickcrispino)#
  • PR: #580

  • MassGen as a LiteLLM custom provider with MassGenLLM class

  • New run() and build_config() functions for programmatic execution

  • NoneDisplay for silent output in programmatic/LiteLLM use

  • Status: βœ… Completed in v0.1.19

Track: Claude Strict Tool Use & Structured Outputs (@praneeth999, ram2561)#
  • PR: #572

  • enable_strict_tool_use config flag with recursive schema patching

  • output_schema parameter for structured JSON outputs

  • Status: βœ… Completed in v0.1.19

Track: Gemini Exponential Backoff (@praneeth999, ram2561)#
  • PR: #576

  • Automatic retry mechanism for rate limit errors (429, 503)

  • Jittered exponential backoff with Retry-After header support

  • Status: βœ… Completed in v0.1.19

Track: CUA Dockerfile / Auto Docker Setup (@franklinnwren, zhichengren)#
  • Issue: #552

  • Automatic Docker container setup for Computer Use Agent

  • Auto-detection of CUA configs with automatic container creation

  • Status: βœ… Completed in v0.1.20

Track: Web UI (@voidcenter, justin_zhang)#
  • PR: #588

  • Browser-based real-time visualization for multi-agent coordination

  • FastAPI server with WebSocket streaming and React frontend

  • Status: βœ… Completed in v0.1.20

Track: Response API Formatter Enhancement (@praneeth999, ram2561)#
  • Improved function call handling for multi-turn contexts

  • Preserves function_call entries and generates stub outputs

  • Status: βœ… Completed in v0.1.20

Track: Computer Use Documentation (@franklinnwren, zhichengren)#
  • Issue: #562

  • Comprehensive documentation for computer use workflows

  • Environment naming conventions and automatic setup instructions

  • Status: βœ… Completed in v0.1.20

Track: Graceful Cancellation (@ncrispino, nickcrispino)#
  • PR: #596

  • Ctrl+C saves partial progress during multi-agent coordination

  • Session restoration for incomplete turns with --continue

  • Multi-turn mode returns to prompt instead of exiting

  • Status: βœ… Completed in v0.1.21

Track: Shadow Agent Architecture (@ncrispino, nickcrispino)#
  • PR: #600

  • Shadow agents for non-blocking broadcast responses

  • Full context inheritance (conversation history + current turn)

  • Parallel spawning with asyncio.gather()

  • Status: βœ… Completed in v0.1.22

Track: Web UI Automation Mode (@voidcenter, @ncrispino, justin_zhang, nickcrispino)#
  • PR: #607

  • Automation-friendly Web UI view with status header and session polling

  • LOG_DIR and STATUS path output for programmatic monitoring

  • Session persistence API for completed sessions

  • Status: βœ… Completed in v0.1.23

Track: Multi-Turn Cancellation Improvements (@ncrispino, nickcrispino)#
  • PR: #608

  • Flag-based cancellation handling in multi-turn mode

  • Terminal state restoration after Rich display cancellation

  • Cancelled turns build proper history entries with partial results

  • Status: βœ… Completed in v0.1.23

Track: Docker Container Persistence (@ncrispino, nickcrispino)#
  • Commit: 34279c88

  • SessionMountManager for pre-mounting session directories to Docker containers

  • Eliminates container recreation between turns (sub-second vs 2-5 second transitions)

  • Status: βœ… Completed in v0.1.23

Track: Turn History Inspection (@ncrispino, nickcrispino)#
  • Commits: 028f591d, 477423a6

  • New /inspect command for reviewing agent outputs from any turn

  • /inspect all to list all turns with summaries

  • Interactive menu for viewing agent outputs, final answers, and coordination logs

  • Status: βœ… Completed in v0.1.23

Track: Async Execution Consistency (@ncrispino, nickcrispino)#
  • PR: #608

  • New run_async_safely() helper for nested event loop handling

  • Fixed mem0 adapter async lifecycle issues

  • Status: βœ… Completed in v0.1.23

Track: Enhanced Cost Tracking (@ncrispino, nickcrispino)#
  • Expanded token counting and cost calculation across multiple providers

  • Real-time token usage for OpenRouter, xAI/Grok, Gemini, Claude Code backends

  • Per-agent token breakdown with cost inspection command

  • Status: βœ… Completed in v0.1.24

Track: UI-TARS Backend Support (@franklinnwren, zhichengren)#
  • PR: #584

  • New backend for ByteDance’s UI-TARS-1.5-7B model for GUI automation

  • OpenAI-compatible API via HuggingFace Inference Endpoints

  • Tool implementation with Docker and browser automation examples

  • Status: βœ… Completed in v0.1.25

Track: Evolving Skill Creator System (@ncrispino, nickcrispino)#
  • PR: #629

  • Framework for creating and iterating on reusable workflow plans

  • Skills capture steps, Python scripts, and learnings through iteration

  • Support for loading skills from previous sessions

  • Status: βœ… Completed in v0.1.25

Track: Textual Terminal Display Enhancement (@praneeth999, ram2561)#
  • PR: #589

  • Adaptive layout management for different terminal sizes

  • Enhanced dark/light themes with modals and panels

  • Improved agent coordination visualization

  • Status: βœ… Completed in v0.1.25

Track: Shadow Agent Response Depth (@ncrispino, nickcrispino)#
  • PR: #634

  • Test-time compute scaling via response_depth parameter (low/medium/high)

  • Controls solution complexity in shadow agent broadcast responses

  • Status: βœ… Completed in v0.1.26

Track: Docker Diagnostics Module (@ncrispino, nickcrispino)#
  • PR: #634

  • Comprehensive Docker error detection with platform-specific resolution

  • Distinguishes binary not installed, daemon not running, permission denied, images missing

  • Status: βœ… Completed in v0.1.26

Track: Web UI Setup System (@ncrispino, nickcrispino)#
  • PR: #634

  • Guided first-run setup with SetupPage, ConfigEditorModal, CoordinationStep

  • API key management endpoints and environment checks

  • Status: βœ… Completed in v0.1.26

Track: Multimodal Backend Integration (@ncrispino, @qidanrui, nickcrispino, danrui2020)#
  • Commits: 598a32f8, dc920078

  • Native multimodal understanding for Gemini and OpenAI backends

  • Image, audio, video understanding via read_media with backend-native APIs

  • Status: βœ… Completed in v0.1.28

Track: Multimodal Generation Consolidation (@ncrispino, nickcrispino)#
  • Commit: dc920078

  • Unified generate_media tool with provider selection

  • New generation/ module for OpenAI (DALL-E, Sora, TTS), Google (Imagen, Veo), OpenRouter

  • Status: βœ… Completed in v0.1.28

Track: Web UI Artifact Previewer (@ncrispino, @voidcenter, nickcrispino, justin_zhang)#
  • Commit: 598a32f8

  • Preview workspace artifacts directly in web interface

  • Support for PDF, DOCX, PPTX, XLSX, images, HTML, SVG, Markdown, Mermaid

  • Status: βœ… Completed in v0.1.28

Track: Minimum Answers Before Voting (@ncrispino, nickcrispino)#
  • Commit: bc7881d2

  • New min_answers_before_voting orchestrator configuration option

  • Integrated into CLI quickstart wizard and Web UI CoordinationStep

  • Status: βœ… Completed in v0.1.28

Track: Azure OpenAI Workflow Fixes (@AbhimanyuAryan, abhimanyuaryan)#
  • Commit: c71094ac

  • Parameter filtering for unsupported Azure parameters

  • Fixed tool_choice handling, message validation, and response format extraction

  • Status: βœ… Completed in v0.1.28

Track: OpenRouter Tool-Capable Model Filtering (@shubham2345)#
  • Commit: 40acf82c

  • Model list filters to only show models supporting tool calling

  • Checks supported_parameters for β€œtools” capability

  • Status: βœ… Completed in v0.1.28

Track: Subagent System (@ncrispino, nickcrispino)#
  • PR: #690

  • Spawn parallel child MassGen processes for independent task execution

  • Process isolation with independent workspaces per subagent

  • New spawn_subagents tool with result aggregation and token tracking

  • Status: βœ… Completed in v0.1.29

Track: Async Subagent Execution (@ncrispino, @HenryQi, nickcrispino, henry_weiqi)#
  • PR: #801

  • Linear: MAS-214

  • Background subagent execution with async_=True parameter

  • Poll for subagent completion and retrieve results

  • Status: βœ… Completed in v0.1.41

Track: TUI Visual Redesign (@ncrispino, @praneeth999, nickcrispino, ram2561)#
  • PR: #806

  • Comprehensive visual overhaul with modern β€œConversational AI” aesthetic

  • Rounded corners, desaturated colors, edge-to-edge layouts, polished modals

  • Human Input Queue for injecting messages to agents mid-stream

  • Status: βœ… Completed in v0.1.42

Track: AG2 Single-Agent Coordination Fix (@db-ol)#
  • PR: #804

  • Fixed coordination issues for single-agent AG2 setups

  • Single agent can now vote for itself after producing its first answer

  • Status: βœ… Completed in v0.1.42

Track: Tool Call Batching (@ncrispino, nickcrispino)#
  • PR: #815

  • Consecutive MCP tool calls grouped into collapsible tree views

  • Shows 3 items by default with β€œ+N more” indicator, click to expand

  • Respects Timeline Chronology Rule: tools only batch when consecutive

  • New ToolBatchCard widget and ToolBatchTracker state machine

  • Status: βœ… Completed in v0.1.43

Track: Interactive Case Studies & Documentation (@franklinnwren, zhichengren)#
  • PR: #812

  • New documentation page with visual SVG comparisons (MassGen vs single-agent)

  • Video tutorials section with Getting Started and Development videos

  • Iterative refinement examples showing multi-round improvements

  • Status: βœ… Completed in v0.1.43

Track: TUI UX Polish (@ncrispino, nickcrispino)#
  • PR: #815

  • Final presentation display fix (reasoning vs answer separation)

  • Plan mode enhancements with PlanOptionsPopover

  • Quoted path support for paths with spaces

  • Various bug fixes (status bar, scrolling, mode buttons)

  • Status: βœ… Completed in v0.1.43

Track: Tool Metrics Distribution Statistics (@ncrispino, nickcrispino)#
  • Commit: 30aca047

  • Enhanced get_tool_metrics_summary() with per-call averages

  • Output distribution stats (min/max/median) for bottleneck analysis

  • Status: βœ… Completed in v0.1.29

Track: CLI Per-Agent System Messages (@ncrispino, nickcrispino)#
  • Commit: 78177372

  • New mode for assigning different system messages per agent in quickstart

  • Options: β€œSkip”, β€œSame for all”, β€œDifferent per agent”

  • Status: βœ… Completed in v0.1.29

Track: OpenAI Responses API Fixes (@ncrispino, nickcrispino)#
  • PR: #685

  • Fixed duplicate item errors when using previous_response_id

  • Preserved function call ID for proper reasoning item pairing

  • Status: βœ… Completed in v0.1.29

Track: OpenRouter Web Search Plugin (@shubham2345)#
  • PR: #693

  • Native web search integration via OpenRouter’s plugins array

  • Maps enable_web_search to {"id": "web"} plugin format

  • Status: βœ… Completed in v0.1.30

Track: Persona Generator Diversity Modes (@ncrispino, nickcrispino)#
  • PR: #699

  • Two diversity modes: perspective (values/priorities) and implementation (solution types)

  • Phase-based adaptation with softened personas for convergence

  • Status: βœ… Completed in v0.1.30

Track: Azure OpenAI Multi-Endpoint Support (@AbhimanyuAryan, abhimanyuaryan)#
  • PR: #698

  • Support both Azure-specific and OpenAI-compatible endpoints

  • Environment variable expansion (${VAR}) in config files

  • Status: βœ… Completed in v0.1.30

Track: Test Suite Fixes (@maxim-saplin)#
  • PR: #688

  • Comprehensive test fixes with xfail registry

  • Fixed persistent memory retrieval and backend tool registration

  • Status: βœ… Completed in v0.1.30

Track: Logfire Observability Integration (@ncrispino, nickcrispino)#
  • PR: #708

  • Comprehensive structured logging and tracing via Logfire (Pydantic team)

  • Automatic LLM instrumentation for OpenAI, Anthropic Claude, and Google Gemini backends

  • Tool execution tracing with timing metrics and agent coordination observability

  • Enable via --logfire CLI flag or MASSGEN_LOGFIRE_ENABLED=true environment variable

  • Status: βœ… Completed in v0.1.31

Track: Azure OpenAI Native Tool Call Streaming (@AbhimanyuAryan, abhimanyuaryan)#
  • PR: #705

  • Tool calls accumulated and yielded as structured tool_calls chunks

  • Fixed streaming behavior for Azure OpenAI tool calling

  • Status: βœ… Completed in v0.1.31

Track: OpenRouter Web Search Logging (@shubham2345)#
  • PR: #704

  • Fixed logging output for web search operations

  • Status: βœ… Completed in v0.1.31

Track: Session Export Multi-Turn Support (@ncrispino, nickcrispino)#
  • PR: #715

  • Enhanced massgen export with turn range selection and workspace options

  • Multi-turn file collection preserving turn/attempt structure

  • Status: βœ… Completed in v0.1.32

Track: Logfire Optional Dependency (@AbhimanyuAryan, abhimanyuaryan)#
  • PR: #711

  • Moved Logfire from required to optional [observability] extra

  • Helpful error message when --logfire used without Logfire installed

  • Status: βœ… Completed in v0.1.32

Track: Per-Attempt Logging (@ncrispino, nickcrispino)#
  • Commit: a808d730

  • Separate log files per orchestration restart attempt

  • Handler reconfiguration via set_log_attempt() function

  • Status: βœ… Completed in v0.1.32

Track: Office Document PDF Conversion (@ncrispino, nickcrispino)#
  • Commit: 7c7a32e3

  • Automatic DOCX/PPTX/XLSX to PDF conversion for session sharing

  • Docker + LibreOffice headless conversion with image fallback

  • Status: βœ… Completed in v0.1.32

Track: Reactive Context Compression (@ncrispino, nickcrispino)#
  • Issue: #617

  • PR: #697

  • Automatic context compression when context length errors are detected

  • Streaming buffer system for compression recovery

  • Status: βœ… Completed in v0.1.33

Track: Backend Model List Auto-Update (@ncrispino, nickcrispino)#
  • Issue: #645

  • PR: #669

  • Native model listing APIs for providers (Groq, Together, and others)

  • Research third-party wrappers; document manual update processes

  • Status: βœ… Completed in v0.1.34

Track: OpenAI-Compatible Chat Server (@maxim-saplin)#
  • Issue: #628

  • PR: #689

  • Run MassGen as an OpenAI-compatible API server

  • Status: βœ… Completed in v0.1.34

Track: Code-Based Tools in Web UI (@ncrispino, nickcrispino)#
  • Issue: #612

  • Ensure code-based tools work properly in Web UI

  • Integration with new Web UI features

  • Status: βœ… Completed in v0.1.34

Track: Test MassGen for PPTX Slides (@ncrispino, nickcrispino)#
  • Issue: #686

  • Verify and improve PPTX generation capabilities

  • Test slide generation workflows and output quality

  • Status: βœ… Completed in v0.1.34

Track: OpenRouter Tool-Use Model Filtering (@shubham2345)#
  • Issue: #647

  • Restrict OpenRouter model list to only show models that support tool use

  • Filter based on supported_parameters capability checks

  • Status: βœ… Completed in v0.1.34

Track: OpenAI Responses /compact Endpoint (@ncrispino, nickcrispino)#
  • Issue: #739

  • Use OpenAI’s native /compact endpoint instead of custom summarization

  • Leverage API-level context compression for better efficiency

  • Status: βœ… Completed in v0.1.48

Track: Improve Logging (@ncrispino, nickcrispino)#
  • Issue: #683

  • PR: #761

  • Enhanced logging for better debugging and observability via Logfire workflow attributes

  • New massgen logs analyze CLI command with self-analysis mode

  • Status: βœ… Completed in v0.1.35

Track: Add Model Selector for Log Analysis (@ncrispino, nickcrispino)#
  • Issue: #766

  • Allow users to choose which model to use for massgen logs analyze self-analysis mode

  • Configurable model selection for different analysis requirements

  • Status: βœ… Completed in v0.1.50

Track: General Hook Framework (@ncrispino, nickcrispino)#
  • Issue: #745

  • PR: #769

  • Extensible hook system for agent lifecycle events

  • Enable custom actions at key orchestration points

  • Status: βœ… Completed in v0.1.36

Track: Plan and Execute Workflow (@ncrispino, nickcrispino)#
  • PR: #794

  • Complete plan-then-execute workflow separating β€œwhat to build” from β€œhow to build it”

  • --plan-and-execute and --execute-plan CLI options

  • Task verification workflow with verified status and verification groups

  • Plan storage system in .massgen/plans/ with frozen snapshots

  • Status: βœ… Completed in v0.1.39

Track: Improve Log Sharing and Analysis (@ncrispino, nickcrispino)#
  • Issue: #722

  • Enhanced log sharing workflows

  • Improved analysis tools and visualizations

  • Target: v0.1.50+

Track: Claude Code Plugin for MassGen Agents (@ncrispino, nickcrispino)#
  • Issue: #773

  • Plugin/extension for spawning MassGen agents directly from Claude Code interface

  • Seamless integration with Claude Code workflows

  • Target: v0.1.50+

Track: Refactor ask_others for Targeted Agent Queries (@ncrispino, nickcrispino)#
  • Issue: #809

  • Support targeted queries to specific agents via subagent spawning

  • Three modes: broadcast to all, selective broadcast, targeted ask

  • Pass full _streaming_buffer to shadow agents for improved context

  • Target: v0.1.52

Track: Decomposition Coordination Mode (@ncrispino, nickcrispino)#
  • PR: #858

  • New coordination mode that decomposes tasks into subtasks assigned to individual agents

  • Task decomposer with presenter agent role for final synthesis

  • TUI mode bar toggle, subtask assignment display, and generation modals

  • Status: βœ… Completed in v0.1.48

Track: Worktree Isolation (@ncrispino, nickcrispino)#
  • PR: #857

  • Linear: MAS-272

  • Git worktree-based isolation for agent file writes with review workflow

  • Review modal for approving/rejecting changes before applying to original paths

  • Shadow repo support for non-git directories

  • Status: βœ… Completed in v0.1.48

Track: Quickstart Wizard Docker Setup (@ncrispino, nickcrispino)#
  • PR: #857

  • Linear: MAS-267

  • Docker setup step in quickstart wizard with animated pull progress

  • Real-time stdout streaming for image downloads

  • Status: βœ… Completed in v0.1.48

Track: Fairness Gate for Coordination (@ncrispino, nickcrispino)#
  • PR: #869

  • Prevents fast agents from dominating coordination rounds

  • Configurable fairness_lead_cap_answers and max_midstream_injections_per_round

  • Status: βœ… Completed in v0.1.49

Track: Persona Easing TUI Integration (@ncrispino, nickcrispino)#
  • PR: #869

  • Persona easing toggle now accessible from TUI mode bar

  • Status: βœ… Completed in v0.1.49

Track: Checklist Voting Tool (@ncrispino, nickcrispino)#
  • PR: #869

  • New checklist_tools_server.py MCP server for structured quality evaluation

  • Binary pass/fail scoring for objective quality assessment

  • Status: βœ… Completed in v0.1.49

Track: Log Analysis Mode in TUI (@ncrispino, nickcrispino)#
  • PR: #869

  • New β€œAnalyzing” state in TUI mode bar for in-app run analysis

  • Configurable analysis profiles with log directory and turn selection

  • Status: βœ… Completed in v0.1.49

Track: Automated Testing Infrastructure (@ncrispino, nickcrispino)#
  • PR: #869

  • CI/CD workflow (tests.yml), SVG snapshot baselines, 16+ new test files

  • Testing strategy specification and visual regression testing

  • Status: βœ… Completed in v0.1.49

Track: Shadow Agent Chunk Type Fix (@MuL1ian)#
  • PR: #861

  • Fixed β€œ[No response generated]” errors from incorrect chunk type comparison

  • Status: βœ… Completed in v0.1.49

Track: Chunked Plan Execution (@ncrispino, nickcrispino)#
  • PR: #877

  • Plans divided into chunks executed one at a time with progress checkpoints

  • Chunk browsing in TUI, frozen plan snapshots, target_steps/target_chunks parameters

  • Iterative planning review modal with Continue/Edit/Finalize options

  • Status: βœ… Completed in v0.1.50

Track: Skill Lifecycle Management (@ncrispino, nickcrispino)#
  • PR: #878

  • New lifecycle modes (create_or_update, create_new, consolidate)

  • Skill organizer for merging overlapping skills, SKILL_REGISTRY.md routing guide

  • Previous-session skill loading with load_previous_session_skills config

  • Local Skills MCP for Docker/local execution contexts

  • Status: βœ… Completed in v0.1.50

Track: Worktree Improvements (@ncrispino, nickcrispino)#
  • PR: #877

  • Branch accumulation across rounds, cross-agent diff visibility via generate_branch_summaries()

  • Orphan worktree cleanup

  • Status: βœ… Completed in v0.1.50

Track: Responsive TUI Mode Bar (@ncrispino, nickcrispino)#
  • PR: #877

  • Vertical/horizontal adaptive layout with compact labels on narrow terminals

  • TUI homescreen and theming improvements

  • Status: βœ… Completed in v0.1.50

Track: Subagent Delegation Protocol (@ncrispino, nickcrispino)#
  • PR: #955

  • File-based delegation protocol for container-to-host subagent spawning

  • SubagentLaunchWatcher with atomic JSON request/response exchange

  • Workspace path validation against allowlist for security

  • Status: βœ… Completed in v0.1.57

Track: Iterative Refinement Improvements (@ncrispino, nickcrispino)#
  • PR: #955

  • Issue: #874

  • Substantiveness tracking (transformative/structural/incremental) for convergence decisions

  • Builder subagent type for large artifact generation with fresh context

  • Diagnostic report gating and per-agent checklist scoring

  • Claude Code reasoning parameters for updated SDK

  • Status: βœ… Completed in v0.1.57

Track: Multimodal Revamp (@ncrispino, nickcrispino)#
  • Issues: #942, #951

  • ElevenLabs TTS & STT integration for high-quality voice synthesis and transcription

  • Nano Banana 2 as default image generation model

  • Grok Imagine image/video generation (#958) via xAI API

  • Media generation skills (image, video, audio) and multi-turn image editing with continuation IDs

  • Status: βœ… Completed in v0.1.58

Track: Nvidia NIM Backend (@ncrispino, nickcrispino)#
  • PR: #962

  • First-class provider integration for NVIDIA Inference Microservices

  • Support for NVIDIA-hosted models via NIM API

  • Status: βœ… Completed in v0.1.58

Track: Quality Rethinking Subagent (@ncrispino, nickcrispino)#
  • PR: #964

  • New quality_rethinking subagent type for targeted per-element craft improvements

  • Explicit improve/preserve listings in checklists with better label refresh ordering

  • Subagent hardening: better β€˜@’ parsing, error handling for multiple submit_checklist calls

  • Status: βœ… Completed in v0.1.58

Track: Coding Agent Enhancements (@ncrispino, nickcrispino)#
  • PR: #251

  • Enhanced file operations and workspace management

  • Shipping: Continuous improvement


🎯 Long-Term Vision (v0.2.0+)#

Advanced Orchestration Patterns

  • Advanced task decomposition strategies and parallel coordination

  • Assignment of agents to specific tasks and increasing of diversity

  • Improvement in voting as tasks continue

Self-Learning & Adaptation

  • Agents learn from past executions to improve future performance

  • Automatic skill acquisition from successful task completions

  • Feedback loops for continuous improvement

  • Memory systems for retaining learned patterns across sessions

Visual Workflow Designer

  • No-code multi-agent workflow creation

  • Drag-and-drop agent configuration

  • Real-time testing and debugging

Enterprise Features

  • Role-based access control (RBAC)

  • Audit logs and compliance reporting

  • Multi-user collaboration

  • Advanced analytics and cost tracking

Additional Framework Integrations

  • LangChain agent support

  • CrewAI compatibility

  • Custom framework adapters

Complete Multimodal Pipeline

  • End-to-end audio processing (speech-to-text, text-to-speech)

  • Video understanding and generation

  • Advanced document processing (PDF, Word, Excel)


πŸ”— GitHub Integration#

Track development progress:


🀝 Contributing#

Interested in contributing? You have two options:

Option 1: Join an Existing Track

  1. See Contributors & Contact table above for active tracks

  2. Contact the track owner via Discord to discuss your ideas

  3. Follow CONTRIBUTING.md for development process

Option 2: Create Your Own Track

  1. Have a significant feature idea? Propose a new track!

  2. Reach out via the #massgen channel on Discord

  3. Work with the MassGen dev team to integrate your track into the roadmap

  4. Become a track owner and guide other contributors

See CONTRIBUTING.md for development setup, code standards, testing, and documentation requirements.


See Also#

β€”