Agent Communication#
MassGen supports collaborative problem-solving through agent-to-agent communication and optional human participation. This enables agents to coordinate, ask for help, and work together more effectively during complex tasks.
Overview#
The communication system allows agents to:
Ask questions to other agents using the
ask_others()toolRequest input, suggestions, or help during coordination
Coordinate on shared resources or dependencies
Optionally include the human user in discussions
Communication is handled through a broadcast channel that:
Spawns shadow agents in parallel to generate responses (agent mode)
Collects responses asynchronously without interrupting working agents
Returns responses to the requesting agent
Optionally prompts the human user for input (human mode)
Note
Backend Limitation: The claude_code backend does not currently support
broadcasting/ask_others(). When Claude Code agents attempt to use ask_others(),
they will see an error message. This is a known limitation tracked in
GitHub Issue #648.
Use other backends (openai, claude, gemini, etc.) for agents that need
to participate in broadcasts.
Communication Modes#
There are three broadcast modes:
- Disabled (default)
Broadcasting is completely disabled. Agents work independently.
orchestrator: coordination: broadcast: false
- Agent-to-agent
Agents can communicate with each other. Questions are broadcast to all other agents who can respond.
orchestrator: coordination: broadcast: "agents"
- Human-only
Agents can ask questions directly to the human user. Other agents are NOT prompted - only the human responds. This is useful when you want human guidance without agent cross-talk.
orchestrator: coordination: broadcast: "human"
Basic Usage#
The ask_others() tool waits for all responses before returning:
# Agent calls ask_others()
result = ask_others("What authentication patterns are already implemented in the codebase?")
# Tool blocks and waits for responses
# Returns: {
# "status": "complete",
# "responses": [
# {"responder_id": "agent_b", "content": "Use OAuth2...", "is_human": False},
# {"responder_id": "agent_c", "content": "I agree...", "is_human": False}
# ]
# }
# Agent can now use responses
for response in result["responses"]:
print(f"{response['responder_id']}: {response['content']}")
Configuration Options#
All broadcast settings are in the orchestrator’s coordination config:
orchestrator:
coordination:
# Broadcast mode: false (disabled) | "agents" | "human"
broadcast: "agents"
# Maximum time to wait for responses (seconds)
broadcast_timeout: 300
# Maximum active broadcasts per agent
max_broadcasts_per_agent: 10
# How frequently agents use ask_others() (optional)
# Options: "low" | "medium" | "high"
broadcast_sensitivity: "medium"
# Response depth for shadow agents (test-time compute scaling)
# Controls how thorough/complex suggested solutions should be
# Options: "low" | "medium" | "high"
response_depth: "medium"
Complete Configuration Examples#
Agent-to-Agent Communication
# massgen/configs/broadcast/test_broadcast_agents.yaml
agents:
- id: agent_a
backend:
type: openai
model: gpt-5
cwd: workspace1
enable_mcp_command_line: true
command_line_execution_mode: docker
- id: agent_b
backend:
type: gemini
model: gemini-3-pro-preview
cwd: workspace2
enable_mcp_command_line: true
command_line_execution_mode: docker
orchestrator:
snapshot_storage: snapshots
agent_temporary_workspace: temp_workspaces
coordination:
broadcast: "agents" # Enable agent-to-agent communication
broadcast_sensitivity: "high" # Agents use ask_others() frequently
response_depth: "medium" # Balanced solution complexity
broadcast_timeout: 300
max_broadcasts_per_agent: 10
Human-in-the-Loop Communication
# massgen/configs/broadcast/test_broadcast_human.yaml
agents:
- id: agent_a
backend:
type: openai
model: gpt-5
cwd: workspace1
enable_mcp_command_line: true
command_line_execution_mode: docker
- id: agent_b
backend:
type: gemini
model: gemini-3-pro-preview
cwd: workspace2
enable_mcp_command_line: true
command_line_execution_mode: docker
orchestrator:
snapshot_storage: snapshots
agent_temporary_workspace: temp_workspaces
coordination:
broadcast: "human" # Human will be prompted for responses
broadcast_sensitivity: "high"
response_depth: "medium" # Balanced solution complexity
broadcast_timeout: 60 # Shorter timeout for interactive sessions
max_broadcasts_per_agent: 5
Human Participation#
When broadcast: "human" is enabled, the human user is the sole responder.
Other agents are NOT prompted - only the human answers questions:
orchestrator:
coordination:
broadcast: "human"
What happens:
Agent calls
ask_others("Question here")Human sees notification in terminal (other agents are NOT notified):
====================================================================== 📢 BROADCAST FROM AGENT_A ====================================================================== What authentication patterns are already implemented in the codebase? ────────────────────────────────────────────────────────────────────── Options: • Type your response and press Enter • Press Enter alone to skip • You have 300 seconds to respond ====================================================================== Your response (or Enter to skip):
Human can: - Type response and press Enter - Press Enter to skip (no response) - Wait for timeout (no response)
Human’s response is returned to the requesting agent
Human Q&A Context Injection#
When multiple agents run in parallel, they may ask similar questions to the human. MassGen prevents redundant prompts through serialization and Q&A history reuse.
Serialization:
In human mode, ask_others() calls are serialized - only one agent can prompt the
human at a time. If Agent B calls ask_others() while Agent A is waiting for a
response, Agent B waits until Agent A’s request completes.
Q&A History Reuse:
Once a human has answered any question, subsequent ask_others() calls return
the existing Q&A history without prompting the human again:
{
"status": "deferred",
"responses": [],
"human_qa_history": [
{"question": "What color theme?", "answer": "Dark mode"}
],
"human_qa_note": "The human has already answered questions this session. Review the history above..."
}
The agent receives the existing Q&A history to check if their question was already
answered. If the existing Q&A answers their question, they can use it directly.
If their question needs different information, they can call ask_others() again
with a more specific question (which will prompt the human for new input).
How it works:
Agent A calls
ask_others("What color theme?")→ acquires lock → prompts humanAgent B calls
ask_others("What style?")→ waits for lock…Human answers “Dark mode” → Q&A stored → Agent A gets response → lock released
Agent B acquires lock → sees Q&A history exists → returns “deferred” with Q&A (NO prompt!)
Agent B uses existing Q&A or asks a different question
Key points:
Human is only prompted once - subsequent calls return existing Q&A
ask_others()calls are serialized (one at a time) in human modeQ&A history persists across all turns within a session
Agents can call
ask_others()again with a different question if neededNote: Q&A history reuse only applies to
broadcast: "human"mode, not agent-to-agent communication
Examples#
Example 2: Getting Help When Stuck#
# Agent is stuck on a bug
result = ask_others(
"I'm seeing a weird authentication error: 'Token signature invalid'. "
"I've verified the secret key is correct. Any ideas what might cause this?"
)
# Review suggestions
for response in result["responses"]:
print(f"💡 Suggestion from {response['responder_id']}: {response['content']}")
Example 3: Hitting a Blocker#
# Agent hits a blocker and needs human input
result = ask_others(
"The Google Maps API requires authentication and I don't have credentials. "
"Should I use a free alternative like OpenStreetMap, or do you have API keys I should use?"
)
# Use human's guidance to proceed
for response in result["responses"]:
print(f"Guidance: {response['content']}")
Example 4: Multiple Viable Approaches#
# Agent needs help choosing between approaches
result = ask_others(
"I can implement the data visualization using either Chart.js (simpler, lighter) "
"or D3.js (more powerful, steeper learning curve). "
"Which would you prefer given your needs?"
)
# Implement based on preference
for response in result["responses"]:
print(f"Decision: {response['content']}")
Example 5: Human Participation (Design Preferences)#
With broadcast: "human" enabled:
# Agent asks for human preferences on design decisions
result = ask_others(
"For the portfolio website, would you prefer: "
"(A) a single-page design with smooth scrolling, or "
"(B) multiple pages with navigation? "
"Also, should I include a contact form or just list your email?"
)
# In human mode, only the human responds
for response in result["responses"]:
print(f"👤 Human: {response['content']}")
Example 6: Clarifying Requirements#
# Agent needs clarification on vague requirements
result = ask_others(
"You mentioned 'professional styling' for the landing page. "
"Do you want a corporate/minimalist look, or something more colorful/creative? "
"Any specific colors or brand guidelines to follow?"
)
# Use clarified requirements
for response in result["responses"]:
print(f"Requirements: {response['content']}")
Example 7: Using Human Q&A History (Deferred Response)#
When Q&A history exists, ask_others() returns immediately with status
“deferred” instead of prompting the human again:
# Agent B calls ask_others (Agent A already asked the human earlier)
result = ask_others("What database should we use?")
# Check if we got a deferred response (Q&A history exists)
if result["status"] == "deferred":
print("Human was NOT prompted - using existing Q&A history:")
for qa in result["human_qa_history"]:
print(f" Q: {qa['question']}")
print(f" A: {qa['answer']}")
# Use existing answers or call ask_others with a different question
# if more specific information is needed
elif result["status"] == "complete":
# This was the first ask_others call - human was prompted
for response in result["responses"]:
print(f"Human: {response['content']}")
Technical Details#
Shadow Agent Architecture (Agent Mode)#
When an agent calls ask_others() in agent mode (broadcast: "agents"),
MassGen uses a shadow agent architecture to generate responses:
Broadcast created with unique
request_idShadow agents are spawned in parallel for each target agent
Each shadow agent:
Shares the parent agent’s backend (stateless, safe to share)
Copies the parent agent’s full conversation history (complete context)
Includes the parent’s current turn streaming content (work in progress)
Uses a simplified system prompt (preserves identity/persona, removes workflow tools)
Generates a tool-free text response
All shadow agent responses are collected simultaneously via
asyncio.gather()Parent agents continue working uninterrupted throughout
Informational messages are injected into parent agents (“FYI, you were asked X and responded Y”)
Requesting agent receives all responses when complete
Why Shadow Agents:
The shadow agent architecture was chosen for two key reasons:
True Parallelization: Shadow agents run completely in parallel without blocking or interrupting the parent agents. The parent agent continues its work while its shadow responds to the broadcast. This maximizes throughput and prevents deadlocks.
System Prompt Control: Shadow agents use a simplified system prompt that:
Preserves the parent’s identity and persona (user-configured system message)
Removes workflow tools (no
vote,new_answer,ask_others)Focuses purely on responding to the question
Prevents the shadow from taking unintended actions or getting confused
Full Context Responses:
Shadow agents have access to the parent agent’s complete context, including:
Conversation history: All prior messages and decisions
Current turn content: The parent’s in-progress streaming output (work being generated)
This means:
Responding agents understand their own prior work and current activities
Responses can reference context from earlier in the conversation
Responses account for what the parent is actively working on
Questions can assume the responder has relevant background
Example:
# ✅ Both work well with shadow agents
ask_others("What do you think about this approach?") # Shadow has context!
ask_others(
"I'm considering using React with TypeScript for the frontend. "
"Do you see any issues with this choice?"
)
Parent Agent Awareness:
After a shadow agent responds, an informational message is injected into the parent agent’s conversation history:
[INFO] While you were working, agent_a asked: "Should we use PostgreSQL?"
Your shadow agent responded: "Yes, PostgreSQL is a good choice because..."
(This is just for your awareness - you may continue your work.)
Response Depth (Test-Time Compute Scaling)#
The response_depth parameter controls test-time compute scaling for shadow
agent responses. This concept is inspired by recent AI research showing that allocating
more compute at inference time leads to better results (e.g., OpenAI’s o1/o3 models).
In MassGen, shadow agents responding to broadcasts represent a form of test-time compute.
The response_depth parameter allows you to control this scaling - similar to how a
human might decide how much effort to put into a response based on task importance.
Options:
lowQuick, simple responses with minimal solutions. Shadow agents will:
Prefer basic technologies (vanilla HTML/CSS/JS, simple libraries)
Avoid complex frameworks or architectures
Focus on getting the job done with minimal dependencies
Keep responses brief and to the point
medium(default)Balanced effort with standard solutions. Shadow agents will:
Use appropriate technology for the task complexity
Include standard best practices without over-engineering
Balance simplicity with maintainability
Be concise but thorough
highThorough, comprehensive responses with sophisticated solutions. Shadow agents will:
Recommend modern frameworks and best practices (React, Next.js, TypeScript, etc.)
Include architecture considerations (SSR, component libraries, testing, CI/CD)
Suggest professional-grade tooling and patterns
Provide detailed responses with examples
Example:
orchestrator:
coordination:
broadcast: "agents"
response_depth: "high" # Get sophisticated, comprehensive suggestions
Use Cases:
Use
lowfor quick prototypes or learning projectsUse
medium(default) for standard development workUse
highfor production systems requiring enterprise-grade solutions
Serialized Human Mode#
When an agent calls ask_others() in human mode (broadcast: "human"):
Agent acquires the
ask_otherslock (waits if another agent holds it)If Q&A history exists → returns “deferred” with history (no human prompt)
If no Q&A history → prompts human and waits for response
Response stored in Q&A history
Lock released, next waiting agent proceeds
This ensures:
Human sees only one prompt at a time
Subsequent agents get existing Q&A without re-prompting
Q&A history persists across all turns in the session
Broadcast Tools#
Built-in tools automatically available when broadcasts are enabled:
ask_others(question: str)Ask question to other agents or human. Waits for and returns all responses.
respond_to_broadcast(answer: str)(Deprecated) This tool is no longer needed. With the shadow agent architecture, broadcast responses are handled automatically by shadow agents. If an agent calls this tool, it will receive a message indicating that responses are handled automatically and it should continue its work.
These are built-in workflow tools, not MCP servers.
Rate Limiting#
Each agent can have at most max_broadcasts_per_agent active broadcasts
(default: 10). This prevents agents from spamming broadcasts.
Troubleshooting#
- Claude Code backend shows “No such tool available: ask_others”
The
claude_codebackend does not currently support broadcastingSee GitHub Issue #648 for status
Use other backends (
openai,claude,gemini) for broadcast-enabled agents
- Broadcasts not working
Check that
broadcastis set to"agents"or"human"(notfalse)Verify all agents are initialized and have
_orchestratorreferenceCheck logs for MCP tool injection messages
- Human prompts not appearing
Ensure
broadcast: "human"is set (not just"agents")Check that
coordination_uiis initializedVerify timeout hasn’t expired
- Timeouts occurring
Increase
broadcast_timeoutif agents need more time to respondCheck agent logs to see if they’re receiving broadcasts
Verify agents aren’t stuck or errored
See Also#
Agent Task Planning - Task planning system for organizing work
YAML Configuration Reference - Complete YAML configuration reference