|
| 1 | +# Multi-Agent Speech Debate |
| 2 | + |
| 3 | +This tutorial explores a more advanced use case: simulating a turn-based debate between two agents where each agent speaks their responses. We will also optionally use Speech-to-Text (STT) to provide the initial debate topic. |
| 4 | + |
| 5 | +## Prerequisites |
| 6 | + |
| 7 | +- Python 3.10+ |
| 8 | +- OpenAI API key |
| 9 | +- `swarms` library |
| 10 | +- `voice-agents` library |
| 11 | +- A working microphone (if using STT) |
| 12 | + |
| 13 | +## Tutorial Steps |
| 14 | + |
| 15 | +1. **Install Dependencies** |
| 16 | + ```bash |
| 17 | + pip3 install -U swarms voice-agents |
| 18 | + ``` |
| 19 | + |
| 20 | +2. **Define Agent Personalities** |
| 21 | + Create distinct system prompts for your agents to ensure a dynamic debate. In this example, we use Socrates and Simone de Beauvoir. |
| 22 | + |
| 23 | +3. **Initialize Agents** |
| 24 | + Set up two agents with `streaming_on=True`. |
| 25 | + |
| 26 | +4. **Create a Debate Loop** |
| 27 | + Implement a function that alternates turns between agents, uses their respective TTS voices, and passes the response of one agent as the input to the next. |
| 28 | + |
| 29 | +5. **Integrate STT (Optional)** |
| 30 | + Use `record_audio` and `speech_to_text` to capture your own voice as the starting prompt for the debate. |
| 31 | + |
| 32 | +## Code Example |
| 33 | + |
| 34 | +```python |
| 35 | +from swarms import Agent |
| 36 | +from swarms.structs.conversation import Conversation |
| 37 | +from voice_agents.main import speech_to_text, record_audio, StreamingTTSCallback |
| 38 | + |
| 39 | +def debate_with_speech( |
| 40 | + agents: list, |
| 41 | + max_loops: int = 1, |
| 42 | + task: str = None, |
| 43 | + use_stt_for_input: bool = False, |
| 44 | +): |
| 45 | + """ |
| 46 | + Simulate a turn-based debate between two agents with speech capabilities. |
| 47 | + |
| 48 | + Args: |
| 49 | + agents (list): A list containing exactly two Agent instances who will debate. |
| 50 | + max_loops (int): The number of conversational turns. |
| 51 | + task (str): The initial prompt or question to start the debate. |
| 52 | + use_stt_for_input (bool): If True, use speech-to-text for the initial task input. |
| 53 | + |
| 54 | + Returns: |
| 55 | + str: The formatted conversation history. |
| 56 | + """ |
| 57 | + conversation = Conversation() |
| 58 | + |
| 59 | + # Create TTS callbacks with different voices to differentiate speakers |
| 60 | + tts_callback1 = StreamingTTSCallback(voice="onyx", model="tts-1") # Deeper voice |
| 61 | + tts_callback2 = StreamingTTSCallback(voice="nova", model="tts-1") # Softer voice |
| 62 | + |
| 63 | + # Get initial task from STT or provided string |
| 64 | + if use_stt_for_input: |
| 65 | + print("Please speak your question or topic for the debate...") |
| 66 | + audio = record_audio(duration=5.0) |
| 67 | + task = speech_to_text(audio_data=audio, sample_rate=16000) |
| 68 | + print(f"Transcribed: {task}\n") |
| 69 | + |
| 70 | + message = task |
| 71 | + speaker = agents[0] |
| 72 | + other = agents[1] |
| 73 | + current_callback = tts_callback1 |
| 74 | + other_callback = tts_callback2 |
| 75 | + |
| 76 | + for i in range(max_loops): |
| 77 | + print(f"--- Turn {i+1}: {speaker.agent_name} speaking ---") |
| 78 | + |
| 79 | + # Agent generates response and speaks in real-time |
| 80 | + response = speaker.run( |
| 81 | + task=message, |
| 82 | + streaming_callback=current_callback, |
| 83 | + ) |
| 84 | + current_callback.flush() |
| 85 | + |
| 86 | + conversation.add(speaker.agent_name, response) |
| 87 | + |
| 88 | + # Swap roles for the next turn |
| 89 | + message = response |
| 90 | + speaker, other = other, speaker |
| 91 | + current_callback, other_callback = other_callback, current_callback |
| 92 | + |
| 93 | + return conversation.return_history_as_string() |
| 94 | + |
| 95 | +# Define System Prompts |
| 96 | +socratic_prompt = "You are Socrates. Challenge every assumption with logic." |
| 97 | +beauvoir_prompt = "You are Simone de Beauvoir. Focus on freedom and existence." |
| 98 | + |
| 99 | +# Instantiate Agents |
| 100 | +agent1 = Agent( |
| 101 | + agent_name="Socrates", |
| 102 | + system_prompt=socratic_prompt, |
| 103 | + model_name="gpt-4o", |
| 104 | + streaming_on=True, |
| 105 | +) |
| 106 | +agent2 = Agent( |
| 107 | + agent_name="Simone de Beauvoir", |
| 108 | + system_prompt=beauvoir_prompt, |
| 109 | + model_name="gpt-4o", |
| 110 | + streaming_on=True, |
| 111 | +) |
| 112 | + |
| 113 | +# Run the debate |
| 114 | +history = debate_with_speech( |
| 115 | + agents=[agent1, agent2], |
| 116 | + max_loops=3, |
| 117 | + task="Is freedom an illusion?", |
| 118 | +) |
| 119 | + |
| 120 | +print(history) |
| 121 | +``` |
| 122 | + |
| 123 | +## Key Components |
| 124 | + |
| 125 | +- **Differentiated Voices**: Using "onyx" and "nova" helps the listener distinguish which agent is currently speaking. |
| 126 | +- **Turn-based Logic**: The output of the first agent becomes the input for the second, creating a continuous dialogue. |
| 127 | +- **STT Integration**: `speech_to_text` allows for hands-free interaction with the swarm. |
| 128 | +- **Conversation Tracking**: The `Conversation` struct helps maintain a record of the entire exchange. |
| 129 | + |
0 commit comments