|
| 1 | +# MAKER |
| 2 | + |
| 3 | +**MAKER** (**M**aximal **A**gentic decomposition, first-to-ahead-by-**K** **E**rror correction, and **R**ed-flagging) is a task-agnostic orchestrator for long-horizon problems. It decomposes work into many small steps; at each step it samples LLM outputs, discards bad ones (red-flagging), and commits only when one parsed answer leads the next-best by `k` votes (“first-to-ahead-by-k”). |
| 4 | + |
| 5 | +This implementation follows the framework described in *Solving a Million-Step LLM Task with Zero Errors* (Meyerson et al., 2025) — [arXiv:2511.09030](https://arxiv.org/abs/2511.09030). |
| 6 | + |
| 7 | +**Import:** `from swarms.structs.maker import MAKER` |
| 8 | + |
| 9 | +## When to use MAKER |
| 10 | + |
| 11 | +| Use MAKER when… | Consider something else when… | |
| 12 | +|-----------------|-------------------------------| |
| 13 | +| You can express the problem as a fixed or conditionally bounded sequence of steps | You need a fixed DAG of different agents ([GraphWorkflow](graph_workflow.md)) | |
| 14 | +| Each step should be a single focused LLM call with statistical agreement | You want multi-agent debate + judge ([DebateWithJudge](debate_with_judge.md)) | |
| 15 | +| You care about per-step reliability (voting + validation) over raw speed | You only need one-shot or simple majority across agents ([MajorityVoting](majorityvoting.md)) | |
| 16 | + |
| 17 | +## How it works |
| 18 | + |
| 19 | +```mermaid |
| 20 | +flowchart TD |
| 21 | + T[Task + max_steps] --> S[For each step] |
| 22 | + S --> V[Sample votes via Agent.run] |
| 23 | + V --> R{Red-flag?} |
| 24 | + R -->|invalid / exception| V |
| 25 | + R -->|valid| P[parse_response → hashable result] |
| 26 | + P --> C{Leader ahead by k?} |
| 27 | + C -->|no| V |
| 28 | + C -->|yes| U[update_state, append result] |
| 29 | + U --> S |
| 30 | +``` |
| 31 | + |
| 32 | +1. **MAD (maximal agentic decomposition)** — You run up to `max_steps` iterations; each iteration is one micro-step with a prompt built from the task, optional state, step index, and the previous step’s result (`format_prompt`). |
| 33 | +2. **First-to-ahead-by-k voting** — Parsed answers are counted until some candidate’s count is at least `k` greater than every other candidate (`do_voting`). Optional **`run_parallel_voting`** batches the first round of samples with a thread pool. |
| 34 | +3. **Red-flagging** — Before parsing, `validate_response` can reject outputs (default rejects empty or overly long text vs `max_tokens`). |
| 35 | + |
| 36 | +## Constructor parameters |
| 37 | + |
| 38 | +| Parameter | Role | |
| 39 | +|-----------|------| |
| 40 | +| `model_name`, `system_prompt`, `max_tokens`, `temperature`, `temperature_first` | Passed through to per-step `Agent` instances (first vote often uses `temperature_first=0`). | |
| 41 | +| `k` | Votes a winner must lead the runner-up by (higher ⇒ more reliable, more cost). | |
| 42 | +| `format_prompt(task, state, step_idx, previous_result)` | Builds the user prompt for the current step. | |
| 43 | +| `parse_response(text)` | Turns raw LLM output into a **hashable** result for voting (strings, numbers, tuples of primitives, etc.). | |
| 44 | +| `validate_response(text, max_tokens)` | Returns `False` to discard a sample. | |
| 45 | +| `update_state(state, result, step_idx)` | Fold step output into state (default: unchanged). | |
| 46 | +| `initial_state` | Starting state for `run` / `run_until_condition`. | |
| 47 | +| `max_workers` | Thread pool size for `run_parallel_voting` (default: `k`). | |
| 48 | +| `max_retries_per_step` | Cap on samples per step before `RuntimeError`. | |
| 49 | +| `agents` | Optional list of pre-built `Agent`s; votes cycle through this pool instead of creating fresh micro-agents. | |
| 50 | + |
| 51 | +## Main methods |
| 52 | + |
| 53 | +| Method | Description | |
| 54 | +|--------|-------------| |
| 55 | +| `run(task, max_steps)` | Run exactly `max_steps` voting rounds; returns `list` of per-step results. | |
| 56 | +| `run_until_condition(task, stop_condition, max_steps=1000)` | Like `run`, but before each step the loop checks `stop_condition(state, results, step_idx)`; if true, it exits without running another vote for that index. | |
| 57 | +| `run_parallel_voting(task, max_steps)` | Like `run` but uses parallel sampling for the first batch of votes per step. | |
| 58 | +| `get_statistics()` | Copy of internal counters (samples, votes, red-flags, per-step vote/sample lists). | |
| 59 | +| `reset()` | Clears stats and conversation. | |
| 60 | +| `estimate_cost(total_steps, target_success_probability=0.95)` | Heuristic cost / `k` guidance from paper-style estimates (uses run statistics when available). | |
| 61 | + |
| 62 | +## Minimal example |
| 63 | + |
| 64 | +```python |
| 65 | +from swarms.structs.maker import MAKER |
| 66 | + |
| 67 | + |
| 68 | +def format_prompt(task, state, step_idx, previous_result): |
| 69 | + prev = f"\nPrevious: {previous_result}" if previous_result is not None else "" |
| 70 | + return f"{task}\nStep {step_idx + 1} of the plan. One short line only.{prev}" |
| 71 | + |
| 72 | + |
| 73 | +def parse_response(response: str) -> str: |
| 74 | + return response.strip().splitlines()[0] |
| 75 | + |
| 76 | + |
| 77 | +def validate_response(response: str, max_tokens: int) -> bool: |
| 78 | + if not response.strip(): |
| 79 | + return False |
| 80 | + return len(response) // 4 <= max_tokens # rough token estimate, same idea as default |
| 81 | + |
| 82 | + |
| 83 | +maker = MAKER( |
| 84 | + name="LineByLine", |
| 85 | + model_name="gpt-4.1-mini", |
| 86 | + system_prompt="Answer in one short line per step.", |
| 87 | + format_prompt=format_prompt, |
| 88 | + parse_response=parse_response, |
| 89 | + validate_response=validate_response, |
| 90 | + k=2, |
| 91 | + verbose=True, |
| 92 | +) |
| 93 | + |
| 94 | +results = maker.run(task="List three benefits of unit tests, one per step.", max_steps=3) |
| 95 | +print(results) |
| 96 | +``` |
| 97 | + |
| 98 | +## Related |
| 99 | + |
| 100 | +- Source: `swarms/structs/maker.py` (module and class docstrings mirror this behavior). |
| 101 | +- [MajorityVoting](majorityvoting.md) — multi-agent loops with a consensus agent, not step-wise first-to-ahead-by-k on a decomposed trajectory. |
0 commit comments