Best of N

Best of N is triggered when you explicitly ask for it in your prompt — for example, best of 3, best of 5, or best of 10. Mux interprets that as a request to launch N sibling sub-agents on the same task. The parent does a small amount of up-front analysis to capture shared context and reduce duplicated setup, but still leaves room for the children to approach the problem independently. It then waits for the child runs and synthesizes the strongest result. Use it when you care more about answer quality than speed. In practice it often improves:

plans
deep analysis
debugging
code review
anomaly detection in production metrics
hard math or proof-style work

It tends to help most when the first plausible answer is often incomplete.

Good fits

Gnarly debugging: multiple plausible root causes, weak logs, or flaky repro steps
Deep math: different solution paths may unlock the problem
Code review: broader coverage of correctness, tests, design, and edge cases
Production metrics: several competing explanations for an anomaly

Best-of vs variants

Best-of retries the same ask; variants reuse the same ask with a labeled focus or scope change. Use Best of N when you want multiple independent attempts at one question. Use variants when the same prompt template should be reused across a few parallel lanes instead.

Issue lists: “Solve GitHub issues 23, 32, 45” → one issue-solving template, one variant per issue
Commit-range investigation: “Find the source of this week’s regression” → one investigation template, one variant per commit window such as A..B and B..C
Review lanes: “Review the changes” → one review template, one variant per lane such as frontend, backend, tests, and docs

Start with best of 3 or best of 5. Larger batches cost more and take longer, but can pay off on high-stakes or unusually open-ended problems.

MCP ServersExtend agent capabilities with Model Context Protocol servers

⌘I

​Good fits

​Best-of vs variants

Good fits

Best-of vs variants