You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-10Lines changed: 9 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,25 +30,24 @@ more complex codebases.
30
30
31
31
## What Our "End Users" Say
32
32
33
-
While it is humans who download and set up Serena, our end users are essentially AI agents,
34
-
so they are also in the best position to evaluate Serena.
35
-
We crafted an unbiased and detailed prompt which leads the agent to estimate the value of adding Serena's tools
36
-
to its built-in capabilities. The thorough evaluation usually takes around 25 minutes (per project)
37
-
and tests every single aspect of Serena. A summary of what the agents had to say:
33
+
While it is humans who download and set up Serena, our end users are essentially AI agents,
34
+
so they are also in the best position to evaluate Serena.
35
+
We crafted an unbiased and detailed evaluation prompt
36
+
that leads the agent to perform ~20 routine coding tasks, representative of everyday development work, using both Serena's tools and its own built-ins,
37
+
measure the differences, and report the results. A one-sentence summary of what the agents had to say:
38
38
39
39
**Opus 4.6 (high effort) in Claude Code on a large Python codebase:**
40
40
> "Serena's IDE-backed semantic tools are the single most impactful addition to my toolkit — cross-file renames, moves, and reference lookups that
41
41
would cost me 8–12 careful, error-prone steps collapse into one atomic call, and I would absolutely ask any developer I work with to set them up."
42
42
43
-
**Gpt 5.4 (high) in Codex CLI on a Java codebase:**
43
+
**GPT 5.4 (high) in Codex CLI on a Java codebase:**
44
44
> "As a coding AI agent, I would ask my owner to add Serena because it gives me the missing IDE-level understanding of symbols, references, and
45
45
refactorings, turning fragile text surgery into calmer, faster, more confident code changes where semantics matter."
46
46
47
-
Your agent deserves the best coding tools, give them Serena!
47
+
Give your agent the tools it's been asking for and add Serena MCP to your client!
48
48
49
-
See the documentation on our [evaluation methods](https://oraios.github.io/serena/04-evaluation/000_intro.html) and the
50
-
detailed results beyond the brief recommendations above. You can easily run your own evaluation of Serena on a project of your choice
51
-
by reusing our methods or adapting them to your needs.
49
+
See our [documentation](https://oraios.github.io/serena/04-evaluation/000_intro.html) for the full methodology and much more detailed evaluation results
50
+
beyond these brief summaries, or run your own evaluation on a project of your choice.
0 commit comments