You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-7Lines changed: 10 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,11 +30,14 @@ more complex codebases.
30
30
31
31
## What Our "End Users" Say
32
32
33
-
While it is humans who download and set up Serena, our end users are essentially AI agents,
34
-
so they are also in the best position to evaluate Serena.
35
-
We crafted an unbiased and detailed evaluation prompt
36
-
that leads the agent to perform ~20 routine coding tasks, representative of everyday development work, using both Serena's tools and its own built-ins,
37
-
measure the differences, and report the results. A one-sentence summary of what the agents had to say:
33
+
While it is humans who download and set up Serena, our end users are essentially AI agents.
34
+
As the ones actually applying Serena's tools, they are in the best position to evaluate Serena.
35
+
36
+
We crafted an unbiased evaluation prompt that leads the agent to perform ~20 routine coding tasks,
37
+
representative of everyday development work,
38
+
in order to compare Serena's tools with its own built-ins, measure the differences, and report the results.
39
+
40
+
Here's a one-sentence summary of what the agents had to say:
38
41
39
42
**Opus 4.6 (high effort) in Claude Code on a large Python codebase:**
40
43
> "Serena's IDE-backed semantic tools are the single most impactful addition to my toolkit — cross-file renames, moves, and reference lookups that
@@ -44,9 +47,9 @@ would cost me 8–12 careful, error-prone steps collapse into one atomic call, a
44
47
> "As a coding AI agent, I would ask my owner to add Serena because it gives me the missing IDE-level understanding of symbols, references, and
45
48
refactorings, turning fragile text surgery into calmer, faster, more confident code changes where semantics matter."
46
49
47
-
Give your agent the tools it's been asking for and add Serena MCP to your client!
50
+
Give your agent the tools it has been asking for and add Serena MCP to your client!
48
51
49
-
See our [documentation](https://oraios.github.io/serena/04-evaluation/000_intro.html) for the full methodology and much more detailed evaluation results
52
+
See our [documentation](https://oraios.github.io/serena/04-evaluation/000_evaluation-intro.html) for the full methodology and much more detailed evaluation results
50
53
beyond these brief summaries, or run your own evaluation on a project of your choice.
0 commit comments