Skip to content

fix(opencypher): collect variables from ListPredicateExpression in WHERE clause#3891

Draft
subha0319 wants to merge 2 commits intoArcadeData:mainfrom
subha0319:main
Draft

fix(opencypher): collect variables from ListPredicateExpression in WHERE clause#3891
subha0319 wants to merge 2 commits intoArcadeData:mainfrom
subha0319:main

Conversation

@subha0319
Copy link
Copy Markdown

What does this PR do?

Fixes a bug where any(), none(), all(), and single() list predicates were silently ignored or inverted when used as filter conditions in a WHERE clause. The predicates worked correctly when used as expressions in RETURN, but had no effect (or the opposite effect) in WHERE.

The fix adds handling for ListPredicateExpression in WhereClause.collectExpressionVariables() so that outer variables referenced inside the predicate (e.g. p from p.name) are correctly collected, while the loop-scoped iterator variable (e.g. x) is excluded as it is locally bound.

Motivation

Issue #3888 was reported with a clear reproduction case: a query like:

MATCH (p:Person)
WHERE any(x IN ['Alice'] WHERE x = p.name)
RETURN p.name AS name
ORDER BY name

was returning all rows (Alice, Bob, Charlie) instead of only Alice. The complementary NOT any(...) form also behaved incorrectly, returning no rows when all non-matching rows should survive. Neo4j handles both cases correctly.

The root cause was that WhereClause.collectExpressionVariables() had no branch for ListPredicateExpression. This caused variable extraction to return an empty set, which made the extractForVariables() logic skip the predicate entirely during WHERE filter evaluation hence, treating it as always true.

Related issues

Additional Notes

  • The logic change is confined to a single method in WhereClause.java and no changes to ListPredicateExpression.java or the executor layer were needed, as the per-item evaluation logic (evaluateAny, testItem, etc.) was already correct.
  • Several test failures observed during mvn clean package on Windows are pre-existing and reproducible on the upstream ArcadeData/arcadedb main branch without any of this PR's changes. These failures are unrelated to the list predicate fix.
  • The two new test cases added for this fix: testAnyPredicateInWhere and testNotAnyPredicateInWhere pass successfully.
  • Fix applies equally to all four predicate types: any(), none(), all(), single().

Checklist

  • I have run the build using mvn clean package command
  • My unit tests cover both failure and success scenarios

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements variable collection for list predicate expressions in OpenCypher WHERE clauses, ensuring that variables within predicates like 'any()' are correctly identified. Feedback was provided regarding the placement and implementation of the new test cases. Specifically, the tests were added to an inappropriate nested class lacking the necessary vertex type setup, used inconsistent indentation, and utilized JUnit assertions instead of the project's standard AssertJ library. An unnecessary JUnit import should also be removed.

Comment on lines 934 to 967
@Test
public void testAnyPredicateInWhere() {
// Cleanup first to ensure a clean state
database.command("opencypher", "MATCH (n:Person) DETACH DELETE n");

database.command("opencypher",
"CREATE (:Person {name:'Alice'}), (:Person {name:'Bob'}), (:Person {name:'Charlie'})");

ResultSet rs = database.query("opencypher",
"MATCH (p:Person) WHERE any(x IN ['Alice'] WHERE x = p.name) RETURN p.name AS name ORDER BY name");

List<String> names = new ArrayList<>();
while (rs.hasNext()) names.add(rs.next().getProperty("name"));

assertEquals(List.of("Alice"), names);
}

@Test
public void testNotAnyPredicateInWhere() {
// Cleanup first to ensure a clean state
database.command("opencypher", "MATCH (n:Person) DETACH DELETE n");

database.command("opencypher",
"CREATE (:Person {name:'Alice'}), (:Person {name:'Bob'}), (:Person {name:'Charlie'})");

ResultSet rs = database.query("opencypher",
"MATCH (p:Person) WHERE NOT any(x IN ['Alice'] WHERE x = p.name) RETURN p.name AS name ORDER BY name");

List<String> names = new ArrayList<>();
while (rs.hasNext()) names.add(rs.next().getProperty("name"));

assertEquals(List.of("Bob", "Charlie"), names);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

These tests are logically misplaced inside the NotOperatorParenthesesRegression nested class, which is intended for a different set of regression tests. Furthermore, the setUp method of this nested class does not create the Person vertex type, which will cause these tests to fail. Moving them to the main OpenCypherWhereClauseTest class ensures they use the correct database setup and follow the file's 2-space indentation and AssertJ usage style.

  }

  @Test
  void testAnyPredicateInWhere() {
    database.command("opencypher", "MATCH (n:Person) DETACH DELETE n");
    database.command("opencypher",
        "CREATE (:Person {name:'Alice'}), (:Person {name:'Bob'}), (:Person {name:'Charlie'})");

    ResultSet rs = database.query("opencypher",
        "MATCH (p:Person) WHERE any(x IN ['Alice'] WHERE x = p.name) RETURN p.name AS name ORDER BY name");

    List<String> names = new ArrayList<>();
    while (rs.hasNext()) names.add(rs.next().getProperty("name"));

    assertThat(names).containsExactly("Alice");
  }

  @Test
  void testNotAnyPredicateInWhere() {
    database.command("opencypher", "MATCH (n:Person) DETACH DELETE n");
    database.command("opencypher",
        "CREATE (:Person {name:'Alice'}), (:Person {name:'Bob'}), (:Person {name:'Charlie'})");

    ResultSet rs = database.query("opencypher",
        "MATCH (p:Person) WHERE NOT any(x IN ['Alice'] WHERE x = p.name) RETURN p.name AS name ORDER BY name");

    List<String> names = new ArrayList<>();
    while (rs.hasNext()) names.add(rs.next().getProperty("name"));

    assertThat(names).containsExactly("Bob", "Charlie");
  }


import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;
import static org.junit.jupiter.api.Assertions.assertEquals;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This import is unnecessary if we stick to AssertJ (assertThat) which is used throughout the rest of the file. It's better to maintain consistency in the testing framework used.

@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 17, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

@lvca
Copy link
Copy Markdown
Member

lvca commented Apr 17, 2026

Thanks for the PR! Here are my findings after reviewing the change and reproducing locally.

Critical

1. The new tests pass without the fix.
I checked out current main (PR fix NOT applied), cherry-picked only the test file from this PR, and ran OpenCypherWhereClauseTest: all 48 tests pass, including testAnyPredicateInWhere and testNotAnyPredicateInWhere. I also wrote an independent reproduction at the outer-class level (with the proper Person schema) and it also passes without the fix. The described symptom (query returning Alice, Bob, Charlie) is not reproducible on current main, so these are not valid regression tests.

2. Tests are placed in the wrong nested class.
They were appended at the end of the file, which puts them inside NotOperatorParenthesesRegression. That nested class has its own @BeforeEach creating DOCUMENT/CHUNK/IMAGE only - there is no Person vertex type. The tests only work by implicit auto-creation on CREATE (:Person ...), which is fragile and unrelated to what the inner class is testing. Please either move them to the outer class or add a new nested class (e.g. AnyPredicateInWhereRegression) with a proper setUp that creates Person.

3. Project conventions not followed.
Uses JUnit assertEquals instead of AssertJ assertThat(...).containsExactly(...). The codebase (and CLAUDE.md) mandate AssertJ. The added import static org.junit.jupiter.api.Assertions.assertEquals; should be removed.

Style

  • WhereClause.java:182-184: indentation is broken. The else if sits on a separate line after the previous }, and the first body line is 4-space indented while the rest uses 6-space. Please join } else if (...) onto one line and normalize indentation to match the surrounding branches.
  • The new test methods use inconsistent indentation (8 vs 4 spaces) instead of the project's 2-space standard.

Design / completeness

What's good

  • Diff is minimal and non-invasive.
  • The fix logic (collect outer variables, exclude the loop-bound iterator) is semantically correct and matches Neo4j behavior.
  • OpenCypherListPredicateTest (20 tests) and the rest of OpenCypherWhereClauseTest still pass with the fix applied.

Requested follow-up

Because the tests don't distinguish "with fix" from "without fix" on main, could you:

  1. Provide a failing query/test on a specific main SHA that actually exercises the bug (possibly only via the HTTP API serializer: studio path mentioned in the issue), or
  2. Reframe the change on optimization grounds (enabling filter pushdown for list predicates) rather than as a correctness fix?

Once the points above are addressed, I'm happy to take another look.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 18, 2026

Codecov Report

❌ Patch coverage is 0% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.50%. Comparing base (e076207) to head (1766f52).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...com/arcadedb/query/opencypher/ast/WhereClause.java 0.00% 6 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3891      +/-   ##
==========================================
- Coverage   64.71%   57.50%   -7.22%     
==========================================
  Files        1581     1581              
  Lines      117023   117030       +7     
  Branches    24858    24860       +2     
==========================================
- Hits        75735    67299    -8436     
- Misses      30925    40148    +9223     
+ Partials    10363     9583     -780     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@subha0319 subha0319 marked this pull request as draft April 18, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

any(...) list predicates may be evaluated correctly as expressions in RETURN, but ignored or inverted when used in WHERE.

2 participants