Skip to content

fix(content): replace unbounded query loop with LIMIT-1 DB call in validateRelationships() (#35489)#35491

Open
fabrizzio-dotCMS wants to merge 6 commits intomainfrom
fix/issue-35489-validate-relationships-limit
Open

fix(content): replace unbounded query loop with LIMIT-1 DB call in validateRelationships() (#35489)#35491
fabrizzio-dotCMS wants to merge 6 commits intomainfrom
fix/issue-35489-validate-relationships-limit

Conversation

@fabrizzio-dotCMS
Copy link
Copy Markdown
Member

Problem

Re-saving a contentlet with required many-to-many relationships took up to 10 minutes on environments with ~500 related records and ~40 configured languages. Profiling showed 99% of CPU time in ESContentletAPIImpl.validateRelationships().

Two hotspots caused an N×M SQL query explosion (500 identifiers × 40 languages = 20 000+ queries per save):

  1. Required-relationship existence check (the !foundInRelationships path) called getRelatedContent(), which routes through RelationshipCache and fires one contentlet_version_info SELECT per identifier per language — just to answer a boolean "does anything exist?".

  2. ONE_TO_MANY cardinality check inside the per-child loop also called getRelatedContent() unconditionally for every cardinality, including MANY_TO_MANY, triggering the same cache-miss explosion even when the result was never needed.

Fix

Both calls are replaced with FactoryLocator.getRelationshipFactory().dbRelatedContent(relationship, contentlet, hasParent, false, null, 1, 0), which issues a single LIMIT 1 query directly against tree + contentlet_version_info and stops at the first matching row. The cardinality check is also guarded behind if (ONE_TO_MANY) so it is skipped entirely for other cardinalities.

This mirrors the approach from PR #35407 (which fixed a similar bottleneck in RelationshipFactoryImpl for an older branch) and adapts it to the refactored validateRelationships() in main.

Changes

  • dotCMS/src/main/java/com/dotcms/content/elasticsearch/business/ESContentletAPIImpl.java
    • Replaced getRelatedContent() in required-relationship existence check with dbRelatedContent(…, 1, 0)
    • Wrapped ONE_TO_MANY cardinality DB check in if (ONE_TO_MANY) guard and replaced getRelatedContent() with dbRelatedContent(…, 1, 0)

Acceptance Criteria

  • Re-saving a contentlet with required relationships completes in under 10 seconds (baseline: ~10 min)
  • Total JDBC executions during save are in the hundreds, not tens of thousands
  • Required-relationship validation still rejects saves where a required relationship has no children in the DB
  • Required-relationship validation still passes when the relationship is omitted from the payload but children exist in the DB
  • ONE_TO_ONE cardinality validation still rejects saves that link more than one contentlet
  • Content types with no required relationships are unaffected

Test

./mvnw verify -pl :dotcms-integration -Dcoreit.test.skip=false \
  -Dit.test=ESContentletAPIImplTest#testValidateRelationships

References

🤖 Generated with Claude Code

…uery in validateRelationships() (#35489)

Two hotspots in ESContentletAPIImpl.validateRelationships() caused N×M query explosion
(500 related identifiers × 40 languages = 20 000+ SQL queries per save):

1. The required-relationship existence check called getRelatedContent(), which routes
   through RelationshipCache and fires one query per identifier per language.
   Replaced with FactoryLocator.getRelationshipFactory().dbRelatedContent(…, 1, 0)
   — a single LIMIT-1 query on tree + contentlet_version_info that stops at first hit.

2. The ONE_TO_MANY cardinality check inside the per-child loop called
   getRelatedContent() unconditionally for every cardinality (including MANY_TO_MANY),
   triggering the same cache-miss explosion. Guarded the DB call behind an
   if (ONE_TO_MANY) check and replaced it with the same dbRelatedContent(…, 1, 0) call.

Refs: #35489

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 28, 2026

Claude finished @fabrizzio-dotCMS's task in 4m 38s —— View job


PR Review

  • Gather context
  • Read changed file and diff
  • Analyze correctness, edge cases, and design
  • Post review

The fix is structurally sound. Two specific things worth a second look:


1. hasParent=false is hardcoded in the ONE_TO_MANY check — confirm this is always right

ESContentletAPIImpl.java:8702-8706

if (relationship.getCardinality() == RELATIONSHIP_CARDINALITY.ONE_TO_MANY.ordinal()) {
    final List<Contentlet> relatedContents = FactoryLocator
            .getRelationshipFactory()
            .dbRelatedContent(relationship, contentInRelationship,
                    false, false, null, 1, 0);  // hasParent=false always

false (hasParent) means "treat contentInRelationship as the child, fetch its parents." That's correct here because this block is guarded by isParent(relationship, contentType) && isRelationshipParent, so contentsInRelationship are always children from contentlet's perspective. The original passed pullByParent=false explicitly, so semantics match. ✓ — but it would be clearer with a named constant (IS_CHILD = false or similar) since a reader can't tell at a glance why it's hardcoded.


2. No break/continue after ONE_TO_MANY violation — still O(n) DB queries on the error path

ESContentletAPIImpl.java:8713-8716

hasError = true;
builder.addBadCardinalityRelationship(relationship, contentsInRelationship);
// no break — loop continues, firing another LIMIT 1 query per remaining child

After the first violated child is found, the loop keeps running and fires one dbRelatedContent query per remaining element in contentsInRelationship. For the valid case (no violation) this is unavoidable. For the error case with 500 children, it wastes up to 499 extra queries. The ONE_TO_ONE path uses continue to skip the rest of the outer records loop — the ONE_TO_MANY path has no equivalent.

This is pre-existing, and adding a break here is a small separate improvement, not a blocker. Fix this →


Things verified correct

  • checkParenthasParent mapping: checkParent=true (contentlet is parent, fetches children) and checkParent=false (contentlet is child, fetches parents) map exactly to dbRelatedContent's hasParent semantics. ✓
  • live=false matches original behavior: the original getRelatedContent(... live=null, language=-1) fast-path returns filterRelatedContent()'s result, which internally calls dbRelatedContent with WORKING_VERSION=false. No behavior change. ✓
  • DotSecurityException removal: dbRelatedContent only declares DotDataException. Correct cleanup. ✓
  • ONE_TO_ONE check is unaffected: the ONE_TO_ONE cardinality validation at line 8664 is outside the per-child loop and is not touched. ✓
  • Permission bypass is intentional improvement: the original getRelatedContent called PermissionAPI.filterCollection() before returning, which could silently remove a parent the caller had no read access to — causing a false "no conflict" for the ONE_TO_MANY check. dbRelatedContent bypasses permission filtering, making the structural check more reliable. ✓

The two issues above are minor (one is a naming clarity concern, one is pre-existing). The performance fix itself is correct and the logic is sound.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Slow contentlet save with required relationships — unbounded query loop in validateRelationships()

1 participant