fix(block-editor): support spaces in contentlet search (#34416) by oidacra · Pull Request #35510 · dotCMS/core

oidacra · 2026-04-29T20:52:23Z

Summary

Fixes the Block Editor contentlet search to support multi-word queries with spaces.

When typing inside the Block Editor's contentlet picker (after / → Contentlets → <ContentType>), spaces appear in the input but the result list does not refine. Root cause: SuggestionsService.getContentlets interpolated the user filter into +catchall:*${filter}* title:'${filter}'^15. With a multi-word filter like White Water, Lucene parses +catchall:*White Water* as two terms — +catchall:*White (mandatory) plus Water* (optional) — so the query degrades to a single-word filter on the first token only. The single-quoted title clause is also broken (Lucene phrase queries require double quotes).

CleanShot.2026-04-29.at.17.32.41.mp4

Fix

In suggestions.service.ts (getContentlets):

Tokenize the filter on whitespace and emit one mandatory +catchall:*token* clause per token, joined with spaces. Multi-word queries now require ALL tokens to match.
Switch the title boost from title:'${filter}'^15 to title:"${filter}"^15 for proper Lucene phrase semantics.
Compose the final query via array + filter + join so empty parts (e.g. missing identifierQuery) don't introduce double spaces.
Empty / whitespace-only filters omit the catchall and title clauses entirely.
Hyphen-bearing filters keep the existing identifier branch unchanged (+catchall:${filter}, no wildcards).

Closes

Closes #34416

Acceptance Criteria

Search input accepts space characters (was already preserved by TipTap; the fix targets the resulting query).
Multi-word query (e.g., White Water) narrows results to contentlets matching ALL tokens.
Single-word query still works — no regression.
Empty filter returns the default contentlet list (no catchall/title clauses emitted).
Filter containing - preserves the identifier branch.

Test Plan

Automated tests added in suggestions.service.spec.ts (now 8 tests, was 1) using HttpTestingController to assert the exact query string sent to /api/content/_search for: multi-word, single-word, empty, whitespace-only, hyphen, and contentletIdentifier exclusion cases, plus response mapping.

Manual verification:

Open a content type with a Block Editor field, type /, choose Contentlets → Activity.
Type White Water — list filters to contentlets containing both words (e.g., White Water Rafting).
Type Water — single-word search still works.
Clear the filter — full default list returns.

Changed Files

core-web/libs/block-editor/src/lib/shared/services/suggestions/suggestions.service.ts
core-web/libs/block-editor/src/lib/shared/services/suggestions/suggestions.service.spec.ts
core-web/libs/block-editor/src/lib/elements/dot-bubble-menu/dot-bubble-menu.component.spec.ts (drive-by: 1-line import order fix to satisfy nx affected:lint)

Visual Changes

User-facing behavior change in the Block Editor contentlet search popup; the visual UI itself is unchanged. A before/after screen recording can be shared if needed.

Tokenize the search filter on whitespace and emit one mandatory +catchall:*token* clause per token so multi-word queries (e.g. "White Water") narrow results to contentlets matching ALL tokens. Previously the unquoted space caused Lucene to parse the catchall as a single mandatory term + remaining optional terms, so multi-word filters degraded to filtering by only the first word. Also switch the title relevance boost from single quotes (literal, non-functional) to double quotes (Lucene phrase syntax). Empty filter now omits the catchall/title clauses entirely; hyphen-bearing filters preserve the existing identifier/UUID branch unchanged. Closes #34416

Move the rxjs import above @angular imports to satisfy the existing import/order eslint rule. Pre-existing failure surfaced when running nx affected:lint for #34416; fixed here to keep the affected lint target green.

claude · 2026-04-29T20:53:18Z

Claude finished @oidacra's task in 2m 38s —— View job

PR Review

Gather context and read changed files
Review suggestions.service.ts — core logic change
Review suggestions.service.spec.ts — test coverage
Review dot-bubble-menu.component.spec.ts — drive-by fix
Post findings

Issues

1. UUID_LIKE regex silently kills searches for hex English words (suggestions.service.ts:15)

const UUID_LIKE = /^[0-9a-f]+(-[0-9a-f]+)+$/i;

The comment claims it's "narrow enough to skip ordinary hyphenated English titles," but it matches any hyphen-separated sequence where every segment is all-hex. Short, real English words that are also valid hex: ace, cafe, bead, dead, beef, face, babe, fade. A user searching for content titled ace-cafe or dead-beef silently hits the no-wildcard exact-match path and gets nothing. This is user-invisible: no error, just empty results.

A proper UUID pattern (/^[0-9a-f]{8}(-[0-9a-f]{4}){3}-[0-9a-f]{12}$/i) would be narrow enough to cover the actual identifier use case without these false positives. The test at spec:112 uses abc-def as the UUID case — that's also not a UUID and would be better served by a proper UUID regex.

Fix this →

2. Hyphenated-word tokens won't match via wildcard (suggestions.service.ts:63)

When filter = 'White-Water Falls', whitespace-split produces ['White-Water', 'Falls'], and the query includes +catchall:*White\-Water*. Standard Lucene text analysis would index White-Water as two terms (white, water), so a single wildcard over White\-Water finds nothing. The +Falls clause still narrows results, but the hyphenated token is effectively dead weight.

This existed before (the old - branch also used no wildcards), so it's not a regression introduced here — but the PR now routes this through the "works correctly" path without addressing the actual matching issue. Worth a comment acknowledging the limitation, or splitting tokens on [-\s]+ instead of just \s+.

3. contentletIdentifier is interpolated without escaping (suggestions.service.ts:52)

const identifierQuery = contentletIdentifier ? `-identifier:${contentletIdentifier}` : '';

The PR adds escapeLucene for user-typed input but leaves this call site unescaped. In practice, dotCMS identifiers are UUIDs and safe, but it's an inconsistency: if someone passes an identifier with a Lucene special char, it bypasses the new escaping logic. One-liner fix: escapeLucene(contentletIdentifier).

Minor

The getContentletsByLink method (line 115) interpolates link unescaped — pre-existing, not this PR's problem, but the new escapeLucene helper is sitting right there.
&&/|| multi-char operator handling in LUCENE_SPECIAL_CHARS produces \&& / \|| (escapes only the first char). This neutralises the operator in practice but isn't the canonical form (\&\& / \|\|). Harmless given typical content search input.
Tests are thorough: 8 cases covering multi-word, single-word, empty, whitespace-only, UUID, injection, identifier exclusion, and response mapping. The flushEmpty helper and httpMock.verify() in afterEach are clean. The injection test (spec:143) is a good addition.
dot-bubble-menu.component.spec.ts drive-by: correct, no concerns.

Prettier flagged a double blank line left over from the previous import/order fix. nx format:check is now clean.

Address PR review on #34416: - Narrow the identifier/UUID branch from filter.includes('-') to a hex-only segmented pattern (UUID_LIKE). Hyphenated English titles like "self-care", "follow-up", or "White-Water Falls" now go through the regular tokenized search path instead of degrading to a non-wildcard exact-match clause. - Escape Lucene query-syntax characters (+ - && || ! ( ) { } [ ] ^ " ~ * ? : \ /) before interpolating user input into the catchall and title clauses, preventing a user from injecting arbitrary clauses that would bypass the +contentType restriction. - Drop the redundant .filter(token => token.length > 0) after .trim().split(/\s+/) — the regex already collapses interior runs and trim removes leading/trailing whitespace, so empty tokens are impossible. Adds tests for the hyphenated-title path, the injection-escaping behavior, and updates the existing UUID-branch assertion to expect the escaped hyphen.

oidacra added 2 commits April 29, 2026 16:51

github-project-automation Bot added this to dotCMS - Product Planning Apr 29, 2026

github-actions Bot mentioned this pull request Apr 29, 2026

[DEFECT] Block Editor search not allowing spaces. #34416

Open

github-actions Bot added the Area : Frontend PR changes Angular/TypeScript frontend code label Apr 29, 2026

claude Bot added the AI: Safe To Rollback label Apr 29, 2026

oidacra added 2 commits April 29, 2026 17:25

chore(block-editor): drop stray blank line in dot-bubble-menu spec

2b73749

Prettier flagged a double blank line left over from the previous import/order fix. nx format:check is now clean.

oidacra marked this pull request as ready for review April 29, 2026 21:32

oidacra requested review from KevinDavilaDotCMS, adrianjm-dotCMS, hmoreras, nicobytes, rjvelazco and zJaaal April 29, 2026 21:33

oidacra enabled auto-merge April 29, 2026 21:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(block-editor): support spaces in contentlet search (#34416)#35510

fix(block-editor): support spaces in contentlet search (#34416)#35510
oidacra wants to merge 4 commits intomainfrom
issue-34416

oidacra commented Apr 29, 2026 •

edited

Loading

Uh oh!

claude Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oidacra commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Closes

Acceptance Criteria

Test Plan

Changed Files

Visual Changes

Uh oh!

claude Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Issues

Minor

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

oidacra commented Apr 29, 2026 •

edited

Loading

claude Bot commented Apr 29, 2026 •

edited

Loading