Skip to content

Honor deletedSegmentsRetentionPeriod for REFRESH tables in segment lineage cleanup#18257

Closed
swaminathanmanish wants to merge 2 commits intoapache:masterfrom
swaminathanmanish:configurable-deleted-segments-retention
Closed

Honor deletedSegmentsRetentionPeriod for REFRESH tables in segment lineage cleanup#18257
swaminathanmanish wants to merge 2 commits intoapache:masterfrom
swaminathanmanish:configurable-deleted-segments-retention

Conversation

@swaminathanmanish
Copy link
Copy Markdown
Contributor

@swaminathanmanish swaminathanmanish commented Apr 19, 2026

Problem

REFRESH (non-APPEND) offline tables have their superseded segments deleted via manageSegmentLineageCleanupForTable. This method called deleteSegments(tableNameWithType, segmentsToDelete) — the no-arg overload that ignores deletedSegmentsRetentionPeriod from the table config and always falls back to the cluster-level default (7 days).

manageRetentionForTable (which does read per-table retention config) exits early for REFRESH tables:

if (tableConfig.getTableType() == TableType.OFFLINE && !"APPEND".equalsIgnoreCase(segmentPushType)) {
    return; // REFRESH tables skip this entirely
}

So even if an operator sets deletedSegmentsRetentionPeriod: "0d" on a REFRESH table, the staging retention is silently ignored — segments always sit in Deleted_Segments/ for the full cluster default before permanent deletion.

Fix

Pass the table's deletedSegmentsRetentionPeriod to deleteSegments in manageSegmentLineageCleanupForTable:

// before
_pinotHelixResourceManager.deleteSegments(tableNameWithType, segmentsToDelete);

// after
String retentionPeriod = tableConfig.getValidationConfig().getDeletedSegmentsRetentionPeriod();
_pinotHelixResourceManager.deleteSegments(tableNameWithType, segmentsToDelete, retentionPeriod);

When deletedSegmentsRetentionPeriod is null (not configured), the existing deleteSegments(String, List, String) overload falls through to re-read the table config and then the cluster default — identical behavior to before.

Usage

To enable immediate deletion of superseded segments for a REFRESH table:

"segmentsConfig": {
  "deletedSegmentsRetentionPeriod": "0d"
}

Backward Compatibility

No behavior change for tables that do not set deletedSegmentsRetentionPeriod. APPEND and realtime tables are unaffected.

…neage cleanup

manageSegmentLineageCleanupForTable called deleteSegments() without
passing the table config, so REFRESH tables (which are skipped by
manageRetentionForTable) always fell back to the cluster-level default
staging retention (7 days), ignoring any per-table
deletedSegmentsRetentionPeriod set in segmentsConfig.

Pass the table's deletedSegmentsRetentionPeriod to deleteSegments so
operators can control the staging window per REFRESH table, including
setting "0d" for immediate deletion.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@swaminathanmanish swaminathanmanish force-pushed the configurable-deleted-segments-retention branch from 90dccea to 1306e99 Compare April 19, 2026 15:48
@swaminathanmanish swaminathanmanish changed the title Make deleted segments staging retention configurable for REFRESH tables Honor deletedSegmentsRetentionPeriod for REFRESH tables in segment lineage cleanup Apr 19, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.47%. Comparing base (3eff094) to head (06cdb8e).

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18257      +/-   ##
============================================
- Coverage     63.48%   63.47%   -0.01%     
  Complexity     1627     1627              
============================================
  Files          3244     3244              
  Lines        197365   197366       +1     
  Branches      30540    30540              
============================================
- Hits         125306   125287      -19     
- Misses        62019    62037      +18     
- Partials      10040    10042       +2     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.46% <100.00%> (-0.01%) ⬇️
java-21 63.45% <100.00%> (-0.02%) ⬇️
temurin 63.47% <100.00%> (-0.01%) ⬇️
unittests 63.47% <100.00%> (-0.01%) ⬇️
unittests1 55.45% <ø> (-0.01%) ⬇️
unittests2 34.96% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants