Skip to content

KAFKA-14490: Throw UncheckedIOException from LazyIndex#22330

Open
Phixsura wants to merge 2 commits into
apache:trunkfrom
Phixsura:04-KAFKA-14490-unchecked-ioexception-lazy-index
Open

KAFKA-14490: Throw UncheckedIOException from LazyIndex#22330
Phixsura wants to merge 2 commits into
apache:trunkfrom
Phixsura:04-KAFKA-14490-unchecked-ioexception-lazy-index

Conversation

@Phixsura
Copy link
Copy Markdown

Background

KAFKA-14490 proposes replacing checked IOException with UncheckedIOException in the log layer, following the pattern already adopted by the raft module. The ticket description specifically calls out LazyIndex's private constructor as a candidate for simplification once IOException is unchecked.

This is the first PR in the KAFKA-14490 series. Plan posted on the ticket walks through the remaining 17 files in dependency order.

Changes

  • LazyIndex.get / renameTo / deleteIfExists / close no longer declare throws IOException. IOException is caught at the file-IO boundary inside IndexFile, IndexValue and the loader, and re-thrown as UncheckedIOException. This mirrors FileRawSnapshotReader / FileRawSnapshotWriter / Snapshots / FileQuorumStateStore / KafkaRaftLog in the raft module.
  • The private constructor changes from (IndexWrapper, long, int, IndexType) to (IndexWrapper, IndexLoader<T>). The new IndexLoader<T> is a small @FunctionalInterface that allows the lambda to declare throws IOException. forOffset and forTime now pass a lambda that captures baseOffset and maxIndexSize directly.
  • The previous loadIndex private method and its @SuppressWarnings("unchecked") switch on IndexType are removed. The remaining @SuppressWarnings("unchecked") on get is for the (IndexValue<T>) wrapper cast, which is structurally required by the <?> wildcard on instanceof IndexValue<?> and cannot be eliminated without a larger refactor.

Scope kept minimal

External callers' method signatures (e.g. LogSegment.offsetIndex() throws IOException) are left untouched. Their throws IOException clauses become redundant since LazyIndex.get no longer throws checked, but Java permits over-declared throws and pruning them belongs in the follow-up PRs that convert the surrounding files. This keeps the blast radius of the first PR limited to LazyIndex.java.

Verification

  • :storage:compileJava clean.
  • :storage:test --tests "*LogSegment*" --tests "*Index*" all green.
  • :core:test --tests "*LogSegment*" --tests "*UnifiedLog*" --tests "*LazyIndex*" all green (covers the Scala LogTestUtils and RemoteLogManagerTest callers).
  • :storage:spotlessCheck, :storage:checkstyleMain, :storage:checkstyleTest, :storage:spotbugsMain --rerun-tasks all clean.

Committer review notes

Happy to split out the constructor cleanup into its own PR if reviewers prefer the IOException conversion be reviewed in isolation first.

Wraps IOException at the file-IO boundary inside IndexFile, IndexValue and
the index loader so LazyIndex's public methods no longer declare
`throws IOException`, matching the pattern already in place in the raft
module (FileRawSnapshotReader and peers).

Folds in the constructor cleanup called out in the ticket description:
`(IndexWrapper, baseOffset, maxIndexSize, IndexType)` becomes
`(IndexWrapper, IndexLoader<T>)`, removing the private `loadIndex` switch
on `IndexType` and its `@SuppressWarnings("unchecked")` cast. `forOffset`
and `forTime` pass a lambda factory instead.

External callers' signatures are left untouched; their `throws IOException`
clauses become redundant but are safe to keep until the follow-up files in
KAFKA-14490 are converted.

First PR in the KAFKA-14490 series.
@github-actions github-actions Bot added triage PRs from the community storage Pull requests that target the storage module labels May 19, 2026
…dexTest

Aligns the wrap sites with the raft module's existing pattern
(FileRawSnapshotReader, Snapshots, FileQuorumStateStore): each
UncheckedIOException now carries a String.format message naming the
operation and the affected index file, so stack traces in production
identify the failing path without needing the cause chain.

LazyIndexTest covers the public API end-to-end with real files via
TestUtils.tempFile, including:

- forOffset / forTime return the right index subtype on get
- get caches and returns the same instance on subsequent calls
- get does not touch disk until first call
- get wraps loader IOException as UncheckedIOException carrying
  "Error loading index file" and the original IOException as cause
- renameTo swallows NoSuchFileException when the source has already
  been deleted (preserves the existing tolerance contract)
- deleteIfExists and updateParentDir behave correctly before and
  after load
- close is safe both before and after load
@Phixsura
Copy link
Copy Markdown
Author

@ijuma First PR in the KAFKA-14490 series is up. This batch covers LazyIndex: the IOException-to-UncheckedIOException conversion plus the private-constructor cleanup you called out in the description (removes the loadIndex switch on IndexType and its @SuppressWarnings("unchecked")). Added LazyIndexTest with 11 unit tests covering both the happy path and the wrap-on-IOException paths. Happy to tune the pattern before the larger files (AbstractIndex / OffsetIndex / TimeIndex / TransactionIndex) land in subsequent PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-approved storage Pull requests that target the storage module triage PRs from the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants