Skip to content

[PostgreSQL] Extend the allowed Debezium properties#1247

Merged
gayaldassanayake merged 19 commits intoballerina-platform:mainfrom
gayaldassanayake:all-fields
Mar 19, 2026
Merged

[PostgreSQL] Extend the allowed Debezium properties#1247
gayaldassanayake merged 19 commits intoballerina-platform:mainfrom
gayaldassanayake:all-fields

Conversation

@gayaldassanayake
Copy link
Copy Markdown
Contributor

@gayaldassanayake gayaldassanayake commented Feb 26, 2026

Purpose

Issue : ballerina-platform/ballerina-library#8572

Examples

Checklist

  • Linked to an issue
  • Updated the changelog
  • Added tests
  • Updated the spec
  • Checked native-image compatibility

PostgreSQL Debezium Properties Extension

Overview

This PR extends the PostgreSQL CDC module to support a broader set of Debezium configuration properties, enabling finer-grained control over replication, publication, streaming, and snapshot behavior. It also refactors configuration handling to separate Debezium engine properties from listener-specific options and updates module dependencies.

Key Changes

Dependency and Version Updates

  • Bumped module version to 1.17.0 (from 1.16.4) and updated related compiler/native artifact references.
  • Upgraded CDC runtime dependency to 1.3.0.
  • Removed Kafka/Avro/Confluent schema-registry related dependencies to narrow the module's scope to PostgreSQL CDC.

Configuration Extensions

  • Extended Postgres connection model with fields for table/column filtering and message key mapping: includedTables, excludedTables, includedColumns, excludedColumns, messageKeyColumns.
  • Introduced hierarchical configuration types: ReplicationConfiguration, PublicationConfiguration, StreamingConfiguration.
  • Added enums PublicationAutocreateMode and LsnFlushMode for stronger typing of specific options.
  • Added PostgreSqlOptions to expose extendedSnapshot, dataTypeConfig, heartbeatConfig and related snapshot controls.

Architecture and Behavioral Changes

  • Refactored CdcListener to maintain two readonly maps: debeziumConfigs (map) and listenerConfigs (map), instead of a single combined config map.
  • Listener startup now invokes an extended start path that accepts split configuration maps.
  • Deprecated top-level pluginName, slotName, and publicationName in favor of nested configuration with backward-compatible fallbacks.

Utilities and Population Logic

  • New utility functions to translate high-level configuration into Debezium properties:
    • populateDatabaseConfigurations (replaces previous populatePostgresConfigurations)
    • populateDebeziumProperties and populateOptions (including extended snapshot and heartbeat handling)
    • populateTableAndColumnFiltering, populateSchemaConfigurations, populateStreamingConfiguration
  • Helpers convert interval/decimal values to millisecond strings and assemble Debezium-compatible key/value maps.

Documentation and Tests

  • README updated with PostgreSQL-specific connection examples, advanced options, and guidance on passing raw Debezium properties.
  • Added seven unit tests covering replication, publication, streaming, relational filtering, extended snapshot, and heartbeat option population.
  • Added test-suite cleanup hook to remove temporary test artifacts.
  • Added changelog entry: "Introduce additional Debezium properties."

Impact

  • Improves flexibility and control for PostgreSQL change-data-capture configurations, supporting advanced replication and streaming scenarios.
  • Separates engine-level Debezium configuration from listener-specific settings for clearer responsibilities and easier extension.
  • Reduces external dependency footprint by removing Kafka/Avro/Confluent dependencies.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 26, 2026

Warning

Rate limit exceeded

@gayaldassanayake has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 41 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7cd57e4e-bbd0-479d-a9d7-0905fd48b732

📥 Commits

Reviewing files that changed from the base of the PR and between ee32c0d and 944a758.

📒 Files selected for processing (2)
  • ballerina/cdc_listener.bal
  • gradle.properties
📝 Walkthrough

Walkthrough

Refactors CDC listener configuration into separate Debezium and listener maps, adds rich PostgreSQL-specific config types (replication, publication, streaming, extended snapshot, options), updates version/dependencies to 1.17.0 and cdc to 1.3.0, removes unused packages, expands tests and docs.

Changes

Cohort / File(s) Summary
Version & Compiler Plugin
ballerina/Ballerina.toml, ballerina/CompilerPlugin.toml, gradle.properties
Bump package to 1.17.0, update compiler plugin JAR paths to 1.17.0-SNAPSHOT, and update gradle properties (stdlibCdc → 1.3.0).
Dependency Cleanup
ballerina/Dependencies.toml, build.gradle
Remove Avro/Kafka/Confluent-related package entries and properties; mark some packages testOnly; update ballerinax:postgresql and ...driver versions.
Configuration Types
ballerina/listener_types.bal
Add enums/records: PublicationAutocreateMode, LsnFlushMode, ReplicationConfiguration, PublicationConfiguration, StreamingConfiguration, PostgreSqlOptions, ExtendedSnapshotConfiguration; extend PostgresDatabaseConnection and PostgresListenerConfiguration and deprecate legacy fields.
Config Population Logic
ballerina/utils.bal
Introduce helpers to populate Debezium and listener configs: database/replication/publication/streaming/filtering/extended-snapshot/options; rename populatePostgresConfigurations → populateDatabaseConfigurations; add new Postgres-specific constants.
Listener Initialization & Start
ballerina/cdc_listener.bal
Replace single config field with two readonly maps (debeziumConfigs, listenerConfigs); init now populates both via utilities; start calls externStartWithExtendedConfigs(self, debeziumConfigs, listenerConfigs).
Tests & Cleanup
ballerina/tests/listener_tests.bal, ballerina/tests/test_cleanup.bal
Add seven Postgres CDC unit tests covering replication, publication, streaming, relational filtering, extended snapshot, heartbeat options; add @test:AfterSuite cleanup to remove ./tmp.
Documentation & Changelog
ballerina/README.md, changelog.md
Document database connection and advanced PostgreSQL CDC options, expand CDC examples, and add changelog entry for new Debezium properties.

Sequence Diagram(s)

sequenceDiagram
    participant App as CdcListener
    participant Utils as utils.bal
    participant Native as NativeBackend
    participant Debez as DebeziumEngine

    App->>Utils: populateDebeziumProperties(config)
    App->>Utils: populateListenerProperties(config)
    Utils-->>App: debeziumConfigs, listenerConfigs
    App->>Native: externStartWithExtendedConfigs(debeziumConfigs, listenerConfigs)
    Native->>Debez: initialize engine with debeziumConfigs
    Native->>App: started / status events
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • shafreenAnfar
  • niveathika
  • aashikam

Poem

🐰 I hopped through configs, split them in two,
Debezium hums while listeners chew,
Replication rules and snapshots in sight,
Tables and columns now all set right,
A carrot cheer for CDC tonight! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete. While it references an issue link and includes the checklist template, all checklist items are unchecked and no substantive examples or explanation of changes are provided. Complete the checklist items by marking relevant ones as done (e.g., 'Updated the changelog', 'Added tests'), and provide a brief explanation of the key changes made in the PR.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title '[PostgreSQL] Extend the allowed Debezium properties' clearly summarizes the main change: expanding Debezium properties support in the PostgreSQL module.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gayaldassanayake
Copy link
Copy Markdown
Contributor Author

@coderabbitai please review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 26, 2026

@gayaldassanayake, I'll review the changes in this PR.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@gayaldassanayake gayaldassanayake marked this pull request as ready for review March 10, 2026 10:53
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
changelog.md (1)

13-15: Clarify the compatibility story for the filter-field move.

This entry explains where the fields live now, but not whether the previous placement is still accepted. A one-line migration note would make upgrades much safer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@changelog.md` around lines 13 - 15, Add a one-line migration note clarifying
compatibility after the bullet about PostgresDatabaseConnection: state whether
the filter fields (includedTables, excludedTables, includedColumns,
excludedColumns) previously accepted on cdc:DatabaseConnection are still
supported (e.g., deprecated aliasing/forwarding) or must be moved to
PostgresDatabaseConnection, and give the recommended upgrade action (move fields
to PostgresDatabaseConnection or leave as-is) so users know if they need to
update configs; reference the symbol names PostgresDatabaseConnection and
cdc:DatabaseConnection and the four field names in the sentence.
ballerina/README.md (1)

520-522: Add one raw Debezium-property example here.

The paragraph says extra fields can be passed through, but a tiny snippet would remove ambiguity about where those keys belong in the listener config.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/README.md` around lines 520 - 522, Add a minimal inline example
showing how to pass a raw Debezium property into the listener config under the
options parameter: demonstrate using the options parameter (type
PostgresSqlOptions) with an extra field like
"snapshot.select.statement.overrides" (or another real Debezium key) so readers
see that raw keys are placed directly on options in the listener config; keep it
one small config snippet and label it as a raw Debezium property example for
clarity.
ballerina/tests/listener_tests.bal (1)

270-318: Add one precedence test for the deprecated top-level fields.

The implementation in ballerina/utils.bal explicitly prefers replicationConfig and publicationConfig over pluginName, slotName, and publicationName, but the new suite only exercises nested-only inputs. A mixed legacy+nested case would lock the migration behavior down.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/tests/listener_tests.bal` around lines 270 - 318, Add a precedence
test that mixes deprecated top-level fields with the new nested configs to
assert nested configs win; create a test (e.g.,
testPostgresReplicationPrecedence / testPostgresPublicationPrecedence) that
constructs a PostgresDatabaseConnection containing both top-level fields
pluginName/slotName/publicationName (and/or
slotDropOnStop/publicationAutocreateMode) set to one value and
replicationConfig/publicationConfig set to a different value, call
populateDatabaseConfigurations(connection, actualProperties), and assert
actualProperties["plugin.name"], ["slot.drop.on.stop"], and ["publication.name"]
/ ["publication.autocreate.mode"] equal the values from
replicationConfig/publicationConfig (not the deprecated top-level ones).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ballerina/tests/listener_tests.bal`:
- Around line 263-291: The test seeds expectedProperties and then calls
populateDatabaseConfigurations(connection, actualProperties) but only asserts
two keys; add assertions for the remaining seeded properties so the mapping is
fully validated: assert actualProperties["slot.name"] ==
expectedProperties["slot.name"], actualProperties["slot.stream.params"] ==
expectedProperties["slot.stream.params"], and
actualProperties["xmin.fetch.interval.ms"] ==
expectedProperties["xmin.fetch.interval.ms"] (and any other seeded keys) using
test:assertEquals with explanatory msg strings; locate these in the same test
after the existing plugin.name and slot.drop.on.stop assertions and mirror their
style.

In `@ballerina/utils.bal`:
- Around line 122-128: The code writes LSN_FLUSH_MODE inside
populateStreamingConfiguration even though the bundled Debezium may not support
lsn.flush.mode; either upgrade the PostgreSQL CDC driver to a version that
bundles Debezium >= 3.4 or change populateStreamingConfiguration to avoid
writing LSN_FLUSH_MODE when the runtime driver lacks support: detect support (or
add a feature flag/validation) and if unsupported either skip setting
LSN_FLUSH_MODE and log/return a clear validation error when config.lsnFlushMode
is set; update the logic around populateStreamingConfiguration, LSN_FLUSH_MODE,
and config.lsnFlushMode accordingly so users don’t get a runtime error.

In `@gradle.properties`:
- Around line 61-63: Update the property stdlibPostgresqlDriverVersion in
gradle.properties to match the pinned postgresql.driver version in
ballerina/Dependencies.toml (change 1.6.1 → 1.6.2), so the value consumed by
build.gradle (the property stdlibPostgresqlDriverVersion) and the package
metadata (postgresql.driver in Dependencies.toml) remain consistent.

---

Nitpick comments:
In `@ballerina/README.md`:
- Around line 520-522: Add a minimal inline example showing how to pass a raw
Debezium property into the listener config under the options parameter:
demonstrate using the options parameter (type PostgresSqlOptions) with an extra
field like "snapshot.select.statement.overrides" (or another real Debezium key)
so readers see that raw keys are placed directly on options in the listener
config; keep it one small config snippet and label it as a raw Debezium property
example for clarity.

In `@ballerina/tests/listener_tests.bal`:
- Around line 270-318: Add a precedence test that mixes deprecated top-level
fields with the new nested configs to assert nested configs win; create a test
(e.g., testPostgresReplicationPrecedence / testPostgresPublicationPrecedence)
that constructs a PostgresDatabaseConnection containing both top-level fields
pluginName/slotName/publicationName (and/or
slotDropOnStop/publicationAutocreateMode) set to one value and
replicationConfig/publicationConfig set to a different value, call
populateDatabaseConfigurations(connection, actualProperties), and assert
actualProperties["plugin.name"], ["slot.drop.on.stop"], and ["publication.name"]
/ ["publication.autocreate.mode"] equal the values from
replicationConfig/publicationConfig (not the deprecated top-level ones).

In `@changelog.md`:
- Around line 13-15: Add a one-line migration note clarifying compatibility
after the bullet about PostgresDatabaseConnection: state whether the filter
fields (includedTables, excludedTables, includedColumns, excludedColumns)
previously accepted on cdc:DatabaseConnection are still supported (e.g.,
deprecated aliasing/forwarding) or must be moved to PostgresDatabaseConnection,
and give the recommended upgrade action (move fields to
PostgresDatabaseConnection or leave as-is) so users know if they need to update
configs; reference the symbol names PostgresDatabaseConnection and
cdc:DatabaseConnection and the four field names in the sentence.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fd423c2c-69c6-40f9-9117-0caf4dafe545

📥 Commits

Reviewing files that changed from the base of the PR and between e0702ad and f7a79a3.

📒 Files selected for processing (12)
  • ballerina/Ballerina.toml
  • ballerina/CompilerPlugin.toml
  • ballerina/Dependencies.toml
  • ballerina/README.md
  • ballerina/cdc_listener.bal
  • ballerina/listener_types.bal
  • ballerina/tests/listener_tests.bal
  • ballerina/tests/test_cleanup.bal
  • ballerina/utils.bal
  • build.gradle
  • changelog.md
  • gradle.properties
💤 Files with no reviewable changes (1)
  • build.gradle

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
gradle.properties (1)

61-62: ⚠️ Potential issue | 🟠 Major

Keep the Gradle PostgreSQL driver pin aligned with the lockfile.

Line 61 still pins stdlibPostgresqlDriverVersion to 1.6.1, but ballerina/Dependencies.toml now locks ballerinax/postgresql.driver to 1.6.2. That can leave Gradle resolving a different driver than the package metadata.

🔧 Minimal fix
-stdlibPostgresqlDriverVersion=1.6.1
+stdlibPostgresqlDriverVersion=1.6.2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@gradle.properties` around lines 61 - 62, The gradle property
stdlibPostgresqlDriverVersion is pinned to 1.6.1 while Dependencies.toml locks
ballerinax/postgresql.driver to 1.6.2; update the stdlibPostgresqlDriverVersion
value to 1.6.2 to match the lockfile so Gradle won't resolve a mismatched
driver, and re-run dependency resolution/verification to ensure versions are
aligned.
ballerina/tests/listener_tests.bal (1)

261-291: ⚠️ Potential issue | 🟡 Minor

Assert every property you seed in these mapping tests.

These cases populate slot.name, slot.stream.params, and xmin.fetch.interval.ms, but never assert them. That leaves part of the new mapping surface unverified.

🧪 Missing assertions
     test:assertEquals(actualProperties["plugin.name"],
         expectedProperties["plugin.name"],
         msg = "Plugin name does not match.");
+    test:assertEquals(actualProperties["slot.name"],
+        expectedProperties["slot.name"],
+        msg = "Slot name does not match.");
     test:assertEquals(actualProperties["slot.drop.on.stop"],
         expectedProperties["slot.drop.on.stop"],
         msg = "Slot drop on stop does not match.");
+    test:assertEquals(actualProperties["slot.stream.params"],
+        expectedProperties["slot.stream.params"],
+        msg = "Slot stream params does not match.");
@@
     test:assertEquals(actualProperties["status.update.interval.ms"],
         expectedProperties["status.update.interval.ms"],
         msg = "Status update interval does not match.");
+    test:assertEquals(actualProperties["xmin.fetch.interval.ms"],
+        expectedProperties["xmin.fetch.interval.ms"],
+        msg = "XMIN fetch interval does not match.");
     test:assertEquals(actualProperties["lsn.flush.mode"],
         expectedProperties["lsn.flush.mode"],
         msg = "LSN flush mode does not match.");

Also applies to: 321-349

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/tests/listener_tests.bal` around lines 261 - 291, The test seed
seeds additional replication properties but only asserts two of them; update the
testPostgresReplicationConfiguration (and the similar test at lines 321-349) to
assert all seeded keys by adding assertions that compare
actualProperties["slot.name"], actualProperties["slot.stream.params"], and
actualProperties["xmin.fetch.interval.ms"] against expectedProperties entries;
locate the assertions around populateDatabaseConfigurations and add
test:assertEquals calls for those three keys to ensure
populateDatabaseConfigurations produces the full mapping.
ballerina/utils.bal (1)

122-128: ⚠️ Potential issue | 🟠 Major

Don't emit lsn.flush.mode against the current CDC driver.

Line 127 writes lsn.flush.mode, but this PR still depends on ballerinax/postgresql.cdc.driver:1.0.0, which Ballerina Central documents as bundling Debezium Postgres Driver 3.0.8.Final. Debezium introduced lsn.flush.mode in 3.4, so streamingConfig.lsnFlushMode is ahead of the bundled runtime here. Please gate or reject this field until the driver is upgraded in the same PR. (central.ballerina.io)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/utils.bal` around lines 122 - 128, The code is emitting
LSN_FLUSH_MODE even though the bundled Debezium driver
(ballerinax/postgresql.cdc.driver:1.0.0) is older and doesn't support
lsn.flush.mode; update populateStreamingConfiguration to not set
debeziumConfigs[LSN_FLUSH_MODE] from StreamingConfiguration.lsnFlushMode unless
the runtime/driver version supports it (or reject the field); specifically, gate
the assignment that checks lsnFlushMode (the lsnFlushMode variable and
LSN_FLUSH_MODE constant) behind a runtime/compile check or remove/ignore
lsnFlushMode until the driver dependency is bumped, so
populateStreamingConfiguration does not emit the unsupported config key.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@ballerina/tests/listener_tests.bal`:
- Around line 261-291: The test seed seeds additional replication properties but
only asserts two of them; update the testPostgresReplicationConfiguration (and
the similar test at lines 321-349) to assert all seeded keys by adding
assertions that compare actualProperties["slot.name"],
actualProperties["slot.stream.params"], and
actualProperties["xmin.fetch.interval.ms"] against expectedProperties entries;
locate the assertions around populateDatabaseConfigurations and add
test:assertEquals calls for those three keys to ensure
populateDatabaseConfigurations produces the full mapping.

In `@ballerina/utils.bal`:
- Around line 122-128: The code is emitting LSN_FLUSH_MODE even though the
bundled Debezium driver (ballerinax/postgresql.cdc.driver:1.0.0) is older and
doesn't support lsn.flush.mode; update populateStreamingConfiguration to not set
debeziumConfigs[LSN_FLUSH_MODE] from StreamingConfiguration.lsnFlushMode unless
the runtime/driver version supports it (or reject the field); specifically, gate
the assignment that checks lsnFlushMode (the lsnFlushMode variable and
LSN_FLUSH_MODE constant) behind a runtime/compile check or remove/ignore
lsnFlushMode until the driver dependency is bumped, so
populateStreamingConfiguration does not emit the unsupported config key.

In `@gradle.properties`:
- Around line 61-62: The gradle property stdlibPostgresqlDriverVersion is pinned
to 1.6.1 while Dependencies.toml locks ballerinax/postgresql.driver to 1.6.2;
update the stdlibPostgresqlDriverVersion value to 1.6.2 to match the lockfile so
Gradle won't resolve a mismatched driver, and re-run dependency
resolution/verification to ensure versions are aligned.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd83eb45-5c46-4184-b71e-b6fc5f2cfccd

📥 Commits

Reviewing files that changed from the base of the PR and between f7a79a3 and d998d6f.

📒 Files selected for processing (12)
  • ballerina/Ballerina.toml
  • ballerina/CompilerPlugin.toml
  • ballerina/Dependencies.toml
  • ballerina/README.md
  • ballerina/cdc_listener.bal
  • ballerina/listener_types.bal
  • ballerina/tests/listener_tests.bal
  • ballerina/tests/test_cleanup.bal
  • ballerina/utils.bal
  • build.gradle
  • changelog.md
  • gradle.properties
💤 Files with no reviewable changes (1)
  • build.gradle
🚧 Files skipped from review as they are similar to previous changes (4)
  • changelog.md
  • ballerina/Ballerina.toml
  • ballerina/README.md
  • ballerina/CompilerPlugin.toml

ayeshLK
ayeshLK previously approved these changes Mar 17, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ballerina/listener_types.bal`:
- Around line 92-94: The PostgresDatabaseConnection public record's doc block is
missing documentation for the new streamingConfig field; update the
PostgresDatabaseConnection documentation to add a description for
streamingConfig (what it configures, expected type/shape, and any
relation/priority with replicationConfig/publicationConfig), ensuring the doc
mentions the streamingConfig field name and its purpose so the generated API
docs include this new entry point.
- Around line 48-62: The code accepts both legacy fields and new nested records
(ReplicationConfiguration and PublicationConfiguration) but currently overrides
legacy values when nested records are present; update the logic around
replicationConfig and publicationConfig (in the code that currently switches
wholesale) to either reject mixed usage or merge field-by-field: detect if both
legacy fields and the nested record are provided and return an error, or iterate
each field in ReplicationConfiguration and PublicationConfiguration and copy
only non-nil/non-default values into the final config so legacy values are
preserved when nested records omit fields (reference ReplicationConfiguration,
PublicationConfiguration, replicationConfig, publicationConfig, and the
functions that perform the conversion).
- Around line 35-40: The public enum LsnFlushMode exposes a value
CONNECTOR_AND_DRIVER that requires Debezium 3.4+ but the bundled CDC driver
(ballerinax/postgresql.cdc.driver:1.0.0) contains Debezium 3.0.8 and will fail
at runtime; fix by either removing the CONNECTOR_AND_DRIVER enum member from
LsnFlushMode (and any other references to it) so the public API only exposes
supported values (MANUAL, CONNECTOR) and update any docs/tests that reference
it, or instead update the CDC driver dependency to a version that bundles
Debezium 3.4+ and adjust build/dependency metadata accordingly; target symbols:
enum LsnFlushMode and the CONNECTOR_AND_DRIVER variant (also remove/update the
duplicate occurrence referenced at the later LsnFlushMode definition).
- Around line 28-33: Add the missing enum member to PublicationAutocreateMode so
callers can use the Debezium-supported mode "no_tables": add a new enum constant
named NO_TABLES with the string value "no_tables" to the public enum
PublicationAutocreateMode (alongside ALL_TABLES, DISABLED, FILTERED) so the
public API exposes that valid mode.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b9e8ed6a-2125-499f-b91a-19672d353022

📥 Commits

Reviewing files that changed from the base of the PR and between d998d6f and ee32c0d.

📒 Files selected for processing (12)
  • ballerina/Ballerina.toml
  • ballerina/CompilerPlugin.toml
  • ballerina/Dependencies.toml
  • ballerina/README.md
  • ballerina/cdc_listener.bal
  • ballerina/listener_types.bal
  • ballerina/tests/listener_tests.bal
  • ballerina/tests/test_cleanup.bal
  • ballerina/utils.bal
  • build.gradle
  • changelog.md
  • gradle.properties
💤 Files with no reviewable changes (1)
  • build.gradle
✅ Files skipped from review due to trivial changes (4)
  • changelog.md
  • ballerina/Ballerina.toml
  • gradle.properties
  • ballerina/CompilerPlugin.toml
🚧 Files skipped from review as they are similar to previous changes (6)
  • ballerina/tests/test_cleanup.bal
  • ballerina/cdc_listener.bal
  • ballerina/utils.bal
  • ballerina/README.md
  • ballerina/tests/listener_tests.bal
  • ballerina/Dependencies.toml

Comment on lines +28 to +33
# Represents publication autocreate modes.
public enum PublicationAutocreateMode {
ALL_TABLES = "all_tables",
DISABLED = "disabled",
FILTERED = "filtered"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

curl -fsSL https://debezium.io/documentation/reference/stable/connectors/postgresql.html |
  sed -n '/publication.autocreate.mode/,/replica.identity.autoset.values/p' |
  rg -n '`(all_tables|disabled|filtered|no_tables)`'

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🏁 Script executed:

# Check the actual file to see current enum members
cat -n ballerina/listener_types.bal | head -40

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 1719


🏁 Script executed:

# Search for dependency declarations (look for build files, gradle, maven, etc.)
fd -e 'gradle|xml|toml|yaml|yml|lock' -type f | head -20

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 261


🏁 Script executed:

# Try alternative approach to check Debezium docs - simpler curl without piping to external tools
curl -fsSL 'https://debezium.io/documentation/reference/stable/connectors/postgresql.html' 2>&1 | grep -i 'publication.autocreate.mode' | head -5

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 1680


🏁 Script executed:

# Search for dependency/version files in the repository
fd -type f \( -name "*.gradle" -o -name "pom.xml" -o -name "Ballerina.toml" -o -name "*.lock" \)

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 261


🌐 Web query:

Debezium PostgreSQL 3.0.8 publication.autocreate.mode no_tables

💡 Result:

In Debezium PostgreSQL connector version 3.0.8, the publication.autocreate.mode configuration property controls how the connector handles creation of the PostgreSQL publication used for logical replication with the pgoutput plugin. The no_tables mode creates (or uses an existing) publication with no tables assigned to it initially. The user must then manually add tables to the publication using SQL commands like ALTER PUBLICATION mypub ADD TABLE myschema.mytable;. This is useful when the connector lacks privileges to auto-add tables or when fine-grained control over publication tables is needed. This option was added via DBZ-8156 and documented in a merged pull request in August 2024, present in Debezium 3.0.8 (released March 2025). Other common modes include: - disabled: No auto-creation; publication must pre-exist. - all_tables: Creates publication for all tables (default in older versions). - filtered: Creates publication based on table.include.list/table.exclude.list filters. Configure as: "publication.autocreate.mode": "no_tables"

Citations:


🏁 Script executed:

# Find dependency files using simpler approach
find . -maxdepth 3 -type f \( -name "Ballerina.toml" -o -name "gradle.build" -o -name "pom.xml" -o -name "*.gradle" \) 2>/dev/null

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 688


🏁 Script executed:

# Search for "debezium" references in the repository
rg -i "debezium|publication.autocreate" --type-list | head -5 && rg -i "debezium|publication" -A 2 -B 2 | head -40

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 2881


🏁 Script executed:

# Check main build.gradle for dependencies
cat build.gradle | grep -i debezium -A 2 -B 2

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🏁 Script executed:

# Check native build.gradle
cat native/build.gradle | grep -i debezium -A 2 -B 2

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🏁 Script executed:

# Read the native/build.gradle file to see dependencies
cat native/build.gradle | head -80

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 2803


🏁 Script executed:

# Broader search for debezium anywhere in the repo
rg 'debezium' -i --max-count 20

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 3615


🏁 Script executed:

# Check gradle.properties for version variables
cat gradle.properties | grep -i 'cdc\|debezium' -A 1 -B 1

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 186


🏁 Script executed:

# Check ballerina/Ballerina.toml for dependencies
cat ballerina/Ballerina.toml | head -40

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 830


🏁 Script executed:

# Check what version of cdc driver is being used and search for its Debezium version
# Look at settings.gradle to understand dependency resolution
cat settings.gradle | grep -i cdc -A 3 -B 3

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🌐 Web query:

ballerinax postgresql.cdc.driver 1.0.0 Debezium version

💡 Result:

The ballerinax/postgresql.cdc.driver version 1.0.0 bundles Debezium Postgres Driver version 3.0.8.Final.

Citations:


🏁 Script executed:

# Check if there are any other references to the actual Debezium version in this repo
rg 'debezium.*3\.0' -i

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


Add the missing NO_TABLES enum member to PublicationAutocreateMode.

The currently published ballerinax/postgresql.cdc.driver 1.0.0 bundles Debezium Postgres Driver 3.0.8.Final, which supports publication.autocreate.mode=no_tables. Omitting it from the enum blocks users from configuring this valid mode through the public API.

Fix
 public enum PublicationAutocreateMode {
     ALL_TABLES = "all_tables",
     DISABLED = "disabled",
-    FILTERED = "filtered"
+    FILTERED = "filtered",
+    NO_TABLES = "no_tables"
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Represents publication autocreate modes.
public enum PublicationAutocreateMode {
ALL_TABLES = "all_tables",
DISABLED = "disabled",
FILTERED = "filtered"
}
# Represents publication autocreate modes.
public enum PublicationAutocreateMode {
ALL_TABLES = "all_tables",
DISABLED = "disabled",
FILTERED = "filtered",
NO_TABLES = "no_tables"
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/listener_types.bal` around lines 28 - 33, Add the missing enum
member to PublicationAutocreateMode so callers can use the Debezium-supported
mode "no_tables": add a new enum constant named NO_TABLES with the string value
"no_tables" to the public enum PublicationAutocreateMode (alongside ALL_TABLES,
DISABLED, FILTERED) so the public API exposes that valid mode.

Comment on lines +35 to +40
# Represents LSN flush modes.
public enum LsnFlushMode {
MANUAL = "manual",
CONNECTOR = "connector",
CONNECTOR_AND_DRIVER = "connector_and_driver"
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Candidate manifest files =="
fd '^(Ballerina\.toml|Dependencies\.toml|Package\.md|README\.md)$' -x sh -c '
  printf "\n--- %s ---\n" "$1"
  sed -n "1,220p" "$1"
' sh {}

echo
echo "== Debezium / CDC driver references =="
rg -n 'postgresql\.cdc\.driver|debezium-connector-postgres|Debezium Postgres Driver|lsn\.flush\.mode|connector_and_driver' .

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 29179


🏁 Script executed:

# Check the full postgresql.cdc.driver version in Dependencies.toml
cat ballerina/Dependencies.toml | grep -A 10 'name = "postgresql.cdc.driver"'

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 583


🏁 Script executed:

# Check build.gradle for CDC driver version
grep -n 'stdlibPostgresCdcDriverVersion\|postgresql.*cdc.*driver' build.gradle | head -20

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 204


🏁 Script executed:

# Check if there's a gradle.properties or similar with version definitions
fd '(gradle\.properties|version\.txt|versions\.gradle)$' -x cat {}

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 1257


🏁 Script executed:

# Search for any Debezium version references in the native build or pom files
fd '(pom\.xml|build\.gradle)' -x grep -l 'debezium\|cdc.*driver' {} 2>/dev/null | head -5

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 105


🏁 Script executed:

# Find and examine native build files for Debezium/CDC driver dependencies
find ./native -name 'pom.xml' -o -name 'build.gradle' 2>/dev/null | head -5

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 112


🏁 Script executed:

# Check for any native module files that might reference Debezium version
fd 'pom\.xml' ./native -x grep -l 'debezium' {} 2>/dev/null

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🏁 Script executed:

# Search for any version or release documentation
fd '(CHANGELOG|RELEASE|NEWS|VERSION)' -i | grep -i 'cdc\|debezium' | head -10

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🏁 Script executed:

# Check native build gradle files for CDC driver or Debezium references
find ./native -name '*.gradle' -x grep -n 'debezium\|cdc.*driver\|postgres.*connector' {} 2>/dev/null

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 73


🏁 Script executed:

# Examine native/build.gradle for CDC driver dependencies
cat ./native/build.gradle | grep -A 5 -B 5 'cdc\|debezium\|postgres' -i

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 934


🏁 Script executed:

# Check if there's a separate CDC driver repository or if it's defined in build
rg 'postgresql\.cdc\.driver|Debezium' ballerina/Dependencies.toml -A 5 -B 2

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 705


🏁 Script executed:

# Search for any gradle property or variable that tracks CDC driver bundled Debezium version
rg 'debezium.*version|cdc.*version' -i gradle.properties build.gradle

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 400


🏁 Script executed:

# Check if this PR changes any CDC driver version by looking for gradle version assignments
grep -r 'stdlibPostgresCdcDriverVersion\|postgresql.cdc.driver' gradle.properties build.gradle native/build.gradle 2>/dev/null

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 268


🏁 Script executed:

# Search for any Debezium or version configuration in tests
rg 'debezium|3\.0\.|3\.4\.' -i ballerina/tests listener_tests.bal | head -20

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 149


🏁 Script executed:

# Check README or documentation for Debezium version requirements
cat ballerina/README.md | grep -A 5 -B 5 'debezium\|cdc.*driver\|3\.0\|3\.4' -i

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 999


🏁 Script executed:

# Search for any version constraints or compatibility notes
rg 'version.*3\.' ballerina/ -A 2 -B 2

Repository: ballerina-platform/module-ballerinax-postgresql

Length of output: 336


🌐 Web query:

ballerinax/postgresql.cdc.driver version 1.0.0 Debezium

💡 Result:

ballerinax/postgresql.cdc.driver version 1.0.0 is a Ballerina library that provides the Debezium PostgreSQL connector dependencies for Change Data Capture (CDC) functionality in Ballerina applications. It enables seamless listening to changes (inserts, updates, deletes) in PostgreSQL databases. Key details: - Released on 2025-05-29 (latest and only release as of 2026-03-19). - Compatible with Ballerina Language version 2201.12.0. - Bundles Debezium Postgres Driver version 3.0.8.Final. - Usage: Import alongside ballerinax/cdc and ballerinax/postgresql modules: import ballerinax/cdc; import ballerinax/postgresql; import ballerinax/postgresql.cdc.driver as _; Then configure a postgresql:CdcListener to capture events with remote functions like onCreate, onUpdate, onDelete. It is a support module for Ballerina's CDC framework, which is built on Debezium. No additional driver configuration needed as it's pre-bundled. Available on Ballerina Central: https://central.ballerina.io/ballerinax/postgresql.cdc.driver/latest. Source: https://github.com/ballerina-platform/module-ballerinax-postgresql.cdc.driver.

Citations:


Remove CONNECTOR_AND_DRIVER from the public API or upgrade the bundled CDC driver to support Debezium 3.4+.

The enum value CONNECTOR_AND_DRIVER requires Debezium Postgres Driver 3.4 or later to support the lsn.flush.mode property. The bundled ballerinax/postgresql.cdc.driver:1.0.0 currently provides only Debezium 3.0.8.Final, which does not recognize this configuration value. Code using this enum will compile but fail at runtime when Debezium attempts to parse it. This PR does not upgrade the CDC driver version, so either the enum should be removed from the public API for now, or the CDC driver dependency must be upgraded to a version supporting Debezium 3.4+.

Also applies to: lines 69-72

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/listener_types.bal` around lines 35 - 40, The public enum
LsnFlushMode exposes a value CONNECTOR_AND_DRIVER that requires Debezium 3.4+
but the bundled CDC driver (ballerinax/postgresql.cdc.driver:1.0.0) contains
Debezium 3.0.8 and will fail at runtime; fix by either removing the
CONNECTOR_AND_DRIVER enum member from LsnFlushMode (and any other references to
it) so the public API only exposes supported values (MANUAL, CONNECTOR) and
update any docs/tests that reference it, or instead update the CDC driver
dependency to a version that bundles Debezium 3.4+ and adjust build/dependency
metadata accordingly; target symbols: enum LsnFlushMode and the
CONNECTOR_AND_DRIVER variant (also remove/update the duplicate occurrence
referenced at the later LsnFlushMode definition).

Comment on lines +48 to +62
public type ReplicationConfiguration record {|
PostgreSQLLogicalDecodingPlugin pluginName = PGOUTPUT;
string slotName = "debezium";
boolean slotDropOnStop = false;
string slotStreamParams?;
|};

# PostgreSQL publication configuration (pgoutput plugin).
#
# + publicationName - Name of PostgreSQL publication
# + publicationAutocreateMode - Mode for auto-creating publications
public type PublicationConfiguration record {|
string publicationName = "dbz_publication";
PublicationAutocreateMode publicationAutocreateMode = ALL_TABLES;
|};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject mixed deprecated and nested configs, or merge them field-by-field.

ballerina/utils.bal:74-100 switches wholesale to replicationConfig / publicationConfig when those records are present. Because these new records prefill pluginName, slotName, and publicationName, a partial migration such as {pluginName: DECODERBUFS, replicationConfig: {slotDropOnStop: true}} silently resolves to pgoutput / debezium instead of preserving the legacy values. That is a behavior change, not just a deprecation path.

Also applies to: 108-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/listener_types.bal` around lines 48 - 62, The code accepts both
legacy fields and new nested records (ReplicationConfiguration and
PublicationConfiguration) but currently overrides legacy values when nested
records are present; update the logic around replicationConfig and
publicationConfig (in the code that currently switches wholesale) to either
reject mixed usage or merge field-by-field: detect if both legacy fields and the
nested record are provided and return an error, or iterate each field in
ReplicationConfiguration and PublicationConfiguration and copy only
non-nil/non-default values into the final config so legacy values are preserved
when nested records omit fields (reference ReplicationConfiguration,
PublicationConfiguration, replicationConfig, publicationConfig, and the
functions that perform the conversion).

Comment on lines +92 to 94
# + replicationConfig - Replication configuration (logical decoding plugin, slot name and parameters). Takes priority over deprecated top-level fields
# + publicationConfig - Publication configuration (publication name and autocreate mode). Takes priority over deprecated top-level fields
public type PostgresDatabaseConnection record {|
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document streamingConfig in the public record docs.

Line 119 adds a new public field, but the PostgresDatabaseConnection doc block never lists or describes streamingConfig. That leaves the generated API docs incomplete for one of the new entry points.

Also applies to: 117-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ballerina/listener_types.bal` around lines 92 - 94, The
PostgresDatabaseConnection public record's doc block is missing documentation
for the new streamingConfig field; update the PostgresDatabaseConnection
documentation to add a description for streamingConfig (what it configures,
expected type/shape, and any relation/priority with
replicationConfig/publicationConfig), ensuring the doc mentions the
streamingConfig field name and its purpose so the generated API docs include
this new entry point.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 96.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.77%. Comparing base (46d45da) to head (944a758).
⚠️ Report is 20 commits behind head on main.

Files with missing lines Patch % Lines
ballerina/utils.bal 95.65% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1247      +/-   ##
============================================
+ Coverage     81.63%   81.77%   +0.13%     
+ Complexity      864      863       -1     
============================================
  Files            28       28              
  Lines          3523     3566      +43     
  Branches        484      494      +10     
============================================
+ Hits           2876     2916      +40     
- Misses          431      433       +2     
- Partials        216      217       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gayaldassanayake gayaldassanayake merged commit 2d3b8a7 into ballerina-platform:main Mar 19, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants