Skip to content

feat: add DuckDB::Value.create_varchar#1254

Merged
suketa merged 2 commits intosuketa:mainfrom
otegami:feature/value-create-varchar
Apr 9, 2026
Merged

feat: add DuckDB::Value.create_varchar#1254
suketa merged 2 commits intosuketa:mainfrom
otegami:feature/value-create-varchar

Conversation

@otegami
Copy link
Copy Markdown
Contributor

@otegami otegami commented Apr 9, 2026

refs: GH-695

Add DuckDB::Value.create_varchar to create a Value object for the VARCHAR type by wrapping the following C API:

The method validates that the input is a String before passing it to the C layer.

Summary by CodeRabbit

  • New Features

    • Added DuckDB::Value.create_varchar to create VARCHAR values from Ruby strings; validates UTF-8/US-ASCII compatibility and raises ArgumentError for invalid encodings or non-strings.
  • Documentation

    • Updated changelog to document the new varchar constructor.
  • Tests

    • Added unit and integration tests, including prepared-statement binding and cases for empty, valid, and invalid/non-UTF-8 inputs.

refs: suketaGH-695

Add `DuckDB::Value.create_varchar` to create a Value object for the VARCHAR type by wrapping the following C API:

- [duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);](https://duckdb.org/docs/stable/clients/c/api.html#duckdb_create_varchar_length)

The method validates that the input is a String before passing it to the C layer.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d56d187f-67e5-4bbb-b2aa-047755a1a619

📥 Commits

Reviewing files that changed from the base of the PR and between 3f83a12 and 5976ec4.

📒 Files selected for processing (2)
  • lib/duckdb/value.rb
  • test/duckdb_test/value_test.rb
✅ Files skipped from review due to trivial changes (1)
  • test/duckdb_test/value_test.rb
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/duckdb/value.rb

📝 Walkthrough

Walkthrough

Adds a new public factory DuckDB::Value.create_varchar(String) with Ruby-side encoding checks, a private C singleton helper that builds a DuckDB VARCHAR value, a changelog entry, and unit + integration tests.

Changes

Cohort / File(s) Summary
Changelog & Ruby API
CHANGELOG.md, lib/duckdb/value.rb
Documented feature in Unreleased; added DuckDB::Value.create_varchar(value) with String type check and UTF-8/US-ASCII + valid-encoding validation via check_utf8_compatible!.
C extension
ext/duckdb/value.c
Added private singleton helper _create_varchar(str) that extracts Ruby string pointer/length, calls duckdb_create_varchar_length, and wraps the resulting DuckDB value into a DuckDB::Value.
Tests
test/duckdb_test/value_test.rb
Added unit tests for valid, empty, and US-ASCII strings; invalid inputs (non-String, invalid UTF-8 bytes, binary/non-text encoding, Shift_JIS); added prepared-statement binding/integration test for VARCHAR.

Sequence Diagram(s)

sequenceDiagram
  participant Ruby as Ruby (caller)
  participant CExt as C extension (ext/duckdb/value.c)
  participant DuckDB as DuckDB C API
  Ruby->>Ruby: DuckDB::Value.create_varchar(str)\n(check_type!, check_utf8_compatible!)
  Ruby->>CExt: _create_varchar(str)
  CExt->>DuckDB: duckdb_create_varchar_length(ptr, len)
  DuckDB-->>CExt: duckdb_value (allocated)
  CExt-->>Ruby: rbduckdb_value_new(duckdb_value)
  Ruby-->>Caller: DuckDB::Value instance
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • suketa

Poem

🐰 I hopped from Ruby to DuckDB's strand,
I checked each byte and held your hand.
UTF‑8 cosy, ASCII too,
A VARCHAR popped — hiccup? not you! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add DuckDB::Value.create_varchar' directly and clearly describes the main change—adding a new factory method for VARCHAR values, which is the primary objective of the pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/duckdb_test/value_test.rb (1)

519-527: Consider one extra edge-case round-trip test for string payloads.

An additional bind/round-trip case with an embedded null byte would harden regression coverage for tricky string inputs.

✅ Suggested additional test
+    def test_create_varchar_bind_value_with_null_byte
+      `@con.query`('CREATE TABLE e2e_varchar_null_byte (id INTEGER, val VARCHAR)')
+      stmt = DuckDB::PreparedStatement.new(`@con`, 'INSERT INTO e2e_varchar_null_byte VALUES (1, ?)')
+      stmt.bind_value(1, DuckDB::Value.create_varchar("a\0b"))
+      stmt.execute
+      result = `@con.query`('SELECT val FROM e2e_varchar_null_byte WHERE id = 1')
+
+      assert_equal("a\0b", result.first[0])
+    end
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/duckdb_test/value_test.rb` around lines 519 - 527, Add a new edge-case
test alongside test_create_varchar_bind_value that verifies round-trip behavior
for strings containing embedded null bytes: create the same table (or reuse
e2e_varchar), prepare an INSERT via DuckDB::PreparedStatement, bind a value
using DuckDB::Value.create_varchar with a string containing an embedded "\0"
(e.g. "Hello\0DuckDB"), execute the statement, SELECT the value back via
`@con.query` and assert the returned string equals the original byte-for-byte
(including the null). Use the same call sites (DuckDB::PreparedStatement,
bind_value, DuckDB::Value.create_varchar, `@con.query`) so the test covers both
binding and retrieval.
ext/duckdb/value.c (1)

19-19: Align new C helper with extension naming and docs conventions.

Please rename the new helper to the rbduckdb_ prefix and add a call-seq: comment block for the function.

♻️ Proposed fix
-static VALUE duckdb_value_s__create_varchar(VALUE klass, VALUE str);
+static VALUE rbduckdb_value_s__create_varchar(VALUE klass, VALUE str);
@@
-static VALUE duckdb_value_s__create_varchar(VALUE klass, VALUE str) {
+/*
+ * call-seq:
+ *   DuckDB::Value._create_varchar(str) -> DuckDB::Value
+ *
+ * Internal constructor for VARCHAR values from a Ruby String.
+ */
+static VALUE rbduckdb_value_s__create_varchar(VALUE klass, VALUE str) {
     const char *str_ptr = StringValuePtr(str);
     idx_t str_len = RSTRING_LEN(str);
     duckdb_value value = duckdb_create_varchar_length(str_ptr, str_len);
     return rbduckdb_value_new(value);
 }
@@
-    rb_define_private_method(rb_singleton_class(cDuckDBValue), "_create_varchar", duckdb_value_s__create_varchar, 1);
+    rb_define_private_method(rb_singleton_class(cDuckDBValue), "_create_varchar", rbduckdb_value_s__create_varchar, 1);

As per coding guidelines, ext/duckdb/**/*.c: “C symbols must be prefixed with rbduckdb_ to avoid namespace conflicts” and “All C functions should use comment blocks with call-seq: for documentation”.

Also applies to: 99-104, 247-247

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ext/duckdb/value.c` at line 19, The new helper function
duckdb_value_s__create_varchar should be renamed to follow the extension's C
symbol prefix (rename to rbduckdb_value_s__create_varchar) and you must add a
documentation comment block above its declaration with a call-seq: line
describing the Ruby-facing usage; update both the forward declaration and the
definition sites to the new name and adjust any callers (e.g., usages named
duckdb_value_s__create_varchar) accordingly, and apply the same rename+call-seq
comment convention to the other similar helpers in this file that were flagged
(ensure all C symbols use the rbduckdb_ prefix and each function has a call-seq:
comment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ext/duckdb/value.c`:
- Line 19: The new helper function duckdb_value_s__create_varchar should be
renamed to follow the extension's C symbol prefix (rename to
rbduckdb_value_s__create_varchar) and you must add a documentation comment block
above its declaration with a call-seq: line describing the Ruby-facing usage;
update both the forward declaration and the definition sites to the new name and
adjust any callers (e.g., usages named duckdb_value_s__create_varchar)
accordingly, and apply the same rename+call-seq comment convention to the other
similar helpers in this file that were flagged (ensure all C symbols use the
rbduckdb_ prefix and each function has a call-seq: comment).

In `@test/duckdb_test/value_test.rb`:
- Around line 519-527: Add a new edge-case test alongside
test_create_varchar_bind_value that verifies round-trip behavior for strings
containing embedded null bytes: create the same table (or reuse e2e_varchar),
prepare an INSERT via DuckDB::PreparedStatement, bind a value using
DuckDB::Value.create_varchar with a string containing an embedded "\0" (e.g.
"Hello\0DuckDB"), execute the statement, SELECT the value back via `@con.query`
and assert the returned string equals the original byte-for-byte (including the
null). Use the same call sites (DuckDB::PreparedStatement, bind_value,
DuckDB::Value.create_varchar, `@con.query`) so the test covers both binding and
retrieval.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9aabfe77-a7a2-43f8-8f5b-327a7a03dc92

📥 Commits

Reviewing files that changed from the base of the PR and between 6bb3cd2 and 0dd6338.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • ext/duckdb/value.c
  • lib/duckdb/value.rb
  • test/duckdb_test/value_test.rb

Copy link
Copy Markdown
Owner

@suketa suketa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question.

# @param value [String] the string value.
# @return [DuckDB::Value] the created Value object.
# @raise [ArgumentError] if +value+ is not a String.
def create_varchar(value)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened when passing "\x00\x01\x02\xff".b (binary data)? How about other encoding strings?

Copy link
Copy Markdown
Contributor Author

@otegami otegami Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix: 5976ec4
Thank you for reviewing it!
I checked the DuckDB C API documentation and found that duckdb_create_varchar_length returns nullptr when the input is not valid UTF-8.

So we definitely need to validate the encoding.
I added check_utf8_compatible! which accepts only UTF-8 and US-ASCII and raises ArgumentError for other encodings like binary data ("\x00\x01\x02\xff".b), Shift_JIS strings, and so on.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
lib/duckdb/value.rb (1)

170-173: ⚠️ Potential issue | 🟡 Minor

Update @raise docs to include encoding validation failures.

Line 172 documents only type errors, but this method also raises ArgumentError for non UTF-8/US-ASCII strings via check_utf8_compatible!. Please document that behavior in the YARD comment.

Proposed doc update
-      # `@raise` [ArgumentError] if +value+ is not a String.
+      # `@raise` [ArgumentError] if +value+ is not a String, or if its encoding is not UTF-8/US-ASCII.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/duckdb/value.rb` around lines 170 - 173, Update the YARD `@raise`
documentation for the method create_varchar to mention that it can raise
ArgumentError not only for non-String types but also when the string fails
encoding validation via check_utf8_compatible! (e.g., non UTF-8/US-ASCII data);
locate the create_varchar method and expand the `@raise` line to include both the
type error and encoding validation failure, referencing check_utf8_compatible!
in the description for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/duckdb/value.rb`:
- Around line 175-176: The UTF-8 compatibility check in check_utf8_compatible!
currently only inspects the string's encoding label and can let strings with
invalid UTF-8 bytes through; update check_utf8_compatible! to also call
value.valid_encoding? (i.e., require both value.encoding == Encoding::UTF_8 and
value.valid_encoding?) before allowing _create_varchar(value) to proceed so
invalid UTF-8 byte sequences are rejected at the Ruby layer with a clear error.

---

Duplicate comments:
In `@lib/duckdb/value.rb`:
- Around line 170-173: Update the YARD `@raise` documentation for the method
create_varchar to mention that it can raise ArgumentError not only for
non-String types but also when the string fails encoding validation via
check_utf8_compatible! (e.g., non UTF-8/US-ASCII data); locate the
create_varchar method and expand the `@raise` line to include both the type error
and encoding validation failure, referencing check_utf8_compatible! in the
description for clarity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 17ee2e7f-06b2-46f8-b609-c09dfc5f0711

📥 Commits

Reviewing files that changed from the base of the PR and between 0dd6338 and 3f83a12.

📒 Files selected for processing (2)
  • lib/duckdb/value.rb
  • test/duckdb_test/value_test.rb
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/duckdb_test/value_test.rb

refs: suketaGH-695

The DuckDB C API `duckdb_create_varchar_length` returns nullptr when the
input is not valid UTF-8. Add `check_utf8_compatible!` validation in the
Ruby layer to raise ArgumentError for non-UTF-8 strings (e.g., binary
data, Shift_JIS) and invalid UTF-8 byte sequences before passing them
to the C layer.
@otegami otegami force-pushed the feature/value-create-varchar branch from 3f83a12 to 5976ec4 Compare April 9, 2026 12:19
Copy link
Copy Markdown
Owner

@suketa suketa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@suketa suketa merged commit a1081f1 into suketa:main Apr 9, 2026
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants