feat: add DuckDB::Value.create_varchar by otegami · Pull Request #1254 · suketa/ruby-duckdb

otegami · 2026-04-09T02:41:48Z

Add DuckDB::Value.create_varchar to create a Value object for the VARCHAR type by wrapping the following C API:

duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);

The method validates that the input is a String before passing it to the C layer.

Summary by CodeRabbit

New Features
- Added DuckDB::Value.create_varchar to create VARCHAR values from Ruby strings; validates UTF-8/US-ASCII compatibility and raises ArgumentError for invalid encodings or non-strings.
Documentation
- Updated changelog to document the new varchar constructor.
Tests
- Added unit and integration tests, including prepared-statement binding and cases for empty, valid, and invalid/non-UTF-8 inputs.

refs: suketaGH-695 Add `DuckDB::Value.create_varchar` to create a Value object for the VARCHAR type by wrapping the following C API: - [duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);](https://duckdb.org/docs/stable/clients/c/api.html#duckdb_create_varchar_length) The method validates that the input is a String before passing it to the C layer.

coderabbitai · 2026-04-09T02:42:02Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d56d187f-67e5-4bbb-b2aa-047755a1a619

📥 Commits

Reviewing files that changed from the base of the PR and between 3f83a12 and 5976ec4.

📒 Files selected for processing (2)

lib/duckdb/value.rb
test/duckdb_test/value_test.rb

✅ Files skipped from review due to trivial changes (1)

test/duckdb_test/value_test.rb

🚧 Files skipped from review as they are similar to previous changes (1)

lib/duckdb/value.rb

📝 Walkthrough

Walkthrough

Adds a new public factory DuckDB::Value.create_varchar(String) with Ruby-side encoding checks, a private C singleton helper that builds a DuckDB VARCHAR value, a changelog entry, and unit + integration tests.

Changes

Cohort / File(s)	Summary
Changelog & Ruby API `CHANGELOG.md`, `lib/duckdb/value.rb`	Documented feature in Unreleased; added `DuckDB::Value.create_varchar(value)` with `String` type check and UTF-8/US-ASCII + valid-encoding validation via `check_utf8_compatible!`.
C extension `ext/duckdb/value.c`	Added private singleton helper `_create_varchar(str)` that extracts Ruby string pointer/length, calls `duckdb_create_varchar_length`, and wraps the resulting DuckDB value into a `DuckDB::Value`.
Tests `test/duckdb_test/value_test.rb`	Added unit tests for valid, empty, and US-ASCII strings; invalid inputs (non-String, invalid UTF-8 bytes, binary/non-text encoding, Shift_JIS); added prepared-statement binding/integration test for VARCHAR.

Sequence Diagram(s)

sequenceDiagram
  participant Ruby as Ruby (caller)
  participant CExt as C extension (ext/duckdb/value.c)
  participant DuckDB as DuckDB C API
  Ruby->>Ruby: DuckDB::Value.create_varchar(str)\n(check_type!, check_utf8_compatible!)
  Ruby->>CExt: _create_varchar(str)
  CExt->>DuckDB: duckdb_create_varchar_length(ptr, len)
  DuckDB-->>CExt: duckdb_value (allocated)
  CExt-->>Ruby: rbduckdb_value_new(duckdb_value)
  Ruby-->>Caller: DuckDB::Value instance

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: add DuckDB::Value.create_int32 #1249: Adds an analogous DuckDB::Value factory method implemented with a private C helper (similar pattern to this VARCHAR addition).
rename DuckDB::ValueImpl to DuckDB::Value #1243: Changes in ext/duckdb/value.c touching value construction and related init/constructor functions.
feat: add DuckDB::Value.create_null #1252: Adds new DuckDB::Value.create_* factory methods with corresponding C helpers and singleton registrations.

Suggested reviewers

suketa

Poem

🐰 I hopped from Ruby to DuckDB's strand,
I checked each byte and held your hand.
UTF‑8 cosy, ASCII too,
A VARCHAR popped — hiccup? not you! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add DuckDB::Value.create_varchar' directly and clearly describes the main change—adding a new factory method for VARCHAR values, which is the primary objective of the pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

test/duckdb_test/value_test.rb (1)

519-527: Consider one extra edge-case round-trip test for string payloads.

An additional bind/round-trip case with an embedded null byte would harden regression coverage for tricky string inputs.

✅ Suggested additional test

+    def test_create_varchar_bind_value_with_null_byte
+      `@con.query`('CREATE TABLE e2e_varchar_null_byte (id INTEGER, val VARCHAR)')
+      stmt = DuckDB::PreparedStatement.new(`@con`, 'INSERT INTO e2e_varchar_null_byte VALUES (1, ?)')
+      stmt.bind_value(1, DuckDB::Value.create_varchar("a\0b"))
+      stmt.execute
+      result = `@con.query`('SELECT val FROM e2e_varchar_null_byte WHERE id = 1')
+
+      assert_equal("a\0b", result.first[0])
+    end

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/duckdb_test/value_test.rb` around lines 519 - 527, Add a new edge-case
test alongside test_create_varchar_bind_value that verifies round-trip behavior
for strings containing embedded null bytes: create the same table (or reuse
e2e_varchar), prepare an INSERT via DuckDB::PreparedStatement, bind a value
using DuckDB::Value.create_varchar with a string containing an embedded "\0"
(e.g. "Hello\0DuckDB"), execute the statement, SELECT the value back via
`@con.query` and assert the returned string equals the original byte-for-byte
(including the null). Use the same call sites (DuckDB::PreparedStatement,
bind_value, DuckDB::Value.create_varchar, `@con.query`) so the test covers both
binding and retrieval.

ext/duckdb/value.c (1)

19-19: Align new C helper with extension naming and docs conventions.

Please rename the new helper to the rbduckdb_ prefix and add a call-seq: comment block for the function.

♻️ Proposed fix

-static VALUE duckdb_value_s__create_varchar(VALUE klass, VALUE str);
+static VALUE rbduckdb_value_s__create_varchar(VALUE klass, VALUE str);
@@
-static VALUE duckdb_value_s__create_varchar(VALUE klass, VALUE str) {
+/*
+ * call-seq:
+ *   DuckDB::Value._create_varchar(str) -> DuckDB::Value
+ *
+ * Internal constructor for VARCHAR values from a Ruby String.
+ */
+static VALUE rbduckdb_value_s__create_varchar(VALUE klass, VALUE str) {
     const char *str_ptr = StringValuePtr(str);
     idx_t str_len = RSTRING_LEN(str);
     duckdb_value value = duckdb_create_varchar_length(str_ptr, str_len);
     return rbduckdb_value_new(value);
 }
@@
-    rb_define_private_method(rb_singleton_class(cDuckDBValue), "_create_varchar", duckdb_value_s__create_varchar, 1);
+    rb_define_private_method(rb_singleton_class(cDuckDBValue), "_create_varchar", rbduckdb_value_s__create_varchar, 1);

As per coding guidelines, ext/duckdb/**/*.c: “C symbols must be prefixed with rbduckdb_ to avoid namespace conflicts” and “All C functions should use comment blocks with call-seq: for documentation”.

Also applies to: 99-104, 247-247

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ext/duckdb/value.c` at line 19, The new helper function
duckdb_value_s__create_varchar should be renamed to follow the extension's C
symbol prefix (rename to rbduckdb_value_s__create_varchar) and you must add a
documentation comment block above its declaration with a call-seq: line
describing the Ruby-facing usage; update both the forward declaration and the
definition sites to the new name and adjust any callers (e.g., usages named
duckdb_value_s__create_varchar) accordingly, and apply the same rename+call-seq
comment convention to the other similar helpers in this file that were flagged
(ensure all C symbols use the rbduckdb_ prefix and each function has a call-seq:
comment).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ext/duckdb/value.c`:
- Line 19: The new helper function duckdb_value_s__create_varchar should be
renamed to follow the extension's C symbol prefix (rename to
rbduckdb_value_s__create_varchar) and you must add a documentation comment block
above its declaration with a call-seq: line describing the Ruby-facing usage;
update both the forward declaration and the definition sites to the new name and
adjust any callers (e.g., usages named duckdb_value_s__create_varchar)
accordingly, and apply the same rename+call-seq comment convention to the other
similar helpers in this file that were flagged (ensure all C symbols use the
rbduckdb_ prefix and each function has a call-seq: comment).

In `@test/duckdb_test/value_test.rb`:
- Around line 519-527: Add a new edge-case test alongside
test_create_varchar_bind_value that verifies round-trip behavior for strings
containing embedded null bytes: create the same table (or reuse e2e_varchar),
prepare an INSERT via DuckDB::PreparedStatement, bind a value using
DuckDB::Value.create_varchar with a string containing an embedded "\0" (e.g.
"Hello\0DuckDB"), execute the statement, SELECT the value back via `@con.query`
and assert the returned string equals the original byte-for-byte (including the
null). Use the same call sites (DuckDB::PreparedStatement, bind_value,
DuckDB::Value.create_varchar, `@con.query`) so the test covers both binding and
retrieval.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9aabfe77-a7a2-43f8-8f5b-327a7a03dc92

📥 Commits

Reviewing files that changed from the base of the PR and between 6bb3cd2 and 0dd6338.

📒 Files selected for processing (4)

CHANGELOG.md
ext/duckdb/value.c
lib/duckdb/value.rb
test/duckdb_test/value_test.rb

suketa

I have one question.

suketa · 2026-04-09T10:15:39Z

lib/duckdb/value.rb

+      # @param value [String] the string value.
+      # @return [DuckDB::Value] the created Value object.
+      # @raise [ArgumentError] if +value+ is not a String.
+      def create_varchar(value)


What happened when passing "\x00\x01\x02\xff".b (binary data)? How about other encoding strings?

fix: 5976ec4
Thank you for reviewing it!
I checked the DuckDB C API documentation and found that duckdb_create_varchar_length returns nullptr when the input is not valid UTF-8.

https://github.com/duckdb/duckdb/blob/161b7454005b8e5bb6d68e5ae59b548928f83b9f/src/include/duckdb.h#L2408-L2415

So we definitely need to validate the encoding.
I added check_utf8_compatible! which accepts only UTF-8 and US-ASCII and raises ArgumentError for other encodings like binary data ("\x00\x01\x02\xff".b), Shift_JIS strings, and so on.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

lib/duckdb/value.rb (1)
170-173: ⚠️ Potential issue | 🟡 Minor

Update @raise docs to include encoding validation failures.

Line 172 documents only type errors, but this method also raises ArgumentError for non UTF-8/US-ASCII strings via check_utf8_compatible!. Please document that behavior in the YARD comment.
Proposed doc update
-      # `@raise` [ArgumentError] if +value+ is not a String.
+      # `@raise` [ArgumentError] if +value+ is not a String, or if its encoding is not UTF-8/US-ASCII.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/duckdb/value.rb` around lines 170 - 173, Update the YARD `@raise`
documentation for the method create_varchar to mention that it can raise
ArgumentError not only for non-String types but also when the string fails
encoding validation via check_utf8_compatible! (e.g., non UTF-8/US-ASCII data);
locate the create_varchar method and expand the `@raise` line to include both the
type error and encoding validation failure, referencing check_utf8_compatible!
in the description for clarity.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/duckdb/value.rb`:
- Around line 175-176: The UTF-8 compatibility check in check_utf8_compatible!
currently only inspects the string's encoding label and can let strings with
invalid UTF-8 bytes through; update check_utf8_compatible! to also call
value.valid_encoding? (i.e., require both value.encoding == Encoding::UTF_8 and
value.valid_encoding?) before allowing _create_varchar(value) to proceed so
invalid UTF-8 byte sequences are rejected at the Ruby layer with a clear error.

---

Duplicate comments:
In `@lib/duckdb/value.rb`:
- Around line 170-173: Update the YARD `@raise` documentation for the method
create_varchar to mention that it can raise ArgumentError not only for
non-String types but also when the string fails encoding validation via
check_utf8_compatible! (e.g., non UTF-8/US-ASCII data); locate the
create_varchar method and expand the `@raise` line to include both the type error
and encoding validation failure, referencing check_utf8_compatible! in the
description for clarity.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 17ee2e7f-06b2-46f8-b609-c09dfc5f0711

📥 Commits

Reviewing files that changed from the base of the PR and between 0dd6338 and 3f83a12.

📒 Files selected for processing (2)

lib/duckdb/value.rb
test/duckdb_test/value_test.rb

🚧 Files skipped from review as they are similar to previous changes (1)

test/duckdb_test/value_test.rb

lib/duckdb/value.rb

refs: suketaGH-695 The DuckDB C API `duckdb_create_varchar_length` returns nullptr when the input is not valid UTF-8. Add `check_utf8_compatible!` validation in the Ruby layer to raise ArgumentError for non-UTF-8 strings (e.g., binary data, Shift_JIS) and invalid UTF-8 byte sequences before passing them to the C layer.

suketa

Thank you.

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

suketa reviewed Apr 9, 2026

View reviewed changes

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

lib/duckdb/value.rb Show resolved Hide resolved

otegami force-pushed the feature/value-create-varchar branch from 3f83a12 to 5976ec4 Compare April 9, 2026 12:19

suketa approved these changes Apr 9, 2026

View reviewed changes

suketa merged commit a1081f1 into suketa:main Apr 9, 2026
41 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add DuckDB::Value.create_varchar#1254

feat: add DuckDB::Value.create_varchar#1254
suketa merged 2 commits intosuketa:mainfrom
otegami:feature/value-create-varchar

otegami commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 9, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

suketa left a comment

Uh oh!

suketa Apr 9, 2026

Uh oh!

otegami Apr 9, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

suketa left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

otegami commented Apr 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

suketa left a comment

Choose a reason for hiding this comment

Uh oh!

suketa Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

otegami Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

suketa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

otegami commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 9, 2026 •

edited

Loading

otegami Apr 9, 2026 •

edited

Loading