feat: add DuckDB::Value.create_varchar#1254
Conversation
refs: suketaGH-695 Add `DuckDB::Value.create_varchar` to create a Value object for the VARCHAR type by wrapping the following C API: - [duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);](https://duckdb.org/docs/stable/clients/c/api.html#duckdb_create_varchar_length) The method validates that the input is a String before passing it to the C layer.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a new public factory Changes
Sequence Diagram(s)sequenceDiagram
participant Ruby as Ruby (caller)
participant CExt as C extension (ext/duckdb/value.c)
participant DuckDB as DuckDB C API
Ruby->>Ruby: DuckDB::Value.create_varchar(str)\n(check_type!, check_utf8_compatible!)
Ruby->>CExt: _create_varchar(str)
CExt->>DuckDB: duckdb_create_varchar_length(ptr, len)
DuckDB-->>CExt: duckdb_value (allocated)
CExt-->>Ruby: rbduckdb_value_new(duckdb_value)
Ruby-->>Caller: DuckDB::Value instance
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/duckdb_test/value_test.rb (1)
519-527: Consider one extra edge-case round-trip test for string payloads.An additional bind/round-trip case with an embedded null byte would harden regression coverage for tricky string inputs.
✅ Suggested additional test
+ def test_create_varchar_bind_value_with_null_byte + `@con.query`('CREATE TABLE e2e_varchar_null_byte (id INTEGER, val VARCHAR)') + stmt = DuckDB::PreparedStatement.new(`@con`, 'INSERT INTO e2e_varchar_null_byte VALUES (1, ?)') + stmt.bind_value(1, DuckDB::Value.create_varchar("a\0b")) + stmt.execute + result = `@con.query`('SELECT val FROM e2e_varchar_null_byte WHERE id = 1') + + assert_equal("a\0b", result.first[0]) + end🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/duckdb_test/value_test.rb` around lines 519 - 527, Add a new edge-case test alongside test_create_varchar_bind_value that verifies round-trip behavior for strings containing embedded null bytes: create the same table (or reuse e2e_varchar), prepare an INSERT via DuckDB::PreparedStatement, bind a value using DuckDB::Value.create_varchar with a string containing an embedded "\0" (e.g. "Hello\0DuckDB"), execute the statement, SELECT the value back via `@con.query` and assert the returned string equals the original byte-for-byte (including the null). Use the same call sites (DuckDB::PreparedStatement, bind_value, DuckDB::Value.create_varchar, `@con.query`) so the test covers both binding and retrieval.ext/duckdb/value.c (1)
19-19: Align new C helper with extension naming and docs conventions.Please rename the new helper to the
rbduckdb_prefix and add acall-seq:comment block for the function.♻️ Proposed fix
-static VALUE duckdb_value_s__create_varchar(VALUE klass, VALUE str); +static VALUE rbduckdb_value_s__create_varchar(VALUE klass, VALUE str); @@ -static VALUE duckdb_value_s__create_varchar(VALUE klass, VALUE str) { +/* + * call-seq: + * DuckDB::Value._create_varchar(str) -> DuckDB::Value + * + * Internal constructor for VARCHAR values from a Ruby String. + */ +static VALUE rbduckdb_value_s__create_varchar(VALUE klass, VALUE str) { const char *str_ptr = StringValuePtr(str); idx_t str_len = RSTRING_LEN(str); duckdb_value value = duckdb_create_varchar_length(str_ptr, str_len); return rbduckdb_value_new(value); } @@ - rb_define_private_method(rb_singleton_class(cDuckDBValue), "_create_varchar", duckdb_value_s__create_varchar, 1); + rb_define_private_method(rb_singleton_class(cDuckDBValue), "_create_varchar", rbduckdb_value_s__create_varchar, 1);As per coding guidelines,
ext/duckdb/**/*.c: “C symbols must be prefixed withrbduckdb_to avoid namespace conflicts” and “All C functions should use comment blocks withcall-seq:for documentation”.Also applies to: 99-104, 247-247
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ext/duckdb/value.c` at line 19, The new helper function duckdb_value_s__create_varchar should be renamed to follow the extension's C symbol prefix (rename to rbduckdb_value_s__create_varchar) and you must add a documentation comment block above its declaration with a call-seq: line describing the Ruby-facing usage; update both the forward declaration and the definition sites to the new name and adjust any callers (e.g., usages named duckdb_value_s__create_varchar) accordingly, and apply the same rename+call-seq comment convention to the other similar helpers in this file that were flagged (ensure all C symbols use the rbduckdb_ prefix and each function has a call-seq: comment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ext/duckdb/value.c`:
- Line 19: The new helper function duckdb_value_s__create_varchar should be
renamed to follow the extension's C symbol prefix (rename to
rbduckdb_value_s__create_varchar) and you must add a documentation comment block
above its declaration with a call-seq: line describing the Ruby-facing usage;
update both the forward declaration and the definition sites to the new name and
adjust any callers (e.g., usages named duckdb_value_s__create_varchar)
accordingly, and apply the same rename+call-seq comment convention to the other
similar helpers in this file that were flagged (ensure all C symbols use the
rbduckdb_ prefix and each function has a call-seq: comment).
In `@test/duckdb_test/value_test.rb`:
- Around line 519-527: Add a new edge-case test alongside
test_create_varchar_bind_value that verifies round-trip behavior for strings
containing embedded null bytes: create the same table (or reuse e2e_varchar),
prepare an INSERT via DuckDB::PreparedStatement, bind a value using
DuckDB::Value.create_varchar with a string containing an embedded "\0" (e.g.
"Hello\0DuckDB"), execute the statement, SELECT the value back via `@con.query`
and assert the returned string equals the original byte-for-byte (including the
null). Use the same call sites (DuckDB::PreparedStatement, bind_value,
DuckDB::Value.create_varchar, `@con.query`) so the test covers both binding and
retrieval.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9aabfe77-a7a2-43f8-8f5b-327a7a03dc92
📒 Files selected for processing (4)
CHANGELOG.mdext/duckdb/value.clib/duckdb/value.rbtest/duckdb_test/value_test.rb
| # @param value [String] the string value. | ||
| # @return [DuckDB::Value] the created Value object. | ||
| # @raise [ArgumentError] if +value+ is not a String. | ||
| def create_varchar(value) |
There was a problem hiding this comment.
What happened when passing "\x00\x01\x02\xff".b (binary data)? How about other encoding strings?
There was a problem hiding this comment.
fix: 5976ec4
Thank you for reviewing it!
I checked the DuckDB C API documentation and found that duckdb_create_varchar_length returns nullptr when the input is not valid UTF-8.
So we definitely need to validate the encoding.
I added check_utf8_compatible! which accepts only UTF-8 and US-ASCII and raises ArgumentError for other encodings like binary data ("\x00\x01\x02\xff".b), Shift_JIS strings, and so on.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
lib/duckdb/value.rb (1)
170-173:⚠️ Potential issue | 🟡 MinorUpdate
@raisedocs to include encoding validation failures.Line 172 documents only type errors, but this method also raises
ArgumentErrorfor non UTF-8/US-ASCII strings viacheck_utf8_compatible!. Please document that behavior in the YARD comment.Proposed doc update
- # `@raise` [ArgumentError] if +value+ is not a String. + # `@raise` [ArgumentError] if +value+ is not a String, or if its encoding is not UTF-8/US-ASCII.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/duckdb/value.rb` around lines 170 - 173, Update the YARD `@raise` documentation for the method create_varchar to mention that it can raise ArgumentError not only for non-String types but also when the string fails encoding validation via check_utf8_compatible! (e.g., non UTF-8/US-ASCII data); locate the create_varchar method and expand the `@raise` line to include both the type error and encoding validation failure, referencing check_utf8_compatible! in the description for clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/duckdb/value.rb`:
- Around line 175-176: The UTF-8 compatibility check in check_utf8_compatible!
currently only inspects the string's encoding label and can let strings with
invalid UTF-8 bytes through; update check_utf8_compatible! to also call
value.valid_encoding? (i.e., require both value.encoding == Encoding::UTF_8 and
value.valid_encoding?) before allowing _create_varchar(value) to proceed so
invalid UTF-8 byte sequences are rejected at the Ruby layer with a clear error.
---
Duplicate comments:
In `@lib/duckdb/value.rb`:
- Around line 170-173: Update the YARD `@raise` documentation for the method
create_varchar to mention that it can raise ArgumentError not only for
non-String types but also when the string fails encoding validation via
check_utf8_compatible! (e.g., non UTF-8/US-ASCII data); locate the
create_varchar method and expand the `@raise` line to include both the type error
and encoding validation failure, referencing check_utf8_compatible! in the
description for clarity.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 17ee2e7f-06b2-46f8-b609-c09dfc5f0711
📒 Files selected for processing (2)
lib/duckdb/value.rbtest/duckdb_test/value_test.rb
🚧 Files skipped from review as they are similar to previous changes (1)
- test/duckdb_test/value_test.rb
refs: suketaGH-695 The DuckDB C API `duckdb_create_varchar_length` returns nullptr when the input is not valid UTF-8. Add `check_utf8_compatible!` validation in the Ruby layer to raise ArgumentError for non-UTF-8 strings (e.g., binary data, Shift_JIS) and invalid UTF-8 byte sequences before passing them to the C layer.
3f83a12 to
5976ec4
Compare
refs: GH-695
Add
DuckDB::Value.create_varcharto create a Value object for the VARCHAR type by wrapping the following C API:The method validates that the input is a String before passing it to the C layer.
Summary by CodeRabbit
New Features
Documentation
Tests