Skip to content

implement DuckDB::InstanceCache#get_or_create#939

Merged
suketa merged 10 commits intomainfrom
instance_cache_get_or_create
May 10, 2025
Merged

implement DuckDB::InstanceCache#get_or_create#939
suketa merged 10 commits intomainfrom
instance_cache_get_or_create

Conversation

@suketa
Copy link
Owner

@suketa suketa commented Apr 12, 2025

  • DuckDB::InstanceCache#get_or_create is wrapping C-API duckdb_get_or_create_from_cache.

Summary by CodeRabbit

  • New Features
    • Introduced functionality to create database objects from existing handles.
    • Implemented a caching mechanism for efficient retrieval or creation of database instances with customizable options.
    • Added a new class to manage database instances with keyword arguments for path and configuration.
  • Bug Fixes
    • Enhanced error handling for database instance creation failures.
  • Tests
    • Added comprehensive tests for the get_or_create method under various scenarios to ensure reliability.

- `DuckDB::InstanceCache#get_or_create` is wrapping C-API `duckdb_get_or_create_from_cache`.
@coderabbitai
Copy link

coderabbitai bot commented Apr 12, 2025

Walkthrough

This pull request adds a new function to create a Ruby database object from an existing DuckDB handle and implements a method to retrieve or create a DuckDB database instance from a cache with optional path and configuration arguments. It introduces the DuckDB::InstanceCache Ruby class with a get method, updates the Ruby interface to include this class, and adds comprehensive tests for the new caching functionality. The Windows GitHub Actions workflow is also updated to exclude testing on the 'head' Ruby version.

Changes

File(s) Change Summary
ext/duckdb/database.c
ext/duckdb/database.h
Added new function rbduckdb_create_database_obj to create a Ruby object from a DuckDB database handle, including its declaration and implementation.
ext/duckdb/instance_cache.c Added method duckdb_instance_cache_get_or_create to retrieve or create a database instance from cache, handling optional path and config arguments, error checking, and registered as get_or_create.
lib/duckdb/instance_cache.rb Added new DuckDB::InstanceCache class with a get method to manage database instances by path and config, delegating to the native extension method for retrieval or creation.
lib/duckdb.rb Added require "duckdb/instance_cache" to load the new InstanceCache class.
test/duckdb_test/instance_cache_test.rb Added multiple tests in InstanceCacheTest to verify get_or_create behavior with various arguments including path, empty path, memory, config, and multi-threading, ensuring correct instance creation.
CHANGELOG.md Documented the addition of the DuckDB::InstanceCache class, highlighting new instance caching functionality.
.github/workflows/test_on_windows.yml Modified Ruby version matrix to exclude the 'head' Ruby version for Windows tests, retaining other Ruby versions and Windows variants.

Sequence Diagram(s)

sequenceDiagram
    participant RubyApp
    participant rbduckdb_create_database_obj
    participant Allocator
    participant RubyDuckDB
    
    RubyApp->>rbduckdb_create_database_obj: Call with duckdb_database handle
    rbduckdb_create_database_obj->>Allocator: Allocate cDuckDBDatabase object
    Allocator-->>rbduckdb_create_database_obj: Return allocated object
    rbduckdb_create_database_obj->>RubyDuckDB: Retrieve RubyDuckDB structure
    rbduckdb_create_database_obj->>RubyDuckDB: Assign provided duckdb_database to db field
    rbduckdb_create_database_obj->>RubyApp: Return new Ruby DB object
Loading
sequenceDiagram
    participant RubyApp
    participant InstanceCache
    participant ArgumentProcessor
    participant DuckDBCache
    
    RubyApp->>InstanceCache: get_or_create(vpath, vconfig)
    InstanceCache->>ArgumentProcessor: Validate & parse arguments
    ArgumentProcessor->>DuckDBCache: Request DB instance (or create)
    DuckDBCache-->>ArgumentProcessor: Return DB instance or error
    ArgumentProcessor-->>InstanceCache: Pass result or raise error
    InstanceCache->>RubyApp: Return DB object
Loading

Poem

I'm a hopping coder, a rabbit in the night,
Celebrating new functions shining bright.
With DuckDB objects crafted sly and neat,
And caches that cache with a rhythmic beat.
My ears twitch with every line of code,
A joyful leap on our development road! 🐇

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5de5ba3 and 9103361.

📒 Files selected for processing (1)
  • .github/workflows/test_on_windows.yml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/test_on_windows.yml
⏰ Context from checks skipped due to timeout of 90000ms (17)
  • GitHub Check: test (mswin, 1.1.3)
  • GitHub Check: test (mswin, 1.1.1)
  • GitHub Check: test (ucrt, 1.2.2)
  • GitHub Check: test (3.2.6, 1.1.1)
  • GitHub Check: test (mswin, 1.2.2)
  • GitHub Check: test (3.3.6, 1.1.3)
  • GitHub Check: test (ucrt, 1.1.3)
  • GitHub Check: test (3.3.6, 1.1.1)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.4.1, 1.1.3)
  • GitHub Check: test (3.3.6, 1.2.2)
  • GitHub Check: test (3.4.1, 1.2.2)
  • GitHub Check: test (3.4.1, 1.1.1)
  • GitHub Check: test (3.2.6, 1.1.3)
  • GitHub Check: test (3.1.6, 1.1.3)
  • GitHub Check: test (3.2.6, 1.2.2)
  • GitHub Check: test (3.1.6, 1.2.2)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 91425fc and 7221838.

📒 Files selected for processing (3)
  • ext/duckdb/database.c (1 hunks)
  • ext/duckdb/database.h (1 hunks)
  • ext/duckdb/instance_cache.c (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
ext/duckdb/instance_cache.c (2)
ext/duckdb/database.h (1)
  • rbduckdb_create_database_obj (11-11)
ext/duckdb/database.c (1)
  • rbduckdb_create_database_obj (121-127)
⏰ Context from checks skipped due to timeout of 90000ms (20)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (3.4.1, 1.2.2)
  • GitHub Check: test (3.2.6, 1.1.3)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (head, 1.2.2)
  • GitHub Check: test (mswin, 1.1.1)
  • GitHub Check: test (ucrt, 1.1.3)
  • GitHub Check: test (ucrt, 1.1.1)
  • GitHub Check: test (ucrt, 1.2.2)
  • GitHub Check: test (mingw, 1.1.1)
  • GitHub Check: test (3.4.1, 1.1.1)
  • GitHub Check: test (mingw, 1.2.2)
  • GitHub Check: test (3.2.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.1.3)
  • GitHub Check: test (3.1.6, 1.2.2)
  • GitHub Check: test (3.2.6, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.3)
  • GitHub Check: test (3.1.6, 1.2.2)
🔇 Additional comments (4)
ext/duckdb/database.h (1)

11-11: Clean function declaration for database object creation.

This new function declaration follows the existing coding style and naming conventions. It will allow creating Ruby database objects from existing DuckDB handles, which complements the new instance cache functionality.

ext/duckdb/database.c (1)

121-127: Good implementation of database object creation.

The implementation correctly allocates a new Ruby object, associates it with the provided DuckDB database handle, and returns the object. This function follows the established patterns in the codebase.

ext/duckdb/instance_cache.c (2)

10-10: Appropriate function prototype for get_or_create method.

The function signature correctly uses the variable arguments approach (int argc, VALUE *argv) which is consistent with Ruby's C extension conventions.


101-101: Correct method registration for get_or_create.

The registration properly uses -1 to indicate variable arguments, which matches the implementation.

Comment on lines 50 to 81
static VALUE duckdb_instance_cache_get_or_create(int argc, VALUE *argv, VALUE self) {
VALUE vpath = Qnil;
VALUE vconfig = Qnil;
const char *path = NULL;
char *error = NULL;
duckdb_config config = NULL;
duckdb_database db;
rubyDuckDBInstanceCache *ctx;

rb_scan_args(argc, argv, "02", &vpath, &vconfig);
if (!NIL_P(vpath)) {
path = StringValuePtr(vpath);
}
if (!NIL_P(vconfig)) {
if (!rb_obj_is_kind_of(vconfig, cDuckDBConfig)) {
rb_raise(rb_eTypeError, "The second argument must be DuckDB::Config object.");
}
rubyDuckDBConfig *ctx_config = get_struct_config(vconfig);
config = ctx_config->config;
}

TypedData_Get_Struct(self, rubyDuckDBInstanceCache, &instance_cache_data_type, ctx);

if (duckdb_get_or_create_from_cache(ctx->instance_cache, path, &db, config, &error) == DuckDBError) {
if (error) {
rb_raise(eDuckDBError, "%s", error);
} else {
rb_raise(eDuckDBError, "Failed to get or create database from instance cache");
}
}
return rbduckdb_create_database_obj(db);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing type checking for path argument and RDoc documentation.

The implementation looks solid overall, but has a few areas for improvement:

  1. There's no check that vpath is a String before calling StringValuePtr().
  2. The method lacks RDoc documentation explaining its purpose and parameters.
  3. Potential memory leak if error is allocated by DuckDB but not freed after use.

Consider adding these improvements:

+/*
+ * call-seq:
+ *   instance_cache.get_or_create -> DuckDB::Database
+ *   instance_cache.get_or_create(path) -> DuckDB::Database
+ *   instance_cache.get_or_create(path, config) -> DuckDB::Database
+ *
+ * Gets an existing database from the instance cache or creates a new one.
+ * The path and config are optional. If provided, path must be a String and
+ * config must be a DuckDB::Config object.
+ */
 static VALUE duckdb_instance_cache_get_or_create(int argc, VALUE *argv, VALUE self) {
     VALUE vpath = Qnil;
     VALUE vconfig = Qnil;
     const char *path = NULL;
     char *error = NULL;
     duckdb_config config = NULL;
     duckdb_database db;
     rubyDuckDBInstanceCache *ctx;
 
     rb_scan_args(argc, argv, "02", &vpath, &vconfig);
     if (!NIL_P(vpath)) {
+        Check_Type(vpath, T_STRING);
         path = StringValuePtr(vpath);
     }
     if (!NIL_P(vconfig)) {
         if (!rb_obj_is_kind_of(vconfig, cDuckDBConfig)) {
             rb_raise(rb_eTypeError, "The second argument must be DuckDB::Config object.");
         }
         rubyDuckDBConfig *ctx_config = get_struct_config(vconfig);
         config = ctx_config->config;
     }
 
     TypedData_Get_Struct(self, rubyDuckDBInstanceCache, &instance_cache_data_type, ctx);
+    if (ctx->instance_cache == NULL) {
+        rb_raise(eDuckDBError, "InstanceCache has been destroyed");
+    }
 
     if (duckdb_get_or_create_from_cache(ctx->instance_cache, path, &db, config, &error) == DuckDBError) {
         if (error) {
-            rb_raise(eDuckDBError, "%s", error);
+            VALUE message = rb_str_new_cstr(error);
+            duckdb_free(error);
+            rb_raise(eDuckDBError, "%s", StringValuePtr(message));
         } else {
             rb_raise(eDuckDBError, "Failed to get or create database from instance cache");
         }
     }
     return rbduckdb_create_database_obj(db);
 }

Note: Please verify that duckdb_free is the correct function to free memory allocated by DuckDB. If there's a different API function for this purpose, use that instead.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
static VALUE duckdb_instance_cache_get_or_create(int argc, VALUE *argv, VALUE self) {
VALUE vpath = Qnil;
VALUE vconfig = Qnil;
const char *path = NULL;
char *error = NULL;
duckdb_config config = NULL;
duckdb_database db;
rubyDuckDBInstanceCache *ctx;
rb_scan_args(argc, argv, "02", &vpath, &vconfig);
if (!NIL_P(vpath)) {
path = StringValuePtr(vpath);
}
if (!NIL_P(vconfig)) {
if (!rb_obj_is_kind_of(vconfig, cDuckDBConfig)) {
rb_raise(rb_eTypeError, "The second argument must be DuckDB::Config object.");
}
rubyDuckDBConfig *ctx_config = get_struct_config(vconfig);
config = ctx_config->config;
}
TypedData_Get_Struct(self, rubyDuckDBInstanceCache, &instance_cache_data_type, ctx);
if (duckdb_get_or_create_from_cache(ctx->instance_cache, path, &db, config, &error) == DuckDBError) {
if (error) {
rb_raise(eDuckDBError, "%s", error);
} else {
rb_raise(eDuckDBError, "Failed to get or create database from instance cache");
}
}
return rbduckdb_create_database_obj(db);
}
/*
* call-seq:
* instance_cache.get_or_create -> DuckDB::Database
* instance_cache.get_or_create(path) -> DuckDB::Database
* instance_cache.get_or_create(path, config) -> DuckDB::Database
*
* Gets an existing database from the instance cache or creates a new one.
* The path and config are optional. If provided, path must be a String and
* config must be a DuckDB::Config object.
*/
static VALUE duckdb_instance_cache_get_or_create(int argc, VALUE *argv, VALUE self) {
VALUE vpath = Qnil;
VALUE vconfig = Qnil;
const char *path = NULL;
char *error = NULL;
duckdb_config config = NULL;
duckdb_database db;
rubyDuckDBInstanceCache *ctx;
rb_scan_args(argc, argv, "02", &vpath, &vconfig);
if (!NIL_P(vpath)) {
Check_Type(vpath, T_STRING);
path = StringValuePtr(vpath);
}
if (!NIL_P(vconfig)) {
if (!rb_obj_is_kind_of(vconfig, cDuckDBConfig)) {
rb_raise(rb_eTypeError, "The second argument must be DuckDB::Config object.");
}
rubyDuckDBConfig *ctx_config = get_struct_config(vconfig);
config = ctx_config->config;
}
TypedData_Get_Struct(self, rubyDuckDBInstanceCache, &instance_cache_data_type, ctx);
if (ctx->instance_cache == NULL) {
rb_raise(eDuckDBError, "InstanceCache has been destroyed");
}
if (duckdb_get_or_create_from_cache(ctx->instance_cache, path, &db, config, &error) == DuckDBError) {
if (error) {
VALUE message = rb_str_new_cstr(error);
duckdb_free(error);
rb_raise(eDuckDBError, "%s", StringValuePtr(message));
} else {
rb_raise(eDuckDBError, "Failed to get or create database from instance cache");
}
}
return rbduckdb_create_database_obj(db);
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/duckdb_test/instance_cache_test.rb (2)

14-30: Well-structured concurrency test, with one suggestion.

The test effectively checks multi-threaded access to the same database path through the cache. Good job ensuring proper cleanup in each iteration.

Consider adding an assertion to verify that calling get_or_create with the same path returns the same database instance, which would validate the caching behavior.

 def test_get_or_create
   cache = DuckDB::InstanceCache.new
   path = 'test_shared_db.db'
   30.times do
+    db1 = cache.get_or_create(path)
+    db2 = cache.get_or_create(path)
+    assert_same db1, db2, "Cache should return same instance for the same path"
     thread = Thread.new do
       db = cache.get_or_create(path)
       assert_instance_of DuckDB::Database, db
       db.close
     end
     db = cache.get_or_create(path)
     assert_instance_of DuckDB::Database, db
     db.close
     thread.join

     FileUtils.rm_f(path)
   end
 end

14-51: Consider adding error case testing.

The current tests verify the happy path scenarios, but there's no test for potential error conditions. Consider adding tests for:

  1. Invalid paths or configurations
  2. Error handling when the database cannot be created
  3. Behavior when trying to get a database that was previously closed
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd16d3a and e7d2625.

📒 Files selected for processing (1)
  • test/duckdb_test/instance_cache_test.rb (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (19)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (head, 1.1.1)
  • GitHub Check: test (3.4.1, 1.1.3)
  • GitHub Check: test (mingw, 1.1.1)
  • GitHub Check: test (mingw, 1.1.3)
  • GitHub Check: test (mswin, 1.1.3)
  • GitHub Check: test (3.4.1, 1.2.2)
  • GitHub Check: test (mswin, 1.1.1)
  • GitHub Check: test (ucrt, 1.1.3)
  • GitHub Check: test (3.4.1, 1.1.1)
  • GitHub Check: test (3.2.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.1.3)
  • GitHub Check: test (3.2.6, 1.1.3)
  • GitHub Check: test (3.3.6, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.3)
  • GitHub Check: test (3.1.6, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.2.6, 1.2.2)
🔇 Additional comments (4)
test/duckdb_test/instance_cache_test.rb (4)

4-4: Good addition of FileUtils for test cleanup.

The addition of the FileUtils module is appropriate for managing the test database file cleanup at line 28.


32-37: Good test for default behavior without path.

This test properly verifies that get_or_create works correctly when called without any path argument.


39-44: Good edge case test with empty path.

Testing with an empty string path is a good edge case to include.


46-51: Good test for in-memory database creation.

Testing with the special :memory: path ensures compatibility with in-memory database functionality.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
test/duckdb_test/instance_cache_test.rb (3)

14-30: Thread safety testing looks good, but could be enhanced.

The test correctly checks basic thread safety by creating threads that access the same database path. However, there's no verification that the same database instance is being returned for the same path, which is a key aspect of the caching functionality.

Consider adding an assertion to verify cache behavior by checking object identity:

 def test_get_or_create
   cache = DuckDB::InstanceCache.new
   path = 'test_shared_db.db'
+  first_db = nil
   30.times do
+    # Store the first database instance for comparison
+    first_db ||= cache.get_or_create(path)
+    
     thread = Thread.new do
       db = cache.get_or_create(path)
       assert_instance_of DuckDB::Database, db
+      # Verify we get the same instance from the cache
+      assert_same first_db, db
       db.close
     end
     db = cache.get_or_create(path)
     assert_instance_of DuckDB::Database, db
+    # Verify we get the same instance from the cache
+    assert_same first_db, db
     db.close
     thread.join

     FileUtils.rm_f(path)
   end
 end

53-67: Good comprehensive test for configuration.

This test is well-structured and tests important functionality:

  1. Creates a configuration with non-default settings
  2. Creates a database with the configuration
  3. Verifies the configuration is applied by checking query behavior
  4. Properly cleans up resources

One small suggestion to make this test more robust:

 def test_get_or_create_with_config
   cache = DuckDB::InstanceCache.new
   config = DuckDB::Config.new
   config['default_order'] = 'DESC'
   db = cache.get_or_create(nil, config)
   con = db.connect
   con.query('CREATE TABLE numbers (number INTEGER)')
   con.query('INSERT INTO numbers VALUES (2), (1), (4), (3)')

   result = con.query('SELECT number FROM numbers ORDER BY number')
   assert_equal(4, result.first.first)
+  
+  # Additional verification to ensure all results are in descending order
+  expected = [4, 3, 2, 1]
+  actual = result.map { |row| row.first }
+  assert_equal expected, actual
+  
   con.close
   db.close
   cache.destroy
 end

14-67: Consider standardizing resource cleanup across tests.

All tests properly close database connections, but only test_get_or_create removes database files and only test_get_or_create_with_config calls cache.destroy. Consider standardizing cleanup approach across all tests to ensure consistent resource management.

You could add a setup and teardown method to ensure consistent cleanup:

def setup
  @cache = DuckDB::InstanceCache.new
  @db_path = 'test_shared_db.db'
end

def teardown
  FileUtils.rm_f(@db_path) if File.exist?(@db_path)
  @cache.destroy if @cache
end

Then update tests to use these instance variables.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e7d2625 and 14ed640.

📒 Files selected for processing (1)
  • test/duckdb_test/instance_cache_test.rb (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (20)
  • GitHub Check: test (mswin, 1.2.2)
  • GitHub Check: test (ucrt, 1.1.3)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (mingw, 1.1.1)
  • GitHub Check: test (mingw, 1.1.3)
  • GitHub Check: test (ucrt, 1.1.1)
  • GitHub Check: test (ucrt, 1.2.2)
  • GitHub Check: test (3.4.1, 1.1.3)
  • GitHub Check: test (mingw, 1.2.2)
  • GitHub Check: test (3.2.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.1.3)
  • GitHub Check: test (3.4.1, 1.1.1)
  • GitHub Check: test (3.2.6, 1.1.3)
  • GitHub Check: test (3.2.6, 1.2.2)
  • GitHub Check: test (3.3.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.3)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.1.6, 1.2.2)
  • GitHub Check: test (head, 1.2.2)
🔇 Additional comments (4)
test/duckdb_test/instance_cache_test.rb (4)

4-4: Adding FileUtils module - good addition.

The FileUtils module is required for file management operations, specifically for removing the test database files. This is a necessary addition since the tests will be creating physical database files that need to be cleaned up after testing.


32-37: Good test for default behavior.

This test properly verifies that get_or_create works correctly when called without a path parameter. It confirms that a valid DuckDB::Database instance is returned and properly cleans up by closing the connection.


39-44: Good test for empty path.

This test properly verifies that get_or_create works correctly when called with an empty string as the path. It confirms that a valid DuckDB::Database instance is returned and properly closes the connection afterward.


46-51: Good test for in-memory database.

The test correctly verifies that get_or_create works with the special ':memory:' path parameter which should create an in-memory database. It confirms that a valid DuckDB::Database instance is returned and properly closes the connection afterward.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
lib/duckdb/instance_cache.rb (2)

14-16: Add documentation for the get_or_create method

The get method calls an undocumented get_or_create method that appears to be implemented in the C extension. Consider adding a comment to indicate this, making the code more maintainable:

def get(path: nil, config: nil)
+  # Calls the C extension method implemented in ext/duckdb/instance_cache.c
  get_or_create(path, config)
end

Additionally, consider adding error handling for potential failures in the C extension method call.


5-18: Consider adding instance variable initialization

The InstanceCache class doesn't appear to have any instance variables or initialization logic. If the class needs to maintain state between calls to get, consider adding an initializer method:

class InstanceCache
  def initialize
    # Initialize any necessary instance variables
  end
  
  def get(path: nil, config: nil)
    get_or_create(path, config)
  end
end

If no state is needed, this might be a good candidate for class methods instead of instance methods.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14ed640 and a53492d.

📒 Files selected for processing (1)
  • lib/duckdb/instance_cache.rb (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (18)
  • GitHub Check: test (3.4.1, 1.2.2)
  • GitHub Check: test (3.1.6, 1.2.2)
  • GitHub Check: test (ucrt, 1.2.2)
  • GitHub Check: test (mswin, 1.1.1)
  • GitHub Check: test (head, 1.1.1)
  • GitHub Check: test (3.3.6, 1.2.2)
  • GitHub Check: test (head, 1.2.2)
  • GitHub Check: test (3.1.6, 1.2.2)
  • GitHub Check: test (3.2.7, 1.1.3)
  • GitHub Check: test (3.3.6, 1.1.1)
  • GitHub Check: test (3.4.1, 1.1.3)
  • GitHub Check: test (3.4.1, 1.1.1)
  • GitHub Check: test (3.3.6, 1.1.3)
  • GitHub Check: test (3.2.6, 1.1.3)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.2.6, 1.2.2)
  • GitHub Check: test (3.2.7, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.3)

Comment on lines 3 to 20
if defined?(DuckDB::InstanceCache)

module DuckDB
# The DuckDB::InstanceCache is necessary if a client/program (re)opens
# multiple databases to the same file within the same statement.
#
# require 'duckdb'
# cache = DuckDB::InstanceCache.new
# db1 = cache.get(path: 'db.duckdb')
# db2 = cache.get(path: 'db.duckdb')
class InstanceCache
def get(path: nil, config: nil)
get_or_create(path, config)
end
end
end

end
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reconsider the conditional block structure

The current implementation wraps the entire module definition in an if defined?(DuckDB::InstanceCache) check. This is counterintuitive, as it means the class will only be defined if it's already defined elsewhere. Typically, this pattern is used to prevent redefinition, but the condition should be negated:

-if defined?(DuckDB::InstanceCache)
+unless defined?(DuckDB::InstanceCache)

This ensures the class is only defined if it doesn't already exist, preventing potential redefinition issues.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if defined?(DuckDB::InstanceCache)
module DuckDB
# The DuckDB::InstanceCache is necessary if a client/program (re)opens
# multiple databases to the same file within the same statement.
#
# require 'duckdb'
# cache = DuckDB::InstanceCache.new
# db1 = cache.get(path: 'db.duckdb')
# db2 = cache.get(path: 'db.duckdb')
class InstanceCache
def get(path: nil, config: nil)
get_or_create(path, config)
end
end
end
end
unless defined?(DuckDB::InstanceCache)
module DuckDB
# The DuckDB::InstanceCache is necessary if a client/program (re)opens
# multiple databases to the same file within the same statement.
#
# require 'duckdb'
# cache = DuckDB::InstanceCache.new
# db1 = cache.get(path: 'db.duckdb')
# db2 = cache.get(path: 'db.duckdb')
class InstanceCache
def get(path: nil, config: nil)
get_or_create(path, config)
end
end
end
end

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGELOG.md (1)

5-5: Clarify the new caching functionality in the changelog entry.

The changelog entry currently reads:

- implement `DuckDB::InstanceCache` class.

Since the PR focuses on adding the DuckDB::InstanceCache#get_or_create method—which wraps the C-API function duckdb_get_or_create_from_cache—it would be more informative to include this detail. Consider updating the entry to something like:

- implement `DuckDB::InstanceCache` class.
+ implement `DuckDB::InstanceCache` class with `#get_or_create` method as a wrapper for `duckdb_get_or_create_from_cache`.

This update helps users quickly understand the specific functionality added by this PR.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea4b6c2 and a61592b.

📒 Files selected for processing (1)
  • CHANGELOG.md (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (19)
  • GitHub Check: test (head, 1.2.2)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (3.1.6, 1.1.3)
  • GitHub Check: test (head, 1.2.2)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (ucrt, 1.2.2)
  • GitHub Check: test (mingw, 1.1.3)
  • GitHub Check: test (3.3.6, 1.1.3)
  • GitHub Check: test (mingw, 1.2.2)
  • GitHub Check: test (ucrt, 1.1.3)
  • GitHub Check: test (3.1.6, 1.1.3)
  • GitHub Check: test (3.4.1, 1.2.2)
  • GitHub Check: test (3.2.6, 1.1.1)
  • GitHub Check: test (3.3.6, 1.2.2)
  • GitHub Check: test (3.2.6, 1.2.2)
  • GitHub Check: test (3.2.6, 1.1.3)
  • GitHub Check: test (3.3.6, 1.1.1)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.1.6, 1.2.2)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
lib/duckdb/instance_cache.rb (1)

3-26: ⚠️ Potential issue

Incorrect conditional check for module definition

The current implementation wraps the entire module definition in an if defined?(DuckDB::InstanceCache) check. This is counterintuitive, as it means the class will only be defined if it's already defined elsewhere. Typically, this pattern is used to prevent redefinition, but the condition should be negated:

-if defined?(DuckDB::InstanceCache)
+unless defined?(DuckDB::InstanceCache)

This ensures the class is only defined if it doesn't already exist, preventing potential redefinition issues.

🧹 Nitpick comments (3)
lib/duckdb/instance_cache.rb (3)

14-16: Inconsistent method documentation for return value

The :call-seq: documentation indicates the method returns self, but according to the description on line 17, it should return a DuckDB::Database object.

-  #   instance_cache.get(path:, config:) -> self
+  #   instance_cache.get(path:, config:) -> DuckDB::Database

Ensure documentation accurately reflects the actual return type.


20-22: Missing documentation for native method dependency

The get method delegates to get_or_create, which appears to be a native method implemented in the C extension but isn't defined in this Ruby file. Consider adding a comment to make this dependency clear:

def get(path: nil, config: nil)
+  # Delegates to native method implemented in ext/duckdb/instance_cache.c
  get_or_create(path, config)
end

This improves code maintainability by explicitly documenting that get_or_create is a native method.


1-27: Implementation looks good with minor documentation improvements needed

The class implementation is clean and well-documented, with good usage examples. The class effectively serves its purpose of providing a cache for database instances, which is useful when multiple databases are opened to the same file.

Consider adding more details to the class documentation about:

  1. Thread safety considerations
  2. When to use this vs. directly creating database instances
  3. Any performance implications

This would provide more context for users of this class.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a61592b and 5de5ba3.

📒 Files selected for processing (2)
  • ext/duckdb/instance_cache.c (3 hunks)
  • lib/duckdb/instance_cache.rb (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • ext/duckdb/instance_cache.c
⏰ Context from checks skipped due to timeout of 90000ms (21)
  • GitHub Check: test (mswin, 1.1.1)
  • GitHub Check: test (3.2.7, 1.1.3)
  • GitHub Check: test (3.4.2, 1.1.3)
  • GitHub Check: test (3.3.7, 1.1.1)
  • GitHub Check: test (3.3.7, 1.1.3)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (head, 1.2.2)
  • GitHub Check: test (3.3.7, 1.1.1)
  • GitHub Check: test (head, 1.1.1)
  • GitHub Check: test (3.2.7, 1.1.3)
  • GitHub Check: test (head, 1.1.3)
  • GitHub Check: test (3.3.7, 1.1.3)
  • GitHub Check: test (3.1.6, 1.2.2)
  • GitHub Check: test (3.3.7, 1.2.2)
  • GitHub Check: test (3.4.2, 1.1.1)
  • GitHub Check: test (3.4.2, 1.2.2)
  • GitHub Check: test (3.2.7, 1.2.2)
  • GitHub Check: test (3.1.6, 1.1.1)
  • GitHub Check: test (3.4.2, 1.1.3)
  • GitHub Check: test (3.2.7, 1.1.1)
  • GitHub Check: test (3.1.6, 1.1.3)

@suketa suketa merged commit 9d03351 into main May 10, 2025
55 checks passed
@suketa suketa deleted the instance_cache_get_or_create branch May 10, 2025 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant