Skip to content

feat(feature-store): add AI/ML Feature Store use case#60

Merged
robfrank merged 2 commits intomainfrom
feat/feature-store
Mar 23, 2026
Merged

feat(feature-store): add AI/ML Feature Store use case#60
robfrank merged 2 commits intomainfrom
feat/feature-store

Conversation

@robfrank
Copy link
Copy Markdown
Contributor

Summary

  • Adds new feature-store/ use case demonstrating ArcadeDB as a unified ML feature store
  • Serves three ML domains from a single database: fraud scoring (graph + vector + time-series), product recommendations (collaborative filtering + embeddings), and predictive maintenance (dependency graph + sensor aggregates)
  • 11 query patterns implemented across three runners: curl (queries.sh), Java (arcadedb-network), and JavaScript (pg driver via PostgreSQL protocol)
  • Includes CI workflow (matrix: [curl, java, js]), dependabot entries (maven, npm, docker-compose), design doc, and implementation plan

Files (16 new/modified)

  • feature-store/ — full use case directory (docker-compose, setup.sh, sql/, queries/, java/, js/, README)
  • .github/workflows/feature-store.yml — CI with [curl, java, js] matrix
  • .github/dependabot.yml — 3 new entries (maven, npm, docker-compose)
  • docs/plans/2026-03-23-feature-store-design.md — design document
  • docs/plans/2026-03-23-feature-store.md — implementation plan
  • README.md + CLAUDE.md — updated use cases tables

Test plan

  • docker compose up -d starts ArcadeDB with PostgreSQL plugin (ports 2480 + 5432)
  • ./setup.sh creates FeatureStore database and loads schema + data without errors
  • ./queries/queries.sh returns non-empty results for all 11 queries
  • mvn package && java -jar target/feature-store.jar runs all 11 queries
  • npm install && node feature-store.js runs all 11 queries via PostgreSQL protocol
  • CI workflow passes for all 3 matrix runners

🤖 Generated with Claude Code

Unified ML feature store serving three domains (fraud scoring, product
recommendations, predictive maintenance) from a single ArcadeDB instance.
Demonstrates graph, vector, time-series, and document features with 11
query patterns across curl, Java, and JavaScript (PostgreSQL protocol).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new AI/ML Feature Store use case, showcasing ArcadeDB's robust multi-model capabilities as a consolidated platform for machine learning feature management. It provides a practical demonstration of how a single database can efficiently handle diverse data requirements for fraud detection, product recommendations, and predictive maintenance. The implementation includes comprehensive examples with various query patterns and client integrations, supported by detailed documentation and automated CI/CD configurations.

Highlights

  • New AI/ML Feature Store Use Case: A new feature-store/ module has been added, demonstrating ArcadeDB as a unified ML feature store capable of serving fraud scoring, product recommendations, and predictive maintenance from a single database.
  • Multi-Client Query Demonstrations: The use case includes 11 distinct query patterns implemented across three different runners: curl via shell scripts, Java using arcadedb-network, and JavaScript utilizing the PostgreSQL protocol.
  • Comprehensive Supporting Infrastructure: The new feature store comes with a dedicated CI workflow, updated Dependabot configurations for its dependencies (Maven, npm, Docker Compose), and detailed design and implementation plan documents.
  • Documentation Updates: The root README.md and CLAUDE.md files have been updated to include the newly added AI/ML Feature Store use case in their respective tables.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/feature-store.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link
Copy Markdown

claude Bot commented Mar 23, 2026

Code Review — feat(feature-store): AI/ML Feature Store use case

Overall this is a well-structured addition that follows the repo conventions closely. The schema design, data seeding, and multi-runner CI setup are all solid. A few issues worth addressing before merge:


Bugs / Correctness

1. Query 9 is not recursive — only fetches direct dependents

queries.sh (and Java/JS equivalents):

MATCH {type: Equipment, where: (equipmentId = 'eq1')}
      .in('DEPENDS_ON'){as: dep}

There is no while: condition, so this is a single-hop traversal. With the sample data (eq5 → eq4 → eq1), eq5 won't appear in the results even though the description says "Find all downstream equipment affected if eq1 fails." The design doc showed .in('DEPENDS_ON'){while: ($depth < 5), as: dep}. The while: clause was dropped somewhere during implementation.

2. Query 8 hardcodes the preference vector — not actually personalized

The design doc specified a Cypher query that joins the User node to read u.preferenceVec dynamically. The implementation uses a hardcoded vector [0.9, 0.1, 0.1, 0.1] in SQL:

ORDER BY vectorNeighbors('Product[embedding]', [0.9, 0.1, 0.1, 0.1], 20) DESC

This only works correctly for u1 whose preferenceVec happens to be [0.9, 0.1, 0.1, 0.1]. The query is mislabeled "Personalized Ranking" — it's a static vector search. Either implement the original Cypher version that reads u.preferenceVec, or rename the query to "Vector Product Search" and document the limitation.

3. Misleading comment in feature-store.js for Query 8

// ── Query 8: Personalized Ranking (Cypher via {cypher} prefix) ──────────────

The actual implementation uses plain SQL (no {cypher} prefix). The comment is left over from the design doc's original Cypher approach. Should be updated.

4. Query 11 creates duplicate FeatureSnapshot rows on repeated runs

The INSERT in step 4 has no guard against re-insertion:

INSERT INTO FeatureSnapshot SET entityId = 'a4', ...

Running queries.sh or feature-store.js multiple times will accumulate duplicate snapshots. Consider adding an IF NOT EXISTS-style check or using UPDATE ... UPSERT, or at minimum add a note in the README that the verify step will show N rows if run N times.


Minor Issues

5. command function shadows bash built-in in queries.sh

command() {
  local lang="$1" cmd="$2"
  ...
}

command is a bash built-in used internally by the shell. Shadowing it can cause surprising failures in some shell environments. Rename to run_command or send_command (consistent with setup.sh's send_sql naming).

6. No package-lock.json committed; CI uses npm install not npm ci

The npm cache key hashes package.json, but without a lockfile CI installs latest-in-range packages on every run. This means a pg patch release can silently break CI. Commit package-lock.json and switch CI to npm ci for reproducible builds (matches how the Maven step uses a locked pom.xml).

7. CI action version comment may be incorrect

uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6.3.0

The public actions/checkout and actions/setup-node are currently at v4. If these SHAs were copied from another workflow in this repo that already uses them successfully this is fine — just worth verifying the comments match the actual tags the SHAs resolve to, since stale version comments undermine the readability benefit of pinning.


Nits

  • The FeatureSnapshot.featureVector property is typed as LIST (untyped elements), but the stored vectors mix integers and a float: [4, 3, 5, 12, 2950, 0.15]. This works in ArcadeDB's schema-less mode, but a LIST OF FLOAT type would be more explicit and consistent with behaviorVec/embedding.
  • Query 8 is described as "Personalized Ranking" in both the README table and printHeader calls, but the signal type column in the README table says "Vector" (not "Graph + Vector" as the design doc noted). Minor inconsistency.
  • The implementation plan (docs/plans/2026-03-23-feature-store.md) has a directive comment at the top: > **For Claude:** REQUIRED SUB-SKILL: use superpowers:executing-plans…. That's an implementation-time artifact and could be stripped before merge.

Summary

Issue Severity
Query 9 missing while: — only direct dependents returned Bug
Query 8 hardcoded vector — not truly personalized Bug
Misleading comment on JS Query 8 Minor
Duplicate FeatureSnapshot rows on re-run Minor
command shadows bash built-in Minor
Missing package-lock.json / npm install vs npm ci Minor
Action version comments may be stale Nit

The core architecture, SQL patterns, and CI structure are all consistent with the rest of the repo. The two correctness issues (#1 and #2) are the main things to fix before merge.

🤖 Generated with Claude Code

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive and well-structured AI/ML Feature Store use case, complete with data, queries, and demo applications in shell, Java, and JavaScript. The addition of detailed design and implementation documents is commendable. My review identified a recurring correctness issue in one of the queries (Query 9) across all three implementations, where a property was being incorrectly accessed from a vertex instead of an edge. I've provided fixes for this. Additionally, I found a bug in the JavaScript implementation related to an unescaped variable in a query string and suggested a security enhancement for the docker-compose.yml file to avoid hardcoding credentials.

Comment on lines +248 to +254
SELECT name, failureRate, criticality
FROM (
MATCH {type: Equipment, where: (equipmentId = 'eq1')}
.in('DEPENDS_ON'){as: dep}
RETURN dep.name AS name, dep.failureRate AS failureRate,
dep.out('DEPENDS_ON')[0].criticality AS criticality
)""";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This query to find the equipment dependency chain is incorrect. The expression dep.out('DEPENDS_ON')[0].criticality attempts to get the criticality property from a vertex, but this property is on the DEPENDS_ON edge. This will return null for criticality. Furthermore, the query only finds direct dependencies, while the description implies finding all downstream equipment. A traversal is needed. The query should be updated to perform a traversal and correctly retrieve criticality from the edge. The printing logic will also need to be updated to handle the new depth column.

            SELECT
              dep.name AS name,
              dep.failureRate AS failureRate,
              e.criticality AS criticality,
              $depth AS depth
            FROM (
              MATCH {type: Equipment, where: (equipmentId = 'eq1')}
                    .inE('DEPENDS_ON'){while: ($depth < 5), as: e}
                    .outV(){as: dep}
              RETURN dep, e, $depth
            )
            ORDER BY depth ASC

Comment thread feature-store/js/feature-store.js Outdated
Comment on lines +53 to +55
.both('TRANSFERRED'){while: ($depth < 4), as: hop}
{type: Account, where: (flagged = true), as: flagged}
RETURN flagged.accountId AS flaggedId, $depth AS depth
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The $depth variable in this MATCH query needs to be escaped in the JavaScript template literal. Without escaping, it will be interpreted as a variable for interpolation, which will likely cause an error or unexpected behavior. It should be written as \$depth.

Suggested change
.both('TRANSFERRED'){while: ($depth < 4), as: hop}
{type: Account, where: (flagged = true), as: flagged}
RETURN flagged.accountId AS flaggedId, $depth AS depth
.both('TRANSFERRED'){while: (\$depth < 4), as: hop}
{type: Account, where: (flagged = true), as: flagged}
RETURN flagged.accountId AS flaggedId, \$depth AS depth

Comment on lines +185 to +197
const sql = `
SELECT name, failureRate, criticality
FROM (
MATCH {type: Equipment, where: (equipmentId = 'eq1')}
.in('DEPENDS_ON'){as: dep}
RETURN dep.name AS name, dep.failureRate AS failureRate,
dep.out('DEPENDS_ON')[0].criticality AS criticality
)`;

const res = await client.query(sql);
for (const row of res.rows) {
console.log(` ${String(row.name).padEnd(20)} | failureRate: ${row.failurerate} | criticality: ${row.criticality}`);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This query to find the equipment dependency chain is incorrect. The expression dep.out('DEPENDS_ON')[0].criticality attempts to get the criticality property from a vertex, but this property is on the DEPENDS_ON edge. This will return null for criticality. Furthermore, the query only finds direct dependencies, while the description implies finding all downstream equipment. A traversal is needed. The suggested code fixes the query to perform a traversal and correctly retrieve criticality from the edge, and also updates the printing logic. Note that $depth must be escaped as \$depth in template literals.

  const sql = `
    SELECT
      dep.name AS name,
      dep.failureRate AS failureRate,
      e.criticality AS criticality,
      \$depth AS depth
    FROM (
      MATCH {type: Equipment, where: (equipmentId = 'eq1')}
            .inE('DEPENDS_ON'){while: (\$depth < 5), as: e}
            .outV(){as: dep}
      RETURN dep, e, \$depth
    )
    ORDER BY depth ASC`;

  const res = await client.query(sql);
  for (const row of res.rows) {
    console.log(`  ${String(row.name).padEnd(20)} | failureRate: ${row.failurerate} | criticality: ${row.criticality} | depth: ${row.depth}`);
  }

Comment thread feature-store/queries/queries.sh Outdated
Comment on lines +156 to +162
SELECT name, failureRate, criticality
FROM (
MATCH {type: Equipment, where: (equipmentId = 'eq1')}
.in('DEPENDS_ON'){as: dep}
RETURN dep.name AS name, dep.failureRate AS failureRate,
dep.out('DEPENDS_ON')[0].criticality AS criticality
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This query to find the equipment dependency chain is incorrect. The expression dep.out('DEPENDS_ON')[0].criticality attempts to get the criticality property from a vertex, but this property is on the DEPENDS_ON edge. This will return null for criticality. Furthermore, the query only finds direct dependencies, while the description implies finding all downstream equipment. A traversal is needed. Here is a corrected query that performs a traversal and correctly retrieves the criticality from the edge.

Suggested change
SELECT name, failureRate, criticality
FROM (
MATCH {type: Equipment, where: (equipmentId = 'eq1')}
.in('DEPENDS_ON'){as: dep}
RETURN dep.name AS name, dep.failureRate AS failureRate,
dep.out('DEPENDS_ON')[0].criticality AS criticality
)
SELECT
dep.name AS name,
dep.failureRate AS failureRate,
e.criticality AS criticality,
\$depth AS depth
FROM (
MATCH {type: Equipment, where: (equipmentId = 'eq1')}
.inE('DEPENDS_ON'){while: (\$depth < 5), as: e}
.outV(){as: dep}
RETURN dep, e, \$depth
)
ORDER BY depth ASC

Comment thread .github/dependabot.yml
schedule:
interval: weekly
day: "sunday"
open-pull-requests-limit: 20
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The open-pull-requests-limit for Maven is set to 20, which is higher than the limit of 10 for other package ecosystems in this file. While this is not necessarily an issue, it's worth ensuring this is intentional. A higher limit can sometimes lead to a large number of open PRs from dependabot, which might be overwhelming to manage. If the number of Maven dependencies is large and they are updated frequently, this might be justified.

Comment on lines +134 to +143
SELECT flaggedId, depth
FROM (
MATCH {type: Account, where: (accountId = 'a4')}
.both('TRANSFERRED'){while: ($depth < 4), as: hop}
{type: Account, where: (flagged = true), as: flagged}
RETURN flagged.accountId AS flaggedId, $depth AS depth
)
ORDER BY depth ASC
LIMIT 1
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current MATCH query is a valid way to find the shortest path, but using the dedicated shortestPath() function is generally more idiomatic and can be more performant for this specific task. It's designed to efficiently find the shortest path between nodes. Consider using it to better demonstrate best practices.

Suggested change
SELECT flaggedId, depth
FROM (
MATCH {type: Account, where: (accountId = 'a4')}
.both('TRANSFERRED'){while: ($depth < 4), as: hop}
{type: Account, where: (flagged = true), as: flagged}
RETURN flagged.accountId AS flaggedId, $depth AS depth
)
ORDER BY depth ASC
LIMIT 1
```
SELECT path.size() - 1 AS depth, path[path.size()-1].accountId as flaggedId
FROM (
SELECT shortestPath(
(SELECT FROM Account WHERE accountId = 'a4'),
(SELECT FROM Account WHERE flagged = true),
'BOTH',
'TRANSFERRED'
) AS path
)
WHERE path IS NOT NULL
LIMIT 1

Comment on lines +216 to +223
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a4') TO (SELECT FROM Account WHERE accountId = 'a5') SET amount = 1500.00, recordedAt = '2026-03-11 04:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a6') TO (SELECT FROM Account WHERE accountId = 'a4') SET amount = 4000.00, recordedAt = '2026-03-12 01:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a3') TO (SELECT FROM Account WHERE accountId = 'a1') SET amount = 100.00, recordedAt = '2026-03-05 14:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a2') TO (SELECT FROM Account WHERE accountId = 'a1') SET amount = 250.00, recordedAt = '2026-03-06 16:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a5') TO (SELECT FROM Account WHERE accountId = 'a6') SET amount = 1800.00, recordedAt = '2026-03-13 05:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a1') TO (SELECT FROM Account WHERE accountId = 'a4') SET amount = 300.00, recordedAt = '2026-03-08 12:00:00';
-- LINKED_DEVICE edges (a4 and a5 share devices with flagged a6)
CREATE EDGE LINKED_DEVICE FROM (SELECT FROM Account WHERE accountId = 'a4') TO (SELECT FROM Account WHERE accountId = 'a6') SET deviceId = 'dev-001';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The query for finding the equipment dependency chain is incorrect. The expression dep.out('DEPENDS_ON')[0].criticality attempts to get the criticality property from a vertex, but it's defined on the DEPENDS_ON edge. This will result in a null value. The query should be updated to correctly retrieve criticality from the edge.

Suggested change
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a4') TO (SELECT FROM Account WHERE accountId = 'a5') SET amount = 1500.00, recordedAt = '2026-03-11 04:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a6') TO (SELECT FROM Account WHERE accountId = 'a4') SET amount = 4000.00, recordedAt = '2026-03-12 01:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a3') TO (SELECT FROM Account WHERE accountId = 'a1') SET amount = 100.00, recordedAt = '2026-03-05 14:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a2') TO (SELECT FROM Account WHERE accountId = 'a1') SET amount = 250.00, recordedAt = '2026-03-06 16:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a5') TO (SELECT FROM Account WHERE accountId = 'a6') SET amount = 1800.00, recordedAt = '2026-03-13 05:00:00';
CREATE EDGE TRANSFERRED FROM (SELECT FROM Account WHERE accountId = 'a1') TO (SELECT FROM Account WHERE accountId = 'a4') SET amount = 300.00, recordedAt = '2026-03-08 12:00:00';
-- LINKED_DEVICE edges (a4 and a5 share devices with flagged a6)
CREATE EDGE LINKED_DEVICE FROM (SELECT FROM Account WHERE accountId = 'a4') TO (SELECT FROM Account WHERE accountId = 'a6') SET deviceId = 'dev-001';
SELECT dep.name as name, dep.failureRate as failureRate, e.criticality as criticality, $depth as depth
FROM (
MATCH {type: Equipment, where: (equipmentId = 'eq1')}
.inE('DEPENDS_ON'){while: ($depth < 5), as: e}
.outV(){as: dep}
RETURN dep, e, $depth
)
ORDER BY depth ASC

- "5432:5432"
environment:
JAVA_OPTS: >-
-Darcadedb.server.rootPassword=arcadedb
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The root password for ArcadeDB is hardcoded in the docker-compose.yml file. For better security, it's recommended to source secrets from environment variables, which can be managed outside of version control (e.g., in a .env file). This prevents accidental exposure of credentials.

        -Darcadedb.server.rootPassword=${ARCADEDB_ROOT_PASSWORD:-arcadedb}

- Query 2: restructure MATCH to filter flagged in outer WHERE
  (terminal node after while: traversal is invalid ArcadeDB syntax)
- Query 9: add while: for recursive traversal, use .inE()/.outV()
  to access edge criticality (was reading from vertex, returning null)
- Query 8: rename "Personalized Ranking" to "Category Vector Search"
  (vector is hardcoded for u1, not dynamically read)
- Rename command() to send_command() in queries.sh (shadows bash built-in)
- Add package-lock.json, switch CI to npm ci for reproducible builds

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@robfrank
Copy link
Copy Markdown
Contributor Author

Review response

Fixed in 327cd20:

# Issue Fix
Query 2 MATCH syntax — terminal {type:...} after while: invalid Moved flagged filter to outer WHERE clause
Query 9 Single-hop only, criticality from vertex (null) Added while:($depth < 5), switched to .inE()/.outV() for edge property
Query 8 Mislabeled "Personalized Ranking", hardcoded vector Renamed to "Category Vector Search", updated description
JS Q8 Comment said "Cypher" but uses SQL Fixed comment
command() Shadows bash built-in Renamed to send_command()
package-lock.json Missing, CI used npm install Generated lockfile, CI now uses npm ci

Not implementing (with reasoning):

  • Gemini: $depth JS escaping$depth without braces is literal text in JS template literals. Only ${...} triggers interpolation. Not a bug.
  • Gemini: shortestPath() for Q2 — Comment targets the design doc, not implementation code. shortestPath() availability in ArcadeDB 26.3.2 is unverified.
  • Gemini: Docker password env vars — All 9 existing use cases hardcode the demo password identically. This is a demo repository.
  • Gemini: dependabot limit 20 — Matches all other Maven entries in the file.
  • Claude: Action version comments — SHAs copied from working supply-chain.yml which has been passing CI since February.
  • Claude: Q11 duplicate snapshots — Expected demo behavior. Each run appends a new snapshot, which is actually realistic for a feature store audit trail.

@claude
Copy link
Copy Markdown

claude Bot commented Mar 23, 2026

Code Review: feat(feature-store). Overall clean and well-structured. Key findings: (1) Query 2 materialises all accounts within 4 hops before outer WHERE flagged=true filter -- design doc had a more efficient MATCH-internal filter; (2) Query 11 Step 4 INSERT is non-idempotent, accumulates FeatureSnapshot rows on re-runs; (3) JS Query 5 and 6 use a double-fallback for Cypher column names that silently swallows undefined; (4) Merchant has no unique index on merchantId unlike Account/User/Equipment; (5) Java price cast in runQuery7/8 throws on null -- tryRun catches but skips silently; (6) Design doc Query Language Mapping table says Query 8 is Cypher but implementation uses SQL -- doc needs update. Nits: send_command in queries.sh deserves a comment on command vs query endpoint; package.json missing engines field; port 2424 omitted intentionally. What is done well: one-statement-per-line SQL, IF NOT EXISTS guards, three-runner CI matrix with fail-fast false, correct Cypher prefix for pg wire, lowercased SQL column names in JS, action SHAs match supply-chain.yml, LSM_VECTOR index syntax and vectorNeighbors pattern are exactly right per documented ArcadeDB quirks. Generated with Claude Code

@robfrank robfrank merged commit b3474e2 into main Mar 23, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant