Commit 33c0ed1
feat: graph discovery, composite FK/PK support, and cross-schema joins (#560)
* feat: add schema discovery documents and real-time subscription support
Add automatic generation of schema discovery documents ("Schema Bible")
that provide rich metadata about database tables, relationships, and
query patterns. Documents are generated at startup and regenerated on
schema changes.
- Add GenerateDiscovery() and SubscribeDiscovery() to core API
- Add OnSchemaChange callback for schema change notifications
- Register discovery as MCP resource and REST endpoint
- Support WebSocket subscriptions for real-time discovery updates
- Fire callbacks on startup, reload, and DB watcher schema changes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add query syntax reference to discovery docs and fix templates
The generated schema bible had no DSL reference, causing agents to guess
syntax incorrectly (using group_by instead of distinct, wrong in operator
format, etc.). Add a Query Syntax Reference section covering filter
operators, aggregation functions, grouping with distinct, pagination,
ordering, relationships, and common mistakes. Also fix query templates
to use distinct:[col] for time-series and breakdown grouping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: auto-discover and resolve tables across all database schemas
GraphJin previously required tables to be in the default schema (e.g.,
"public" for PostgreSQL) or explicit schema configuration. This made it
fail silently on databases like AdventureWorks where tables live in
non-public schemas (production, sales, etc.).
Add a name-only secondary index (nameIndex) to DBSchema that enables
cross-schema table resolution as a fallback when exact schema:name
lookup fails. Resolution order: exact match → single cross-schema
match → default schema preference → ambiguity error with schema list.
Also fix the hardcoded "public" default in NewCompiler to use the
discovered schema for all database dialects (MSSQL→dbo, MySQL→db name,
etc.), and use the resolved schema in AddRole keys so role permissions
work correctly for tables in non-default schemas.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add table of contents to discovery docs and configurable workflow timeout
- Discovery documents now include a navigable Table of Contents section
with anchor links to all sections and individual tables
- Workflow script timeout is now configurable via mcp.workflow_timeout
(in seconds), defaulting to 5s when not set
- Timeout value is exposed in get_js_runtime_api response so LLM agents
can plan workflow strategies based on available headroom
* feat: add composite FK support, cross-schema JOIN fix, and AdventureWorks integration tests
- Fix Postgres composite FK discovery: change confkey[1] to
confkey[array_position(co.conkey, f.attnum)] so each local FK column
maps to its correct referenced column by position
- Add DiscoverCompositeFKs() to detect multi-column FK constraints and
merge them into single graph edges with ExtraPairs
- Add ColPair/CompositeFKInfo types, propagate ExtraPairs through
TEdge → TPath → DBRel → buildFilter → OpAnd expressions
- Fix renderJoin to schema-qualify intermediate JOIN table names
(e.g., INNER JOIN "person"."businessentity" instead of INNER JOIN businessentity)
- Add AdventureWorks as integration test database (full 760K-row dataset)
with 24 business-scenario tests verified against SQL ground truth
- Add make test-adventureworks target (23/24 tests passing)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent FK columns from being misinterpreted as relationship joins in WHERE filters
FK columns like customer_id or territoryid were being treated as nested
relationship references (triggering EXISTS subqueries) instead of simple
column filters. Now processNestedTable checks if the field name matches
a column on the current table before attempting path resolution.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: split discovery document into granular MCP resources
Split the monolithic Schema Bible into focused sections (overview, syntax,
tables, full_tables, insights) so MCP agents can load only what they need
without exceeding context limits. Add server instructions for MCP.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: update go.work.sum
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add composite FK extra pair columns to parent subquery SELECT list
When a composite FK join references extra columns (e.g., specialofferid
in the salesorderdetail → specialofferproduct join), those columns must
be included in the parent subquery's SELECT list. Without this, the
aliased subquery (e.g., salesorderdetail_2) doesn't expose the column,
causing "column does not exist" errors.
Also fix CustomerGeography test to filter for customers with personid
(B2B store-only customers have NULL personid, causing empty joins).
24/24 AdventureWorks tests now pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: strip person.password sample data from AdventureWorks dump
Remove hashed password + salt values from the test data fixture to
resolve GitHub secret scanning alert. The person.password table
contains AdventureWorks demo data (not real credentials) but triggers
automated secret detection. No tests depend on this table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add composite FK discovery for MySQL, MariaDB, SQLite, Oracle, MSSQL, Snowflake
Extend DiscoverCompositeFKs() with per-database SQL queries to detect
multi-column foreign key constraints. Each DB uses its native system
catalog (information_schema, pragma_foreign_key_list, all_constraints,
sys.foreign_key_columns, _gj_fk_metadata) with GROUP BY + HAVING COUNT
to identify composite FKs.
The downstream machinery (edge merging, ExtraPairs propagation, AND
filter generation) is already DB-agnostic from the Postgres implementation.
Includes unit tests for query constants, CSV parsing, normalization per
DB type (Oracle/MSSQL/Snowflake: snake_case + lowercase), and unsupported
DB fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: move AdventureWorks test data to tests-large/ and enable git LFS
Large SQL fixtures (75MB data dump) moved out of tests/ into tests-large/
to keep the main test directory lean. The 75MB data file is now tracked
via git LFS. Updated init script paths in dbint_test.go and added a
test-large Makefile target.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add composite FK integration tests for all 6 database types
Add product_variants + order_items tables with composite FK
(product_id, variant_id) to Postgres, MySQL, MariaDB, SQLite, MSSQL,
and Oracle test schemas. Integration tests verify:
- Forward join: order_items → product_variants (correct variant matched)
- All-rows match: every order_item joins to its correct variant
- Reverse join: product_variants → order_items
All 18 tests pass (3 tests × 6 databases).
Also adds unit tests for composite FK query constants, CSV parsing
with per-DB normalization, and unsupported DB fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: add composite primary key support across all 6 database dialects
Tables with multi-column primary keys (e.g., PRIMARY KEY (user_id, session_id))
now work correctly throughout GraphJin. Previously only the first PK column was
recognized and the rest were silently dropped.
Core changes:
- Add PrimaryCols []DBColumn to DBTable with HasCompositePK/PKColNames/IsPKCol helpers
- compileArgID accepts composite PK as object: id: {col1: val1, col2: val2}
- orderByIDCol adds all PK columns to ORDER BY
- Mutation helpers generate multi-column variable declarations and WHERE clauses
Dialect updates (150 PrimaryCol refs across 18 files):
- Postgres: multi-column ON CONFLICT
- SQLite: JSON-encoded composite keys in _gj_ids, multi-col ON CONFLICT/RETURNING
- MySQL: multi-col JSON_TABLE, PK detection via IsPKCol
- MSSQL: multi-col MERGE ON, OPENJSON columns
- Oracle: multi-col ORDER BY, RETURNING INTO, JSON_TABLE
- Snowflake: multi-col identity updates, PK detection
Tested: 4 unit tests + 15 integration tests (3 tests × 5 DBs) all passing,
full Postgres and MySQL regression suites clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: don't use JSON_TABLE for connect/disconnect mutations on MySQL/MariaDB
Connect and disconnect mutations use a compiled WHERE filter (e.g., id IN
(1,2,3)), not a JSON record set. Passing the connect data through JSON_TABLE
caused MySQL Error 3666 ("Can't store an array in scalar JSON_TABLE column")
because the filter value is an array, not a scalar.
The fix removes the unnecessary RenderMutateToRecordSet calls from
RenderLinearConnect and RenderLinearDisconnect for both MySQL and MariaDB
dialects. The WHERE filter rendered by renderFilter() is sufficient.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: update test documentation for all 9 database targets
Rewrites tests/TESTS.md to cover the full test infrastructure:
- All 9 database targets (PG, MySQL, MariaDB, SQLite, Oracle, MSSQL,
Snowflake, MongoDB, AdventureWorks) with container images and make targets
- Composite FK and PK test sections with per-DB compatibility
- Ground truth verification pattern documentation
- Full compatibility matrix and schema file listing
- Known issues section for SQLite, MariaDB, MongoDB, Snowflake
Adds tests-large/TESTS.md explaining the large-scale test strategy,
AdventureWorks database stats, and how to add new large-scale fixtures.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 3dc3f88 commit 33c0ed1
File tree
57 files changed
+14515
-583
lines changed- core
- internal
- dialect
- psql
- qcode
- sdata
- sql
- serv
- tests-large
- tests
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
57 files changed
+14515
-583
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
60 | 66 | | |
61 | 67 | | |
62 | 68 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
123 | 130 | | |
124 | 131 | | |
125 | 132 | | |
126 | 133 | | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
127 | 216 | | |
128 | 217 | | |
129 | 218 | | |
| |||
142 | 231 | | |
143 | 232 | | |
144 | 233 | | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
145 | 237 | | |
146 | 238 | | |
147 | 239 | | |
| |||
157 | 249 | | |
158 | 250 | | |
159 | 251 | | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
160 | 255 | | |
161 | 256 | | |
162 | 257 | | |
| |||
657 | 752 | | |
658 | 753 | | |
659 | 754 | | |
660 | | - | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
661 | 761 | | |
662 | 762 | | |
663 | 763 | | |
| |||
803 | 903 | | |
804 | 904 | | |
805 | 905 | | |
| 906 | + | |
806 | 907 | | |
807 | 908 | | |
808 | 909 | | |
| |||
939 | 1040 | | |
940 | 1041 | | |
941 | 1042 | | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
942 | 1046 | | |
943 | 1047 | | |
944 | 1048 | | |
| |||
0 commit comments