Unify instanceof with TypeTag — JumpTable dispatch, fused IsType, InstanceOfRemoved diagnostic#3340
Conversation
Route class-typed and primitive-typed match arms through Terminator::Switch → analyze_switch (JumpTable/BinarySearch/IfElseChain) instead of sequential IsType chains. - Remove dead_code attributes from SwitchKind::TypeTag and class_type_tags - Add type_tag_for_ty helper for type tag lookup - Extend try_lower_match_as_switch to recognize Binding and TypedBinding patterns resolving to class/primitive types - Add SwitchKind::TypeTag arm in switch-operand and arm_names emission - Fix class_type_tags ordering to match emitter iteration order - Add snapshot tests for JumpTable, IfElseChain, and BinarySearch dispatch
Replace the two multi-instruction type-check patterns with a single fused Instruction::IsType(usize): - LoadConst + CmpOp::InstanceOf (classes) → IsType(const_idx) - TypeTag + LoadConst + CmpOp::Eq (primitives) → IsType(const_idx) The constant pool index resolves to either a class HeapPtr (identity check) or an integer tag (type tag comparison). CmpOp::InstanceOf is deleted entirely from bytecode and VM.
…nostic Remove BinaryOp::Instanceof from AST, TIR, MIR, visualization, and onionskin crates. Emit InstanceofRemoved (E0098) diagnostic at CST→AST lowering with a helpful message pointing users to match expressions. Lexer/parser/formatter kept for parse-level error recovery.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRemoved the Changes
Sequence Diagram(s)sequenceDiagram
participant Parser
participant Lowerer as Lowerer (AST)
participant MIR as MIR/LoweringContext
participant Emitter as Emitter/Stackify
participant VM
Parser->>Lowerer: parse source (including `instanceof` / match type-patterns)
Lowerer->>Lowerer: detect `KW_INSTANCEOF` -> emit InstanceofRemoved\nreturn Expr::Missing
Note right of Lowerer: for match type-patterns, build class_type_tags and emit Rvalue::TypeTag
Lowerer->>MIR: produce SwitchKind::TypeTag / Rvalue::TypeTag
MIR->>Emitter: lower switch/arms, request type_tag constants
Emitter->>Emitter: materialize discriminant or emit `IsType(const_idx)`
Emitter->>VM: emit bytecode with `Instruction::IsType(idx)`
VM->>VM: `IsType` pops value, compares by class identity or typetag, pushes Bool
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Binary size checks passed✅ 7 passed
Generated by |
There was a problem hiding this comment.
Actionable comments posted: 7
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 6a7f463a-39ee-4efb-8547-6a4069ba0ac7
⛔ Files ignored due to path filters (13)
baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__testing_std__/baml_tests____testing_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/catch_all_keyword/baml_tests__catch_all_keyword__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/catch_throw/baml_tests__catch_throw__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/catch_throw_regressions/baml_tests__catch_throw_regressions__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/error_cases/baml_tests__error_cases__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/match_exhaustiveness/baml_tests__match_exhaustiveness__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/match_exhaustiveness/baml_tests__match_exhaustiveness__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_textual.snapis excluded by!**/*.snap
📒 Files selected for processing (16)
baml_language/crates/baml_compiler2_ast/src/ast.rsbaml_language/crates/baml_compiler2_ast/src/lower_expr_body.rsbaml_language/crates/baml_compiler2_ast/src/lowering_diagnostic.rsbaml_language/crates/baml_compiler2_emit/src/emit.rsbaml_language/crates/baml_compiler2_mir/src/ir.rsbaml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_compiler2_tir/src/builder.rsbaml_language/crates/baml_compiler2_visualization/src/control_flow/from_ast.rsbaml_language/crates/baml_tests/tests/exceptions.rsbaml_language/crates/baml_tests/tests/instanceof.rsbaml_language/crates/baml_tests/tests/match_basics.rsbaml_language/crates/baml_tests/tests/match_types.rsbaml_language/crates/bex_vm/src/debug.rsbaml_language/crates/bex_vm/src/vm.rsbaml_language/crates/bex_vm_types/src/bytecode.rsbaml_language/crates/tools_onionskin/src/compiler.rs
💤 Files with no reviewable changes (3)
- baml_language/crates/baml_compiler2_visualization/src/control_flow/from_ast.rs
- baml_language/crates/baml_compiler2_ast/src/ast.rs
- baml_language/crates/tools_onionskin/src/compiler.rs
Resolve conflicts between canary's error type refactor (VmError → VmInternalError/VmBamlError/VmRustFnError) and this branch's removal of CmpOp::InstanceOf in favor of fused Instruction::IsType. Conflicts resolved: - bex_vm/src/vm.rs: Dropped CmpOp::InstanceOf match arms that canary updated with new error types, since InstanceOf is removed from the enum. - Bytecode snapshots: Regenerated to reflect both canary's new error classes (AllocFailure, etc.) and this branch's is_type instruction.
Avoids emitting wasteful type_tag + pop 1 when a match has exactly one exhaustive arm—no comparison is needed so the discriminant is never pushed.
Null patterns in match arms now go through SwitchKind::TypeTag instead of falling back to the chain path. Combined with the single-arm exhaustive optimization, this eliminates redundant null comparisons.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@baml_language/crates/baml_compiler2_mir/src/lower.rs`:
- Around line 3619-3627: The code sets is_switch_exhaustive using
otherwise_idx.is_none() && (is_exhaustive || matches!(switch_kind,
Some(SwitchKind::TypeTag))), which incorrectly treats any TypeTag switch without
a wildcard as exhaustive; change the condition to only treat the switch as
exhaustive when TIR marked it exhaustive. Update the expression for
is_switch_exhaustive to otherwise_idx.is_none() && is_exhaustive (remove the
matches!(switch_kind, Some(SwitchKind::TypeTag)) clause) so only true exhaustive
switches (as indicated by is_exhaustive) skip the final arm comparison.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d3a966da-1fbe-4e3a-a425-f863d3bc9cdc
⛔ Files ignored due to path filters (10)
baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__testing_std__/baml_tests____testing_std____04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__testing_std__/baml_tests____testing_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/lambda_advanced/baml_tests__lambda_advanced__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/match_exhaustiveness/baml_tests__match_exhaustiveness__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/match_exhaustiveness/baml_tests__match_exhaustiveness__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_textual.snapis excluded by!**/*.snap
📒 Files selected for processing (2)
baml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_tests/tests/match_basics.rs
Revert null-to-TypeTag routing (it made 2-arm optional matches worse) and instead teach lower_match_chain about exhaustiveness: when the match is exhaustive and we reach the last arm with no guard, skip the pattern test entirely since it must match.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
baml_language/crates/baml_compiler2_mir/src/lower.rs (1)
3596-3604:⚠️ Potential issue | 🟠 MajorTypeTag switch is marked exhaustive without proof.
This can miscompile partial typed matches by turning the default path into
Unreachablewhenis_exhaustiveis false.Safe minimal fix
- let is_switch_exhaustive = otherwise_idx.is_none() - && (is_exhaustive || matches!(switch_kind, Some(SwitchKind::TypeTag))); + let is_switch_exhaustive = otherwise_idx.is_none() && is_exhaustive;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_compiler2_mir/src/lower.rs` around lines 3596 - 3604, The code wrongly treats TypeTag switches as exhaustive without proof; update the is_switch_exhaustive computation (the binding named is_switch_exhaustive in lower.rs) to require the original is_exhaustive flag rather than special-casing SwitchKind::TypeTag — i.e., only set is_switch_exhaustive when otherwise_idx.is_none() && is_exhaustive, removing the matches!(..., SwitchKind::TypeTag) clause so partial typed matches don't turn the default path into Unreachable.
🧹 Nitpick comments (1)
baml_language/crates/baml_compiler2_mir/src/lower.rs (1)
465-493: Avoid recomputing class type tags for every lowering context.
build_class_type_tagswalks all files and is invoked in both constructors, which can add avoidable compile-time overhead. Consider memoizing this map behind a DB query and reusing it here.Also applies to: 621-623, 792-794
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_compiler2_mir/src/lower.rs` around lines 465 - 493, build_class_type_tags currently recomputes the full map by walking all files (using compiler2_all_files, file_item_tree, file_package) each time it's called; convert this to a memoized DB query so the map is computed once and cached and callers (the constructors mentioned) call that query instead. Move the body of build_class_type_tags into a new salsa-tracked/DB query (e.g., class_type_tags_query or Db::class_type_tags) and have build_class_type_tags delegate to that query (or replace callers to call the new query directly); ensure the query returns IndexMap<TypeName,i64> using the same logic (module_path creation, TypeName construction, type_tag counter) so behavior is unchanged but cached. Update the two constructors that currently invoke build_class_type_tags to use the new DB query, and apply the same pattern to the other similar sites called out (lines ~621-623 and ~792-794).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@baml_language/crates/baml_compiler2_mir/src/lower.rs`:
- Around line 3596-3604: The code wrongly treats TypeTag switches as exhaustive
without proof; update the is_switch_exhaustive computation (the binding named
is_switch_exhaustive in lower.rs) to require the original is_exhaustive flag
rather than special-casing SwitchKind::TypeTag — i.e., only set
is_switch_exhaustive when otherwise_idx.is_none() && is_exhaustive, removing the
matches!(..., SwitchKind::TypeTag) clause so partial typed matches don't turn
the default path into Unreachable.
---
Nitpick comments:
In `@baml_language/crates/baml_compiler2_mir/src/lower.rs`:
- Around line 465-493: build_class_type_tags currently recomputes the full map
by walking all files (using compiler2_all_files, file_item_tree, file_package)
each time it's called; convert this to a memoized DB query so the map is
computed once and cached and callers (the constructors mentioned) call that
query instead. Move the body of build_class_type_tags into a new
salsa-tracked/DB query (e.g., class_type_tags_query or Db::class_type_tags) and
have build_class_type_tags delegate to that query (or replace callers to call
the new query directly); ensure the query returns IndexMap<TypeName,i64> using
the same logic (module_path creation, TypeName construction, type_tag counter)
so behavior is unchanged but cached. Update the two constructors that currently
invoke build_class_type_tags to use the new DB query, and apply the same pattern
to the other similar sites called out (lines ~621-623 and ~792-794).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d414997a-d5f5-42ec-8c71-c9ccf9dfc588
⛔ Files ignored due to path filters (11)
baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/__testing_std__/baml_tests____testing_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/format_checks/baml_tests__format_checks__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/match_exhaustiveness/baml_tests__match_exhaustiveness__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/match_exhaustiveness/baml_tests__match_exhaustiveness__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_type_resolution/baml_tests__namespaces_type_resolution__04_5_mir.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_type_resolution/baml_tests__namespaces_type_resolution__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_expanded_unoptimized.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/tests/bytecode_format/snapshots/bytecode_format__bytecode_display_textual.snapis excluded by!**/*.snap
📒 Files selected for processing (2)
baml_language/crates/baml_compiler2_mir/src/lower.rsbaml_language/crates/baml_tests/tests/match_basics.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- baml_language/crates/baml_tests/tests/match_basics.rs
…-arm threshold - Add try_lower_catch_as_switch: catch arms with 4+ typed patterns now use type_tag + jump_table dispatch instead of sequential is_type chains. Handles throw_if_panic for wildcard arms and rethrow for default. - Raise TypeTag switch threshold to 4 arms for both match and catch. For <4 arms the sequential is_type chain is more compact than the type_tag + copy/cmp_op if-else chain. - Add operand metadata to IsType instructions so bytecode debug output shows `is_type Cat` instead of `is_type ?2`. - Add tests: null in TypeTag switch, catch switch optimization (4-arm jump_table, wildcard+throw_if_panic, 2-arm sequential, mixed no-switch).
Resolves conflicts between TypeTag switch dispatch (this branch) and bytecode optimization PR (canary): - lower.rs: adopt canary's classify_pattern_type_tag/format_type_tag_name helpers, updated to support class types via type_tag_for_ty - Fix exhaustive switch assertion: only infer TypeTag exhaustiveness for match expressions, not catch (any value can be thrown at runtime) - Fix otherwise-block lowering to use recomputed is_switch_exhaustive rather than caller's original is_exhaustive flag - exceptions.rs: merge both branch's new tests, update inline snapshots for canary's InitField optimization - Accept snapshot updates from canary's empty-block elimination
…ceof tests - Delete try_lower_catch_as_switch (~160 lines) — the unified try_lower_as_switch already handles catch via SwitchOtherwise::Catch - Delete baml_tests/tests/instanceof.rs (custom test, replaced by snapshots) - Update LSP tests to expect E0098 instanceof-removed diagnostic - Add instanceof_removed insta snapshot project for diagnostic coverage
The if-else switch emitter copied the discriminant before each arm's comparison so it survived for the next arm. On the last arm there is no next arm, so the copy + two pops (match/fallthrough) were wasted. New `emit_compare_and_branch_final` lets `cmp_op ==` consume the discriminant directly, saving 3 instructions per non-exhaustive switch. Also fixes stale `MatchHash` → `DenseTag` comment in bytecode.rs.
Instead of pushing the discriminant once and keeping it alive across arms with copy/pop, each arm now re-loads from the operand. This makes every arm self-contained and eliminates all copy+pop overhead in the if-else chain (2 instructions saved per non-last arm). The copy/pop pattern is retained only in emit_compare_and_branch for the binary search strategy where the discriminant must persist across the tree traversal.
Add JumpIfFalse (peek TOS, jump if false, no pop) and ShortCircuit terminator to eliminate redundant store/load pairs from &&/|| evaluation. The emitter produces the same pattern as the legacy compiler: &&: eval(lhs); JumpIfFalse(join); Pop(1); eval(rhs) ||: eval(lhs); JumpIfFalse(rhs); Jump(join); Pop(1); eval(rhs) The destination local is classified as PhiLike via is_short_circuit_phi, so the value stays on the operand stack with no store_var/load_var. Total codegen: -24,459 bytes (-7.4%) vs canary across all snapshots.
The stack_carry simulation conservatively rejected all locals whose use site was a ShortCircuit terminator, forcing redundant store_var/load_var spills. Since ShortCircuit peeks the operand (value stays on TOS), a PhiLike value already on the stack works correctly with JumpIfFalse. Removes ~3.2KB of redundant bytecode from format_checks and 56 bytes from null_handling.
Update inline snapshots for 3 ignored (compiler2 todo) assignment tests to reflect the InitField optimization from e817f17. The runtime assertions still fail due to unimplemented method-call-through-field patterns — these tests remain #[ignore].
Convert §12/§13 catch tests from external file snapshots to inline snapshots, consistent with every other test in exceptions.rs.
ShortCircuit is side-effect-free unlike Call/Await, so remove it from the force-alive list in eliminate_dead_locals. When a ShortCircuit destination is dead, replace the terminator with Goto to join so dead-block elimination cleans up the eval_rhs block. Also inline the last 2 external match_types snapshots and delete the tests/snapshots/ directory. null_handling codegen: -150 bytes vs canary (was +128 before).
instanceof with TypeTag — JumpTable dispatch, fused IsType, InstanceOfRemoved diagnostic
Artifacts | Task
What problems was I solving
Class-type match arms compiled to sequential
IsTypechains (O(n) per arm), while enum variants already enjoyed O(1) jump table dispatch viaDiscriminant. TheSwitchKind::TypeTaginfrastructure was fully plumbed through MIR -> bytecode -> VM but sat dead behind#[allow(dead_code)]. Meanwhile,instanceofremained in the V2 surface language despite being removed from V1, silently lowering toRvalue::IsTypeinstead of producing the intended E0098 diagnostic. Type checks used two different multi-instruction patterns:LoadConst + CmpOp::InstanceOffor classes andTypeTag + LoadConst + CmpOp::Eqfor primitives. Sparse switch dispatch used O(log K) binary search, which for K=8 arms emits ~4K instructions of comparisons and branching.After this PR:
TypeTag+Terminator::Switch->analyze_switchpicks JumpTable (dense, >=4 arms), PerfectHash (sparse, >=4), or IfElseChain (<4)Instruction::DenseTag) for O(1) dispatch, replacing the O(log K)BinarySearchpath. Memory scales with K (arms), not N (total classes)Instruction::IsType(usize)replaces both multi-instruction type-check patterns, reducing per-arm instruction count from 2-3 to 1instanceofkeyword emits E0098InstanceofRemoveddiagnostic at CST->AST loweringCmpOp::InstanceOfis deleted entirely from bytecode and VMWhat user-facing changes did I ship
instanceofkeyword now produces a helpful error: "instanceofis no longer supported. Use amatchexpression for type checking instead."is_typeinstruction instead of 2-3 instructionsinstanceofwas already non-functional in V1 and all existing tests usematchHow I implemented it
Six incremental commits, each independently testable, delivered in bottom-up order (MIR -> bytecode -> AST cleanup -> optimizations):
Phase 1: Activate SwitchKind::TypeTag (commit dc4a387)
lower.rs-- Core change. Removed#[allow(dead_code)]fromSwitchKind::TypeTagandclass_type_tags. Addedtype_tag_for_ty()helper. Extendedtry_lower_match_as_switchto recognizeAstPattern::Binding(resolving to a class type) andAstPattern::TypedBindingpatterns, routing them throughSwitchKind::TypeTag. Added TypeTag switch-operand emission (Rvalue::TypeTag), arm name generation, and exhaustiveness handling.lower.rs-- Addedbuild_class_type_tags()method that assigns tags iteratingcompiler2_all_filesin emitter order, ensuring MIR tags match runtimeclass.type_tagvalues.match_types.rs-- Integration tests for JumpTable, IfElseChain, and mixed class+primitive dispatch.Phase 2: Fused Instruction::IsType (commit 0f43b6e)
bytecode.rs-- AddedInstruction::IsType(usize). DeletedCmpOp::InstanceOffromCmpOpenum andDisplay.vm.rs-- AddedIsTypehandler: dispatches onValue::Objectfor class identity orValue::Intfor type tag comparison. Removed all sixCmpOp::InstanceOfmatch arms.emit.rs-- Rewroteis_type()to emit a singleInstruction::IsType(const_idx)for both class and primitive paths.debug.rs-- Display forIsTypein expanded, textual, and color modes. Textual mode resolves constant metadata for human-readable output (is_type Catinstead ofis_type ?2).Phase 3: Remove instanceof keyword (commit d32147f)
lower_expr_body.rs--KW_INSTANCEOFnow emitsLoweringDiagnostic::InstanceofRemovedand returnsExpr::Missing.lowering_diagnostic.rs-- AddedInstanceofRemovedvariant with E0098 error code.ast.rs-- DeletedBinaryOp::Instanceofvariant.Instanceofarms frombuilder.rs,lower.rs,from_ast.rs,compiler.rs.instanceof.rs-- Tests verify E0098 diagnostic.Phase 4: Perfect Hash + DenseTag for sparse switches (commits d365605, 98ae8f5, 119a7e9)
emit.rs-- AddedSwitchStrategy::PerfectHashvariant andfind_perfect_hash(): brute-force search over multiply-shift hash familyh(x) = ((x as u64).wrapping_mul(M) >> S) & maskto find collision-free mapping. For K <= 128 keys, tries table sizesnext_power_of_two(K)then2x, with M in1..10_000and S in0..64. Completes in microseconds for typical arm counts.emit.rs-- Addedemit_switch_perfect_hash(): emitstype_tag+DenseTag(table_idx)+JumpTable-- just 3 instructions for O(1) dispatch regardless of tag sparsity. TheDenseTaginstruction remaps sparse tags to dense[0, K-1]indices; unknown tags push-1as sentinel for the jump table's default arm.bytecode.rs-- AddedInstruction::DenseTag(usize),MatchHashTable(multiply/shift/mask + entries),MatchHashEntry(expected_tag + dense_index).Bytecodegainsmatch_hash_tables: Vec<MatchHashTable>field.vm.rs--DenseTaghandler: computesh(tag), verifiesentry.expected_tag == tag, pushesdense_indexor-1sentinel.debug.rs-- Display forDenseTag: showsdense_tag [Cat, Dog, Bird, Fish]with key names from the hash table.Phase 5: Catch arm TypeTag dispatch + threshold tuning (commit 119a7e9)
lower.rs-- Addedtry_lower_catch_as_switch(): catch arms with 4+ typed patterns now use TypeTag switch dispatch withthrow_if_panicfor wildcard arms and rethrow for default. Catch switches are never marked exhaustive (panics must always be rethrowable).lower.rs-- Raised TypeTag switch threshold to 4 arms for both match and catch. Below 4 arms, the sequentialis_typechain is more compact than thetype_tag+ copy/cmp if-else chain.lower.rs-- Adopted canary'sclassify_pattern_type_tag()helper, updated to support class types viatype_tag_for_ty(). Fixed otherwise-block lowering to use recomputedis_switch_exhaustiveflag.emit.rs-- Single-arm exhaustive switches skip the discriminant entirely (notype_tag + pop 1for guaranteed matches).exceptions.rs-- Tests:catch_four_typed_arms_jump_table,catch_four_typed_arms_plus_wildcard_jump_table,catch_two_typed_arms_sequential_chain,catch_mixed_literal_and_typed_no_switch, null routing, and canary's primitive TypeTag + instanceof chain tests.Bug fixes along the way
nullhas a stable type tag (typetag::NULL).type_tag + pop 1when a match has exactly one exhaustive arm.Deviations from the plan
Implemented as planned
type_tag_for_ty(),classify_pattern_type_tag(),try_lower_match_as_switchextension,IsTypeVM handler, emit rewrite,find_perfect_hash()with multiply-shift family,MatchHashTable/MatchHashEntrydata structuresDeviations/surprises
MatchHash->DenseTag: Plan specifiedInstruction::MatchHash(usize), implementation usesInstruction::DenseTag(usize). Naming only -- semantics identical.analyze_switchwould implicitly choose the right strategy. Implementation added an explicit 4-arm minimum for TypeTag switches since the copy/cmp overhead of small switches makes sequentialis_typechains more compact.build_class_type_tags: Plan described inline population during package iteration. Implementation extracted a dedicated method iteratingcompiler2_all_filesin emitter order -- required for correctness to ensure MIR tags match runtime values.required_match_casesreturnsNonefor class types. Not in plan but required for correctness.Additions not in plan
try_lower_catch_as_switch) -- the plan only covered match expressionsis_typeoperand metadata for human-readable debug output (is_type Catnotis_type ?2)Items planned but not implemented
How to verify it
Setup
git fetch git checkout unify-instanceof-with-typetag cd baml_languageAutomated Tests
cargo test -p baml_tests -p bex_vm -p bex_vm_typesManual Verification
cargo test -p baml_tests -- match_class_type_tag_jump_table-- 4+ dense class arms usetype_tag+ jump_tablecargo test -p baml_tests -- match_class_type_is_type_chain-- <4 class arms use sequentialis_typecargo test -p baml_tests -- match_mixed_class_primitive-- mixed class + primitive TypeTag switchcargo test -p baml_tests -- catch_four_typed_arms_jump_table-- catch TypeTag dispatchcargo test -p baml_tests -- instanceof-- E0098 diagnostic emittedis_typereplacingload_const + cmp_op instanceofandtype_tag + load_const + cmp_op ==dense_tag+jump_tablefor sparse >=4-arm TypeTag switchesDescription for the changelog
Unify type checking: activate TypeTag dispatch for class-type match/catch arms with jump table (dense) or perfect hash (sparse) for O(1) dispatch, replace multi-instruction type checks with fused
IsType, and removeinstanceofkeyword with E0098 diagnostic.