Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ This repository contains multiple independent projects.
| [ghrelease](lib/ghrelease/) | beta | Helper for fetching GitHub release assets; API may change without notice. |
| [svghatch](lib/svghatch/) | alpha | Replaces solid colors in SVG files with line patterns for black and white printing. |
| [toml](lib/toml/) | beta | Surgical TOML editor with comment preservation; used by conf. |
| [ts-parser](lib/ts-parser/) | alpha | Pure-Go TypeScript/TSX parser using tree-sitter WASM and wazero; includes JSDoc extraction. |
| [version](lib/version/) | stable | Shared version command implementation for CLI tools. |
| [cli](lib/cli/) | stable | Unified CLI color/styling utilities for all tools. |
| [linters/uselesswrapper](linters/uselesswrapper/) | alpha | Static analysis tool that detects useless function wrappers; integrated via Dagger and mise. |
Expand Down
246 changes: 246 additions & 0 deletions lib/ts-parser/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
# ts-parser

A pure-Go library for parsing TypeScript and TSX source code using tree-sitter grammars compiled to WebAssembly and executed via wazero.

## Features

- Pure Go implementation - no CGO required
- Parses TypeScript (`.ts`) and TSX (`.tsx`) files
- Provides syntax tree with byte ranges for each node
- Extracts JSDoc comments and links them to declarations
- Works offline - no network access required
- No external tools needed at runtime

## Installation

```bash
go get github.com/neongreen/mono/lib/ts-parser
```

## Usage

### Parsing TypeScript

```go
package main

import (
"context"
"fmt"
"log"

tsparser "github.com/neongreen/mono/lib/ts-parser"
)

func main() {
ctx := context.Background()

// Create a parser
parser, err := tsparser.NewParser(ctx)
if err != nil {
log.Fatal(err)
}
defer parser.Close(ctx)

// Parse TypeScript source
src := []byte(`
function greet(name: string): string {
return "Hello, " + name;
}
`)

tree, err := parser.ParseTS(src)
if err != nil {
log.Fatal(err)
}
defer tree.Close()

// Get root node
root, err := tree.RootNode()
if err != nil {
log.Fatal(err)
}

// Print node information
typeName, _ := root.TypeName()
startByte, _ := root.StartByte()
endByte, _ := root.EndByte()

fmt.Printf("Root: type=%s, bytes=[%d, %d)\n", typeName, startByte, endByte)

// Iterate children
childCount, _ := root.ChildCount()
for i := uint32(0); i < childCount; i++ {
child, _ := root.Child(i)
childType, _ := child.TypeName()
text, _ := child.Text()
fmt.Printf(" Child %d: type=%s, text=%q\n", i, childType, text)
}
}
```

### Parsing TSX

```go
tree, err := parser.ParseTSX([]byte(`
function Greeting({ name }: { name: string }) {
return <div>Hello, {name}!</div>;
}
`))
if err != nil {
log.Fatal(err)
}
defer tree.Close()
```

### Extracting JSDoc Comments

```go
src := []byte(`
/** Greets a person by name */
function greet(name: string): string {
return "Hello, " + name;
}

class Greeter {
/** The greeting message */
message: string;

/** Creates a new Greeter */
constructor(message: string) {
this.message = message;
}
}
`)

tree, err := parser.ParseTS(src)
if err != nil {
log.Fatal(err)
}
defer tree.Close()

// Extract JSDoc comments
jsdocs, err := tsparser.ExtractJSDoc(src, tree)
if err != nil {
log.Fatal(err)
}

for _, jsdoc := range jsdocs {
commentText := string(src[jsdoc.CommentStart:jsdoc.CommentEnd])
fmt.Printf("JSDoc: %s\n", commentText)
fmt.Printf(" Attached to: %s (%s)\n", jsdoc.AttachedDecl.Name, jsdoc.AttachedDecl.TypeName)
fmt.Printf(" Container: %s (%s)\n", jsdoc.ContainerDecl.Name, jsdoc.ContainerDecl.TypeName)
}
```

## API Reference

### Parser

```go
// NewParser creates a new parser with compiled WASM modules.
func NewParser(ctx context.Context) (*Parser, error)

// ParseTS parses TypeScript source code.
func (p *Parser) ParseTS(src []byte) (*Tree, error)

// ParseTSX parses TSX source code.
func (p *Parser) ParseTSX(src []byte) (*Tree, error)

// Close releases all resources.
func (p *Parser) Close(ctx context.Context) error
```

### Tree

```go
// RootNode returns the root node of the syntax tree.
func (t *Tree) RootNode() (*Node, error)

// Close releases resources used by the tree.
func (t *Tree) Close() error

// Source returns the original source code.
func (t *Tree) Source() []byte
```

### Node

```go
// TypeName returns the type name of this node.
func (n *Node) TypeName() (string, error)

// StartByte returns the byte offset where this node starts.
func (n *Node) StartByte() (uint32, error)

// EndByte returns the byte offset where this node ends.
func (n *Node) EndByte() (uint32, error)

// ChildCount returns the number of children.
func (n *Node) ChildCount() (uint32, error)

// Child returns the child at the given index.
func (n *Node) Child(index uint32) (*Node, error)

// NamedChildCount returns the number of named children.
func (n *Node) NamedChildCount() (uint32, error)

// NamedChild returns the named child at the given index.
func (n *Node) NamedChild(index uint32) (*Node, error)

// Parent returns the parent node.
func (n *Node) Parent() (*Node, error)

// NextSibling returns the next sibling.
func (n *Node) NextSibling() (*Node, error)

// PrevSibling returns the previous sibling.
func (n *Node) PrevSibling() (*Node, error)

// Text returns the source text corresponding to this node.
func (n *Node) Text() (string, error)

// IsNull returns true if this is a null node.
func (n *Node) IsNull() (bool, error)

// IsError returns true if this node represents a syntax error.
func (n *Node) IsError() (bool, error)
```

### JSDoc Extraction

```go
// ExtractJSDoc extracts all JSDoc comments from a parsed tree.
func ExtractJSDoc(src []byte, tree *Tree) ([]JSDoc, error)

// JSDoc represents a JSDoc comment with attachment information.
type JSDoc struct {
CommentStart int // Byte offset where comment starts
CommentEnd int // Byte offset where comment ends
AttachedDecl NodeRef // Declaration this JSDoc attaches to
ContainerDecl NodeRef // Nearest enclosing declaration
}

// NodeRef is a stable reference to a node.
type NodeRef struct {
StartByte int // Byte offset where node starts
EndByte int // Byte offset where node ends
Name string // Display name (may be empty)
TypeName string // Grammar type name
}
```

## Building from Source

The WASM file is pre-built and embedded in the library. End users don't need to build it.

For maintainers who need to rebuild the WASM file, see [WASM_BUILD.md](WASM_BUILD.md).

## Constraints

This library is designed to work in restricted environments:

- **CGO_ENABLED=0** - Works without CGO
- **No network access** - Everything is embedded
- **No external tools** - No Node.js, tree-sitter CLI, or compilers needed
- **No C code** - Pure Go with WASM execution via wazero
132 changes: 132 additions & 0 deletions lib/ts-parser/WASM_BUILD.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Building the Tree-Sitter WASM Module

This document explains how the WASM file was produced for the ts-parser library.

**Note**: This documentation is for maintainers only. End users do not need to build the WASM file - it is pre-built and embedded in the library.

## Overview

The library uses a single WASM file:
- `internal/wasm/parser.wasm` - Tree-sitter runtime + TypeScript + TSX grammars

This file contains the tree-sitter parser runtime along with both language grammars, compiled to WASI-compatible WebAssembly using Zig.

## Source Repositories and Pinned Versions

| Component | Repository | Version/Commit |
|-----------|-----------|----------------|
| tree-sitter | https://github.com/tree-sitter/tree-sitter | v0.24.7 |
| tree-sitter-typescript | https://github.com/tree-sitter/tree-sitter-typescript | v0.23.2 |
| Zig compiler | https://ziglang.org | 0.13.0 |

## Prerequisites

### 1. Zig Compiler

Install Zig 0.13.0:

```bash
# Download and extract
curl -sL https://ziglang.org/download/0.13.0/zig-linux-x86_64-0.13.0.tar.xz | tar -xJ

# Add to PATH
export PATH="$PWD/zig-linux-x86_64-0.13.0:$PATH"

# Verify
zig version
# Should output: 0.13.0
```

### 2. Clone Source Repositories

```bash
# Create build directory
mkdir ts-wasm-build && cd ts-wasm-build

# Clone tree-sitter runtime
git clone --depth 1 --branch v0.24.7 https://github.com/tree-sitter/tree-sitter.git ts-runtime

# Clone tree-sitter-typescript grammar
git clone --depth 1 --branch v0.23.2 https://github.com/tree-sitter/tree-sitter-typescript.git ts-typescript
```

## Build Command

Build a single WASM file containing both TypeScript and TSX grammars:

```bash
zig cc --target=wasm32-wasi-musl -mexec-model=reactor \
-I ts-runtime/lib/include \
-I ts-runtime/lib/src \
-I ts-typescript/typescript/src \
-I ts-typescript/tsx/src \
-I ts-typescript/common \
-I ts-typescript \
ts-runtime/lib/src/lib.c \
ts-typescript/typescript/src/parser.c \
ts-typescript/typescript/src/scanner.c \
ts-typescript/tsx/src/parser.c \
ts-typescript/tsx/src/scanner.c \
-o parser.wasm \
-Oz -fPIC \
-Wl,--no-entry \
-Wl,-z -Wl,stack-size=65536 \
-Wl,--strip-debug \
-Wl,--export=malloc \
-Wl,--export=free \
-Wl,--export=strlen \
-Wl,--export=ts_parser_new \
-Wl,--export=ts_parser_parse_string \
-Wl,--export=ts_parser_set_language \
-Wl,--export=ts_parser_delete \
-Wl,--export=ts_language_version \
-Wl,--export=ts_tree_root_node \
-Wl,--export=ts_tree_delete \
-Wl,--export=ts_node_string \
-Wl,--export=ts_node_child_count \
-Wl,--export=ts_node_named_child_count \
-Wl,--export=ts_node_child \
-Wl,--export=ts_node_named_child \
-Wl,--export=ts_node_type \
-Wl,--export=ts_node_start_byte \
-Wl,--export=ts_node_end_byte \
-Wl,--export=ts_node_is_error \
-Wl,--export=ts_node_is_null \
-Wl,--export=ts_node_parent \
-Wl,--export=ts_node_next_sibling \
-Wl,--export=ts_node_prev_sibling \
-Wl,--export=ts_node_next_named_sibling \
-Wl,--export=ts_node_prev_named_sibling \
-Wl,--export=tree_sitter_typescript \
-Wl,--export=tree_sitter_tsx
```

## Installing WASM File

Copy the built file to this repository:

```bash
cp parser.wasm /path/to/mono/lib/ts-parser/internal/wasm/parser.wasm
```

## Verification

After installing the WASM file, verify the library works:

```bash
CGO_ENABLED=0 go test ./...
```

All tests must pass with CGO disabled.

## Notes on the Build Process

The WASM file is built using Zig's cross-compilation to WASI (WebAssembly System Interface):

- **Target**: `wasm32-wasi-musl` - 32-bit WebAssembly with WASI for system calls and musl libc
- **Execution model**: `reactor` - The module exports functions rather than having a main entry point
- **Optimization**: `-Oz` for size optimization
- **Stack size**: 65536 bytes (64KB)
- **Exports**: Tree-sitter API functions plus both `tree_sitter_typescript` and `tree_sitter_tsx` language functions

The pre-built WASM files from tree-sitter-typescript releases are Emscripten-compiled and expect a JavaScript runtime, so they cannot be used with wazero. The Zig-compiled WASI version works directly with wazero.
Loading