TeamVLA – Two-Arm Newton-Guided Operations

TeamVLA is a benchmark and reference implementation for coordinating two robot arms in a shared Newton physics scene. The system targets three cooperative manipulation tasks—lift-and-place, hand-off, and bimanual drawer—conditioned on multimodal (vision + language) inputs and producing per-arm joint deltas plus gripper commands.

The repository is scaffolded in phases. Each phase document lives under planning/ and captures objectives, deliverables, task breakdowns, and testing strategy.

Getting Started

Python: 3.10 or newer (managed by uv).
Newton SDK: Not bundled; follow vendor instructions and ensure headers/binaries are on your library path once you reach the implementation phases.

# Create an isolated environment (uv-managed)
uv venv .venv
source .venv/bin/activate  # or .venv\\Scripts\\activate on Windows

# Install runtime dependencies from pyproject.toml / uv.lock
uv sync

# (Optional) install developer tooling (adds [dependency-groups.dev])
uv sync --dev

We default to uv for package management; feel free to translate the commands to your preferred tooling if necessary.

Repository Structure

TeamVLA/
  planning/             # Phase docs, architecture notes, prompt
  envs/                 # Newton environment and task registry (Phase 1+)
  control/              # IK utilities, phase machine, scripted controllers (Phase 2+)
  data/                 # Schema, writer, dataloader (Phase 3+)
  models/               # Vision/language encoders and VLA policies (Phase 4+)
  train/                # Losses, schedulers, BC trainer (Phase 4+)
  eval/                 # Metrics, rollouts, benchmarking utilities
  demos/                # Gradio demo entry-point
  scripts/              # CLI helpers for data collection/rendering
  configs/              # YAML configuration stubs (Phase 3+)
  tests/                # Pytest suite (Phase 0+)
  assets/               # Placeholder for Newton assets (not tracked)

Most packages now expose concrete functionality—encoders/models, scripted control, data pipelines, evaluation. The architecture still follows the phase roadmap, so extending a module rarely requires reshaping its neighbours.

Development Workflow

Follow the phase roadmap in order; each step builds on prior scaffolding.
Keep functions short and single-purpose—prefer private helpers over long public methods.
Add or update tests alongside code. Every new feature must be covered by a granular unit test.
Use logging (logging module) for runtime diagnostics; avoid printing in library code.
Run the tooling stack regularly (install torch and pytest locally for the full test suite):
- uv run pytest
- uv run ruff check .
- uv run black --check .
- uv run ty check
- uv run mypy . (for verbose type-debugging when needed)

Refer to docs/contributing.md and docs/tooling.md for the full contributor checklist and CLI references.

Phase Roadmap

Phase	Description	Reference
0	Bootstrap structure, metadata, documentation, baseline tests	`planning/phase_0.md`
1	Environment core + task registry scaffolding	`planning/phase_1.md`
2	IK utilities, phase machine, scripted demos	`planning/phase_2.md`
3	Data schema, writer, dataloader, configs	`planning/phase_3.md`
4	Models, encoders, training loop	`planning/phase_4.md`
5	Evaluation suite, benchmark CLI, Gradio demo	`planning/phase_5.md`
6	Comprehensive testing, linting, CI guidance	`planning/phase_6.md`

Consult the architecture overview (planning/architecture.md) and the prompt scaffold (planning/prompt.md) for fine-grained interface requirements.

Data Collection & Training

Environment & Control Interfaces: See docs/interfaces/env_control.md for a field-by-field breakdown of observations, action vectors, and scripted policy outputs owned by the simulation/control track.
Data Pipeline Contract: docs/data_pipeline.md summarises schema expectations, episode writing, dataset transforms, and the artefacts Track B ships to the other teams.

Generate scripted demonstrations:

python -m scripts.collect_demos --task lift --episodes 10 --out data/episodes

Inspect or convert collected episodes to videos (falls back to .npy if imageio is unavailable):
```
python -m scripts.render_videos --episodes data/episodes --out videos
```

Launch a behaviour-cloning run (requires torch):

python -m train.bc_trainer --config configs/train_bc.yaml

Evaluate a policy (currently uses a placeholder zero-action policy until checkpoints are produced):
```
python -m eval.bench --tasks lift handoff drawer --episodes 2 --output results/summary.json
```

Demo

Launch the Gradio-based interactive demo:

python -m demos.app

It resets the Newton environment per request and reports the placeholder actions returned by the policy shim. Once checkpoints are available, demos.app.load_policy can be extended to load them.

See docs/workflow.md for a detailed walkthrough that ties together setup, scripted data generation, training, evaluation, and demo usage.

Testing

The unit suite now covers control, data, models, evaluation, scripts, and an end-to-end smoke test linking the pipeline. Install torch locally to run the full suite:

pytest

Tests that rely on optional dependencies fall back gracefully if the packages are missing. When authoring new tests, continue to mark Newton-dependent cases with @pytest.mark.requires_newton and use pytest -m "not slow" during quick iterations.

See docs/testing_strategy.md for category breakdowns, fixtures, and evaluation summary templates. Troubleshooting tips live in docs/troubleshooting.md.

License

MIT License. See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TeamVLA – Two-Arm Newton-Guided Operations

Getting Started

Repository Structure

Development Workflow

Phase Roadmap

Data Collection & Training

Demo

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
assets		assets
configs		configs
control		control
data		data
demos		demos
docs		docs
envs		envs
eval		eval
models		models
planning		planning
scripts		scripts
tests		tests
train		train
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

TeamVLA – Two-Arm Newton-Guided Operations

Getting Started

Repository Structure

Development Workflow

Phase Roadmap

Data Collection & Training

Demo

Testing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages