Bidirectional converter between the Beneficial Ownership Data Standard (BODS) v0.4 and Neo4j graph database, with built-in graph analysis queries for UBO detection, corporate group mapping, and circular ownership detection.
Part of the BODS Interoperability Toolkit.
BODS provides a universal, standardised format for beneficial ownership data — enabling interoperability across countries, registers, and data sources. Neo4j provides powerful graph traversal and analysis capabilities that are essential for understanding complex ownership structures.
This tool bridges the gap:
- BODS → Neo4j: Import any BODS v0.4 dataset into Neo4j for graph analysis, UBO detection, and visualisation
- Neo4j → BODS: Export graph data back to BODS format for standardised data exchange and sharing
- Graph Queries: Ready-made Cypher queries for common beneficial ownership analysis tasks
The Beneficial Ownership Data Standard was created by Open Ownership as the world's leading open standard for beneficial ownership information on legal persons and legal arrangements.
BODS data is published by multiple sources including:
- BODS Data Explorer — GLEIF and other datasets in BODS format
- UK PSC Pipeline — UK Companies House data
- GLEIF Pipeline — Legal Entity Identifiers
- OpenCorporates Pipeline — OpenCorporates relationship data
- Kyckr Pipeline — Kyckr UBO verification data
- ICIJ Offshore Leaks Pipeline — Offshore Leaks Database
pip install -e .For development:
pip install -e ".[dev]"docker compose up -dThis starts Neo4j Community Edition on bolt://localhost:7687 with APOC plugin.
Generate Neo4j-importable CSV files and import scripts:
bods-neo4j to-csv examples/sample_data/sample_bods.json -o ./neo4j_exportThis produces:
entities.csv,persons.csv,relationships.csvimport.cypher— for loading into a running Neo4j instanceimport.sh— forneo4j-adminbulk import
Load BODS data directly into Neo4j:
bods-neo4j to-neo4j examples/sample_data/sample_bods.json \
--uri bolt://localhost:7687 \
--username neo4j \
--password bodspasswordExport the graph back to BODS format:
bods-neo4j to-bods output.jsonl \
--uri bolt://localhost:7687 \
--username neo4j \
--password bodspassword \
--publisher-name "My Organisation"# BODS file statistics
bods-neo4j info examples/sample_data/sample_bods.json
# Neo4j graph statistics
bods-neo4j graph-info --uri bolt://localhost:7687 --username neo4j --password bodspasswordBODS statements map to Neo4j's property graph model as follows:
| BODS Statement | Neo4j Label(s) | Key Properties |
|---|---|---|
| Entity (registeredEntity) | :Entity:RegisteredEntity |
name, recordId, jurisdictionCode, entityType |
| Entity (trust) | :Entity:Arrangement:Trust |
name, recordId |
| Entity (stateBody) | :Entity:StateBody |
name, recordId |
| Entity (nomination) | :Entity:Arrangement:Nomination |
name, recordId |
| Person (knownPerson) | :Person |
name, recordId, birthDate, nationalityCode |
| BODS Concept | Neo4j Relationship | Direction | Key Properties |
|---|---|---|---|
| Ownership/Control interest | [:HAS_INTEREST] |
(interestedParty)-[:HAS_INTEREST]->(subject) |
interestTypes, shareMinimum, shareMaximum, isBeneficialOwnership |
All BODS metadata is preserved as node/relationship properties:
- Complex nested structures (identifiers, addresses, interests arrays) are stored as
*_jsonproperties - Statement IDs, record IDs, publication details, and source information are preserved
- Data can be converted back to valid BODS v0.4 format without loss
from bods_neo4j.config import Neo4jConfig
from bods_neo4j.graph_queries.ubo_detection import find_owners, find_all_ubos
from bods_neo4j.graph_queries.corporate_groups import find_corporate_group, find_top_level_parents
from bods_neo4j.graph_queries.circular_ownership import find_circular_ownership
config = Neo4jConfig(uri="bolt://localhost:7687", username="neo4j", password="bodspassword")
# Find all owners of an entity
owners = find_owners("rec-entity-alpha", config)
# Find all UBOs with >= 25% effective ownership
ubos = find_all_ubos(config, threshold=25.0)
# Map a corporate group
group = find_corporate_group("rec-entity-alpha", config)
# Find top-level parent entities
parents = find_top_level_parents(config, limit=50)
# Detect circular ownership
cycles = find_circular_ownership(config)Find all owners of an entity:
MATCH path = (owner)-[:HAS_INTEREST*1..10]->(target:Entity {recordId: "rec-entity-alpha"})
WHERE owner:Person OR (owner:Entity AND NOT EXISTS {
MATCH (upstream)-[:HAS_INTEREST]->(owner)
})
RETURN owner.name, length(path) AS depthCalculate effective ownership through chains:
MATCH path = (person:Person)-[:HAS_INTEREST*1..10]->(entity:Entity)
WITH person, entity, path,
reduce(pct = 1.0, r IN relationships(path) |
CASE WHEN r.shareMinimum IS NOT NULL
THEN pct * (toFloat(r.shareMinimum) / 100.0)
ELSE pct END) * 100.0 AS effectivePct
WHERE effectivePct >= 25
RETURN person.name, entity.name, effectivePctDetect circular ownership:
MATCH path = (e:Entity)-[:HAS_INTEREST*2..10]->(e)
RETURN e.name, length(path) AS cycleLength,
[n IN nodes(path) | n.name] AS cycleNamesFind entities without identified UBOs:
MATCH (e:Entity)
WHERE NOT EXISTS { MATCH (p:Person)-[:HAS_INTEREST*]->(e) }
AND EXISTS { MATCH ()-[:HAS_INTEREST]->(e) }
RETURN e.name, e.jurisdictionCodeThis tool was designed by comparing two approaches to modelling UK beneficial ownership data:
| Approach | Open Ownership (bods-uk-psc-pipeline) | Neo4j Team (neo4j-company-house-demo) |
|---|---|---|
| Output | Standardised BODS v0.4 JSON | Custom Neo4j property graph |
| Strength | Interoperability across data sources | Graph traversal and UBO analysis |
| Weakness | No graph analysis built in | Source-specific, not interoperable |
BODS-Neo4j bridges this gap — standardised data format with graph analysis capabilities.
pytestTests cover:
- BODS file reading (JSON and JSONL)
- Statement mapping (entities, persons, relationships)
- CSV export with correct structure
- Round-trip fidelity (BODS → Neo4j → BODS)
- BODS schema utilities
tests/test_bods_fixtures_conformance.py runs the mapper against every case in the canonical bods-v04-fixtures pack via the pytest-bods-v04-fixtures plugin. Tests are parametrized by fixture name so a failure like [edge-cases/10-circular-ownership] points straight at the offending case.
Graph-specific conformance checks include: every statement maps to a node or relationship (no silent None returns from shape divergence); circular ownership emits two distinct mirrored HAS_INTEREST edges; and declared-unknown UBOs (inline unspecifiedReason) don't crash the mapper and still leave the known subject entity as a usable node.
src/bods_neo4j/
├── cli.py # Click CLI commands
├── config.py # Configuration dataclasses
├── bods_to_neo4j/
│ ├── reader.py # BODS JSON/JSONL streaming reader
│ ├── mapper.py # BODS statements → Neo4j nodes/relationships
│ ├── csv_exporter.py # CSV export + Cypher/admin import scripts
│ └── driver_loader.py # Direct Neo4j loading via Python driver
├── neo4j_to_bods/
│ ├── extractor.py # Query Neo4j graph
│ ├── mapper.py # Neo4j nodes/rels → BODS statements
│ └── writer.py # Output BODS JSON/JSONL
├── graph_queries/
│ ├── ubo_detection.py # Ultimate beneficial owner traversal
│ ├── corporate_groups.py # Corporate group mapping and metrics
│ └── circular_ownership.py # Cycle detection
└── utils/
├── bods_schema.py # BODS v0.4 constants and helpers
└── neo4j_helpers.py # Neo4j connection and batch operations
MIT