Skip to content

join_near_values

Maarten Hilferink edited this page Apr 8, 2026 · 1 revision

Geometric functions join_near_values

The join_near_values function finds all pairs of points from two datasets that are within a specified distance of each other.

syntax

join_near_values(points_A: A->Point, points_B: B->Point, maxDist: Float) -> unit AB with:
    - attribute<A> first_rel (AB)
    - attribute<B> second_rel (AB)

Variants with explicit result type:

  • join_near_values_uint8 - result indices as UInt8
  • join_near_values_uint16 - result indices as UInt16
  • join_near_values_uint32 - result indices as UInt32 (default)
  • join_near_values_uint64 - result indices as UInt64

definition

Performs a spatial join between two point sets based on proximity. Returns all pairs (a, b) where the Euclidean distance between point a and point b is less than or equal to maxDist.

The result is a new unit (domain) with:

  • first_rel: relation to domain A (the first point in each pair)
  • second_rel: relation to domain B (the second point in each pair)

This is useful for:

  • Finding nearby facilities
  • Spatial clustering
  • Network analysis (connecting nodes within distance threshold)
  • Point matching between datasets

arguments

argument description type
points_A First point set A->Point
points_B Second point set B->Point
maxDist Maximum distance threshold (parameter, Void domain) {Void}->Float

performance

Uses a spatial index for efficient nearest-neighbor queries. Average complexity is O(n × m × log m) where n and m are the sizes of the two point sets, but can approach O(n × m) in pathological cases with many nearby points.

Performance is highly dependent on:

  • The maxDist threshold (smaller = faster)
  • Point density and spatial distribution
  • Available memory for spatial index

For large datasets, consider using smaller maxDist values or filtering points beforehand.

conditions

  • points_A and points_B must have the same point type (e.g., both DPoint or both FPoint)
  • maxDist must have Void domain (single parameter value)
  • maxDist must be non-negative
  • Domains A and B must be ordinal and zero-based

example

unit<uint32> Shops: nrofrows = 100;
unit<uint32> Customers: nrofrows = 10000;

attribute<DPoint> shop_location (Shops);
attribute<DPoint> customer_location (Customers);

// Find all customer-shop pairs within 500 meters
unit<uint32> NearbyPairs := join_near_values(
    customer_location,
    shop_location,
    500.0  // meters
);

attribute<Customers> customer_rel (NearbyPairs) := NearbyPairs/first_rel;
attribute<Shops>     shop_rel     (NearbyPairs) := NearbyPairs/second_rel;

// Distance for each pair
attribute<Float64> distance (NearbyPairs) := 
    dist(customer_location[customer_rel], shop_location[shop_rel]);

see also

since version

7.100

Clone this wiki locally