-
Notifications
You must be signed in to change notification settings - Fork 4
Partitioned layer
Several vector checks, e.g. mmu, neighbour, exploit what we call partitioned layer.
A delivery may contain a layer with really high number of vertices. Performing checks on such high number of vertices brings the risk of geometry overflow in operations such as unioning or risk of unaccepted resource consumption which may even drive the kernel to shoot the postgresql process. This issue paralyses performing quality check on products containing complex layers. In order to overcome such weakness the implementation of qc tool introduce partitioning.
Before particular check comes into operation the layer is recursively divided into smaller partitions until there is no more than maximum allowed number of vertices in each partition, e.g. 50000 vertices per partition. Then the specialized checks perform their algorithms exploiting such partitioned layer.
The partition is a rectangular area similar to bounding box in planar srs. The partitioning is a recursive process, where every superpartition with high number of vertices is divided half and half into two subpartitions. The superpartition is divided always on its longer side. If some subpartition got acceptable number of vertices, no more division is performed, otherwise the subpartition becomes superpartition and it is divided again. All features of the superpartition are relocated into covering subpartition. If some of the features is not fully covered by one of the two subpartitions, then the feature is splitted into new features so that every of such new features is fully covered by exactly one of the subpartition. So, there never appears a feature crossing partition boundary. The original fid of splitted feature is preserved in the new features, so that there may be more than one feature with the same fid at the end of partitioning process. As the superpartitions become empty, they are deleted. The partitioning starts with an initial superpartition set up to cover the whole layer.
Another partitioning algorithm may be dividing the layer into tiles with fixed dimensions. However, the recursive division has several important benefits over tiles.
-
It is not needed to call magic to discover appropriate dimensions of the tile.
-
The number of vertices comming into computation during partitioning may reach one half compared to tiling.
-
The most important effect is that recursive partitioning is the perfect approach for dealing with non-homogenous layers. By non-homogenous layer is meant the layer where the distribution of vertexes or complexity of features vary in area. The notable example may be n2k product, where:
- there are small AOIs occasionaly distributed in a large delivery unit area;
- such AOIs are composed of many small features complemented by a few large unfolded features;
- one such large feature may have bounding box reaching the whole AOI, so that without partitioning its many vertices interfere processing of every other small feature.
The other examples of non-homogenous products are riparian zones, coastal zones, street tree layer.
Side effect of partitioning is speedup of jobs working on layers containing millions of tangled features. It is no trouble to handle millions of rows for postgresql, however it is a real trouble to handle millions of vertex operations for postgis. The situation falls bad especially in neighbourhood algorithms where the number of computations may rise up to the power of two. Partitioning eliminates problem of large unfolded features, i.e. computing again and again a lot of vertices far away from location of interest. There were situations where the job checking complex layer dig two days of computation. When partitioning comes into power the same job is over in two hours.
The partitioning is not performed automatically by every job. It is performed only if the job has enabled a check which exploits the partitioning, e.g. mmu check. The partitioning is performed once only per job even if more checks exploiting partitioning are enabled, e.g. mmu and neighbour.
Checks exploiting partitioned layer:
- vector.mmu;
- vector.neighbour;
- TODO;