We added an extension to YARRRML to generate RML that generates a Linked Data Event Stream (LDES). LDES specifies how to model and publish changes in documents as a stream of events. Each event, called member in LDES speak, is a version of an original document.
We provide an ldes key in subjects mappings
to generate necessary LDES members and metadata.
We will explain the different options for generating LDES by showing some examples. The YARRRML mappings and output are abbreviated (prefixes and sources are omitted) to focus on the relevant parts, but the complete examples can be found here.
All examples use the same input data: temperature readings from two sensors:
SensorID,Timestamp,Temperature
1,2023-01-01T08:00:00,8
2,2023-01-01T08:00:00,9
1,2023-01-01T09:00:00,9
2,2023-01-01T09:00:00,9A basic LDES can be generated by providing just the ldes key without values.
YARRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
By default, the generated subject IRI is checked for uniqueness to determine if a new member needs to be generated.
In this case the subject IRI is based on the SensorID, so there are only two members: one with id 1 and one with
id 2.
Before diving into the details of every property, we show an example that uses all properties that define how an LDES
gets generated, except memberIdFunction.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
targets:
- [ out.ttl~void, turtle ]
ldes:
id: ex:myldes
# basically generate a member for each record
watchedProperties: [$(SensorID), $(Timestamp), $(Temperature)]
shape: ex:shape.shacl
timestampPath: [ex:ts, $(Timestamp), xsd:dateTime]
versionOfPath: [ex:hasVersion, ex:$(SensorID)]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]Output:
<1#0>
ex:hasVersion <1> ;
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00" ;
a ex:Thermometer .
<1#1>
ex:hasVersion <1> ;
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00" ;
a ex:Thermometer .
<2#0>
ex:hasVersion <2> ;
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00" ;
a ex:Thermometer .
<2#1>
ex:hasVersion <2> ;
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00" ;
a ex:Thermometer .
ex:myldes
a ldes:EventStream ;
ldes:timestampPath ex:ts ;
ldes:versionOfPath ex:hasVersion ;
tree:member <1#0>, <1#1>, <2#0>, <2#1> ;
tree:shape <shape.shacl> .
The id turns the IRI of the LDES EventSteam metadata to ex:myldes.
We define a custom LDES id and shape IRI.
We also define a timestampPath and a versionOfPath and specify what member triples they generate.
The watchedProperties key is used to define which data records end up as members in the LDES.
The watched properties, given as an array, are compared between members that would have the same subject IRI generated
by the subject value template:
- If at least one of these properties change, the generated subject IRI will be made unique and the member is added.
- If the watched properties remain the same, or if none are given:
- If the subject IRI template generates a unique IRI: add the new member.
- If the subject IRI template doesn't generate a unique IRI: discard the new member because in this case this member is considered a duplicate of a previous one.
Here are some examples:
In this case we're only interested in generating a new member if the temperature changes for a sensor:
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
watchedProperties: [$(Temperature)]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<1#1>
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
tree:member <1#0>, <1#1>, <2#0> ;
tree:shape <shape.shacl> .
There are two members for sensor 1 (two readings with different temperature values) and only one for sensor 2 (same values for temperature in each reading).
The previous example showed how to create new members if temperature changes. This example creates new members if the timestamp changes.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
watchedProperties: [$(Timestamp)]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<1#1>
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#1>
ex:temp "9" ;
ex:ts "2023-01-01T09:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
tree:member <1#0>, <1#1>, <2#0>, <2#1> ;
tree:shape <shape.shacl> .
This time every reading produces a member because for every sensor each reading has a different timestamp.
In example 1. No configuration, no watchedProperties given, so member generation depends on
the subject template, given by the value key in subjects.
Since the template uses the SensorID it only generates members when a reading of a new sensor arrives.
versionOfPath specifies LDESs ldes:versionOfPath predicate and object.
- If not present, no
ldes:versionOfPathis generated. - If a predicate and IRI template are given, then
ldes:versionOfPathis defined by the predicate and the value that is defined by that template. E.g.:versionOfPath: [dcterms:isVersionOf, ex:$(SensorID)]
- If only a predicate is given, then the versionOfPath is defined by that predicate and the value is defined by:
- the corresponding object mapping for the predicate, if any, or
- the subject template.
E.g.:
the value template is in this case the subject template:
versionOfPath: [dcterms:isVersionOf]
ex:$(SensorID)
- If an empty array is given, then the predicate defaults to
dcterms:isVersionOfand the value template defaults to:- the corresponding object mapping for the predicate, if any, or
- the subject value template.
E.g.:
versionOfPath: []
Here are some examples:
This example shows that the default ldes:versionOfPath with a predicate dcters:isVersionOf is generated.
The corresponding predicate and objects are generated for each member.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
versionOfPath: []
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
dcterms:isVersionOf <1> ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
dcterms:isVersionOf <2> ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:versionOfPath dcterms:isVersionOf ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
The next example shows how a versionOfPath property with a given predicate results in members using that
predicate, without having to define it in the predicateobject mappings.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
versionOfPath: [ex:hasOriginal]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:hasOriginal <1> ;
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:hasOriginal <2> ;
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:versionOfPath ex:hasOriginal ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
This example shows a versionOfPath with a custom predicate and object referring to
another IRI than the derived from the subject template.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
versionOfPath: [ex:hasOriginal, ex:original/$(SensorID)]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:hasOriginal <original/1> ;
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:hasOriginal <original/2> ;
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:versionOfPath ex:hasOriginal ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
timestampPath specifies the predicate and optionally object used to indicate the LDESs ldes:timestampPath.
- If no
timestampPathis present, noldes:timestampPathwill be generated. - If only a predicate is given, it has to be present in the
predicateobjectmappings. In that case the object is defined there. E.g.:In this case a predicateobject mapping must exist, e.g.:timestampPath: [ex:ts]
po: [[ex:ts, $(Timestamp)]]
- If a predicate and an object are given, an implicit
predicateobjectmapping with the given object will be added. E.g.:This is equivalent to the previous example, but no explicittimestampPath: [ex:ts, $(Timestamp)]
predicateobjectmappingmust be defined.
Here are some examples:
This example defines a timestampPath using an existing predicateobject mapping for ex:ts:
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
timestampPath: [ex:ts]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:timestampPath ex:ts ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
It is possible to define a custom timestampPath , where the predicate and object are not present in the
predicateobject mappings.
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
timestampPath: [ex:ts, $(Timestamp), xsd:dateTime]
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
ex:eventstream
a ldes:EventStream ;
ldes:timestampPath ex:ts ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
The IRI of the generated 'ldes:EventStream' object defaults to http://example.org/eventStream. This is often not
what you want. This IRI is easily cutomized with the id key:
YARRRML:
mappings:
temperature-reading:
sources: data-source
subjects:
- value: ex:$(SensorID)
ldes:
id: http://ldes.org/thisisanldeswithacustomid
targets:
- [out.ttl~void, turtle]
po:
- [a, ex:Thermometer]
- [ex:temp, $(Temperature)]
- [ex:ts, $(Timestamp), xsd:dateTime]Output:
<1#0>
ex:temp "8" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<2#0>
ex:temp "9" ;
ex:ts "2023-01-01T08:00:00"^^xsd:dateTime ;
a ex:Thermometer .
<http://ldes.org/thisisanldeswithacustomid>
a ldes:EventStream ;
tree:member <1#0>, <2#0> ;
tree:shape <shape.shacl> .
The shape key allows to refer to a SHACL shape that can be used to validate members, for instance
by an LDES Server implementation.
It defaults to ex:shape.shacl, but can be customized.
Note that the shape itself is not generated by the LDES extension.