Skip to content

[Discussion] Data conversion management #143

@cottinisimone

Description

@cottinisimone

Esrs data conversion management

Topic

This is a discussion thread in order to decide which route should this library follow in order to manage data conversion.

State of the art

Actually esrs is implementing the weak schema technique. With this technique the library handles missing attributes or superfluous attributes gracefully, just by manually updating the event shape adding Optional fields or removing existing fields from the event. Being that this is an optimistic approach, sometimes retro-compatibility issues while loading older events from the store come out.

Techniques

Event versioning/multiple versions

In this technique multiple versions of an event type are supported throughout the application. The event structure is extended with a version number. This version number can be read by all the event listeners, and they have to contain knowledge of the different versions in order to support them. In this technique the event store remains intact as old versions are never transformed. There are no extra write operations needed to convert the store.

Implementation

A macro to put on top of an event that every time that piece of code is compiled optionally insert that new event version in a local schema registry (like a file?) and checks the event retro-compatibility.
The cons with this approach is that is needed to think about a macro attribute to ignore a specific field (for example if the event store has been updated manually) or ask the user to manually fix local schema registry file.

Upcasting

Upcasting centralizes the update knowledge in an upcaster: a component that transforms an event before offering it to the application. Different than in the multiple versions technique is that the event listeners are not aware of the different versions of events. Because the upcaster changes the event the listeners only need to support the last version.

Implementation

Create a new trait Upcaster that your event must implement and having a two functions that user must implement, like upcast and actual_version. Inside of the upcast function the user should manually deserialize a json to get versions and the fields it needs in order to build latest event version?

Others

There are two other more exotic techniques to mention:

  • Lazy transformation: also uses an upcaster to transform every event before offering it to the application, but the result of the transformation is also stored in the event store.
  • In place transformation: transform every event and updates the existing event in the store with the new transformation.
  • Copy and transformation: it copies and transforms every event to a
    new store.

The biggest downside for those three techniques is that all those techniques perform, while reading events, updates on database, breaking the "events are immutable" dictatum.

Moreover lazy transformation and in place transformation are not reliable being that changing the event store permanently makes it mandatory to make backups.

And ofc there's the "leave it as it is" option :).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions