Skip to content

📄🚀 Accommodate other types of instantaneous events #99

@jaygordon

Description

@jaygordon

Describe the feature you want and how it meets your needs or solves a problem

As a software developer who works with a range of data sets from various transit agencies and ITS vendors, I’d like the TIDES data model to handle instantaneous events that don’t clearly fit into any of the existing tables.

TIDES currently has three tables for storing instantaneous events:

  • fare_transactions, for AFC events such as fare validations or ticket purchases.
  • vehicle_locations, for location snapshots of transit vehicles (a.k.a. “heartbeats” or “breadcrumbs”).
  • passenger_events, for various events that occur during a stop_visit but that aren't fare transactions (such as APC activity or bike rack deployments).

Putting aside the names of these tables for a moment and looking at them more abstractly, each contains instantaneous events with the following differences:

  • fare_transactions relates to an AFC device and optionally a vehicle, records no location information, and has several fields specific to fare transactions.
  • vehicle_locations relates to a vehicle and contains spatial coordinates and/or odometer information.
  • passenger_events relates to a stop_visit (indirectly, by avheclia and time and/or stop sequence), a type of event (from an enum), and records no location information because it can be gleaned from the associated stop_visit.

These tables capture some important types of events, but there are other useful instantaneous events that don’t clearly fit into any of these tables. For example:

  • Passenger-related events that don’t entail fare transactions or vehicles, such as gate or turnstile entries or exits.
  • Vehicle-related events that are recorded by stationary equipment and aren’t necessarily relatable to a vehicle ID, such as a track circuit or signal block recording the passage of a train. (Such systems often reference some internal, temporary train identifier that differs from the persistent railcar identifiers).
  • Onboard passenger-related events that occur between stop_visits and therefore have meaningful location information. For example, a passenger requesting a stop or the AVL system announcing the name of an upcoming stop (both of which can be ingredients for route, pattern, or stop-visit inference when that information isn't directly provided in the data set).
  • Log entries from various systems, such as a driver logging into an AVL system, a station gate being rebooted, or a station escalator turning on or reversing direction.

Describe the solution you'd like

I’d like some way of accommodating the record types mentioned above, and more generally, other types of instantaneous transit records that other users may require now or in the future. I think this could be achieved by redefining or modifying the vehicle_locations and passenger_events tables. (I think fare_transactions should remain relatively unchanged, as it has several fields suited to a very specific kind of event)

Describe alternatives you've considered

1. Replace vehicle_locations and passenger_events with two somewhat more flexible tables: one for events generated by vehicles or vehicle-mounted devices and the other for events generated by stationary devices.

  • Pro: This accommodates record types that aren’t currently accommodated.
  • Cons:
    • Segregating records by vehicle-mounted vs stationary devices will force some data sources to be split into two tables. For example, logs from an AFC system would need to be split so that farebox records go to one table and vending machine and station-gate records go to another.
    • This ignores mobile, non-vehicle mounted devices such as a handheld fare validator used by a fare inspector who travels on different vehicles throughout a shift. This could be done with a third, optional table (for truly mobile as opposed to vehicle-mounted devices), or the vehicle ID field could simply be made optional and these devices could record lat/lon if given.

2. Combine vehicle_locations and passenger_events into a single table that has fields for lat/lon, odometer, and references to vehicles and/or places, and name it something more generic.

  • Pros:
    • This avoids having to split an input data source into two tables (see the first bullet in solution 1), and can be very flexible for accommodating future record types and data sources.
    • Some combined AVL/APC systems already provide data in this way, as a stream of vehicle, door, and passenger events that all record the current lat, lon, and odometer.
  • Con: It would be more onerous to validate and enforce referential integrity, as different record types should ideally require different fields to be populated. For example, records that currently reside in the vehicle_location table should ideally require a vehicle_id while a station-gate exit record should leave that field null but might require a device ID or location ID.

3. Keep vehicle_locations, add lat/lon/odom fields to passenger_events so that it can capture events between stop visits, and add one or more additional tables to capture the other types of records mentioned above.

  • Pro: Tables are more clearly suited to certain types of events, and less filtering would be required to extract certain record types from a table of many types.
  • Con: This may require adding new tables as TIDES users seek to capture other types of instantaneous record types, many of which will be structurally quite similar (a device and/or vehicle ID, a timestamp, some event type or message). This could be mitigated by adding one very generic table as a catch-all for all instantaneous events that don't fit into vehicle_locations or passenger_events.

Metadata

Metadata

Assignees

No one assigned

    Labels

    📄 specPertains to the specification itself🚀 featureAdds a new feature - to spec or code

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions