Interface VersionedDataSource<T extends VersionedDataUpdate>

Type Parameters:
T - type of the data
All Known Subinterfaces:
AttributeSubscription, ForestSource, ItemTracker

@PublicApi public interface VersionedDataSource<T extends VersionedDataUpdate>

VersionedDataSource establishes a way for sources of some data to version their content, to allow caching and incremental updates.

Data source uses DataVersion to tag every internal update of the data. Clients request the updates to the data through getUpdate(DataVersion) method, passing the last version of the data they have seen. Data source replies with a sub-class of VersionedDataUpdate, which carries the updated data and the new version.

A data source may provide incremental updates in case the difference between the state of the data on the client and in the source can be identified. DataVersion class enables that through the use of signature field, which identifies the sequence (or "version space") for the incremental numeric versions, stored in the version field. See DataVersion.

Pseudo-code for working with VersionedDataSource would look like this:

   VersionedDataSource<MyDataUpdate> source = ...;
   DataVersion lastVersionSeen = DataVersion.ZERO;
   while (...) {  // this could be event-based or user-initiated periodical updates
     MyDataUpdate update = source.getUpdate(lastVersionSeen);
     if (update.isEmpty()) {
       // usually, do nothing -- nothing has changed, we have the latest version
     } else if (update.isFull()) {
       // full update -- previous state can be removed, and new state taken fully from the update
       ...
     } else if (update.isIncremental()) {
       // incremental update -- the new state can be produced by taking the difference from the update and applying it to the old state
       ...
     }
     lastVersionSeen = update.getVersion(); // update lastVersionSeen so next time we request updates based on what we have
   }
 

Of course, the cycle can be replaced with anything else, like polling. It does not have to be periodic, but the longer time has passed since the update, the less is the chance that incremental update is possible.

Laziness a.k.a. Pull Architecture

An important aspect of most services providing VersionDataSource is laziness. They are ready to answer the calls to getUpdate(DataVersion), but if that requires some calculations or getting data from other sources, those activities get delayed until the moment getUpdate(DataVersion) is called. That lets us keep many data sources, with potentially costly update operations, in memory, and spent resources only on those that are actually requested by someone.

In the same spirit, data sources can be chained. Say, data source A combines output from data sources B and C. It subscribes to both B and C and keeps track of the last seen version in both data sources. When it receives getUpdate() call, it proceeds to call getUpdate() from B and C and then combine the result. Neither data source has to do anything until a user or an active component actually calls A.getUpdate().

  • Method Summary

    Modifier and Type
    Method
    Description
    Returns the current version of the data without triggering data source's recalculation.
    getUpdate(DataVersion fromVersion)
    Returns an update based on the version of the data that the client has.
  • Method Details

    • getUpdate

      @NotNull T getUpdate(@NotNull DataVersion fromVersion)

      Returns an update based on the version of the data that the client has.

      When the caller does not yet have previous state and its version, use DataVersion.ZERO. Full update is guaranteed in this case.

      If data source depends on other data sources or has pending changes, this call will cause the source to become up-to-date and perform any necessary calculations.

      Parameters:
      fromVersion - version of the data that is known to the caller
      Returns:
      an update that can be applied at the caller site to get the up-to-date state
    • getCurrentVersion

      @NotNull DataVersion getCurrentVersion()

      Returns the current version of the data without triggering data source's recalculation.

      This can only be used for diagnostics or similar purposes — correct continuous update algorithm would use getUpdate(DataVersion) only. There are two reasons for that:

      • if you call getCurrentVersion() and then getUpdate(DataVersion), a concurrent modification may happen in between, and
      • if data source needs to do some calculations to come up with a newer version of data, this call will not trigger it (unlike getUpdate(DataVersion).
      Returns:
      the most recent version of the data known to the source