Overview
Data Studio datasets may become outdated for various reasons (e.g. change in schema). As client integrations have dependencies on these datasets, PrecisionLender maintains a dataset version and deprecation process.
In this Article
Dataset Versioning
Version definition: v{major}.{minor} - for example, v2.1 is major version 2 and minor version 1.
Major version changes signify breaking changes - changes that make the dataset non-backwards compatible (or non-resolvable over the entire time series). These may require changes to client processes since they may include removal of fields, renamed fields, reordering of fields, or changes to the data type or meaning of fields.
Examples include, but are not limited to:
- removal of fields
- renamed fields
- reordering of fields
- changes to the data type or meaning of fields
Minor version changes signify non-breaking changes. These typically do not require changes to client processes. A common minor version update is the addition of a field to the end of the dataset.
Examples include, but are not limited to:
- the addition of field(s) to the end of the dataset
- minor bug fixes
For some major version updates, PrecisionLender will backfill the entire time series with the updated data. This makes it possible to work with the entire time series without having to handle differences in the dataset across time. When we backfill the time series in this manner, we'll communicate that situation directly and via Release Notes.
Dataset Version Change Timelines
Minor Version Upgrade Timeline
Typically, when a minor version is incremented, the previous version stops populating and there is no duplication across minor versions.
Major Version Upgrade Timeline
When there is a major version update, we will communicate the timeline for the version change.
- One month prior to the new major version release, users will be notified of the coming change.
- For the first two months after deployment of the new version, both the previous major version and the new major version will be populated. During this overlap period, there is the possibility of duplicate entries across versions.
- One month after the release of the new version, users will receive notification of the previous version discontinuation.
- Two months after the deployment of the new version, the previous version will no longer be populated, and end users will be notified.
For some major version updates, PrecisionLender will backfill the entire time series with the updated data. This makes it possible to work with the entire time series without having to handle differences in the dataset across time. When we backfill the time series in this manner, we'll communicate that situation directly and via Release Notes.
Best Practices
Specify Major Versions
It is best to specify the major version for each dataset (as opposed to taking the latest major version). Typically, you will not want your downstream processes to start using a new major version without intentionally checking and updating your processes to handle the changes.
Take All Minor Versions within a Major version – OR – Take the Latest Minor Version within a Major Version
If you are working with time series data or dealing with a dataset whose time series is "incremental" (where each DatePartition does not contain a full set of records), you'll want to read in all minor versions within your specified major version. This is safe because minor version updates are non-breaking and there is no duplication of records across minor versions.
If you are dealing with a dataset whose time series is "snapshot" in nature (where each DatePartition contains a complete set of records), you'll only need the latest minor version within your specified major version (and may only actually need the latest DatePartition within that minor version).
Generally, Do Not Specify Minor Versions
Since minor version updates are non-breaking, and therefore are released as ready, if you specify a minor version, you can unknowingly stop getting the latest data when a minor version is incremented. This may be desirable in specific situations, but it is generally not advisable.