Time Travel on S3 - Data Pipeline Examples

Time Travel on Amazon S3 with Upsolver

Being able to go back to a point in time and re-run source data into an output can be an invaluable tool in your toolbox. Consider the following scenarios:

Bug. You’ve discovered an error in your pipeline logic and want to fix the error and then run the data back through it to reflect the correct logic.
Schema evolution. A field was added a while back that you’ve just discovered. You want to re-run the data from the point the new field was added.
Test a hypothesis. You believe that adding a new field will help with analytics. Time travel can populate that field for you historically so you can test the hypothesis. For instance, you believe the number of page views is predictive of purchase but you never joined that data.

Upsolver can easily handle any of these situations. In this short interactive demo, we will show you how you can time travel to adapt to a change in your schema. In our example, a new field was added to the input data in the previous month. We will use Upsolver to manage either altering the existing output table for you or creating a new table based on the changes to the source schema. Finally, we will make a small change to the SQL to specify a time window to re-process the source data and pick up the new field that was missed.

Time Travel on Amazon S3 with Upsolver

Like What You See?

Run Upsolver on your data now

Start for Free

Talk to an Expert

Templates

All Templates