Explore our expert-made templates & start with the right one for you.
Table of contents
Upsolver creates query-ready live tables for use by a data lake query engine, an external data store such as cloud data warehouse or enterprise search engine, or to a stream processing engine for further transport and downstream processing . In all cases, Upsolver allows you to select and configure the output visually.
You can see a current list of output options here.
Data lake query engines
Upsolver provides a high-performance way to use data lake query engines. Historically engines such as Presto have had slow response times as they had to access file systems directly. Upsolver stores your transformation output as data lake tables which have been auto-optimized for fast queries, via implementation of numerous data lake engineering best practices such as partitioning, compaction, vacuuming and in-memory indexing.
Transforming and writing data to Athena
External data stores
If you need to make data accessible in an external system outside your original cloud object storage bucket, you can configure a native connector to continuously output table-formatted data for query in other cloud object storage, data warehouses, search engine databases and streaming systems. This means you get to take advantage of affordable data lake compute for ingestion and data prep, and pass prepared tables for query on more expensive systems. To keep your output tables up to date, you can simply perform UPSERTs to the output tables through a SQL command (link to further down the page).
For any data source you can set an aggregation time frame to minimize storage requirements and query speed/cost on the output system. You can also configure the time window for partitions and set parameters to manage how long data is retained in the output. For cloud object stores you can store data in either table or hierarchical format.
You can output Upsolver tables to streaming engines such as Kafka and Kinesis with exactly once delivery guarantees. Simply specify the system and topic/stream in the output system. You can also tag streams for identification in external monitoring systems.