Explore our expert-made templates & start with the right one for you.
Businesses need to make information in relational databases broadly available for analysis, data science, BI and ML, without impacting the performance of the source database with resource-consuming queries.
The best practice for doing this is change data capture (CDC), which transmits change events based on database log activity. The Upsolver SQLake CDC solution creates and maintains a live database replica in your data lake and data warehouse. You can then access live operational data, with minimal impact on the source database. And since SQLake is a powerful stream processor, you can cleanse and transform the data in-stream – mask, normalize, join, aggregate and more.
SQLake replicates your MySQL or Postgres database in the data lake, performs transformations and outputs live tables to Snowflake, Redshift, query engines such as Athena or Redshift Spectrum, or search engines such as Elasticsearch.
SQLake’s CDC solution for database replication provides simplicity, data freshness, reliability and cost-effectiveness. It is based on the popular open source Debezium Engine, but uses SQLake for data movement and processing of transformations and stateful operations. Without SQLake, Debezium requires Kafka Connect and a Kafka cluster.
SQLake stores the initial database snapshot plus raw change events in Amazon S3 as an immutable log, which allows for easy time travel and reprocessing of past data.
CDC support will be expanded to other popular databases such as Oracle Database, Microsoft SQL Server, IBM DB2 and MongoDB in the near future.
Unlike most CDC systems, which focus solely on moving data, SQLake is a scalable, stateful processing engine that allows you to transform the data in-flow. Some examples of its power include:
Unlike previous generations of agent- or appliance-based CDC solutions, SQLake is a cloud-native service that runs in your AWS VPC. Your data never leaves your control, and installation takes only a few minutes.
SQLake pricing is simple. Start with a free fully-featured 30-day trial. Then stay on board for as little as $99 / TB ingested – transformations are free.
To build a CDC pipeline, you use the SQLake IDE or CLI to write SQL statements that
1) Connect to your source database,
2) Snapshot your database and then ingest change events,
3) Transform (e.g. cleanse, filter, aggregate, enrich, join) the data, and
4) Output live tables (merging changed rows) to one or more destination system.
See how easy it is to continuously stream change logs for database replication from relational databases.
What is change data capture (CDC)? Change data capture is a data integration capability available…
Learn how to create query-able tables from your operational databases with Change Data Capture (CDC)…
Change data capture (CDC) enables you to replicate changes from transactional databases to your data…
Learn how to configure your MySQL database to enable CDC, and learn how to build…
Accelerate data lake queries
Real-time ETL for cloud data warehouse
Build real-time data products
Explore our expert-made templates & start with the right one for you.