Leading in-app monetization and video advertising platform.
ironSource processes 500K events per second and over 20 billion events daily
Builds, manages, and orchestrates its data lake with minimal coding and maintenance
ironSource, in their own words, is the leading in-app monetization and video advertising platform, making free-to-play and free-to-use possible for over 1.5B people around the world. ironSource helps app developers take their apps to the next level, including the industry’s largest in-app video network. Over 80,000 apps use ironSource technologies to grow their businesses. The massive scale in which ironSource operates across its various monetization platforms—including apps, video, and mediation—leads to millions of end-devices generating massive amounts of streaming data. They need to collect, store, and prepare data to support multiple use cases while minimizing infrastructure and engineering overheads..
After working for several years in a database-focused approach, the rapid growth in ironSource’s data made their previous system unviable from a cost and maintenance perspective. Instead, they adopted a data lake architecture, storing raw event data on object storage, and creating customized output streams that power multiple applications and analytic flows.
When building its data lake, ironSource wanted to ensure that the following specs were met:
- Scale – ironSource processes 500K events per second and over 20 billion events daily. The ability to store near-infinite amounts of data in S3 without preprocessing the data is crucial.
- Flexibility – ironSource uses data to support multiple business processes. Because they need to feed the same data into multiple services to provide for different use cases, the company needed to bypass the rigidity and schema limitations entailed by a database approach. Instead, they store all the original data on S3 and create ad-hoc outputs and transformations as needed.
- Resilience – Because all historical data is on S3, recovery from failure is easier, and errors further down the pipeline are less likely to affect production environments.
"Upsolver has saved thousands of engineering hours and significantly reduced total cost of ownership, which enables us to invest these resources in continuing our hypergrowth rather than data pipelines."
- Seva Feldman, Vice President of Research and Development at ironSource Mobile.
The Solution Architecture
The following diagram shows the architecture ironSource uses: .
Step 1 - Building the input stream in Upsolver:
Using the Upsolver GUI, ironSource connects directly to the relevant Kafka topics and writes them to S3 precisely one time.
Step 2 - Sending tables to Athena in Upsolver
To understand production issues, developers and product teams need access to data. These teams can work with the data directly and answer their own questions by using Upsolver and Athena.
Upsolver simplifies and automates the process of preparing data for consumption in Athena, including compaction, compression, partitioning, and creating and managing tables in the AWS Glue Data Catalog.
Athena’s serverless architecture further compliments this independence, which means there’s no infrastructure to manage and analysts don’t need DevOps to use Amazon Redshift or query clusters for each new question.
Step 3 - Editing the table option for Outputs
IronSource’s BI analysts use Tableau to query and visualize data using SQL. However, performing this type of analysis on streaming data may require extensive ETL and data preparation, which can limit the scope of analysis and create reporting bottlenecks.
IronSource’s cloud data lake architecture enables BI teams to work with big data in Tableau. They use Upsolver to enrich and filter data and write it to Redshift to build reporting dashboards, or send tables to Athena for ad-hoc analytic queries. Tableau connects natively to both Redshift and Athena, so analysts can query the data using regular SQL and familiar tools, rather than relying on manual ETL processes.
Step 4 - Creating a reduced stream for Amazon ES
Engineering teams at IronSource use Amazon ES to monitor and analyze application logs. However, as with any database, storing raw data in Amazon ES is expensive and can lead to production issues.Because a large part of these logs are duplicates, Upsolver deduplicates the data. This reduces Amazon ES costs and improves performance. Upsolver cuts down the size of the data stored in Amazon ES by 70% by aggregating identical records. This makes it viable and cost-effective despite generating a high volume of logs. To do this, Upsolver adds a calculated field to the event stream, which indicates whether a particular log is a duplicate. If so, it filters the log out of the stream that it sends to Amazon ES.
Why ironSource chose Upsolver:
- Self-sufficiency for data consumers – As a self-service platform, Upsolver allows BI developers, Ops, and software teams to transform data streams into tabular data without writing code.
- Improved performance – Because Upsolver stores files in optimized Parquet storage on S3, ironSource benefits from high query performance without manual performance tuning.
- Elastic scaling – ironSource is in hyper-growth, so needs elastic scaling to handle increases in inbound data volume and peaks throughout the week, reprocessing of events from S3, and isolation between different groups that use the data.
- Data privacy – Because ironSource’s VPC deploys Upsolver with no access from outside, there is no risk to sensitive data.
- Thousands of engineering hours saved – ironSource’s DevOps and data engineers save thousands of hours that they would otherwise spend on infrastructure by replacing manual, coding-intensive processes with self-service tools and managed infrastructure.
- Fees reduction – Factoring infrastructure, workforce, and licensing costs, Upsolver significantly reduces ironSource’s total infrastructure costs.
- 15-minute latency from Kafka to end-user – Data consumers can respond and take action with near real-time data.
- X increase in scale – 9 Currently at 0.5M incoming events/sec and 3.5M outgoing events/sec.